February 1, 2020

3468 words 17 mins read

Paper Group AWR 273

Paper Group AWR 273

Demand Forecasting from Spatiotemporal Data with Graph Networks and Temporal-Guided Embedding. Supervised Online Hashing via Hadamard Codebook Learning. Reconstructing continuous distributions of 3D protein structure from cryo-EM images. Node Embedding over Temporal Graphs. Minibatch Processing in Spiking Neural Networks. Improving Neural Response …

Demand Forecasting from Spatiotemporal Data with Graph Networks and Temporal-Guided Embedding

Title Demand Forecasting from Spatiotemporal Data with Graph Networks and Temporal-Guided Embedding
Authors Doyup Lee, Suehun Jung, Yeongjae Cheon, Dongil Kim, Seungil You
Abstract Short-term demand forecasting models commonly combine convolutional and recurrent layers to extract complex spatiotemporal patterns in data. Long-term histories are also used to consider periodicity and seasonality patterns as time series data. In this study, we propose an efficient architecture, Temporal-Guided Network (TGNet), which utilizes graph networks and temporal-guided embedding. Graph networks extract invariant features to permutations of adjacent regions instead of convolutional layers. Temporal-guided embedding explicitly learns temporal contexts from training data and is substituted for the input of long-term histories from days/weeks ago. TGNet learns an autoregressive model, conditioned on temporal contexts of forecasting targets from temporal-guided embedding. Finally, our model achieves competitive performances with other baselines on three spatiotemporal demand dataset from real-world, but the number of trainable parameters is about 20 times smaller than a state-of-the-art baseline. We also show that temporal-guided embedding learns temporal contexts as intended and TGNet has robust forecasting performances even to atypical event situations.
Tasks Time Series
Published 2019-05-26
URL https://arxiv.org/abs/1905.10709v2
PDF https://arxiv.org/pdf/1905.10709v2.pdf
PWC https://paperswithcode.com/paper/190510709
Repo https://github.com/LeeDoYup/TGNet-keras
Framework tf

Supervised Online Hashing via Hadamard Codebook Learning

Title Supervised Online Hashing via Hadamard Codebook Learning
Authors Mingbao Lin, Rongrong Ji, Hong Liu, Yongjian Liu
Abstract In recent years, binary code learning, a.k.a hashing, has received extensive attention in large-scale multimedia retrieval. It aims to encode high-dimensional data points to binary codes, hence the original high-dimensional metric space can be efficiently approximated via Hamming space. However, most existing hashing methods adopted offline batch learning, which is not suitable to handle incremental datasets with streaming data or new instances. In contrast, the robustness of the existing online hashing remains as an open problem, while the embedding of supervised/semantic information hardly boosts the performance of the online hashing, mainly due to the defect of unknown category numbers in supervised learning. In this paper, we proposed an online hashing scheme, termed Hadamard Codebook based Online Hashing (HCOH), which aims to solve the above problems towards robust and supervised online hashing. In particular, we first assign an appropriate high-dimensional binary codes to each class label, which is generated randomly by Hadamard codes to each class label, which is generated randomly by Hadamard codes. Subsequently, LSH is adopted to reduce the length of such Hadamard codes in accordance with the hash bits, which can adapt the predefined binary codes online, and theoretically guarantee the semantic similarity. Finally, we consider the setting of stochastic data acquisition, which facilitates our method to efficiently learn the corresponding hashing functions via stochastic gradient descend (SGD) online. Notably, the proposed HCOH can be embedded with supervised labels and it not limited to a predefined category number. Extensive experiments on three widely-used benchmarks demonstrate the merits of the proposed scheme over the state-of-the-art methods. The code is available at https://github.com/lmbxmu/mycode/tree/master/2018ACMMM_HCOH.
Tasks Semantic Similarity, Semantic Textual Similarity
Published 2019-04-28
URL https://arxiv.org/abs/1905.03694v2
PDF https://arxiv.org/pdf/1905.03694v2.pdf
PWC https://paperswithcode.com/paper/190503694
Repo https://github.com/lmbxmu/mycode
Framework none

Reconstructing continuous distributions of 3D protein structure from cryo-EM images

Title Reconstructing continuous distributions of 3D protein structure from cryo-EM images
Authors Ellen D. Zhong, Tristan Bepler, Joseph H. Davis, Bonnie Berger
Abstract Cryo-electron microscopy (cryo-EM) is a powerful technique for determining the structure of proteins and other macromolecular complexes at near-atomic resolution. In single particle cryo-EM, the central problem is to reconstruct the three-dimensional structure of a macromolecule from $10^{4-7}$ noisy and randomly oriented two-dimensional projections. However, the imaged protein complexes may exhibit structural variability, which complicates reconstruction and is typically addressed using discrete clustering approaches that fail to capture the full range of protein dynamics. Here, we introduce a novel method for cryo-EM reconstruction that extends naturally to modeling continuous generative factors of structural heterogeneity. This method encodes structures in Fourier space using coordinate-based deep neural networks, and trains these networks from unlabeled 2D cryo-EM images by combining exact inference over image orientation with variational inference for structural heterogeneity. We demonstrate that the proposed method, termed cryoDRGN, can perform ab initio reconstruction of 3D protein complexes from simulated and real 2D cryo-EM image data. To our knowledge, cryoDRGN is the first neural network-based approach for cryo-EM reconstruction and the first end-to-end method for directly reconstructing continuous ensembles of protein structures from cryo-EM images.
Tasks
Published 2019-09-11
URL https://arxiv.org/abs/1909.05215v3
PDF https://arxiv.org/pdf/1909.05215v3.pdf
PWC https://paperswithcode.com/paper/reconstructing-continuously-heterogeneous
Repo https://github.com/zhonge/cryodrgn
Framework pytorch

Node Embedding over Temporal Graphs

Title Node Embedding over Temporal Graphs
Authors Uriel Singer, Ido Guy, Kira Radinsky
Abstract In this work, we present a method for node embedding in temporal graphs. We propose an algorithm that learns the evolution of a temporal graph’s nodes and edges over time and incorporates this dynamics in a temporal node embedding framework for different graph prediction tasks. We present a joint loss function that creates a temporal embedding of a node by learning to combine its historical temporal embeddings, such that it optimizes per given task (e.g., link prediction). The algorithm is initialized using static node embeddings, which are then aligned over the representations of a node at different time points, and eventually adapted for the given task in a joint optimization. We evaluate the effectiveness of our approach over a variety of temporal graphs for the two fundamental tasks of temporal link prediction and multi-label node classification, comparing to competitive baselines and algorithmic alternatives. Our algorithm shows performance improvements across many of the datasets and baselines and is found particularly effective for graphs that are less cohesive, with a lower clustering coefficient.
Tasks Link Prediction, Node Classification
Published 2019-03-21
URL http://arxiv.org/abs/1903.08889v2
PDF http://arxiv.org/pdf/1903.08889v2.pdf
PWC https://paperswithcode.com/paper/node-embedding-over-temporal-graphs
Repo https://github.com/urielsinger/Datasets
Framework none

Minibatch Processing in Spiking Neural Networks

Title Minibatch Processing in Spiking Neural Networks
Authors Daniel J. Saunders, Cooper Sigrist, Kenneth Chaney, Robert Kozma, Hava T. Siegelmann
Abstract Spiking neural networks (SNNs) are a promising candidate for biologically-inspired and energy efficient computation. However, their simulation is notoriously time consuming, and may be seen as a bottleneck in developing competitive training methods with potential deployment on neuromorphic hardware platforms. To address this issue, we provide an implementation of mini-batch processing applied to clock-based SNN simulation, leading to drastically increased data throughput. To our knowledge, this is the first general-purpose implementation of mini-batch processing in a spiking neural networks simulator, which works with arbitrary neuron and synapse models. We demonstrate nearly constant-time scaling with batch size on a simulation setup (up to GPU memory limits), and showcase the effectiveness of large batch sizes in two SNN application domains, resulting in $\approx$880X and $\approx$24X reductions in wall-clock time respectively. Different parameter reduction techniques are shown to produce different learning outcomes in a simulation of networks trained with spike-timing-dependent plasticity. Machine learning practitioners and biological modelers alike may benefit from the drastically reduced simulation time and increased iteration speed this method enables. Code to reproduce the benchmarks and experimental findings in this paper can be found at https://github.com/djsaunde/snn-minibatch.
Tasks
Published 2019-09-05
URL https://arxiv.org/abs/1909.02549v1
PDF https://arxiv.org/pdf/1909.02549v1.pdf
PWC https://paperswithcode.com/paper/minibatch-processing-in-spiking-neural
Repo https://github.com/djsaunde/snn-minibatch
Framework pytorch

Improving Neural Response Diversity with Frequency-Aware Cross-Entropy Loss

Title Improving Neural Response Diversity with Frequency-Aware Cross-Entropy Loss
Authors Shaojie Jiang, Pengjie Ren, Christof Monz, Maarten de Rijke
Abstract Sequence-to-Sequence (Seq2Seq) models have achieved encouraging performance on the dialogue response generation task. However, existing Seq2Seq-based response generation methods suffer from a low-diversity problem: they frequently generate generic responses, which make the conversation less interesting. In this paper, we address the low-diversity problem by investigating its connection with model over-confidence reflected in predicted distributions. Specifically, we first analyze the influence of the commonly used Cross-Entropy (CE) loss function, and find that the CE loss function prefers high-frequency tokens, which results in low-diversity responses. We then propose a Frequency-Aware Cross-Entropy (FACE) loss function that improves over the CE loss function by incorporating a weighting mechanism conditioned on token frequency. Extensive experiments on benchmark datasets show that the FACE loss function is able to substantially improve the diversity of existing state-of-the-art Seq2Seq response generation methods, in terms of both automatic and human evaluations.
Tasks
Published 2019-02-25
URL http://arxiv.org/abs/1902.09191v1
PDF http://arxiv.org/pdf/1902.09191v1.pdf
PWC https://paperswithcode.com/paper/improving-neural-response-diversity-with
Repo https://github.com/ShaojieJiang/FACE
Framework pytorch

CXPlain: Causal Explanations for Model Interpretation under Uncertainty

Title CXPlain: Causal Explanations for Model Interpretation under Uncertainty
Authors Patrick Schwab, Walter Karlen
Abstract Feature importance estimates that inform users about the degree to which given inputs influence the output of a predictive model are crucial for understanding, validating, and interpreting machine-learning models. However, providing fast and accurate estimates of feature importance for high-dimensional data, and quantifying the uncertainty of such estimates remain open challenges. Here, we frame the task of providing explanations for the decisions of machine-learning models as a causal learning task, and train causal explanation (CXPlain) models that learn to estimate to what degree certain inputs cause outputs in another machine-learning model. CXPlain can, once trained, be used to explain the target model in little time, and enables the quantification of the uncertainty associated with its feature importance estimates via bootstrap ensembling. We present experiments that demonstrate that CXPlain is significantly more accurate and faster than existing model-agnostic methods for estimating feature importance. In addition, we confirm that the uncertainty estimates provided by CXPlain ensembles are strongly correlated with their ability to accurately estimate feature importance on held-out data.
Tasks Feature Importance
Published 2019-10-27
URL https://arxiv.org/abs/1910.12336v1
PDF https://arxiv.org/pdf/1910.12336v1.pdf
PWC https://paperswithcode.com/paper/cxplain-causal-explanations-for-model
Repo https://github.com/d909b/cxplain
Framework tf

Blockwisely Supervised Neural Architecture Search with Knowledge Distillation

Title Blockwisely Supervised Neural Architecture Search with Knowledge Distillation
Authors Changlin Li, Jiefeng Peng, Liuchun Yuan, Guangrun Wang, Xiaodan Liang, Liang Lin, Xiaojun Chang
Abstract Neural Architecture Search (NAS), aiming at automatically designing network architectures by machines, is hoped and expected to bring about a new revolution in machine learning. Despite these high expectation, the effectiveness and efficiency of existing NAS solutions are unclear, with some recent works going so far as to suggest that many existing NAS solutions are no better than random architecture selection. The inefficiency of NAS solutions may be attributed to inaccurate architecture evaluation. Specifically, to speed up NAS, recent works have proposed under-training different candidate architectures in a large search space concurrently by using shared network parameters; however, this has resulted in incorrect architecture ratings and furthered the ineffectiveness of NAS. In this work, we propose to modularize the large search space of NAS into blocks to ensure that the potential candidate architectures are fully trained; this reduces the representation shift caused by the shared parameters and leads to the correct rating of the candidates. Thanks to the block-wise search, we can also evaluate all of the candidate architectures within a block. Moreover, we find that the knowledge of a network model lies not only in the network parameters but also in the network architecture. Therefore, we propose to distill the neural architecture (DNA) knowledge from a teacher model as the supervision to guide our block-wise architecture search, which significantly improves the effectiveness of NAS. Remarkably, the capacity of our searched architecture has exceeded the teacher model, demonstrating the practicability and scalability of our method. Finally, our method achieves a state-of-the-art 78.4% top-1 accuracy on ImageNet in a mobile setting, which is about a 2.1% gain over EfficientNet-B0. All of our searched models along with the evaluation code are available online.
Tasks Neural Architecture Search
Published 2019-11-29
URL https://arxiv.org/abs/1911.13053v2
PDF https://arxiv.org/pdf/1911.13053v2.pdf
PWC https://paperswithcode.com/paper/blockwisely-supervised-neural-architecture
Repo https://github.com/jiefengpeng/DNA
Framework pytorch

Conditional Density Estimation with Neural Networks: Best Practices and Benchmarks

Title Conditional Density Estimation with Neural Networks: Best Practices and Benchmarks
Authors Jonas Rothfuss, Fabio Ferreira, Simon Walther, Maxim Ulrich
Abstract Given a set of empirical observations, conditional density estimation aims to capture the statistical relationship between a conditional variable $\mathbf{x}$ and a dependent variable $\mathbf{y}$ by modeling their conditional probability $p(\mathbf{y}\mathbf{x})$. The paper develops best practices for conditional density estimation for finance applications with neural networks, grounded on mathematical insights and empirical evaluations. In particular, we introduce a noise regularization and data normalization scheme, alleviating problems with over-fitting, initialization and hyper-parameter sensitivity of such estimators. We compare our proposed methodology with popular semi- and non-parametric density estimators, underpin its effectiveness in various benchmarks on simulated and Euro Stoxx 50 data and show its superior performance. Our methodology allows to obtain high-quality estimators for statistical expectations of higher moments, quantiles and non-linear return transformations, with very little assumptions about the return dynamic.
Tasks Density Estimation
Published 2019-03-03
URL http://arxiv.org/abs/1903.00954v2
PDF http://arxiv.org/pdf/1903.00954v2.pdf
PWC https://paperswithcode.com/paper/conditional-density-estimation-with-neural
Repo https://github.com/freelunchtheorem/Conditional_Density_Estimation
Framework tf

Using Clinical Notes with Time Series Data for ICU Management

Title Using Clinical Notes with Time Series Data for ICU Management
Authors Swaraj Khadanga, Karan Aggarwal, Shafiq Joty, Jaideep Srivastava
Abstract Monitoring patients in ICU is a challenging and high-cost task. Hence, predicting the condition of patients during their ICU stay can help provide better acute care and plan the hospital’s resources. There has been continuous progress in machine learning research for ICU management, and most of this work has focused on using time series signals recorded by ICU instruments. In our work, we show that adding clinical notes as another modality improves the performance of the model for three benchmark tasks: in-hospital mortality prediction, modeling decompensation, and length of stay forecasting that play an important role in ICU management. While the time-series data is measured at regular intervals, doctor notes are charted at irregular times, making it challenging to model them together. We propose a method to model them jointly, achieving considerable improvement across benchmark tasks over baseline time-series model. Our implementation can be found at \url{https://github.com/kaggarwal/ClinicalNotesICU}.
Tasks Mortality Prediction, Time Series
Published 2019-09-12
URL https://arxiv.org/abs/1909.09702v2
PDF https://arxiv.org/pdf/1909.09702v2.pdf
PWC https://paperswithcode.com/paper/using-clinical-notes-with-time-series-data
Repo https://github.com/kaggarwal/ClinicalNotesICU
Framework none

OpenLORIS-Object: A Robotic Vision Dataset and Benchmark for Lifelong Deep Learning

Title OpenLORIS-Object: A Robotic Vision Dataset and Benchmark for Lifelong Deep Learning
Authors Qi She, Fan Feng, Xinyue Hao, Qihan Yang, Chuanlin Lan, Vincenzo Lomonaco, Xuesong Shi, Zhengwei Wang, Yao Guo, Yimin Zhang, Fei Qiao, Rosa H. M. Chan
Abstract The recent breakthroughs in computer vision have benefited from the availability of large representative datasets (e.g. ImageNet and COCO) for training. Yet, robotic vision poses unique challenges for applying visual algorithms developed from these standard computer vision datasets due to their implicit assumption over non-varying distributions for a fixed set of tasks. Fully retraining models each time a new task becomes available is infeasible due to computational, storage and sometimes privacy issues, while na"{i}ve incremental strategies have been shown to suffer from catastrophic forgetting. It is crucial for the robots to operate continuously under open-set and detrimental conditions with adaptive visual perceptual systems, where lifelong learning is a fundamental capability. However, very few datasets and benchmarks are available to evaluate and compare emerging techniques. To fill this gap, we provide a new lifelong robotic vision dataset (“OpenLORIS-Object”) collected via RGB-D cameras. The dataset embeds the challenges faced by a robot in the real-life application and provides new benchmarks for validating lifelong object recognition algorithms. Moreover, we have provided a testbed of $9$ state-of-the-art lifelong learning algorithms. Each of them involves $48$ tasks with $4$ evaluation metrics over the OpenLORIS-Object dataset. The results demonstrate that the object recognition task in the ever-changing difficulty environments is far from being solved and the bottlenecks are at the forward/backward transfer designs. Our dataset and benchmark are publicly available at at \href{https://lifelong-robotic-vision.github.io/dataset/object}{\underline{https://lifelong-robotic-vision.github.io/dataset/object}}.
Tasks Object Recognition
Published 2019-11-15
URL https://arxiv.org/abs/1911.06487v2
PDF https://arxiv.org/pdf/1911.06487v2.pdf
PWC https://paperswithcode.com/paper/openloris-object-a-dataset-and-benchmark
Repo https://github.com/ffeng1996/OpenLORIS-Object-Code
Framework pytorch

Augmenting correlation structures in spatial data using deep generative models

Title Augmenting correlation structures in spatial data using deep generative models
Authors Konstantin Klemmer, Adriano Koshiyama, Sebastian Flennerhag
Abstract State-of-the-art deep learning methods have shown a remarkable capacity to model complex data domains, but struggle with geospatial data. In this paper, we introduce SpaceGAN, a novel generative model for geospatial domains that learns neighbourhood structures through spatial conditioning. We propose to enhance spatial representation beyond mere spatial coordinates, by conditioning each data point on feature vectors of its spatial neighbours, thus allowing for a more flexible representation of the spatial structure. To overcome issues of training convergence, we employ a metric capturing the loss in local spatial autocorrelation between real and generated data as stopping criterion for SpaceGAN parametrization. This way, we ensure that the generator produces synthetic samples faithful to the spatial patterns observed in the input. SpaceGAN is successfully applied for data augmentation and outperforms compared to other methods of synthetic spatial data generation. Finally, we propose an ensemble learning framework for the geospatial domain, taking augmented SpaceGAN samples as training data for a set of ensemble learners. We empirically show the superiority of this approach over conventional ensemble learning approaches and rivaling spatial data augmentation methods, using synthetic and real-world prediction tasks. Our findings suggest that SpaceGAN can be used as a tool for (1) artificially inflating sparse geospatial data and (2) improving generalization of geospatial models.
Tasks Data Augmentation
Published 2019-05-23
URL https://arxiv.org/abs/1905.09796v1
PDF https://arxiv.org/pdf/1905.09796v1.pdf
PWC https://paperswithcode.com/paper/augmenting-correlation-structures-in-spatial
Repo https://github.com/konstantinklemmer/spacegan
Framework pytorch

Conditioned-U-Net: Introducing a Control Mechanism in the U-Net for Multiple Source Separations

Title Conditioned-U-Net: Introducing a Control Mechanism in the U-Net for Multiple Source Separations
Authors Gabriel Meseguer-Brocal, Geoffroy Peeters
Abstract Data-driven models for audio source separation such as U-Net or Wave-U-Net are usually models dedicated to and specifically trained for a single task, e.g. a particular instrument isolation. Training them for various tasks at once commonly results in worse performances than training them for a single specialized task. In this work, we introduce the Conditioned-U-Net (C-U-Net) which adds a control mechanism to the standard U-Net. The control mechanism allows us to train a unique and generic U-Net to perform the separation of various instruments. The C-U-Net decides the instrument to isolate according to a one-hot-encoding input vector. The input vector is embedded to obtain the parameters that control Feature-wise Linear Modulation (FiLM) layers. FiLM layers modify the U-Net feature maps in order to separate the desired instrument via affine transformations. The C-U-Net performs different instrument separations, all with a single model achieving the same performances as the dedicated ones at a lower cost.
Tasks
Published 2019-07-02
URL https://arxiv.org/abs/1907.01277v3
PDF https://arxiv.org/pdf/1907.01277v3.pdf
PWC https://paperswithcode.com/paper/conditioned-u-net-introducing-a-control
Repo https://github.com/gabolsgabs/cunet
Framework tf
Title Norm-Explicit Quantization: Improving Vector Quantization for Maximum Inner Product Search
Authors Xinyan Dai, Xiao Yan, Kelvin K. W. Ng, Jie Liu, James Cheng
Abstract Vector quantization (VQ) techniques are widely used in similarity search for data compression, fast metric computation and etc. Originally designed for Euclidean distance, existing VQ techniques (e.g., PQ, AQ) explicitly or implicitly minimize the quantization error. In this paper, we present a new angle to analyze the quantization error, which decomposes the quantization error into norm error and direction error. We show that quantization errors in norm have much higher influence on inner products than quantization errors in direction, and small quantization error does not necessarily lead to good performance in maximum inner product search (MIPS). Based on this observation, we propose norm-explicit quantization (NEQ) — a general paradigm that improves existing VQ techniques for MIPS. NEQ quantizes the norms of items in a dataset explicitly to reduce errors in norm, which is crucial for MIPS. For the direction vectors, NEQ can simply reuse an existing VQ technique to quantize them without modification. We conducted extensive experiments on a variety of datasets and parameter configurations. The experimental results show that NEQ improves the performance of various VQ techniques for MIPS, including PQ, OPQ, RQ and AQ.
Tasks Quantization
Published 2019-11-12
URL https://arxiv.org/abs/1911.04654v2
PDF https://arxiv.org/pdf/1911.04654v2.pdf
PWC https://paperswithcode.com/paper/norm-explicit-quantization-improving-vector
Repo https://github.com/xinyandai/product-quantization
Framework none

Function Preserving Projection for Scalable Exploration of High-Dimensional Data

Title Function Preserving Projection for Scalable Exploration of High-Dimensional Data
Authors Shusen Liu, Rushil Anirudh, Jayaraman J. Thiagarajan, Peer-Timo Bremer
Abstract We present function preserving projections (FPP), a scalable linear projection technique for discovering interpretable relationships in high-dimensional data. Conventional dimension reduction methods aim to maximally preserve the global and/or local geometric structure of a dataset. However, in practice one is often more interested in determining how one or multiple user-selected response function(s) can be explained by the data. To intuitively connect the responses to the data, FPP constructs 2D linear embeddings optimized to reveal interpretable yet potentially non-linear patterns of the response functions. More specifically, FPP is designed to (i) produce human-interpretable embeddings; (ii) capture non-linear relationships; (iii) allow the simultaneous use of multiple response functions; and (iv) scale to millions of samples. Using FPP on real-world datasets, one can obtain fundamentally new insights about high-dimensional relationships in large-scale data that could not be achieved using existing dimension reduction methods.
Tasks Dimensionality Reduction
Published 2019-09-25
URL https://arxiv.org/abs/1909.11804v1
PDF https://arxiv.org/pdf/1909.11804v1.pdf
PWC https://paperswithcode.com/paper/function-preserving-projection-for-scalable
Repo https://github.com/LLNL/fpp
Framework tf
comments powered by Disqus