October 18, 2019

3407 words 16 mins read

Paper Group ANR 678

A theory of sequence indexing and working memory in recurrent neural networks. Parametrized Accelerated Methods Free of Condition Number. Generalized Multichannel Variational Autoencoder for Underdetermined Source Separation. TESSERACT: Eliminating Experimental Bias in Malware Classification across Space and Time. Learning Socially Appropriate Robo …

A theory of sequence indexing and working memory in recurrent neural networks


Title	A theory of sequence indexing and working memory in recurrent neural networks
Authors	E. Paxon Frady, Denis Kleyko, Friedrich T. Sommer
Abstract	To accommodate structured approaches of neural computation, we propose a class of recurrent neural networks for indexing and storing sequences of symbols or analog data vectors. These networks with randomized input weights and orthogonal recurrent weights implement coding principles previously described in vector symbolic architectures (VSA), and leverage properties of reservoir computing. In general, the storage in reservoir computing is lossy and crosstalk noise limits the retrieval accuracy and information capacity. A novel theory to optimize memory performance in such networks is presented and compared with simulation experiments. The theory describes linear readout of analog data, and readout with winner-take-all error correction of symbolic data as proposed in VSA models. We find that diverse VSA models from the literature have universal performance properties, which are superior to what previous analyses predicted. Further, we propose novel VSA models with the statistically optimal Wiener filter in the readout that exhibit much higher information capacity, in particular for storing analog data. The presented theory also applies to memory buffers, networks with gradual forgetting, which can operate on infinite data streams without memory overflow. Interestingly, we find that different forgetting mechanisms, such as attenuating recurrent weights or neural nonlinearities, produce very similar behavior if the forgetting time constants are aligned. Such models exhibit extensive capacity when their forgetting time constant is optimized for given noise conditions and network size. These results enable the design of new types of VSA models for the online processing of data streams.
Tasks
Published	2018-02-28
URL	http://arxiv.org/abs/1803.00412v1
PDF	http://arxiv.org/pdf/1803.00412v1.pdf
PWC	https://paperswithcode.com/paper/a-theory-of-sequence-indexing-and-working
Repo
Framework

Parametrized Accelerated Methods Free of Condition Number


Title	Parametrized Accelerated Methods Free of Condition Number
Authors	Chaoyue Liu, Mikhail Belkin
Abstract	Analyses of accelerated (momentum-based) gradient descent usually assume bounded condition number to obtain exponential convergence rates. However, in many real problems, e.g., kernel methods or deep neural networks, the condition number, even locally, can be unbounded, unknown or mis-estimated. This poses problems in both implementing and analyzing accelerated algorithms. In this paper, we address this issue by proposing parametrized accelerated methods by considering the condition number as a free parameter. We provide spectral-level analysis for several important accelerated algorithms, obtain explicit expressions and improve worst case convergence rates. Moreover, we show that those algorithm converge exponentially even when the condition number is unknown or mis-estimated.
Tasks
Published	2018-02-28
URL	http://arxiv.org/abs/1802.10235v1
PDF	http://arxiv.org/pdf/1802.10235v1.pdf
PWC	https://paperswithcode.com/paper/parametrized-accelerated-methods-free-of
Repo
Framework

Generalized Multichannel Variational Autoencoder for Underdetermined Source Separation


Title	Generalized Multichannel Variational Autoencoder for Underdetermined Source Separation
Authors	Shogo Seki, Hirokazu Kameoka, Li Li, Tomoki Toda, Kazuya Takeda
Abstract	This paper deals with a multichannel audio source separation problem under underdetermined conditions. Multichannel Non-negative Matrix Factorization (MNMF) is one of powerful approaches, which adopts the NMF concept for source power spectrogram modeling. This concept is also employed in Independent Low-Rank Matrix Analysis (ILRMA), a special class of the MNMF framework formulated under determined conditions. While these methods work reasonably well for particular types of sound sources, one limitation is that they can fail to work for sources with spectrograms that do not comply with the NMF model. To address this limitation, an extension of ILRMA called the Multichannel Variational Autoencoder (MVAE) method was recently proposed, where a Conditional VAE (CVAE) is used instead of the NMF model for source power spectrogram modeling. This approach has shown to perform impressively in determined source separation tasks thanks to the representation power of DNNs. While the original MVAE method was formulated under determined mixing conditions, this paper generalizes it so that it can also deal with underdetermined cases. We call the proposed framework the Generalized MVAE (GMVAE). The proposed method was evaluated on a underdetermined source separation task of separating out three sources from two microphone inputs. Experimental results revealed that the GMVAE method achieved better performance than the MNMF method.
Tasks
Published	2018-09-29
URL	http://arxiv.org/abs/1810.00223v1
PDF	http://arxiv.org/pdf/1810.00223v1.pdf
PWC	https://paperswithcode.com/paper/generalized-multichannel-variational
Repo
Framework

TESSERACT: Eliminating Experimental Bias in Malware Classification across Space and Time


Title	TESSERACT: Eliminating Experimental Bias in Malware Classification across Space and Time
Authors	Feargus Pendlebury, Fabio Pierazzi, Roberto Jordaney, Johannes Kinder, Lorenzo Cavallaro
Abstract	Is Android malware classification a solved problem? Published F1 scores of up to 0.99 appear to leave very little room for improvement. In this paper, we argue that results are commonly inflated due to two pervasive sources of experimental bias: “spatial bias” caused by distributions of training and testing data that are not representative of a real-world deployment; and “temporal bias” caused by incorrect time splits of training and testing sets, leading to impossible configurations. We propose a set of space and time constraints for experiment design that eliminates both sources of bias. We introduce a new metric that summarizes the expected robustness of a classifier in a real-world setting, and we present an algorithm to tune its performance. Finally, we demonstrate how this allows us to evaluate mitigation strategies for time decay such as active learning. We have implemented our solutions in TESSERACT, an open source evaluation framework for comparing malware classifiers in a realistic setting. We used TESSERACT to evaluate three Android malware classifiers from the literature on a dataset of 129K applications spanning over three years. Our evaluation confirms that earlier published results are biased, while also revealing counter-intuitive performance and showing that appropriate tuning can lead to significant improvements.
Tasks	Active Learning, Malware Classification
Published	2018-07-20
URL	https://arxiv.org/abs/1807.07838v4
PDF	https://arxiv.org/pdf/1807.07838v4.pdf
PWC	https://paperswithcode.com/paper/tesseract-eliminating-experimental-bias-in
Repo
Framework

Learning Socially Appropriate Robot Approaching Behavior Toward Groups using Deep Reinforcement Learning


Title	Learning Socially Appropriate Robot Approaching Behavior Toward Groups using Deep Reinforcement Learning
Authors	Yuan Gao, Fangkai Yang, Martin Frisk, Daniel Hernandez, Christopher Peters, Ginevra Castellano
Abstract	Deep reinforcement learning has recently been widely applied in robotics to study tasks such as locomotion and grasping, but its application to social human-robot interaction (HRI) remains a challenge. In this paper, we present a deep learning scheme that acquires a prior model of robot approaching behavior in simulation and applies it to real-world interaction with a physical robot approaching groups of humans. The scheme, which we refer to as Staged Social Behavior Learning (SSBL), considers different stages of learning in social scenarios. We learn robot approaching behaviors towards small groups in simulation and evaluate the performance of the model using objective and subjective measures in a perceptual study and a HRI user study with human participants. Results show that our model generates more socially appropriate behavior compared to a state-of-the-art model.
Tasks
Published	2018-10-16
URL	https://arxiv.org/abs/1810.06979v3
PDF	https://arxiv.org/pdf/1810.06979v3.pdf
PWC	https://paperswithcode.com/paper/social-behavior-learning-with-realistic
Repo
Framework

Cost-Effective Training of Deep CNNs with Active Model Adaptation


Title	Cost-Effective Training of Deep CNNs with Active Model Adaptation
Authors	Sheng-Jun Huang, Jia-Wei Zhao, Zhao-Yang Liu
Abstract	Deep convolutional neural networks have achieved great success in various applications. However, training an effective DNN model for a specific task is rather challenging because it requires a prior knowledge or experience to design the network architecture, repeated trial-and-error process to tune the parameters, and a large set of labeled data to train the model. In this paper, we propose to overcome these challenges by actively adapting a pre-trained model to a new task with less labeled examples. Specifically, the pre-trained model is iteratively fine tuned based on the most useful examples. The examples are actively selected based on a novel criterion, which jointly estimates the potential contribution of an instance on optimizing the feature representation as well as improving the classification model for the target task. On one hand, the pre-trained model brings plentiful information from its original task, avoiding redesign of the network architecture or training from scratch; and on the other hand, the labeling cost can be significantly reduced by active label querying. Experiments on multiple datasets and different pre-trained models demonstrate that the proposed approach can achieve cost-effective training of DNNs.
Tasks
Published	2018-02-15
URL	http://arxiv.org/abs/1802.05394v2
PDF	http://arxiv.org/pdf/1802.05394v2.pdf
PWC	https://paperswithcode.com/paper/cost-effective-training-of-deep-cnns-with
Repo
Framework

A Primal-dual Learning Algorithm for Personalized Dynamic Pricing with an Inventory Constraint


Title	A Primal-dual Learning Algorithm for Personalized Dynamic Pricing with an Inventory Constraint
Authors	Ningyuan Chen, Guillermo Gallego
Abstract	A firm is selling a product to different types (based on the features such as education backgrounds, ages, etc.) of customers over a finite season with non-replenishable initial inventory. The type label of an arriving customer can be observed but the demand function associated with each type is initially unknown. The firm sets personalized prices dynamically for each type and attempts to maximize the revenue over the season. We provide a learning algorithm that is near-optimal when the demand and capacity scale in proportion. The algorithm utilizes the primal-dual formulation of the problem and learns the dual optimal solution explicitly. It allows the algorithm to overcome the curse of dimensionality (the rate of regret is independent of the number of types) and sheds light on novel algorithmic designs for learning problems with resource constraints.
Tasks
Published	2018-12-20
URL	http://arxiv.org/abs/1812.09234v1
PDF	http://arxiv.org/pdf/1812.09234v1.pdf
PWC	https://paperswithcode.com/paper/a-primal-dual-learning-algorithm-for
Repo
Framework

Face Recognition in Low Quality Images: A Survey


Title	Face Recognition in Low Quality Images: A Survey
Authors	Pei Li, Loreto Prieto, Domingo Mery, Patrick Flynn
Abstract	Low-resolution face recognition (LRFR) has received increasing attention over the past few years. Its applications lie widely in the real-world environment when high-resolution or high-quality images are hard to capture. One of the biggest demands for LRFR technologies is video surveillance. As the the number of surveillance cameras in the city increases, the videos that captured will need to be processed automatically. However, those videos or images are usually captured with large standoffs, arbitrary illumination condition, and diverse angles of view. Faces in these images are generally small in size. Several studies addressed this problem employed techniques like super resolution, deblurring, or learning a relationship between different resolution domains. In this paper, we provide a comprehensive review of approaches to low-resolution face recognition in the past five years. First, a general problem definition is given. Later, systematically analysis of the works on this topic is presented by catogory. In addition to describing the methods, we also focus on datasets and experiment settings. We further address the related works on unconstrained low-resolution face recognition and compare them with the result that use synthetic low-resolution data. Finally, we summarized the general limitations and speculate a priorities for the future effort.
Tasks	Deblurring, Face Recognition, Super-Resolution
Published	2018-05-29
URL	http://arxiv.org/abs/1805.11519v3
PDF	http://arxiv.org/pdf/1805.11519v3.pdf
PWC	https://paperswithcode.com/paper/face-recognition-in-low-quality-images-a
Repo
Framework

Layer-Parallel Training of Deep Residual Neural Networks


Title	Layer-Parallel Training of Deep Residual Neural Networks
Authors	S. Günther, L. Ruthotto, J. B. Schroder, E. C. Cyr, N. R. Gauger
Abstract	Residual neural networks (ResNets) are a promising class of deep neural networks that have shown excellent performance for a number of learning tasks, e.g., image classification and recognition. Mathematically, ResNet architectures can be interpreted as forward Euler discretizations of a nonlinear initial value problem whose time-dependent control variables represent the weights of the neural network. Hence, training a ResNet can be cast as an optimal control problem of the associated dynamical system. For similar time-dependent optimal control problems arising in engineering applications, parallel-in-time methods have shown notable improvements in scalability. This paper demonstrates the use of those techniques for efficient and effective training of ResNets. The proposed algorithms replace the classical (sequential) forward and backward propagation through the network layers by a parallel nonlinear multigrid iteration applied to the layer domain. This adds a new dimension of parallelism across layers that is attractive when training very deep networks. From this basic idea, we derive multiple layer-parallel methods. The most efficient version employs a simultaneous optimization approach where updates to the network parameters are based on inexact gradient information in order to speed up the training process. Using numerical examples from supervised classification, we demonstrate that the new approach achieves similar training performance to traditional methods, but enables layer-parallelism and thus provides speedup over layer-serial methods through greater concurrency.
Tasks	Image Classification
Published	2018-12-11
URL	https://arxiv.org/abs/1812.04352v3
PDF	https://arxiv.org/pdf/1812.04352v3.pdf
PWC	https://paperswithcode.com/paper/layer-parallel-training-of-deep-residual
Repo
Framework

An Optimal Policy for Dynamic Assortment Planning Under Uncapacitated Multinomial Logit Models


Title	An Optimal Policy for Dynamic Assortment Planning Under Uncapacitated Multinomial Logit Models
Authors	Xi Chen, Yining Wang, Yuan Zhou
Abstract	We study the dynamic assortment planning problem, where for each arriving customer, the seller offers an assortment of substitutable products and customer makes the purchase among offered products according to an uncapacitated multinomial logit (MNL) model. Since all the utility parameters of MNL are unknown, the seller needs to simultaneously learn customers’ choice behavior and make dynamic decisions on assortments based on the current knowledge. The goal of the seller is to maximize the expected revenue, or equivalently, to minimize the expected regret. Although dynamic assortment planning problem has received an increasing attention in revenue management, most existing policies require the estimation of mean utility for each product and the final regret usually involves the number of products $N$. The optimal regret of the dynamic assortment planning problem under the most basic and popular choice model—MNL model is still open. By carefully analyzing a revenue potential function, we develop a trisection based policy combined with adaptive confidence bound construction, which achieves an {item-independent} regret bound of $O(\sqrt{T})$, where $T$ is the length of selling horizon. We further establish the matching lower bound result to show the optimality of our policy. There are two major advantages of the proposed policy. First, the regret of all our policies has no dependence on $N$. Second, our policies are almost assumption free: there is no assumption on mean utility nor any “separability” condition on the expected revenues for different assortments. Our result also extends the unimodal bandit literature.
Tasks
Published	2018-05-12
URL	http://arxiv.org/abs/1805.04785v2
PDF	http://arxiv.org/pdf/1805.04785v2.pdf
PWC	https://paperswithcode.com/paper/an-optimal-policy-for-dynamic-assortment
Repo
Framework

Online Multi-Object Tracking with Historical Appearance Matching and Scene Adaptive Detection Filtering


Title	Online Multi-Object Tracking with Historical Appearance Matching and Scene Adaptive Detection Filtering
Authors	Young-chul Yoon, Abhijeet Boragule, Young-min Song, Kwangjin Yoon, Moongu Jeon
Abstract	In this paper, we propose the methods to handle temporal errors during multi-object tracking. Temporal error occurs when objects are occluded or noisy detections appear near the object. In those situations, tracking may fail and various errors like drift or ID-switching occur. It is hard to overcome temporal errors only by using motion and shape information. So, we propose the historical appearance matching method and joint-input siamese network which was trained by 2-step process. It can prevent tracking failures although objects are temporally occluded or last matching information is unreliable. We also provide useful technique to remove noisy detections effectively according to scene condition. Tracking performance, especially identity consistency, is highly improved by attaching our methods.
Tasks	Multi-Object Tracking, Object Tracking, Online Multi-Object Tracking
Published	2018-05-28
URL	http://arxiv.org/abs/1805.10916v4
PDF	http://arxiv.org/pdf/1805.10916v4.pdf
PWC	https://paperswithcode.com/paper/online-multi-object-tracking-with-historical
Repo
Framework

Detection of Premature Ventricular Contractions Using Densely Connected Deep Convolutional Neural Network with Spatial Pyramid Pooling Layer


Title	Detection of Premature Ventricular Contractions Using Densely Connected Deep Convolutional Neural Network with Spatial Pyramid Pooling Layer
Authors	Jianning Li
Abstract	Premature ventricular contraction(PVC) is a type of premature ectopic beat originating from the ventricles. Automatic method for accurate and robust detection of PVC is highly clinically desired.Currently, most of these methods are developed and tested using the same database divided into training and testing set and their generalization performance across databases has not been fully validated. In this paper, a method based on densely connected convolutional neural network and spatial pyramid pooling is proposed for PVC detection which can take arbitrarily-sized QRS complexes as input both in training and testing. With a much less complicated and more straightforward architecture,the proposed network achieves comparable results to current state-of-the-art deep learning based method with regard to accuracy,sensitivity and specificity by training and testing using the MIT-BIH arrhythmia database as benchmark.Besides the benchmark database,QRS complexes are extracted from four more open databases namely the St-Petersburg Institute of Cardiological Technics 12-lead Arrhythmia Database,The MIT-BIH Normal Sinus Rhythm Database,The MIT-BIH Long Term Database and European ST-T Database. The extracted QRS complexes are different in length and sampling rate among the five databases.Cross-database training and testing is also experimented.The performance of the network shows an improvement on the benchmark database according to the result demonstrating the advantage of using multiple databases for training over using only a single database.The network also achieves satisfactory scores on the other four databases showing good generalization capability.
Tasks
Published	2018-06-12
URL	https://arxiv.org/abs/1806.04564v7
PDF	https://arxiv.org/pdf/1806.04564v7.pdf
PWC	https://paperswithcode.com/paper/detection-of-premature-ventricular
Repo
Framework

An Unsupervised Approach to Solving Inverse Problems using Generative Adversarial Networks


Title	An Unsupervised Approach to Solving Inverse Problems using Generative Adversarial Networks
Authors	Rushil Anirudh, Jayaraman J. Thiagarajan, Bhavya Kailkhura, Timo Bremer
Abstract	Solving inverse problems continues to be a challenge in a wide array of applications ranging from deblurring, image inpainting, source separation etc. Most existing techniques solve such inverse problems by either explicitly or implicitly finding the inverse of the model. The former class of techniques require explicit knowledge of the measurement process which can be unrealistic, and rely on strong analytical regularizers to constrain the solution space, which often do not generalize well. The latter approaches have had remarkable success in part due to deep learning, but require a large collection of source-observation pairs, which can be prohibitively expensive. In this paper, we propose an unsupervised technique to solve inverse problems with generative adversarial networks (GANs). Using a pre-trained GAN in the space of source signals, we show that one can reliably recover solutions to under determined problems in a `blind’ fashion, i.e., without knowledge of the measurement process. We solve this by making successive estimates on the model and the solution in an iterative fashion. We show promising results in three challenging applications – blind source separation, image deblurring, and recovering an image from its edge map, and perform better than several baselines. \|
Tasks	Deblurring, Image Inpainting
Published	2018-05-18
URL	http://arxiv.org/abs/1805.07281v2
PDF	http://arxiv.org/pdf/1805.07281v2.pdf
PWC	https://paperswithcode.com/paper/an-unsupervised-approach-to-solving-inverse
Repo
Framework

Finding Syntax in Human Encephalography with Beam Search


Title	Finding Syntax in Human Encephalography with Beam Search
Authors	John Hale, Chris Dyer, Adhiguna Kuncoro, Jonathan R. Brennan
Abstract	Recurrent neural network grammars (RNNGs) are generative models of (tree,string) pairs that rely on neural networks to evaluate derivational choices. Parsing with them using beam search yields a variety of incremental complexity metrics such as word surprisal and parser action count. When used as regressors against human electrophysiological responses to naturalistic text, they derive two amplitude effects: an early peak and a P600-like later peak. By contrast, a non-syntactic neural language model yields no reliable effects. Model comparisons attribute the early peak to syntactic composition within the RNNG. This pattern of results recommends the RNNG+beam search combination as a mechanistic model of the syntactic processing that occurs during normal human language comprehension.
Tasks	Language Modelling
Published	2018-06-11
URL	http://arxiv.org/abs/1806.04127v1
PDF	http://arxiv.org/pdf/1806.04127v1.pdf
PWC	https://paperswithcode.com/paper/finding-syntax-in-human-encephalography-with
Repo
Framework

Vision Meets Drones: A Challenge


Title	Vision Meets Drones: A Challenge
Authors	Pengfei Zhu, Longyin Wen, Xiao Bian, Haibin Ling, Qinghua Hu
Abstract	In this paper we present a large-scale visual object detection and tracking benchmark, named VisDrone2018, aiming at advancing visual understanding tasks on the drone platform. The images and video sequences in the benchmark were captured over various urban/suburban areas of 14 different cities across China from north to south. Specifically, VisDrone2018 consists of 263 video clips and 10,209 images (no overlap with video clips) with rich annotations, including object bounding boxes, object categories, occlusion, truncation ratios, etc. With intensive amount of effort, our benchmark has more than 2.5 million annotated instances in 179,264 images/video frames. Being the largest such dataset ever published, the benchmark enables extensive evaluation and investigation of visual analysis algorithms on the drone platform. In particular, we design four popular tasks with the benchmark, including object detection in images, object detection in videos, single object tracking, and multi-object tracking. All these tasks are extremely challenging in the proposed dataset due to factors such as occlusion, large scale and pose variation, and fast motion. We hope the benchmark largely boost the research and development in visual analysis on drone platforms.
Tasks	Multi-Object Tracking, Object Detection, Object Tracking
Published	2018-04-20
URL	http://arxiv.org/abs/1804.07437v2
PDF	http://arxiv.org/pdf/1804.07437v2.pdf
PWC	https://paperswithcode.com/paper/vision-meets-drones-a-challenge
Repo
Framework