July 27, 2019

2743 words 13 mins read

Paper Group ANR 531

A Comprehensive Survey on Bengali Phoneme Recognition. Imitation from Observation: Learning to Imitate Behaviors from Raw Video via Context Translation. Simplified Long Short-term Memory Recurrent Neural Networks: part III. Fingerprint Orientation Refinement through Iterative Smoothing. Converting Cascade-Correlation Neural Nets into Probabilistic …

A Comprehensive Survey on Bengali Phoneme Recognition


Title	A Comprehensive Survey on Bengali Phoneme Recognition
Authors	Sadia Tasnim Swarna, Shamim Ehsan, Md. Saiful Islam, Marium E Jannat
Abstract	Hidden Markov model based various phoneme recognition methods for Bengali language is reviewed. Automatic phoneme recognition for Bengali language using multilayer neural network is reviewed. Usefulness of multilayer neural network over single layer neural network is discussed. Bangla phonetic feature table construction and enhancement for Bengali speech recognition is also discussed. Comparison among these methods is discussed.
Tasks	Speech Recognition
Published	2017-01-27
URL	http://arxiv.org/abs/1701.08156v2
PDF	http://arxiv.org/pdf/1701.08156v2.pdf
PWC	https://paperswithcode.com/paper/a-comprehensive-survey-on-bengali-phoneme
Repo
Framework

Imitation from Observation: Learning to Imitate Behaviors from Raw Video via Context Translation


Title	Imitation from Observation: Learning to Imitate Behaviors from Raw Video via Context Translation
Authors	YuXuan Liu, Abhishek Gupta, Pieter Abbeel, Sergey Levine
Abstract	Imitation learning is an effective approach for autonomous systems to acquire control policies when an explicit reward function is unavailable, using supervision provided as demonstrations from an expert, typically a human operator. However, standard imitation learning methods assume that the agent receives examples of observation-action tuples that could be provided, for instance, to a supervised learning algorithm. This stands in contrast to how humans and animals imitate: we observe another person performing some behavior and then figure out which actions will realize that behavior, compensating for changes in viewpoint, surroundings, object positions and types, and other factors. We term this kind of imitation learning “imitation-from-observation,” and propose an imitation learning method based on video prediction with context translation and deep reinforcement learning. This lifts the assumption in imitation learning that the demonstration should consist of observations in the same environment configuration, and enables a variety of interesting applications, including learning robotic skills that involve tool use simply by observing videos of human tool use. Our experimental results show the effectiveness of our approach in learning a wide range of real-world robotic tasks modeled after common household chores from videos of a human demonstrator, including sweeping, ladling almonds, pushing objects as well as a number of tasks in simulation.
Tasks	Imitation Learning, Video Prediction
Published	2017-07-11
URL	http://arxiv.org/abs/1707.03374v2
PDF	http://arxiv.org/pdf/1707.03374v2.pdf
PWC	https://paperswithcode.com/paper/imitation-from-observation-learning-to
Repo
Framework

Simplified Long Short-term Memory Recurrent Neural Networks: part III


Title	Simplified Long Short-term Memory Recurrent Neural Networks: part III
Authors	Atra Akandeh, Fathi M. Salem
Abstract	This is part III of three-part work. In parts I and II, we have presented eight variants for simplified Long Short Term Memory (LSTM) recurrent neural networks (RNNs). It is noted that fast computation, specially in constrained computing resources, are an important factor in processing big time-sequence data. In this part III paper, we present and evaluate two new LSTM model variants which dramatically reduce the computational load while retaining comparable performance to the base (standard) LSTM RNNs. In these new variants, we impose (Hadamard) pointwise state multiplications in the cell-memory network in addition to the gating signal networks.
Tasks
Published	2017-07-14
URL	http://arxiv.org/abs/1707.04626v1
PDF	http://arxiv.org/pdf/1707.04626v1.pdf
PWC	https://paperswithcode.com/paper/simplified-long-short-term-memory-recurrent
Repo
Framework


Title	Fingerprint Orientation Refinement through Iterative Smoothing
Authors	Pierluigi Maponi, Riccardo Piergallini, Filippo Santarelli
Abstract	We propose a new gradient-based method for the extraction of the orientation field associated to a fingerprint, and a regularisation procedure to improve the orientation field computed from noisy fingerprint images. The regularisation algorithm is based on three new integral operators, introduced and discussed in this paper. A pre-processing technique is also proposed to achieve better performances of the algorithm. The results of a numerical experiment are reported to give an evidence of the efficiency of the proposed algorithm.
Tasks
Published	2017-11-09
URL	http://arxiv.org/abs/1711.03214v1
PDF	http://arxiv.org/pdf/1711.03214v1.pdf
PWC	https://paperswithcode.com/paper/fingerprint-orientation-refinement-through
Repo
Framework

Converting Cascade-Correlation Neural Nets into Probabilistic Generative Models


Title	Converting Cascade-Correlation Neural Nets into Probabilistic Generative Models
Authors	Ardavan Salehi Nobandegani, Thomas R. Shultz
Abstract	Humans are not only adept in recognizing what class an input instance belongs to (i.e., classification task), but perhaps more remarkably, they can imagine (i.e., generate) plausible instances of a desired class with ease, when prompted. Inspired by this, we propose a framework which allows transforming Cascade-Correlation Neural Networks (CCNNs) into probabilistic generative models, thereby enabling CCNNs to generate samples from a category of interest. CCNNs are a well-known class of deterministic, discriminative NNs, which autonomously construct their topology, and have been successful in giving accounts for a variety of psychological phenomena. Our proposed framework is based on a Markov Chain Monte Carlo (MCMC) method, called the Metropolis-adjusted Langevin algorithm, which capitalizes on the gradient information of the target distribution to direct its explorations towards regions of high probability, thereby achieving good mixing properties. Through extensive simulations, we demonstrate the efficacy of our proposed framework.
Tasks
Published	2017-01-18
URL	http://arxiv.org/abs/1701.05004v1
PDF	http://arxiv.org/pdf/1701.05004v1.pdf
PWC	https://paperswithcode.com/paper/converting-cascade-correlation-neural-nets
Repo
Framework

Learning Robust Representations for Computer Vision


Title	Learning Robust Representations for Computer Vision
Authors	Peng Zheng, Aleksandr Y. Aravkin, Karthikeyan Natesan Ramamurthy, Jayaraman Jayaraman Thiagarajan
Abstract	Unsupervised learning techniques in computer vision often require learning latent representations, such as low-dimensional linear and non-linear subspaces. Noise and outliers in the data can frustrate these approaches by obscuring the latent spaces. Our main goal is deeper understanding and new development of robust approaches for representation learning. We provide a new interpretation for existing robust approaches and present two specific contributions: a new robust PCA approach, which can separate foreground features from dynamic background, and a novel robust spectral clustering method, that can cluster facial images with high accuracy. Both contributions show superior performance to standard methods on real-world test sets.
Tasks	Representation Learning
Published	2017-07-31
URL	http://arxiv.org/abs/1708.00069v1
PDF	http://arxiv.org/pdf/1708.00069v1.pdf
PWC	https://paperswithcode.com/paper/learning-robust-representations-for-computer
Repo
Framework

Combining Machine Learning and Physics to Understand Glassy Systems


Title	Combining Machine Learning and Physics to Understand Glassy Systems
Authors	Samuel S. Schoenholz
Abstract	Our understanding of supercooled liquids and glasses has lagged significantly behind that of simple liquids and crystalline solids. This is in part due to the many possibly relevant degrees of freedom that are present due to the disorder inherent to these systems and in part to non-equilibrium effects which are difficult to treat in the standard context of statistical physics. Together these issues have resulted in a field whose theories are under-constrained by experiment and where fundamental questions are still unresolved. Mean field results have been successful in infinite dimensions but it is unclear to what extent they apply to realistic systems and assume uniform local structure. At odds with this are theories premised on the existence of structural defects. However, until recently it has been impossible to find structural signatures that are predictive of dynamics. Here we summarize and recast the results from several recent papers offering a data driven approach to building a phenomenological theory of disordered materials by combining machine learning with physical intuition.
Tasks
Published	2017-09-23
URL	http://arxiv.org/abs/1709.08015v1
PDF	http://arxiv.org/pdf/1709.08015v1.pdf
PWC	https://paperswithcode.com/paper/combining-machine-learning-and-physics-to
Repo
Framework

The landscape of the spiked tensor model


Title	The landscape of the spiked tensor model
Authors	Gerard Ben Arous, Song Mei, Andrea Montanari, Mihai Nica
Abstract	We consider the problem of estimating a large rank-one tensor ${\boldsymbol u}^{\otimes k}\in({\mathbb R}^{n})^{\otimes k}$, $k\ge 3$ in Gaussian noise. Earlier work characterized a critical signal-to-noise ratio $\lambda_{Bayes}= O(1)$ above which an ideal estimator achieves strictly positive correlation with the unknown vector of interest. Remarkably no polynomial-time algorithm is known that achieved this goal unless $\lambda\ge C n^{(k-2)/4}$ and even powerful semidefinite programming relaxations appear to fail for $1\ll \lambda\ll n^{(k-2)/4}$. In order to elucidate this behavior, we consider the maximum likelihood estimator, which requires maximizing a degree-$k$ homogeneous polynomial over the unit sphere in $n$ dimensions. We compute the expected number of critical points and local maxima of this objective function and show that it is exponential in the dimensions $n$, and give exact formulas for the exponential growth rate. We show that (for $\lambda$ larger than a constant) critical points are either very close to the unknown vector ${\boldsymbol u}$, or are confined in a band of width $\Theta(\lambda^{-1/(k-1)})$ around the maximum circle that is orthogonal to ${\boldsymbol u}$. For local maxima, this band shrinks to be of size $\Theta(\lambda^{-1/(k-2)})$. These `uninformative’ local maxima are likely to cause the failure of optimization algorithms. \|
Tasks
Published	2017-11-15
URL	http://arxiv.org/abs/1711.05424v2
PDF	http://arxiv.org/pdf/1711.05424v2.pdf
PWC	https://paperswithcode.com/paper/the-landscape-of-the-spiked-tensor-model
Repo
Framework

An Exploration of Word Embedding Initialization in Deep-Learning Tasks


Title	An Exploration of Word Embedding Initialization in Deep-Learning Tasks
Authors	Tom Kocmi, Ondřej Bojar
Abstract	Word embeddings are the interface between the world of discrete units of text processing and the continuous, differentiable world of neural networks. In this work, we examine various random and pretrained initialization methods for embeddings used in deep networks and their effect on the performance on four NLP tasks with both recurrent and convolutional architectures. We confirm that pretrained embeddings are a little better than random initialization, especially considering the speed of learning. On the other hand, we do not see any significant difference between various methods of random initialization, as long as the variance is kept reasonably low. High-variance initialization prevents the network to use the space of embeddings and forces it to use other free parameters to accomplish the task. We support this hypothesis by observing the performance in learning lexical relations and by the fact that the network can learn to perform reasonably in its task even with fixed random embeddings.
Tasks	Word Embeddings
Published	2017-11-24
URL	http://arxiv.org/abs/1711.09160v1
PDF	http://arxiv.org/pdf/1711.09160v1.pdf
PWC	https://paperswithcode.com/paper/an-exploration-of-word-embedding
Repo
Framework

Towards continuous control of flippers for a multi-terrain robot using deep reinforcement learning


Title	Towards continuous control of flippers for a multi-terrain robot using deep reinforcement learning
Authors	Giuseppe Paolo, Lei Tai, Ming Liu
Abstract	In this paper we focus on developing a control algorithm for multi-terrain tracked robots with flippers using a reinforcement learning (RL) approach. The work is based on the deep deterministic policy gradient (DDPG) algorithm, proven to be very successful in simple simulation environments. The algorithm works in an end-to-end fashion in order to control the continuous position of the flippers. This end-to-end approach makes it easy to apply the controller to a wide array of circumstances, but the huge flexibility comes to the cost of an increased difficulty of solution. The complexity of the task is enlarged even more by the fact that real multi-terrain robots move in partially observable environments. Notwithstanding these complications, being able to smoothly control a multi-terrain robot can produce huge benefits in impaired people daily lives or in search and rescue situations.
Tasks	Continuous Control
Published	2017-09-25
URL	http://arxiv.org/abs/1709.08430v1
PDF	http://arxiv.org/pdf/1709.08430v1.pdf
PWC	https://paperswithcode.com/paper/towards-continuous-control-of-flippers-for-a
Repo
Framework

A Novel Brain Decoding Method: a Correlation Network Framework for Revealing Brain Connections


Title	A Novel Brain Decoding Method: a Correlation Network Framework for Revealing Brain Connections
Authors	Siyu Yu, Nanning Zheng, Yongqiang Ma, Hao Wu, Badong Chen
Abstract	Brain decoding is a hot spot in cognitive science, which focuses on reconstructing perceptual images from brain activities. Analyzing the correlations of collected data from human brain activities and representing activity patterns are two problems in brain decoding based on functional magnetic resonance imaging (fMRI) signals. However, existing correlation analysis methods mainly focus on the strength information of voxel, which reveals functional connectivity in the cerebral cortex. They tend to neglect the structural information that implies the intracortical or intrinsic connections; that is, structural connectivity. Hence, the effective connectivity inferred by these methods is relatively unilateral. Therefore, we proposed a correlation network (CorrNet) framework that could be flexibly combined with diverse pattern representation models. In the CorrNet framework, the topological correlation was introduced to reveal structural information. Rich correlations were obtained, which contributed to specifying the underlying effective connectivity. We also combined the CorrNet framework with a linear support vector machine (SVM) and a dynamic evolving spike neuron network (SNN) for pattern representation separately, thus providing a novel method for decoding cognitive activity patterns. Experimental results verified the reliability and robustness of our CorrNet framework and demonstrated that the new method achieved significant improvement in brain decoding over comparable methods.
Tasks	Brain Decoding
Published	2017-12-01
URL	http://arxiv.org/abs/1712.01668v1
PDF	http://arxiv.org/pdf/1712.01668v1.pdf
PWC	https://paperswithcode.com/paper/a-novel-brain-decoding-method-a-correlation
Repo
Framework

Traffic-Aware Transmission Mode Selection in D2D-enabled Cellular Networks with Token System


Title	Traffic-Aware Transmission Mode Selection in D2D-enabled Cellular Networks with Token System
Authors	Yiling Yuan, Tao Yang, Hui Feng, Bo Hu, Jianqiu Zhang, Bin Wang, Qiyong Lu
Abstract	We consider a D2D-enabled cellular network where user equipments (UEs) owned by rational users are incentivized to form D2D pairs using tokens. They exchange tokens electronically to “buy” and “sell” D2D services. Meanwhile the devices have the ability to choose the transmission mode, i.e. receiving data via cellular links or D2D links. Thus taking the different benefits brought by diverse traffic types as a prior, the UEs can utilize their tokens more efficiently via transmission mode selection. In this paper, the optimal transmission mode selection strategy as well as token collection policy are investigated to maximize the long-term utility in the dynamic network environment. The optimal policy is proved to be a threshold strategy, and the thresholds have a monotonicity property. Numerical simulations verify our observations and the gain from transmission mode selection is observed.
Tasks
Published	2017-03-02
URL	http://arxiv.org/abs/1703.00660v3
PDF	http://arxiv.org/pdf/1703.00660v3.pdf
PWC	https://paperswithcode.com/paper/traffic-aware-transmission-mode-selection-in
Repo
Framework

Analyzing Knowledge Transfer in Deep Q-Networks for Autonomously Handling Multiple Intersections


Title	Analyzing Knowledge Transfer in Deep Q-Networks for Autonomously Handling Multiple Intersections
Authors	David Isele, Akansel Cosgun, Kikuo Fujimura
Abstract	We analyze how the knowledge to autonomously handle one type of intersection, represented as a Deep Q-Network, translates to other types of intersections (tasks). We view intersection handling as a deep reinforcement learning problem, which approximates the state action Q function as a deep neural network. Using a traffic simulator, we show that directly copying a network trained for one type of intersection to another type of intersection decreases the success rate. We also show that when a network that is pre-trained on Task A and then is fine-tuned on a Task B, the resulting network not only performs better on the Task B than an network exclusively trained on Task A, but also retained knowledge on the Task A. Finally, we examine a lifelong learning setting, where we train a single network on five different types of intersections sequentially and show that the resulting network exhibited catastrophic forgetting of knowledge on previous tasks. This result suggests a need for a long-term memory component to preserve knowledge.
Tasks	Transfer Learning
Published	2017-05-02
URL	http://arxiv.org/abs/1705.01197v1
PDF	http://arxiv.org/pdf/1705.01197v1.pdf
PWC	https://paperswithcode.com/paper/analyzing-knowledge-transfer-in-deep-q
Repo
Framework

Significance of Softmax-based Features in Comparison to Distance Metric Learning-based Features


Title	Significance of Softmax-based Features in Comparison to Distance Metric Learning-based Features
Authors	Shota Horiguchi, Daiki Ikami, Kiyoharu Aizawa
Abstract	The extraction of useful deep features is important for many computer vision tasks. Deep features extracted from classification networks have proved to perform well in those tasks. To obtain features of greater usefulness, end-to-end distance metric learning (DML) has been applied to train the feature extractor directly. However, in these DML studies, there were no equitable comparisons between features extracted from a DML-based network and those from a softmax-based network. In this paper, by presenting objective comparisons between these two approaches under the same network architecture, we show that the softmax-based features perform competitive, or even better, to the state-of-the-art DML features when the size of the dataset, that is, the number of training samples per class, is large. The results suggest that softmax-based features should be properly taken into account when evaluating the performance of deep features.
Tasks	Metric Learning
Published	2017-12-29
URL	http://arxiv.org/abs/1712.10151v2
PDF	http://arxiv.org/pdf/1712.10151v2.pdf
PWC	https://paperswithcode.com/paper/significance-of-softmax-based-features-in
Repo
Framework

A Gaussian Process Regression Model for Distribution Inputs


Title	A Gaussian Process Regression Model for Distribution Inputs
Authors	François Bachoc, Fabrice Gamboa, Jean-Michel Loubes, Nil Venet
Abstract	Monge-Kantorovich distances, otherwise known as Wasserstein distances, have received a growing attention in statistics and machine learning as a powerful discrepancy measure for probability distributions. In this paper, we focus on forecasting a Gaussian process indexed by probability distributions. For this, we provide a family of positive definite kernels built using transportation based distances. We provide a probabilistic understanding of these kernels and characterize the corresponding stochastic processes. We prove that the Gaussian processes indexed by distributions corresponding to these kernels can be efficiently forecast, opening new perspectives in Gaussian process modeling.
Tasks	Gaussian Processes
Published	2017-01-31
URL	http://arxiv.org/abs/1701.09055v2
PDF	http://arxiv.org/pdf/1701.09055v2.pdf
PWC	https://paperswithcode.com/paper/a-gaussian-process-regression-model-for
Repo
Framework