January 25, 2020

3524 words 17 mins read

Paper Group ANR 1748

A Novel Universal Solar Energy Predictor. Differentiable Ranks and Sorting using Optimal Transport. Finite-Sample Analysis for SARSA with Linear Function Approximation. Least Action Principles and Well-Posed Learning Problems. Enhanced Robot Speech Recognition Using Biomimetic Binaural Sound Source Localization. Stability of Linear Structural Equat …

A Novel Universal Solar Energy Predictor


Title	A Novel Universal Solar Energy Predictor
Authors	Nirupam Bidikar, Kotoju Rajitha, P. Usha Supriya
Abstract	Solar energy is one of the most economical and clean sustainable energy sources on the planet. However, the solar energy throughput is highly unpredictable due to its dependency on a plethora of conditions including weather, seasons, and other ecological/environmental conditions. Thus, the solar energy prediction is an inevitable necessity to optimize solar energy and also to improve the efficiency of solar energy systems. Conventionally, the optimization of the solar energy is undertaken by subject matter experts using their domain knowledge; although it is impractical for even the experts to tune the solar systems on a continuous basis. We strongly believe that the power of machine learning can be harnessed to better optimize the solar energy production by learning the correlation between various conditions and solar energy production from historical data which is typically readily available. For this use, this paper predicts the daily total energy generation of an installed solar program using the Naive Bayes classifier. In the forecast procedure, one year historical dataset including daily moderate temperatures, daily total sunshine duration, daily total global solar radiation and daily total photovoltaic energy generation parameters are used as the categorical valued features. By way of this Naive Bayes program the sensitivity and the precision measures are improved for the photovoltaic energy prediction and also the consequences of other solar characteristics on the solar energy production have been assessed.
Tasks
Published	2019-02-01
URL	http://arxiv.org/abs/1902.06660v2
PDF	http://arxiv.org/pdf/1902.06660v2.pdf
PWC	https://paperswithcode.com/paper/a-novel-universal-solar-energy-predictor
Repo
Framework

Differentiable Ranks and Sorting using Optimal Transport


Title	Differentiable Ranks and Sorting using Optimal Transport
Authors	Marco Cuturi, Olivier Teboul, Jean-Philippe Vert
Abstract	Sorting an array is a fundamental routine in machine learning, one that is used to compute rank-based statistics, cumulative distribution functions (CDFs), quantiles, or to select closest neighbors and labels. The sorting function is however piece-wise constant (the sorting permutation of a vector does not change if the entries of that vector are infinitesimally perturbed) and therefore has no gradient information to back-propagate. We propose a framework to sort elements that is algorithmically differentiable. We leverage the fact that sorting can be seen as a particular instance of the optimal transport (OT) problem on $\mathbb{R}$, from input values to a predefined array of sorted values (e.g. $1,2,\dots,n$ if the input array has $n$ elements). Building upon this link , we propose generalized CDFs and quantile operators by varying the size and weights of the target presorted array. Because this amounts to using the so-called Kantorovich formulation of OT, we call these quantities K-sorts, K-CDFs and K-quantiles. We recover differentiable algorithms by adding to the OT problem an entropic regularization, and approximate it using a few Sinkhorn iterations. We call these operators S-sorts, S-CDFs and S-quantiles, and use them in various learning settings: we benchmark them against the recently proposed neuralsort [Grover et al. 2019], propose applications to quantile regression and introduce differentiable formulations of the top-k accuracy that deliver state-of-the art performance.
Tasks
Published	2019-05-28
URL	https://arxiv.org/abs/1905.11885v2
PDF	https://arxiv.org/pdf/1905.11885v2.pdf
PWC	https://paperswithcode.com/paper/differentiable-sorting-using-optimal
Repo
Framework

Finite-Sample Analysis for SARSA with Linear Function Approximation


Title	Finite-Sample Analysis for SARSA with Linear Function Approximation
Authors	Shaofeng Zou, Tengyu Xu, Yingbin Liang
Abstract	SARSA is an on-policy algorithm to learn a Markov decision process policy in reinforcement learning. We investigate the SARSA algorithm with linear function approximation under the non-i.i.d.\ data, where a single sample trajectory is available. With a Lipschitz continuous policy improvement operator that is smooth enough, SARSA has been shown to converge asymptotically \cite{perkins2003convergent,melo2008analysis}. However, its non-asymptotic analysis is challenging and remains unsolved due to the non-i.i.d. samples and the fact that the behavior policy changes dynamically with time. In this paper, we develop a novel technique to explicitly characterize the stochastic bias of a type of stochastic approximation procedures with time-varying Markov transition kernels. Our approach enables non-asymptotic convergence analyses of this type of stochastic approximation algorithms, which may be of independent interest. Using our bias characterization technique and a gradient descent type of analysis, we provide the finite-sample analysis on the mean square error of the SARSA algorithm. We then further study a fitted SARSA algorithm, which includes the original SARSA algorithm and its variant in \cite{perkins2003convergent} as special cases. This fitted SARSA algorithm provides a more general framework for \textit{iterative} on-policy fitted policy iteration, which is more memory and computationally efficient. For this fitted SARSA algorithm, we also provide its finite-sample analysis.
Tasks	Q-Learning
Published	2019-02-06
URL	https://arxiv.org/abs/1902.02234v3
PDF	https://arxiv.org/pdf/1902.02234v3.pdf
PWC	https://paperswithcode.com/paper/finite-sample-analysis-for-sarsa-and-q
Repo
Framework

Least Action Principles and Well-Posed Learning Problems


Title	Least Action Principles and Well-Posed Learning Problems
Authors	Alessandro Betti, Marco Gori
Abstract	Machine Learning algorithms are typically regarded as appropriate optimization schemes for minimizing risk functions that are constructed on the training set, which conveys statistical flavor to the corresponding learning problem. When the focus is shifted on perception, which is inherently interwound with time, recent alternative formulations of learning have been proposed that rely on the principle of Least Cognitive Action, which very much reminds us of the Least Action Principle in mechanics. In this paper, we discuss different forms of the cognitive action and show the well-posedness of learning. In particular, unlike the special case of the action in mechanics, where the stationarity is typically gained on saddle points, we prove the existence of the minimum of a special form of cognitive action, which yields forth-order differential equations of learning. We also briefly discuss the dissipative behavior of these equations that turns out to characterize the process of learning.
Tasks
Published	2019-07-04
URL	https://arxiv.org/abs/1907.02517v1
PDF	https://arxiv.org/pdf/1907.02517v1.pdf
PWC	https://paperswithcode.com/paper/least-action-principles-and-well-posed
Repo
Framework

Enhanced Robot Speech Recognition Using Biomimetic Binaural Sound Source Localization


Title	Enhanced Robot Speech Recognition Using Biomimetic Binaural Sound Source Localization
Authors	Jorge, Davila-Chacon, Jindong, Liu, Stefan, Wermter
Abstract	Inspired by the behavior of humans talking in noisy environments, we propose an embodied embedded cognition approach to improve automatic speech recognition (ASR) systems for robots in challenging environments, such as with ego noise, using binaural sound source localization (SSL). The approach is verified by measuring the impact of SSL with a humanoid robot head on the performance of an ASR system. More specifically, a robot orients itself toward the angle where the signal-to-noise ratio (SNR) of speech is maximized for one microphone before doing an ASR task. First, a spiking neural network inspired by the midbrain auditory system based on our previous work is applied to calculate the sound signal angle. Then, a feedforward neural network is used to handle high levels of ego noise and reverberation in the signal. Finally, the sound signal is fed into an ASR system. For ASR, we use a system developed by our group and compare its performance with and without the support from SSL. We test our SSL and ASR systems on two humanoid platforms with different structural and material properties. With our approach we halve the sentence error rate with respect to the common downmixing of both channels. Surprisingly, the ASR performance is more than two times better when the angle between the humanoid head and the sound source allows sound waves to be reflected most intensely from the pinna to the ear microphone, rather than when sound waves arrive perpendicularly to the membrane.
Tasks	Speech Recognition
Published	2019-02-13
URL	http://arxiv.org/abs/1902.05446v1
PDF	http://arxiv.org/pdf/1902.05446v1.pdf
PWC	https://paperswithcode.com/paper/enhanced-robot-speech-recognition-using
Repo
Framework

Stability of Linear Structural Equation Models of Causal Inference


Title	Stability of Linear Structural Equation Models of Causal Inference
Authors	Karthik Abinav Sankararaman, Anand Louis, Navin Goyal
Abstract	We consider the numerical stability of the parameter recovery problem in Linear Structural Equation Model ($\LSEM$) of causal inference. A long line of work starting from Wright (1920) has focused on understanding which sub-classes of $\LSEM$ allow for efficient parameter recovery. Despite decades of study, this question is not yet fully resolved. The goal of this paper is complementary to this line of work; we want to understand the stability of the recovery problem in the cases when efficient recovery is possible. Numerical stability of Pearl’s notion of causality was first studied in Schulman and Srivastava (2016) using the concept of condition number where they provide ill-conditioned examples. In this work, we provide a condition number analysis for the $\LSEM$. First we prove that under a sufficient condition, for a certain sub-class of $\LSEM$ that are \emph{bow-free} (Brito and Pearl (2002)), the parameter recovery is stable. We further prove that \emph{randomly} chosen input parameters for this family satisfy the condition with a substantial probability. Hence for this family, on a large subset of parameter space, recovery is numerically stable. Next we construct an example of $\LSEM$ on four vertices with \emph{unbounded} condition number. We then corroborate our theoretical findings via simulations as well as real-world experiments for a sociology application. Finally, we provide a general heuristic for estimating the condition number of any $\LSEM$ instance.
Tasks	Causal Inference
Published	2019-05-16
URL	https://arxiv.org/abs/1905.06836v2
PDF	https://arxiv.org/pdf/1905.06836v2.pdf
PWC	https://paperswithcode.com/paper/stability-of-linear-structural-equation
Repo
Framework

Why Does Hierarchy (Sometimes) Work So Well in Reinforcement Learning?


Title	Why Does Hierarchy (Sometimes) Work So Well in Reinforcement Learning?
Authors	Ofir Nachum, Haoran Tang, Xingyu Lu, Shixiang Gu, Honglak Lee, Sergey Levine
Abstract	Hierarchical reinforcement learning has demonstrated significant success at solving difficult reinforcement learning (RL) tasks. Previous works have motivated the use of hierarchy by appealing to a number of intuitive benefits, including learning over temporally extended transitions, exploring over temporally extended periods, and training and exploring in a more semantically meaningful action space, among others. However, in fully observed, Markovian settings, it is not immediately clear why hierarchical RL should provide benefits over standard “shallow” RL architectures. In this work, we isolate and evaluate the claimed benefits of hierarchical RL on a suite of tasks encompassing locomotion, navigation, and manipulation. Surprisingly, we find that most of the observed benefits of hierarchy can be attributed to improved exploration, as opposed to easier policy learning or imposed hierarchical structures. Given this insight, we present exploration techniques inspired by hierarchy that achieve performance competitive with hierarchical RL while at the same time being much simpler to use and implement.
Tasks	Hierarchical Reinforcement Learning
Published	2019-09-23
URL	https://arxiv.org/abs/1909.10618v2
PDF	https://arxiv.org/pdf/1909.10618v2.pdf
PWC	https://paperswithcode.com/paper/why-does-hierarchy-sometimes-work-so-well-in
Repo
Framework

Few-shot brain segmentation from weakly labeled data with deep heteroscedastic multi-task networks


Title	Few-shot brain segmentation from weakly labeled data with deep heteroscedastic multi-task networks
Authors	Richard McKinley, Michael Rebsamen, Raphael Meier, Mauricio Reyes, Christian Rummel, Roland Wiest
Abstract	In applications of supervised learning applied to medical image segmentation, the need for large amounts of labeled data typically goes unquestioned. In particular, in the case of brain anatomy segmentation, hundreds or thousands of weakly-labeled volumes are often used as training data. In this paper, we first observe that for many brain structures, a small number of training examples, (n=9), weakly labeled using Freesurfer 6.0, plus simple data augmentation, suffice as training data to achieve high performance, achieving an overall mean Dice coefficient of $0.84 \pm 0.12$ compared to Freesurfer over 28 brain structures in T1-weighted images of $\approx 4000$ 9-10 year-olds from the Adolescent Brain Cognitive Development study. We then examine two varieties of heteroscedastic network as a method for improving classification results. An existing proposal by Kendall and Gal, which uses Monte-Carlo inference to learn to predict the variance of each prediction, yields an overall mean Dice of $0.85 \pm 0.14$ and showed statistically significant improvements over 25 brain structures. Meanwhile a novel heteroscedastic network which directly learns the probability that an example has been mislabeled yielded an overall mean Dice of $0.87 \pm 0.11$ and showed statistically significant improvements over all but one of the brain structures considered. The loss function associated to this network can be interpreted as performing a form of learned label smoothing, where labels are only smoothed where they are judged to be uncertain.
Tasks	Brain Segmentation, Data Augmentation, Medical Image Segmentation, Semantic Segmentation
Published	2019-04-04
URL	http://arxiv.org/abs/1904.02436v1
PDF	http://arxiv.org/pdf/1904.02436v1.pdf
PWC	https://paperswithcode.com/paper/few-shot-brain-segmentation-from-weakly
Repo
Framework

Visualising Basins of Attraction for the Cross-Entropy and the Squared Error Neural Network Loss Functions


Title	Visualising Basins of Attraction for the Cross-Entropy and the Squared Error Neural Network Loss Functions
Authors	Anna Sergeevna Bosman, Andries Engelbrecht, Mardé Helbig
Abstract	Quantification of the stationary points and the associated basins of attraction of neural network loss surfaces is an important step towards a better understanding of neural network loss surfaces at large. This work proposes a novel method to visualise basins of attraction together with the associated stationary points via gradient-based random sampling. The proposed technique is used to perform an empirical study of the loss surfaces generated by two different error metrics: quadratic loss and entropic loss. The empirical observations confirm the theoretical hypothesis regarding the nature of neural network attraction basins. Entropic loss is shown to exhibit stronger gradients and fewer stationary points than quadratic loss, indicating that entropic loss has a more searchable landscape. Quadratic loss is shown to be more resilient to overfitting than entropic loss. Both losses are shown to exhibit local minima, but the number of local minima is shown to decrease with an increase in dimensionality. Thus, the proposed visualisation technique successfully captures the local minima properties exhibited by the neural network loss surfaces, and can be used for the purpose of fitness landscape analysis of neural networks.
Tasks
Published	2019-01-08
URL	http://arxiv.org/abs/1901.02302v2
PDF	http://arxiv.org/pdf/1901.02302v2.pdf
PWC	https://paperswithcode.com/paper/visualising-basins-of-attraction-for-the
Repo
Framework

An Auxiliary Classifier Generative Adversarial Framework for Relation Extraction


Title	An Auxiliary Classifier Generative Adversarial Framework for Relation Extraction
Authors	Yun Zhao
Abstract	Relation extraction models suffer from limited qualified training data. Using human annotators to label sentences is too expensive and does not scale well especially when dealing with large datasets. In this paper, we use Auxiliary Classifier Generative Adversarial Networks (AC-GANs) to generate high-quality relational sentences and to improve the performance of relation classifier in end-to-end models. In AC-GAN, the discriminator gives not only a probability distribution over the real source, but also a probability distribution over the relation labels. This helps to generate meaningful relational sentences. Experimental results show that our proposed data augmentation method significantly improves the performance of relation extraction compared to state-of-the-art methods
Tasks	Data Augmentation, Relation Extraction
Published	2019-09-06
URL	https://arxiv.org/abs/1909.05370v1
PDF	https://arxiv.org/pdf/1909.05370v1.pdf
PWC	https://paperswithcode.com/paper/an-auxiliary-classifier-generative
Repo
Framework

Joint Event and Temporal Relation Extraction with Shared Representations and Structured Prediction


Title	Joint Event and Temporal Relation Extraction with Shared Representations and Structured Prediction
Authors	Rujun Han, Qiang Ning, Nanyun Peng
Abstract	We propose a joint event and temporal relation extraction model with shared representation learning and structured prediction. The proposed method has two advantages over existing work. First, it improves event representation by allowing the event and relation modules to share the same contextualized embeddings and neural representation learner. Second, it avoids error propagation in the conventional pipeline systems by leveraging structured inference and learning methods to assign both the event labels and the temporal relation labels jointly. Experiments show that the proposed method can improve both event extraction and temporal relation extraction over state-of-the-art systems, with the end-to-end F1 improved by 10% and 6.8% on two benchmark datasets respectively.
Tasks	Relation Extraction, Representation Learning, Structured Prediction
Published	2019-09-02
URL	https://arxiv.org/abs/1909.05360v1
PDF	https://arxiv.org/pdf/1909.05360v1.pdf
PWC	https://paperswithcode.com/paper/joint-event-and-temporal-relation-extraction
Repo
Framework

A unified neural network for object detection, multiple object tracking and vehicle re-identification


Title	A unified neural network for object detection, multiple object tracking and vehicle re-identification
Authors	Yuhao Xu, Jiakui Wang
Abstract	Deep SORT\cite{wojke2017simple} is a tracking-by-detetion approach to multiple object tracking with a detector and a RE-ID model. Both separately training and inference with the two model is time-comsuming. In this paper, we unify the detector and RE-ID model into an end-to-end network, by adding an additional track branch for tracking in Faster RCNN architecture. With a unified network, we are able to train the whole model end-to-end with multi loss, which has shown much benefit in other recent works. The RE-ID model in Deep SORT needs to use deep CNNs to extract feature map from detected object images, However, track branch in our proposed network straight make use of the RoI feature vector in Faster RCNN baseline, which reduced the amount of calculation. Since the single image lacks the same object which is necessary when we use the triplet loss to optimizer the track branch, we concatenate the neighbouring frames in a video to construct our training dataset. We have trained and evaluated our model on AIC19 vehicle tracking dataset, experiment shows that our model with resnet101 backbone can achieve 57.79 % mAP and track vehicle well.
Tasks	Multiple Object Tracking, Object Detection, Object Tracking, Vehicle Re-Identification
Published	2019-07-08
URL	https://arxiv.org/abs/1907.03465v1
PDF	https://arxiv.org/pdf/1907.03465v1.pdf
PWC	https://paperswithcode.com/paper/a-unified-neural-network-for-object-detection
Repo
Framework

Re-learning of Child Model for Misclassified data by using KL Divergence in AffectNet: A Database for Facial Expression


Title	Re-learning of Child Model for Misclassified data by using KL Divergence in AffectNet: A Database for Facial Expression
Authors	Takumi Ichimura, Shin Kamada
Abstract	AffectNet contains more than 1,000,000 facial images which manually annotated for the presence of eight discrete facial expressions and the intensity of valence and arousal. Adaptive structural learning method of DBN (Adaptive DBN) is positioned as a top Deep learning model of classification capability for some large image benchmark databases. The Convolutional Neural Network and Adaptive DBN were trained for AffectNet and classification capability was compared. Adaptive DBN showed higher classification ratio. However, the model was not able to classify some test cases correctly because human emotions contain many ambiguous features or patterns leading wrong answer which includes the possibility of being a factor of adversarial examples, due to two or more annotators answer different subjective judgment for an image. In order to distinguish such cases, this paper investigated a re-learning model of Adaptive DBN with two or more child models, where the original trained model can be seen as a parent model and then new child models are generated for some misclassified cases. In addition, an appropriate child model was generated according to difference between two models by using KL divergence. The generated child models showed better performance to classify two emotion categories: `Disgust' and` Anger’.
Tasks
Published	2019-09-30
URL	https://arxiv.org/abs/1909.13481v1
PDF	https://arxiv.org/pdf/1909.13481v1.pdf
PWC	https://paperswithcode.com/paper/re-learning-of-child-model-for-misclassified
Repo
Framework

Intrinsic Motivation Driven Intuitive Physics Learning using Deep Reinforcement Learning with Intrinsic Reward Normalization


Title	Intrinsic Motivation Driven Intuitive Physics Learning using Deep Reinforcement Learning with Intrinsic Reward Normalization
Authors	JaeWon Choi, Sung-eui Yoon
Abstract	At an early age, human infants are able to learn and build a model of the world very quickly by constantly observing and interacting with objects around them. One of the most fundamental intuitions human infants acquire is intuitive physics. Human infants learn and develop these models, which later serve as prior knowledge for further learning. Inspired by such behaviors exhibited by human infants, we introduce a graphical physics network integrated with deep reinforcement learning. Specifically, we introduce an intrinsic reward normalization method that allows our agent to efficiently choose actions that can improve its intuitive physics model the most. Using a 3D physics engine, we show that our graphical physics network is able to infer object’s positions and velocities very effectively, and our deep reinforcement learning network encourages an agent to improve its model by making it continuously interact with objects only using intrinsic motivation. We experiment our model in both stationary and non-stationary state problems and show benefits of our approach in terms of the number of different actions the agent performs and the accuracy of agent’s intuition model. Videos are at https://www.youtube.com/watch?v=pDbByp91r3M&t=2s
Tasks
Published	2019-07-06
URL	https://arxiv.org/abs/1907.03116v1
PDF	https://arxiv.org/pdf/1907.03116v1.pdf
PWC	https://paperswithcode.com/paper/intrinsic-motivation-driven-intuitive-physics
Repo
Framework

Competitive Experience Replay


Title	Competitive Experience Replay
Authors	Hao Liu, Alexander Trott, Richard Socher, Caiming Xiong
Abstract	Deep learning has achieved remarkable successes in solving challenging reinforcement learning (RL) problems when dense reward function is provided. However, in sparse reward environment it still often suffers from the need to carefully shape reward function to guide policy optimization. This limits the applicability of RL in the real world since both reinforcement learning and domain-specific knowledge are required. It is therefore of great practical importance to develop algorithms which can learn from a binary signal indicating successful task completion or other unshaped, sparse reward signals. We propose a novel method called competitive experience replay, which efficiently supplements a sparse reward by placing learning in the context of an exploration competition between a pair of agents. Our method complements the recently proposed hindsight experience replay (HER) by inducing an automatic exploratory curriculum. We evaluate our approach on the tasks of reaching various goal locations in an ant maze and manipulating objects with a robotic arm. Each task provides only binary rewards indicating whether or not the goal is achieved. Our method asymmetrically augments these sparse rewards for a pair of agents each learning the same task, creating a competitive game designed to drive exploration. Extensive experiments demonstrate that this method leads to faster converge and improved task performance.
Tasks
Published	2019-02-01
URL	http://arxiv.org/abs/1902.00528v4
PDF	http://arxiv.org/pdf/1902.00528v4.pdf
PWC	https://paperswithcode.com/paper/competitive-experience-replay
Repo
Framework