October 16, 2019

3092 words 15 mins read

Paper Group ANR 1084

Machine learning based hyperspectral image analysis: A survey. The Greedy Dirichlet Process Filter - An Online Clustering Multi-Target Tracker. Generative Low-Shot Network Expansion. Multi-Task Learning for Extraction of Adverse Drug Reaction Mentions from Tweets. Contextual Bandits with Cross-learning. Deep Variational Reinforcement Learning for P …

Machine learning based hyperspectral image analysis: A survey


Title	Machine learning based hyperspectral image analysis: A survey
Authors	Utsav B. Gewali, Sildomar T. Monteiro, Eli Saber
Abstract	Hyperspectral sensors enable the study of the chemical properties of scene materials remotely for the purpose of identification, detection, and chemical composition analysis of objects in the environment. Hence, hyperspectral images captured from earth observing satellites and aircraft have been increasingly important in agriculture, environmental monitoring, urban planning, mining, and defense. Machine learning algorithms due to their outstanding predictive power have become a key tool for modern hyperspectral image analysis. Therefore, a solid understanding of machine learning techniques have become essential for remote sensing researchers and practitioners. This paper reviews and compares recent machine learning-based hyperspectral image analysis methods published in literature. We organize the methods by the image analysis task and by the type of machine learning algorithm, and present a two-way mapping between the image analysis tasks and the types of machine learning algorithms that can be applied to them. The paper is comprehensive in coverage of both hyperspectral image analysis tasks and machine learning algorithms. The image analysis tasks considered are land cover classification, target detection, unmixing, and physical parameter estimation. The machine learning algorithms covered are Gaussian models, linear regression, logistic regression, support vector machines, Gaussian mixture model, latent linear models, sparse linear models, Gaussian mixture models, ensemble learning, directed graphical models, undirected graphical models, clustering, Gaussian processes, Dirichlet processes, and deep learning. We also discuss the open challenges in the field of hyperspectral image analysis and explore possible future directions.
Tasks	Gaussian Processes
Published	2018-02-23
URL	http://arxiv.org/abs/1802.08701v2
PDF	http://arxiv.org/pdf/1802.08701v2.pdf
PWC	https://paperswithcode.com/paper/machine-learning-based-hyperspectral-image
Repo
Framework

The Greedy Dirichlet Process Filter - An Online Clustering Multi-Target Tracker


Title	The Greedy Dirichlet Process Filter - An Online Clustering Multi-Target Tracker
Authors	Benjamin Naujoks, Patrick Burger, Hans-Joachim Wuensche
Abstract	Reliable collision avoidance is one of the main requirements for autonomous driving. Hence, it is important to correctly estimate the states of an unknown number of static and dynamic objects in real-time. Here, data association is a major challenge for every multi-target tracker. We propose a novel multi-target tracker called Greedy Dirichlet Process Filter (GDPF) based on the non-parametric Bayesian model called Dirichlet Processes and the fast posterior computation algorithm Sequential Updating and Greedy Search (SUGS). By adding a temporal dependence we get a real-time capable tracking framework without the need of a previous clustering or data association step. Real-world tests show that GDPF outperforms other multi-target tracker in terms of accuracy and stability.
Tasks	Autonomous Driving
Published	2018-11-14
URL	http://arxiv.org/abs/1811.05911v2
PDF	http://arxiv.org/pdf/1811.05911v2.pdf
PWC	https://paperswithcode.com/paper/the-greedy-dirichlet-process-filter-an-online
Repo
Framework

Generative Low-Shot Network Expansion


Title	Generative Low-Shot Network Expansion
Authors	Adi Hayat, Mark Kliger, Shachar Fleishman, Daniel Cohen-Or
Abstract	Conventional deep learning classifiers are static in the sense that they are trained on a predefined set of classes and learning to classify a novel class typically requires re-training. In this work, we address the problem of Low-Shot network expansion learning. We introduce a learning framework which enables expanding a pre-trained (base) deep network to classify novel classes when the number of examples for the novel classes is particularly small. We present a simple yet powerful hard distillation method where the base network is augmented with additional weights to classify the novel classes, while keeping the weights of the base network unchanged. We show that since only a small number of weights needs to be trained, the hard distillation excels in low-shot training scenarios. Furthermore, hard distillation avoids detriment to classification performance on the base classes. Finally, we show that low-shot network expansion can be done with a very small memory footprint by using a compact generative model of the base classes training data with only a negligible degradation relative to learning with the full training set.
Tasks
Published	2018-10-19
URL	http://arxiv.org/abs/1810.08363v1
PDF	http://arxiv.org/pdf/1810.08363v1.pdf
PWC	https://paperswithcode.com/paper/generative-low-shot-network-expansion
Repo
Framework

Multi-Task Learning for Extraction of Adverse Drug Reaction Mentions from Tweets


Title	Multi-Task Learning for Extraction of Adverse Drug Reaction Mentions from Tweets
Authors	Shashank Gupta, Manish Gupta, Vasudeva Varma, Sachin Pawar, Nitin Ramrakhiyani, Girish K. Palshikar
Abstract	Adverse drug reactions (ADRs) are one of the leading causes of mortality in health care. Current ADR surveillance systems are often associated with a substantial time lag before such events are officially published. On the other hand, online social media such as Twitter contain information about ADR events in real-time, much before any official reporting. Current state-of-the-art in ADR mention extraction uses Recurrent Neural Networks (RNN), which typically need large labeled corpora. Towards this end, we propose a multi-task learning based method which can utilize a similar auxiliary task (adverse drug event detection) to enhance the performance of the main task, i.e., ADR extraction. Furthermore, in the absence of auxiliary task dataset, we propose a novel joint multi-task learning method to automatically generate weak supervision dataset for the auxiliary task when a large pool of unlabeled tweets is available. Experiments with 0.48M tweets show that the proposed approach outperforms the state-of-the-art methods for the ADR mention extraction task by 7.2% in terms of F1 score.
Tasks	Multi-Task Learning
Published	2018-02-14
URL	http://arxiv.org/abs/1802.05130v1
PDF	http://arxiv.org/pdf/1802.05130v1.pdf
PWC	https://paperswithcode.com/paper/multi-task-learning-for-extraction-of-adverse
Repo
Framework

Contextual Bandits with Cross-learning


Title	Contextual Bandits with Cross-learning
Authors	Santiago Balseiro, Negin Golrezaei, Mohammad Mahdian, Vahab Mirrokni, Jon Schneider
Abstract	In the classical contextual bandits problem, in each round $t$, a learner observes some context $c$, chooses some action $a$ to perform, and receives some reward $r_{a,t}(c)$. We consider the variant of this problem where in addition to receiving the reward $r_{a,t}(c)$, the learner also learns the values of $r_{a,t}(c’)$ for all other contexts $c'$; i.e., the rewards that would have been achieved by performing that action under different contexts. This variant arises in several strategic settings, such as learning how to bid in non-truthful repeated auctions (in this setting the context is the decision maker’s private valuation for each auction). We call this problem the contextual bandits problem with cross-learning. The best algorithms for the classical contextual bandits problem achieve $\tilde{O}(\sqrt{CKT})$ regret against all stationary policies, where $C$ is the number of contexts, $K$ the number of actions, and $T$ the number of rounds. We demonstrate algorithms for the contextual bandits problem with cross-learning that remove the dependence on $C$ and achieve regret $O(\sqrt{KT})$ (when contexts are stochastic with known distribution), $\tilde{O}(K^{1/3}T^{2/3})$ (when contexts are stochastic with unknown distribution), and $\tilde{O}(\sqrt{KT})$ (when contexts are adversarial but rewards are stochastic).
Tasks	Multi-Armed Bandits
Published	2018-09-25
URL	https://arxiv.org/abs/1809.09582v2
PDF	https://arxiv.org/pdf/1809.09582v2.pdf
PWC	https://paperswithcode.com/paper/contextual-bandits-with-cross-learning
Repo
Framework

Deep Variational Reinforcement Learning for POMDPs


Title	Deep Variational Reinforcement Learning for POMDPs
Authors	Maximilian Igl, Luisa Zintgraf, Tuan Anh Le, Frank Wood, Shimon Whiteson
Abstract	Many real-world sequential decision making problems are partially observable by nature, and the environment model is typically unknown. Consequently, there is great need for reinforcement learning methods that can tackle such problems given only a stream of incomplete and noisy observations. In this paper, we propose deep variational reinforcement learning (DVRL), which introduces an inductive bias that allows an agent to learn a generative model of the environment and perform inference in that model to effectively aggregate the available information. We develop an n-step approximation to the evidence lower bound (ELBO), allowing the model to be trained jointly with the policy. This ensures that the latent state representation is suitable for the control task. In experiments on Mountain Hike and flickering Atari we show that our method outperforms previous approaches relying on recurrent neural networks to encode the past.
Tasks	Decision Making
Published	2018-06-06
URL	http://arxiv.org/abs/1806.02426v1
PDF	http://arxiv.org/pdf/1806.02426v1.pdf
PWC	https://paperswithcode.com/paper/deep-variational-reinforcement-learning-for
Repo
Framework

Logistic Ensemble Models


Title	Logistic Ensemble Models
Authors	Bob Vanderheyden, Jennifer Priestley
Abstract	Predictive models that are developed in a regulated industry or a regulated application, like determination of credit worthiness, must be interpretable and rational (e.g., meaningful improvements in basic credit behavior must result in improved credit worthiness scores). Machine Learning technologies provide very good performance with minimal analyst intervention, making them well suited to a high volume analytic environment, but the majority are black box tools that provide very limited insight or interpretability into key drivers of model performance or predicted model output values. This paper presents a methodology that blends one of the most popular predictive statistical modeling methods for binary classification with a core model enhancement strategy found in machine learning. The resulting prediction methodology provides solid performance, from minimal analyst effort, while providing the interpretability and rationality required in regulated industries, as well as in other environments where interpretation of model parameters is required (e.g. businesses that require interpretation of models, to take action on them).
Tasks
Published	2018-06-12
URL	http://arxiv.org/abs/1806.04555v1
PDF	http://arxiv.org/pdf/1806.04555v1.pdf
PWC	https://paperswithcode.com/paper/logistic-ensemble-models
Repo
Framework

Deep learning based inverse method for layout design


Title	Deep learning based inverse method for layout design
Authors	Yujie Zhang, Wenjing Ye
Abstract	Layout design with complex constraints is a challenging problem to solve due to the non-uniqueness of the solution and the difficulties in incorporating the constraints into the conventional optimization-based methods. In this paper, we propose a design method based on the recently developed machine learning technique, Variational Autoencoder (VAE). We utilize the learning capability of the VAE to learn the constraints and the generative capability of the VAE to generate design candidates that automatically satisfy all the constraints. As such, no constraints need to be imposed during the design stage. In addition, we show that the VAE network is also capable of learning the underlying physics of the design problem, leading to an efficient design tool that does not need any physical simulation once the network is constructed. We demonstrated the performance of the method on two cases: inverse design of surface diffusion induced morphology change and mask design for optical microlithography.
Tasks
Published	2018-06-07
URL	http://arxiv.org/abs/1806.03182v1
PDF	http://arxiv.org/pdf/1806.03182v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-based-inverse-method-for-layout
Repo
Framework

Action Recognition with Spatio-Temporal Visual Attention on Skeleton Image Sequences


Title	Action Recognition with Spatio-Temporal Visual Attention on Skeleton Image Sequences
Authors	Zhengyuan Yang, Yuncheng Li, Jianchao Yang, Jiebo Luo
Abstract	Action recognition with 3D skeleton sequences is becoming popular due to its speed and robustness. The recently proposed Convolutional Neural Networks (CNN) based methods have shown good performance in learning spatio-temporal representations for skeleton sequences. Despite the good recognition accuracy achieved by previous CNN based methods, there exist two problems that potentially limit the performance. First, previous skeleton representations are generated by chaining joints with a fixed order. The corresponding semantic meaning is unclear and the structural information among the joints is lost. Second, previous models do not have an ability to focus on informative joints. The attention mechanism is important for skeleton based action recognition because there exist spatio-temporal key stages while the joint predictions can be inaccurate. To solve these two problems, we propose a novel CNN based method for skeleton based action recognition. We first redesign the skeleton representations with a depth-first tree traversal order, which enhances the semantic meaning of skeleton images and better preserves the associated structural information. We then propose the idea of a two-branch attention architecture that focuses on spatio-temporal key stages and filters out unreliable joint predictions. A base attention model with the simplest structure is first introduced. By improving the structures in both branches, we further propose a Global Long-sequence Attention Network (GLAN). Furthermore, in order to adjust the kernel’s spatio-temporal aspect ratios and better capture long term dependencies, we propose a Sub-Sequence Attention Network (SSAN) that takes sub-image sequences as inputs. Our experiment results on NTU RGB+D and SBU Kinetic Interaction outperforms the state-of-the-art. The model is further validated on noisy estimated poses from UCF101 and Kinetics.
Tasks	Skeleton Based Action Recognition, Temporal Action Localization
Published	2018-01-31
URL	http://arxiv.org/abs/1801.10304v2
PDF	http://arxiv.org/pdf/1801.10304v2.pdf
PWC	https://paperswithcode.com/paper/action-recognition-with-spatio-temporal
Repo
Framework

Neural probabilistic motor primitives for humanoid control


Title	Neural probabilistic motor primitives for humanoid control
Authors	Josh Merel, Leonard Hasenclever, Alexandre Galashov, Arun Ahuja, Vu Pham, Greg Wayne, Yee Whye Teh, Nicolas Heess
Abstract	We focus on the problem of learning a single motor module that can flexibly express a range of behaviors for the control of high-dimensional physically simulated humanoids. To do this, we propose a motor architecture that has the general structure of an inverse model with a latent-variable bottleneck. We show that it is possible to train this model entirely offline to compress thousands of expert policies and learn a motor primitive embedding space. The trained neural probabilistic motor primitive system can perform one-shot imitation of whole-body humanoid behaviors, robustly mimicking unseen trajectories. Additionally, we demonstrate that it is also straightforward to train controllers to reuse the learned motor primitive space to solve tasks, and the resulting movements are relatively naturalistic. To support the training of our model, we compare two approaches for offline policy cloning, including an experience efficient method which we call linear feedback policy cloning. We encourage readers to view a supplementary video ( https://youtu.be/CaDEf-QcKwA ) summarizing our results.
Tasks
Published	2018-11-28
URL	http://arxiv.org/abs/1811.11711v2
PDF	http://arxiv.org/pdf/1811.11711v2.pdf
PWC	https://paperswithcode.com/paper/neural-probabilistic-motor-primitives-for
Repo
Framework

Exploring the Vulnerability of Single Shot Module in Object Detectors via Imperceptible Background Patches


Title	Exploring the Vulnerability of Single Shot Module in Object Detectors via Imperceptible Background Patches
Authors	Yuezun Li, Xiao Bian, Ming-ching Chang, Siwei Lyu
Abstract	Recent works succeeded to generate adversarial perturbations on the entire image or the object of interests to corrupt CNN based object detectors. In this paper, we focus on exploring the vulnerability of the Single Shot Module (SSM) commonly used in recent object detectors, by adding small perturbations to patches in the background outside the object. The SSM is referred to the Region Proposal Network used in a two-stage object detector or the single-stage object detector itself. The SSM is typically a fully convolutional neural network which generates output in a single forward pass. Due to the excessive convolutions used in SSM, the actual receptive field is larger than the object itself. As such, we propose a novel method to corrupt object detectors by generating imperceptible patches only in the background. Our method can find a few background patches for perturbation, which can effectively decrease true positives and dramatically increase false positives. Efficacy is demonstrated on 5 two-stage object detectors and 8 single-stage object detectors on the MS COCO 2014 dataset. Results indicate that perturbations with small distortions outside the bounding box of object region can still severely damage the detection performance.
Tasks
Published	2018-09-16
URL	https://arxiv.org/abs/1809.05966v3
PDF	https://arxiv.org/pdf/1809.05966v3.pdf
PWC	https://paperswithcode.com/paper/exploring-the-vulnerability-of-single-shot
Repo
Framework

Large Multistream Data Analytics for Monitoring and Diagnostics in Manufacturing Systems


Title	Large Multistream Data Analytics for Monitoring and Diagnostics in Manufacturing Systems
Authors	Samaneh Ebrahimi, Chitta Ranjan, Kamran Paynabar
Abstract	The high-dimensionality and volume of large scale multistream data has inhibited significant research progress in developing an integrated monitoring and diagnostics (M&D) approach. This data, also categorized as big data, is becoming common in manufacturing plants. In this paper, we propose an integrated M&D approach for large scale streaming data. We developed a novel monitoring method named Adaptive Principal Component monitoring (APC) which adaptively chooses PCs that are most likely to vary due to the change for early detection. Importantly, we integrate a novel diagnostic approach, Principal Component Signal Recovery (PCSR), to enable a streamlined SPC. This diagnostics approach draws inspiration from Compressed Sensing and uses Adaptive Lasso for identifying the sparse change in the process. We theoretically motivate our approaches and do a performance evaluation of our integrated M&D method through simulations and case studies.
Tasks
Published	2018-12-26
URL	http://arxiv.org/abs/1812.10430v1
PDF	http://arxiv.org/pdf/1812.10430v1.pdf
PWC	https://paperswithcode.com/paper/large-multistream-data-analytics-for
Repo
Framework

Gaussian Process Accelerated Feldman-Cousins Approach for Physical Parameter Inference


Title	Gaussian Process Accelerated Feldman-Cousins Approach for Physical Parameter Inference
Authors	Lingge Li, Nitish Nayak, Jianming Bian, Pierre Baldi
Abstract	The unified approach of Feldman and Cousins allows for exact statistical inference of small signals that commonly arise in high energy physics. It has gained widespread use, for instance, in measurements of neutrino oscillation parameters in long-baseline experiments. However, the approach relies on the Neyman construction of the classical confidence interval and is computationally intensive as it is typically done in a grid-based fashion over the entire parameter space. In this letter, we propose an efficient algorithm for the Feldman-Cousins approach using Gaussian processes to construct confidence intervals iteratively. We show that in the neutrino oscillation context, one can obtain confidence intervals 5 times faster in one dimension and 10 times faster in two dimensions, while maintaining an accuracy above 99.5%.
Tasks	Gaussian Processes
Published	2018-11-16
URL	https://arxiv.org/abs/1811.07050v3
PDF	https://arxiv.org/pdf/1811.07050v3.pdf
PWC	https://paperswithcode.com/paper/efficient-neutrino-oscillation-parameter
Repo
Framework

Sentylic at IEST 2018: Gated Recurrent Neural Network and Capsule Network Based Approach for Implicit Emotion Detection


Title	Sentylic at IEST 2018: Gated Recurrent Neural Network and Capsule Network Based Approach for Implicit Emotion Detection
Authors	Prabod Rathnayaka, Supun Abeysinghe, Chamod Samarajeewa, Isura Manchanayake, Malaka Walpola
Abstract	In this paper, we present the system we have used for the Implicit WASSA 2018 Implicit Emotion Shared Task. The task is to predict the emotion of a tweet of which the explicit mentions of emotion terms have been removed. The idea is to come up with a model which has the ability to implicitly identify the emotion expressed given the context words. We have used a Gated Recurrent Neural Network (GRU) and a Capsule Network based model for the task. Pre-trained word embeddings have been utilized to incorporate contextual knowledge about words into the model. GRU layer learns latent representations using the input word embeddings. Subsequent Capsule Network layer learns high-level features from that hidden representation. The proposed model managed to achieve a macro-F1 score of 0.692.
Tasks	Word Embeddings
Published	2018-09-05
URL	http://arxiv.org/abs/1809.01452v1
PDF	http://arxiv.org/pdf/1809.01452v1.pdf
PWC	https://paperswithcode.com/paper/sentylic-at-iest-2018-gated-recurrent-neural
Repo
Framework

Taylor-based Optimized Recursive Extended Exponential Smoothed Neural Networks Forecasting Method


Title	Taylor-based Optimized Recursive Extended Exponential Smoothed Neural Networks Forecasting Method
Authors	Emna Krichene, Wael Ouarda, Habib Chabchoub, Adel M. Alimi
Abstract	A newly introduced method called Taylor-based Optimized Recursive Extended Exponential Smoothed Neural Networks Forecasting method is applied and extended in this study to forecast numerical values. Unlike traditional forecasting techniques which forecast only future values, our proposed method provides a new extension to correct the predicted values which is done by forecasting the estimated error. Experimental results demonstrated that the proposed method has a high accuracy both in training and testing data and outperform the state-of-the-art RNN models on Mackey-Glass, NARMA, Lorenz and Henon map datasets.
Tasks
Published	2018-11-01
URL	http://arxiv.org/abs/1811.00323v1
PDF	http://arxiv.org/pdf/1811.00323v1.pdf
PWC	https://paperswithcode.com/paper/taylor-based-optimized-recursive-extended
Repo
Framework