Paper Group ANR 1084
Machine learning based hyperspectral image analysis: A survey. The Greedy Dirichlet Process Filter - An Online Clustering Multi-Target Tracker. Generative Low-Shot Network Expansion. Multi-Task Learning for Extraction of Adverse Drug Reaction Mentions from Tweets. Contextual Bandits with Cross-learning. Deep Variational Reinforcement Learning for P …
Machine learning based hyperspectral image analysis: A survey
Title | Machine learning based hyperspectral image analysis: A survey |
Authors | Utsav B. Gewali, Sildomar T. Monteiro, Eli Saber |
Abstract | Hyperspectral sensors enable the study of the chemical properties of scene materials remotely for the purpose of identification, detection, and chemical composition analysis of objects in the environment. Hence, hyperspectral images captured from earth observing satellites and aircraft have been increasingly important in agriculture, environmental monitoring, urban planning, mining, and defense. Machine learning algorithms due to their outstanding predictive power have become a key tool for modern hyperspectral image analysis. Therefore, a solid understanding of machine learning techniques have become essential for remote sensing researchers and practitioners. This paper reviews and compares recent machine learning-based hyperspectral image analysis methods published in literature. We organize the methods by the image analysis task and by the type of machine learning algorithm, and present a two-way mapping between the image analysis tasks and the types of machine learning algorithms that can be applied to them. The paper is comprehensive in coverage of both hyperspectral image analysis tasks and machine learning algorithms. The image analysis tasks considered are land cover classification, target detection, unmixing, and physical parameter estimation. The machine learning algorithms covered are Gaussian models, linear regression, logistic regression, support vector machines, Gaussian mixture model, latent linear models, sparse linear models, Gaussian mixture models, ensemble learning, directed graphical models, undirected graphical models, clustering, Gaussian processes, Dirichlet processes, and deep learning. We also discuss the open challenges in the field of hyperspectral image analysis and explore possible future directions. |
Tasks | Gaussian Processes |
Published | 2018-02-23 |
URL | http://arxiv.org/abs/1802.08701v2 |
http://arxiv.org/pdf/1802.08701v2.pdf | |
PWC | https://paperswithcode.com/paper/machine-learning-based-hyperspectral-image |
Repo | |
Framework | |
The Greedy Dirichlet Process Filter - An Online Clustering Multi-Target Tracker
Title | The Greedy Dirichlet Process Filter - An Online Clustering Multi-Target Tracker |
Authors | Benjamin Naujoks, Patrick Burger, Hans-Joachim Wuensche |
Abstract | Reliable collision avoidance is one of the main requirements for autonomous driving. Hence, it is important to correctly estimate the states of an unknown number of static and dynamic objects in real-time. Here, data association is a major challenge for every multi-target tracker. We propose a novel multi-target tracker called Greedy Dirichlet Process Filter (GDPF) based on the non-parametric Bayesian model called Dirichlet Processes and the fast posterior computation algorithm Sequential Updating and Greedy Search (SUGS). By adding a temporal dependence we get a real-time capable tracking framework without the need of a previous clustering or data association step. Real-world tests show that GDPF outperforms other multi-target tracker in terms of accuracy and stability. |
Tasks | Autonomous Driving |
Published | 2018-11-14 |
URL | http://arxiv.org/abs/1811.05911v2 |
http://arxiv.org/pdf/1811.05911v2.pdf | |
PWC | https://paperswithcode.com/paper/the-greedy-dirichlet-process-filter-an-online |
Repo | |
Framework | |
Generative Low-Shot Network Expansion
Title | Generative Low-Shot Network Expansion |
Authors | Adi Hayat, Mark Kliger, Shachar Fleishman, Daniel Cohen-Or |
Abstract | Conventional deep learning classifiers are static in the sense that they are trained on a predefined set of classes and learning to classify a novel class typically requires re-training. In this work, we address the problem of Low-Shot network expansion learning. We introduce a learning framework which enables expanding a pre-trained (base) deep network to classify novel classes when the number of examples for the novel classes is particularly small. We present a simple yet powerful hard distillation method where the base network is augmented with additional weights to classify the novel classes, while keeping the weights of the base network unchanged. We show that since only a small number of weights needs to be trained, the hard distillation excels in low-shot training scenarios. Furthermore, hard distillation avoids detriment to classification performance on the base classes. Finally, we show that low-shot network expansion can be done with a very small memory footprint by using a compact generative model of the base classes training data with only a negligible degradation relative to learning with the full training set. |
Tasks | |
Published | 2018-10-19 |
URL | http://arxiv.org/abs/1810.08363v1 |
http://arxiv.org/pdf/1810.08363v1.pdf | |
PWC | https://paperswithcode.com/paper/generative-low-shot-network-expansion |
Repo | |
Framework | |
Multi-Task Learning for Extraction of Adverse Drug Reaction Mentions from Tweets
Title | Multi-Task Learning for Extraction of Adverse Drug Reaction Mentions from Tweets |
Authors | Shashank Gupta, Manish Gupta, Vasudeva Varma, Sachin Pawar, Nitin Ramrakhiyani, Girish K. Palshikar |
Abstract | Adverse drug reactions (ADRs) are one of the leading causes of mortality in health care. Current ADR surveillance systems are often associated with a substantial time lag before such events are officially published. On the other hand, online social media such as Twitter contain information about ADR events in real-time, much before any official reporting. Current state-of-the-art in ADR mention extraction uses Recurrent Neural Networks (RNN), which typically need large labeled corpora. Towards this end, we propose a multi-task learning based method which can utilize a similar auxiliary task (adverse drug event detection) to enhance the performance of the main task, i.e., ADR extraction. Furthermore, in the absence of auxiliary task dataset, we propose a novel joint multi-task learning method to automatically generate weak supervision dataset for the auxiliary task when a large pool of unlabeled tweets is available. Experiments with 0.48M tweets show that the proposed approach outperforms the state-of-the-art methods for the ADR mention extraction task by 7.2% in terms of F1 score. |
Tasks | Multi-Task Learning |
Published | 2018-02-14 |
URL | http://arxiv.org/abs/1802.05130v1 |
http://arxiv.org/pdf/1802.05130v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-task-learning-for-extraction-of-adverse |
Repo | |
Framework | |
Contextual Bandits with Cross-learning
Title | Contextual Bandits with Cross-learning |
Authors | Santiago Balseiro, Negin Golrezaei, Mohammad Mahdian, Vahab Mirrokni, Jon Schneider |
Abstract | In the classical contextual bandits problem, in each round $t$, a learner observes some context $c$, chooses some action $a$ to perform, and receives some reward $r_{a,t}(c)$. We consider the variant of this problem where in addition to receiving the reward $r_{a,t}(c)$, the learner also learns the values of $r_{a,t}(c’)$ for all other contexts $c'$; i.e., the rewards that would have been achieved by performing that action under different contexts. This variant arises in several strategic settings, such as learning how to bid in non-truthful repeated auctions (in this setting the context is the decision maker’s private valuation for each auction). We call this problem the contextual bandits problem with cross-learning. The best algorithms for the classical contextual bandits problem achieve $\tilde{O}(\sqrt{CKT})$ regret against all stationary policies, where $C$ is the number of contexts, $K$ the number of actions, and $T$ the number of rounds. We demonstrate algorithms for the contextual bandits problem with cross-learning that remove the dependence on $C$ and achieve regret $O(\sqrt{KT})$ (when contexts are stochastic with known distribution), $\tilde{O}(K^{1/3}T^{2/3})$ (when contexts are stochastic with unknown distribution), and $\tilde{O}(\sqrt{KT})$ (when contexts are adversarial but rewards are stochastic). |
Tasks | Multi-Armed Bandits |
Published | 2018-09-25 |
URL | https://arxiv.org/abs/1809.09582v2 |
https://arxiv.org/pdf/1809.09582v2.pdf | |
PWC | https://paperswithcode.com/paper/contextual-bandits-with-cross-learning |
Repo | |
Framework | |
Deep Variational Reinforcement Learning for POMDPs
Title | Deep Variational Reinforcement Learning for POMDPs |
Authors | Maximilian Igl, Luisa Zintgraf, Tuan Anh Le, Frank Wood, Shimon Whiteson |
Abstract | Many real-world sequential decision making problems are partially observable by nature, and the environment model is typically unknown. Consequently, there is great need for reinforcement learning methods that can tackle such problems given only a stream of incomplete and noisy observations. In this paper, we propose deep variational reinforcement learning (DVRL), which introduces an inductive bias that allows an agent to learn a generative model of the environment and perform inference in that model to effectively aggregate the available information. We develop an n-step approximation to the evidence lower bound (ELBO), allowing the model to be trained jointly with the policy. This ensures that the latent state representation is suitable for the control task. In experiments on Mountain Hike and flickering Atari we show that our method outperforms previous approaches relying on recurrent neural networks to encode the past. |
Tasks | Decision Making |
Published | 2018-06-06 |
URL | http://arxiv.org/abs/1806.02426v1 |
http://arxiv.org/pdf/1806.02426v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-variational-reinforcement-learning-for |
Repo | |
Framework | |
Logistic Ensemble Models
Title | Logistic Ensemble Models |
Authors | Bob Vanderheyden, Jennifer Priestley |
Abstract | Predictive models that are developed in a regulated industry or a regulated application, like determination of credit worthiness, must be interpretable and rational (e.g., meaningful improvements in basic credit behavior must result in improved credit worthiness scores). Machine Learning technologies provide very good performance with minimal analyst intervention, making them well suited to a high volume analytic environment, but the majority are black box tools that provide very limited insight or interpretability into key drivers of model performance or predicted model output values. This paper presents a methodology that blends one of the most popular predictive statistical modeling methods for binary classification with a core model enhancement strategy found in machine learning. The resulting prediction methodology provides solid performance, from minimal analyst effort, while providing the interpretability and rationality required in regulated industries, as well as in other environments where interpretation of model parameters is required (e.g. businesses that require interpretation of models, to take action on them). |
Tasks | |
Published | 2018-06-12 |
URL | http://arxiv.org/abs/1806.04555v1 |
http://arxiv.org/pdf/1806.04555v1.pdf | |
PWC | https://paperswithcode.com/paper/logistic-ensemble-models |
Repo | |
Framework | |
Deep learning based inverse method for layout design
Title | Deep learning based inverse method for layout design |
Authors | Yujie Zhang, Wenjing Ye |
Abstract | Layout design with complex constraints is a challenging problem to solve due to the non-uniqueness of the solution and the difficulties in incorporating the constraints into the conventional optimization-based methods. In this paper, we propose a design method based on the recently developed machine learning technique, Variational Autoencoder (VAE). We utilize the learning capability of the VAE to learn the constraints and the generative capability of the VAE to generate design candidates that automatically satisfy all the constraints. As such, no constraints need to be imposed during the design stage. In addition, we show that the VAE network is also capable of learning the underlying physics of the design problem, leading to an efficient design tool that does not need any physical simulation once the network is constructed. We demonstrated the performance of the method on two cases: inverse design of surface diffusion induced morphology change and mask design for optical microlithography. |
Tasks | |
Published | 2018-06-07 |
URL | http://arxiv.org/abs/1806.03182v1 |
http://arxiv.org/pdf/1806.03182v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-based-inverse-method-for-layout |
Repo | |
Framework | |
Action Recognition with Spatio-Temporal Visual Attention on Skeleton Image Sequences
Title | Action Recognition with Spatio-Temporal Visual Attention on Skeleton Image Sequences |
Authors | Zhengyuan Yang, Yuncheng Li, Jianchao Yang, Jiebo Luo |
Abstract | Action recognition with 3D skeleton sequences is becoming popular due to its speed and robustness. The recently proposed Convolutional Neural Networks (CNN) based methods have shown good performance in learning spatio-temporal representations for skeleton sequences. Despite the good recognition accuracy achieved by previous CNN based methods, there exist two problems that potentially limit the performance. First, previous skeleton representations are generated by chaining joints with a fixed order. The corresponding semantic meaning is unclear and the structural information among the joints is lost. Second, previous models do not have an ability to focus on informative joints. The attention mechanism is important for skeleton based action recognition because there exist spatio-temporal key stages while the joint predictions can be inaccurate. To solve these two problems, we propose a novel CNN based method for skeleton based action recognition. We first redesign the skeleton representations with a depth-first tree traversal order, which enhances the semantic meaning of skeleton images and better preserves the associated structural information. We then propose the idea of a two-branch attention architecture that focuses on spatio-temporal key stages and filters out unreliable joint predictions. A base attention model with the simplest structure is first introduced. By improving the structures in both branches, we further propose a Global Long-sequence Attention Network (GLAN). Furthermore, in order to adjust the kernel’s spatio-temporal aspect ratios and better capture long term dependencies, we propose a Sub-Sequence Attention Network (SSAN) that takes sub-image sequences as inputs. Our experiment results on NTU RGB+D and SBU Kinetic Interaction outperforms the state-of-the-art. The model is further validated on noisy estimated poses from UCF101 and Kinetics. |
Tasks | Skeleton Based Action Recognition, Temporal Action Localization |
Published | 2018-01-31 |
URL | http://arxiv.org/abs/1801.10304v2 |
http://arxiv.org/pdf/1801.10304v2.pdf | |
PWC | https://paperswithcode.com/paper/action-recognition-with-spatio-temporal |
Repo | |
Framework | |
Neural probabilistic motor primitives for humanoid control
Title | Neural probabilistic motor primitives for humanoid control |
Authors | Josh Merel, Leonard Hasenclever, Alexandre Galashov, Arun Ahuja, Vu Pham, Greg Wayne, Yee Whye Teh, Nicolas Heess |
Abstract | We focus on the problem of learning a single motor module that can flexibly express a range of behaviors for the control of high-dimensional physically simulated humanoids. To do this, we propose a motor architecture that has the general structure of an inverse model with a latent-variable bottleneck. We show that it is possible to train this model entirely offline to compress thousands of expert policies and learn a motor primitive embedding space. The trained neural probabilistic motor primitive system can perform one-shot imitation of whole-body humanoid behaviors, robustly mimicking unseen trajectories. Additionally, we demonstrate that it is also straightforward to train controllers to reuse the learned motor primitive space to solve tasks, and the resulting movements are relatively naturalistic. To support the training of our model, we compare two approaches for offline policy cloning, including an experience efficient method which we call linear feedback policy cloning. We encourage readers to view a supplementary video ( https://youtu.be/CaDEf-QcKwA ) summarizing our results. |
Tasks | |
Published | 2018-11-28 |
URL | http://arxiv.org/abs/1811.11711v2 |
http://arxiv.org/pdf/1811.11711v2.pdf | |
PWC | https://paperswithcode.com/paper/neural-probabilistic-motor-primitives-for |
Repo | |
Framework | |
Exploring the Vulnerability of Single Shot Module in Object Detectors via Imperceptible Background Patches
Title | Exploring the Vulnerability of Single Shot Module in Object Detectors via Imperceptible Background Patches |
Authors | Yuezun Li, Xiao Bian, Ming-ching Chang, Siwei Lyu |
Abstract | Recent works succeeded to generate adversarial perturbations on the entire image or the object of interests to corrupt CNN based object detectors. In this paper, we focus on exploring the vulnerability of the Single Shot Module (SSM) commonly used in recent object detectors, by adding small perturbations to patches in the background outside the object. The SSM is referred to the Region Proposal Network used in a two-stage object detector or the single-stage object detector itself. The SSM is typically a fully convolutional neural network which generates output in a single forward pass. Due to the excessive convolutions used in SSM, the actual receptive field is larger than the object itself. As such, we propose a novel method to corrupt object detectors by generating imperceptible patches only in the background. Our method can find a few background patches for perturbation, which can effectively decrease true positives and dramatically increase false positives. Efficacy is demonstrated on 5 two-stage object detectors and 8 single-stage object detectors on the MS COCO 2014 dataset. Results indicate that perturbations with small distortions outside the bounding box of object region can still severely damage the detection performance. |
Tasks | |
Published | 2018-09-16 |
URL | https://arxiv.org/abs/1809.05966v3 |
https://arxiv.org/pdf/1809.05966v3.pdf | |
PWC | https://paperswithcode.com/paper/exploring-the-vulnerability-of-single-shot |
Repo | |
Framework | |
Large Multistream Data Analytics for Monitoring and Diagnostics in Manufacturing Systems
Title | Large Multistream Data Analytics for Monitoring and Diagnostics in Manufacturing Systems |
Authors | Samaneh Ebrahimi, Chitta Ranjan, Kamran Paynabar |
Abstract | The high-dimensionality and volume of large scale multistream data has inhibited significant research progress in developing an integrated monitoring and diagnostics (M&D) approach. This data, also categorized as big data, is becoming common in manufacturing plants. In this paper, we propose an integrated M&D approach for large scale streaming data. We developed a novel monitoring method named Adaptive Principal Component monitoring (APC) which adaptively chooses PCs that are most likely to vary due to the change for early detection. Importantly, we integrate a novel diagnostic approach, Principal Component Signal Recovery (PCSR), to enable a streamlined SPC. This diagnostics approach draws inspiration from Compressed Sensing and uses Adaptive Lasso for identifying the sparse change in the process. We theoretically motivate our approaches and do a performance evaluation of our integrated M&D method through simulations and case studies. |
Tasks | |
Published | 2018-12-26 |
URL | http://arxiv.org/abs/1812.10430v1 |
http://arxiv.org/pdf/1812.10430v1.pdf | |
PWC | https://paperswithcode.com/paper/large-multistream-data-analytics-for |
Repo | |
Framework | |
Gaussian Process Accelerated Feldman-Cousins Approach for Physical Parameter Inference
Title | Gaussian Process Accelerated Feldman-Cousins Approach for Physical Parameter Inference |
Authors | Lingge Li, Nitish Nayak, Jianming Bian, Pierre Baldi |
Abstract | The unified approach of Feldman and Cousins allows for exact statistical inference of small signals that commonly arise in high energy physics. It has gained widespread use, for instance, in measurements of neutrino oscillation parameters in long-baseline experiments. However, the approach relies on the Neyman construction of the classical confidence interval and is computationally intensive as it is typically done in a grid-based fashion over the entire parameter space. In this letter, we propose an efficient algorithm for the Feldman-Cousins approach using Gaussian processes to construct confidence intervals iteratively. We show that in the neutrino oscillation context, one can obtain confidence intervals 5 times faster in one dimension and 10 times faster in two dimensions, while maintaining an accuracy above 99.5%. |
Tasks | Gaussian Processes |
Published | 2018-11-16 |
URL | https://arxiv.org/abs/1811.07050v3 |
https://arxiv.org/pdf/1811.07050v3.pdf | |
PWC | https://paperswithcode.com/paper/efficient-neutrino-oscillation-parameter |
Repo | |
Framework | |
Sentylic at IEST 2018: Gated Recurrent Neural Network and Capsule Network Based Approach for Implicit Emotion Detection
Title | Sentylic at IEST 2018: Gated Recurrent Neural Network and Capsule Network Based Approach for Implicit Emotion Detection |
Authors | Prabod Rathnayaka, Supun Abeysinghe, Chamod Samarajeewa, Isura Manchanayake, Malaka Walpola |
Abstract | In this paper, we present the system we have used for the Implicit WASSA 2018 Implicit Emotion Shared Task. The task is to predict the emotion of a tweet of which the explicit mentions of emotion terms have been removed. The idea is to come up with a model which has the ability to implicitly identify the emotion expressed given the context words. We have used a Gated Recurrent Neural Network (GRU) and a Capsule Network based model for the task. Pre-trained word embeddings have been utilized to incorporate contextual knowledge about words into the model. GRU layer learns latent representations using the input word embeddings. Subsequent Capsule Network layer learns high-level features from that hidden representation. The proposed model managed to achieve a macro-F1 score of 0.692. |
Tasks | Word Embeddings |
Published | 2018-09-05 |
URL | http://arxiv.org/abs/1809.01452v1 |
http://arxiv.org/pdf/1809.01452v1.pdf | |
PWC | https://paperswithcode.com/paper/sentylic-at-iest-2018-gated-recurrent-neural |
Repo | |
Framework | |
Taylor-based Optimized Recursive Extended Exponential Smoothed Neural Networks Forecasting Method
Title | Taylor-based Optimized Recursive Extended Exponential Smoothed Neural Networks Forecasting Method |
Authors | Emna Krichene, Wael Ouarda, Habib Chabchoub, Adel M. Alimi |
Abstract | A newly introduced method called Taylor-based Optimized Recursive Extended Exponential Smoothed Neural Networks Forecasting method is applied and extended in this study to forecast numerical values. Unlike traditional forecasting techniques which forecast only future values, our proposed method provides a new extension to correct the predicted values which is done by forecasting the estimated error. Experimental results demonstrated that the proposed method has a high accuracy both in training and testing data and outperform the state-of-the-art RNN models on Mackey-Glass, NARMA, Lorenz and Henon map datasets. |
Tasks | |
Published | 2018-11-01 |
URL | http://arxiv.org/abs/1811.00323v1 |
http://arxiv.org/pdf/1811.00323v1.pdf | |
PWC | https://paperswithcode.com/paper/taylor-based-optimized-recursive-extended |
Repo | |
Framework | |