July 27, 2019

3246 words 16 mins read

Paper Group ANR 472

Variational Inference for Gaussian Process Models with Linear Complexity. Boltzmann Exploration Done Right. Fully Convolutional Neural Networks for Dynamic Object Detection in Grid Maps (Masters Thesis). Hypothesis Testing for High-Dimensional Multinomials: A Selective Review. Towards Building Large Scale Multimodal Domain-Aware Conversation System …

Variational Inference for Gaussian Process Models with Linear Complexity


Title	Variational Inference for Gaussian Process Models with Linear Complexity
Authors	Ching-An Cheng, Byron Boots
Abstract	Large-scale Gaussian process inference has long faced practical challenges due to time and space complexity that is superlinear in dataset size. While sparse variational Gaussian process models are capable of learning from large-scale data, standard strategies for sparsifying the model can prevent the approximation of complex functions. In this work, we propose a novel variational Gaussian process model that decouples the representation of mean and covariance functions in reproducing kernel Hilbert space. We show that this new parametrization generalizes previous models. Furthermore, it yields a variational inference problem that can be solved by stochastic gradient ascent with time and space complexity that is only linear in the number of mean function parameters, regardless of the choice of kernels, likelihoods, and inducing points. This strategy makes the adoption of large-scale expressive Gaussian process models possible. We run several experiments on regression tasks and show that this decoupled approach greatly outperforms previous sparse variational Gaussian process inference procedures.
Tasks
Published	2017-11-28
URL	http://arxiv.org/abs/1711.10127v2
PDF	http://arxiv.org/pdf/1711.10127v2.pdf
PWC	https://paperswithcode.com/paper/variational-inference-for-gaussian-process-2
Repo
Framework

Boltzmann Exploration Done Right


Title	Boltzmann Exploration Done Right
Authors	Nicolò Cesa-Bianchi, Claudio Gentile, Gábor Lugosi, Gergely Neu
Abstract	Boltzmann exploration is a classic strategy for sequential decision-making under uncertainty, and is one of the most standard tools in Reinforcement Learning (RL). Despite its widespread use, there is virtually no theoretical understanding about the limitations or the actual benefits of this exploration scheme. Does it drive exploration in a meaningful way? Is it prone to misidentifying the optimal actions or spending too much time exploring the suboptimal ones? What is the right tuning for the learning rate? In this paper, we address several of these questions in the classic setup of stochastic multi-armed bandits. One of our main results is showing that the Boltzmann exploration strategy with any monotone learning-rate sequence will induce suboptimal behavior. As a remedy, we offer a simple non-monotone schedule that guarantees near-optimal performance, albeit only when given prior access to key problem parameters that are typically not available in practical situations (like the time horizon $T$ and the suboptimality gap $\Delta$). More importantly, we propose a novel variant that uses different learning rates for different arms, and achieves a distribution-dependent regret bound of order $\frac{K\log^2 T}{\Delta}$ and a distribution-independent bound of order $\sqrt{KT}\log K$ without requiring such prior knowledge. To demonstrate the flexibility of our technique, we also propose a variant that guarantees the same performance bounds even if the rewards are heavy-tailed.
Tasks	Decision Making, Decision Making Under Uncertainty, Multi-Armed Bandits
Published	2017-05-29
URL	http://arxiv.org/abs/1705.10257v2
PDF	http://arxiv.org/pdf/1705.10257v2.pdf
PWC	https://paperswithcode.com/paper/boltzmann-exploration-done-right
Repo
Framework

Fully Convolutional Neural Networks for Dynamic Object Detection in Grid Maps (Masters Thesis)


Title	Fully Convolutional Neural Networks for Dynamic Object Detection in Grid Maps (Masters Thesis)
Authors	Florian Piewak
Abstract	One of the most important parts of environment perception is the detection of obstacles in the surrounding of the vehicle. To achieve that, several sensors like radars, LiDARs and cameras are installed in autonomous vehicles. The produced sensor data is fused to a general representation of the surrounding. In this thesis the dynamic occupancy grid map approach of Nuss et al. is used while three goals are achieved. First, the approach of Nuss et al. to distinguish between moving and non-moving obstacles is improved by using Fully Convolutional Neural Networks to create a class prediction for each grid cell. For this purpose, the network is initialized with public pre-trained network models and the training is executed with a semi-automatic generated dataset. The second goal is to provide orientation information for each detected moving obstacle. This could improve tracking algorithms, which are based on the dynamic occupancy grid map. The orientation extraction based on the Convolutional Neural Network shows a better performance in comparison to an orientation extraction directly over the velocity information of the dynamic occupancy grid map. A general problem of developing machine learning approaches like Neural Networks is the number of labeled data, which can always be increased. For this reason, the last goal is to evaluate a semi-supervised learning algorithm, to generate automatically more labeled data. The result of this evaluation shows that the automated labeled data does not improve the performance of the Convolutional Neural Network. All in all, the best results are combined to compare the detection against the approach of Nuss et al. [36] and a relative improvement of 34.8% is reached.
Tasks	Autonomous Vehicles, Object Detection
Published	2017-09-10
URL	http://arxiv.org/abs/1709.03138v1
PDF	http://arxiv.org/pdf/1709.03138v1.pdf
PWC	https://paperswithcode.com/paper/fully-convolutional-neural-networks-for-1
Repo
Framework

Hypothesis Testing for High-Dimensional Multinomials: A Selective Review


Title	Hypothesis Testing for High-Dimensional Multinomials: A Selective Review
Authors	Sivaraman Balakrishnan, Larry Wasserman
Abstract	The statistical analysis of discrete data has been the subject of extensive statistical research dating back to the work of Pearson. In this survey we review some recently developed methods for testing hypotheses about high-dimensional multinomials. Traditional tests like the $\chi^2$ test and the likelihood ratio test can have poor power in the high-dimensional setting. Much of the research in this area has focused on finding tests with asymptotically Normal limits and developing (stringent) conditions under which tests have Normal limits. We argue that this perspective suffers from a significant deficiency: it can exclude many high-dimensional cases when - despite having non Normal null distributions - carefully designed tests can have high power. Finally, we illustrate that taking a minimax perspective and considering refinements of this perspective can lead naturally to powerful and practical tests.
Tasks
Published	2017-12-17
URL	http://arxiv.org/abs/1712.06120v1
PDF	http://arxiv.org/pdf/1712.06120v1.pdf
PWC	https://paperswithcode.com/paper/hypothesis-testing-for-high-dimensional
Repo
Framework

Towards Building Large Scale Multimodal Domain-Aware Conversation Systems


Title	Towards Building Large Scale Multimodal Domain-Aware Conversation Systems
Authors	Amrita Saha, Mitesh Khapra, Karthik Sankaranarayanan
Abstract	While multimodal conversation agents are gaining importance in several domains such as retail, travel etc., deep learning research in this area has been limited primarily due to the lack of availability of large-scale, open chatlogs. To overcome this bottleneck, in this paper we introduce the task of multimodal, domain-aware conversations, and propose the MMD benchmark dataset. This dataset was gathered by working in close coordination with large number of domain experts in the retail domain. These experts suggested various conversations flows and dialog states which are typically seen in multimodal conversations in the fashion domain. Keeping these flows and states in mind, we created a dataset consisting of over 150K conversation sessions between shoppers and sales agents, with the help of in-house annotators using a semi-automated manually intense iterative process. With this dataset, we propose 5 new sub-tasks for multimodal conversations along with their evaluation methodology. We also propose two multimodal neural models in the encode-attend-decode paradigm and demonstrate their performance on two of the sub-tasks, namely text response generation and best image response selection. These experiments serve to establish baseline performance and open new research directions for each of these sub-tasks. Further, for each of the sub-tasks, we present a `per-state evaluation’ of 9 most significant dialog states, which would enable more focused research into understanding the challenges and complexities involved in each of these states. \|
Tasks
Published	2017-04-01
URL	http://arxiv.org/abs/1704.00200v3
PDF	http://arxiv.org/pdf/1704.00200v3.pdf
PWC	https://paperswithcode.com/paper/towards-building-large-scale-multimodal
Repo
Framework

Analysis of Vanilla Rolling Horizon Evolution Parameters in General Video Game Playing


Title	Analysis of Vanilla Rolling Horizon Evolution Parameters in General Video Game Playing
Authors	Raluca D. Gaina, Jialin Liu, Simon M. Lucas, Diego Perez-Liebana
Abstract	Monte Carlo Tree Search techniques have generally dominated General Video Game Playing, but recent research has started looking at Evolutionary Algorithms and their potential at matching Tree Search level of play or even outperforming these methods. Online or Rolling Horizon Evolution is one of the options available to evolve sequences of actions for planning in General Video Game Playing, but no research has been done up to date that explores the capabilities of the vanilla version of this algorithm in multiple games. This study aims to critically analyse the different configurations regarding population size and individual length in a set of 20 games from the General Video Game AI corpus. Distinctions are made between deterministic and stochastic games, and the implications of using superior time budgets are studied. Results show that there is scope for the use of these techniques, which in some configurations outperform Monte Carlo Tree Search, and also suggest that further research in these methods could boost their performance.
Tasks
Published	2017-04-24
URL	http://arxiv.org/abs/1704.07075v1
PDF	http://arxiv.org/pdf/1704.07075v1.pdf
PWC	https://paperswithcode.com/paper/analysis-of-vanilla-rolling-horizon-evolution
Repo
Framework

Outlier Detection by Consistent Data Selection Method


Title	Outlier Detection by Consistent Data Selection Method
Authors	Utkarsh Porwal, Smruthi Mukund
Abstract	Often the challenge associated with tasks like fraud and spam detection[1] is the lack of all likely patterns needed to train suitable supervised learning models. In order to overcome this limitation, such tasks are attempted as outlier or anomaly detection tasks. We also hypothesize that out- liers have behavioral patterns that change over time. Limited data and continuously changing patterns makes learning significantly difficult. In this work we are proposing an approach that detects outliers in large data sets by relying on data points that are consistent. The primary contribution of this work is that it will quickly help retrieve samples for both consistent and non-outlier data sets and is also mindful of new outlier patterns. No prior knowledge of each set is required to extract the samples. The method consists of two phases, in the first phase, consistent data points (non- outliers) are retrieved by an ensemble method of unsupervised clustering techniques and in the second phase a one class classifier trained on the consistent data point set is ap- plied on the remaining sample set to identify the outliers. The approach is tested on three publicly available data sets and the performance scores are competitive.
Tasks	Anomaly Detection, One-class classifier, Outlier Detection
Published	2017-12-12
URL	http://arxiv.org/abs/1712.04129v2
PDF	http://arxiv.org/pdf/1712.04129v2.pdf
PWC	https://paperswithcode.com/paper/outlier-detection-by-consistent-data
Repo
Framework

Extreme Dimension Reduction for Handling Covariate Shift


Title	Extreme Dimension Reduction for Handling Covariate Shift
Authors	Fulton Wang, Cynthia Rudin
Abstract	In the covariate shift learning scenario, the training and test covariate distributions differ, so that a predictor’s average loss over the training and test distributions also differ. In this work, we explore the potential of extreme dimension reduction, i.e. to very low dimensions, in improving the performance of importance weighting methods for handling covariate shift, which fail in high dimensions due to potentially high train/test covariate divergence and the inability to accurately estimate the requisite density ratios. We first formulate and solve a problem optimizing over linear subspaces a combination of their predictive utility and train/test divergence within. Applying it to simulated and real data, we show extreme dimension reduction helps sometimes but not always, due to a bias introduced by dimension reduction.
Tasks	Dimensionality Reduction
Published	2017-11-29
URL	http://arxiv.org/abs/1711.10938v2
PDF	http://arxiv.org/pdf/1711.10938v2.pdf
PWC	https://paperswithcode.com/paper/extreme-dimension-reduction-for-handling
Repo
Framework

Semi-Automatic Algorithm for Breast MRI Lesion Segmentation Using Marker-Controlled Watershed Transformation


Title	Semi-Automatic Algorithm for Breast MRI Lesion Segmentation Using Marker-Controlled Watershed Transformation
Authors	Sulaiman Vesal, Andres Diaz-Pinto, Nishant Ravikumar, Stephan Ellmann, Amirabbas Davari, Andreas Maier
Abstract	Magnetic resonance imaging (MRI) is an effective imaging modality for identifying and localizing breast lesions in women. Accurate and precise lesion segmentation using a computer-aided-diagnosis (CAD) system, is a crucial step in evaluating tumor volume and in the quantification of tumor characteristics. However, this is a challenging task, since breast lesions have sophisticated shape, topological structure, and high variance in their intensity distribution across patients. In this paper, we propose a novel marker-controlled watershed transformation-based approach, which uses the brightest pixels in a region of interest (determined by experts) as markers to overcome this challenge, and accurately segment lesions in breast MRI. The proposed approach was evaluated on 106 lesions, which includes 64 malignant and 42 benign cases. Segmentation results were quantified by comparison with ground truth labels, using the Dice similarity coefficient (DSC) and Jaccard index (JI) metrics. The proposed method achieved an average Dice coefficient of 0.7808$\pm$0.1729 and Jaccard index of 0.6704$\pm$0.2167. These results illustrate that the proposed method shows promise for future work related to the segmentation and classification of benign and malignant breast lesions.
Tasks	Lesion Segmentation
Published	2017-12-14
URL	http://arxiv.org/abs/1712.05200v1
PDF	http://arxiv.org/pdf/1712.05200v1.pdf
PWC	https://paperswithcode.com/paper/semi-automatic-algorithm-for-breast-mri
Repo
Framework

Structured Low-Rank Matrix Factorization: Global Optimality, Algorithms, and Applications


Title	Structured Low-Rank Matrix Factorization: Global Optimality, Algorithms, and Applications
Authors	Benjamin D. Haeffele, Rene Vidal
Abstract	Recently, convex formulations of low-rank matrix factorization problems have received considerable attention in machine learning. However, such formulations often require solving for a matrix of the size of the data matrix, making it challenging to apply them to large scale datasets. Moreover, in many applications the data can display structures beyond simply being low-rank, e.g., images and videos present complex spatio-temporal structures that are largely ignored by standard low-rank methods. In this paper we study a matrix factorization technique that is suitable for large datasets and captures additional structure in the factors by using a particular form of regularization that includes well-known regularizers such as total variation and the nuclear norm as particular cases. Although the resulting optimization problem is non-convex, we show that if the size of the factors is large enough, under certain conditions, any local minimizer for the factors yields a global minimizer. A few practical algorithms are also provided to solve the matrix factorization problem, and bounds on the distance from a given approximate solution of the optimization problem to the global optimum are derived. Examples in neural calcium imaging video segmentation and hyperspectral compressed recovery show the advantages of our approach on high-dimensional datasets.
Tasks	Video Semantic Segmentation
Published	2017-08-25
URL	http://arxiv.org/abs/1708.07850v1
PDF	http://arxiv.org/pdf/1708.07850v1.pdf
PWC	https://paperswithcode.com/paper/structured-low-rank-matrix-factorization-1
Repo
Framework

Cost-Sensitive Approach to Batch Size Adaptation for Gradient Descent


Title	Cost-Sensitive Approach to Batch Size Adaptation for Gradient Descent
Authors	Matteo Pirotta, Marcello Restelli
Abstract	In this paper, we propose a novel approach to automatically determine the batch size in stochastic gradient descent methods. The choice of the batch size induces a trade-off between the accuracy of the gradient estimate and the cost in terms of samples of each update. We propose to determine the batch size by optimizing the ratio between a lower bound to a linear or quadratic Taylor approximation of the expected improvement and the number of samples used to estimate the gradient. The performance of the proposed approach is empirically compared with related methods on popular classification tasks. The work was presented at the NIPS workshop on Optimizing the Optimizers. Barcelona, Spain, 2016.
Tasks
Published	2017-12-09
URL	http://arxiv.org/abs/1712.03428v1
PDF	http://arxiv.org/pdf/1712.03428v1.pdf
PWC	https://paperswithcode.com/paper/cost-sensitive-approach-to-batch-size
Repo
Framework

State Space LSTM Models with Particle MCMC Inference


Title	State Space LSTM Models with Particle MCMC Inference
Authors	Xun Zheng, Manzil Zaheer, Amr Ahmed, Yuan Wang, Eric P Xing, Alexander J Smola
Abstract	Long Short-Term Memory (LSTM) is one of the most powerful sequence models. Despite the strong performance, however, it lacks the nice interpretability as in state space models. In this paper, we present a way to combine the best of both worlds by introducing State Space LSTM (SSL) models that generalizes the earlier work \cite{zaheer2017latent} of combining topic models with LSTM. However, unlike \cite{zaheer2017latent}, we do not make any factorization assumptions in our inference algorithm. We present an efficient sampler based on sequential Monte Carlo (SMC) method that draws from the joint posterior directly. Experimental results confirms the superiority and stability of this SMC inference algorithm on a variety of domains.
Tasks	Topic Models
Published	2017-11-30
URL	http://arxiv.org/abs/1711.11179v1
PDF	http://arxiv.org/pdf/1711.11179v1.pdf
PWC	https://paperswithcode.com/paper/state-space-lstm-models-with-particle-mcmc
Repo
Framework

A Study of Metrics of Distance and Correlation Between Ranked Lists for Compositionality Detection


Title	A Study of Metrics of Distance and Correlation Between Ranked Lists for Compositionality Detection
Authors	Christina Lioma, Niels Dalum Hansen
Abstract	Compositionality in language refers to how much the meaning of some phrase can be decomposed into the meaning of its constituents and the way these constituents are combined. Based on the premise that substitution by synonyms is meaning-preserving, compositionality can be approximated as the semantic similarity between a phrase and a version of that phrase where words have been replaced by their synonyms. Different ways of representing such phrases exist (e.g., vectors [1] or language models [2]), and the choice of representation affects the measurement of semantic similarity. We propose a new compositionality detection method that represents phrases as ranked lists of term weights. Our method approximates the semantic similarity between two ranked list representations using a range of well-known distance and correlation metrics. In contrast to most state-of-the-art approaches in compositionality detection, our method is completely unsupervised. Experiments with a publicly available dataset of 1048 human-annotated phrases shows that, compared to strong supervised baselines, our approach provides superior measurement of compositionality using any of the distance and correlation metrics considered.
Tasks	Semantic Similarity, Semantic Textual Similarity
Published	2017-03-10
URL	http://arxiv.org/abs/1703.03640v1
PDF	http://arxiv.org/pdf/1703.03640v1.pdf
PWC	https://paperswithcode.com/paper/a-study-of-metrics-of-distance-and
Repo
Framework

xUnit: Learning a Spatial Activation Function for Efficient Image Restoration


Title	xUnit: Learning a Spatial Activation Function for Efficient Image Restoration
Authors	Idan Kligvasser, Tamar Rott Shaham, Tomer Michaeli
Abstract	In recent years, deep neural networks (DNNs) achieved unprecedented performance in many low-level vision tasks. However, state-of-the-art results are typically achieved by very deep networks, which can reach tens of layers with tens of millions of parameters. To make DNNs implementable on platforms with limited resources, it is necessary to weaken the tradeoff between performance and efficiency. In this paper, we propose a new activation unit, which is particularly suitable for image restoration problems. In contrast to the widespread per-pixel activation units, like ReLUs and sigmoids, our unit implements a learnable nonlinear function with spatial connections. This enables the net to capture much more complex features, thus requiring a significantly smaller number of layers in order to reach the same performance. We illustrate the effectiveness of our units through experiments with state-of-the-art nets for denoising, de-raining, and super resolution, which are already considered to be very small. With our approach, we are able to further reduce these models by nearly 50% without incurring any degradation in performance.
Tasks	Denoising, Image Restoration, Super-Resolution
Published	2017-11-17
URL	http://arxiv.org/abs/1711.06445v3
PDF	http://arxiv.org/pdf/1711.06445v3.pdf
PWC	https://paperswithcode.com/paper/xunit-learning-a-spatial-activation-function
Repo
Framework

Sequence Mining and Pattern Analysis in Drilling Reports with Deep Natural Language Processing


Title	Sequence Mining and Pattern Analysis in Drilling Reports with Deep Natural Language Processing
Authors	Júlio Hoffimann, Youli Mao, Avinash Wesley, Aimee Taylor
Abstract	Drilling activities in the oil and gas industry have been reported over decades for thousands of wells on a daily basis, yet the analysis of this text at large-scale for information retrieval, sequence mining, and pattern analysis is very challenging. Drilling reports contain interpretations written by drillers from noting measurements in downhole sensors and surface equipment, and can be used for operation optimization and accident mitigation. In this initial work, a methodology is proposed for automatic classification of sentences written in drilling reports into three relevant labels (EVENT, SYMPTOM and ACTION) for hundreds of wells in an actual field. Some of the main challenges in the text corpus were overcome, which include the high frequency of technical symbols, mistyping/abbreviation of technical terms, and the presence of incomplete sentences in the drilling reports. We obtain state-of-the-art classification accuracy within this technical language and illustrate advanced queries enabled by the tool.
Tasks	Information Retrieval
Published	2017-12-05
URL	http://arxiv.org/abs/1712.01476v1
PDF	http://arxiv.org/pdf/1712.01476v1.pdf
PWC	https://paperswithcode.com/paper/sequence-mining-and-pattern-analysis-in
Repo
Framework