Paper Group ANR 472
Variational Inference for Gaussian Process Models with Linear Complexity. Boltzmann Exploration Done Right. Fully Convolutional Neural Networks for Dynamic Object Detection in Grid Maps (Masters Thesis). Hypothesis Testing for High-Dimensional Multinomials: A Selective Review. Towards Building Large Scale Multimodal Domain-Aware Conversation System …
Variational Inference for Gaussian Process Models with Linear Complexity
Title | Variational Inference for Gaussian Process Models with Linear Complexity |
Authors | Ching-An Cheng, Byron Boots |
Abstract | Large-scale Gaussian process inference has long faced practical challenges due to time and space complexity that is superlinear in dataset size. While sparse variational Gaussian process models are capable of learning from large-scale data, standard strategies for sparsifying the model can prevent the approximation of complex functions. In this work, we propose a novel variational Gaussian process model that decouples the representation of mean and covariance functions in reproducing kernel Hilbert space. We show that this new parametrization generalizes previous models. Furthermore, it yields a variational inference problem that can be solved by stochastic gradient ascent with time and space complexity that is only linear in the number of mean function parameters, regardless of the choice of kernels, likelihoods, and inducing points. This strategy makes the adoption of large-scale expressive Gaussian process models possible. We run several experiments on regression tasks and show that this decoupled approach greatly outperforms previous sparse variational Gaussian process inference procedures. |
Tasks | |
Published | 2017-11-28 |
URL | http://arxiv.org/abs/1711.10127v2 |
http://arxiv.org/pdf/1711.10127v2.pdf | |
PWC | https://paperswithcode.com/paper/variational-inference-for-gaussian-process-2 |
Repo | |
Framework | |
Boltzmann Exploration Done Right
Title | Boltzmann Exploration Done Right |
Authors | Nicolò Cesa-Bianchi, Claudio Gentile, Gábor Lugosi, Gergely Neu |
Abstract | Boltzmann exploration is a classic strategy for sequential decision-making under uncertainty, and is one of the most standard tools in Reinforcement Learning (RL). Despite its widespread use, there is virtually no theoretical understanding about the limitations or the actual benefits of this exploration scheme. Does it drive exploration in a meaningful way? Is it prone to misidentifying the optimal actions or spending too much time exploring the suboptimal ones? What is the right tuning for the learning rate? In this paper, we address several of these questions in the classic setup of stochastic multi-armed bandits. One of our main results is showing that the Boltzmann exploration strategy with any monotone learning-rate sequence will induce suboptimal behavior. As a remedy, we offer a simple non-monotone schedule that guarantees near-optimal performance, albeit only when given prior access to key problem parameters that are typically not available in practical situations (like the time horizon $T$ and the suboptimality gap $\Delta$). More importantly, we propose a novel variant that uses different learning rates for different arms, and achieves a distribution-dependent regret bound of order $\frac{K\log^2 T}{\Delta}$ and a distribution-independent bound of order $\sqrt{KT}\log K$ without requiring such prior knowledge. To demonstrate the flexibility of our technique, we also propose a variant that guarantees the same performance bounds even if the rewards are heavy-tailed. |
Tasks | Decision Making, Decision Making Under Uncertainty, Multi-Armed Bandits |
Published | 2017-05-29 |
URL | http://arxiv.org/abs/1705.10257v2 |
http://arxiv.org/pdf/1705.10257v2.pdf | |
PWC | https://paperswithcode.com/paper/boltzmann-exploration-done-right |
Repo | |
Framework | |
Fully Convolutional Neural Networks for Dynamic Object Detection in Grid Maps (Masters Thesis)
Title | Fully Convolutional Neural Networks for Dynamic Object Detection in Grid Maps (Masters Thesis) |
Authors | Florian Piewak |
Abstract | One of the most important parts of environment perception is the detection of obstacles in the surrounding of the vehicle. To achieve that, several sensors like radars, LiDARs and cameras are installed in autonomous vehicles. The produced sensor data is fused to a general representation of the surrounding. In this thesis the dynamic occupancy grid map approach of Nuss et al. is used while three goals are achieved. First, the approach of Nuss et al. to distinguish between moving and non-moving obstacles is improved by using Fully Convolutional Neural Networks to create a class prediction for each grid cell. For this purpose, the network is initialized with public pre-trained network models and the training is executed with a semi-automatic generated dataset. The second goal is to provide orientation information for each detected moving obstacle. This could improve tracking algorithms, which are based on the dynamic occupancy grid map. The orientation extraction based on the Convolutional Neural Network shows a better performance in comparison to an orientation extraction directly over the velocity information of the dynamic occupancy grid map. A general problem of developing machine learning approaches like Neural Networks is the number of labeled data, which can always be increased. For this reason, the last goal is to evaluate a semi-supervised learning algorithm, to generate automatically more labeled data. The result of this evaluation shows that the automated labeled data does not improve the performance of the Convolutional Neural Network. All in all, the best results are combined to compare the detection against the approach of Nuss et al. [36] and a relative improvement of 34.8% is reached. |
Tasks | Autonomous Vehicles, Object Detection |
Published | 2017-09-10 |
URL | http://arxiv.org/abs/1709.03138v1 |
http://arxiv.org/pdf/1709.03138v1.pdf | |
PWC | https://paperswithcode.com/paper/fully-convolutional-neural-networks-for-1 |
Repo | |
Framework | |
Hypothesis Testing for High-Dimensional Multinomials: A Selective Review
Title | Hypothesis Testing for High-Dimensional Multinomials: A Selective Review |
Authors | Sivaraman Balakrishnan, Larry Wasserman |
Abstract | The statistical analysis of discrete data has been the subject of extensive statistical research dating back to the work of Pearson. In this survey we review some recently developed methods for testing hypotheses about high-dimensional multinomials. Traditional tests like the $\chi^2$ test and the likelihood ratio test can have poor power in the high-dimensional setting. Much of the research in this area has focused on finding tests with asymptotically Normal limits and developing (stringent) conditions under which tests have Normal limits. We argue that this perspective suffers from a significant deficiency: it can exclude many high-dimensional cases when - despite having non Normal null distributions - carefully designed tests can have high power. Finally, we illustrate that taking a minimax perspective and considering refinements of this perspective can lead naturally to powerful and practical tests. |
Tasks | |
Published | 2017-12-17 |
URL | http://arxiv.org/abs/1712.06120v1 |
http://arxiv.org/pdf/1712.06120v1.pdf | |
PWC | https://paperswithcode.com/paper/hypothesis-testing-for-high-dimensional |
Repo | |
Framework | |
Towards Building Large Scale Multimodal Domain-Aware Conversation Systems
Title | Towards Building Large Scale Multimodal Domain-Aware Conversation Systems |
Authors | Amrita Saha, Mitesh Khapra, Karthik Sankaranarayanan |
Abstract | While multimodal conversation agents are gaining importance in several domains such as retail, travel etc., deep learning research in this area has been limited primarily due to the lack of availability of large-scale, open chatlogs. To overcome this bottleneck, in this paper we introduce the task of multimodal, domain-aware conversations, and propose the MMD benchmark dataset. This dataset was gathered by working in close coordination with large number of domain experts in the retail domain. These experts suggested various conversations flows and dialog states which are typically seen in multimodal conversations in the fashion domain. Keeping these flows and states in mind, we created a dataset consisting of over 150K conversation sessions between shoppers and sales agents, with the help of in-house annotators using a semi-automated manually intense iterative process. With this dataset, we propose 5 new sub-tasks for multimodal conversations along with their evaluation methodology. We also propose two multimodal neural models in the encode-attend-decode paradigm and demonstrate their performance on two of the sub-tasks, namely text response generation and best image response selection. These experiments serve to establish baseline performance and open new research directions for each of these sub-tasks. Further, for each of the sub-tasks, we present a `per-state evaluation’ of 9 most significant dialog states, which would enable more focused research into understanding the challenges and complexities involved in each of these states. | |
Tasks | |
Published | 2017-04-01 |
URL | http://arxiv.org/abs/1704.00200v3 |
http://arxiv.org/pdf/1704.00200v3.pdf | |
PWC | https://paperswithcode.com/paper/towards-building-large-scale-multimodal |
Repo | |
Framework | |
Analysis of Vanilla Rolling Horizon Evolution Parameters in General Video Game Playing
Title | Analysis of Vanilla Rolling Horizon Evolution Parameters in General Video Game Playing |
Authors | Raluca D. Gaina, Jialin Liu, Simon M. Lucas, Diego Perez-Liebana |
Abstract | Monte Carlo Tree Search techniques have generally dominated General Video Game Playing, but recent research has started looking at Evolutionary Algorithms and their potential at matching Tree Search level of play or even outperforming these methods. Online or Rolling Horizon Evolution is one of the options available to evolve sequences of actions for planning in General Video Game Playing, but no research has been done up to date that explores the capabilities of the vanilla version of this algorithm in multiple games. This study aims to critically analyse the different configurations regarding population size and individual length in a set of 20 games from the General Video Game AI corpus. Distinctions are made between deterministic and stochastic games, and the implications of using superior time budgets are studied. Results show that there is scope for the use of these techniques, which in some configurations outperform Monte Carlo Tree Search, and also suggest that further research in these methods could boost their performance. |
Tasks | |
Published | 2017-04-24 |
URL | http://arxiv.org/abs/1704.07075v1 |
http://arxiv.org/pdf/1704.07075v1.pdf | |
PWC | https://paperswithcode.com/paper/analysis-of-vanilla-rolling-horizon-evolution |
Repo | |
Framework | |
Outlier Detection by Consistent Data Selection Method
Title | Outlier Detection by Consistent Data Selection Method |
Authors | Utkarsh Porwal, Smruthi Mukund |
Abstract | Often the challenge associated with tasks like fraud and spam detection[1] is the lack of all likely patterns needed to train suitable supervised learning models. In order to overcome this limitation, such tasks are attempted as outlier or anomaly detection tasks. We also hypothesize that out- liers have behavioral patterns that change over time. Limited data and continuously changing patterns makes learning significantly difficult. In this work we are proposing an approach that detects outliers in large data sets by relying on data points that are consistent. The primary contribution of this work is that it will quickly help retrieve samples for both consistent and non-outlier data sets and is also mindful of new outlier patterns. No prior knowledge of each set is required to extract the samples. The method consists of two phases, in the first phase, consistent data points (non- outliers) are retrieved by an ensemble method of unsupervised clustering techniques and in the second phase a one class classifier trained on the consistent data point set is ap- plied on the remaining sample set to identify the outliers. The approach is tested on three publicly available data sets and the performance scores are competitive. |
Tasks | Anomaly Detection, One-class classifier, Outlier Detection |
Published | 2017-12-12 |
URL | http://arxiv.org/abs/1712.04129v2 |
http://arxiv.org/pdf/1712.04129v2.pdf | |
PWC | https://paperswithcode.com/paper/outlier-detection-by-consistent-data |
Repo | |
Framework | |
Extreme Dimension Reduction for Handling Covariate Shift
Title | Extreme Dimension Reduction for Handling Covariate Shift |
Authors | Fulton Wang, Cynthia Rudin |
Abstract | In the covariate shift learning scenario, the training and test covariate distributions differ, so that a predictor’s average loss over the training and test distributions also differ. In this work, we explore the potential of extreme dimension reduction, i.e. to very low dimensions, in improving the performance of importance weighting methods for handling covariate shift, which fail in high dimensions due to potentially high train/test covariate divergence and the inability to accurately estimate the requisite density ratios. We first formulate and solve a problem optimizing over linear subspaces a combination of their predictive utility and train/test divergence within. Applying it to simulated and real data, we show extreme dimension reduction helps sometimes but not always, due to a bias introduced by dimension reduction. |
Tasks | Dimensionality Reduction |
Published | 2017-11-29 |
URL | http://arxiv.org/abs/1711.10938v2 |
http://arxiv.org/pdf/1711.10938v2.pdf | |
PWC | https://paperswithcode.com/paper/extreme-dimension-reduction-for-handling |
Repo | |
Framework | |
Semi-Automatic Algorithm for Breast MRI Lesion Segmentation Using Marker-Controlled Watershed Transformation
Title | Semi-Automatic Algorithm for Breast MRI Lesion Segmentation Using Marker-Controlled Watershed Transformation |
Authors | Sulaiman Vesal, Andres Diaz-Pinto, Nishant Ravikumar, Stephan Ellmann, Amirabbas Davari, Andreas Maier |
Abstract | Magnetic resonance imaging (MRI) is an effective imaging modality for identifying and localizing breast lesions in women. Accurate and precise lesion segmentation using a computer-aided-diagnosis (CAD) system, is a crucial step in evaluating tumor volume and in the quantification of tumor characteristics. However, this is a challenging task, since breast lesions have sophisticated shape, topological structure, and high variance in their intensity distribution across patients. In this paper, we propose a novel marker-controlled watershed transformation-based approach, which uses the brightest pixels in a region of interest (determined by experts) as markers to overcome this challenge, and accurately segment lesions in breast MRI. The proposed approach was evaluated on 106 lesions, which includes 64 malignant and 42 benign cases. Segmentation results were quantified by comparison with ground truth labels, using the Dice similarity coefficient (DSC) and Jaccard index (JI) metrics. The proposed method achieved an average Dice coefficient of 0.7808$\pm$0.1729 and Jaccard index of 0.6704$\pm$0.2167. These results illustrate that the proposed method shows promise for future work related to the segmentation and classification of benign and malignant breast lesions. |
Tasks | Lesion Segmentation |
Published | 2017-12-14 |
URL | http://arxiv.org/abs/1712.05200v1 |
http://arxiv.org/pdf/1712.05200v1.pdf | |
PWC | https://paperswithcode.com/paper/semi-automatic-algorithm-for-breast-mri |
Repo | |
Framework | |
Structured Low-Rank Matrix Factorization: Global Optimality, Algorithms, and Applications
Title | Structured Low-Rank Matrix Factorization: Global Optimality, Algorithms, and Applications |
Authors | Benjamin D. Haeffele, Rene Vidal |
Abstract | Recently, convex formulations of low-rank matrix factorization problems have received considerable attention in machine learning. However, such formulations often require solving for a matrix of the size of the data matrix, making it challenging to apply them to large scale datasets. Moreover, in many applications the data can display structures beyond simply being low-rank, e.g., images and videos present complex spatio-temporal structures that are largely ignored by standard low-rank methods. In this paper we study a matrix factorization technique that is suitable for large datasets and captures additional structure in the factors by using a particular form of regularization that includes well-known regularizers such as total variation and the nuclear norm as particular cases. Although the resulting optimization problem is non-convex, we show that if the size of the factors is large enough, under certain conditions, any local minimizer for the factors yields a global minimizer. A few practical algorithms are also provided to solve the matrix factorization problem, and bounds on the distance from a given approximate solution of the optimization problem to the global optimum are derived. Examples in neural calcium imaging video segmentation and hyperspectral compressed recovery show the advantages of our approach on high-dimensional datasets. |
Tasks | Video Semantic Segmentation |
Published | 2017-08-25 |
URL | http://arxiv.org/abs/1708.07850v1 |
http://arxiv.org/pdf/1708.07850v1.pdf | |
PWC | https://paperswithcode.com/paper/structured-low-rank-matrix-factorization-1 |
Repo | |
Framework | |
Cost-Sensitive Approach to Batch Size Adaptation for Gradient Descent
Title | Cost-Sensitive Approach to Batch Size Adaptation for Gradient Descent |
Authors | Matteo Pirotta, Marcello Restelli |
Abstract | In this paper, we propose a novel approach to automatically determine the batch size in stochastic gradient descent methods. The choice of the batch size induces a trade-off between the accuracy of the gradient estimate and the cost in terms of samples of each update. We propose to determine the batch size by optimizing the ratio between a lower bound to a linear or quadratic Taylor approximation of the expected improvement and the number of samples used to estimate the gradient. The performance of the proposed approach is empirically compared with related methods on popular classification tasks. The work was presented at the NIPS workshop on Optimizing the Optimizers. Barcelona, Spain, 2016. |
Tasks | |
Published | 2017-12-09 |
URL | http://arxiv.org/abs/1712.03428v1 |
http://arxiv.org/pdf/1712.03428v1.pdf | |
PWC | https://paperswithcode.com/paper/cost-sensitive-approach-to-batch-size |
Repo | |
Framework | |
State Space LSTM Models with Particle MCMC Inference
Title | State Space LSTM Models with Particle MCMC Inference |
Authors | Xun Zheng, Manzil Zaheer, Amr Ahmed, Yuan Wang, Eric P Xing, Alexander J Smola |
Abstract | Long Short-Term Memory (LSTM) is one of the most powerful sequence models. Despite the strong performance, however, it lacks the nice interpretability as in state space models. In this paper, we present a way to combine the best of both worlds by introducing State Space LSTM (SSL) models that generalizes the earlier work \cite{zaheer2017latent} of combining topic models with LSTM. However, unlike \cite{zaheer2017latent}, we do not make any factorization assumptions in our inference algorithm. We present an efficient sampler based on sequential Monte Carlo (SMC) method that draws from the joint posterior directly. Experimental results confirms the superiority and stability of this SMC inference algorithm on a variety of domains. |
Tasks | Topic Models |
Published | 2017-11-30 |
URL | http://arxiv.org/abs/1711.11179v1 |
http://arxiv.org/pdf/1711.11179v1.pdf | |
PWC | https://paperswithcode.com/paper/state-space-lstm-models-with-particle-mcmc |
Repo | |
Framework | |
A Study of Metrics of Distance and Correlation Between Ranked Lists for Compositionality Detection
Title | A Study of Metrics of Distance and Correlation Between Ranked Lists for Compositionality Detection |
Authors | Christina Lioma, Niels Dalum Hansen |
Abstract | Compositionality in language refers to how much the meaning of some phrase can be decomposed into the meaning of its constituents and the way these constituents are combined. Based on the premise that substitution by synonyms is meaning-preserving, compositionality can be approximated as the semantic similarity between a phrase and a version of that phrase where words have been replaced by their synonyms. Different ways of representing such phrases exist (e.g., vectors [1] or language models [2]), and the choice of representation affects the measurement of semantic similarity. We propose a new compositionality detection method that represents phrases as ranked lists of term weights. Our method approximates the semantic similarity between two ranked list representations using a range of well-known distance and correlation metrics. In contrast to most state-of-the-art approaches in compositionality detection, our method is completely unsupervised. Experiments with a publicly available dataset of 1048 human-annotated phrases shows that, compared to strong supervised baselines, our approach provides superior measurement of compositionality using any of the distance and correlation metrics considered. |
Tasks | Semantic Similarity, Semantic Textual Similarity |
Published | 2017-03-10 |
URL | http://arxiv.org/abs/1703.03640v1 |
http://arxiv.org/pdf/1703.03640v1.pdf | |
PWC | https://paperswithcode.com/paper/a-study-of-metrics-of-distance-and |
Repo | |
Framework | |
xUnit: Learning a Spatial Activation Function for Efficient Image Restoration
Title | xUnit: Learning a Spatial Activation Function for Efficient Image Restoration |
Authors | Idan Kligvasser, Tamar Rott Shaham, Tomer Michaeli |
Abstract | In recent years, deep neural networks (DNNs) achieved unprecedented performance in many low-level vision tasks. However, state-of-the-art results are typically achieved by very deep networks, which can reach tens of layers with tens of millions of parameters. To make DNNs implementable on platforms with limited resources, it is necessary to weaken the tradeoff between performance and efficiency. In this paper, we propose a new activation unit, which is particularly suitable for image restoration problems. In contrast to the widespread per-pixel activation units, like ReLUs and sigmoids, our unit implements a learnable nonlinear function with spatial connections. This enables the net to capture much more complex features, thus requiring a significantly smaller number of layers in order to reach the same performance. We illustrate the effectiveness of our units through experiments with state-of-the-art nets for denoising, de-raining, and super resolution, which are already considered to be very small. With our approach, we are able to further reduce these models by nearly 50% without incurring any degradation in performance. |
Tasks | Denoising, Image Restoration, Super-Resolution |
Published | 2017-11-17 |
URL | http://arxiv.org/abs/1711.06445v3 |
http://arxiv.org/pdf/1711.06445v3.pdf | |
PWC | https://paperswithcode.com/paper/xunit-learning-a-spatial-activation-function |
Repo | |
Framework | |
Sequence Mining and Pattern Analysis in Drilling Reports with Deep Natural Language Processing
Title | Sequence Mining and Pattern Analysis in Drilling Reports with Deep Natural Language Processing |
Authors | Júlio Hoffimann, Youli Mao, Avinash Wesley, Aimee Taylor |
Abstract | Drilling activities in the oil and gas industry have been reported over decades for thousands of wells on a daily basis, yet the analysis of this text at large-scale for information retrieval, sequence mining, and pattern analysis is very challenging. Drilling reports contain interpretations written by drillers from noting measurements in downhole sensors and surface equipment, and can be used for operation optimization and accident mitigation. In this initial work, a methodology is proposed for automatic classification of sentences written in drilling reports into three relevant labels (EVENT, SYMPTOM and ACTION) for hundreds of wells in an actual field. Some of the main challenges in the text corpus were overcome, which include the high frequency of technical symbols, mistyping/abbreviation of technical terms, and the presence of incomplete sentences in the drilling reports. We obtain state-of-the-art classification accuracy within this technical language and illustrate advanced queries enabled by the tool. |
Tasks | Information Retrieval |
Published | 2017-12-05 |
URL | http://arxiv.org/abs/1712.01476v1 |
http://arxiv.org/pdf/1712.01476v1.pdf | |
PWC | https://paperswithcode.com/paper/sequence-mining-and-pattern-analysis-in |
Repo | |
Framework | |