Paper Group AWR 301
BabyAI: A Platform to Study the Sample Efficiency of Grounded Language Learning. Meta-Learning Priors for Efficient Online Bayesian Regression. A Differentially Private Kernel Two-Sample Test. ensmallen: a flexible C++ library for efficient function optimization. Towards Good Practices on Building Effective CNN Baseline Model for Person Re-identifi …
BabyAI: A Platform to Study the Sample Efficiency of Grounded Language Learning
Title | BabyAI: A Platform to Study the Sample Efficiency of Grounded Language Learning |
Authors | Maxime Chevalier-Boisvert, Dzmitry Bahdanau, Salem Lahlou, Lucas Willems, Chitwan Saharia, Thien Huu Nguyen, Yoshua Bengio |
Abstract | Allowing humans to interactively train artificial agents to understand language instructions is desirable for both practical and scientific reasons, but given the poor data efficiency of the current learning methods, this goal may require substantial research efforts. Here, we introduce the BabyAI research platform to support investigations towards including humans in the loop for grounded language learning. The BabyAI platform comprises an extensible suite of 19 levels of increasing difficulty. The levels gradually lead the agent towards acquiring a combinatorially rich synthetic language which is a proper subset of English. The platform also provides a heuristic expert agent for the purpose of simulating a human teacher. We report baseline results and estimate the amount of human involvement that would be required to train a neural network-based agent on some of the BabyAI levels. We put forward strong evidence that current deep learning methods are not yet sufficiently sample efficient when it comes to learning a language with compositional properties. |
Tasks | |
Published | 2018-10-18 |
URL | https://arxiv.org/abs/1810.08272v4 |
https://arxiv.org/pdf/1810.08272v4.pdf | |
PWC | https://paperswithcode.com/paper/babyai-first-steps-towards-grounded-language |
Repo | https://github.com/taungerine/babyai |
Framework | pytorch |
Meta-Learning Priors for Efficient Online Bayesian Regression
Title | Meta-Learning Priors for Efficient Online Bayesian Regression |
Authors | James Harrison, Apoorva Sharma, Marco Pavone |
Abstract | Gaussian Process (GP) regression has seen widespread use in robotics due to its generality, simplicity of use, and the utility of Bayesian predictions. The predominant implementation of GP regression is a nonparameteric kernel-based approach, as it enables fitting of arbitrary nonlinear functions. However, this approach suffers from two main drawbacks: (1) it is computationally inefficient, as computation scales poorly with the number of samples; and (2) it can be data inefficient, as encoding prior knowledge that can aid the model through the choice of kernel and associated hyperparameters is often challenging and unintuitive. In this work, we propose ALPaCA, an algorithm for efficient Bayesian regression which addresses these issues. ALPaCA uses a dataset of sample functions to learn a domain-specific, finite-dimensional feature encoding, as well as a prior over the associated weights, such that Bayesian linear regression in this feature space yields accurate online predictions of the posterior predictive density. These features are neural networks, which are trained via a meta-learning (or “learning-to-learn”) approach. ALPaCA extracts all prior information directly from the dataset, rather than restricting prior information to the choice of kernel hyperparameters. Furthermore, by operating in the weight space, it substantially reduces sample complexity. We investigate the performance of ALPaCA on two simple regression problems, two simulated robotic systems, and on a lane-change driving task performed by humans. We find our approach outperforms kernel-based GP regression, as well as state of the art meta-learning approaches, thereby providing a promising plug-in tool for many regression tasks in robotics where scalability and data-efficiency are important. |
Tasks | Meta-Learning |
Published | 2018-07-24 |
URL | http://arxiv.org/abs/1807.08912v2 |
http://arxiv.org/pdf/1807.08912v2.pdf | |
PWC | https://paperswithcode.com/paper/meta-learning-priors-for-efficient-online |
Repo | https://github.com/StanfordASL/ALPaCA |
Framework | none |
A Differentially Private Kernel Two-Sample Test
Title | A Differentially Private Kernel Two-Sample Test |
Authors | Anant Raj, Ho Chung Leon Law, Dino Sejdinovic, Mijung Park |
Abstract | Kernel two-sample testing is a useful statistical tool in determining whether data samples arise from different distributions without imposing any parametric assumptions on those distributions. However, raw data samples can expose sensitive information about individuals who participate in scientific studies, which makes the current tests vulnerable to privacy breaches. Hence, we design a new framework for kernel two-sample testing conforming to differential privacy constraints, in order to guarantee the privacy of subjects in the data. Unlike existing differentially private parametric tests that simply add noise to data, kernel-based testing imposes a challenge due to a complex dependence of test statistics on the raw data, as these statistics correspond to estimators of distances between representations of probability measures in Hilbert spaces. Our approach considers finite dimensional approximations to those representations. As a result, a simple chi-squared test is obtained, where a test statistic depends on a mean and covariance of empirical differences between the samples, which we perturb for a privacy guarantee. We investigate the utility of our framework in two realistic settings and conclude that our method requires only a relatively modest increase in sample size to achieve a similar level of power to the non-private tests in both settings. |
Tasks | |
Published | 2018-08-01 |
URL | http://arxiv.org/abs/1808.00380v1 |
http://arxiv.org/pdf/1808.00380v1.pdf | |
PWC | https://paperswithcode.com/paper/a-differentially-private-kernel-two-sample |
Repo | https://github.com/hcllaw/private_tst |
Framework | none |
ensmallen: a flexible C++ library for efficient function optimization
Title | ensmallen: a flexible C++ library for efficient function optimization |
Authors | Shikhar Bhardwaj, Ryan R. Curtin, Marcus Edel, Yannis Mentekidis, Conrad Sanderson |
Abstract | We present ensmallen, a fast and flexible C++ library for mathematical optimization of arbitrary user-supplied functions, which can be applied to many machine learning problems. Several types of optimizations are supported, including differentiable, separable, constrained, and categorical objective functions. The library provides many pre-built optimizers (including numerous variants of SGD and Quasi-Newton optimizers) as well as a flexible framework for implementing new optimizers and objective functions. Implementation of a new optimizer requires only one method and a new objective function requires typically one or two C++ functions. This can aid in the quick implementation and prototyping of new machine learning algorithms. Due to the use of C++ template metaprogramming, ensmallen is able to support compiler optimizations that provide fast runtimes. Empirical comparisons show that ensmallen is able to outperform other optimization frameworks (like Julia and SciPy), sometimes by large margins. The library is distributed under the BSD license and is ready for use in production environments. |
Tasks | |
Published | 2018-10-22 |
URL | http://arxiv.org/abs/1810.09361v2 |
http://arxiv.org/pdf/1810.09361v2.pdf | |
PWC | https://paperswithcode.com/paper/ensmallen-a-flexible-c-library-for-efficient |
Repo | https://github.com/mlpack/ensmallen |
Framework | none |
Towards Good Practices on Building Effective CNN Baseline Model for Person Re-identification
Title | Towards Good Practices on Building Effective CNN Baseline Model for Person Re-identification |
Authors | Fu Xiong, Yang Xiao, Zhiguo Cao, Kaicheng Gong, Zhiwen Fang, Joey Tianyi Zhou |
Abstract | Person re-identification is indeed a challenging visual recognition task due to the critical issues of human pose variation, human body occlusion, camera view variation, etc. To address this, most of the state-of-the-art approaches are proposed based on deep convolutional neural network (CNN), being leveraged by its strong feature learning power and classification boundary fitting capacity. Although the vital role towards person re-identification, how to build effective CNN baseline model has not been well studied yet. To answer this open question, we propose 3 good practices in this paper from the perspectives of adjusting CNN architecture and training procedure. In particular, they are adding batch normalization after the global pooling layer, executing identity categorization directly using only one fully-connected, and using Adam as optimizer. The extensive experiments on 3 widely-used benchmark datasets demonstrate that, our propositions essentially facilitate the CNN baseline model to achieve the state-of-the-art performance without any other high-level domain knowledge or low-level technical trick. |
Tasks | Person Re-Identification |
Published | 2018-07-29 |
URL | http://arxiv.org/abs/1807.11042v1 |
http://arxiv.org/pdf/1807.11042v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-good-practices-on-building-effective |
Repo | https://github.com/xf1994/good_practices_for_person_reID |
Framework | pytorch |
Adaptation and Robust Learning of Probabilistic Movement Primitives
Title | Adaptation and Robust Learning of Probabilistic Movement Primitives |
Authors | Sebastian Gomez-Gonzalez, Gerhard Neumann, Bernhard Schölkopf, Jan Peters |
Abstract | Probabilistic representations of movement primitives open important new possibilities for machine learning in robotics. These representations are able to capture the variability of the demonstrations from a teacher as a probability distribution over trajectories, providing a sensible region of exploration and the ability to adapt to changes in the robot environment. However, to be able to capture variability and correlations between different joints, a probabilistic movement primitive requires the estimation of a larger number of parameters compared to their deterministic counterparts, that focus on modeling only the mean behavior. In this paper, we make use of prior distributions over the parameters of a probabilistic movement primitive to make robust estimates of the parameters with few training instances. In addition, we introduce general purpose operators to adapt movement primitives in joint and task space. The proposed training method and adaptation operators are tested in a coffee preparation and in robot table tennis task. In the coffee preparation task we evaluate the generalization performance to changes in the location of the coffee grinder and brewing chamber in a target area, achieving the desired behavior after only two demonstrations. In the table tennis task we evaluate the hit and return rates, outperforming previous approaches while using fewer task specific heuristics. |
Tasks | |
Published | 2018-08-31 |
URL | https://arxiv.org/abs/1808.10648v2 |
https://arxiv.org/pdf/1808.10648v2.pdf | |
PWC | https://paperswithcode.com/paper/adaptation-and-robust-learning-of |
Repo | https://github.com/SamuelBG13/robo-cheesecake |
Framework | none |
Unsupervised Detection of Anomalous Sound based on Deep Learning and the Neyman-Pearson Lemma
Title | Unsupervised Detection of Anomalous Sound based on Deep Learning and the Neyman-Pearson Lemma |
Authors | Yuma Koizumi, Shoichiro Saito, Hisashi Uematsum Yuta Kawachi, Noboru Harada |
Abstract | This paper proposes a novel optimization principle and its implementation for unsupervised anomaly detection in sound (ADS) using an autoencoder (AE). The goal of unsupervised-ADS is to detect unknown anomalous sound without training data of anomalous sound. Use of an AE as a normal model is a state-of-the-art technique for unsupervised-ADS. To decrease the false positive rate (FPR), the AE is trained to minimize the reconstruction error of normal sounds and the anomaly score is calculated as the reconstruction error of the observed sound. Unfortunately, since this training procedure does not take into account the anomaly score for anomalous sounds, the true positive rate (TPR) does not necessarily increase. In this study, we define an objective function based on the Neyman-Pearson lemma by considering ADS as a statistical hypothesis test. The proposed objective function trains the AE to maximize the TPR under an arbitrary low FPR condition. To calculate the TPR in the objective function, we consider that the set of anomalous sounds is the complementary set of normal sounds and simulate anomalous sounds by using a rejection sampling algorithm. Through experiments using synthetic data, we found that the proposed method improved the performance measures of ADS under low FPR conditions. In addition, we confirmed that the proposed method could detect anomalous sounds in real environments. |
Tasks | Anomaly Detection, Unsupervised Anomaly Detection, Unsupervised Anomaly Detection In Sound |
Published | 2018-10-22 |
URL | http://arxiv.org/abs/1810.09133v1 |
http://arxiv.org/pdf/1810.09133v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-detection-of-anomalous-sound |
Repo | https://github.com/lifesailor/data-driven-predictive-maintenance |
Framework | none |
Dense Object Nets: Learning Dense Visual Object Descriptors By and For Robotic Manipulation
Title | Dense Object Nets: Learning Dense Visual Object Descriptors By and For Robotic Manipulation |
Authors | Peter R. Florence, Lucas Manuelli, Russ Tedrake |
Abstract | What is the right object representation for manipulation? We would like robots to visually perceive scenes and learn an understanding of the objects in them that (i) is task-agnostic and can be used as a building block for a variety of manipulation tasks, (ii) is generally applicable to both rigid and non-rigid objects, (iii) takes advantage of the strong priors provided by 3D vision, and (iv) is entirely learned from self-supervision. This is hard to achieve with previous methods: much recent work in grasping does not extend to grasping specific objects or other tasks, whereas task-specific learning may require many trials to generalize well across object configurations or other tasks. In this paper we present Dense Object Nets, which build on recent developments in self-supervised dense descriptor learning, as a consistent object representation for visual understanding and manipulation. We demonstrate they can be trained quickly (approximately 20 minutes) for a wide variety of previously unseen and potentially non-rigid objects. We additionally present novel contributions to enable multi-object descriptor learning, and show that by modifying our training procedure, we can either acquire descriptors which generalize across classes of objects, or descriptors that are distinct for each object instance. Finally, we demonstrate the novel application of learned dense descriptors to robotic manipulation. We demonstrate grasping of specific points on an object across potentially deformed object configurations, and demonstrate using class general descriptors to transfer specific grasps across objects in a class. |
Tasks | |
Published | 2018-06-22 |
URL | http://arxiv.org/abs/1806.08756v2 |
http://arxiv.org/pdf/1806.08756v2.pdf | |
PWC | https://paperswithcode.com/paper/dense-object-nets-learning-dense-visual |
Repo | https://github.com/RobotLocomotion/pytorch-dense-correspondence |
Framework | pytorch |
Chinese NER Using Lattice LSTM
Title | Chinese NER Using Lattice LSTM |
Authors | Yue Zhang, Jie Yang |
Abstract | We investigate a lattice-structured LSTM model for Chinese NER, which encodes a sequence of input characters as well as all potential words that match a lexicon. Compared with character-based methods, our model explicitly leverages word and word sequence information. Compared with word-based methods, lattice LSTM does not suffer from segmentation errors. Gated recurrent cells allow our model to choose the most relevant characters and words from a sentence for better NER results. Experiments on various datasets show that lattice LSTM outperforms both word-based and character-based LSTM baselines, achieving the best results. |
Tasks | Chinese Named Entity Recognition |
Published | 2018-05-05 |
URL | http://arxiv.org/abs/1805.02023v4 |
http://arxiv.org/pdf/1805.02023v4.pdf | |
PWC | https://paperswithcode.com/paper/chinese-ner-using-lattice-lstm |
Repo | https://github.com/jiesutd/LatticeLSTM |
Framework | pytorch |
Efficient and Accurate MRI Super-Resolution using a Generative Adversarial Network and 3D Multi-Level Densely Connected Network
Title | Efficient and Accurate MRI Super-Resolution using a Generative Adversarial Network and 3D Multi-Level Densely Connected Network |
Authors | Yuhua Chen, Feng Shi, Anthony G. Christodoulou, Zhengwei Zhou, Yibin Xie, Debiao Li |
Abstract | High-resolution (HR) magnetic resonance images (MRI) provide detailed anatomical information important for clinical application and quantitative image analysis. However, HR MRI conventionally comes at the cost of longer scan time, smaller spatial coverage, and lower signal-to-noise ratio (SNR). Recent studies have shown that single image super-resolution (SISR), a technique to recover HR details from one single low-resolution (LR) input image, could provide high-quality image details with the help of advanced deep convolutional neural networks (CNN). However, deep neural networks consume memory heavily and run slowly, especially in 3D settings. In this paper, we propose a novel 3D neural network design, namely a multi-level densely connected super-resolution network (mDCSRN) with generative adversarial network (GAN)-guided training. The mDCSRN quickly trains and inferences and the GAN promotes realistic output hardly distinguishable from original HR images. Our results from experiments on a dataset with 1,113 subjects show that our new architecture beats other popular deep learning methods in recovering 4x resolution-downgraded im-ages and runs 6x faster. |
Tasks | Image Super-Resolution, Super-Resolution |
Published | 2018-03-04 |
URL | http://arxiv.org/abs/1803.01417v3 |
http://arxiv.org/pdf/1803.01417v3.pdf | |
PWC | https://paperswithcode.com/paper/efficient-and-accurate-mri-super-resolution |
Repo | https://github.com/Hadrien-Cornier/E6040-super-resolution-project |
Framework | pytorch |
On learning with shift-invariant structures
Title | On learning with shift-invariant structures |
Authors | Cristian Rusu |
Abstract | We describe new results and algorithms for two different, but related, problems which deal with circulant matrices: learning shift-invariant components from training data and calculating the shift (or alignment) between two given signals. In the first instance, we deal with the shift-invariant dictionary learning problem while the latter bears the name of (compressive) shift retrieval. We formulate these problems using circulant and convolutional matrices (including unions of such matrices), define optimization problems that describe our goals and propose efficient ways to solve them. Based on these findings, we also show how to learn a wavelet-like dictionary from training data. We connect our work with various previous results from the literature and we show the effectiveness of our proposed algorithms using synthetic, ECG signals and images. |
Tasks | Dictionary Learning |
Published | 2018-12-03 |
URL | https://arxiv.org/abs/1812.01115v2 |
https://arxiv.org/pdf/1812.01115v2.pdf | |
PWC | https://paperswithcode.com/paper/on-learning-with-shift-invariant-structures |
Repo | https://github.com/cristian-rusu-research/shift-invariance |
Framework | none |
Detecting Traffic Lights by Single Shot Detection
Title | Detecting Traffic Lights by Single Shot Detection |
Authors | Julian Müller, Klaus Dietmayer |
Abstract | Recent improvements in object detection are driven by the success of convolutional neural networks (CNN). They are able to learn rich features outperforming hand-crafted features. So far, research in traffic light detection mainly focused on hand-crafted features, such as color, shape or brightness of the traffic light bulb. This paper presents a deep learning approach for accurate traffic light detection in adapting a single shot detection (SSD) approach. SSD performs object proposals creation and classification using a single CNN. The original SSD struggles in detecting very small objects, which is essential for traffic light detection. By our adaptations it is possible to detect objects much smaller than ten pixels without increasing the input image size. We present an extensive evaluation on the DriveU Traffic Light Dataset (DTLD). We reach both, high accuracy and low false positive rates. The trained model is real-time capable with ten frames per second on a Nvidia Titan Xp. Code has been made available at https://github.com/julimueller/tl_ssd. |
Tasks | Object Detection |
Published | 2018-05-07 |
URL | http://arxiv.org/abs/1805.02523v3 |
http://arxiv.org/pdf/1805.02523v3.pdf | |
PWC | https://paperswithcode.com/paper/detecting-traffic-lights-by-single-shot |
Repo | https://github.com/julimueller/tl_ssd |
Framework | none |
Mixture Density Generative Adversarial Networks
Title | Mixture Density Generative Adversarial Networks |
Authors | Hamid Eghbal-zadeh, Werner Zellinger, Gerhard Widmer |
Abstract | Generative Adversarial Networks have surprising ability for generating sharp and realistic images, though they are known to suffer from the so-called mode collapse problem. In this paper, we propose a new GAN variant called Mixture Density GAN that while being capable of generating high-quality images, overcomes this problem by encouraging the Discriminator to form clusters in its embedding space, which in turn leads the Generator to exploit these and discover different modes in the data. This is achieved by positioning Gaussian density functions in the corners of a simplex, using the resulting Gaussian mixture as a likelihood function over discriminator embeddings, and formulating an objective function for GAN training that is based on these likelihoods. We demonstrate empirically (1) the quality of the generated images in Mixture Density GAN and their strong similarity to real images, as measured by the Fr'echet Inception Distance (FID), which compares very favourably with state-of-the-art methods, and (2) the ability to avoid mode collapse and discover all data modes. |
Tasks | |
Published | 2018-10-31 |
URL | http://arxiv.org/abs/1811.00152v2 |
http://arxiv.org/pdf/1811.00152v2.pdf | |
PWC | https://paperswithcode.com/paper/mixture-density-generative-adversarial |
Repo | https://github.com/eghbalz/mdgan |
Framework | none |
Improving Named Entity Recognition by Jointly Learning to Disambiguate Morphological Tags
Title | Improving Named Entity Recognition by Jointly Learning to Disambiguate Morphological Tags |
Authors | Onur Güngör, Suzan Üsküdarlı, Tunga Güngör |
Abstract | Previous studies have shown that linguistic features of a word such as possession, genitive or other grammatical cases can be employed in word representations of a named entity recognition (NER) tagger to improve the performance for morphologically rich languages. However, these taggers require external morphological disambiguation (MD) tools to function which are hard to obtain or non-existent for many languages. In this work, we propose a model which alleviates the need for such disambiguators by jointly learning NER and MD taggers in languages for which one can provide a list of candidate morphological analyses. We show that this can be done independent of the morphological annotation schemes, which differ among languages. Our experiments employing three different model architectures that join these two tasks show that joint learning improves NER performance. Furthermore, the morphological disambiguator’s performance is shown to be competitive. |
Tasks | Named Entity Recognition |
Published | 2018-07-17 |
URL | http://arxiv.org/abs/1807.06683v1 |
http://arxiv.org/pdf/1807.06683v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-named-entity-recognition-by-jointly |
Repo | https://github.com/onurgu/joint-ner-and-md-tagger |
Framework | none |
Deep Gated Recurrent and Convolutional Network Hybrid Model for Univariate Time Series Classification
Title | Deep Gated Recurrent and Convolutional Network Hybrid Model for Univariate Time Series Classification |
Authors | Nelly Elsayed, Anthony S. Maida, Magdy Bayoumi |
Abstract | Hybrid LSTM-fully convolutional networks (LSTM-FCN) for time series classification have produced state-of-the-art classification results on univariate time series. We show that replacing the LSTM with a gated recurrent unit (GRU) to create a GRU-fully convolutional network hybrid model (GRU-FCN) can offer even better performance on many time series datasets. The proposed GRU-FCN model outperforms state-of-the-art classification performance in many univariate and multivariate time series datasets. In addition, since the GRU uses a simpler architecture than the LSTM, it has fewer training parameters, less training time, and a simpler hardware implementation, compared to the LSTM-based models. |
Tasks | Time Series, Time Series Classification |
Published | 2018-12-18 |
URL | http://arxiv.org/abs/1812.07683v3 |
http://arxiv.org/pdf/1812.07683v3.pdf | |
PWC | https://paperswithcode.com/paper/deep-gated-recurrent-and-convolutional |
Repo | https://github.com/NellyElsayed/GRU-FCN-model-for-univariate-time-series-classification |
Framework | none |