May 7, 2019

2643 words 13 mins read

Paper Group AWR 79

Paper Group AWR 79

Actor-critic versus direct policy search: a comparison based on sample complexity. Variational Graph Auto-Encoders. Linear Algebraic Structure of Word Senses, with Applications to Polysemy. Ancestral Causal Inference. Discovering Causal Signals in Images. Lecture Notes on Randomized Linear Algebra. What Is the Best Practice for CNNs Applied to Visu …

Actor-critic versus direct policy search: a comparison based on sample complexity

Title Actor-critic versus direct policy search: a comparison based on sample complexity
Authors Arnaud de Froissard de Broissia, Olivier Sigaud
Abstract Sample efficiency is a critical property when optimizing policy parameters for the controller of a robot. In this paper, we evaluate two state-of-the-art policy optimization algorithms. One is a recent deep reinforcement learning method based on an actor-critic algorithm, Deep Deterministic Policy Gradient (DDPG), that has been shown to perform well on various control benchmarks. The other one is a direct policy search method, Covariance Matrix Adaptation Evolution Strategy (CMA-ES), a black-box optimization method that is widely used for robot learning. The algorithms are evaluated on a continuous version of the mountain car benchmark problem, so as to compare their sample complexity. From a preliminary analysis, we expect DDPG to be more sample efficient than CMA-ES, which is confirmed by our experimental results.
Tasks
Published 2016-06-29
URL http://arxiv.org/abs/1606.09152v2
PDF http://arxiv.org/pdf/1606.09152v2.pdf
PWC https://paperswithcode.com/paper/actor-critic-versus-direct-policy-search-a
Repo https://github.com/MOCR/DDPG
Framework tf

Variational Graph Auto-Encoders

Title Variational Graph Auto-Encoders
Authors Thomas N. Kipf, Max Welling
Abstract We introduce the variational graph auto-encoder (VGAE), a framework for unsupervised learning on graph-structured data based on the variational auto-encoder (VAE). This model makes use of latent variables and is capable of learning interpretable latent representations for undirected graphs. We demonstrate this model using a graph convolutional network (GCN) encoder and a simple inner product decoder. Our model achieves competitive results on a link prediction task in citation networks. In contrast to most existing models for unsupervised learning on graph-structured data and link prediction, our model can naturally incorporate node features, which significantly improves predictive performance on a number of benchmark datasets.
Tasks Graph Clustering, Link Prediction
Published 2016-11-21
URL http://arxiv.org/abs/1611.07308v1
PDF http://arxiv.org/pdf/1611.07308v1.pdf
PWC https://paperswithcode.com/paper/variational-graph-auto-encoders
Repo https://github.com/tkipf/gae
Framework tf

Linear Algebraic Structure of Word Senses, with Applications to Polysemy

Title Linear Algebraic Structure of Word Senses, with Applications to Polysemy
Authors Sanjeev Arora, Yuanzhi Li, Yingyu Liang, Tengyu Ma, Andrej Risteski
Abstract Word embeddings are ubiquitous in NLP and information retrieval, but it is unclear what they represent when the word is polysemous. Here it is shown that multiple word senses reside in linear superposition within the word embedding and simple sparse coding can recover vectors that approximately capture the senses. The success of our approach, which applies to several embedding methods, is mathematically explained using a variant of the random walk on discourses model (Arora et al., 2016). A novel aspect of our technique is that each extracted word sense is accompanied by one of about 2000 “discourse atoms” that gives a succinct description of which other words co-occur with that word sense. Discourse atoms can be of independent interest, and make the method potentially more useful. Empirical tests are used to verify and support the theory.
Tasks Information Retrieval, Word Embeddings
Published 2016-01-14
URL http://arxiv.org/abs/1601.03764v6
PDF http://arxiv.org/pdf/1601.03764v6.pdf
PWC https://paperswithcode.com/paper/linear-algebraic-structure-of-word-senses
Repo https://github.com/PrincetonML/SemanticVector
Framework none

Ancestral Causal Inference

Title Ancestral Causal Inference
Authors Sara Magliacane, Tom Claassen, Joris M. Mooij
Abstract Constraint-based causal discovery from limited data is a notoriously difficult challenge due to the many borderline independence test decisions. Several approaches to improve the reliability of the predictions by exploiting redundancy in the independence information have been proposed recently. Though promising, existing approaches can still be greatly improved in terms of accuracy and scalability. We present a novel method that reduces the combinatorial explosion of the search space by using a more coarse-grained representation of causal information, drastically reducing computation time. Additionally, we propose a method to score causal predictions based on their confidence. Crucially, our implementation also allows one to easily combine observational and interventional data and to incorporate various types of available background knowledge. We prove soundness and asymptotic consistency of our method and demonstrate that it can outperform the state-of-the-art on synthetic data, achieving a speedup of several orders of magnitude. We illustrate its practical feasibility by applying it on a challenging protein data set.
Tasks Causal Discovery, Causal Inference
Published 2016-06-22
URL http://arxiv.org/abs/1606.07035v3
PDF http://arxiv.org/pdf/1606.07035v3.pdf
PWC https://paperswithcode.com/paper/ancestral-causal-inference
Repo https://github.com/caus-am/aci
Framework none

Discovering Causal Signals in Images

Title Discovering Causal Signals in Images
Authors David Lopez-Paz, Robert Nishihara, Soumith Chintala, Bernhard Schölkopf, Léon Bottou
Abstract This paper establishes the existence of observable footprints that reveal the “causal dispositions” of the object categories appearing in collections of images. We achieve this goal in two steps. First, we take a learning approach to observational causal discovery, and build a classifier that achieves state-of-the-art performance on finding the causal direction between pairs of random variables, given samples from their joint distribution. Second, we use our causal direction classifier to effectively distinguish between features of objects and features of their contexts in collections of static images. Our experiments demonstrate the existence of a relation between the direction of causality and the difference between objects and their contexts, and by the same token, the existence of observable signals that reveal the causal dispositions of objects.
Tasks Causal Discovery
Published 2016-05-26
URL http://arxiv.org/abs/1605.08179v2
PDF http://arxiv.org/pdf/1605.08179v2.pdf
PWC https://paperswithcode.com/paper/discovering-causal-signals-in-images
Repo https://github.com/kyrs/NCC-experiments
Framework tf

Lecture Notes on Randomized Linear Algebra

Title Lecture Notes on Randomized Linear Algebra
Authors Michael W. Mahoney
Abstract These are lecture notes that are based on the lectures from a class I taught on the topic of Randomized Linear Algebra (RLA) at UC Berkeley during the Fall 2013 semester.
Tasks
Published 2016-08-16
URL http://arxiv.org/abs/1608.04481v1
PDF http://arxiv.org/pdf/1608.04481v1.pdf
PWC https://paperswithcode.com/paper/lecture-notes-on-randomized-linear-algebra
Repo https://github.com/NumericalMax/Randomized-Matrix-Product
Framework none

What Is the Best Practice for CNNs Applied to Visual Instance Retrieval?

Title What Is the Best Practice for CNNs Applied to Visual Instance Retrieval?
Authors Jiedong Hao, Jing Dong, Wei Wang, Tieniu Tan
Abstract Previous work has shown that feature maps of deep convolutional neural networks (CNNs) can be interpreted as feature representation of a particular image region. Features aggregated from these feature maps have been exploited for image retrieval tasks and achieved state-of-the-art performances in recent years. The key to the success of such methods is the feature representation. However, the different factors that impact the effectiveness of features are still not explored thoroughly. There are much less discussion about the best combination of them. The main contribution of our paper is the thorough evaluations of the various factors that affect the discriminative ability of the features extracted from CNNs. Based on the evaluation results, we also identify the best choices for different factors and propose a new multi-scale image feature representation method to encode the image effectively. Finally, we show that the proposed method generalises well and outperforms the state-of-the-art methods on four typical datasets used for visual instance retrieval.
Tasks Image Retrieval
Published 2016-11-05
URL http://arxiv.org/abs/1611.01640v1
PDF http://arxiv.org/pdf/1611.01640v1.pdf
PWC https://paperswithcode.com/paper/what-is-the-best-practice-for-cnns-applied-to
Repo https://github.com/hbwang1427/image_retrieval
Framework none

Learning Scalable Deep Kernels with Recurrent Structure

Title Learning Scalable Deep Kernels with Recurrent Structure
Authors Maruan Al-Shedivat, Andrew Gordon Wilson, Yunus Saatchi, Zhiting Hu, Eric P. Xing
Abstract Many applications in speech, robotics, finance, and biology deal with sequential data, where ordering matters and recurrent structures are common. However, this structure cannot be easily captured by standard kernel functions. To model such structure, we propose expressive closed-form kernel functions for Gaussian processes. The resulting model, GP-LSTM, fully encapsulates the inductive biases of long short-term memory (LSTM) recurrent networks, while retaining the non-parametric probabilistic advantages of Gaussian processes. We learn the properties of the proposed kernels by optimizing the Gaussian process marginal likelihood using a new provably convergent semi-stochastic gradient procedure and exploit the structure of these kernels for scalable training and prediction. This approach provides a practical representation for Bayesian LSTMs. We demonstrate state-of-the-art performance on several benchmarks, and thoroughly investigate a consequential autonomous driving application, where the predictive uncertainties provided by GP-LSTM are uniquely valuable.
Tasks Autonomous Driving, Gaussian Processes, Smart Grid Prediction
Published 2016-10-27
URL http://arxiv.org/abs/1610.08936v3
PDF http://arxiv.org/pdf/1610.08936v3.pdf
PWC https://paperswithcode.com/paper/learning-scalable-deep-kernels-with-recurrent
Repo https://github.com/alshedivat/keras-gp
Framework tf

Constructing a Natural Language Inference Dataset using Generative Neural Networks

Title Constructing a Natural Language Inference Dataset using Generative Neural Networks
Authors Janez Starc, Dunja Mladenić
Abstract Natural Language Inference is an important task for Natural Language Understanding. It is concerned with classifying the logical relation between two sentences. In this paper, we propose several text generative neural networks for generating text hypothesis, which allows construction of new Natural Language Inference datasets. To evaluate the models, we propose a new metric – the accuracy of the classifier trained on the generated dataset. The accuracy obtained by our best generative model is only 2.7% lower than the accuracy of the classifier trained on the original, human crafted dataset. Furthermore, the best generated dataset combined with the original dataset achieves the highest accuracy. The best model learns a mapping embedding for each training example. By comparing various metrics we show that datasets that obtain higher ROUGE or METEOR scores do not necessarily yield higher classification accuracies. We also provide analysis of what are the characteristics of a good dataset including the distinguishability of the generated datasets from the original one.
Tasks Natural Language Inference
Published 2016-07-20
URL http://arxiv.org/abs/1607.06025v2
PDF http://arxiv.org/pdf/1607.06025v2.pdf
PWC https://paperswithcode.com/paper/constructing-a-natural-language-inference
Repo https://github.com/jstarc/nli_generation
Framework none

Rationalizing Neural Predictions

Title Rationalizing Neural Predictions
Authors Tao Lei, Regina Barzilay, Tommi Jaakkola
Abstract Prediction without justification has limited applicability. As a remedy, we learn to extract pieces of input text as justifications – rationales – that are tailored to be short and coherent, yet sufficient for making the same prediction. Our approach combines two modular components, generator and encoder, which are trained to operate well together. The generator specifies a distribution over text fragments as candidate rationales and these are passed through the encoder for prediction. Rationales are never given during training. Instead, the model is regularized by desiderata for rationales. We evaluate the approach on multi-aspect sentiment analysis against manually annotated test cases. Our approach outperforms attention-based baseline by a significant margin. We also successfully illustrate the method on the question retrieval task.
Tasks Sentiment Analysis
Published 2016-06-13
URL http://arxiv.org/abs/1606.04155v2
PDF http://arxiv.org/pdf/1606.04155v2.pdf
PWC https://paperswithcode.com/paper/rationalizing-neural-predictions
Repo https://github.com/Gorov/three_player_for_emnlp
Framework pytorch

Connecting Generative Adversarial Networks and Actor-Critic Methods

Title Connecting Generative Adversarial Networks and Actor-Critic Methods
Authors David Pfau, Oriol Vinyals
Abstract Both generative adversarial networks (GAN) in unsupervised learning and actor-critic methods in reinforcement learning (RL) have gained a reputation for being difficult to optimize. Practitioners in both fields have amassed a large number of strategies to mitigate these instabilities and improve training. Here we show that GANs can be viewed as actor-critic methods in an environment where the actor cannot affect the reward. We review the strategies for stabilizing training for each class of models, both those that generalize between the two and those that are particular to that model. We also review a number of extensions to GANs and RL algorithms with even more complicated information flow. We hope that by highlighting this formal connection we will encourage both GAN and RL communities to develop general, scalable, and stable algorithms for multilevel optimization with deep networks, and to draw inspiration across communities.
Tasks
Published 2016-10-06
URL http://arxiv.org/abs/1610.01945v3
PDF http://arxiv.org/pdf/1610.01945v3.pdf
PWC https://paperswithcode.com/paper/connecting-generative-adversarial-networks
Repo https://github.com/170928/Multi-Reinforcement-Learning-Study-List
Framework none

Transfer String Kernel for Cross-Context DNA-Protein Binding Prediction

Title Transfer String Kernel for Cross-Context DNA-Protein Binding Prediction
Authors Ritambhara Singh, Jack Lanchantin, Gabriel Robins, Yanjun Qi
Abstract Through sequence-based classification, this paper tries to accurately predict the DNA binding sites of transcription factors (TFs) in an unannotated cellular context. Related methods in the literature fail to perform such predictions accurately, since they do not consider sample distribution shift of sequence segments from an annotated (source) context to an unannotated (target) context. We, therefore, propose a method called “Transfer String Kernel” (TSK) that achieves improved prediction of transcription factor binding site (TFBS) using knowledge transfer via cross-context sample adaptation. TSK maps sequence segments to a high-dimensional feature space using a discriminative mismatch string kernel framework. In this high-dimensional space, labeled examples of the source context are re-weighted so that the revised sample distribution matches the target context more closely. We have experimentally verified TSK for TFBS identifications on fourteen different TFs under a cross-organism setting. We find that TSK consistently outperforms the state-of the-art TFBS tools, especially when working with TFs whose binding sequences are not conserved across contexts. We also demonstrate the generalizability of TSK by showing its cutting-edge performance on a different set of cross-context tasks for the MHC peptide binding predictions.
Tasks Transfer Learning
Published 2016-09-12
URL http://arxiv.org/abs/1609.03490v1
PDF http://arxiv.org/pdf/1609.03490v1.pdf
PWC https://paperswithcode.com/paper/transfer-string-kernel-for-cross-context-dna
Repo https://github.com/QData/TransferStringKernel
Framework none

DeepDiary: Automatic Caption Generation for Lifelogging Image Streams

Title DeepDiary: Automatic Caption Generation for Lifelogging Image Streams
Authors Chenyou Fan, David J. Crandall
Abstract Lifelogging cameras capture everyday life from a first-person perspective, but generate so much data that it is hard for users to browse and organize their image collections effectively. In this paper, we propose to use automatic image captioning algorithms to generate textual representations of these collections. We develop and explore novel techniques based on deep learning to generate captions for both individual images and image streams, using temporal consistency constraints to create summaries that are both more compact and less noisy. We evaluate our techniques with quantitative and qualitative results, and apply captioning to an image retrieval application for finding potentially private images. Our results suggest that our automatic captioning algorithms, while imperfect, may work well enough to help users manage lifelogging photo collections.
Tasks Image Captioning, Image Retrieval
Published 2016-08-12
URL http://arxiv.org/abs/1608.03819v1
PDF http://arxiv.org/pdf/1608.03819v1.pdf
PWC https://paperswithcode.com/paper/deepdiary-automatic-caption-generation-for
Repo https://github.com/fanchenyou/deepdiary
Framework none

Fine-grained Analysis of Sentence Embeddings Using Auxiliary Prediction Tasks

Title Fine-grained Analysis of Sentence Embeddings Using Auxiliary Prediction Tasks
Authors Yossi Adi, Einat Kermany, Yonatan Belinkov, Ofer Lavi, Yoav Goldberg
Abstract There is a lot of research interest in encoding variable length sentences into fixed length vectors, in a way that preserves the sentence meanings. Two common methods include representations based on averaging word vectors, and representations based on the hidden states of recurrent neural networks such as LSTMs. The sentence vectors are used as features for subsequent machine learning tasks or for pre-training in the context of deep learning. However, not much is known about the properties that are encoded in these sentence representations and about the language information they capture. We propose a framework that facilitates better understanding of the encoded representations. We define prediction tasks around isolated aspects of sentence structure (namely sentence length, word content, and word order), and score representations by the ability to train a classifier to solve each prediction task when using the representation as input. We demonstrate the potential contribution of the approach by analyzing different sentence representation mechanisms. The analysis sheds light on the relative strengths of different sentence embedding methods with respect to these low level prediction tasks, and on the effect of the encoded vector’s dimensionality on the resulting representations.
Tasks Sentence Embedding, Sentence Embeddings
Published 2016-08-15
URL http://arxiv.org/abs/1608.04207v3
PDF http://arxiv.org/pdf/1608.04207v3.pdf
PWC https://paperswithcode.com/paper/fine-grained-analysis-of-sentence-embeddings
Repo https://github.com/facebookresearch/InferSent
Framework pytorch

Finding Alternate Features in Lasso

Title Finding Alternate Features in Lasso
Authors Satoshi Hara, Takanori Maehara
Abstract We propose a method for finding alternate features missing in the Lasso optimal solution. In ordinary Lasso problem, one global optimum is obtained and the resulting features are interpreted as task-relevant features. However, this can overlook possibly relevant features not selected by the Lasso. With the proposed method, we can provide not only the Lasso optimal solution but also possible alternate features to the Lasso solution. We show that such alternate features can be computed efficiently by avoiding redundant computations. We also demonstrate how the proposed method works in the 20 newsgroup data, which shows that reasonable features are found as alternate features.
Tasks
Published 2016-11-18
URL http://arxiv.org/abs/1611.05940v2
PDF http://arxiv.org/pdf/1611.05940v2.pdf
PWC https://paperswithcode.com/paper/finding-alternate-features-in-lasso
Repo https://github.com/sato9hara/LassoVariants
Framework none
comments powered by Disqus