October 21, 2019

2977 words 14 mins read

Paper Group AWR 101

Paper Group AWR 101

Fast yet Simple Natural-Gradient Descent for Variational Inference in Complex Models. How to train your MAML. A Corpus for Reasoning About Natural Language Grounded in Photographs. Collective Entity Disambiguation with Structured Gradient Tree Boosting. Sampling Theory for Graph Signals on Product Graphs. Exploring the Semantic Content of Unsupervi …

Fast yet Simple Natural-Gradient Descent for Variational Inference in Complex Models

Title Fast yet Simple Natural-Gradient Descent for Variational Inference in Complex Models
Authors Mohammad Emtiyaz Khan, Didrik Nielsen
Abstract Bayesian inference plays an important role in advancing machine learning, but faces computational challenges when applied to complex models such as deep neural networks. Variational inference circumvents these challenges by formulating Bayesian inference as an optimization problem and solving it using gradient-based optimization. In this paper, we argue in favor of natural-gradient approaches which, unlike their gradient-based counterparts, can improve convergence by exploiting the information geometry of the solutions. We show how to derive fast yet simple natural-gradient updates by using a duality associated with exponential-family distributions. An attractive feature of these methods is that, by using natural-gradients, they are able to extract accurate local approximations for individual model components. We summarize recent results for Bayesian deep learning showing the superiority of natural-gradient approaches over their gradient counterparts.
Tasks Bayesian Inference
Published 2018-07-12
URL http://arxiv.org/abs/1807.04489v2
PDF http://arxiv.org/pdf/1807.04489v2.pdf
PWC https://paperswithcode.com/paper/fast-yet-simple-natural-gradient-descent-for
Repo https://github.com/ssggreg/active_learning
Framework pytorch

How to train your MAML

Title How to train your MAML
Authors Antreas Antoniou, Harrison Edwards, Amos Storkey
Abstract The field of few-shot learning has recently seen substantial advancements. Most of these advancements came from casting few-shot learning as a meta-learning problem. Model Agnostic Meta Learning or MAML is currently one of the best approaches for few-shot learning via meta-learning. MAML is simple, elegant and very powerful, however, it has a variety of issues, such as being very sensitive to neural network architectures, often leading to instability during training, requiring arduous hyperparameter searches to stabilize training and achieve high generalization and being very computationally expensive at both training and inference times. In this paper, we propose various modifications to MAML that not only stabilize the system, but also substantially improve the generalization performance, convergence speed and computational overhead of MAML, which we call MAML++.
Tasks Few-Shot Image Classification, Few-Shot Learning, Meta-Learning
Published 2018-10-22
URL http://arxiv.org/abs/1810.09502v3
PDF http://arxiv.org/pdf/1810.09502v3.pdf
PWC https://paperswithcode.com/paper/how-to-train-your-maml
Repo https://github.com/AntreasAntoniou/HowToTrainYourMAMLPytorch
Framework pytorch

A Corpus for Reasoning About Natural Language Grounded in Photographs

Title A Corpus for Reasoning About Natural Language Grounded in Photographs
Authors Alane Suhr, Stephanie Zhou, Ally Zhang, Iris Zhang, Huajun Bai, Yoav Artzi
Abstract We introduce a new dataset for joint reasoning about natural language and images, with a focus on semantic diversity, compositionality, and visual reasoning challenges. The data contains 107,292 examples of English sentences paired with web photographs. The task is to determine whether a natural language caption is true about a pair of photographs. We crowdsource the data using sets of visually rich images and a compare-and-contrast task to elicit linguistically diverse language. Qualitative analysis shows the data requires compositional joint reasoning, including about quantities, comparisons, and relations. Evaluation using state-of-the-art visual reasoning methods shows the data presents a strong challenge.
Tasks Visual Reasoning
Published 2018-11-01
URL https://arxiv.org/abs/1811.00491v3
PDF https://arxiv.org/pdf/1811.00491v3.pdf
PWC https://paperswithcode.com/paper/a-corpus-for-reasoning-about-natural-language
Repo https://github.com/vortexJCH/nlvr
Framework none

Collective Entity Disambiguation with Structured Gradient Tree Boosting

Title Collective Entity Disambiguation with Structured Gradient Tree Boosting
Authors Yi Yang, Ozan Irsoy, Kazi Shefaet Rahman
Abstract We present a gradient-tree-boosting-based structured learning model for jointly disambiguating named entities in a document. Gradient tree boosting is a widely used machine learning algorithm that underlies many top-performing natural language processing systems. Surprisingly, most works limit the use of gradient tree boosting as a tool for regular classification or regression problems, despite the structured nature of language. To the best of our knowledge, our work is the first one that employs the structured gradient tree boosting (SGTB) algorithm for collective entity disambiguation. By defining global features over previous disambiguation decisions and jointly modeling them with local features, our system is able to produce globally optimized entity assignments for mentions in a document. Exact inference is prohibitively expensive for our globally normalized model. To solve this problem, we propose Bidirectional Beam Search with Gold path (BiBSG), an approximate inference algorithm that is a variant of the standard beam search algorithm. BiBSG makes use of global information from both past and future to perform better local search. Experiments on standard benchmark datasets show that SGTB significantly improves upon published results. Specifically, SGTB outperforms the previous state-of-the-art neural system by near 1% absolute accuracy on the popular AIDA-CoNLL dataset.
Tasks Entity Disambiguation
Published 2018-02-28
URL http://arxiv.org/abs/1802.10229v2
PDF http://arxiv.org/pdf/1802.10229v2.pdf
PWC https://paperswithcode.com/paper/collective-entity-disambiguation-with
Repo https://github.com/bloomberg/sgtb
Framework none

Sampling Theory for Graph Signals on Product Graphs

Title Sampling Theory for Graph Signals on Product Graphs
Authors Rohan Varma, Jelena Kovačević
Abstract In this paper, we extend the sampling theory on graphs by constructing a framework that exploits the structure in product graphs for efficient sampling and recovery of bandlimited graph signals that lie on them. Product graphs are graphs that are composed from smaller graph atoms; we motivate how this model is a flexible and useful way to model richer classes of data that can be multi-modal in nature. Previous works have established a sampling theory on graphs for bandlimited signals. Importantly, the framework achieves significant savings in both sample complexity and computational complexity
Tasks
Published 2018-09-26
URL http://arxiv.org/abs/1809.10049v1
PDF http://arxiv.org/pdf/1809.10049v1.pdf
PWC https://paperswithcode.com/paper/sampling-theory-for-graph-signals-on-product
Repo https://github.com/CrowdArt/node-chat-app
Framework none

Exploring the Semantic Content of Unsupervised Graph Embeddings: An Empirical Study

Title Exploring the Semantic Content of Unsupervised Graph Embeddings: An Empirical Study
Authors Stephen Bonner, Ibad Kureshi, John Brennan, Georgios Theodoropoulos, Andrew Stephen McGough, Boguslaw Obara
Abstract Graph embeddings have become a key and widely used technique within the field of graph mining, proving to be successful across a broad range of domains including social, citation, transportation and biological. Graph embedding techniques aim to automatically create a low-dimensional representation of a given graph, which captures key structural elements in the resulting embedding space. However, to date, there has been little work exploring exactly which topological structures are being learned in the embeddings process. In this paper, we investigate if graph embeddings are approximating something analogous with traditional vertex level graph features. If such a relationship can be found, it could be used to provide a theoretical insight into how graph embedding approaches function. We perform this investigation by predicting known topological features, using supervised and unsupervised methods, directly from the embedding space. If a mapping between the embeddings and topological features can be found, then we argue that the structural information encapsulated by the features is represented in the embedding space. To explore this, we present extensive experimental evaluation from five state-of-the-art unsupervised graph embedding techniques, across a range of empirical graph datasets, measuring a selection of topological features. We demonstrate that several topological features are indeed being approximated by the embedding space, allowing key insight into how graph embeddings create good representations.
Tasks Graph Embedding
Published 2018-06-19
URL http://arxiv.org/abs/1806.07464v1
PDF http://arxiv.org/pdf/1806.07464v1.pdf
PWC https://paperswithcode.com/paper/exploring-the-semantic-content-of
Repo https://github.com/sbonner0/unsupervised-graph-embeddings
Framework tf

Learning Private Neural Language Modeling with Attentive Aggregation

Title Learning Private Neural Language Modeling with Attentive Aggregation
Authors Shaoxiong Ji, Shirui Pan, Guodong Long, Xue Li, Jing Jiang, Zi Huang
Abstract Mobile keyboard suggestion is typically regarded as a word-level language modeling problem. Centralized machine learning technique requires massive user data collected to train on, which may impose privacy concerns for sensitive personal typing data of users. Federated learning (FL) provides a promising approach to learning private language modeling for intelligent personalized keyboard suggestion by training models in distributed clients rather than training in a central server. To obtain a global model for prediction, existing FL algorithms simply average the client models and ignore the importance of each client during model aggregation. Furthermore, there is no optimization for learning a well-generalized global model on the central server. To solve these problems, we propose a novel model aggregation with the attention mechanism considering the contribution of clients models to the global model, together with an optimization technique during server aggregation. Our proposed attentive aggregation method minimizes the weighted distance between the server model and client models through iterative parameters updating while attends the distance between the server model and client models. Through experiments on two popular language modeling datasets and a social media dataset, our proposed method outperforms its counterparts in terms of perplexity and communication cost in most settings of comparison.
Tasks Language Modelling
Published 2018-12-17
URL http://arxiv.org/abs/1812.07108v2
PDF http://arxiv.org/pdf/1812.07108v2.pdf
PWC https://paperswithcode.com/paper/learning-private-neural-language-modeling
Repo https://github.com/shaoxiongji/fed-att
Framework pytorch

Meta-Learning Probabilistic Inference For Prediction

Title Meta-Learning Probabilistic Inference For Prediction
Authors Jonathan Gordon, John Bronskill, Matthias Bauer, Sebastian Nowozin, Richard E. Turner
Abstract This paper introduces a new framework for data efficient and versatile learning. Specifically: 1) We develop ML-PIP, a general framework for Meta-Learning approximate Probabilistic Inference for Prediction. ML-PIP extends existing probabilistic interpretations of meta-learning to cover a broad class of methods. 2) We introduce VERSA, an instance of the framework employing a flexible and versatile amortization network that takes few-shot learning datasets as inputs, with arbitrary numbers of shots, and outputs a distribution over task-specific parameters in a single forward pass. VERSA substitutes optimization at test time with forward passes through inference networks, amortizing the cost of inference and relieving the need for second derivatives during training. 3) We evaluate VERSA on benchmark datasets where the method sets new state-of-the-art results, handles arbitrary numbers of shots, and for classification, arbitrary numbers of classes at train and test time. The power of the approach is then demonstrated through a challenging few-shot ShapeNet view reconstruction task.
Tasks Few-Shot Learning, Meta-Learning
Published 2018-05-24
URL https://arxiv.org/abs/1805.09921v4
PDF https://arxiv.org/pdf/1805.09921v4.pdf
PWC https://paperswithcode.com/paper/meta-learning-probabilistic-inference-for
Repo https://github.com/Gordonjo/versa
Framework tf

Quaternion Recurrent Neural Networks

Title Quaternion Recurrent Neural Networks
Authors Titouan Parcollet, Mirco Ravanelli, Mohamed Morchid, Georges Linarès, Chiheb Trabelsi, Renato De Mori, Yoshua Bengio
Abstract Recurrent neural networks (RNNs) are powerful architectures to model sequential data, due to their capability to learn short and long-term dependencies between the basic elements of a sequence. Nonetheless, popular tasks such as speech or images recognition, involve multi-dimensional input features that are characterized by strong internal dependencies between the dimensions of the input vector. We propose a novel quaternion recurrent neural network (QRNN), alongside with a quaternion long-short term memory neural network (QLSTM), that take into account both the external relations and these internal structural dependencies with the quaternion algebra. Similarly to capsules, quaternions allow the QRNN to code internal dependencies by composing and processing multidimensional features as single entities, while the recurrent operation reveals correlations between the elements composing the sequence. We show that both QRNN and QLSTM achieve better performances than RNN and LSTM in a realistic application of automatic speech recognition. Finally, we show that QRNN and QLSTM reduce by a maximum factor of 3.3x the number of free parameters needed, compared to real-valued RNNs and LSTMs to reach better results, leading to a more compact representation of the relevant information.
Tasks Speech Recognition
Published 2018-06-12
URL http://arxiv.org/abs/1806.04418v3
PDF http://arxiv.org/pdf/1806.04418v3.pdf
PWC https://paperswithcode.com/paper/quaternion-recurrent-neural-networks
Repo https://github.com/Riccardo-Vecchi/Pytorch-Quaternion-Neural-Networks
Framework pytorch

Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates

Title Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates
Authors Taku Kudo
Abstract Subword units are an effective way to alleviate the open vocabulary problems in neural machine translation (NMT). While sentences are usually converted into unique subword sequences, subword segmentation is potentially ambiguous and multiple segmentations are possible even with the same vocabulary. The question addressed in this paper is whether it is possible to harness the segmentation ambiguity as a noise to improve the robustness of NMT. We present a simple regularization method, subword regularization, which trains the model with multiple subword segmentations probabilistically sampled during training. In addition, for better subword sampling, we propose a new subword segmentation algorithm based on a unigram language model. We experiment with multiple corpora and report consistent improvements especially on low resource and out-of-domain settings.
Tasks Language Modelling, Machine Translation
Published 2018-04-29
URL http://arxiv.org/abs/1804.10959v1
PDF http://arxiv.org/pdf/1804.10959v1.pdf
PWC https://paperswithcode.com/paper/subword-regularization-improving-neural
Repo https://github.com/Waino/OpenNMT-py
Framework pytorch

Conditional Inference in Pre-trained Variational Autoencoders via Cross-coding

Title Conditional Inference in Pre-trained Variational Autoencoders via Cross-coding
Authors Ga Wu, Justin Domke, Scott Sanner
Abstract Variational Autoencoders (VAEs) are a popular generative model, but one in which conditional inference can be challenging. If the decomposition into query and evidence variables is fixed, conditional VAEs provide an attractive solution. To support arbitrary queries, one is generally reduced to Markov Chain Monte Carlo sampling methods that can suffer from long mixing times. In this paper, we propose an idea we term cross-coding to approximate the distribution over the latent variables after conditioning on an evidence assignment to some subset of the variables. This allows generating query samples without retraining the full VAE. We experimentally evaluate three variations of cross-coding showing that (i) they can be quickly optimized for different decompositions of evidence and query and (ii) they quantitatively and qualitatively outperform Hamiltonian Monte Carlo.
Tasks
Published 2018-05-20
URL http://arxiv.org/abs/1805.07785v2
PDF http://arxiv.org/pdf/1805.07785v2.pdf
PWC https://paperswithcode.com/paper/conditional-inference-in-pre-trained
Repo https://github.com/wuga214/XCoder_VAE_Conditional_Inference
Framework tf

What made you do this? Understanding black-box decisions with sufficient input subsets

Title What made you do this? Understanding black-box decisions with sufficient input subsets
Authors Brandon Carter, Jonas Mueller, Siddhartha Jain, David Gifford
Abstract Local explanation frameworks aim to rationalize particular decisions made by a black-box prediction model. Existing techniques are often restricted to a specific type of predictor or based on input saliency, which may be undesirably sensitive to factors unrelated to the model’s decision making process. We instead propose sufficient input subsets that identify minimal subsets of features whose observed values alone suffice for the same decision to be reached, even if all other input feature values are missing. General principles that globally govern a model’s decision-making can also be revealed by searching for clusters of such input patterns across many data points. Our approach is conceptually straightforward, entirely model-agnostic, simply implemented using instance-wise backward selection, and able to produce more concise rationales than existing techniques. We demonstrate the utility of our interpretation method on various neural network models trained on text, image, and genomic data.
Tasks Decision Making
Published 2018-10-09
URL http://arxiv.org/abs/1810.03805v2
PDF http://arxiv.org/pdf/1810.03805v2.pdf
PWC https://paperswithcode.com/paper/what-made-you-do-this-understanding-black-box
Repo https://github.com/b-carter/SufficientInputSubsets
Framework tf

EPINET: A Fully-Convolutional Neural Network Using Epipolar Geometry for Depth from Light Field Images

Title EPINET: A Fully-Convolutional Neural Network Using Epipolar Geometry for Depth from Light Field Images
Authors Changha Shin, Hae-Gon Jeon, Youngjin Yoon, In So Kweon, Seon Joo Kim
Abstract Light field cameras capture both the spatial and the angular properties of light rays in space. Due to its property, one can compute the depth from light fields in uncontrolled lighting environments, which is a big advantage over active sensing devices. Depth computed from light fields can be used for many applications including 3D modelling and refocusing. However, light field images from hand-held cameras have very narrow baselines with noise, making the depth estimation difficult. any approaches have been proposed to overcome these limitations for the light field depth estimation, but there is a clear trade-off between the accuracy and the speed in these methods. In this paper, we introduce a fast and accurate light field depth estimation method based on a fully-convolutional neural network. Our network is designed by considering the light field geometry and we also overcome the lack of training data by proposing light field specific data augmentation methods. We achieved the top rank in the HCI 4D Light Field Benchmark on most metrics, and we also demonstrate the effectiveness of the proposed method on real-world light-field images.
Tasks Data Augmentation, Depth Estimation
Published 2018-04-06
URL http://arxiv.org/abs/1804.02379v1
PDF http://arxiv.org/pdf/1804.02379v1.pdf
PWC https://paperswithcode.com/paper/epinet-a-fully-convolutional-neural-network
Repo https://github.com/chshin10/epinet
Framework tf

Teaching Machines to Code: Neural Markup Generation with Visual Attention

Title Teaching Machines to Code: Neural Markup Generation with Visual Attention
Authors Sumeet S. Singh
Abstract We present a neural transducer model with visual attention that learns to generate LaTeX markup of a real-world math formula given its image. Applying sequence modeling and transduction techniques that have been very successful across modalities such as natural language, image, handwriting, speech and audio; we construct an image-to-markup model that learns to produce syntactically and semantically correct LaTeX markup code over 150 words long and achieves a BLEU score of 89%; improving upon the previous state-of-art for the Im2Latex problem. We also demonstrate with heat-map visualization how attention helps in interpreting the model and can pinpoint (detect and localize) symbols on the image accurately despite having been trained without any bounding box data.
Tasks
Published 2018-02-15
URL http://arxiv.org/abs/1802.05415v2
PDF http://arxiv.org/pdf/1802.05415v2.pdf
PWC https://paperswithcode.com/paper/teaching-machines-to-code-neural-markup
Repo https://github.com/untrix/im2latex
Framework tf

Eliminating the Blind Spot: Adapting 3D Object Detection and Monocular Depth Estimation to 360° Panoramic Imagery

Title Eliminating the Blind Spot: Adapting 3D Object Detection and Monocular Depth Estimation to 360° Panoramic Imagery
Authors Grégoire Payen de La Garanderie, Amir Atapour Abarghouei, Toby P. Breckon
Abstract Recent automotive vision work has focused almost exclusively on processing forward-facing cameras. However, future autonomous vehicles will not be viable without a more comprehensive surround sensing, akin to a human driver, as can be provided by 360{\deg} panoramic cameras. We present an approach to adapt contemporary deep network architectures developed on conventional rectilinear imagery to work on equirectangular 360{\deg} panoramic imagery. To address the lack of annotated panoramic automotive datasets availability, we adapt a contemporary automotive dataset, via style and projection transformations, to facilitate the cross-domain retraining of contemporary algorithms for panoramic imagery. Following this approach we retrain and adapt existing architectures to recover scene depth and 3D pose of vehicles from monocular panoramic imagery without any panoramic training labels or calibration parameters. Our approach is evaluated qualitatively on crowd-sourced panoramic images and quantitatively using an automotive environment simulator to provide the first benchmark for such techniques within panoramic imagery.
Tasks 3D Object Detection, Autonomous Vehicles, Calibration, Depth Estimation, Monocular Depth Estimation, Object Detection
Published 2018-08-19
URL http://arxiv.org/abs/1808.06253v1
PDF http://arxiv.org/pdf/1808.06253v1.pdf
PWC https://paperswithcode.com/paper/eliminating-the-blind-spot-adapting-3d-object-1
Repo https://github.com/gdlg/panoramic-depth-estimation
Framework tf
comments powered by Disqus