October 21, 2019

2852 words 14 mins read

Paper Group AWR 110

Paper Group AWR 110

An Interpretable Reasoning Network for Multi-Relation Question Answering. Learning 3D Shapes as Multi-Layered Height-maps using 2D Convolutional Networks. A Temporally-Aware Interpolation Network for Video Frame Inpainting. Solving the Empirical Bayes Normal Means Problem with Correlated Noise. Evaluating Generative Adversarial Networks on Explicit …

An Interpretable Reasoning Network for Multi-Relation Question Answering

Title An Interpretable Reasoning Network for Multi-Relation Question Answering
Authors Mantong Zhou, Minlie Huang, Xiaoyan Zhu
Abstract Multi-relation Question Answering is a challenging task, due to the requirement of elaborated analysis on questions and reasoning over multiple fact triples in knowledge base. In this paper, we present a novel model called Interpretable Reasoning Network that employs an interpretable, hop-by-hop reasoning process for question answering. The model dynamically decides which part of an input question should be analyzed at each hop; predicts a relation that corresponds to the current parsed results; utilizes the predicted relation to update the question representation and the state of the reasoning process; and then drives the next-hop reasoning. Experiments show that our model yields state-of-the-art results on two datasets. More interestingly, the model can offer traceable and observable intermediate predictions for reasoning analysis and failure diagnosis, thereby allowing manual manipulation in predicting the final answer.
Tasks Question Answering
Published 2018-01-15
URL http://arxiv.org/abs/1801.04726v3
PDF http://arxiv.org/pdf/1801.04726v3.pdf
PWC https://paperswithcode.com/paper/an-interpretable-reasoning-network-for-multi
Repo https://github.com/zmtkeke/IRN
Framework tf

Learning 3D Shapes as Multi-Layered Height-maps using 2D Convolutional Networks

Title Learning 3D Shapes as Multi-Layered Height-maps using 2D Convolutional Networks
Authors Kripasindhu Sarkar, Basavaraj Hampiholi, Kiran Varanasi, Didier Stricker
Abstract We present a novel global representation of 3D shapes, suitable for the application of 2D CNNs. We represent 3D shapes as multi-layered height-maps (MLH) where at each grid location, we store multiple instances of height maps, thereby representing 3D shape detail that is hidden behind several layers of occlusion. We provide a novel view merging method for combining view dependent information (Eg. MLH descriptors) from multiple views. Because of the ability of using 2D CNNs, our method is highly memory efficient in terms of input resolution compared to the voxel based input. Together with MLH descriptors and our multi view merging, we achieve the state-of-the-art result in classification on ModelNet dataset.
Tasks 3D Object Classification
Published 2018-07-23
URL http://arxiv.org/abs/1807.08485v2
PDF http://arxiv.org/pdf/1807.08485v2.pdf
PWC https://paperswithcode.com/paper/learning-3d-shapes-as-multi-layered-height
Repo https://github.com/krips89/mlh_mvcnn
Framework pytorch

A Temporally-Aware Interpolation Network for Video Frame Inpainting

Title A Temporally-Aware Interpolation Network for Video Frame Inpainting
Authors Ximeng Sun, Ryan Szeto, Jason J. Corso
Abstract We propose the first deep learning solution to video frame inpainting, a challenging instance of the general video inpainting problem with applications in video editing, manipulation, and forensics. Our task is less ambiguous than frame interpolation and video prediction because we have access to both the temporal context and a partial glimpse of the future, allowing us to better evaluate the quality of a model’s predictions objectively. We devise a pipeline composed of two modules: a bidirectional video prediction module, and a temporally-aware frame interpolation module. The prediction module makes two intermediate predictions of the missing frames, one conditioned on the preceding frames and the other conditioned on the following frames, using a shared convolutional LSTM-based encoder-decoder. The interpolation module blends the intermediate predictions to form the final result. Specifically, it utilizes time information and hidden activations from the video prediction module to resolve disagreements between the predictions. Our experiments demonstrate that our approach produces more accurate and qualitatively satisfying results than a state-of-the-art video prediction method and many strong frame inpainting baselines.
Tasks Video Inpainting, Video Prediction
Published 2018-03-20
URL http://arxiv.org/abs/1803.07218v2
PDF http://arxiv.org/pdf/1803.07218v2.pdf
PWC https://paperswithcode.com/paper/a-temporally-aware-interpolation-network-for
Repo https://github.com/sunxm2357/TAI_video_frame_inpainting
Framework pytorch

Solving the Empirical Bayes Normal Means Problem with Correlated Noise

Title Solving the Empirical Bayes Normal Means Problem with Correlated Noise
Authors Lei Sun, Matthew Stephens
Abstract The Normal Means problem plays a fundamental role in many areas of modern high-dimensional statistics, both in theory and practice. And the Empirical Bayes (EB) approach to solving this problem has been shown to be highly effective, again both in theory and practice. However, almost all EB treatments of the Normal Means problem assume that the observations are independent. In practice correlations are ubiquitous in real-world applications, and these correlations can grossly distort EB estimates. Here, exploiting theory from Schwartzman (2010), we develop new EB methods for solving the Normal Means problem that take account of unknown correlations among observations. We provide practical software implementations of these methods, and illustrate them in the context of large-scale multiple testing problems and False Discovery Rate (FDR) control. In realistic numerical experiments our methods compare favorably with other commonly-used multiple testing methods.
Tasks
Published 2018-12-18
URL http://arxiv.org/abs/1812.07488v2
PDF http://arxiv.org/pdf/1812.07488v2.pdf
PWC https://paperswithcode.com/paper/solving-the-empirical-bayes-normal-means
Repo https://github.com/LSun/cashr_paper
Framework none

Evaluating Generative Adversarial Networks on Explicitly Parameterized Distributions

Title Evaluating Generative Adversarial Networks on Explicitly Parameterized Distributions
Authors Shayne O’Brien, Matt Groh, Abhimanyu Dubey
Abstract The true distribution parameterizations of commonly used image datasets are inaccessible. Rather than designing metrics for feature spaces with unknown characteristics, we propose to measure GAN performance by evaluating on explicitly parameterized, synthetic data distributions. As a case study, we examine the performance of 16 GAN variants on six multivariate distributions of varying dimensionalities and training set sizes. In this learning environment, we observe that: GANs exhibit similar performance trends across dimensionalities; learning depends on the underlying distribution and its complexity; the number of training samples can have a large impact on performance; evaluation and relative comparisons are metric-dependent; diverse sets of hyperparameters can produce a “best” result; and some GANs are more robust to hyperparameter changes than others. These observations both corroborate findings of previous GAN evaluation studies and make novel contributions regarding the relationship between size, complexity, and GAN performance.
Tasks
Published 2018-12-27
URL http://arxiv.org/abs/1812.10782v1
PDF http://arxiv.org/pdf/1812.10782v1.pdf
PWC https://paperswithcode.com/paper/evaluating-generative-adversarial-networks-on
Repo https://github.com/shayneobrien/explicit-gan-eval
Framework pytorch

Deep Generative Modeling of LiDAR Data

Title Deep Generative Modeling of LiDAR Data
Authors Lucas Caccia, Herke van Hoof, Aaron Courville, Joelle Pineau
Abstract Building models capable of generating structured output is a key challenge for AI and robotics. While generative models have been explored on many types of data, little work has been done on synthesizing lidar scans, which play a key role in robot mapping and localization. In this work, we show that one can adapt deep generative models for this task by unravelling lidar scans into a 2D point map. Our approach can generate high quality samples, while simultaneously learning a meaningful latent representation of the data. We demonstrate significant improvements against state-of-the-art point cloud generation methods. Furthermore, we propose a novel data representation that augments the 2D signal with absolute positional information. We show that this helps robustness to noisy and imputed input; the learned model can recover the underlying lidar scan from seemingly uninformative data
Tasks Point Cloud Generation
Published 2018-12-04
URL https://arxiv.org/abs/1812.01180v4
PDF https://arxiv.org/pdf/1812.01180v4.pdf
PWC https://paperswithcode.com/paper/deep-generative-modeling-of-lidar-data
Repo https://github.com/pclucas14/lidar_generation
Framework pytorch

Adaptive Input Representations for Neural Language Modeling

Title Adaptive Input Representations for Neural Language Modeling
Authors Alexei Baevski, Michael Auli
Abstract We introduce adaptive input representations for neural language modeling which extend the adaptive softmax of Grave et al. (2017) to input representations of variable capacity. There are several choices on how to factorize the input and output layers, and whether to model words, characters or sub-word units. We perform a systematic comparison of popular choices for a self-attentional architecture. Our experiments show that models equipped with adaptive embeddings are more than twice as fast to train than the popular character input CNN while having a lower number of parameters. On the WikiText-103 benchmark we achieve 18.7 perplexity, an improvement of 10.5 perplexity compared to the previously best published result and on the Billion Word benchmark, we achieve 23.02 perplexity.
Tasks Language Modelling
Published 2018-09-28
URL http://arxiv.org/abs/1809.10853v3
PDF http://arxiv.org/pdf/1809.10853v3.pdf
PWC https://paperswithcode.com/paper/adaptive-input-representations-for-neural
Repo https://github.com/AranKomat/adapinp
Framework pytorch

One-Shot Instance Segmentation

Title One-Shot Instance Segmentation
Authors Claudio Michaelis, Ivan Ustyuzhaninov, Matthias Bethge, Alexander S. Ecker
Abstract We tackle the problem of one-shot instance segmentation: Given an example image of a novel, previously unknown object category, find and segment all objects of this category within a complex scene. To address this challenging new task, we propose Siamese Mask R-CNN. It extends Mask R-CNN by a Siamese backbone encoding both reference image and scene, allowing it to target detection and segmentation towards the reference category. We demonstrate empirical results on MS Coco highlighting challenges of the one-shot setting: while transferring knowledge about instance segmentation to novel object categories works very well, targeting the detection network towards the reference category appears to be more difficult. Our work provides a first strong baseline for one-shot instance segmentation and will hopefully inspire further research into more powerful and flexible scene analysis algorithms. Code is available at: https://github.com/bethgelab/siamese-mask-rcnn
Tasks Few-Shot Learning, Few-Shot Object Detection, Instance Segmentation, Object Detection, One-Shot Instance Segmentation, One-Shot Learning, One-Shot Object Detection
Published 2018-11-28
URL https://arxiv.org/abs/1811.11507v2
PDF https://arxiv.org/pdf/1811.11507v2.pdf
PWC https://paperswithcode.com/paper/one-shot-instance-segmentation
Repo https://github.com/bethgelab/siamese-mask-rcnn
Framework tf

Adversarial Autoencoders for Compact Representations of 3D Point Clouds

Title Adversarial Autoencoders for Compact Representations of 3D Point Clouds
Authors Maciej Zamorski, Maciej Zięba, Piotr Klukowski, Rafał Nowak, Karol Kurach, Wojciech Stokowiec, Tomasz Trzciński
Abstract Deep generative architectures provide a way to model not only images but also complex, 3-dimensional objects, such as point clouds. In this work, we present a novel method to obtain meaningful representations of 3D shapes that can be used for challenging tasks including 3D points generation, reconstruction, compression, and clustering. Contrary to existing methods for 3D point cloud generation that train separate decoupled models for representation learning and generation, our approach is the first end-to-end solution that allows to simultaneously learn a latent space of representation and generate 3D shape out of it. Moreover, our model is capable of learning meaningful compact binary descriptors with adversarial training conducted on a latent space. To achieve this goal, we extend a deep Adversarial Autoencoder model (AAE) to accept 3D input and create 3D output. Thanks to our end-to-end training regime, the resulting method called 3D Adversarial Autoencoder (3dAAE) obtains either binary or continuous latent space that covers a much wider portion of training data distribution. Finally, our quantitative evaluation shows that 3dAAE provides state-of-the-art results for 3D points clustering and 3D object retrieval.
Tasks 3D Object Retrieval, Generating 3D Point Clouds, Point Cloud Generation, Representation Learning
Published 2018-11-19
URL http://arxiv.org/abs/1811.07605v3
PDF http://arxiv.org/pdf/1811.07605v3.pdf
PWC https://paperswithcode.com/paper/adversarial-autoencoders-for-generating-3d
Repo https://github.com/MaciejZamorski/3d-AAE
Framework pytorch

Imitation Learning for Neural Morphological String Transduction

Title Imitation Learning for Neural Morphological String Transduction
Authors Peter Makarov, Simon Clematide
Abstract We employ imitation learning to train a neural transition-based string transducer for morphological tasks such as inflection generation and lemmatization. Previous approaches to training this type of model either rely on an external character aligner for the production of gold action sequences, which results in a suboptimal model due to the unwarranted dependence on a single gold action sequence despite spurious ambiguity, or require warm starting with an MLE model. Our approach only requires a simple expert policy, eliminating the need for a character aligner or warm start. It also addresses familiar MLE training biases and leads to strong and state-of-the-art performance on several benchmarks.
Tasks Imitation Learning, Lemmatization
Published 2018-08-31
URL http://arxiv.org/abs/1808.10701v1
PDF http://arxiv.org/pdf/1808.10701v1.pdf
PWC https://paperswithcode.com/paper/imitation-learning-for-neural-morphological
Repo https://github.com/ZurichNLP/emnlp2018-imitation-learning-for-neural-morphology
Framework none

GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding

Title GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Authors Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, Samuel R. Bowman
Abstract For natural language understanding (NLU) technology to be maximally useful, both practically and as a scientific object of study, it must be general: it must be able to process language in a way that is not exclusively tailored to any one specific task or dataset. In pursuit of this objective, we introduce the General Language Understanding Evaluation benchmark (GLUE), a tool for evaluating and analyzing the performance of models across a diverse range of existing NLU tasks. GLUE is model-agnostic, but it incentivizes sharing knowledge across tasks because certain tasks have very limited training data. We further provide a hand-crafted diagnostic test suite that enables detailed linguistic analysis of NLU models. We evaluate baselines based on current methods for multi-task and transfer learning and find that they do not immediately give substantial improvements over the aggregate performance of training a separate model per task, indicating room for improvement in developing general and robust NLU systems.
Tasks Natural Language Inference, Transfer Learning
Published 2018-04-20
URL http://arxiv.org/abs/1804.07461v3
PDF http://arxiv.org/pdf/1804.07461v3.pdf
PWC https://paperswithcode.com/paper/glue-a-multi-task-benchmark-and-analysis
Repo https://github.com/nyu-mll/GLUE-baselines
Framework pytorch

Rapid Adaptation of Neural Machine Translation to New Languages

Title Rapid Adaptation of Neural Machine Translation to New Languages
Authors Graham Neubig, Junjie Hu
Abstract This paper examines the problem of adapting neural machine translation systems to new, low-resourced languages (LRLs) as effectively and rapidly as possible. We propose methods based on starting with massively multilingual “seed models”, which can be trained ahead-of-time, and then continuing training on data related to the LRL. We contrast a number of strategies, leading to a novel, simple, yet effective method of “similar-language regularization”, where we jointly train on both a LRL of interest and a similar high-resourced language to prevent over-fitting to small LRL data. Experiments demonstrate that massively multilingual models, even without any explicit adaptation, are surprisingly effective, achieving BLEU scores of up to 15.5 with no data from the LRL, and that the proposed similar-language regularization method improves over other adaptation methods by 1.7 BLEU points average over 4 LRL settings. Code to reproduce experiments at https://github.com/neubig/rapid-adaptation
Tasks Machine Translation
Published 2018-08-13
URL http://arxiv.org/abs/1808.04189v1
PDF http://arxiv.org/pdf/1808.04189v1.pdf
PWC https://paperswithcode.com/paper/rapid-adaptation-of-neural-machine
Repo https://github.com/neubig/rapid-adaptation
Framework none

Is Robustness the Cost of Accuracy? – A Comprehensive Study on the Robustness of 18 Deep Image Classification Models

Title Is Robustness the Cost of Accuracy? – A Comprehensive Study on the Robustness of 18 Deep Image Classification Models
Authors Dong Su, Huan Zhang, Hongge Chen, Jinfeng Yi, Pin-Yu Chen, Yupeng Gao
Abstract The prediction accuracy has been the long-lasting and sole standard for comparing the performance of different image classification models, including the ImageNet competition. However, recent studies have highlighted the lack of robustness in well-trained deep neural networks to adversarial examples. Visually imperceptible perturbations to natural images can easily be crafted and mislead the image classifiers towards misclassification. To demystify the trade-offs between robustness and accuracy, in this paper we thoroughly benchmark 18 ImageNet models using multiple robustness metrics, including the distortion, success rate and transferability of adversarial examples between 306 pairs of models. Our extensive experimental results reveal several new insights: (1) linear scaling law - the empirical $\ell_2$ and $\ell_\infty$ distortion metrics scale linearly with the logarithm of classification error; (2) model architecture is a more critical factor to robustness than model size, and the disclosed accuracy-robustness Pareto frontier can be used as an evaluation criterion for ImageNet model designers; (3) for a similar network architecture, increasing network depth slightly improves robustness in $\ell_\infty$ distortion; (4) there exist models (in VGG family) that exhibit high adversarial transferability, while most adversarial examples crafted from one model can only be transferred within the same family. Experiment code is publicly available at \url{https://github.com/huanzhang12/Adversarial_Survey}.
Tasks Image Classification
Published 2018-08-05
URL http://arxiv.org/abs/1808.01688v2
PDF http://arxiv.org/pdf/1808.01688v2.pdf
PWC https://paperswithcode.com/paper/is-robustness-the-cost-of-accuracy-a
Repo https://github.com/huanzhang12/Adversarial_Survey
Framework tf

Deep Bayesian Bandits Showdown: An Empirical Comparison of Bayesian Deep Networks for Thompson Sampling

Title Deep Bayesian Bandits Showdown: An Empirical Comparison of Bayesian Deep Networks for Thompson Sampling
Authors Carlos Riquelme, George Tucker, Jasper Snoek
Abstract Recent advances in deep reinforcement learning have made significant strides in performance on applications such as Go and Atari games. However, developing practical methods to balance exploration and exploitation in complex domains remains largely unsolved. Thompson Sampling and its extension to reinforcement learning provide an elegant approach to exploration that only requires access to posterior samples of the model. At the same time, advances in approximate Bayesian methods have made posterior approximation for flexible neural network models practical. Thus, it is attractive to consider approximate Bayesian neural networks in a Thompson Sampling framework. To understand the impact of using an approximate posterior on Thompson Sampling, we benchmark well-established and recently developed methods for approximate posterior sampling combined with Thompson Sampling over a series of contextual bandit problems. We found that many approaches that have been successful in the supervised learning setting underperformed in the sequential decision-making scenario. In particular, we highlight the challenge of adapting slowly converging uncertainty estimates to the online setting.
Tasks Decision Making, Multi-Armed Bandits
Published 2018-02-26
URL http://arxiv.org/abs/1802.09127v1
PDF http://arxiv.org/pdf/1802.09127v1.pdf
PWC https://paperswithcode.com/paper/deep-bayesian-bandits-showdown-an-empirical
Repo https://github.com/tensorflow/models
Framework tf

From Credit Assignment to Entropy Regularization: Two New Algorithms for Neural Sequence Prediction

Title From Credit Assignment to Entropy Regularization: Two New Algorithms for Neural Sequence Prediction
Authors Zihang Dai, Qizhe Xie, Eduard Hovy
Abstract In this work, we study the credit assignment problem in reward augmented maximum likelihood (RAML) learning, and establish a theoretical equivalence between the token-level counterpart of RAML and the entropy regularized reinforcement learning. Inspired by the connection, we propose two sequence prediction algorithms, one extending RAML with fine-grained credit assignment and the other improving Actor-Critic with a systematic entropy regularization. On two benchmark datasets, we show the proposed algorithms outperform RAML and Actor-Critic respectively, providing new alternatives to sequence prediction.
Tasks
Published 2018-04-29
URL http://arxiv.org/abs/1804.10974v1
PDF http://arxiv.org/pdf/1804.10974v1.pdf
PWC https://paperswithcode.com/paper/from-credit-assignment-to-entropy
Repo https://github.com/zihangdai/ERAC-VAML
Framework pytorch
comments powered by Disqus