October 21, 2019

2852 words 14 mins read

Paper Group AWR 110

An Interpretable Reasoning Network for Multi-Relation Question Answering. Learning 3D Shapes as Multi-Layered Height-maps using 2D Convolutional Networks. A Temporally-Aware Interpolation Network for Video Frame Inpainting. Solving the Empirical Bayes Normal Means Problem with Correlated Noise. Evaluating Generative Adversarial Networks on Explicit …

An Interpretable Reasoning Network for Multi-Relation Question Answering


Title	An Interpretable Reasoning Network for Multi-Relation Question Answering
Authors	Mantong Zhou, Minlie Huang, Xiaoyan Zhu
Abstract	Multi-relation Question Answering is a challenging task, due to the requirement of elaborated analysis on questions and reasoning over multiple fact triples in knowledge base. In this paper, we present a novel model called Interpretable Reasoning Network that employs an interpretable, hop-by-hop reasoning process for question answering. The model dynamically decides which part of an input question should be analyzed at each hop; predicts a relation that corresponds to the current parsed results; utilizes the predicted relation to update the question representation and the state of the reasoning process; and then drives the next-hop reasoning. Experiments show that our model yields state-of-the-art results on two datasets. More interestingly, the model can offer traceable and observable intermediate predictions for reasoning analysis and failure diagnosis, thereby allowing manual manipulation in predicting the final answer.
Tasks	Question Answering
Published	2018-01-15
URL	http://arxiv.org/abs/1801.04726v3
PDF	http://arxiv.org/pdf/1801.04726v3.pdf
PWC	https://paperswithcode.com/paper/an-interpretable-reasoning-network-for-multi
Repo	https://github.com/zmtkeke/IRN
Framework	tf

Learning 3D Shapes as Multi-Layered Height-maps using 2D Convolutional Networks


Title	Learning 3D Shapes as Multi-Layered Height-maps using 2D Convolutional Networks
Authors	Kripasindhu Sarkar, Basavaraj Hampiholi, Kiran Varanasi, Didier Stricker
Abstract	We present a novel global representation of 3D shapes, suitable for the application of 2D CNNs. We represent 3D shapes as multi-layered height-maps (MLH) where at each grid location, we store multiple instances of height maps, thereby representing 3D shape detail that is hidden behind several layers of occlusion. We provide a novel view merging method for combining view dependent information (Eg. MLH descriptors) from multiple views. Because of the ability of using 2D CNNs, our method is highly memory efficient in terms of input resolution compared to the voxel based input. Together with MLH descriptors and our multi view merging, we achieve the state-of-the-art result in classification on ModelNet dataset.
Tasks	3D Object Classification
Published	2018-07-23
URL	http://arxiv.org/abs/1807.08485v2
PDF	http://arxiv.org/pdf/1807.08485v2.pdf
PWC	https://paperswithcode.com/paper/learning-3d-shapes-as-multi-layered-height
Repo	https://github.com/krips89/mlh_mvcnn
Framework	pytorch

A Temporally-Aware Interpolation Network for Video Frame Inpainting


Title	A Temporally-Aware Interpolation Network for Video Frame Inpainting
Authors	Ximeng Sun, Ryan Szeto, Jason J. Corso
Abstract	We propose the first deep learning solution to video frame inpainting, a challenging instance of the general video inpainting problem with applications in video editing, manipulation, and forensics. Our task is less ambiguous than frame interpolation and video prediction because we have access to both the temporal context and a partial glimpse of the future, allowing us to better evaluate the quality of a model’s predictions objectively. We devise a pipeline composed of two modules: a bidirectional video prediction module, and a temporally-aware frame interpolation module. The prediction module makes two intermediate predictions of the missing frames, one conditioned on the preceding frames and the other conditioned on the following frames, using a shared convolutional LSTM-based encoder-decoder. The interpolation module blends the intermediate predictions to form the final result. Specifically, it utilizes time information and hidden activations from the video prediction module to resolve disagreements between the predictions. Our experiments demonstrate that our approach produces more accurate and qualitatively satisfying results than a state-of-the-art video prediction method and many strong frame inpainting baselines.
Tasks	Video Inpainting, Video Prediction
Published	2018-03-20
URL	http://arxiv.org/abs/1803.07218v2
PDF	http://arxiv.org/pdf/1803.07218v2.pdf
PWC	https://paperswithcode.com/paper/a-temporally-aware-interpolation-network-for
Repo	https://github.com/sunxm2357/TAI_video_frame_inpainting
Framework	pytorch

Solving the Empirical Bayes Normal Means Problem with Correlated Noise


Title	Solving the Empirical Bayes Normal Means Problem with Correlated Noise
Authors	Lei Sun, Matthew Stephens
Abstract	The Normal Means problem plays a fundamental role in many areas of modern high-dimensional statistics, both in theory and practice. And the Empirical Bayes (EB) approach to solving this problem has been shown to be highly effective, again both in theory and practice. However, almost all EB treatments of the Normal Means problem assume that the observations are independent. In practice correlations are ubiquitous in real-world applications, and these correlations can grossly distort EB estimates. Here, exploiting theory from Schwartzman (2010), we develop new EB methods for solving the Normal Means problem that take account of unknown correlations among observations. We provide practical software implementations of these methods, and illustrate them in the context of large-scale multiple testing problems and False Discovery Rate (FDR) control. In realistic numerical experiments our methods compare favorably with other commonly-used multiple testing methods.
Tasks
Published	2018-12-18
URL	http://arxiv.org/abs/1812.07488v2
PDF	http://arxiv.org/pdf/1812.07488v2.pdf
PWC	https://paperswithcode.com/paper/solving-the-empirical-bayes-normal-means
Repo	https://github.com/LSun/cashr_paper
Framework	none

Evaluating Generative Adversarial Networks on Explicitly Parameterized Distributions


Title	Evaluating Generative Adversarial Networks on Explicitly Parameterized Distributions
Authors	Shayne O’Brien, Matt Groh, Abhimanyu Dubey
Abstract	The true distribution parameterizations of commonly used image datasets are inaccessible. Rather than designing metrics for feature spaces with unknown characteristics, we propose to measure GAN performance by evaluating on explicitly parameterized, synthetic data distributions. As a case study, we examine the performance of 16 GAN variants on six multivariate distributions of varying dimensionalities and training set sizes. In this learning environment, we observe that: GANs exhibit similar performance trends across dimensionalities; learning depends on the underlying distribution and its complexity; the number of training samples can have a large impact on performance; evaluation and relative comparisons are metric-dependent; diverse sets of hyperparameters can produce a “best” result; and some GANs are more robust to hyperparameter changes than others. These observations both corroborate findings of previous GAN evaluation studies and make novel contributions regarding the relationship between size, complexity, and GAN performance.
Tasks
Published	2018-12-27
URL	http://arxiv.org/abs/1812.10782v1
PDF	http://arxiv.org/pdf/1812.10782v1.pdf
PWC	https://paperswithcode.com/paper/evaluating-generative-adversarial-networks-on
Repo	https://github.com/shayneobrien/explicit-gan-eval
Framework	pytorch

Deep Generative Modeling of LiDAR Data


Title	Deep Generative Modeling of LiDAR Data
Authors	Lucas Caccia, Herke van Hoof, Aaron Courville, Joelle Pineau
Abstract	Building models capable of generating structured output is a key challenge for AI and robotics. While generative models have been explored on many types of data, little work has been done on synthesizing lidar scans, which play a key role in robot mapping and localization. In this work, we show that one can adapt deep generative models for this task by unravelling lidar scans into a 2D point map. Our approach can generate high quality samples, while simultaneously learning a meaningful latent representation of the data. We demonstrate significant improvements against state-of-the-art point cloud generation methods. Furthermore, we propose a novel data representation that augments the 2D signal with absolute positional information. We show that this helps robustness to noisy and imputed input; the learned model can recover the underlying lidar scan from seemingly uninformative data
Tasks	Point Cloud Generation
Published	2018-12-04
URL	https://arxiv.org/abs/1812.01180v4
PDF	https://arxiv.org/pdf/1812.01180v4.pdf
PWC	https://paperswithcode.com/paper/deep-generative-modeling-of-lidar-data
Repo	https://github.com/pclucas14/lidar_generation
Framework	pytorch

Adaptive Input Representations for Neural Language Modeling


Title	Adaptive Input Representations for Neural Language Modeling
Authors	Alexei Baevski, Michael Auli
Abstract	We introduce adaptive input representations for neural language modeling which extend the adaptive softmax of Grave et al. (2017) to input representations of variable capacity. There are several choices on how to factorize the input and output layers, and whether to model words, characters or sub-word units. We perform a systematic comparison of popular choices for a self-attentional architecture. Our experiments show that models equipped with adaptive embeddings are more than twice as fast to train than the popular character input CNN while having a lower number of parameters. On the WikiText-103 benchmark we achieve 18.7 perplexity, an improvement of 10.5 perplexity compared to the previously best published result and on the Billion Word benchmark, we achieve 23.02 perplexity.
Tasks	Language Modelling
Published	2018-09-28
URL	http://arxiv.org/abs/1809.10853v3
PDF	http://arxiv.org/pdf/1809.10853v3.pdf
PWC	https://paperswithcode.com/paper/adaptive-input-representations-for-neural
Repo	https://github.com/AranKomat/adapinp
Framework	pytorch

One-Shot Instance Segmentation


Title	One-Shot Instance Segmentation
Authors	Claudio Michaelis, Ivan Ustyuzhaninov, Matthias Bethge, Alexander S. Ecker
Abstract	We tackle the problem of one-shot instance segmentation: Given an example image of a novel, previously unknown object category, find and segment all objects of this category within a complex scene. To address this challenging new task, we propose Siamese Mask R-CNN. It extends Mask R-CNN by a Siamese backbone encoding both reference image and scene, allowing it to target detection and segmentation towards the reference category. We demonstrate empirical results on MS Coco highlighting challenges of the one-shot setting: while transferring knowledge about instance segmentation to novel object categories works very well, targeting the detection network towards the reference category appears to be more difficult. Our work provides a first strong baseline for one-shot instance segmentation and will hopefully inspire further research into more powerful and flexible scene analysis algorithms. Code is available at: https://github.com/bethgelab/siamese-mask-rcnn
Tasks	Few-Shot Learning, Few-Shot Object Detection, Instance Segmentation, Object Detection, One-Shot Instance Segmentation, One-Shot Learning, One-Shot Object Detection
Published	2018-11-28
URL	https://arxiv.org/abs/1811.11507v2
PDF	https://arxiv.org/pdf/1811.11507v2.pdf
PWC	https://paperswithcode.com/paper/one-shot-instance-segmentation
Repo	https://github.com/bethgelab/siamese-mask-rcnn
Framework	tf

Adversarial Autoencoders for Compact Representations of 3D Point Clouds


Title	Adversarial Autoencoders for Compact Representations of 3D Point Clouds
Authors	Maciej Zamorski, Maciej Zięba, Piotr Klukowski, Rafał Nowak, Karol Kurach, Wojciech Stokowiec, Tomasz Trzciński
Abstract	Deep generative architectures provide a way to model not only images but also complex, 3-dimensional objects, such as point clouds. In this work, we present a novel method to obtain meaningful representations of 3D shapes that can be used for challenging tasks including 3D points generation, reconstruction, compression, and clustering. Contrary to existing methods for 3D point cloud generation that train separate decoupled models for representation learning and generation, our approach is the first end-to-end solution that allows to simultaneously learn a latent space of representation and generate 3D shape out of it. Moreover, our model is capable of learning meaningful compact binary descriptors with adversarial training conducted on a latent space. To achieve this goal, we extend a deep Adversarial Autoencoder model (AAE) to accept 3D input and create 3D output. Thanks to our end-to-end training regime, the resulting method called 3D Adversarial Autoencoder (3dAAE) obtains either binary or continuous latent space that covers a much wider portion of training data distribution. Finally, our quantitative evaluation shows that 3dAAE provides state-of-the-art results for 3D points clustering and 3D object retrieval.
Tasks	3D Object Retrieval, Generating 3D Point Clouds, Point Cloud Generation, Representation Learning
Published	2018-11-19
URL	http://arxiv.org/abs/1811.07605v3
PDF	http://arxiv.org/pdf/1811.07605v3.pdf
PWC	https://paperswithcode.com/paper/adversarial-autoencoders-for-generating-3d
Repo	https://github.com/MaciejZamorski/3d-AAE
Framework	pytorch

Imitation Learning for Neural Morphological String Transduction


Title	Imitation Learning for Neural Morphological String Transduction
Authors	Peter Makarov, Simon Clematide
Abstract	We employ imitation learning to train a neural transition-based string transducer for morphological tasks such as inflection generation and lemmatization. Previous approaches to training this type of model either rely on an external character aligner for the production of gold action sequences, which results in a suboptimal model due to the unwarranted dependence on a single gold action sequence despite spurious ambiguity, or require warm starting with an MLE model. Our approach only requires a simple expert policy, eliminating the need for a character aligner or warm start. It also addresses familiar MLE training biases and leads to strong and state-of-the-art performance on several benchmarks.
Tasks	Imitation Learning, Lemmatization
Published	2018-08-31
URL	http://arxiv.org/abs/1808.10701v1
PDF	http://arxiv.org/pdf/1808.10701v1.pdf
PWC	https://paperswithcode.com/paper/imitation-learning-for-neural-morphological
Repo	https://github.com/ZurichNLP/emnlp2018-imitation-learning-for-neural-morphology
Framework	none

GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding


Title	GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Authors	Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, Samuel R. Bowman
Abstract	For natural language understanding (NLU) technology to be maximally useful, both practically and as a scientific object of study, it must be general: it must be able to process language in a way that is not exclusively tailored to any one specific task or dataset. In pursuit of this objective, we introduce the General Language Understanding Evaluation benchmark (GLUE), a tool for evaluating and analyzing the performance of models across a diverse range of existing NLU tasks. GLUE is model-agnostic, but it incentivizes sharing knowledge across tasks because certain tasks have very limited training data. We further provide a hand-crafted diagnostic test suite that enables detailed linguistic analysis of NLU models. We evaluate baselines based on current methods for multi-task and transfer learning and find that they do not immediately give substantial improvements over the aggregate performance of training a separate model per task, indicating room for improvement in developing general and robust NLU systems.
Tasks	Natural Language Inference, Transfer Learning
Published	2018-04-20
URL	http://arxiv.org/abs/1804.07461v3
PDF	http://arxiv.org/pdf/1804.07461v3.pdf
PWC	https://paperswithcode.com/paper/glue-a-multi-task-benchmark-and-analysis
Repo	https://github.com/nyu-mll/GLUE-baselines
Framework	pytorch

Rapid Adaptation of Neural Machine Translation to New Languages


Title	Rapid Adaptation of Neural Machine Translation to New Languages
Authors	Graham Neubig, Junjie Hu
Abstract	This paper examines the problem of adapting neural machine translation systems to new, low-resourced languages (LRLs) as effectively and rapidly as possible. We propose methods based on starting with massively multilingual “seed models”, which can be trained ahead-of-time, and then continuing training on data related to the LRL. We contrast a number of strategies, leading to a novel, simple, yet effective method of “similar-language regularization”, where we jointly train on both a LRL of interest and a similar high-resourced language to prevent over-fitting to small LRL data. Experiments demonstrate that massively multilingual models, even without any explicit adaptation, are surprisingly effective, achieving BLEU scores of up to 15.5 with no data from the LRL, and that the proposed similar-language regularization method improves over other adaptation methods by 1.7 BLEU points average over 4 LRL settings. Code to reproduce experiments at https://github.com/neubig/rapid-adaptation
Tasks	Machine Translation
Published	2018-08-13
URL	http://arxiv.org/abs/1808.04189v1
PDF	http://arxiv.org/pdf/1808.04189v1.pdf
PWC	https://paperswithcode.com/paper/rapid-adaptation-of-neural-machine
Repo	https://github.com/neubig/rapid-adaptation
Framework	none

Is Robustness the Cost of Accuracy? – A Comprehensive Study on the Robustness of 18 Deep Image Classification Models


Title	Is Robustness the Cost of Accuracy? – A Comprehensive Study on the Robustness of 18 Deep Image Classification Models
Authors	Dong Su, Huan Zhang, Hongge Chen, Jinfeng Yi, Pin-Yu Chen, Yupeng Gao
Abstract	The prediction accuracy has been the long-lasting and sole standard for comparing the performance of different image classification models, including the ImageNet competition. However, recent studies have highlighted the lack of robustness in well-trained deep neural networks to adversarial examples. Visually imperceptible perturbations to natural images can easily be crafted and mislead the image classifiers towards misclassification. To demystify the trade-offs between robustness and accuracy, in this paper we thoroughly benchmark 18 ImageNet models using multiple robustness metrics, including the distortion, success rate and transferability of adversarial examples between 306 pairs of models. Our extensive experimental results reveal several new insights: (1) linear scaling law - the empirical $\ell_2$ and $\ell_\infty$ distortion metrics scale linearly with the logarithm of classification error; (2) model architecture is a more critical factor to robustness than model size, and the disclosed accuracy-robustness Pareto frontier can be used as an evaluation criterion for ImageNet model designers; (3) for a similar network architecture, increasing network depth slightly improves robustness in $\ell_\infty$ distortion; (4) there exist models (in VGG family) that exhibit high adversarial transferability, while most adversarial examples crafted from one model can only be transferred within the same family. Experiment code is publicly available at \url{https://github.com/huanzhang12/Adversarial_Survey}.
Tasks	Image Classification
Published	2018-08-05
URL	http://arxiv.org/abs/1808.01688v2
PDF	http://arxiv.org/pdf/1808.01688v2.pdf
PWC	https://paperswithcode.com/paper/is-robustness-the-cost-of-accuracy-a
Repo	https://github.com/huanzhang12/Adversarial_Survey
Framework	tf

Deep Bayesian Bandits Showdown: An Empirical Comparison of Bayesian Deep Networks for Thompson Sampling


Title	Deep Bayesian Bandits Showdown: An Empirical Comparison of Bayesian Deep Networks for Thompson Sampling
Authors	Carlos Riquelme, George Tucker, Jasper Snoek
Abstract	Recent advances in deep reinforcement learning have made significant strides in performance on applications such as Go and Atari games. However, developing practical methods to balance exploration and exploitation in complex domains remains largely unsolved. Thompson Sampling and its extension to reinforcement learning provide an elegant approach to exploration that only requires access to posterior samples of the model. At the same time, advances in approximate Bayesian methods have made posterior approximation for flexible neural network models practical. Thus, it is attractive to consider approximate Bayesian neural networks in a Thompson Sampling framework. To understand the impact of using an approximate posterior on Thompson Sampling, we benchmark well-established and recently developed methods for approximate posterior sampling combined with Thompson Sampling over a series of contextual bandit problems. We found that many approaches that have been successful in the supervised learning setting underperformed in the sequential decision-making scenario. In particular, we highlight the challenge of adapting slowly converging uncertainty estimates to the online setting.
Tasks	Decision Making, Multi-Armed Bandits
Published	2018-02-26
URL	http://arxiv.org/abs/1802.09127v1
PDF	http://arxiv.org/pdf/1802.09127v1.pdf
PWC	https://paperswithcode.com/paper/deep-bayesian-bandits-showdown-an-empirical
Repo	https://github.com/tensorflow/models
Framework	tf

From Credit Assignment to Entropy Regularization: Two New Algorithms for Neural Sequence Prediction


Title	From Credit Assignment to Entropy Regularization: Two New Algorithms for Neural Sequence Prediction
Authors	Zihang Dai, Qizhe Xie, Eduard Hovy
Abstract	In this work, we study the credit assignment problem in reward augmented maximum likelihood (RAML) learning, and establish a theoretical equivalence between the token-level counterpart of RAML and the entropy regularized reinforcement learning. Inspired by the connection, we propose two sequence prediction algorithms, one extending RAML with fine-grained credit assignment and the other improving Actor-Critic with a systematic entropy regularization. On two benchmark datasets, we show the proposed algorithms outperform RAML and Actor-Critic respectively, providing new alternatives to sequence prediction.
Tasks
Published	2018-04-29
URL	http://arxiv.org/abs/1804.10974v1
PDF	http://arxiv.org/pdf/1804.10974v1.pdf
PWC	https://paperswithcode.com/paper/from-credit-assignment-to-entropy
Repo	https://github.com/zihangdai/ERAC-VAML
Framework	pytorch