Paper Group AWR 110
An Interpretable Reasoning Network for Multi-Relation Question Answering. Learning 3D Shapes as Multi-Layered Height-maps using 2D Convolutional Networks. A Temporally-Aware Interpolation Network for Video Frame Inpainting. Solving the Empirical Bayes Normal Means Problem with Correlated Noise. Evaluating Generative Adversarial Networks on Explicit …
An Interpretable Reasoning Network for Multi-Relation Question Answering
Title | An Interpretable Reasoning Network for Multi-Relation Question Answering |
Authors | Mantong Zhou, Minlie Huang, Xiaoyan Zhu |
Abstract | Multi-relation Question Answering is a challenging task, due to the requirement of elaborated analysis on questions and reasoning over multiple fact triples in knowledge base. In this paper, we present a novel model called Interpretable Reasoning Network that employs an interpretable, hop-by-hop reasoning process for question answering. The model dynamically decides which part of an input question should be analyzed at each hop; predicts a relation that corresponds to the current parsed results; utilizes the predicted relation to update the question representation and the state of the reasoning process; and then drives the next-hop reasoning. Experiments show that our model yields state-of-the-art results on two datasets. More interestingly, the model can offer traceable and observable intermediate predictions for reasoning analysis and failure diagnosis, thereby allowing manual manipulation in predicting the final answer. |
Tasks | Question Answering |
Published | 2018-01-15 |
URL | http://arxiv.org/abs/1801.04726v3 |
http://arxiv.org/pdf/1801.04726v3.pdf | |
PWC | https://paperswithcode.com/paper/an-interpretable-reasoning-network-for-multi |
Repo | https://github.com/zmtkeke/IRN |
Framework | tf |
Learning 3D Shapes as Multi-Layered Height-maps using 2D Convolutional Networks
Title | Learning 3D Shapes as Multi-Layered Height-maps using 2D Convolutional Networks |
Authors | Kripasindhu Sarkar, Basavaraj Hampiholi, Kiran Varanasi, Didier Stricker |
Abstract | We present a novel global representation of 3D shapes, suitable for the application of 2D CNNs. We represent 3D shapes as multi-layered height-maps (MLH) where at each grid location, we store multiple instances of height maps, thereby representing 3D shape detail that is hidden behind several layers of occlusion. We provide a novel view merging method for combining view dependent information (Eg. MLH descriptors) from multiple views. Because of the ability of using 2D CNNs, our method is highly memory efficient in terms of input resolution compared to the voxel based input. Together with MLH descriptors and our multi view merging, we achieve the state-of-the-art result in classification on ModelNet dataset. |
Tasks | 3D Object Classification |
Published | 2018-07-23 |
URL | http://arxiv.org/abs/1807.08485v2 |
http://arxiv.org/pdf/1807.08485v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-3d-shapes-as-multi-layered-height |
Repo | https://github.com/krips89/mlh_mvcnn |
Framework | pytorch |
A Temporally-Aware Interpolation Network for Video Frame Inpainting
Title | A Temporally-Aware Interpolation Network for Video Frame Inpainting |
Authors | Ximeng Sun, Ryan Szeto, Jason J. Corso |
Abstract | We propose the first deep learning solution to video frame inpainting, a challenging instance of the general video inpainting problem with applications in video editing, manipulation, and forensics. Our task is less ambiguous than frame interpolation and video prediction because we have access to both the temporal context and a partial glimpse of the future, allowing us to better evaluate the quality of a model’s predictions objectively. We devise a pipeline composed of two modules: a bidirectional video prediction module, and a temporally-aware frame interpolation module. The prediction module makes two intermediate predictions of the missing frames, one conditioned on the preceding frames and the other conditioned on the following frames, using a shared convolutional LSTM-based encoder-decoder. The interpolation module blends the intermediate predictions to form the final result. Specifically, it utilizes time information and hidden activations from the video prediction module to resolve disagreements between the predictions. Our experiments demonstrate that our approach produces more accurate and qualitatively satisfying results than a state-of-the-art video prediction method and many strong frame inpainting baselines. |
Tasks | Video Inpainting, Video Prediction |
Published | 2018-03-20 |
URL | http://arxiv.org/abs/1803.07218v2 |
http://arxiv.org/pdf/1803.07218v2.pdf | |
PWC | https://paperswithcode.com/paper/a-temporally-aware-interpolation-network-for |
Repo | https://github.com/sunxm2357/TAI_video_frame_inpainting |
Framework | pytorch |
Solving the Empirical Bayes Normal Means Problem with Correlated Noise
Title | Solving the Empirical Bayes Normal Means Problem with Correlated Noise |
Authors | Lei Sun, Matthew Stephens |
Abstract | The Normal Means problem plays a fundamental role in many areas of modern high-dimensional statistics, both in theory and practice. And the Empirical Bayes (EB) approach to solving this problem has been shown to be highly effective, again both in theory and practice. However, almost all EB treatments of the Normal Means problem assume that the observations are independent. In practice correlations are ubiquitous in real-world applications, and these correlations can grossly distort EB estimates. Here, exploiting theory from Schwartzman (2010), we develop new EB methods for solving the Normal Means problem that take account of unknown correlations among observations. We provide practical software implementations of these methods, and illustrate them in the context of large-scale multiple testing problems and False Discovery Rate (FDR) control. In realistic numerical experiments our methods compare favorably with other commonly-used multiple testing methods. |
Tasks | |
Published | 2018-12-18 |
URL | http://arxiv.org/abs/1812.07488v2 |
http://arxiv.org/pdf/1812.07488v2.pdf | |
PWC | https://paperswithcode.com/paper/solving-the-empirical-bayes-normal-means |
Repo | https://github.com/LSun/cashr_paper |
Framework | none |
Evaluating Generative Adversarial Networks on Explicitly Parameterized Distributions
Title | Evaluating Generative Adversarial Networks on Explicitly Parameterized Distributions |
Authors | Shayne O’Brien, Matt Groh, Abhimanyu Dubey |
Abstract | The true distribution parameterizations of commonly used image datasets are inaccessible. Rather than designing metrics for feature spaces with unknown characteristics, we propose to measure GAN performance by evaluating on explicitly parameterized, synthetic data distributions. As a case study, we examine the performance of 16 GAN variants on six multivariate distributions of varying dimensionalities and training set sizes. In this learning environment, we observe that: GANs exhibit similar performance trends across dimensionalities; learning depends on the underlying distribution and its complexity; the number of training samples can have a large impact on performance; evaluation and relative comparisons are metric-dependent; diverse sets of hyperparameters can produce a “best” result; and some GANs are more robust to hyperparameter changes than others. These observations both corroborate findings of previous GAN evaluation studies and make novel contributions regarding the relationship between size, complexity, and GAN performance. |
Tasks | |
Published | 2018-12-27 |
URL | http://arxiv.org/abs/1812.10782v1 |
http://arxiv.org/pdf/1812.10782v1.pdf | |
PWC | https://paperswithcode.com/paper/evaluating-generative-adversarial-networks-on |
Repo | https://github.com/shayneobrien/explicit-gan-eval |
Framework | pytorch |
Deep Generative Modeling of LiDAR Data
Title | Deep Generative Modeling of LiDAR Data |
Authors | Lucas Caccia, Herke van Hoof, Aaron Courville, Joelle Pineau |
Abstract | Building models capable of generating structured output is a key challenge for AI and robotics. While generative models have been explored on many types of data, little work has been done on synthesizing lidar scans, which play a key role in robot mapping and localization. In this work, we show that one can adapt deep generative models for this task by unravelling lidar scans into a 2D point map. Our approach can generate high quality samples, while simultaneously learning a meaningful latent representation of the data. We demonstrate significant improvements against state-of-the-art point cloud generation methods. Furthermore, we propose a novel data representation that augments the 2D signal with absolute positional information. We show that this helps robustness to noisy and imputed input; the learned model can recover the underlying lidar scan from seemingly uninformative data |
Tasks | Point Cloud Generation |
Published | 2018-12-04 |
URL | https://arxiv.org/abs/1812.01180v4 |
https://arxiv.org/pdf/1812.01180v4.pdf | |
PWC | https://paperswithcode.com/paper/deep-generative-modeling-of-lidar-data |
Repo | https://github.com/pclucas14/lidar_generation |
Framework | pytorch |
Adaptive Input Representations for Neural Language Modeling
Title | Adaptive Input Representations for Neural Language Modeling |
Authors | Alexei Baevski, Michael Auli |
Abstract | We introduce adaptive input representations for neural language modeling which extend the adaptive softmax of Grave et al. (2017) to input representations of variable capacity. There are several choices on how to factorize the input and output layers, and whether to model words, characters or sub-word units. We perform a systematic comparison of popular choices for a self-attentional architecture. Our experiments show that models equipped with adaptive embeddings are more than twice as fast to train than the popular character input CNN while having a lower number of parameters. On the WikiText-103 benchmark we achieve 18.7 perplexity, an improvement of 10.5 perplexity compared to the previously best published result and on the Billion Word benchmark, we achieve 23.02 perplexity. |
Tasks | Language Modelling |
Published | 2018-09-28 |
URL | http://arxiv.org/abs/1809.10853v3 |
http://arxiv.org/pdf/1809.10853v3.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-input-representations-for-neural |
Repo | https://github.com/AranKomat/adapinp |
Framework | pytorch |
One-Shot Instance Segmentation
Title | One-Shot Instance Segmentation |
Authors | Claudio Michaelis, Ivan Ustyuzhaninov, Matthias Bethge, Alexander S. Ecker |
Abstract | We tackle the problem of one-shot instance segmentation: Given an example image of a novel, previously unknown object category, find and segment all objects of this category within a complex scene. To address this challenging new task, we propose Siamese Mask R-CNN. It extends Mask R-CNN by a Siamese backbone encoding both reference image and scene, allowing it to target detection and segmentation towards the reference category. We demonstrate empirical results on MS Coco highlighting challenges of the one-shot setting: while transferring knowledge about instance segmentation to novel object categories works very well, targeting the detection network towards the reference category appears to be more difficult. Our work provides a first strong baseline for one-shot instance segmentation and will hopefully inspire further research into more powerful and flexible scene analysis algorithms. Code is available at: https://github.com/bethgelab/siamese-mask-rcnn |
Tasks | Few-Shot Learning, Few-Shot Object Detection, Instance Segmentation, Object Detection, One-Shot Instance Segmentation, One-Shot Learning, One-Shot Object Detection |
Published | 2018-11-28 |
URL | https://arxiv.org/abs/1811.11507v2 |
https://arxiv.org/pdf/1811.11507v2.pdf | |
PWC | https://paperswithcode.com/paper/one-shot-instance-segmentation |
Repo | https://github.com/bethgelab/siamese-mask-rcnn |
Framework | tf |
Adversarial Autoencoders for Compact Representations of 3D Point Clouds
Title | Adversarial Autoencoders for Compact Representations of 3D Point Clouds |
Authors | Maciej Zamorski, Maciej Zięba, Piotr Klukowski, Rafał Nowak, Karol Kurach, Wojciech Stokowiec, Tomasz Trzciński |
Abstract | Deep generative architectures provide a way to model not only images but also complex, 3-dimensional objects, such as point clouds. In this work, we present a novel method to obtain meaningful representations of 3D shapes that can be used for challenging tasks including 3D points generation, reconstruction, compression, and clustering. Contrary to existing methods for 3D point cloud generation that train separate decoupled models for representation learning and generation, our approach is the first end-to-end solution that allows to simultaneously learn a latent space of representation and generate 3D shape out of it. Moreover, our model is capable of learning meaningful compact binary descriptors with adversarial training conducted on a latent space. To achieve this goal, we extend a deep Adversarial Autoencoder model (AAE) to accept 3D input and create 3D output. Thanks to our end-to-end training regime, the resulting method called 3D Adversarial Autoencoder (3dAAE) obtains either binary or continuous latent space that covers a much wider portion of training data distribution. Finally, our quantitative evaluation shows that 3dAAE provides state-of-the-art results for 3D points clustering and 3D object retrieval. |
Tasks | 3D Object Retrieval, Generating 3D Point Clouds, Point Cloud Generation, Representation Learning |
Published | 2018-11-19 |
URL | http://arxiv.org/abs/1811.07605v3 |
http://arxiv.org/pdf/1811.07605v3.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-autoencoders-for-generating-3d |
Repo | https://github.com/MaciejZamorski/3d-AAE |
Framework | pytorch |
Imitation Learning for Neural Morphological String Transduction
Title | Imitation Learning for Neural Morphological String Transduction |
Authors | Peter Makarov, Simon Clematide |
Abstract | We employ imitation learning to train a neural transition-based string transducer for morphological tasks such as inflection generation and lemmatization. Previous approaches to training this type of model either rely on an external character aligner for the production of gold action sequences, which results in a suboptimal model due to the unwarranted dependence on a single gold action sequence despite spurious ambiguity, or require warm starting with an MLE model. Our approach only requires a simple expert policy, eliminating the need for a character aligner or warm start. It also addresses familiar MLE training biases and leads to strong and state-of-the-art performance on several benchmarks. |
Tasks | Imitation Learning, Lemmatization |
Published | 2018-08-31 |
URL | http://arxiv.org/abs/1808.10701v1 |
http://arxiv.org/pdf/1808.10701v1.pdf | |
PWC | https://paperswithcode.com/paper/imitation-learning-for-neural-morphological |
Repo | https://github.com/ZurichNLP/emnlp2018-imitation-learning-for-neural-morphology |
Framework | none |
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Title | GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding |
Authors | Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, Samuel R. Bowman |
Abstract | For natural language understanding (NLU) technology to be maximally useful, both practically and as a scientific object of study, it must be general: it must be able to process language in a way that is not exclusively tailored to any one specific task or dataset. In pursuit of this objective, we introduce the General Language Understanding Evaluation benchmark (GLUE), a tool for evaluating and analyzing the performance of models across a diverse range of existing NLU tasks. GLUE is model-agnostic, but it incentivizes sharing knowledge across tasks because certain tasks have very limited training data. We further provide a hand-crafted diagnostic test suite that enables detailed linguistic analysis of NLU models. We evaluate baselines based on current methods for multi-task and transfer learning and find that they do not immediately give substantial improvements over the aggregate performance of training a separate model per task, indicating room for improvement in developing general and robust NLU systems. |
Tasks | Natural Language Inference, Transfer Learning |
Published | 2018-04-20 |
URL | http://arxiv.org/abs/1804.07461v3 |
http://arxiv.org/pdf/1804.07461v3.pdf | |
PWC | https://paperswithcode.com/paper/glue-a-multi-task-benchmark-and-analysis |
Repo | https://github.com/nyu-mll/GLUE-baselines |
Framework | pytorch |
Rapid Adaptation of Neural Machine Translation to New Languages
Title | Rapid Adaptation of Neural Machine Translation to New Languages |
Authors | Graham Neubig, Junjie Hu |
Abstract | This paper examines the problem of adapting neural machine translation systems to new, low-resourced languages (LRLs) as effectively and rapidly as possible. We propose methods based on starting with massively multilingual “seed models”, which can be trained ahead-of-time, and then continuing training on data related to the LRL. We contrast a number of strategies, leading to a novel, simple, yet effective method of “similar-language regularization”, where we jointly train on both a LRL of interest and a similar high-resourced language to prevent over-fitting to small LRL data. Experiments demonstrate that massively multilingual models, even without any explicit adaptation, are surprisingly effective, achieving BLEU scores of up to 15.5 with no data from the LRL, and that the proposed similar-language regularization method improves over other adaptation methods by 1.7 BLEU points average over 4 LRL settings. Code to reproduce experiments at https://github.com/neubig/rapid-adaptation |
Tasks | Machine Translation |
Published | 2018-08-13 |
URL | http://arxiv.org/abs/1808.04189v1 |
http://arxiv.org/pdf/1808.04189v1.pdf | |
PWC | https://paperswithcode.com/paper/rapid-adaptation-of-neural-machine |
Repo | https://github.com/neubig/rapid-adaptation |
Framework | none |
Is Robustness the Cost of Accuracy? – A Comprehensive Study on the Robustness of 18 Deep Image Classification Models
Title | Is Robustness the Cost of Accuracy? – A Comprehensive Study on the Robustness of 18 Deep Image Classification Models |
Authors | Dong Su, Huan Zhang, Hongge Chen, Jinfeng Yi, Pin-Yu Chen, Yupeng Gao |
Abstract | The prediction accuracy has been the long-lasting and sole standard for comparing the performance of different image classification models, including the ImageNet competition. However, recent studies have highlighted the lack of robustness in well-trained deep neural networks to adversarial examples. Visually imperceptible perturbations to natural images can easily be crafted and mislead the image classifiers towards misclassification. To demystify the trade-offs between robustness and accuracy, in this paper we thoroughly benchmark 18 ImageNet models using multiple robustness metrics, including the distortion, success rate and transferability of adversarial examples between 306 pairs of models. Our extensive experimental results reveal several new insights: (1) linear scaling law - the empirical $\ell_2$ and $\ell_\infty$ distortion metrics scale linearly with the logarithm of classification error; (2) model architecture is a more critical factor to robustness than model size, and the disclosed accuracy-robustness Pareto frontier can be used as an evaluation criterion for ImageNet model designers; (3) for a similar network architecture, increasing network depth slightly improves robustness in $\ell_\infty$ distortion; (4) there exist models (in VGG family) that exhibit high adversarial transferability, while most adversarial examples crafted from one model can only be transferred within the same family. Experiment code is publicly available at \url{https://github.com/huanzhang12/Adversarial_Survey}. |
Tasks | Image Classification |
Published | 2018-08-05 |
URL | http://arxiv.org/abs/1808.01688v2 |
http://arxiv.org/pdf/1808.01688v2.pdf | |
PWC | https://paperswithcode.com/paper/is-robustness-the-cost-of-accuracy-a |
Repo | https://github.com/huanzhang12/Adversarial_Survey |
Framework | tf |
Deep Bayesian Bandits Showdown: An Empirical Comparison of Bayesian Deep Networks for Thompson Sampling
Title | Deep Bayesian Bandits Showdown: An Empirical Comparison of Bayesian Deep Networks for Thompson Sampling |
Authors | Carlos Riquelme, George Tucker, Jasper Snoek |
Abstract | Recent advances in deep reinforcement learning have made significant strides in performance on applications such as Go and Atari games. However, developing practical methods to balance exploration and exploitation in complex domains remains largely unsolved. Thompson Sampling and its extension to reinforcement learning provide an elegant approach to exploration that only requires access to posterior samples of the model. At the same time, advances in approximate Bayesian methods have made posterior approximation for flexible neural network models practical. Thus, it is attractive to consider approximate Bayesian neural networks in a Thompson Sampling framework. To understand the impact of using an approximate posterior on Thompson Sampling, we benchmark well-established and recently developed methods for approximate posterior sampling combined with Thompson Sampling over a series of contextual bandit problems. We found that many approaches that have been successful in the supervised learning setting underperformed in the sequential decision-making scenario. In particular, we highlight the challenge of adapting slowly converging uncertainty estimates to the online setting. |
Tasks | Decision Making, Multi-Armed Bandits |
Published | 2018-02-26 |
URL | http://arxiv.org/abs/1802.09127v1 |
http://arxiv.org/pdf/1802.09127v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-bayesian-bandits-showdown-an-empirical |
Repo | https://github.com/tensorflow/models |
Framework | tf |
From Credit Assignment to Entropy Regularization: Two New Algorithms for Neural Sequence Prediction
Title | From Credit Assignment to Entropy Regularization: Two New Algorithms for Neural Sequence Prediction |
Authors | Zihang Dai, Qizhe Xie, Eduard Hovy |
Abstract | In this work, we study the credit assignment problem in reward augmented maximum likelihood (RAML) learning, and establish a theoretical equivalence between the token-level counterpart of RAML and the entropy regularized reinforcement learning. Inspired by the connection, we propose two sequence prediction algorithms, one extending RAML with fine-grained credit assignment and the other improving Actor-Critic with a systematic entropy regularization. On two benchmark datasets, we show the proposed algorithms outperform RAML and Actor-Critic respectively, providing new alternatives to sequence prediction. |
Tasks | |
Published | 2018-04-29 |
URL | http://arxiv.org/abs/1804.10974v1 |
http://arxiv.org/pdf/1804.10974v1.pdf | |
PWC | https://paperswithcode.com/paper/from-credit-assignment-to-entropy |
Repo | https://github.com/zihangdai/ERAC-VAML |
Framework | pytorch |