July 29, 2019

2921 words 14 mins read

Paper Group AWR 82

Sharp Models on Dull Hardware: Fast and Accurate Neural Machine Translation Decoding on the CPU. Continuous Representation of Location for Geolocation and Lexical Dialectology using Mixture Density Networks. Learning Word Relatedness over Time. Learning Background-Aware Correlation Filters for Visual Tracking. Dynamic Time-Aware Attention to Speake …

Sharp Models on Dull Hardware: Fast and Accurate Neural Machine Translation Decoding on the CPU


Title	Sharp Models on Dull Hardware: Fast and Accurate Neural Machine Translation Decoding on the CPU
Authors	Jacob Devlin
Abstract	Attentional sequence-to-sequence models have become the new standard for machine translation, but one challenge of such models is a significant increase in training and decoding cost compared to phrase-based systems. Here, we focus on efficient decoding, with a goal of achieving accuracy close the state-of-the-art in neural machine translation (NMT), while achieving CPU decoding speed/throughput close to that of a phrasal decoder. We approach this problem from two angles: First, we describe several techniques for speeding up an NMT beam search decoder, which obtain a 4.4x speedup over a very efficient baseline decoder without changing the decoder output. Second, we propose a simple but powerful network architecture which uses an RNN (GRU/LSTM) layer at bottom, followed by a series of stacked fully-connected layers applied at every timestep. This architecture achieves similar accuracy to a deep recurrent model, at a small fraction of the training and decoding cost. By combining these techniques, our best system achieves a very competitive accuracy of 38.3 BLEU on WMT English-French NewsTest2014, while decoding at 100 words/sec on single-threaded CPU. We believe this is the best published accuracy/speed trade-off of an NMT system.
Tasks	Machine Translation
Published	2017-05-04
URL	http://arxiv.org/abs/1705.01991v1
PDF	http://arxiv.org/pdf/1705.01991v1.pdf
PWC	https://paperswithcode.com/paper/sharp-models-on-dull-hardware-fast-and
Repo	https://github.com/kpu/intgemm
Framework	none

Continuous Representation of Location for Geolocation and Lexical Dialectology using Mixture Density Networks


Title	Continuous Representation of Location for Geolocation and Lexical Dialectology using Mixture Density Networks
Authors	Afshin Rahimi, Timothy Baldwin, Trevor Cohn
Abstract	We propose a method for embedding two-dimensional locations in a continuous vector space using a neural network-based model incorporating mixtures of Gaussian distributions, presenting two model variants for text-based geolocation and lexical dialectology. Evaluated over Twitter data, the proposed model outperforms conventional regression-based geolocation and provides a better estimate of uncertainty. We also show the effectiveness of the representation for predicting words from location in lexical dialectology, and evaluate it using the DARE dataset.
Tasks
Published	2017-08-14
URL	http://arxiv.org/abs/1708.04358v1
PDF	http://arxiv.org/pdf/1708.04358v1.pdf
PWC	https://paperswithcode.com/paper/continuous-representation-of-location-for
Repo	https://github.com/afshinrahimi/geomdn
Framework	none

Learning Word Relatedness over Time


Title	Learning Word Relatedness over Time
Authors	Guy D. Rosin, Eytan Adar, Kira Radinsky
Abstract	Search systems are often focused on providing relevant results for the “now”, assuming both corpora and user needs that focus on the present. However, many corpora today reflect significant longitudinal collections ranging from 20 years of the Web to hundreds of years of digitized newspapers and books. Understanding the temporal intent of the user and retrieving the most relevant historical content has become a significant challenge. Common search features, such as query expansion, leverage the relationship between terms but cannot function well across all times when relationships vary temporally. In this work, we introduce a temporal relationship model that is extracted from longitudinal data collections. The model supports the task of identifying, given two words, when they relate to each other. We present an algorithmic framework for this task and show its application for the task of query expansion, achieving high gain.
Tasks
Published	2017-07-25
URL	http://arxiv.org/abs/1707.08081v2
PDF	http://arxiv.org/pdf/1707.08081v2.pdf
PWC	https://paperswithcode.com/paper/learning-word-relatedness-over-time
Repo	https://github.com/guyrosin/learning-word-relatedness
Framework	none

Learning Background-Aware Correlation Filters for Visual Tracking


Title	Learning Background-Aware Correlation Filters for Visual Tracking
Authors	Hamed Kiani Galoogahi, Ashton Fagg, Simon Lucey
Abstract	Correlation Filters (CFs) have recently demonstrated excellent performance in terms of rapidly tracking objects under challenging photometric and geometric variations. The strength of the approach comes from its ability to efficiently learn - “on the fly” - how the object is changing over time. A fundamental drawback to CFs, however, is that the background of the object is not be modelled over time which can result in suboptimal results. In this paper we propose a Background-Aware CF that can model how both the foreground and background of the object varies over time. Our approach, like conventional CFs, is extremely computationally efficient - and extensive experiments over multiple tracking benchmarks demonstrate the superior accuracy and real-time performance of our method compared to the state-of-the-art trackers including those based on a deep learning paradigm.
Tasks	Visual Tracking
Published	2017-03-14
URL	http://arxiv.org/abs/1703.04590v2
PDF	http://arxiv.org/pdf/1703.04590v2.pdf
PWC	https://paperswithcode.com/paper/learning-background-aware-correlation-filters
Repo	https://github.com/4kubo/bacf_python
Framework	none

Dynamic Time-Aware Attention to Speaker Roles and Contexts for Spoken Language Understanding


Title	Dynamic Time-Aware Attention to Speaker Roles and Contexts for Spoken Language Understanding
Authors	Po-Chun Chen, Ta-Chung Chi, Shang-Yu Su, Yun-Nung Chen
Abstract	Spoken language understanding (SLU) is an essential component in conversational systems. Most SLU component treats each utterance independently, and then the following components aggregate the multi-turn information in the separate phases. In order to avoid error propagation and effectively utilize contexts, prior work leveraged history for contextual SLU. However, the previous model only paid attention to the content in history utterances without considering their temporal information and speaker roles. In the dialogues, the most recent utterances should be more important than the least recent ones. Furthermore, users usually pay attention to 1) self history for reasoning and 2) others’ utterances for listening, the speaker of the utterances may provides informative cues to help understanding. Therefore, this paper proposes an attention-based network that additionally leverages temporal information and speaker role for better SLU, where the attention to contexts and speaker roles can be automatically learned in an end-to-end manner. The experiments on the benchmark Dialogue State Tracking Challenge 4 (DSTC4) dataset show that the time-aware dynamic role attention networks significantly improve the understanding performance.
Tasks	Dialogue State Tracking, Spoken Language Understanding
Published	2017-09-30
URL	http://arxiv.org/abs/1710.00165v2
PDF	http://arxiv.org/pdf/1710.00165v2.pdf
PWC	https://paperswithcode.com/paper/dynamic-time-aware-attention-to-speaker-roles
Repo	https://github.com/MiuLab/Time-SLU
Framework	tf

Accurately and Efficiently Interpreting Human-Robot Instructions of Varying Granularities


Title	Accurately and Efficiently Interpreting Human-Robot Instructions of Varying Granularities
Authors	Dilip Arumugam, Siddharth Karamcheti, Nakul Gopalan, Lawson L. S. Wong, Stefanie Tellex
Abstract	Humans can ground natural language commands to tasks at both abstract and fine-grained levels of specificity. For instance, a human forklift operator can be instructed to perform a high-level action, like “grab a pallet” or a low-level action like “tilt back a little bit.” While robots are also capable of grounding language commands to tasks, previous methods implicitly assume that all commands and tasks reside at a single, fixed level of abstraction. Additionally, methods that do not use multiple levels of abstraction encounter inefficient planning and execution times as they solve tasks at a single level of abstraction with large, intractable state-action spaces closely resembling real world complexity. In this work, by grounding commands to all the tasks or subtasks available in a hierarchical planning framework, we arrive at a model capable of interpreting language at multiple levels of specificity ranging from coarse to more granular. We show that the accuracy of the grounding procedure is improved when simultaneously inferring the degree of abstraction in language used to communicate the task. Leveraging hierarchy also improves efficiency: our proposed approach enables a robot to respond to a command within one second on 90% of our tasks, while baselines take over twenty seconds on half the tasks. Finally, we demonstrate that a real, physical robot can ground commands at multiple levels of abstraction allowing it to efficiently plan different subtasks within the same planning hierarchy.
Tasks
Published	2017-04-21
URL	http://arxiv.org/abs/1704.06616v2
PDF	http://arxiv.org/pdf/1704.06616v2.pdf
PWC	https://paperswithcode.com/paper/accurately-and-efficiently-interpreting-human
Repo	https://github.com/h2r/GLAMDP
Framework	tf

Know-Evolve: Deep Temporal Reasoning for Dynamic Knowledge Graphs


Title	Know-Evolve: Deep Temporal Reasoning for Dynamic Knowledge Graphs
Authors	Rakshit Trivedi, Hanjun Dai, Yichen Wang, Le Song
Abstract	The availability of large scale event data with time stamps has given rise to dynamically evolving knowledge graphs that contain temporal information for each edge. Reasoning over time in such dynamic knowledge graphs is not yet well understood. To this end, we present Know-Evolve, a novel deep evolutionary knowledge network that learns non-linearly evolving entity representations over time. The occurrence of a fact (edge) is modeled as a multivariate point process whose intensity function is modulated by the score for that fact computed based on the learned entity embeddings. We demonstrate significantly improved performance over various relational learning approaches on two large scale real-world datasets. Further, our method effectively predicts occurrence or recurrence time of a fact which is novel compared to prior reasoning approaches in multi-relational setting.
Tasks	Entity Embeddings, Knowledge Graphs, Relational Reasoning
Published	2017-05-16
URL	http://arxiv.org/abs/1705.05742v3
PDF	http://arxiv.org/pdf/1705.05742v3.pdf
PWC	https://paperswithcode.com/paper/know-evolve-deep-temporal-reasoning-for
Repo	https://github.com/INK-USC/RENet
Framework	pytorch

Stochastic reconstruction of an oolitic limestone by generative adversarial networks


Title	Stochastic reconstruction of an oolitic limestone by generative adversarial networks
Authors	Lukas Mosser, Olivier Dubrule, Martin J. Blunt
Abstract	Stochastic image reconstruction is a key part of modern digital rock physics and materials analysis that aims to create numerous representative samples of material micro-structures for upscaling, numerical computation of effective properties and uncertainty quantification. We present a method of three-dimensional stochastic image reconstruction based on generative adversarial neural networks (GANs). GANs represent a framework of unsupervised learning methods that require no a priori inference of the probability distribution associated with the training data. Using a fully convolutional neural network allows fast sampling of large volumetric images.We apply a GAN based workflow of network training and image generation to an oolitic Ketton limestone micro-CT dataset. Minkowski functionals, effective permeability as well as velocity distributions of simulated flow within the acquired images are compared with the synthetic reconstructions generated by the deep neural network. While our results show that GANs allow a fast and accurate reconstruction of the evaluated image dataset, we address a number of open questions and challenges involved in the evaluation of generative network-based methods.
Tasks	Image Generation, Image Reconstruction
Published	2017-12-07
URL	http://arxiv.org/abs/1712.02854v1
PDF	http://arxiv.org/pdf/1712.02854v1.pdf
PWC	https://paperswithcode.com/paper/stochastic-reconstruction-of-an-oolitic
Repo	https://github.com/LukasMosser/geogan
Framework	pytorch

Gated-Attention Architectures for Task-Oriented Language Grounding


Title	Gated-Attention Architectures for Task-Oriented Language Grounding
Authors	Devendra Singh Chaplot, Kanthashree Mysore Sathyendra, Rama Kumar Pasumarthi, Dheeraj Rajagopal, Ruslan Salakhutdinov
Abstract	To perform tasks specified by natural language instructions, autonomous agents need to extract semantically meaningful representations of language and map it to visual elements and actions in the environment. This problem is called task-oriented language grounding. We propose an end-to-end trainable neural architecture for task-oriented language grounding in 3D environments which assumes no prior linguistic or perceptual knowledge and requires only raw pixels from the environment and the natural language instruction as input. The proposed model combines the image and text representations using a Gated-Attention mechanism and learns a policy to execute the natural language instruction using standard reinforcement and imitation learning methods. We show the effectiveness of the proposed model on unseen instructions as well as unseen maps, both quantitatively and qualitatively. We also introduce a novel environment based on a 3D game engine to simulate the challenges of task-oriented language grounding over a rich set of instructions and environment states.
Tasks	Imitation Learning
Published	2017-06-22
URL	http://arxiv.org/abs/1706.07230v2
PDF	http://arxiv.org/pdf/1706.07230v2.pdf
PWC	https://paperswithcode.com/paper/gated-attention-architectures-for-task
Repo	https://github.com/devendrachaplot/DeepRL-Grounding
Framework	pytorch

On the Compactness, Efficiency, and Representation of 3D Convolutional Networks: Brain Parcellation as a Pretext Task


Title	On the Compactness, Efficiency, and Representation of 3D Convolutional Networks: Brain Parcellation as a Pretext Task
Authors	Wenqi Li, Guotai Wang, Lucas Fidon, Sebastien Ourselin, M. Jorge Cardoso, Tom Vercauteren
Abstract	Deep convolutional neural networks are powerful tools for learning visual representations from images. However, designing efficient deep architectures to analyse volumetric medical images remains challenging. This work investigates efficient and flexible elements of modern convolutional networks such as dilated convolution and residual connection. With these essential building blocks, we propose a high-resolution, compact convolutional network for volumetric image segmentation. To illustrate its efficiency of learning 3D representation from large-scale image data, the proposed network is validated with the challenging task of parcellating 155 neuroanatomical structures from brain MR images. Our experiments show that the proposed network architecture compares favourably with state-of-the-art volumetric segmentation networks while being an order of magnitude more compact. We consider the brain parcellation task as a pretext task for volumetric image segmentation; our trained network potentially provides a good starting point for transfer learning. Additionally, we show the feasibility of voxel-level uncertainty estimation using a sampling approximation through dropout.
Tasks	Semantic Segmentation, Transfer Learning
Published	2017-07-06
URL	http://arxiv.org/abs/1707.01992v1
PDF	http://arxiv.org/pdf/1707.01992v1.pdf
PWC	https://paperswithcode.com/paper/on-the-compactness-efficiency-and
Repo	https://github.com/fepegar/highresnet
Framework	pytorch

Zero-Shot Activity Recognition with Verb Attribute Induction


Title	Zero-Shot Activity Recognition with Verb Attribute Induction
Authors	Rowan Zellers, Yejin Choi
Abstract	In this paper, we investigate large-scale zero-shot activity recognition by modeling the visual and linguistic attributes of action verbs. For example, the verb “salute” has several properties, such as being a light movement, a social act, and short in duration. We use these attributes as the internal mapping between visual and textual representations to reason about a previously unseen action. In contrast to much prior work that assumes access to gold standard attributes for zero-shot classes and focuses primarily on object attributes, our model uniquely learns to infer action attributes from dictionary definitions and distributed word representations. Experimental results confirm that action attributes inferred from language can provide a predictive signal for zero-shot prediction of previously unseen activities.
Tasks	Activity Recognition
Published	2017-07-29
URL	http://arxiv.org/abs/1707.09468v2
PDF	http://arxiv.org/pdf/1707.09468v2.pdf
PWC	https://paperswithcode.com/paper/zero-shot-activity-recognition-with-verb
Repo	https://github.com/uwnlp/verb-attributes
Framework	pytorch

Gate Activation Signal Analysis for Gated Recurrent Neural Networks and Its Correlation with Phoneme Boundaries


Title	Gate Activation Signal Analysis for Gated Recurrent Neural Networks and Its Correlation with Phoneme Boundaries
Authors	Yu-Hsuan Wang, Cheng-Tao Chung, Hung-yi Lee
Abstract	In this paper we analyze the gate activation signals inside the gated recurrent neural networks, and find the temporal structure of such signals is highly correlated with the phoneme boundaries. This correlation is further verified by a set of experiments for phoneme segmentation, in which better results compared to standard approaches were obtained.
Tasks
Published	2017-03-22
URL	http://arxiv.org/abs/1703.07588v2
PDF	http://arxiv.org/pdf/1703.07588v2.pdf
PWC	https://paperswithcode.com/paper/gate-activation-signal-analysis-for-gated
Repo	https://github.com/allyoushawn/timit_gas
Framework	tf

Wasserstein Learning of Deep Generative Point Process Models


Title	Wasserstein Learning of Deep Generative Point Process Models
Authors	Shuai Xiao, Mehrdad Farajtabar, Xiaojing Ye, Junchi Yan, Le Song, Hongyuan Zha
Abstract	Point processes are becoming very popular in modeling asynchronous sequential data due to their sound mathematical foundation and strength in modeling a variety of real-world phenomena. Currently, they are often characterized via intensity function which limits model’s expressiveness due to unrealistic assumptions on its parametric form used in practice. Furthermore, they are learned via maximum likelihood approach which is prone to failure in multi-modal distributions of sequences. In this paper, we propose an intensity-free approach for point processes modeling that transforms nuisance processes to a target one. Furthermore, we train the model using a likelihood-free leveraging Wasserstein distance between point processes. Experiments on various synthetic and real-world data substantiate the superiority of the proposed point process model over conventional ones.
Tasks	Point Processes
Published	2017-05-23
URL	http://arxiv.org/abs/1705.08051v1
PDF	http://arxiv.org/pdf/1705.08051v1.pdf
PWC	https://paperswithcode.com/paper/wasserstein-learning-of-deep-generative-point
Repo	https://github.com/xiaoshuai09/Wasserstein-Learning-For-Point-Process
Framework	tf

From Parity to Preference-based Notions of Fairness in Classification


Title	From Parity to Preference-based Notions of Fairness in Classification
Authors	Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rodriguez, Krishna P. Gummadi, Adrian Weller
Abstract	The adoption of automated, data-driven decision making in an ever expanding range of applications has raised concerns about its potential unfairness towards certain social groups. In this context, a number of recent studies have focused on defining, detecting, and removing unfairness from data-driven decision systems. However, the existing notions of fairness, based on parity (equality) in treatment or outcomes for different social groups, tend to be quite stringent, limiting the overall decision making accuracy. In this paper, we draw inspiration from the fair-division and envy-freeness literature in economics and game theory and propose preference-based notions of fairness – given the choice between various sets of decision treatments or outcomes, any group of users would collectively prefer its treatment or outcomes, regardless of the (dis)parity as compared to the other groups. Then, we introduce tractable proxies to design margin-based classifiers that satisfy these preference-based notions of fairness. Finally, we experiment with a variety of synthetic and real-world datasets and show that preference-based fairness allows for greater decision accuracy than parity-based fairness.
Tasks	Decision Making
Published	2017-06-30
URL	http://arxiv.org/abs/1707.00010v2
PDF	http://arxiv.org/pdf/1707.00010v2.pdf
PWC	https://paperswithcode.com/paper/from-parity-to-preference-based-notions-of
Repo	https://github.com/mbilalzafar/fair-classification
Framework	none

DeLiGAN : Generative Adversarial Networks for Diverse and Limited Data


Title	DeLiGAN : Generative Adversarial Networks for Diverse and Limited Data
Authors	Swaminathan Gurumurthy, Ravi Kiran Sarvadevabhatla, Venkatesh Babu Radhakrishnan
Abstract	A class of recent approaches for generating images, called Generative Adversarial Networks (GAN), have been used to generate impressively realistic images of objects, bedrooms, handwritten digits and a variety of other image modalities. However, typical GAN-based approaches require large amounts of training data to capture the diversity across the image modality. In this paper, we propose DeLiGAN – a novel GAN-based architecture for diverse and limited training data scenarios. In our approach, we reparameterize the latent generative space as a mixture model and learn the mixture model’s parameters along with those of GAN. This seemingly simple modification to the GAN framework is surprisingly effective and results in models which enable diversity in generated samples although trained with limited data. In our work, we show that DeLiGAN can generate images of handwritten digits, objects and hand-drawn sketches, all using limited amounts of data. To quantitatively characterize intra-class diversity of generated samples, we also introduce a modified version of “inception-score”, a measure which has been found to correlate well with human assessment of generated samples.
Tasks	Image Generation
Published	2017-06-07
URL	http://arxiv.org/abs/1706.02071v1
PDF	http://arxiv.org/pdf/1706.02071v1.pdf
PWC	https://paperswithcode.com/paper/deligan-generative-adversarial-networks-for
Repo	https://github.com/RAF96/ifmo-2019-deep-learning-coursework
Framework	none