Paper Group AWR 82
Sharp Models on Dull Hardware: Fast and Accurate Neural Machine Translation Decoding on the CPU. Continuous Representation of Location for Geolocation and Lexical Dialectology using Mixture Density Networks. Learning Word Relatedness over Time. Learning Background-Aware Correlation Filters for Visual Tracking. Dynamic Time-Aware Attention to Speake …
Sharp Models on Dull Hardware: Fast and Accurate Neural Machine Translation Decoding on the CPU
Title | Sharp Models on Dull Hardware: Fast and Accurate Neural Machine Translation Decoding on the CPU |
Authors | Jacob Devlin |
Abstract | Attentional sequence-to-sequence models have become the new standard for machine translation, but one challenge of such models is a significant increase in training and decoding cost compared to phrase-based systems. Here, we focus on efficient decoding, with a goal of achieving accuracy close the state-of-the-art in neural machine translation (NMT), while achieving CPU decoding speed/throughput close to that of a phrasal decoder. We approach this problem from two angles: First, we describe several techniques for speeding up an NMT beam search decoder, which obtain a 4.4x speedup over a very efficient baseline decoder without changing the decoder output. Second, we propose a simple but powerful network architecture which uses an RNN (GRU/LSTM) layer at bottom, followed by a series of stacked fully-connected layers applied at every timestep. This architecture achieves similar accuracy to a deep recurrent model, at a small fraction of the training and decoding cost. By combining these techniques, our best system achieves a very competitive accuracy of 38.3 BLEU on WMT English-French NewsTest2014, while decoding at 100 words/sec on single-threaded CPU. We believe this is the best published accuracy/speed trade-off of an NMT system. |
Tasks | Machine Translation |
Published | 2017-05-04 |
URL | http://arxiv.org/abs/1705.01991v1 |
http://arxiv.org/pdf/1705.01991v1.pdf | |
PWC | https://paperswithcode.com/paper/sharp-models-on-dull-hardware-fast-and |
Repo | https://github.com/kpu/intgemm |
Framework | none |
Continuous Representation of Location for Geolocation and Lexical Dialectology using Mixture Density Networks
Title | Continuous Representation of Location for Geolocation and Lexical Dialectology using Mixture Density Networks |
Authors | Afshin Rahimi, Timothy Baldwin, Trevor Cohn |
Abstract | We propose a method for embedding two-dimensional locations in a continuous vector space using a neural network-based model incorporating mixtures of Gaussian distributions, presenting two model variants for text-based geolocation and lexical dialectology. Evaluated over Twitter data, the proposed model outperforms conventional regression-based geolocation and provides a better estimate of uncertainty. We also show the effectiveness of the representation for predicting words from location in lexical dialectology, and evaluate it using the DARE dataset. |
Tasks | |
Published | 2017-08-14 |
URL | http://arxiv.org/abs/1708.04358v1 |
http://arxiv.org/pdf/1708.04358v1.pdf | |
PWC | https://paperswithcode.com/paper/continuous-representation-of-location-for |
Repo | https://github.com/afshinrahimi/geomdn |
Framework | none |
Learning Word Relatedness over Time
Title | Learning Word Relatedness over Time |
Authors | Guy D. Rosin, Eytan Adar, Kira Radinsky |
Abstract | Search systems are often focused on providing relevant results for the “now”, assuming both corpora and user needs that focus on the present. However, many corpora today reflect significant longitudinal collections ranging from 20 years of the Web to hundreds of years of digitized newspapers and books. Understanding the temporal intent of the user and retrieving the most relevant historical content has become a significant challenge. Common search features, such as query expansion, leverage the relationship between terms but cannot function well across all times when relationships vary temporally. In this work, we introduce a temporal relationship model that is extracted from longitudinal data collections. The model supports the task of identifying, given two words, when they relate to each other. We present an algorithmic framework for this task and show its application for the task of query expansion, achieving high gain. |
Tasks | |
Published | 2017-07-25 |
URL | http://arxiv.org/abs/1707.08081v2 |
http://arxiv.org/pdf/1707.08081v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-word-relatedness-over-time |
Repo | https://github.com/guyrosin/learning-word-relatedness |
Framework | none |
Learning Background-Aware Correlation Filters for Visual Tracking
Title | Learning Background-Aware Correlation Filters for Visual Tracking |
Authors | Hamed Kiani Galoogahi, Ashton Fagg, Simon Lucey |
Abstract | Correlation Filters (CFs) have recently demonstrated excellent performance in terms of rapidly tracking objects under challenging photometric and geometric variations. The strength of the approach comes from its ability to efficiently learn - “on the fly” - how the object is changing over time. A fundamental drawback to CFs, however, is that the background of the object is not be modelled over time which can result in suboptimal results. In this paper we propose a Background-Aware CF that can model how both the foreground and background of the object varies over time. Our approach, like conventional CFs, is extremely computationally efficient - and extensive experiments over multiple tracking benchmarks demonstrate the superior accuracy and real-time performance of our method compared to the state-of-the-art trackers including those based on a deep learning paradigm. |
Tasks | Visual Tracking |
Published | 2017-03-14 |
URL | http://arxiv.org/abs/1703.04590v2 |
http://arxiv.org/pdf/1703.04590v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-background-aware-correlation-filters |
Repo | https://github.com/4kubo/bacf_python |
Framework | none |
Dynamic Time-Aware Attention to Speaker Roles and Contexts for Spoken Language Understanding
Title | Dynamic Time-Aware Attention to Speaker Roles and Contexts for Spoken Language Understanding |
Authors | Po-Chun Chen, Ta-Chung Chi, Shang-Yu Su, Yun-Nung Chen |
Abstract | Spoken language understanding (SLU) is an essential component in conversational systems. Most SLU component treats each utterance independently, and then the following components aggregate the multi-turn information in the separate phases. In order to avoid error propagation and effectively utilize contexts, prior work leveraged history for contextual SLU. However, the previous model only paid attention to the content in history utterances without considering their temporal information and speaker roles. In the dialogues, the most recent utterances should be more important than the least recent ones. Furthermore, users usually pay attention to 1) self history for reasoning and 2) others’ utterances for listening, the speaker of the utterances may provides informative cues to help understanding. Therefore, this paper proposes an attention-based network that additionally leverages temporal information and speaker role for better SLU, where the attention to contexts and speaker roles can be automatically learned in an end-to-end manner. The experiments on the benchmark Dialogue State Tracking Challenge 4 (DSTC4) dataset show that the time-aware dynamic role attention networks significantly improve the understanding performance. |
Tasks | Dialogue State Tracking, Spoken Language Understanding |
Published | 2017-09-30 |
URL | http://arxiv.org/abs/1710.00165v2 |
http://arxiv.org/pdf/1710.00165v2.pdf | |
PWC | https://paperswithcode.com/paper/dynamic-time-aware-attention-to-speaker-roles |
Repo | https://github.com/MiuLab/Time-SLU |
Framework | tf |
Accurately and Efficiently Interpreting Human-Robot Instructions of Varying Granularities
Title | Accurately and Efficiently Interpreting Human-Robot Instructions of Varying Granularities |
Authors | Dilip Arumugam, Siddharth Karamcheti, Nakul Gopalan, Lawson L. S. Wong, Stefanie Tellex |
Abstract | Humans can ground natural language commands to tasks at both abstract and fine-grained levels of specificity. For instance, a human forklift operator can be instructed to perform a high-level action, like “grab a pallet” or a low-level action like “tilt back a little bit.” While robots are also capable of grounding language commands to tasks, previous methods implicitly assume that all commands and tasks reside at a single, fixed level of abstraction. Additionally, methods that do not use multiple levels of abstraction encounter inefficient planning and execution times as they solve tasks at a single level of abstraction with large, intractable state-action spaces closely resembling real world complexity. In this work, by grounding commands to all the tasks or subtasks available in a hierarchical planning framework, we arrive at a model capable of interpreting language at multiple levels of specificity ranging from coarse to more granular. We show that the accuracy of the grounding procedure is improved when simultaneously inferring the degree of abstraction in language used to communicate the task. Leveraging hierarchy also improves efficiency: our proposed approach enables a robot to respond to a command within one second on 90% of our tasks, while baselines take over twenty seconds on half the tasks. Finally, we demonstrate that a real, physical robot can ground commands at multiple levels of abstraction allowing it to efficiently plan different subtasks within the same planning hierarchy. |
Tasks | |
Published | 2017-04-21 |
URL | http://arxiv.org/abs/1704.06616v2 |
http://arxiv.org/pdf/1704.06616v2.pdf | |
PWC | https://paperswithcode.com/paper/accurately-and-efficiently-interpreting-human |
Repo | https://github.com/h2r/GLAMDP |
Framework | tf |
Know-Evolve: Deep Temporal Reasoning for Dynamic Knowledge Graphs
Title | Know-Evolve: Deep Temporal Reasoning for Dynamic Knowledge Graphs |
Authors | Rakshit Trivedi, Hanjun Dai, Yichen Wang, Le Song |
Abstract | The availability of large scale event data with time stamps has given rise to dynamically evolving knowledge graphs that contain temporal information for each edge. Reasoning over time in such dynamic knowledge graphs is not yet well understood. To this end, we present Know-Evolve, a novel deep evolutionary knowledge network that learns non-linearly evolving entity representations over time. The occurrence of a fact (edge) is modeled as a multivariate point process whose intensity function is modulated by the score for that fact computed based on the learned entity embeddings. We demonstrate significantly improved performance over various relational learning approaches on two large scale real-world datasets. Further, our method effectively predicts occurrence or recurrence time of a fact which is novel compared to prior reasoning approaches in multi-relational setting. |
Tasks | Entity Embeddings, Knowledge Graphs, Relational Reasoning |
Published | 2017-05-16 |
URL | http://arxiv.org/abs/1705.05742v3 |
http://arxiv.org/pdf/1705.05742v3.pdf | |
PWC | https://paperswithcode.com/paper/know-evolve-deep-temporal-reasoning-for |
Repo | https://github.com/INK-USC/RENet |
Framework | pytorch |
Stochastic reconstruction of an oolitic limestone by generative adversarial networks
Title | Stochastic reconstruction of an oolitic limestone by generative adversarial networks |
Authors | Lukas Mosser, Olivier Dubrule, Martin J. Blunt |
Abstract | Stochastic image reconstruction is a key part of modern digital rock physics and materials analysis that aims to create numerous representative samples of material micro-structures for upscaling, numerical computation of effective properties and uncertainty quantification. We present a method of three-dimensional stochastic image reconstruction based on generative adversarial neural networks (GANs). GANs represent a framework of unsupervised learning methods that require no a priori inference of the probability distribution associated with the training data. Using a fully convolutional neural network allows fast sampling of large volumetric images.We apply a GAN based workflow of network training and image generation to an oolitic Ketton limestone micro-CT dataset. Minkowski functionals, effective permeability as well as velocity distributions of simulated flow within the acquired images are compared with the synthetic reconstructions generated by the deep neural network. While our results show that GANs allow a fast and accurate reconstruction of the evaluated image dataset, we address a number of open questions and challenges involved in the evaluation of generative network-based methods. |
Tasks | Image Generation, Image Reconstruction |
Published | 2017-12-07 |
URL | http://arxiv.org/abs/1712.02854v1 |
http://arxiv.org/pdf/1712.02854v1.pdf | |
PWC | https://paperswithcode.com/paper/stochastic-reconstruction-of-an-oolitic |
Repo | https://github.com/LukasMosser/geogan |
Framework | pytorch |
Gated-Attention Architectures for Task-Oriented Language Grounding
Title | Gated-Attention Architectures for Task-Oriented Language Grounding |
Authors | Devendra Singh Chaplot, Kanthashree Mysore Sathyendra, Rama Kumar Pasumarthi, Dheeraj Rajagopal, Ruslan Salakhutdinov |
Abstract | To perform tasks specified by natural language instructions, autonomous agents need to extract semantically meaningful representations of language and map it to visual elements and actions in the environment. This problem is called task-oriented language grounding. We propose an end-to-end trainable neural architecture for task-oriented language grounding in 3D environments which assumes no prior linguistic or perceptual knowledge and requires only raw pixels from the environment and the natural language instruction as input. The proposed model combines the image and text representations using a Gated-Attention mechanism and learns a policy to execute the natural language instruction using standard reinforcement and imitation learning methods. We show the effectiveness of the proposed model on unseen instructions as well as unseen maps, both quantitatively and qualitatively. We also introduce a novel environment based on a 3D game engine to simulate the challenges of task-oriented language grounding over a rich set of instructions and environment states. |
Tasks | Imitation Learning |
Published | 2017-06-22 |
URL | http://arxiv.org/abs/1706.07230v2 |
http://arxiv.org/pdf/1706.07230v2.pdf | |
PWC | https://paperswithcode.com/paper/gated-attention-architectures-for-task |
Repo | https://github.com/devendrachaplot/DeepRL-Grounding |
Framework | pytorch |
On the Compactness, Efficiency, and Representation of 3D Convolutional Networks: Brain Parcellation as a Pretext Task
Title | On the Compactness, Efficiency, and Representation of 3D Convolutional Networks: Brain Parcellation as a Pretext Task |
Authors | Wenqi Li, Guotai Wang, Lucas Fidon, Sebastien Ourselin, M. Jorge Cardoso, Tom Vercauteren |
Abstract | Deep convolutional neural networks are powerful tools for learning visual representations from images. However, designing efficient deep architectures to analyse volumetric medical images remains challenging. This work investigates efficient and flexible elements of modern convolutional networks such as dilated convolution and residual connection. With these essential building blocks, we propose a high-resolution, compact convolutional network for volumetric image segmentation. To illustrate its efficiency of learning 3D representation from large-scale image data, the proposed network is validated with the challenging task of parcellating 155 neuroanatomical structures from brain MR images. Our experiments show that the proposed network architecture compares favourably with state-of-the-art volumetric segmentation networks while being an order of magnitude more compact. We consider the brain parcellation task as a pretext task for volumetric image segmentation; our trained network potentially provides a good starting point for transfer learning. Additionally, we show the feasibility of voxel-level uncertainty estimation using a sampling approximation through dropout. |
Tasks | Semantic Segmentation, Transfer Learning |
Published | 2017-07-06 |
URL | http://arxiv.org/abs/1707.01992v1 |
http://arxiv.org/pdf/1707.01992v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-compactness-efficiency-and |
Repo | https://github.com/fepegar/highresnet |
Framework | pytorch |
Zero-Shot Activity Recognition with Verb Attribute Induction
Title | Zero-Shot Activity Recognition with Verb Attribute Induction |
Authors | Rowan Zellers, Yejin Choi |
Abstract | In this paper, we investigate large-scale zero-shot activity recognition by modeling the visual and linguistic attributes of action verbs. For example, the verb “salute” has several properties, such as being a light movement, a social act, and short in duration. We use these attributes as the internal mapping between visual and textual representations to reason about a previously unseen action. In contrast to much prior work that assumes access to gold standard attributes for zero-shot classes and focuses primarily on object attributes, our model uniquely learns to infer action attributes from dictionary definitions and distributed word representations. Experimental results confirm that action attributes inferred from language can provide a predictive signal for zero-shot prediction of previously unseen activities. |
Tasks | Activity Recognition |
Published | 2017-07-29 |
URL | http://arxiv.org/abs/1707.09468v2 |
http://arxiv.org/pdf/1707.09468v2.pdf | |
PWC | https://paperswithcode.com/paper/zero-shot-activity-recognition-with-verb |
Repo | https://github.com/uwnlp/verb-attributes |
Framework | pytorch |
Gate Activation Signal Analysis for Gated Recurrent Neural Networks and Its Correlation with Phoneme Boundaries
Title | Gate Activation Signal Analysis for Gated Recurrent Neural Networks and Its Correlation with Phoneme Boundaries |
Authors | Yu-Hsuan Wang, Cheng-Tao Chung, Hung-yi Lee |
Abstract | In this paper we analyze the gate activation signals inside the gated recurrent neural networks, and find the temporal structure of such signals is highly correlated with the phoneme boundaries. This correlation is further verified by a set of experiments for phoneme segmentation, in which better results compared to standard approaches were obtained. |
Tasks | |
Published | 2017-03-22 |
URL | http://arxiv.org/abs/1703.07588v2 |
http://arxiv.org/pdf/1703.07588v2.pdf | |
PWC | https://paperswithcode.com/paper/gate-activation-signal-analysis-for-gated |
Repo | https://github.com/allyoushawn/timit_gas |
Framework | tf |
Wasserstein Learning of Deep Generative Point Process Models
Title | Wasserstein Learning of Deep Generative Point Process Models |
Authors | Shuai Xiao, Mehrdad Farajtabar, Xiaojing Ye, Junchi Yan, Le Song, Hongyuan Zha |
Abstract | Point processes are becoming very popular in modeling asynchronous sequential data due to their sound mathematical foundation and strength in modeling a variety of real-world phenomena. Currently, they are often characterized via intensity function which limits model’s expressiveness due to unrealistic assumptions on its parametric form used in practice. Furthermore, they are learned via maximum likelihood approach which is prone to failure in multi-modal distributions of sequences. In this paper, we propose an intensity-free approach for point processes modeling that transforms nuisance processes to a target one. Furthermore, we train the model using a likelihood-free leveraging Wasserstein distance between point processes. Experiments on various synthetic and real-world data substantiate the superiority of the proposed point process model over conventional ones. |
Tasks | Point Processes |
Published | 2017-05-23 |
URL | http://arxiv.org/abs/1705.08051v1 |
http://arxiv.org/pdf/1705.08051v1.pdf | |
PWC | https://paperswithcode.com/paper/wasserstein-learning-of-deep-generative-point |
Repo | https://github.com/xiaoshuai09/Wasserstein-Learning-For-Point-Process |
Framework | tf |
From Parity to Preference-based Notions of Fairness in Classification
Title | From Parity to Preference-based Notions of Fairness in Classification |
Authors | Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rodriguez, Krishna P. Gummadi, Adrian Weller |
Abstract | The adoption of automated, data-driven decision making in an ever expanding range of applications has raised concerns about its potential unfairness towards certain social groups. In this context, a number of recent studies have focused on defining, detecting, and removing unfairness from data-driven decision systems. However, the existing notions of fairness, based on parity (equality) in treatment or outcomes for different social groups, tend to be quite stringent, limiting the overall decision making accuracy. In this paper, we draw inspiration from the fair-division and envy-freeness literature in economics and game theory and propose preference-based notions of fairness – given the choice between various sets of decision treatments or outcomes, any group of users would collectively prefer its treatment or outcomes, regardless of the (dis)parity as compared to the other groups. Then, we introduce tractable proxies to design margin-based classifiers that satisfy these preference-based notions of fairness. Finally, we experiment with a variety of synthetic and real-world datasets and show that preference-based fairness allows for greater decision accuracy than parity-based fairness. |
Tasks | Decision Making |
Published | 2017-06-30 |
URL | http://arxiv.org/abs/1707.00010v2 |
http://arxiv.org/pdf/1707.00010v2.pdf | |
PWC | https://paperswithcode.com/paper/from-parity-to-preference-based-notions-of |
Repo | https://github.com/mbilalzafar/fair-classification |
Framework | none |
DeLiGAN : Generative Adversarial Networks for Diverse and Limited Data
Title | DeLiGAN : Generative Adversarial Networks for Diverse and Limited Data |
Authors | Swaminathan Gurumurthy, Ravi Kiran Sarvadevabhatla, Venkatesh Babu Radhakrishnan |
Abstract | A class of recent approaches for generating images, called Generative Adversarial Networks (GAN), have been used to generate impressively realistic images of objects, bedrooms, handwritten digits and a variety of other image modalities. However, typical GAN-based approaches require large amounts of training data to capture the diversity across the image modality. In this paper, we propose DeLiGAN – a novel GAN-based architecture for diverse and limited training data scenarios. In our approach, we reparameterize the latent generative space as a mixture model and learn the mixture model’s parameters along with those of GAN. This seemingly simple modification to the GAN framework is surprisingly effective and results in models which enable diversity in generated samples although trained with limited data. In our work, we show that DeLiGAN can generate images of handwritten digits, objects and hand-drawn sketches, all using limited amounts of data. To quantitatively characterize intra-class diversity of generated samples, we also introduce a modified version of “inception-score”, a measure which has been found to correlate well with human assessment of generated samples. |
Tasks | Image Generation |
Published | 2017-06-07 |
URL | http://arxiv.org/abs/1706.02071v1 |
http://arxiv.org/pdf/1706.02071v1.pdf | |
PWC | https://paperswithcode.com/paper/deligan-generative-adversarial-networks-for |
Repo | https://github.com/RAF96/ifmo-2019-deep-learning-coursework |
Framework | none |