July 29, 2019

2733 words 13 mins read

Paper Group AWR 160

High-Order Attention Models for Visual Question Answering. Imagination-Augmented Agents for Deep Reinforcement Learning. BPEmb: Tokenization-free Pre-trained Subword Embeddings in 275 Languages. Interpretable Explanations of Black Boxes by Meaningful Perturbation. Structure propagation for zero-shot learning. A Neural Language Model for Dynamically …

High-Order Attention Models for Visual Question Answering


Title	High-Order Attention Models for Visual Question Answering
Authors	Idan Schwartz, Alexander G. Schwing, Tamir Hazan
Abstract	The quest for algorithms that enable cognitive abilities is an important part of machine learning. A common trait in many recently investigated cognitive-like tasks is that they take into account different data modalities, such as visual and textual input. In this paper we propose a novel and generally applicable form of attention mechanism that learns high-order correlations between various data modalities. We show that high-order correlations effectively direct the appropriate attention to the relevant elements in the different data modalities that are required to solve the joint task. We demonstrate the effectiveness of our high-order attention mechanism on the task of visual question answering (VQA), where we achieve state-of-the-art performance on the standard VQA dataset.
Tasks	Question Answering, Visual Question Answering
Published	2017-11-12
URL	http://arxiv.org/abs/1711.04323v1
PDF	http://arxiv.org/pdf/1711.04323v1.pdf
PWC	https://paperswithcode.com/paper/high-order-attention-models-for-visual
Repo	https://github.com/idansc/HighOrderAtten
Framework	torch

Imagination-Augmented Agents for Deep Reinforcement Learning


Title	Imagination-Augmented Agents for Deep Reinforcement Learning
Authors	Théophane Weber, Sébastien Racanière, David P. Reichert, Lars Buesing, Arthur Guez, Danilo Jimenez Rezende, Adria Puigdomènech Badia, Oriol Vinyals, Nicolas Heess, Yujia Li, Razvan Pascanu, Peter Battaglia, Demis Hassabis, David Silver, Daan Wierstra
Abstract	We introduce Imagination-Augmented Agents (I2As), a novel architecture for deep reinforcement learning combining model-free and model-based aspects. In contrast to most existing model-based reinforcement learning and planning methods, which prescribe how a model should be used to arrive at a policy, I2As learn to interpret predictions from a learned environment model to construct implicit plans in arbitrary ways, by using the predictions as additional context in deep policy networks. I2As show improved data efficiency, performance, and robustness to model misspecification compared to several baselines.
Tasks
Published	2017-07-19
URL	http://arxiv.org/abs/1707.06203v2
PDF	http://arxiv.org/pdf/1707.06203v2.pdf
PWC	https://paperswithcode.com/paper/imagination-augmented-agents-for-deep
Repo	https://github.com/Olloxan/Pytorch-A2C
Framework	pytorch

BPEmb: Tokenization-free Pre-trained Subword Embeddings in 275 Languages


Title	BPEmb: Tokenization-free Pre-trained Subword Embeddings in 275 Languages
Authors	Benjamin Heinzerling, Michael Strube
Abstract	We present BPEmb, a collection of pre-trained subword unit embeddings in 275 languages, based on Byte-Pair Encoding (BPE). In an evaluation using fine-grained entity typing as testbed, BPEmb performs competitively, and for some languages bet- ter than alternative subword approaches, while requiring vastly fewer resources and no tokenization. BPEmb is available at https://github.com/bheinzerling/bpemb
Tasks	Entity Typing, Tokenization, Word Embeddings
Published	2017-10-05
URL	http://arxiv.org/abs/1710.02187v1
PDF	http://arxiv.org/pdf/1710.02187v1.pdf
PWC	https://paperswithcode.com/paper/bpemb-tokenization-free-pre-trained-subword
Repo	https://github.com/bheinzerling/bpemb
Framework	none

Interpretable Explanations of Black Boxes by Meaningful Perturbation


Title	Interpretable Explanations of Black Boxes by Meaningful Perturbation
Authors	Ruth Fong, Andrea Vedaldi
Abstract	As machine learning algorithms are increasingly applied to high impact yet high risk tasks, such as medical diagnosis or autonomous driving, it is critical that researchers can explain how such algorithms arrived at their predictions. In recent years, a number of image saliency methods have been developed to summarize where highly complex neural networks “look” in an image for evidence for their predictions. However, these techniques are limited by their heuristic nature and architectural constraints. In this paper, we make two main contributions: First, we propose a general framework for learning different kinds of explanations for any black box algorithm. Second, we specialise the framework to find the part of an image most responsible for a classifier decision. Unlike previous works, our method is model-agnostic and testable because it is grounded in explicit and interpretable image perturbations.
Tasks	Interpretable Machine Learning
Published	2017-04-11
URL	http://arxiv.org/abs/1704.03296v3
PDF	http://arxiv.org/pdf/1704.03296v3.pdf
PWC	https://paperswithcode.com/paper/interpretable-explanations-of-black-boxes-by
Repo	https://github.com/dizcza/EmbedderSDR
Framework	pytorch

Structure propagation for zero-shot learning


Title	Structure propagation for zero-shot learning
Authors	Guangfeng Lin, Yajun Chen, Fan Zhao
Abstract	The key of zero-shot learning (ZSL) is how to find the information transfer model for bridging the gap between images and semantic information (texts or attributes). Existing ZSL methods usually construct the compatibility function between images and class labels with the consideration of the relevance on the semantic classes (the manifold structure of semantic classes). However, the relationship of image classes (the manifold structure of image classes) is also very important for the compatibility model construction. It is difficult to capture the relationship among image classes due to unseen classes, so that the manifold structure of image classes often is ignored in ZSL. To complement each other between the manifold structure of image classes and that of semantic classes information, we propose structure propagation (SP) for improving the performance of ZSL for classification. SP can jointly consider the manifold structure of image classes and that of semantic classes for approximating to the intrinsic structure of object classes. Moreover, the SP can describe the constrain condition between the compatibility function and these manifold structures for balancing the influence of the structure propagation iteration. The SP solution provides not only unseen class labels but also the relationship of two manifold structures that encode the positive transfer in structure propagation. Experimental results demonstrate that SP can attain the promising results on the AwA, CUB, Dogs and SUN databases.
Tasks	Zero-Shot Learning
Published	2017-11-27
URL	http://arxiv.org/abs/1711.09513v1
PDF	http://arxiv.org/pdf/1711.09513v1.pdf
PWC	https://paperswithcode.com/paper/structure-propagation-for-zero-shot-learning
Repo	https://github.com/lgf78103/Structure-propagation-for-zero-shot-learning
Framework	none

A Neural Language Model for Dynamically Representing the Meanings of Unknown Words and Entities in a Discourse


Title	A Neural Language Model for Dynamically Representing the Meanings of Unknown Words and Entities in a Discourse
Authors	Sosuke Kobayashi, Naoaki Okazaki, Kentaro Inui
Abstract	This study addresses the problem of identifying the meaning of unknown words or entities in a discourse with respect to the word embedding approaches used in neural language models. We proposed a method for on-the-fly construction and exploitation of word embeddings in both the input and output layers of a neural model by tracking contexts. This extends the dynamic entity representation used in Kobayashi et al. (2016) and incorporates a copy mechanism proposed independently by Gu et al. (2016) and Gulcehre et al. (2016). In addition, we construct a new task and dataset called Anonymized Language Modeling for evaluating the ability to capture word meanings while reading. Experiments conducted using our novel dataset show that the proposed variant of RNN language model outperformed the baseline model. Furthermore, the experiments also demonstrate that dynamic updates of an output layer help a model predict reappearing entities, whereas those of an input layer are effective to predict words following reappearing entities.
Tasks	Language Modelling, Word Embeddings
Published	2017-09-06
URL	http://arxiv.org/abs/1709.01679v2
PDF	http://arxiv.org/pdf/1709.01679v2.pdf
PWC	https://paperswithcode.com/paper/a-neural-language-model-for-dynamically
Repo	https://github.com/soskek/dynamic_neural_text_model
Framework	none

Continual Learning with Deep Generative Replay


Title	Continual Learning with Deep Generative Replay
Authors	Hanul Shin, Jung Kwon Lee, Jaehong Kim, Jiwon Kim
Abstract	Attempts to train a comprehensive artificial intelligence capable of solving multiple tasks have been impeded by a chronic problem called catastrophic forgetting. Although simply replaying all previous data alleviates the problem, it requires large memory and even worse, often infeasible in real world applications where the access to past data is limited. Inspired by the generative nature of hippocampus as a short-term memory system in primate brain, we propose the Deep Generative Replay, a novel framework with a cooperative dual model architecture consisting of a deep generative model (“generator”) and a task solving model (“solver”). With only these two models, training data for previous tasks can easily be sampled and interleaved with those for a new task. We test our methods in several sequential learning settings involving image classification tasks.
Tasks	Continual Learning, Image Classification
Published	2017-05-24
URL	http://arxiv.org/abs/1705.08690v3
PDF	http://arxiv.org/pdf/1705.08690v3.pdf
PWC	https://paperswithcode.com/paper/continual-learning-with-deep-generative
Repo	https://github.com/hursung1/DeepGenerativeReplay
Framework	pytorch

Treelogy: A Novel Tree Classifier Utilizing Deep and Hand-crafted Representations


Title	Treelogy: A Novel Tree Classifier Utilizing Deep and Hand-crafted Representations
Authors	İlke Çuğu, Eren Şener, Çağrı Erciyes, Burak Balcı, Emre Akın, Itır Önal, Ahmet Oğuz Akyüz
Abstract	We propose a novel tree classification system called Treelogy, that fuses deep representations with hand-crafted features obtained from leaf images to perform leaf-based plant classification. Key to this system are segmentation of the leaf from an untextured background, using convolutional neural networks (CNNs) for learning deep representations, extracting hand-crafted features with a number of image processing techniques, training a linear SVM with feature vectors, merging SVM and CNN results, and identifying the species from a dataset of 57 trees. Our classification results show that fusion of deep representations with hand-crafted features leads to the highest accuracy. The proposed algorithm is embedded in a smart-phone application, which is publicly available. Furthermore, our novel dataset comprised of 5408 leaf images is also made public for use of other researchers.
Tasks
Published	2017-01-28
URL	http://arxiv.org/abs/1701.08291v1
PDF	http://arxiv.org/pdf/1701.08291v1.pdf
PWC	https://paperswithcode.com/paper/treelogy-a-novel-tree-classifier-utilizing
Repo	https://github.com/cuguilke/Treelogy
Framework	caffe2

Streaming Word Embeddings with the Space-Saving Algorithm


Title	Streaming Word Embeddings with the Space-Saving Algorithm
Authors	Chandler May, Kevin Duh, Benjamin Van Durme, Ashwin Lall
Abstract	We develop a streaming (one-pass, bounded-memory) word embedding algorithm based on the canonical skip-gram with negative sampling algorithm implemented in word2vec. We compare our streaming algorithm to word2vec empirically by measuring the cosine similarity between word pairs under each algorithm and by applying each algorithm in the downstream task of hashtag prediction on a two-month interval of the Twitter sample stream. We then discuss the results of these experiments, concluding they provide partial validation of our approach as a streaming replacement for word2vec. Finally, we discuss potential failure modes and suggest directions for future work.
Tasks	Word Embeddings
Published	2017-04-24
URL	http://arxiv.org/abs/1704.07463v1
PDF	http://arxiv.org/pdf/1704.07463v1.pdf
PWC	https://paperswithcode.com/paper/streaming-word-embeddings-with-the-space
Repo	https://github.com/cjmay/athena
Framework	none

RAIL: Risk-Averse Imitation Learning


Title	RAIL: Risk-Averse Imitation Learning
Authors	Anirban Santara, Abhishek Naik, Balaraman Ravindran, Dipankar Das, Dheevatsa Mudigere, Sasikanth Avancha, Bharat Kaul
Abstract	Imitation learning algorithms learn viable policies by imitating an expert’s behavior when reward signals are not available. Generative Adversarial Imitation Learning (GAIL) is a state-of-the-art algorithm for learning policies when the expert’s behavior is available as a fixed set of trajectories. We evaluate in terms of the expert’s cost function and observe that the distribution of trajectory-costs is often more heavy-tailed for GAIL-agents than the expert at a number of benchmark continuous-control tasks. Thus, high-cost trajectories, corresponding to tail-end events of catastrophic failure, are more likely to be encountered by the GAIL-agents than the expert. This makes the reliability of GAIL-agents questionable when it comes to deployment in risk-sensitive applications like robotic surgery and autonomous driving. In this work, we aim to minimize the occurrence of tail-end events by minimizing tail risk within the GAIL framework. We quantify tail risk by the Conditional-Value-at-Risk (CVaR) of trajectories and develop the Risk-Averse Imitation Learning (RAIL) algorithm. We observe that the policies learned with RAIL show lower tail-end risk than those of vanilla GAIL. Thus the proposed RAIL algorithm appears as a potent alternative to GAIL for improved reliability in risk-sensitive applications.
Tasks	Autonomous Driving, Continuous Control, Imitation Learning
Published	2017-07-20
URL	http://arxiv.org/abs/1707.06658v4
PDF	http://arxiv.org/pdf/1707.06658v4.pdf
PWC	https://paperswithcode.com/paper/rail-risk-averse-imitation-learning
Repo	https://github.com/Santara/RAIL
Framework	none

Using Linguistic Features to Improve the Generalization Capability of Neural Coreference Resolvers


Title	Using Linguistic Features to Improve the Generalization Capability of Neural Coreference Resolvers
Authors	Nafise Sadat Moosavi, Michael Strube
Abstract	Coreference resolution is an intermediate step for text understanding. It is used in tasks and domains for which we do not necessarily have coreference annotated corpora. Therefore, generalization is of special importance for coreference resolution. However, while recent coreference resolvers have notable improvements on the CoNLL dataset, they struggle to generalize properly to new domains or datasets. In this paper, we investigate the role of linguistic features in building more generalizable coreference resolvers. We show that generalization improves only slightly by merely using a set of additional linguistic features. However, employing features and subsets of their values that are informative for coreference resolution, considerably improves generalization. Thanks to better generalization, our system achieves state-of-the-art results in out-of-domain evaluations, e.g., on WikiCoref, our system, which is trained on CoNLL, achieves on-par performance with a system designed for this dataset.
Tasks	Coreference Resolution
Published	2017-08-01
URL	http://arxiv.org/abs/1708.00160v2
PDF	http://arxiv.org/pdf/1708.00160v2.pdf
PWC	https://paperswithcode.com/paper/using-linguistic-features-to-improve-the
Repo	https://github.com/ns-moosavi/epm
Framework	none

Multi-scale Multi-band DenseNets for Audio Source Separation


Title	Multi-scale Multi-band DenseNets for Audio Source Separation
Authors	Naoya Takahashi, Yuki Mitsufuji
Abstract	This paper deals with the problem of audio source separation. To handle the complex and ill-posed nature of the problems of audio source separation, the current state-of-the-art approaches employ deep neural networks to obtain instrumental spectra from a mixture. In this study, we propose a novel network architecture that extends the recently developed densely connected convolutional network (DenseNet), which has shown excellent results on image classification tasks. To deal with the specific problem of audio source separation, an up-sampling layer, block skip connection and band-dedicated dense blocks are incorporated on top of DenseNet. The proposed approach takes advantage of long contextual information and outperforms state-of-the-art results on SiSEC 2016 competition by a large margin in terms of signal-to-distortion ratio. Moreover, the proposed architecture requires significantly fewer parameters and considerably less training time compared with other methods.
Tasks	Music Source Separation
Published	2017-06-29
URL	http://arxiv.org/abs/1706.09588v1
PDF	http://arxiv.org/pdf/1706.09588v1.pdf
PWC	https://paperswithcode.com/paper/multi-scale-multi-band-densenets-for-audio
Repo	https://github.com/tsurumeso/vocal-remover
Framework	none

Criticality as It Could Be: organizational invariance as self-organized criticality in embodied agents


Title	Criticality as It Could Be: organizational invariance as self-organized criticality in embodied agents
Authors	Miguel Aguilera, Manuel G. Bedia
Abstract	This paper outlines a methodological approach for designing adaptive agents driving themselves near points of criticality. Using a synthetic approach we construct a conceptual model that, instead of specifying mechanistic requirements to generate criticality, exploits the maintenance of an organizational structure capable of reproducing critical behavior. Our approach exploits the well-known principle of universality, which classifies critical phenomena inside a few universality classes of systems independently of their specific mechanisms or topologies. In particular, we implement an artificial embodied agent controlled by a neural network maintaining a correlation structure randomly sampled from a lattice Ising model at a critical point. We evaluate the agent in two classical reinforcement learning scenarios: the Mountain Car benchmark and the Acrobot double pendulum, finding that in both cases the neural controller reaches a point of criticality, which coincides with a transition point between two regimes of the agent’s behaviour, maximizing the mutual information between neurons and sensorimotor patterns. Finally, we discuss the possible applications of this synthetic approach to the comprehension of deeper principles connected to the pervasive presence of criticality in biological and cognitive systems.
Tasks
Published	2017-04-18
URL	http://arxiv.org/abs/1704.05255v3
PDF	http://arxiv.org/pdf/1704.05255v3.pdf
PWC	https://paperswithcode.com/paper/criticality-as-it-could-be-organizational
Repo	https://github.com/heysoos/CriticalForagingOrgs
Framework	none

Convergence Analysis of the Dynamics of a Special Kind of Two-Layered Neural Networks with $\ell_1$ and $\ell_2$ Regularization


Title	Convergence Analysis of the Dynamics of a Special Kind of Two-Layered Neural Networks with $\ell_1$ and $\ell_2$ Regularization
Authors	Zhifeng Kong
Abstract	In this paper, we made an extension to the convergence analysis of the dynamics of two-layered bias-free networks with one $ReLU$ output. We took into consideration two popular regularization terms: the $\ell_1$ and $\ell_2$ norm of the parameter vector $w$, and added it to the square loss function with coefficient $\lambda/2$. We proved that when $\lambda$ is small, the weight vector $w$ converges to the optimal solution $\hat{w}$ (with respect to the new loss function) with probability $\geq (1-\varepsilon)(1-A_d)/2$ under random initiations in a sphere centered at the origin, where $\varepsilon$ is a small value and $A_d$ is a constant. Numerical experiments including phase diagrams and repeated simulations verified our theory.
Tasks
Published	2017-11-19
URL	http://arxiv.org/abs/1711.07005v1
PDF	http://arxiv.org/pdf/1711.07005v1.pdf
PWC	https://paperswithcode.com/paper/convergence-analysis-of-the-dynamics-of-a
Repo	https://github.com/FengNiMa/ReLU_Convergence
Framework	pytorch

Spatio-Temporal Graph Convolutional Networks: A Deep Learning Framework for Traffic Forecasting


Title	Spatio-Temporal Graph Convolutional Networks: A Deep Learning Framework for Traffic Forecasting
Authors	Bing Yu, Haoteng Yin, Zhanxing Zhu
Abstract	Timely accurate traffic forecast is crucial for urban traffic control and guidance. Due to the high nonlinearity and complexity of traffic flow, traditional methods cannot satisfy the requirements of mid-and-long term prediction tasks and often neglect spatial and temporal dependencies. In this paper, we propose a novel deep learning framework, Spatio-Temporal Graph Convolutional Networks (STGCN), to tackle the time series prediction problem in traffic domain. Instead of applying regular convolutional and recurrent units, we formulate the problem on graphs and build the model with complete convolutional structures, which enable much faster training speed with fewer parameters. Experiments show that our model STGCN effectively captures comprehensive spatio-temporal correlations through modeling multi-scale traffic networks and consistently outperforms state-of-the-art baselines on various real-world traffic datasets.
Tasks	Time Series, Time Series Prediction, Traffic Prediction
Published	2017-09-14
URL	http://arxiv.org/abs/1709.04875v4
PDF	http://arxiv.org/pdf/1709.04875v4.pdf
PWC	https://paperswithcode.com/paper/spatio-temporal-graph-convolutional-networks
Repo	https://github.com/Aguin/STGCN-PyTorch
Framework	pytorch