July 29, 2019

2733 words 13 mins read

Paper Group AWR 160

Paper Group AWR 160

High-Order Attention Models for Visual Question Answering. Imagination-Augmented Agents for Deep Reinforcement Learning. BPEmb: Tokenization-free Pre-trained Subword Embeddings in 275 Languages. Interpretable Explanations of Black Boxes by Meaningful Perturbation. Structure propagation for zero-shot learning. A Neural Language Model for Dynamically …

High-Order Attention Models for Visual Question Answering

Title High-Order Attention Models for Visual Question Answering
Authors Idan Schwartz, Alexander G. Schwing, Tamir Hazan
Abstract The quest for algorithms that enable cognitive abilities is an important part of machine learning. A common trait in many recently investigated cognitive-like tasks is that they take into account different data modalities, such as visual and textual input. In this paper we propose a novel and generally applicable form of attention mechanism that learns high-order correlations between various data modalities. We show that high-order correlations effectively direct the appropriate attention to the relevant elements in the different data modalities that are required to solve the joint task. We demonstrate the effectiveness of our high-order attention mechanism on the task of visual question answering (VQA), where we achieve state-of-the-art performance on the standard VQA dataset.
Tasks Question Answering, Visual Question Answering
Published 2017-11-12
URL http://arxiv.org/abs/1711.04323v1
PDF http://arxiv.org/pdf/1711.04323v1.pdf
PWC https://paperswithcode.com/paper/high-order-attention-models-for-visual
Repo https://github.com/idansc/HighOrderAtten
Framework torch

Imagination-Augmented Agents for Deep Reinforcement Learning

Title Imagination-Augmented Agents for Deep Reinforcement Learning
Authors Théophane Weber, Sébastien Racanière, David P. Reichert, Lars Buesing, Arthur Guez, Danilo Jimenez Rezende, Adria Puigdomènech Badia, Oriol Vinyals, Nicolas Heess, Yujia Li, Razvan Pascanu, Peter Battaglia, Demis Hassabis, David Silver, Daan Wierstra
Abstract We introduce Imagination-Augmented Agents (I2As), a novel architecture for deep reinforcement learning combining model-free and model-based aspects. In contrast to most existing model-based reinforcement learning and planning methods, which prescribe how a model should be used to arrive at a policy, I2As learn to interpret predictions from a learned environment model to construct implicit plans in arbitrary ways, by using the predictions as additional context in deep policy networks. I2As show improved data efficiency, performance, and robustness to model misspecification compared to several baselines.
Tasks
Published 2017-07-19
URL http://arxiv.org/abs/1707.06203v2
PDF http://arxiv.org/pdf/1707.06203v2.pdf
PWC https://paperswithcode.com/paper/imagination-augmented-agents-for-deep
Repo https://github.com/Olloxan/Pytorch-A2C
Framework pytorch

BPEmb: Tokenization-free Pre-trained Subword Embeddings in 275 Languages

Title BPEmb: Tokenization-free Pre-trained Subword Embeddings in 275 Languages
Authors Benjamin Heinzerling, Michael Strube
Abstract We present BPEmb, a collection of pre-trained subword unit embeddings in 275 languages, based on Byte-Pair Encoding (BPE). In an evaluation using fine-grained entity typing as testbed, BPEmb performs competitively, and for some languages bet- ter than alternative subword approaches, while requiring vastly fewer resources and no tokenization. BPEmb is available at https://github.com/bheinzerling/bpemb
Tasks Entity Typing, Tokenization, Word Embeddings
Published 2017-10-05
URL http://arxiv.org/abs/1710.02187v1
PDF http://arxiv.org/pdf/1710.02187v1.pdf
PWC https://paperswithcode.com/paper/bpemb-tokenization-free-pre-trained-subword
Repo https://github.com/bheinzerling/bpemb
Framework none

Interpretable Explanations of Black Boxes by Meaningful Perturbation

Title Interpretable Explanations of Black Boxes by Meaningful Perturbation
Authors Ruth Fong, Andrea Vedaldi
Abstract As machine learning algorithms are increasingly applied to high impact yet high risk tasks, such as medical diagnosis or autonomous driving, it is critical that researchers can explain how such algorithms arrived at their predictions. In recent years, a number of image saliency methods have been developed to summarize where highly complex neural networks “look” in an image for evidence for their predictions. However, these techniques are limited by their heuristic nature and architectural constraints. In this paper, we make two main contributions: First, we propose a general framework for learning different kinds of explanations for any black box algorithm. Second, we specialise the framework to find the part of an image most responsible for a classifier decision. Unlike previous works, our method is model-agnostic and testable because it is grounded in explicit and interpretable image perturbations.
Tasks Interpretable Machine Learning
Published 2017-04-11
URL http://arxiv.org/abs/1704.03296v3
PDF http://arxiv.org/pdf/1704.03296v3.pdf
PWC https://paperswithcode.com/paper/interpretable-explanations-of-black-boxes-by
Repo https://github.com/dizcza/EmbedderSDR
Framework pytorch

Structure propagation for zero-shot learning

Title Structure propagation for zero-shot learning
Authors Guangfeng Lin, Yajun Chen, Fan Zhao
Abstract The key of zero-shot learning (ZSL) is how to find the information transfer model for bridging the gap between images and semantic information (texts or attributes). Existing ZSL methods usually construct the compatibility function between images and class labels with the consideration of the relevance on the semantic classes (the manifold structure of semantic classes). However, the relationship of image classes (the manifold structure of image classes) is also very important for the compatibility model construction. It is difficult to capture the relationship among image classes due to unseen classes, so that the manifold structure of image classes often is ignored in ZSL. To complement each other between the manifold structure of image classes and that of semantic classes information, we propose structure propagation (SP) for improving the performance of ZSL for classification. SP can jointly consider the manifold structure of image classes and that of semantic classes for approximating to the intrinsic structure of object classes. Moreover, the SP can describe the constrain condition between the compatibility function and these manifold structures for balancing the influence of the structure propagation iteration. The SP solution provides not only unseen class labels but also the relationship of two manifold structures that encode the positive transfer in structure propagation. Experimental results demonstrate that SP can attain the promising results on the AwA, CUB, Dogs and SUN databases.
Tasks Zero-Shot Learning
Published 2017-11-27
URL http://arxiv.org/abs/1711.09513v1
PDF http://arxiv.org/pdf/1711.09513v1.pdf
PWC https://paperswithcode.com/paper/structure-propagation-for-zero-shot-learning
Repo https://github.com/lgf78103/Structure-propagation-for-zero-shot-learning
Framework none

A Neural Language Model for Dynamically Representing the Meanings of Unknown Words and Entities in a Discourse

Title A Neural Language Model for Dynamically Representing the Meanings of Unknown Words and Entities in a Discourse
Authors Sosuke Kobayashi, Naoaki Okazaki, Kentaro Inui
Abstract This study addresses the problem of identifying the meaning of unknown words or entities in a discourse with respect to the word embedding approaches used in neural language models. We proposed a method for on-the-fly construction and exploitation of word embeddings in both the input and output layers of a neural model by tracking contexts. This extends the dynamic entity representation used in Kobayashi et al. (2016) and incorporates a copy mechanism proposed independently by Gu et al. (2016) and Gulcehre et al. (2016). In addition, we construct a new task and dataset called Anonymized Language Modeling for evaluating the ability to capture word meanings while reading. Experiments conducted using our novel dataset show that the proposed variant of RNN language model outperformed the baseline model. Furthermore, the experiments also demonstrate that dynamic updates of an output layer help a model predict reappearing entities, whereas those of an input layer are effective to predict words following reappearing entities.
Tasks Language Modelling, Word Embeddings
Published 2017-09-06
URL http://arxiv.org/abs/1709.01679v2
PDF http://arxiv.org/pdf/1709.01679v2.pdf
PWC https://paperswithcode.com/paper/a-neural-language-model-for-dynamically
Repo https://github.com/soskek/dynamic_neural_text_model
Framework none

Continual Learning with Deep Generative Replay

Title Continual Learning with Deep Generative Replay
Authors Hanul Shin, Jung Kwon Lee, Jaehong Kim, Jiwon Kim
Abstract Attempts to train a comprehensive artificial intelligence capable of solving multiple tasks have been impeded by a chronic problem called catastrophic forgetting. Although simply replaying all previous data alleviates the problem, it requires large memory and even worse, often infeasible in real world applications where the access to past data is limited. Inspired by the generative nature of hippocampus as a short-term memory system in primate brain, we propose the Deep Generative Replay, a novel framework with a cooperative dual model architecture consisting of a deep generative model (“generator”) and a task solving model (“solver”). With only these two models, training data for previous tasks can easily be sampled and interleaved with those for a new task. We test our methods in several sequential learning settings involving image classification tasks.
Tasks Continual Learning, Image Classification
Published 2017-05-24
URL http://arxiv.org/abs/1705.08690v3
PDF http://arxiv.org/pdf/1705.08690v3.pdf
PWC https://paperswithcode.com/paper/continual-learning-with-deep-generative
Repo https://github.com/hursung1/DeepGenerativeReplay
Framework pytorch

Treelogy: A Novel Tree Classifier Utilizing Deep and Hand-crafted Representations

Title Treelogy: A Novel Tree Classifier Utilizing Deep and Hand-crafted Representations
Authors İlke Çuğu, Eren Şener, Çağrı Erciyes, Burak Balcı, Emre Akın, Itır Önal, Ahmet Oğuz Akyüz
Abstract We propose a novel tree classification system called Treelogy, that fuses deep representations with hand-crafted features obtained from leaf images to perform leaf-based plant classification. Key to this system are segmentation of the leaf from an untextured background, using convolutional neural networks (CNNs) for learning deep representations, extracting hand-crafted features with a number of image processing techniques, training a linear SVM with feature vectors, merging SVM and CNN results, and identifying the species from a dataset of 57 trees. Our classification results show that fusion of deep representations with hand-crafted features leads to the highest accuracy. The proposed algorithm is embedded in a smart-phone application, which is publicly available. Furthermore, our novel dataset comprised of 5408 leaf images is also made public for use of other researchers.
Tasks
Published 2017-01-28
URL http://arxiv.org/abs/1701.08291v1
PDF http://arxiv.org/pdf/1701.08291v1.pdf
PWC https://paperswithcode.com/paper/treelogy-a-novel-tree-classifier-utilizing
Repo https://github.com/cuguilke/Treelogy
Framework caffe2

Streaming Word Embeddings with the Space-Saving Algorithm

Title Streaming Word Embeddings with the Space-Saving Algorithm
Authors Chandler May, Kevin Duh, Benjamin Van Durme, Ashwin Lall
Abstract We develop a streaming (one-pass, bounded-memory) word embedding algorithm based on the canonical skip-gram with negative sampling algorithm implemented in word2vec. We compare our streaming algorithm to word2vec empirically by measuring the cosine similarity between word pairs under each algorithm and by applying each algorithm in the downstream task of hashtag prediction on a two-month interval of the Twitter sample stream. We then discuss the results of these experiments, concluding they provide partial validation of our approach as a streaming replacement for word2vec. Finally, we discuss potential failure modes and suggest directions for future work.
Tasks Word Embeddings
Published 2017-04-24
URL http://arxiv.org/abs/1704.07463v1
PDF http://arxiv.org/pdf/1704.07463v1.pdf
PWC https://paperswithcode.com/paper/streaming-word-embeddings-with-the-space
Repo https://github.com/cjmay/athena
Framework none

RAIL: Risk-Averse Imitation Learning

Title RAIL: Risk-Averse Imitation Learning
Authors Anirban Santara, Abhishek Naik, Balaraman Ravindran, Dipankar Das, Dheevatsa Mudigere, Sasikanth Avancha, Bharat Kaul
Abstract Imitation learning algorithms learn viable policies by imitating an expert’s behavior when reward signals are not available. Generative Adversarial Imitation Learning (GAIL) is a state-of-the-art algorithm for learning policies when the expert’s behavior is available as a fixed set of trajectories. We evaluate in terms of the expert’s cost function and observe that the distribution of trajectory-costs is often more heavy-tailed for GAIL-agents than the expert at a number of benchmark continuous-control tasks. Thus, high-cost trajectories, corresponding to tail-end events of catastrophic failure, are more likely to be encountered by the GAIL-agents than the expert. This makes the reliability of GAIL-agents questionable when it comes to deployment in risk-sensitive applications like robotic surgery and autonomous driving. In this work, we aim to minimize the occurrence of tail-end events by minimizing tail risk within the GAIL framework. We quantify tail risk by the Conditional-Value-at-Risk (CVaR) of trajectories and develop the Risk-Averse Imitation Learning (RAIL) algorithm. We observe that the policies learned with RAIL show lower tail-end risk than those of vanilla GAIL. Thus the proposed RAIL algorithm appears as a potent alternative to GAIL for improved reliability in risk-sensitive applications.
Tasks Autonomous Driving, Continuous Control, Imitation Learning
Published 2017-07-20
URL http://arxiv.org/abs/1707.06658v4
PDF http://arxiv.org/pdf/1707.06658v4.pdf
PWC https://paperswithcode.com/paper/rail-risk-averse-imitation-learning
Repo https://github.com/Santara/RAIL
Framework none

Using Linguistic Features to Improve the Generalization Capability of Neural Coreference Resolvers

Title Using Linguistic Features to Improve the Generalization Capability of Neural Coreference Resolvers
Authors Nafise Sadat Moosavi, Michael Strube
Abstract Coreference resolution is an intermediate step for text understanding. It is used in tasks and domains for which we do not necessarily have coreference annotated corpora. Therefore, generalization is of special importance for coreference resolution. However, while recent coreference resolvers have notable improvements on the CoNLL dataset, they struggle to generalize properly to new domains or datasets. In this paper, we investigate the role of linguistic features in building more generalizable coreference resolvers. We show that generalization improves only slightly by merely using a set of additional linguistic features. However, employing features and subsets of their values that are informative for coreference resolution, considerably improves generalization. Thanks to better generalization, our system achieves state-of-the-art results in out-of-domain evaluations, e.g., on WikiCoref, our system, which is trained on CoNLL, achieves on-par performance with a system designed for this dataset.
Tasks Coreference Resolution
Published 2017-08-01
URL http://arxiv.org/abs/1708.00160v2
PDF http://arxiv.org/pdf/1708.00160v2.pdf
PWC https://paperswithcode.com/paper/using-linguistic-features-to-improve-the
Repo https://github.com/ns-moosavi/epm
Framework none

Multi-scale Multi-band DenseNets for Audio Source Separation

Title Multi-scale Multi-band DenseNets for Audio Source Separation
Authors Naoya Takahashi, Yuki Mitsufuji
Abstract This paper deals with the problem of audio source separation. To handle the complex and ill-posed nature of the problems of audio source separation, the current state-of-the-art approaches employ deep neural networks to obtain instrumental spectra from a mixture. In this study, we propose a novel network architecture that extends the recently developed densely connected convolutional network (DenseNet), which has shown excellent results on image classification tasks. To deal with the specific problem of audio source separation, an up-sampling layer, block skip connection and band-dedicated dense blocks are incorporated on top of DenseNet. The proposed approach takes advantage of long contextual information and outperforms state-of-the-art results on SiSEC 2016 competition by a large margin in terms of signal-to-distortion ratio. Moreover, the proposed architecture requires significantly fewer parameters and considerably less training time compared with other methods.
Tasks Music Source Separation
Published 2017-06-29
URL http://arxiv.org/abs/1706.09588v1
PDF http://arxiv.org/pdf/1706.09588v1.pdf
PWC https://paperswithcode.com/paper/multi-scale-multi-band-densenets-for-audio
Repo https://github.com/tsurumeso/vocal-remover
Framework none

Criticality as It Could Be: organizational invariance as self-organized criticality in embodied agents

Title Criticality as It Could Be: organizational invariance as self-organized criticality in embodied agents
Authors Miguel Aguilera, Manuel G. Bedia
Abstract This paper outlines a methodological approach for designing adaptive agents driving themselves near points of criticality. Using a synthetic approach we construct a conceptual model that, instead of specifying mechanistic requirements to generate criticality, exploits the maintenance of an organizational structure capable of reproducing critical behavior. Our approach exploits the well-known principle of universality, which classifies critical phenomena inside a few universality classes of systems independently of their specific mechanisms or topologies. In particular, we implement an artificial embodied agent controlled by a neural network maintaining a correlation structure randomly sampled from a lattice Ising model at a critical point. We evaluate the agent in two classical reinforcement learning scenarios: the Mountain Car benchmark and the Acrobot double pendulum, finding that in both cases the neural controller reaches a point of criticality, which coincides with a transition point between two regimes of the agent’s behaviour, maximizing the mutual information between neurons and sensorimotor patterns. Finally, we discuss the possible applications of this synthetic approach to the comprehension of deeper principles connected to the pervasive presence of criticality in biological and cognitive systems.
Tasks
Published 2017-04-18
URL http://arxiv.org/abs/1704.05255v3
PDF http://arxiv.org/pdf/1704.05255v3.pdf
PWC https://paperswithcode.com/paper/criticality-as-it-could-be-organizational
Repo https://github.com/heysoos/CriticalForagingOrgs
Framework none

Convergence Analysis of the Dynamics of a Special Kind of Two-Layered Neural Networks with $\ell_1$ and $\ell_2$ Regularization

Title Convergence Analysis of the Dynamics of a Special Kind of Two-Layered Neural Networks with $\ell_1$ and $\ell_2$ Regularization
Authors Zhifeng Kong
Abstract In this paper, we made an extension to the convergence analysis of the dynamics of two-layered bias-free networks with one $ReLU$ output. We took into consideration two popular regularization terms: the $\ell_1$ and $\ell_2$ norm of the parameter vector $w$, and added it to the square loss function with coefficient $\lambda/2$. We proved that when $\lambda$ is small, the weight vector $w$ converges to the optimal solution $\hat{w}$ (with respect to the new loss function) with probability $\geq (1-\varepsilon)(1-A_d)/2$ under random initiations in a sphere centered at the origin, where $\varepsilon$ is a small value and $A_d$ is a constant. Numerical experiments including phase diagrams and repeated simulations verified our theory.
Tasks
Published 2017-11-19
URL http://arxiv.org/abs/1711.07005v1
PDF http://arxiv.org/pdf/1711.07005v1.pdf
PWC https://paperswithcode.com/paper/convergence-analysis-of-the-dynamics-of-a
Repo https://github.com/FengNiMa/ReLU_Convergence
Framework pytorch

Spatio-Temporal Graph Convolutional Networks: A Deep Learning Framework for Traffic Forecasting

Title Spatio-Temporal Graph Convolutional Networks: A Deep Learning Framework for Traffic Forecasting
Authors Bing Yu, Haoteng Yin, Zhanxing Zhu
Abstract Timely accurate traffic forecast is crucial for urban traffic control and guidance. Due to the high nonlinearity and complexity of traffic flow, traditional methods cannot satisfy the requirements of mid-and-long term prediction tasks and often neglect spatial and temporal dependencies. In this paper, we propose a novel deep learning framework, Spatio-Temporal Graph Convolutional Networks (STGCN), to tackle the time series prediction problem in traffic domain. Instead of applying regular convolutional and recurrent units, we formulate the problem on graphs and build the model with complete convolutional structures, which enable much faster training speed with fewer parameters. Experiments show that our model STGCN effectively captures comprehensive spatio-temporal correlations through modeling multi-scale traffic networks and consistently outperforms state-of-the-art baselines on various real-world traffic datasets.
Tasks Time Series, Time Series Prediction, Traffic Prediction
Published 2017-09-14
URL http://arxiv.org/abs/1709.04875v4
PDF http://arxiv.org/pdf/1709.04875v4.pdf
PWC https://paperswithcode.com/paper/spatio-temporal-graph-convolutional-networks
Repo https://github.com/Aguin/STGCN-PyTorch
Framework pytorch
comments powered by Disqus