Paper Group AWR 160
High-Order Attention Models for Visual Question Answering. Imagination-Augmented Agents for Deep Reinforcement Learning. BPEmb: Tokenization-free Pre-trained Subword Embeddings in 275 Languages. Interpretable Explanations of Black Boxes by Meaningful Perturbation. Structure propagation for zero-shot learning. A Neural Language Model for Dynamically …
High-Order Attention Models for Visual Question Answering
Title | High-Order Attention Models for Visual Question Answering |
Authors | Idan Schwartz, Alexander G. Schwing, Tamir Hazan |
Abstract | The quest for algorithms that enable cognitive abilities is an important part of machine learning. A common trait in many recently investigated cognitive-like tasks is that they take into account different data modalities, such as visual and textual input. In this paper we propose a novel and generally applicable form of attention mechanism that learns high-order correlations between various data modalities. We show that high-order correlations effectively direct the appropriate attention to the relevant elements in the different data modalities that are required to solve the joint task. We demonstrate the effectiveness of our high-order attention mechanism on the task of visual question answering (VQA), where we achieve state-of-the-art performance on the standard VQA dataset. |
Tasks | Question Answering, Visual Question Answering |
Published | 2017-11-12 |
URL | http://arxiv.org/abs/1711.04323v1 |
http://arxiv.org/pdf/1711.04323v1.pdf | |
PWC | https://paperswithcode.com/paper/high-order-attention-models-for-visual |
Repo | https://github.com/idansc/HighOrderAtten |
Framework | torch |
Imagination-Augmented Agents for Deep Reinforcement Learning
Title | Imagination-Augmented Agents for Deep Reinforcement Learning |
Authors | Théophane Weber, Sébastien Racanière, David P. Reichert, Lars Buesing, Arthur Guez, Danilo Jimenez Rezende, Adria Puigdomènech Badia, Oriol Vinyals, Nicolas Heess, Yujia Li, Razvan Pascanu, Peter Battaglia, Demis Hassabis, David Silver, Daan Wierstra |
Abstract | We introduce Imagination-Augmented Agents (I2As), a novel architecture for deep reinforcement learning combining model-free and model-based aspects. In contrast to most existing model-based reinforcement learning and planning methods, which prescribe how a model should be used to arrive at a policy, I2As learn to interpret predictions from a learned environment model to construct implicit plans in arbitrary ways, by using the predictions as additional context in deep policy networks. I2As show improved data efficiency, performance, and robustness to model misspecification compared to several baselines. |
Tasks | |
Published | 2017-07-19 |
URL | http://arxiv.org/abs/1707.06203v2 |
http://arxiv.org/pdf/1707.06203v2.pdf | |
PWC | https://paperswithcode.com/paper/imagination-augmented-agents-for-deep |
Repo | https://github.com/Olloxan/Pytorch-A2C |
Framework | pytorch |
BPEmb: Tokenization-free Pre-trained Subword Embeddings in 275 Languages
Title | BPEmb: Tokenization-free Pre-trained Subword Embeddings in 275 Languages |
Authors | Benjamin Heinzerling, Michael Strube |
Abstract | We present BPEmb, a collection of pre-trained subword unit embeddings in 275 languages, based on Byte-Pair Encoding (BPE). In an evaluation using fine-grained entity typing as testbed, BPEmb performs competitively, and for some languages bet- ter than alternative subword approaches, while requiring vastly fewer resources and no tokenization. BPEmb is available at https://github.com/bheinzerling/bpemb |
Tasks | Entity Typing, Tokenization, Word Embeddings |
Published | 2017-10-05 |
URL | http://arxiv.org/abs/1710.02187v1 |
http://arxiv.org/pdf/1710.02187v1.pdf | |
PWC | https://paperswithcode.com/paper/bpemb-tokenization-free-pre-trained-subword |
Repo | https://github.com/bheinzerling/bpemb |
Framework | none |
Interpretable Explanations of Black Boxes by Meaningful Perturbation
Title | Interpretable Explanations of Black Boxes by Meaningful Perturbation |
Authors | Ruth Fong, Andrea Vedaldi |
Abstract | As machine learning algorithms are increasingly applied to high impact yet high risk tasks, such as medical diagnosis or autonomous driving, it is critical that researchers can explain how such algorithms arrived at their predictions. In recent years, a number of image saliency methods have been developed to summarize where highly complex neural networks “look” in an image for evidence for their predictions. However, these techniques are limited by their heuristic nature and architectural constraints. In this paper, we make two main contributions: First, we propose a general framework for learning different kinds of explanations for any black box algorithm. Second, we specialise the framework to find the part of an image most responsible for a classifier decision. Unlike previous works, our method is model-agnostic and testable because it is grounded in explicit and interpretable image perturbations. |
Tasks | Interpretable Machine Learning |
Published | 2017-04-11 |
URL | http://arxiv.org/abs/1704.03296v3 |
http://arxiv.org/pdf/1704.03296v3.pdf | |
PWC | https://paperswithcode.com/paper/interpretable-explanations-of-black-boxes-by |
Repo | https://github.com/dizcza/EmbedderSDR |
Framework | pytorch |
Structure propagation for zero-shot learning
Title | Structure propagation for zero-shot learning |
Authors | Guangfeng Lin, Yajun Chen, Fan Zhao |
Abstract | The key of zero-shot learning (ZSL) is how to find the information transfer model for bridging the gap between images and semantic information (texts or attributes). Existing ZSL methods usually construct the compatibility function between images and class labels with the consideration of the relevance on the semantic classes (the manifold structure of semantic classes). However, the relationship of image classes (the manifold structure of image classes) is also very important for the compatibility model construction. It is difficult to capture the relationship among image classes due to unseen classes, so that the manifold structure of image classes often is ignored in ZSL. To complement each other between the manifold structure of image classes and that of semantic classes information, we propose structure propagation (SP) for improving the performance of ZSL for classification. SP can jointly consider the manifold structure of image classes and that of semantic classes for approximating to the intrinsic structure of object classes. Moreover, the SP can describe the constrain condition between the compatibility function and these manifold structures for balancing the influence of the structure propagation iteration. The SP solution provides not only unseen class labels but also the relationship of two manifold structures that encode the positive transfer in structure propagation. Experimental results demonstrate that SP can attain the promising results on the AwA, CUB, Dogs and SUN databases. |
Tasks | Zero-Shot Learning |
Published | 2017-11-27 |
URL | http://arxiv.org/abs/1711.09513v1 |
http://arxiv.org/pdf/1711.09513v1.pdf | |
PWC | https://paperswithcode.com/paper/structure-propagation-for-zero-shot-learning |
Repo | https://github.com/lgf78103/Structure-propagation-for-zero-shot-learning |
Framework | none |
A Neural Language Model for Dynamically Representing the Meanings of Unknown Words and Entities in a Discourse
Title | A Neural Language Model for Dynamically Representing the Meanings of Unknown Words and Entities in a Discourse |
Authors | Sosuke Kobayashi, Naoaki Okazaki, Kentaro Inui |
Abstract | This study addresses the problem of identifying the meaning of unknown words or entities in a discourse with respect to the word embedding approaches used in neural language models. We proposed a method for on-the-fly construction and exploitation of word embeddings in both the input and output layers of a neural model by tracking contexts. This extends the dynamic entity representation used in Kobayashi et al. (2016) and incorporates a copy mechanism proposed independently by Gu et al. (2016) and Gulcehre et al. (2016). In addition, we construct a new task and dataset called Anonymized Language Modeling for evaluating the ability to capture word meanings while reading. Experiments conducted using our novel dataset show that the proposed variant of RNN language model outperformed the baseline model. Furthermore, the experiments also demonstrate that dynamic updates of an output layer help a model predict reappearing entities, whereas those of an input layer are effective to predict words following reappearing entities. |
Tasks | Language Modelling, Word Embeddings |
Published | 2017-09-06 |
URL | http://arxiv.org/abs/1709.01679v2 |
http://arxiv.org/pdf/1709.01679v2.pdf | |
PWC | https://paperswithcode.com/paper/a-neural-language-model-for-dynamically |
Repo | https://github.com/soskek/dynamic_neural_text_model |
Framework | none |
Continual Learning with Deep Generative Replay
Title | Continual Learning with Deep Generative Replay |
Authors | Hanul Shin, Jung Kwon Lee, Jaehong Kim, Jiwon Kim |
Abstract | Attempts to train a comprehensive artificial intelligence capable of solving multiple tasks have been impeded by a chronic problem called catastrophic forgetting. Although simply replaying all previous data alleviates the problem, it requires large memory and even worse, often infeasible in real world applications where the access to past data is limited. Inspired by the generative nature of hippocampus as a short-term memory system in primate brain, we propose the Deep Generative Replay, a novel framework with a cooperative dual model architecture consisting of a deep generative model (“generator”) and a task solving model (“solver”). With only these two models, training data for previous tasks can easily be sampled and interleaved with those for a new task. We test our methods in several sequential learning settings involving image classification tasks. |
Tasks | Continual Learning, Image Classification |
Published | 2017-05-24 |
URL | http://arxiv.org/abs/1705.08690v3 |
http://arxiv.org/pdf/1705.08690v3.pdf | |
PWC | https://paperswithcode.com/paper/continual-learning-with-deep-generative |
Repo | https://github.com/hursung1/DeepGenerativeReplay |
Framework | pytorch |
Treelogy: A Novel Tree Classifier Utilizing Deep and Hand-crafted Representations
Title | Treelogy: A Novel Tree Classifier Utilizing Deep and Hand-crafted Representations |
Authors | İlke Çuğu, Eren Şener, Çağrı Erciyes, Burak Balcı, Emre Akın, Itır Önal, Ahmet Oğuz Akyüz |
Abstract | We propose a novel tree classification system called Treelogy, that fuses deep representations with hand-crafted features obtained from leaf images to perform leaf-based plant classification. Key to this system are segmentation of the leaf from an untextured background, using convolutional neural networks (CNNs) for learning deep representations, extracting hand-crafted features with a number of image processing techniques, training a linear SVM with feature vectors, merging SVM and CNN results, and identifying the species from a dataset of 57 trees. Our classification results show that fusion of deep representations with hand-crafted features leads to the highest accuracy. The proposed algorithm is embedded in a smart-phone application, which is publicly available. Furthermore, our novel dataset comprised of 5408 leaf images is also made public for use of other researchers. |
Tasks | |
Published | 2017-01-28 |
URL | http://arxiv.org/abs/1701.08291v1 |
http://arxiv.org/pdf/1701.08291v1.pdf | |
PWC | https://paperswithcode.com/paper/treelogy-a-novel-tree-classifier-utilizing |
Repo | https://github.com/cuguilke/Treelogy |
Framework | caffe2 |
Streaming Word Embeddings with the Space-Saving Algorithm
Title | Streaming Word Embeddings with the Space-Saving Algorithm |
Authors | Chandler May, Kevin Duh, Benjamin Van Durme, Ashwin Lall |
Abstract | We develop a streaming (one-pass, bounded-memory) word embedding algorithm based on the canonical skip-gram with negative sampling algorithm implemented in word2vec. We compare our streaming algorithm to word2vec empirically by measuring the cosine similarity between word pairs under each algorithm and by applying each algorithm in the downstream task of hashtag prediction on a two-month interval of the Twitter sample stream. We then discuss the results of these experiments, concluding they provide partial validation of our approach as a streaming replacement for word2vec. Finally, we discuss potential failure modes and suggest directions for future work. |
Tasks | Word Embeddings |
Published | 2017-04-24 |
URL | http://arxiv.org/abs/1704.07463v1 |
http://arxiv.org/pdf/1704.07463v1.pdf | |
PWC | https://paperswithcode.com/paper/streaming-word-embeddings-with-the-space |
Repo | https://github.com/cjmay/athena |
Framework | none |
RAIL: Risk-Averse Imitation Learning
Title | RAIL: Risk-Averse Imitation Learning |
Authors | Anirban Santara, Abhishek Naik, Balaraman Ravindran, Dipankar Das, Dheevatsa Mudigere, Sasikanth Avancha, Bharat Kaul |
Abstract | Imitation learning algorithms learn viable policies by imitating an expert’s behavior when reward signals are not available. Generative Adversarial Imitation Learning (GAIL) is a state-of-the-art algorithm for learning policies when the expert’s behavior is available as a fixed set of trajectories. We evaluate in terms of the expert’s cost function and observe that the distribution of trajectory-costs is often more heavy-tailed for GAIL-agents than the expert at a number of benchmark continuous-control tasks. Thus, high-cost trajectories, corresponding to tail-end events of catastrophic failure, are more likely to be encountered by the GAIL-agents than the expert. This makes the reliability of GAIL-agents questionable when it comes to deployment in risk-sensitive applications like robotic surgery and autonomous driving. In this work, we aim to minimize the occurrence of tail-end events by minimizing tail risk within the GAIL framework. We quantify tail risk by the Conditional-Value-at-Risk (CVaR) of trajectories and develop the Risk-Averse Imitation Learning (RAIL) algorithm. We observe that the policies learned with RAIL show lower tail-end risk than those of vanilla GAIL. Thus the proposed RAIL algorithm appears as a potent alternative to GAIL for improved reliability in risk-sensitive applications. |
Tasks | Autonomous Driving, Continuous Control, Imitation Learning |
Published | 2017-07-20 |
URL | http://arxiv.org/abs/1707.06658v4 |
http://arxiv.org/pdf/1707.06658v4.pdf | |
PWC | https://paperswithcode.com/paper/rail-risk-averse-imitation-learning |
Repo | https://github.com/Santara/RAIL |
Framework | none |
Using Linguistic Features to Improve the Generalization Capability of Neural Coreference Resolvers
Title | Using Linguistic Features to Improve the Generalization Capability of Neural Coreference Resolvers |
Authors | Nafise Sadat Moosavi, Michael Strube |
Abstract | Coreference resolution is an intermediate step for text understanding. It is used in tasks and domains for which we do not necessarily have coreference annotated corpora. Therefore, generalization is of special importance for coreference resolution. However, while recent coreference resolvers have notable improvements on the CoNLL dataset, they struggle to generalize properly to new domains or datasets. In this paper, we investigate the role of linguistic features in building more generalizable coreference resolvers. We show that generalization improves only slightly by merely using a set of additional linguistic features. However, employing features and subsets of their values that are informative for coreference resolution, considerably improves generalization. Thanks to better generalization, our system achieves state-of-the-art results in out-of-domain evaluations, e.g., on WikiCoref, our system, which is trained on CoNLL, achieves on-par performance with a system designed for this dataset. |
Tasks | Coreference Resolution |
Published | 2017-08-01 |
URL | http://arxiv.org/abs/1708.00160v2 |
http://arxiv.org/pdf/1708.00160v2.pdf | |
PWC | https://paperswithcode.com/paper/using-linguistic-features-to-improve-the |
Repo | https://github.com/ns-moosavi/epm |
Framework | none |
Multi-scale Multi-band DenseNets for Audio Source Separation
Title | Multi-scale Multi-band DenseNets for Audio Source Separation |
Authors | Naoya Takahashi, Yuki Mitsufuji |
Abstract | This paper deals with the problem of audio source separation. To handle the complex and ill-posed nature of the problems of audio source separation, the current state-of-the-art approaches employ deep neural networks to obtain instrumental spectra from a mixture. In this study, we propose a novel network architecture that extends the recently developed densely connected convolutional network (DenseNet), which has shown excellent results on image classification tasks. To deal with the specific problem of audio source separation, an up-sampling layer, block skip connection and band-dedicated dense blocks are incorporated on top of DenseNet. The proposed approach takes advantage of long contextual information and outperforms state-of-the-art results on SiSEC 2016 competition by a large margin in terms of signal-to-distortion ratio. Moreover, the proposed architecture requires significantly fewer parameters and considerably less training time compared with other methods. |
Tasks | Music Source Separation |
Published | 2017-06-29 |
URL | http://arxiv.org/abs/1706.09588v1 |
http://arxiv.org/pdf/1706.09588v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-scale-multi-band-densenets-for-audio |
Repo | https://github.com/tsurumeso/vocal-remover |
Framework | none |
Criticality as It Could Be: organizational invariance as self-organized criticality in embodied agents
Title | Criticality as It Could Be: organizational invariance as self-organized criticality in embodied agents |
Authors | Miguel Aguilera, Manuel G. Bedia |
Abstract | This paper outlines a methodological approach for designing adaptive agents driving themselves near points of criticality. Using a synthetic approach we construct a conceptual model that, instead of specifying mechanistic requirements to generate criticality, exploits the maintenance of an organizational structure capable of reproducing critical behavior. Our approach exploits the well-known principle of universality, which classifies critical phenomena inside a few universality classes of systems independently of their specific mechanisms or topologies. In particular, we implement an artificial embodied agent controlled by a neural network maintaining a correlation structure randomly sampled from a lattice Ising model at a critical point. We evaluate the agent in two classical reinforcement learning scenarios: the Mountain Car benchmark and the Acrobot double pendulum, finding that in both cases the neural controller reaches a point of criticality, which coincides with a transition point between two regimes of the agent’s behaviour, maximizing the mutual information between neurons and sensorimotor patterns. Finally, we discuss the possible applications of this synthetic approach to the comprehension of deeper principles connected to the pervasive presence of criticality in biological and cognitive systems. |
Tasks | |
Published | 2017-04-18 |
URL | http://arxiv.org/abs/1704.05255v3 |
http://arxiv.org/pdf/1704.05255v3.pdf | |
PWC | https://paperswithcode.com/paper/criticality-as-it-could-be-organizational |
Repo | https://github.com/heysoos/CriticalForagingOrgs |
Framework | none |
Convergence Analysis of the Dynamics of a Special Kind of Two-Layered Neural Networks with $\ell_1$ and $\ell_2$ Regularization
Title | Convergence Analysis of the Dynamics of a Special Kind of Two-Layered Neural Networks with $\ell_1$ and $\ell_2$ Regularization |
Authors | Zhifeng Kong |
Abstract | In this paper, we made an extension to the convergence analysis of the dynamics of two-layered bias-free networks with one $ReLU$ output. We took into consideration two popular regularization terms: the $\ell_1$ and $\ell_2$ norm of the parameter vector $w$, and added it to the square loss function with coefficient $\lambda/2$. We proved that when $\lambda$ is small, the weight vector $w$ converges to the optimal solution $\hat{w}$ (with respect to the new loss function) with probability $\geq (1-\varepsilon)(1-A_d)/2$ under random initiations in a sphere centered at the origin, where $\varepsilon$ is a small value and $A_d$ is a constant. Numerical experiments including phase diagrams and repeated simulations verified our theory. |
Tasks | |
Published | 2017-11-19 |
URL | http://arxiv.org/abs/1711.07005v1 |
http://arxiv.org/pdf/1711.07005v1.pdf | |
PWC | https://paperswithcode.com/paper/convergence-analysis-of-the-dynamics-of-a |
Repo | https://github.com/FengNiMa/ReLU_Convergence |
Framework | pytorch |
Spatio-Temporal Graph Convolutional Networks: A Deep Learning Framework for Traffic Forecasting
Title | Spatio-Temporal Graph Convolutional Networks: A Deep Learning Framework for Traffic Forecasting |
Authors | Bing Yu, Haoteng Yin, Zhanxing Zhu |
Abstract | Timely accurate traffic forecast is crucial for urban traffic control and guidance. Due to the high nonlinearity and complexity of traffic flow, traditional methods cannot satisfy the requirements of mid-and-long term prediction tasks and often neglect spatial and temporal dependencies. In this paper, we propose a novel deep learning framework, Spatio-Temporal Graph Convolutional Networks (STGCN), to tackle the time series prediction problem in traffic domain. Instead of applying regular convolutional and recurrent units, we formulate the problem on graphs and build the model with complete convolutional structures, which enable much faster training speed with fewer parameters. Experiments show that our model STGCN effectively captures comprehensive spatio-temporal correlations through modeling multi-scale traffic networks and consistently outperforms state-of-the-art baselines on various real-world traffic datasets. |
Tasks | Time Series, Time Series Prediction, Traffic Prediction |
Published | 2017-09-14 |
URL | http://arxiv.org/abs/1709.04875v4 |
http://arxiv.org/pdf/1709.04875v4.pdf | |
PWC | https://paperswithcode.com/paper/spatio-temporal-graph-convolutional-networks |
Repo | https://github.com/Aguin/STGCN-PyTorch |
Framework | pytorch |