February 1, 2020

2930 words 14 mins read

Paper Group AWR 88

SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference. Explicit Explore-Exploit Algorithms in Continuous State Spaces. The Explanation Game: Explaining Machine Learning Models with Cooperative Game Theory. Reducing Overestimation Bias in Multi-Agent Domains Using Double Centralized Critics. Interpolation Consistency Training fo …

SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference


Title	SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference
Authors	Lasse Espeholt, Raphaël Marinier, Piotr Stanczyk, Ke Wang, Marcin Michalski
Abstract	We present a modern scalable reinforcement learning agent called SEED (Scalable, Efficient Deep-RL). By effectively utilizing modern accelerators, we show that it is not only possible to train on millions of frames per second but also to lower the cost of experiments compared to current methods. We achieve this with a simple architecture that features centralized inference and an optimized communication layer. SEED adopts two state of the art distributed algorithms, IMPALA/V-trace (policy gradients) and R2D2 (Q-learning), and is evaluated on Atari-57, DeepMind Lab and Google Research Football. We improve the state of the art on Football and are able to reach state of the art on Atari-57 three times faster in wall-time. For the scenarios we consider, a 40% to 80% cost reduction for running experiments is achieved. The implementation along with experiments is open-sourced so results can be reproduced and novel ideas tried out.
Tasks	Q-Learning
Published	2019-10-15
URL	https://arxiv.org/abs/1910.06591v2
PDF	https://arxiv.org/pdf/1910.06591v2.pdf
PWC	https://paperswithcode.com/paper/seed-rl-scalable-and-efficient-deep-rl-with-1
Repo	https://github.com/google-research/seed_rl
Framework	tf

Explicit Explore-Exploit Algorithms in Continuous State Spaces


Title	Explicit Explore-Exploit Algorithms in Continuous State Spaces
Authors	Mikael Henaff
Abstract	We present a new model-based algorithm for reinforcement learning (RL) which consists of explicit exploration and exploitation phases, and is applicable in large or infinite state spaces. The algorithm maintains a set of dynamics models consistent with current experience and explores by finding policies which induce high disagreement between their state predictions. It then exploits using the refined set of models or experience gathered during exploration. We show that under realizability and optimal planning assumptions, our algorithm provably finds a near-optimal policy with a number of samples that is polynomial in a structural complexity measure which we show to be low in several natural settings. We then give a practical approximation using neural networks and demonstrate its performance and sample efficiency in practice.
Tasks
Published	2019-11-01
URL	https://arxiv.org/abs/1911.00617v2
PDF	https://arxiv.org/pdf/1911.00617v2.pdf
PWC	https://paperswithcode.com/paper/explicit-explore-exploit-algorithms-in
Repo	https://github.com/mbhenaff/neural-e3
Framework	pytorch

The Explanation Game: Explaining Machine Learning Models with Cooperative Game Theory


Title	The Explanation Game: Explaining Machine Learning Models with Cooperative Game Theory
Authors	Luke Merrick, Ankur Taly
Abstract	A number of techniques have been proposed to explain a machine learning (ML) model’s prediction by attributing it to the corresponding input features. Popular among these are techniques that apply the Shapley value method from cooperative game theory. While existing papers focus on the axiomatic motivation of Shapley values, and efficient techniques for computing them, they neither justify the game formulations used nor address the uncertainty implicit in their methods’ outputs. For instance, the SHAP algorithm’s formulation may give substantial attributions to features that play no role in a model. Furthermore, without infinite data and computation, SHAP attributions are approximations subject to hitherto uncharacterized uncertainty. In this work, we illustrate how subtle differences in the underlying game formulations of existing methods can cause large differences in attribution for a prediction. We then present a general game formulation that unifies existing methods. Using the primitive of single-reference games, we decompose the Shapley values of the general game formulation into Shapley values of single-reference games. This decomposition enables us to introduce confidence intervals to quantify the uncertainty in estimated attributions. Additionally, this decomposition enables contrastive explanations of a prediction through comparisons with different groups of reference inputs. We tie this idea to classic work on Norm Theory in cognitive psychology, and propose a general framework for generating explanations for ML models, called formulate, approximate, and explain (FAE).
Tasks
Published	2019-09-17
URL	https://arxiv.org/abs/1909.08128v2
PDF	https://arxiv.org/pdf/1909.08128v2.pdf
PWC	https://paperswithcode.com/paper/the-explanation-game-explaining-machine
Repo	https://github.com/fiddler-labs/the-explanation-game-supplemental
Framework	none

Reducing Overestimation Bias in Multi-Agent Domains Using Double Centralized Critics


Title	Reducing Overestimation Bias in Multi-Agent Domains Using Double Centralized Critics
Authors	Johannes Ackermann, Volker Gabler, Takayuki Osa, Masashi Sugiyama
Abstract	Many real world tasks require multiple agents to work together. Multi-agent reinforcement learning (RL) methods have been proposed in recent years to solve these tasks, but current methods often fail to efficiently learn policies. We thus investigate the presence of a common weakness in single-agent RL, namely value function overestimation bias, in the multi-agent setting. Based on our findings, we propose an approach that reduces this bias by using double centralized critics. We evaluate it on six mixed cooperative-competitive tasks, showing a significant advantage over current methods. Finally, we investigate the application of multi-agent methods to high-dimensional robotic tasks and show that our approach can be used to learn decentralized policies in this domain.
Tasks	Multi-agent Reinforcement Learning
Published	2019-10-03
URL	https://arxiv.org/abs/1910.01465v2
PDF	https://arxiv.org/pdf/1910.01465v2.pdf
PWC	https://paperswithcode.com/paper/reducing-overestimation-bias-in-multi-agent
Repo	https://github.com/JohannesAck/MATD3implementation
Framework	tf

Interpolation Consistency Training for Semi-Supervised Learning


Title	Interpolation Consistency Training for Semi-Supervised Learning
Authors	Vikas Verma, Alex Lamb, Juho Kannala, Yoshua Bengio, David Lopez-Paz
Abstract	We introduce Interpolation Consistency Training (ICT), a simple and computation efficient algorithm for training Deep Neural Networks in the semi-supervised learning paradigm. ICT encourages the prediction at an interpolation of unlabeled points to be consistent with the interpolation of the predictions at those points. In classification problems, ICT moves the decision boundary to low-density regions of the data distribution. Our experiments show that ICT achieves state-of-the-art performance when applied to standard neural network architectures on the CIFAR-10 and SVHN benchmark datasets.
Tasks	Semi-Supervised Image Classification
Published	2019-03-09
URL	https://arxiv.org/abs/1903.03825v3
PDF	https://arxiv.org/pdf/1903.03825v3.pdf
PWC	https://paperswithcode.com/paper/interpolation-consistency-training-for-semi
Repo	https://github.com/vikasverma1077/ICT
Framework	pytorch

On the Measure of Intelligence


Title	On the Measure of Intelligence
Authors	François Chollet
Abstract	To make deliberate progress towards more intelligent and more human-like artificial systems, we need to be following an appropriate feedback signal: we need to be able to define and evaluate intelligence in a way that enables comparisons between two systems, as well as comparisons with humans. Over the past hundred years, there has been an abundance of attempts to define and measure intelligence, across both the fields of psychology and AI. We summarize and critically assess these definitions and evaluation approaches, while making apparent the two historical conceptions of intelligence that have implicitly guided them. We note that in practice, the contemporary AI community still gravitates towards benchmarking intelligence by comparing the skill exhibited by AIs and humans at specific tasks such as board games and video games. We argue that solely measuring skill at any given task falls short of measuring intelligence, because skill is heavily modulated by prior knowledge and experience: unlimited priors or unlimited training data allow experimenters to “buy” arbitrary levels of skills for a system, in a way that masks the system’s own generalization power. We then articulate a new formal definition of intelligence based on Algorithmic Information Theory, describing intelligence as skill-acquisition efficiency and highlighting the concepts of scope, generalization difficulty, priors, and experience. Using this definition, we propose a set of guidelines for what a general AI benchmark should look like. Finally, we present a benchmark closely following these guidelines, the Abstraction and Reasoning Corpus (ARC), built upon an explicit set of priors designed to be as close as possible to innate human priors. We argue that ARC can be used to measure a human-like form of general fluid intelligence and that it enables fair general intelligence comparisons between AI systems and humans.
Tasks
Published	2019-11-05
URL	https://arxiv.org/abs/1911.01547v2
PDF	https://arxiv.org/pdf/1911.01547v2.pdf
PWC	https://paperswithcode.com/paper/the-measure-of-intelligence
Repo	https://github.com/fchollet/ARC
Framework	none

Deep Point-wise Prediction for Action Temporal Proposal


Title	Deep Point-wise Prediction for Action Temporal Proposal
Authors	Luxuan Li, Tao Kong, Fuchun Sun, Huaping Liu
Abstract	Detecting actions in videos is an important yet challenging task. Previous works usually utilize (a) sliding window paradigms, or (b) per-frame action scoring and grouping to enumerate the possible temporal locations. Their performances are also limited to the designs of sliding windows or grouping strategies. In this paper, we present a simple and effective method for temporal action proposal generation, named Deep Point-wise Prediction (DPP). DPP simultaneously predicts the action existing possibility and the corresponding temporal locations, without the utilization of any handcrafted sliding window or grouping. The whole system is end-to-end trained with joint loss of temporal action proposal classification and location prediction. We conduct extensive experiments to verify its effectiveness, generality and robustness on standard THUMOS14 dataset. DPP runs more than 1000 frames per second, which largely satisfies the real-time requirement. The code is available at https://github.com/liluxuan1997/DPP.
Tasks	Action Recognition In Videos, Temporal Action Proposal Generation
Published	2019-09-17
URL	https://arxiv.org/abs/1909.07725v1
PDF	https://arxiv.org/pdf/1909.07725v1.pdf
PWC	https://paperswithcode.com/paper/deep-point-wise-prediction-for-action
Repo	https://github.com/liluxuan1997/DPP
Framework	pytorch

Segmentation Guided Image-to-Image Translation with Adversarial Networks


Title	Segmentation Guided Image-to-Image Translation with Adversarial Networks
Authors	Songyao Jiang, Zhiqiang Tao, Yun Fu
Abstract	Recently image-to-image translation has received increasing attention, which aims to map images in one domain to another specific one. Existing methods mainly solve this task via a deep generative model, and focus on exploring the relationship between different domains. However, these methods neglect to utilize higher-level and instance-specific information to guide the training process, leading to a great deal of unrealistic generated images of low quality. Existing methods also lack of spatial controllability during translation. To address these challenge, we propose a novel Segmentation Guided Generative Adversarial Networks (SGGAN), which leverages semantic segmentation to further boost the generation performance and provide spatial mapping. In particular, a segmentor network is designed to impose semantic information on the generated images. Experimental results on multi-domain face image translation task empirically demonstrate our ability of the spatial modification and our superiority in image quality over several state-of-the-art methods.
Tasks	Image-to-Image Translation, Semantic Segmentation
Published	2019-01-06
URL	http://arxiv.org/abs/1901.01569v2
PDF	http://arxiv.org/pdf/1901.01569v2.pdf
PWC	https://paperswithcode.com/paper/segmentation-guided-image-to-image
Repo	https://github.com/jackyjsy/SGGAN
Framework	pytorch

Recurrent Event Network: Global Structure Inference over Temporal Knowledge Graph


Title	Recurrent Event Network: Global Structure Inference over Temporal Knowledge Graph
Authors	Woojeong Jin, He Jiang, Meng Qu, Tong Chen, Changlin Zhang, Pedro Szekely, Xiang Ren
Abstract	Modeling dynamically-evolving, multi-relational graph data has received a surge of interests with the rapid growth of heterogeneous event data. However, predicting future events on such data requires global structure inference over time and the ability to integrate temporal and structural information, which are not yet well understood. We present Recurrent Event Network (RE-Net), a novel autoregressive architecture for modeling temporal sequences of multi-relational graphs (e.g., temporal knowledge graph), which can perform sequential, global structure inference over future time stamps to predict new events. RE-Net employs a recurrent event encoder to model the temporally conditioned joint probability distribution for the event sequences, and equips the event encoder with a neighborhood aggregator for modeling the concurrent events within a time window associated with each entity. We apply teacher forcing for model training over historical data, and infer graph sequences over future time stamps by sampling from the learned joint distribution in a sequential manner. We evaluate the proposed method via temporal link prediction on five public datasets. Extensive experiments demonstrate the strength of RE-Net, especially on multi-step inference over future time stamps. Code and data can be found at https://github.com/INK-USC/RE-Net .
Tasks	Knowledge Graphs, Link Prediction
Published	2019-04-11
URL	https://arxiv.org/abs/1904.05530v3
PDF	https://arxiv.org/pdf/1904.05530v3.pdf
PWC	https://paperswithcode.com/paper/recurrent-event-network-for-reasoning-over
Repo	https://github.com/INK-USC/RENet
Framework	pytorch

BHAAV- A Text Corpus for Emotion Analysis from Hindi Stories


Title	BHAAV- A Text Corpus for Emotion Analysis from Hindi Stories
Authors	Yaman Kumar, Debanjan Mahata, Sagar Aggarwal, Anmol Chugh, Rajat Maheshwari, Rajiv Ratn Shah
Abstract	In this paper, we introduce the first and largest Hindi text corpus, named BHAAV, which means emotions in Hindi, for analyzing emotions that a writer expresses through his characters in a story, as perceived by a narrator/reader. The corpus consists of 20,304 sentences collected from 230 different short stories spanning across 18 genres such as Inspirational and Mystery. Each sentence has been annotated into one of the five emotion categories - anger, joy, suspense, sad, and neutral, by three native Hindi speakers with at least ten years of formal education in Hindi. We also discuss challenges in the annotation of low resource languages such as Hindi, and discuss the scope of the proposed corpus along with its possible uses. We also provide a detailed analysis of the dataset and train strong baseline classifiers reporting their performances.
Tasks	Emotion Recognition
Published	2019-10-09
URL	https://arxiv.org/abs/1910.04073v1
PDF	https://arxiv.org/pdf/1910.04073v1.pdf
PWC	https://paperswithcode.com/paper/bhaav-a-text-corpus-for-emotion-analysis-from
Repo	https://github.com/midas-research/bhaav
Framework	none

Solving Multiagent Planning Problems with Concurrent Conditional Effects


Title	Solving Multiagent Planning Problems with Concurrent Conditional Effects
Authors	Daniel Furelos-Blanco, Anders Jonsson
Abstract	In this work we present a novel approach to solving concurrent multiagent planning problems in which several agents act in parallel. Our approach relies on a compilation from concurrent multiagent planning to classical planning, allowing us to use an off-the-shelf classical planner to solve the original multiagent problem. The solution can be directly interpreted as a concurrent plan that satisfies a given set of concurrency constraints, while avoiding the exponential blowup associated with concurrent actions. Our planner is the first to handle action effects that are conditional on what other agents are doing. Theoretically, we show that the compilation is sound and complete. Empirically, we show that our compilation can solve challenging multiagent planning problems that require concurrent actions.
Tasks
Published	2019-06-19
URL	https://arxiv.org/abs/1906.08157v1
PDF	https://arxiv.org/pdf/1906.08157v1.pdf
PWC	https://paperswithcode.com/paper/solving-multiagent-planning-problems-with
Repo	https://github.com/aig-upf/universal-pddl-parser-multiagent
Framework	none

Synthetic patches, real images: screening for centrosome aberrations in EM images of human cancer cells


Title	Synthetic patches, real images: screening for centrosome aberrations in EM images of human cancer cells
Authors	Artem Lukoyanov, Isabella Haberbosch, Constantin Pape, Alwin Kraemer, Yannick Schwab, Anna Kreshuk
Abstract	Recent advances in high-throughput electron microscopy imaging enable detailed study of centrosome aberrations in cancer cells. While the image acquisition in such pipelines is automated, manual detection of centrioles is still necessary to select cells for re-imaging at higher magnification. In this contribution we propose an algorithm which performs this step automatically and with high accuracy. From the image labels produced by human experts and a 3D model of a centriole we construct an additional training set with patch-level labels. A two-level DenseNet is trained on the hybrid training data with synthetic patches and real images, achieving much better results on real patient data than training only at the image-level. The code can be found at https://github.com/kreshuklab/centriole_detection.
Tasks
Published	2019-08-27
URL	https://arxiv.org/abs/1908.10109v1
PDF	https://arxiv.org/pdf/1908.10109v1.pdf
PWC	https://paperswithcode.com/paper/synthetic-patches-real-images-screening-for
Repo	https://github.com/kreshuklab/centriole_detection
Framework	pytorch

Lost in Machine Translation: A Method to Reduce Meaning Loss


Title	Lost in Machine Translation: A Method to Reduce Meaning Loss
Authors	Reuben Cohn-Gordon, Noah Goodman
Abstract	A desideratum of high-quality translation systems is that they preserve meaning, in the sense that two sentences with different meanings should not translate to one and the same sentence in another language. However, state-of-the-art systems often fail in this regard, particularly in cases where the source and target languages partition the “meaning space” in different ways. For instance, “I cut my finger.” and “I cut my finger off.” describe different states of the world but are translated to French (by both Fairseq and Google Translate) as “Je me suis coupe le doigt.", which is ambiguous as to whether the finger is detached. More generally, translation systems are typically many-to-one (non-injective) functions from source to target language, which in many cases results in important distinctions in meaning being lost in translation. Building on Bayesian models of informative utterance production, we present a method to define a less ambiguous translation system in terms of an underlying pre-trained neural sequence-to-sequence model. This method increases injectivity, resulting in greater preservation of meaning as measured by improvement in cycle-consistency, without impeding translation quality (measured by BLEU score).
Tasks	Machine Translation
Published	2019-02-25
URL	http://arxiv.org/abs/1902.09514v4
PDF	http://arxiv.org/pdf/1902.09514v4.pdf
PWC	https://paperswithcode.com/paper/lost-in-machine-translation-a-method-to
Repo	https://github.com/reubenharry/pragmatic-translation
Framework	pytorch

Learning Deep Generative Models with Annealed Importance Sampling


Title	Learning Deep Generative Models with Annealed Importance Sampling
Authors	Xinqiang Ding, David J. Freedman
Abstract	Variational inference (VI) and Markov chain Monte Carlo (MCMC) are two main approximate approaches for learning deep generative models by maximizing marginal likelihood. In this paper, we propose using annealed importance sampling for learning deep generative models. Our proposed approach bridges VI with MCMC. It generalizes VI methods such as variational auto-encoders and importance weighted auto-encoders (IWAE) and the MCMC method proposed in (Hoffman, 2017). It also provides insights into why running multiple short MCMC chains can help learning deep generative models. Through experiments, we show that our approach yields better density models than IWAE and can effectively trade computation for model accuracy without increasing memory cost.
Tasks	Latent Variable Models
Published	2019-06-12
URL	https://arxiv.org/abs/1906.04904v2
PDF	https://arxiv.org/pdf/1906.04904v2.pdf
PWC	https://paperswithcode.com/paper/improving-importance-weighted-auto-encoders
Repo	https://github.com/xqding/AIWAE
Framework	pytorch

A Cross-Season Correspondence Dataset for Robust Semantic Segmentation


Title	A Cross-Season Correspondence Dataset for Robust Semantic Segmentation
Authors	Måns Larsson, Erik Stenborg, Lars Hammarstrand, Torsten Sattler, Mark Pollefeys, Fredrik Kahl
Abstract	In this paper, we present a method to utilize 2D-2D point matches between images taken during different image conditions to train a convolutional neural network for semantic segmentation. Enforcing label consistency across the matches makes the final segmentation algorithm robust to seasonal changes. We describe how these 2D-2D matches can be generated with little human interaction by geometrically matching points from 3D models built from images. Two cross-season correspondence datasets are created providing 2D-2D matches across seasonal changes as well as from day to night. The datasets are made publicly available to facilitate further research. We show that adding the correspondences as extra supervision during training improves the segmentation performance of the convolutional neural network, making it more robust to seasonal changes and weather conditions.
Tasks	Semantic Segmentation
Published	2019-03-16
URL	https://arxiv.org/abs/1903.06916v2
PDF	https://arxiv.org/pdf/1903.06916v2.pdf
PWC	https://paperswithcode.com/paper/a-cross-season-correspondence-dataset-for
Repo	https://github.com/maunzzz/cross-season-segmentation
Framework	pytorch