Paper Group AWR 88
SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference. Explicit Explore-Exploit Algorithms in Continuous State Spaces. The Explanation Game: Explaining Machine Learning Models with Cooperative Game Theory. Reducing Overestimation Bias in Multi-Agent Domains Using Double Centralized Critics. Interpolation Consistency Training fo …
SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference
Title | SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference |
Authors | Lasse Espeholt, Raphaël Marinier, Piotr Stanczyk, Ke Wang, Marcin Michalski |
Abstract | We present a modern scalable reinforcement learning agent called SEED (Scalable, Efficient Deep-RL). By effectively utilizing modern accelerators, we show that it is not only possible to train on millions of frames per second but also to lower the cost of experiments compared to current methods. We achieve this with a simple architecture that features centralized inference and an optimized communication layer. SEED adopts two state of the art distributed algorithms, IMPALA/V-trace (policy gradients) and R2D2 (Q-learning), and is evaluated on Atari-57, DeepMind Lab and Google Research Football. We improve the state of the art on Football and are able to reach state of the art on Atari-57 three times faster in wall-time. For the scenarios we consider, a 40% to 80% cost reduction for running experiments is achieved. The implementation along with experiments is open-sourced so results can be reproduced and novel ideas tried out. |
Tasks | Q-Learning |
Published | 2019-10-15 |
URL | https://arxiv.org/abs/1910.06591v2 |
https://arxiv.org/pdf/1910.06591v2.pdf | |
PWC | https://paperswithcode.com/paper/seed-rl-scalable-and-efficient-deep-rl-with-1 |
Repo | https://github.com/google-research/seed_rl |
Framework | tf |
Explicit Explore-Exploit Algorithms in Continuous State Spaces
Title | Explicit Explore-Exploit Algorithms in Continuous State Spaces |
Authors | Mikael Henaff |
Abstract | We present a new model-based algorithm for reinforcement learning (RL) which consists of explicit exploration and exploitation phases, and is applicable in large or infinite state spaces. The algorithm maintains a set of dynamics models consistent with current experience and explores by finding policies which induce high disagreement between their state predictions. It then exploits using the refined set of models or experience gathered during exploration. We show that under realizability and optimal planning assumptions, our algorithm provably finds a near-optimal policy with a number of samples that is polynomial in a structural complexity measure which we show to be low in several natural settings. We then give a practical approximation using neural networks and demonstrate its performance and sample efficiency in practice. |
Tasks | |
Published | 2019-11-01 |
URL | https://arxiv.org/abs/1911.00617v2 |
https://arxiv.org/pdf/1911.00617v2.pdf | |
PWC | https://paperswithcode.com/paper/explicit-explore-exploit-algorithms-in |
Repo | https://github.com/mbhenaff/neural-e3 |
Framework | pytorch |
The Explanation Game: Explaining Machine Learning Models with Cooperative Game Theory
Title | The Explanation Game: Explaining Machine Learning Models with Cooperative Game Theory |
Authors | Luke Merrick, Ankur Taly |
Abstract | A number of techniques have been proposed to explain a machine learning (ML) model’s prediction by attributing it to the corresponding input features. Popular among these are techniques that apply the Shapley value method from cooperative game theory. While existing papers focus on the axiomatic motivation of Shapley values, and efficient techniques for computing them, they neither justify the game formulations used nor address the uncertainty implicit in their methods’ outputs. For instance, the SHAP algorithm’s formulation may give substantial attributions to features that play no role in a model. Furthermore, without infinite data and computation, SHAP attributions are approximations subject to hitherto uncharacterized uncertainty. In this work, we illustrate how subtle differences in the underlying game formulations of existing methods can cause large differences in attribution for a prediction. We then present a general game formulation that unifies existing methods. Using the primitive of single-reference games, we decompose the Shapley values of the general game formulation into Shapley values of single-reference games. This decomposition enables us to introduce confidence intervals to quantify the uncertainty in estimated attributions. Additionally, this decomposition enables contrastive explanations of a prediction through comparisons with different groups of reference inputs. We tie this idea to classic work on Norm Theory in cognitive psychology, and propose a general framework for generating explanations for ML models, called formulate, approximate, and explain (FAE). |
Tasks | |
Published | 2019-09-17 |
URL | https://arxiv.org/abs/1909.08128v2 |
https://arxiv.org/pdf/1909.08128v2.pdf | |
PWC | https://paperswithcode.com/paper/the-explanation-game-explaining-machine |
Repo | https://github.com/fiddler-labs/the-explanation-game-supplemental |
Framework | none |
Reducing Overestimation Bias in Multi-Agent Domains Using Double Centralized Critics
Title | Reducing Overestimation Bias in Multi-Agent Domains Using Double Centralized Critics |
Authors | Johannes Ackermann, Volker Gabler, Takayuki Osa, Masashi Sugiyama |
Abstract | Many real world tasks require multiple agents to work together. Multi-agent reinforcement learning (RL) methods have been proposed in recent years to solve these tasks, but current methods often fail to efficiently learn policies. We thus investigate the presence of a common weakness in single-agent RL, namely value function overestimation bias, in the multi-agent setting. Based on our findings, we propose an approach that reduces this bias by using double centralized critics. We evaluate it on six mixed cooperative-competitive tasks, showing a significant advantage over current methods. Finally, we investigate the application of multi-agent methods to high-dimensional robotic tasks and show that our approach can be used to learn decentralized policies in this domain. |
Tasks | Multi-agent Reinforcement Learning |
Published | 2019-10-03 |
URL | https://arxiv.org/abs/1910.01465v2 |
https://arxiv.org/pdf/1910.01465v2.pdf | |
PWC | https://paperswithcode.com/paper/reducing-overestimation-bias-in-multi-agent |
Repo | https://github.com/JohannesAck/MATD3implementation |
Framework | tf |
Interpolation Consistency Training for Semi-Supervised Learning
Title | Interpolation Consistency Training for Semi-Supervised Learning |
Authors | Vikas Verma, Alex Lamb, Juho Kannala, Yoshua Bengio, David Lopez-Paz |
Abstract | We introduce Interpolation Consistency Training (ICT), a simple and computation efficient algorithm for training Deep Neural Networks in the semi-supervised learning paradigm. ICT encourages the prediction at an interpolation of unlabeled points to be consistent with the interpolation of the predictions at those points. In classification problems, ICT moves the decision boundary to low-density regions of the data distribution. Our experiments show that ICT achieves state-of-the-art performance when applied to standard neural network architectures on the CIFAR-10 and SVHN benchmark datasets. |
Tasks | Semi-Supervised Image Classification |
Published | 2019-03-09 |
URL | https://arxiv.org/abs/1903.03825v3 |
https://arxiv.org/pdf/1903.03825v3.pdf | |
PWC | https://paperswithcode.com/paper/interpolation-consistency-training-for-semi |
Repo | https://github.com/vikasverma1077/ICT |
Framework | pytorch |
On the Measure of Intelligence
Title | On the Measure of Intelligence |
Authors | François Chollet |
Abstract | To make deliberate progress towards more intelligent and more human-like artificial systems, we need to be following an appropriate feedback signal: we need to be able to define and evaluate intelligence in a way that enables comparisons between two systems, as well as comparisons with humans. Over the past hundred years, there has been an abundance of attempts to define and measure intelligence, across both the fields of psychology and AI. We summarize and critically assess these definitions and evaluation approaches, while making apparent the two historical conceptions of intelligence that have implicitly guided them. We note that in practice, the contemporary AI community still gravitates towards benchmarking intelligence by comparing the skill exhibited by AIs and humans at specific tasks such as board games and video games. We argue that solely measuring skill at any given task falls short of measuring intelligence, because skill is heavily modulated by prior knowledge and experience: unlimited priors or unlimited training data allow experimenters to “buy” arbitrary levels of skills for a system, in a way that masks the system’s own generalization power. We then articulate a new formal definition of intelligence based on Algorithmic Information Theory, describing intelligence as skill-acquisition efficiency and highlighting the concepts of scope, generalization difficulty, priors, and experience. Using this definition, we propose a set of guidelines for what a general AI benchmark should look like. Finally, we present a benchmark closely following these guidelines, the Abstraction and Reasoning Corpus (ARC), built upon an explicit set of priors designed to be as close as possible to innate human priors. We argue that ARC can be used to measure a human-like form of general fluid intelligence and that it enables fair general intelligence comparisons between AI systems and humans. |
Tasks | |
Published | 2019-11-05 |
URL | https://arxiv.org/abs/1911.01547v2 |
https://arxiv.org/pdf/1911.01547v2.pdf | |
PWC | https://paperswithcode.com/paper/the-measure-of-intelligence |
Repo | https://github.com/fchollet/ARC |
Framework | none |
Deep Point-wise Prediction for Action Temporal Proposal
Title | Deep Point-wise Prediction for Action Temporal Proposal |
Authors | Luxuan Li, Tao Kong, Fuchun Sun, Huaping Liu |
Abstract | Detecting actions in videos is an important yet challenging task. Previous works usually utilize (a) sliding window paradigms, or (b) per-frame action scoring and grouping to enumerate the possible temporal locations. Their performances are also limited to the designs of sliding windows or grouping strategies. In this paper, we present a simple and effective method for temporal action proposal generation, named Deep Point-wise Prediction (DPP). DPP simultaneously predicts the action existing possibility and the corresponding temporal locations, without the utilization of any handcrafted sliding window or grouping. The whole system is end-to-end trained with joint loss of temporal action proposal classification and location prediction. We conduct extensive experiments to verify its effectiveness, generality and robustness on standard THUMOS14 dataset. DPP runs more than 1000 frames per second, which largely satisfies the real-time requirement. The code is available at https://github.com/liluxuan1997/DPP. |
Tasks | Action Recognition In Videos, Temporal Action Proposal Generation |
Published | 2019-09-17 |
URL | https://arxiv.org/abs/1909.07725v1 |
https://arxiv.org/pdf/1909.07725v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-point-wise-prediction-for-action |
Repo | https://github.com/liluxuan1997/DPP |
Framework | pytorch |
Segmentation Guided Image-to-Image Translation with Adversarial Networks
Title | Segmentation Guided Image-to-Image Translation with Adversarial Networks |
Authors | Songyao Jiang, Zhiqiang Tao, Yun Fu |
Abstract | Recently image-to-image translation has received increasing attention, which aims to map images in one domain to another specific one. Existing methods mainly solve this task via a deep generative model, and focus on exploring the relationship between different domains. However, these methods neglect to utilize higher-level and instance-specific information to guide the training process, leading to a great deal of unrealistic generated images of low quality. Existing methods also lack of spatial controllability during translation. To address these challenge, we propose a novel Segmentation Guided Generative Adversarial Networks (SGGAN), which leverages semantic segmentation to further boost the generation performance and provide spatial mapping. In particular, a segmentor network is designed to impose semantic information on the generated images. Experimental results on multi-domain face image translation task empirically demonstrate our ability of the spatial modification and our superiority in image quality over several state-of-the-art methods. |
Tasks | Image-to-Image Translation, Semantic Segmentation |
Published | 2019-01-06 |
URL | http://arxiv.org/abs/1901.01569v2 |
http://arxiv.org/pdf/1901.01569v2.pdf | |
PWC | https://paperswithcode.com/paper/segmentation-guided-image-to-image |
Repo | https://github.com/jackyjsy/SGGAN |
Framework | pytorch |
Recurrent Event Network: Global Structure Inference over Temporal Knowledge Graph
Title | Recurrent Event Network: Global Structure Inference over Temporal Knowledge Graph |
Authors | Woojeong Jin, He Jiang, Meng Qu, Tong Chen, Changlin Zhang, Pedro Szekely, Xiang Ren |
Abstract | Modeling dynamically-evolving, multi-relational graph data has received a surge of interests with the rapid growth of heterogeneous event data. However, predicting future events on such data requires global structure inference over time and the ability to integrate temporal and structural information, which are not yet well understood. We present Recurrent Event Network (RE-Net), a novel autoregressive architecture for modeling temporal sequences of multi-relational graphs (e.g., temporal knowledge graph), which can perform sequential, global structure inference over future time stamps to predict new events. RE-Net employs a recurrent event encoder to model the temporally conditioned joint probability distribution for the event sequences, and equips the event encoder with a neighborhood aggregator for modeling the concurrent events within a time window associated with each entity. We apply teacher forcing for model training over historical data, and infer graph sequences over future time stamps by sampling from the learned joint distribution in a sequential manner. We evaluate the proposed method via temporal link prediction on five public datasets. Extensive experiments demonstrate the strength of RE-Net, especially on multi-step inference over future time stamps. Code and data can be found at https://github.com/INK-USC/RE-Net . |
Tasks | Knowledge Graphs, Link Prediction |
Published | 2019-04-11 |
URL | https://arxiv.org/abs/1904.05530v3 |
https://arxiv.org/pdf/1904.05530v3.pdf | |
PWC | https://paperswithcode.com/paper/recurrent-event-network-for-reasoning-over |
Repo | https://github.com/INK-USC/RENet |
Framework | pytorch |
BHAAV- A Text Corpus for Emotion Analysis from Hindi Stories
Title | BHAAV- A Text Corpus for Emotion Analysis from Hindi Stories |
Authors | Yaman Kumar, Debanjan Mahata, Sagar Aggarwal, Anmol Chugh, Rajat Maheshwari, Rajiv Ratn Shah |
Abstract | In this paper, we introduce the first and largest Hindi text corpus, named BHAAV, which means emotions in Hindi, for analyzing emotions that a writer expresses through his characters in a story, as perceived by a narrator/reader. The corpus consists of 20,304 sentences collected from 230 different short stories spanning across 18 genres such as Inspirational and Mystery. Each sentence has been annotated into one of the five emotion categories - anger, joy, suspense, sad, and neutral, by three native Hindi speakers with at least ten years of formal education in Hindi. We also discuss challenges in the annotation of low resource languages such as Hindi, and discuss the scope of the proposed corpus along with its possible uses. We also provide a detailed analysis of the dataset and train strong baseline classifiers reporting their performances. |
Tasks | Emotion Recognition |
Published | 2019-10-09 |
URL | https://arxiv.org/abs/1910.04073v1 |
https://arxiv.org/pdf/1910.04073v1.pdf | |
PWC | https://paperswithcode.com/paper/bhaav-a-text-corpus-for-emotion-analysis-from |
Repo | https://github.com/midas-research/bhaav |
Framework | none |
Solving Multiagent Planning Problems with Concurrent Conditional Effects
Title | Solving Multiagent Planning Problems with Concurrent Conditional Effects |
Authors | Daniel Furelos-Blanco, Anders Jonsson |
Abstract | In this work we present a novel approach to solving concurrent multiagent planning problems in which several agents act in parallel. Our approach relies on a compilation from concurrent multiagent planning to classical planning, allowing us to use an off-the-shelf classical planner to solve the original multiagent problem. The solution can be directly interpreted as a concurrent plan that satisfies a given set of concurrency constraints, while avoiding the exponential blowup associated with concurrent actions. Our planner is the first to handle action effects that are conditional on what other agents are doing. Theoretically, we show that the compilation is sound and complete. Empirically, we show that our compilation can solve challenging multiagent planning problems that require concurrent actions. |
Tasks | |
Published | 2019-06-19 |
URL | https://arxiv.org/abs/1906.08157v1 |
https://arxiv.org/pdf/1906.08157v1.pdf | |
PWC | https://paperswithcode.com/paper/solving-multiagent-planning-problems-with |
Repo | https://github.com/aig-upf/universal-pddl-parser-multiagent |
Framework | none |
Synthetic patches, real images: screening for centrosome aberrations in EM images of human cancer cells
Title | Synthetic patches, real images: screening for centrosome aberrations in EM images of human cancer cells |
Authors | Artem Lukoyanov, Isabella Haberbosch, Constantin Pape, Alwin Kraemer, Yannick Schwab, Anna Kreshuk |
Abstract | Recent advances in high-throughput electron microscopy imaging enable detailed study of centrosome aberrations in cancer cells. While the image acquisition in such pipelines is automated, manual detection of centrioles is still necessary to select cells for re-imaging at higher magnification. In this contribution we propose an algorithm which performs this step automatically and with high accuracy. From the image labels produced by human experts and a 3D model of a centriole we construct an additional training set with patch-level labels. A two-level DenseNet is trained on the hybrid training data with synthetic patches and real images, achieving much better results on real patient data than training only at the image-level. The code can be found at https://github.com/kreshuklab/centriole_detection. |
Tasks | |
Published | 2019-08-27 |
URL | https://arxiv.org/abs/1908.10109v1 |
https://arxiv.org/pdf/1908.10109v1.pdf | |
PWC | https://paperswithcode.com/paper/synthetic-patches-real-images-screening-for |
Repo | https://github.com/kreshuklab/centriole_detection |
Framework | pytorch |
Lost in Machine Translation: A Method to Reduce Meaning Loss
Title | Lost in Machine Translation: A Method to Reduce Meaning Loss |
Authors | Reuben Cohn-Gordon, Noah Goodman |
Abstract | A desideratum of high-quality translation systems is that they preserve meaning, in the sense that two sentences with different meanings should not translate to one and the same sentence in another language. However, state-of-the-art systems often fail in this regard, particularly in cases where the source and target languages partition the “meaning space” in different ways. For instance, “I cut my finger.” and “I cut my finger off.” describe different states of the world but are translated to French (by both Fairseq and Google Translate) as “Je me suis coupe le doigt.", which is ambiguous as to whether the finger is detached. More generally, translation systems are typically many-to-one (non-injective) functions from source to target language, which in many cases results in important distinctions in meaning being lost in translation. Building on Bayesian models of informative utterance production, we present a method to define a less ambiguous translation system in terms of an underlying pre-trained neural sequence-to-sequence model. This method increases injectivity, resulting in greater preservation of meaning as measured by improvement in cycle-consistency, without impeding translation quality (measured by BLEU score). |
Tasks | Machine Translation |
Published | 2019-02-25 |
URL | http://arxiv.org/abs/1902.09514v4 |
http://arxiv.org/pdf/1902.09514v4.pdf | |
PWC | https://paperswithcode.com/paper/lost-in-machine-translation-a-method-to |
Repo | https://github.com/reubenharry/pragmatic-translation |
Framework | pytorch |
Learning Deep Generative Models with Annealed Importance Sampling
Title | Learning Deep Generative Models with Annealed Importance Sampling |
Authors | Xinqiang Ding, David J. Freedman |
Abstract | Variational inference (VI) and Markov chain Monte Carlo (MCMC) are two main approximate approaches for learning deep generative models by maximizing marginal likelihood. In this paper, we propose using annealed importance sampling for learning deep generative models. Our proposed approach bridges VI with MCMC. It generalizes VI methods such as variational auto-encoders and importance weighted auto-encoders (IWAE) and the MCMC method proposed in (Hoffman, 2017). It also provides insights into why running multiple short MCMC chains can help learning deep generative models. Through experiments, we show that our approach yields better density models than IWAE and can effectively trade computation for model accuracy without increasing memory cost. |
Tasks | Latent Variable Models |
Published | 2019-06-12 |
URL | https://arxiv.org/abs/1906.04904v2 |
https://arxiv.org/pdf/1906.04904v2.pdf | |
PWC | https://paperswithcode.com/paper/improving-importance-weighted-auto-encoders |
Repo | https://github.com/xqding/AIWAE |
Framework | pytorch |
A Cross-Season Correspondence Dataset for Robust Semantic Segmentation
Title | A Cross-Season Correspondence Dataset for Robust Semantic Segmentation |
Authors | Måns Larsson, Erik Stenborg, Lars Hammarstrand, Torsten Sattler, Mark Pollefeys, Fredrik Kahl |
Abstract | In this paper, we present a method to utilize 2D-2D point matches between images taken during different image conditions to train a convolutional neural network for semantic segmentation. Enforcing label consistency across the matches makes the final segmentation algorithm robust to seasonal changes. We describe how these 2D-2D matches can be generated with little human interaction by geometrically matching points from 3D models built from images. Two cross-season correspondence datasets are created providing 2D-2D matches across seasonal changes as well as from day to night. The datasets are made publicly available to facilitate further research. We show that adding the correspondences as extra supervision during training improves the segmentation performance of the convolutional neural network, making it more robust to seasonal changes and weather conditions. |
Tasks | Semantic Segmentation |
Published | 2019-03-16 |
URL | https://arxiv.org/abs/1903.06916v2 |
https://arxiv.org/pdf/1903.06916v2.pdf | |
PWC | https://paperswithcode.com/paper/a-cross-season-correspondence-dataset-for |
Repo | https://github.com/maunzzz/cross-season-segmentation |
Framework | pytorch |