April 2, 2020

3449 words 17 mins read

Paper Group ANR 264

Paper Group ANR 264

Distributed Embodied Evolution in Networks of Agents. On Catastrophic Interference in Atari 2600 Games. EPOS: Estimating 6D Pose of Objects with Symmetries. No-regret learning dynamics for extensive-form correlated and coarse correlated equilibria. Parallel Predictive Entropy Search for Multi-objective Bayesian Optimization with Constraints. Work i …

Distributed Embodied Evolution in Networks of Agents

Title Distributed Embodied Evolution in Networks of Agents
Authors Anil Yaman, Giovanni Iacca
Abstract In most network problems, the optimum behaviors of agents in the network are not known before deployment. In addition to that, agents might be required to adapt, i.e. change their behavior based on the environment conditions. In these scenarios, offline optimization is usually costly and inefficient, while online methods might be more suitable. In this work we propose a distributed embodied evolutionary approach to optimize spatially distributed, locally interacting agents by allowing them to exchange their behavior parameters and learn from each other to adapt to a certain task within a given environment. Our numerical results show that the local exchange of information, performed by means of crossover of behavior parameters with neighbors, allows the network to converge to the global optimum more efficiently than the cases where local interactions are not allowed, even when there are large differences on the optimal behaviors within a neighborhood.
Tasks
Published 2020-03-28
URL https://arxiv.org/abs/2003.12848v2
PDF https://arxiv.org/pdf/2003.12848v2.pdf
PWC https://paperswithcode.com/paper/distributed-embodied-evolution-in-networks-of
Repo
Framework

On Catastrophic Interference in Atari 2600 Games

Title On Catastrophic Interference in Atari 2600 Games
Authors William Fedus, Dibya Ghosh, John D. Martin, Marc G. Bellemare, Yoshua Bengio, Hugo Larochelle
Abstract Model-free deep reinforcement learning algorithms are troubled with poor sample efficiency – learning reliable policies generally requires a vast amount of interaction with the environment. One hypothesis is that catastrophic interference between various segments within the environment is an issue. In this paper, we perform a large-scale empirical study on the presence of catastrophic interference in the Arcade Learning Environment and find that learning particular game segments frequently degrades performance on previously learned segments. In what we term the Memento observation, we show that an identically parameterized agent spawned from a state where the original agent plateaued, reliably makes further progress. This phenomenon is general – we find consistent performance boosts across architectures, learning algorithms and environments. Our results indicate that eliminating catastrophic interference can contribute towards improved performance and data efficiency of deep reinforcement learning algorithms.
Tasks Atari Games
Published 2020-02-28
URL https://arxiv.org/abs/2002.12499v1
PDF https://arxiv.org/pdf/2002.12499v1.pdf
PWC https://paperswithcode.com/paper/on-catastrophic-interference-in-atari-2600
Repo
Framework

EPOS: Estimating 6D Pose of Objects with Symmetries

Title EPOS: Estimating 6D Pose of Objects with Symmetries
Authors Tomas Hodan, Daniel Barath, Jiri Matas
Abstract We present a new method for estimating the 6D pose of rigid objects with available 3D models from a single RGB input image. The method is applicable to a broad range of objects, including challenging ones with global or partial symmetries. An object is represented by compact surface fragments which allow handling symmetries in a systematic manner. Correspondences between densely sampled pixels and the fragments are predicted using an encoder-decoder network. At each pixel, the network predicts: (i) the probability of each object’s presence, (ii) the probability of the fragments given the object’s presence, and (iii) the precise 3D location on each fragment. A data-dependent number of corresponding 3D locations is selected per pixel, and poses of possibly multiple object instances are estimated using a robust and efficient variant of the PnP-RANSAC algorithm. In the BOP Challenge 2019, the method outperforms all RGB and most RGB-D and D methods on the T-LESS and LM-O datasets. On the YCB-V dataset, it is superior to all competitors, with a large margin over the second-best RGB method. Source code is at: cmp.felk.cvut.cz/epos.
Tasks
Published 2020-04-01
URL https://arxiv.org/abs/2004.00605v1
PDF https://arxiv.org/pdf/2004.00605v1.pdf
PWC https://paperswithcode.com/paper/epos-estimating-6d-pose-of-objects-with
Repo
Framework

No-regret learning dynamics for extensive-form correlated and coarse correlated equilibria

Title No-regret learning dynamics for extensive-form correlated and coarse correlated equilibria
Authors Andrea Celli, Alberto Marchesi, Gabriele Farina, Nicola Gatti
Abstract Recently, there has been growing interest around less-restrictive solution concepts than Nash equilibrium in extensive-form games, with significant effort towards the computation of extensive-form correlated equilibrium (EFCE) and extensive-form coarse correlated equilibrium (EFCCE). In this paper, we show how to leverage the popular counterfactual regret minimization (CFR) paradigm to induce simple no-regret dynamics that converge to the set of EFCEs and EFCCEs in an n-player general-sum extensive-form games. For EFCE, we define a notion of internal regret suitable for extensive-form games and exhibit an efficient no-internal-regret algorithm. These results complement those for normal-form games introduced in the seminal paper by Hart and Mas-Colell. For EFCCE, we show that no modification of CFR is needed, and that in fact the empirical frequency of play generated when all the players use the original CFR algorithm converges to the set of EFCCEs.
Tasks
Published 2020-04-01
URL https://arxiv.org/abs/2004.00603v1
PDF https://arxiv.org/pdf/2004.00603v1.pdf
PWC https://paperswithcode.com/paper/no-regret-learning-dynamics-for-extensive
Repo
Framework

Parallel Predictive Entropy Search for Multi-objective Bayesian Optimization with Constraints

Title Parallel Predictive Entropy Search for Multi-objective Bayesian Optimization with Constraints
Authors Eduardo C. Garrido-Merchán, Daniel Hernández-Lobato
Abstract Real-world problems often involve the optimization of several objectives under multiple constraints. Furthermore, we may not have an expression for each objective or constraint; they may be expensive to evaluate; and the evaluations can be noisy. These functions are referred to as black-boxes. Bayesian optimization (BO) can efficiently solve the problems described. For this, BO iteratively fits a model to the observations of each black-box. The models are then used to choose where to evaluate the black-boxes next, with the goal of solving the optimization problem in a few iterations. In particular, they guide the search for the problem solution, and avoid evaluations in regions of little expected utility. A limitation, however, is that current BO methods for these problems choose a point at a time at which to evaluate the black-boxes. If the expensive evaluations can be carried out in parallel (as when a cluster of computers is available), this results in a waste of resources. Here, we introduce PPESMOC, Parallel Predictive Entropy Search for Multi-objective Optimization with Constraints, a BO strategy for solving the problems described. PPESMOC selects, at each iteration, a batch of input locations at which to evaluate the black-boxes, in parallel, to maximally reduce the entropy of the problem solution. To our knowledge, this is the first batch method for constrained multi-objective BO. We present empirical evidence in the form of synthetic, benchmark and real-world experiments that illustrate the effectiveness of PPESMOC.
Tasks
Published 2020-04-01
URL https://arxiv.org/abs/2004.00601v1
PDF https://arxiv.org/pdf/2004.00601v1.pdf
PWC https://paperswithcode.com/paper/parallel-predictive-entropy-search-for-multi
Repo
Framework

Work in Progress: Temporally Extended Auxiliary Tasks

Title Work in Progress: Temporally Extended Auxiliary Tasks
Authors Craig Sherstan, Bilal Kartal, Pablo Hernandez-Leal, Matthew E. Taylor
Abstract Predictive auxiliary tasks have been shown to improve performance in numerous reinforcement learning works, however, this effect is still not well understood. The primary purpose of the work presented here is to investigate the impact that an auxiliary task’s prediction timescale has on the agent’s policy performance. We consider auxiliary tasks which learn to make on-policy predictions using temporal difference learning. We test the impact of prediction timescale using a specific form of auxiliary task in which the input image is used as the prediction target, which we refer to as temporal difference autoencoders (TD-AE). We empirically evaluate the effect of TD-AE on the A2C algorithm in the VizDoom environment using different prediction timescales. While we do not observe a clear relationship between the prediction timescale on performance, we make the following observations: 1) using auxiliary tasks allows us to reduce the trajectory length of the A2C algorithm, 2) in some cases temporally extended TD-AE performs better than a straight autoencoder, 3) performance with auxiliary tasks is sensitive to the weight placed on the auxiliary loss, 4) despite this sensitivity, auxiliary tasks improved performance without extensive hyper-parameter tuning. Our overall conclusions are that TD-AE increases the robustness of the A2C algorithm to the trajectory length and while promising, further study is required to fully understand the relationship between auxiliary task prediction timescale and the agent’s performance.
Tasks
Published 2020-04-01
URL https://arxiv.org/abs/2004.00600v1
PDF https://arxiv.org/pdf/2004.00600v1.pdf
PWC https://paperswithcode.com/paper/work-in-progress-temporally-extended
Repo
Framework

RMP-SNN: Residual Membrane Potential Neuron for Enabling Deeper High-Accuracy and Low-Latency Spiking Neural Network

Title RMP-SNN: Residual Membrane Potential Neuron for Enabling Deeper High-Accuracy and Low-Latency Spiking Neural Network
Authors Bing Han, Gopalakrishnan Srinivasan, Kaushik Roy
Abstract Spiking Neural Networks (SNNs) have recently attracted significant research interest as the third generation of artificial neural networks that can enable low-power event-driven data analytics. The best performing SNNs for image recognition tasks are obtained by converting a trained Analog Neural Network (ANN), consisting of Rectified Linear Units (ReLU), to SNN composed of integrate-and-fire neurons with “proper” firing thresholds. The converted SNNs typically incur loss in accuracy compared to that provided by the original ANN and require sizable number of inference time-steps to achieve the best accuracy. We find that performance degradation in the converted SNN stems from using “hard reset” spiking neuron that is driven to fixed reset potential once its membrane potential exceeds the firing threshold, leading to information loss during SNN inference. We propose ANN-SNN conversion using “soft reset” spiking neuron model, referred to as Residual Membrane Potential (RMP) spiking neuron, which retains the “residual” membrane potential above threshold at the firing instants. We demonstrate near loss-less ANN-SNN conversion using RMP neurons for VGG-16, ResNet-20, and ResNet-34 SNNs on challenging datasets including CIFAR-10 (93.63% top-1), CIFAR-100 (70.93% top-1), and ImageNet (73.09% top-1 accuracy). Our results also show that RMP-SNN surpasses the best inference accuracy provided by the converted SNN with “hard reset” spiking neurons using 2-8 times fewer inference time-steps across network architectures and datasets.
Tasks
Published 2020-02-25
URL https://arxiv.org/abs/2003.01811v2
PDF https://arxiv.org/pdf/2003.01811v2.pdf
PWC https://paperswithcode.com/paper/rmp-snns-residual-membrane-potential-neuron
Repo
Framework

Robust Image Reconstruction with Misaligned Structural Information

Title Robust Image Reconstruction with Misaligned Structural Information
Authors Leon Bungert, Matthias J. Ehrhardt
Abstract Multi-modality (or multi-channel) imaging is becoming increasingly important and more widely available, e.g. hyperspectral imaging in remote sensing, spectral CT in material sciences as well as multi-contrast MRI and PET-MR in medicine. Research in the last decades resulted in a plethora of mathematical methods to combine data from several modalities. State-of-the-art methods, often formulated as variational regularization, have shown to significantly improve image reconstruction both quantitatively and qualitatively. Almost all of these models rely on the assumption that the modalities are perfectly registered, which is not the case in most real world applications. We propose a variational framework which jointly performs reconstruction and registration, thereby overcoming this hurdle. Numerical results show the potential of the proposed strategy for various applications for hyperspectral imaging, PET-MR and multi-contrast MRI: typical misalignments between modalities such as rotations, translations, zooms can be effectively corrected during the reconstruction process. Therefore the proposed framework allows the robust exploitation of shared information across multiple modalities under real conditions.
Tasks Image Reconstruction
Published 2020-04-01
URL https://arxiv.org/abs/2004.00589v1
PDF https://arxiv.org/pdf/2004.00589v1.pdf
PWC https://paperswithcode.com/paper/robust-image-reconstruction-with-misaligned
Repo
Framework

Sign Language Translation with Transformers

Title Sign Language Translation with Transformers
Authors Kayo Yin
Abstract Sign Language Translation (SLT) first uses a Sign Language Recognition (SLR) system to extract sign language glosses from videos. Then, a translation system generates spoken language translations from the sign language glosses. Though SLT has gathered interest recently, little study has been performed on the translation system. This paper focuses on the translation system and improves performance by utilizing Transformer networks. We report a wide range of experimental results for various Transformer setups and introduce the use of Spatial-Temporal Multi-Cue (STMC) networks in an end-to-end SLT system with Transformer. We perform experiments on RWTH-PHOENIX-Weather 2014T, a challenging SLT benchmark dataset of German sign language, and ASLG-PC12, a dataset involving American Sign Language (ASL) recently used in gloss-to-text translation. Our methodology improves on the current state-of-the-art by over 5 and 7 points respectively in BLEU-4 score on ground truth glosses and by using an STMC network to predict glosses of the RWTH-PHOENIX-Weather 2014T dataset. On the ASLG-PC12 corpus, we report an improvement of over 16 points in BLEU-4. Our findings also demonstrate that end-to-end translation on predicted glosses provides even better performance than translation on ground truth glosses. This shows potential for further improvement in SLT by either jointly training the SLR and translation systems or by revising the gloss annotation system.
Tasks Sign Language Recognition, Sign Language Translation
Published 2020-04-01
URL https://arxiv.org/abs/2004.00588v1
PDF https://arxiv.org/pdf/2004.00588v1.pdf
PWC https://paperswithcode.com/paper/sign-language-translation-with-transformers
Repo
Framework

Boosting Deep Hyperspectral Image Classification with Spectral Unmixing

Title Boosting Deep Hyperspectral Image Classification with Spectral Unmixing
Authors Alan J. X. Guo, Fei Zhu
Abstract Recent advances in neural networks have made great progress in addressing the hyperspectral image (HSI) classification problem. However, the overfitting effect, which is mainly caused by complicated model structure and small training set, remains a major concern when applying neural networks to HSIs analysis. Reducing the complexity of the neural networks could prevent overfitting to some extent, but also decline the networks’ ability to extract more abstract features. Enlarging the training set is also difficult. To tackle the overfitting issue, we propose an abundance-based multi-HSI classification method. By applying an autoencoder-based spectral unmixing technique, different HSIs are firstly converted from the spectral domain to the abundance domain. After that, the obtained abundance data from multi-HSI are collected to form an enlarged dataset. Lastly, a simple classifier is trained, which is capable to predict on all the involved datasets. Taking advantage of spectral unmixing, converting the data from the spectral domain to the abundance domain can significantly simplify the classification tasks. This enables the use of a simple network as the classifier, thus alleviating the overfitting effect. Moreover, as much dataset-specific information is eliminated after spectral unmixing, a compatible classifier suitable for different HSIs is trained. In view of this, a several times enlarged training set is constructed by bundling different HSIs’ training data. The effectiveness of the proposed method is verified by ablation study and comparative experiments. On four public HSIs, the proposed method provides comparable classification results with two comparing methods, but with a far more simple model.
Tasks Hyperspectral Image Classification, Image Classification
Published 2020-04-01
URL https://arxiv.org/abs/2004.00583v1
PDF https://arxiv.org/pdf/2004.00583v1.pdf
PWC https://paperswithcode.com/paper/boosting-deep-hyperspectral-image
Repo
Framework

Detecting Troll Behavior via Inverse Reinforcement Learning: A Case Study of Russian Trolls in the 2016 US Election

Title Detecting Troll Behavior via Inverse Reinforcement Learning: A Case Study of Russian Trolls in the 2016 US Election
Authors Luca Luceri, Silvia Giordano, Emilio Ferrara
Abstract Since the 2016 US Presidential election, social media abuse has been eliciting massive concern in the academic community and beyond. Preventing and limiting the malicious activity of users, such as trolls and bots, in their manipulation campaigns is of paramount importance for the integrity of democracy, public health, and more. However, the automated detection of troll accounts is an open challenge. In this work, we propose an approach based on Inverse Reinforcement Learning (IRL) to capture troll behavior and identify troll accounts. We employ IRL to infer a set of online incentives that may steer user behavior, which in turn highlights behavioral differences between troll and non-troll accounts, enabling their accurate classification. As a study case, we consider the troll accounts identified by the US Congress during the investigation of Russian meddling in the 2016 US Presidential election. We report promising results: the IRL-based approach is able to accurately detect troll accounts (AUC=89.1%). The differences in the predictive features between the two classes of accounts enables a principled understanding of the distinctive behaviors reflecting the incentives trolls and non-trolls respond to.
Tasks
Published 2020-01-28
URL https://arxiv.org/abs/2001.10570v2
PDF https://arxiv.org/pdf/2001.10570v2.pdf
PWC https://paperswithcode.com/paper/dont-feed-the-troll-detecting-troll-behavior
Repo
Framework

Diverse and Admissible Trajectory Forecasting through Multimodal Context Understanding

Title Diverse and Admissible Trajectory Forecasting through Multimodal Context Understanding
Authors Seong Hyeon Park, Gyubok Lee, Manoj Bhat, Jimin Seo, Minseok Kang, Jonathan Francis, Ashwin R. Jadhav, Paul Pu Liang, Louis-Philippe Morency
Abstract Multi-agent trajectory forecasting in autonomous driving requires an agent to accurately anticipate the behaviors of the surrounding vehicles and pedestrians, for safe and reliable decision-making. Due to partial observability over the goals, contexts, and interactions of agents in these dynamical scenes, directly obtaining the posterior distribution over future agent trajectories remains a challenging problem. In realistic embodied environments, each agent’s future trajectories should be diverse since multiple plausible sequences of actions can be used to reach its intended goals, and they should be admissible since they must obey physical constraints and stay in drivable areas. In this paper, we propose a model that fully synthesizes multiple input signals from the multimodal worldthe environment’s scene context and interactions between multiple surrounding agentsto best model all diverse and admissible trajectories. We offer new metrics to evaluate the diversity of trajectory predictions, while ensuring admissibility of each trajectory. Based on our new metrics as well as those used in prior work, we compare our model with strong baselines and ablations across two datasets and show a 35% performance-improvement over the state-of-the-art.
Tasks Autonomous Driving, Decision Making
Published 2020-03-06
URL https://arxiv.org/abs/2003.03212v2
PDF https://arxiv.org/pdf/2003.03212v2.pdf
PWC https://paperswithcode.com/paper/diverse-and-admissible-trajectory-forecasting
Repo
Framework

Tightened Convex Relaxations for Neural Network Robustness Certification

Title Tightened Convex Relaxations for Neural Network Robustness Certification
Authors Brendon G. Anderson, Ziye Ma, Jingqi Li, Somayeh Sojoudi
Abstract In this paper, we consider the problem of certifying the robustness of neural networks to perturbed and adversarial input data. Such certification is imperative for the application of neural networks in safety-critical decision-making and control systems. Certification techniques using convex optimization have been proposed, but they often suffer from relaxation errors that void the certificate. Our work exploits the structure of ReLU networks to improve relaxation errors through a novel partition-based certification procedure. The proposed method is proven to tighten existing linear programming relaxations, and asymptotically achieves zero relaxation error as the partition is made finer. We develop a finite partition that attains zero relaxation error and use the result to derive a tractable partitioning scheme that minimizes the worst-case relaxation error. Experiments using real data show that the partitioning procedure is able to issue robustness certificates in cases where prior methods fail. Consequently, partition-based certification procedures are found to provide an intuitive, effective, and theoretically justified method for tightening existing convex relaxation techniques.
Tasks Decision Making
Published 2020-04-01
URL https://arxiv.org/abs/2004.00570v1
PDF https://arxiv.org/pdf/2004.00570v1.pdf
PWC https://paperswithcode.com/paper/tightened-convex-relaxations-for-neural
Repo
Framework

One-shot path planning for multi-agent systems using fully convolutional neural network

Title One-shot path planning for multi-agent systems using fully convolutional neural network
Authors Tomas Kulvicius, Sebastian Herzog, Timo Lüddecke, Minija Tamosiunaite, Florentin Wörgötter
Abstract Path planning plays a crucial role in robot action execution, since a path or a motion trajectory for a particular action has to be defined first before the action can be executed. Most of the current approaches are iterative methods where the trajectory is generated iteratively by predicting the next state based on the current state. Moreover, in case of multi-agent systems, paths are planned for each agent separately. In contrast to that, we propose a novel method by utilising fully convolutional neural network, which allows generation of complete paths, even for more than one agent, in one-shot, i.e., with a single prediction step. We demonstrate that our method is able to successfully generate optimal or close to optimal paths in more than 98% of the cases for single path predictions. Moreover, we show that although the network has never been trained on multi-path planning it is also able to generate optimal or close to optimal paths in 85.7% and 65.4% of the cases when generating two and three paths, respectively.
Tasks
Published 2020-04-01
URL https://arxiv.org/abs/2004.00568v1
PDF https://arxiv.org/pdf/2004.00568v1.pdf
PWC https://paperswithcode.com/paper/one-shot-path-planning-for-multi-agent
Repo
Framework

How Much Can A Retailer Sell? Sales Forecasting on Tmall

Title How Much Can A Retailer Sell? Sales Forecasting on Tmall
Authors Chaochao Chen, Ziqi Liu, Jun Zhou, Xiaolong Li, Yuan Qi, Yujing Jiao, Xingyu Zhong
Abstract Time-series forecasting is an important task in both academic and industry, which can be applied to solve many real forecasting problems like stock, water-supply, and sales predictions. In this paper, we study the case of retailers’ sales forecasting on Tmallthe world’s leading online B2C platform. By analyzing the data, we have two main observations, i.e., sales seasonality after we group different groups of retails and a Tweedie distribution after we transform the sales (target to forecast). Based on our observations, we design two mechanisms for sales forecasting, i.e., seasonality extraction and distribution transformation. First, we adopt Fourier decomposition to automatically extract the seasonalities for different categories of retailers, which can further be used as additional features for any established regression algorithms. Second, we propose to optimize the Tweedie loss of sales after logarithmic transformations. We apply these two mechanisms to classic regression models, i.e., neural network and Gradient Boosting Decision Tree, and the experimental results on Tmall dataset show that both mechanisms can significantly improve the forecasting results.
Tasks Time Series, Time Series Forecasting
Published 2020-02-27
URL https://arxiv.org/abs/2002.11940v1
PDF https://arxiv.org/pdf/2002.11940v1.pdf
PWC https://paperswithcode.com/paper/how-much-can-a-retailer-sell-sales
Repo
Framework
comments powered by Disqus