January 26, 2020

3009 words 15 mins read

Paper Group ANR 1419

Gaze Training by Modulated Dropout Improves Imitation Learning. Wasserstein Neural Processes. Metareasoning in Modular Software Systems: On-the-Fly Configuration using Reinforcement Learning with Rich Contextual Representations. Cascading: Association Augmented Sequential Recommendation. SLM Lab: A Comprehensive Benchmark and Modular Software Frame …

Gaze Training by Modulated Dropout Improves Imitation Learning


Title	Gaze Training by Modulated Dropout Improves Imitation Learning
Authors	Yuying Chen, Congcong Liu, Lei Tai, Ming Liu, Bertram E. Shi
Abstract	Imitation learning by behavioral cloning is a prevalent method that has achieved some success in vision-based autonomous driving. The basic idea behind behavioral cloning is to have the neural network learn from observing a human expert’s behavior. Typically, a convolutional neural network learns to predict the steering commands from raw driver-view images by mimicking the behaviors of human drivers. However, there are other cues, such as gaze behavior, available from human drivers that have yet to be exploited. Previous researches have shown that novice human learners can benefit from observing experts’ gaze patterns. We present here that deep neural networks can also profit from this. We propose a method, gaze-modulated dropout, for integrating this gaze information into a deep driving network implicitly rather than as an additional input. Our experimental results demonstrate that gaze-modulated dropout enhances the generalization capability of the network to unseen scenes. Prediction error in steering commands is reduced by 23.5% compared to uniform dropout. Running closed loop in the simulator, the gaze-modulated dropout net increased the average distance travelled between infractions by 58.5%. Consistent with these results, the gaze-modulated dropout net shows lower model uncertainty.
Tasks	Autonomous Driving, Imitation Learning
Published	2019-04-17
URL	https://arxiv.org/abs/1904.08377v2
PDF	https://arxiv.org/pdf/1904.08377v2.pdf
PWC	https://paperswithcode.com/paper/gaze-training-by-modulated-dropout-improves
Repo
Framework

Wasserstein Neural Processes


Title	Wasserstein Neural Processes
Authors	Andrew Carr, Jared Nielsen, David Wingate
Abstract	Neural Processes (NPs) are a class of models that learn a mapping from a context set of input-output pairs to a distribution over functions. They are traditionally trained using maximum likelihood with a KL divergence regularization term. We show that there are desirable classes of problems where NPs, with this loss, fail to learn any reasonable distribution. We also show that this drawback is solved by using approximations of Wasserstein distance which calculates optimal transport distances even for distributions of disjoint support. We give experimental justification for our method and demonstrate performance. These Wasserstein Neural Processes (WNPs) maintain all of the benefits of traditional NPs while being able to approximate a new class of function mappings.
Tasks
Published	2019-10-01
URL	https://arxiv.org/abs/1910.00668v2
PDF	https://arxiv.org/pdf/1910.00668v2.pdf
PWC	https://paperswithcode.com/paper/wasserstein-neural-processes
Repo
Framework

Metareasoning in Modular Software Systems: On-the-Fly Configuration using Reinforcement Learning with Rich Contextual Representations


Title	Metareasoning in Modular Software Systems: On-the-Fly Configuration using Reinforcement Learning with Rich Contextual Representations
Authors	Aditya Modi, Debadeepta Dey, Alekh Agarwal, Adith Swaminathan, Besmira Nushi, Sean Andrist, Eric Horvitz
Abstract	Assemblies of modular subsystems are being pressed into service to perform sensing, reasoning, and decision making in high-stakes, time-critical tasks in such areas as transportation, healthcare, and industrial automation. We address the opportunity to maximize the utility of an overall computing system by employing reinforcement learning to guide the configuration of the set of interacting modules that comprise the system. The challenge of doing system-wide optimization is a combinatorial problem. Local attempts to boost the performance of a specific module by modifying its configuration often leads to losses in overall utility of the system’s performance as the distribution of inputs to downstream modules changes drastically. We present metareasoning techniques which consider a rich representation of the input, monitor the state of the entire pipeline, and adjust the configuration of modules on-the-fly so as to maximize the utility of a system’s operation. We show significant improvement in both real-world and synthetic pipelines across a variety of reinforcement learning techniques.
Tasks	Decision Making
Published	2019-05-12
URL	https://arxiv.org/abs/1905.05179v1
PDF	https://arxiv.org/pdf/1905.05179v1.pdf
PWC	https://paperswithcode.com/paper/metareasoning-in-modular-software-systems-on
Repo
Framework

Cascading: Association Augmented Sequential Recommendation


Title	Cascading: Association Augmented Sequential Recommendation
Authors	Xu Chen, Kenan Cui, Ya Zhang, Yanfeng Wang
Abstract	Recently, recommendation according to sequential user behaviors has shown promising results in many application scenarios. Generally speaking, real-world sequential user behaviors usually reflect a hybrid of sequential influences and association relationships. However, most existing sequential recommendation methods mainly concentrate on sequential relationships while ignoring association relationships. In this paper, we propose a unified method that incorporates item association and sequential relationships for sequential recommendation. Specifically, we encode the item association as relations in item co-occurrence graph and mine it through graph embedding by GCNs. In the meanwhile, we model the sequential relationships through a widely used RNNs based sequential recommendation method named GRU4Rec. The two parts are connected into an end-to-end network with cascading style, which guarantees that representations for item associations and sequential relationships are learned simultaneously and make the learning process maintain low complexity. We perform extensive experiments on three widely used real-world datasets: TaoBao, Amazon Instant Video and Amazon Cell Phone and Accessories. Comprehensive results have shown that the proposed method outperforms several state-of-the-art methods. Furthermore, a qualitative analysis is also provided to encourage a better understanding about association relationships in sequential recommendation and illustrate the performance gain is exactly from item association.
Tasks	Graph Embedding
Published	2019-10-17
URL	https://arxiv.org/abs/1910.07792v1
PDF	https://arxiv.org/pdf/1910.07792v1.pdf
PWC	https://paperswithcode.com/paper/cascading-association-augmented-sequential
Repo
Framework

SLM Lab: A Comprehensive Benchmark and Modular Software Framework for Reproducible Deep Reinforcement Learning


Title	SLM Lab: A Comprehensive Benchmark and Modular Software Framework for Reproducible Deep Reinforcement Learning
Authors	Keng Wah Loon, Laura Graesser, Milan Cvitkovic
Abstract	We introduce SLM Lab, a software framework for reproducible reinforcement learning (RL) research. SLM Lab implements a number of popular RL algorithms, provides synchronous and asynchronous parallel experiment execution, hyperparameter search, and result analysis. RL algorithms in SLM Lab are implemented in a modular way such that differences in algorithm performance can be confidently ascribed to differences between algorithms, not between implementations. In this work we present the design choices behind SLM Lab and use it to produce a comprehensive single-codebase RL algorithm benchmark. In addition, as a consequence of SLM Lab’s modular design, we introduce and evaluate a discrete-action variant of the Soft Actor-Critic algorithm (Haarnoja et al., 2018) and a hybrid synchronous/asynchronous training method for RL agents.
Tasks
Published	2019-12-28
URL	https://arxiv.org/abs/1912.12482v1
PDF	https://arxiv.org/pdf/1912.12482v1.pdf
PWC	https://paperswithcode.com/paper/slm-lab-a-comprehensive-benchmark-and-modular-1
Repo
Framework

Deep Reinforcement Learning for Unmanned Aerial Vehicle-Assisted Vehicular Networks


Title	Deep Reinforcement Learning for Unmanned Aerial Vehicle-Assisted Vehicular Networks
Authors	Ming Zhu, Xiao-Yang Liu, Xiaodong Wang
Abstract	Unmanned aerial vehicles (UAVs) are envisioned to complement the 5G communication infrastructure in future smart cities. Hot spots easily appear in road intersections, where effective communication among vehicles is challenging. UAVs may serve as relays with the advantages of low price, easy deployment, line-of-sight links, and flexible mobility. In this paper, we study a UAV-assisted vehicular network where the UAV jointly adjusts its transmission control (power and channel) and 3D flight to maximize the total throughput. First, we formulate a Markov decision process (MDP) problem by modeling the mobility of the UAV/vehicles and the state transitions. Secondly, we solve the target problem using a deep reinforcement learning method, namely, the deep deterministic policy gradient (DDPG), and propose three solutions with different control objectives. Considering the energy consumption of 3D flight, we extend the proposed solutions to maximize the total throughput per energy unit by encouraging or discouraging the UAV’s mobility. To achieve this goal, the DDPG framework is modified. Thirdly, in a simplified model with small state space and action space, we verify the optimality of proposed algorithms. Comparing with two baseline schemes, we demonstrate the effectiveness of proposed algorithms in a realistic model.
Tasks
Published	2019-06-12
URL	https://arxiv.org/abs/1906.05015v9
PDF	https://arxiv.org/pdf/1906.05015v9.pdf
PWC	https://paperswithcode.com/paper/deep-reinforcement-learning-for-unmanned
Repo
Framework

Enhancing the Extraction of Interpretable Information for Ischemic Stroke Imaging from Deep Neural Networks


Title	Enhancing the Extraction of Interpretable Information for Ischemic Stroke Imaging from Deep Neural Networks
Authors	Erico Tjoa, Guo Heng, Lu Yuhao, Cuntai Guan
Abstract	We implement a visual interpretability method Layer-wise Relevance Propagation (LRP) on top of 3D U-Net trained to perform lesion segmentation on the small dataset of multi-modal images provided by ISLES 2017 competition. We demonstrate that LRP modifications could provide more sensible visual explanations to an otherwise highly noise-skewed saliency map. We also link amplitude of modified signals to useful information content. High amplitude localized signals appear to constitute the noise that undermines the interpretability capacity of LRP. Furthermore, mathematical framework for possible analysis of function approximation is developed by analogy.
Tasks	Lesion Segmentation
Published	2019-11-19
URL	https://arxiv.org/abs/1911.08136v2
PDF	https://arxiv.org/pdf/1911.08136v2.pdf
PWC	https://paperswithcode.com/paper/enhancing-the-extraction-of-interpretable
Repo
Framework

CSPLib: Twenty Years On


Title	CSPLib: Twenty Years On
Authors	Ian Gent, Toby Walsh
Abstract	In 1999, we introduced CSPLib, a benchmark library for the constraints community. Our CP-1999 poster paper about CSPLib discussed the advantages and disadvantages of building such a library. Unlike some other domains such as theorem proving, or machine learning, representation was then and remains today a major issue in the success or failure to solve problems. Benchmarks in CSPLib are therefore specified in natural language as this allows users to find good representations for themselves. The community responded positively and CSPLib has become a valuable resource but, as we discuss here, we cannot rest.
Tasks	Automated Theorem Proving
Published	2019-09-30
URL	https://arxiv.org/abs/1909.13430v1
PDF	https://arxiv.org/pdf/1909.13430v1.pdf
PWC	https://paperswithcode.com/paper/csplib-twenty-years-on
Repo
Framework

Spectral Approximate Inference


Title	Spectral Approximate Inference
Authors	Sejun Park, Eunho Yang, Se-Young Yun, Jinwoo Shin
Abstract	Given a graphical model (GM), computing its partition function is the most essential inference task, but it is computationally intractable in general. To address the issue, iterative approximation algorithms exploring certain local structure/consistency of GM have been investigated as popular choices in practice. However, due to their local/iterative nature, they often output poor approximations or even do not converge, e.g., in low-temperature regimes (hard instances of large parameters). To overcome the limitation, we propose a novel approach utilizing the global spectral feature of GM. Our contribution is two-fold: (a) we first propose a fully polynomial-time approximation scheme (FPTAS) for approximating the partition function of GM associating with a low-rank coupling matrix; (b) for general high-rank GMs, we design a spectral mean-field scheme utilizing (a) as a subroutine, where it approximates a high-rank GM into a product of rank-1 GMs for an efficient approximation of the partition function. The proposed algorithm is more robust in its running time and accuracy than prior methods, i.e., neither suffers from the convergence issue nor depends on hard local structures, as demonstrated in our experiments.
Tasks
Published	2019-05-14
URL	https://arxiv.org/abs/1905.05348v1
PDF	https://arxiv.org/pdf/1905.05348v1.pdf
PWC	https://paperswithcode.com/paper/spectral-approximate-inference
Repo
Framework

A Joint Convolutional Neural Networks and Context Transfer for Street Scenes Labeling


Title	A Joint Convolutional Neural Networks and Context Transfer for Street Scenes Labeling
Authors	Qi Wang, Junyu Gao, Yuan Yuan
Abstract	Street scene understanding is an essential task for autonomous driving. One important step towards this direction is scene labeling, which annotates each pixel in the images with a correct class label. Although many approaches have been developed, there are still some weak points. Firstly, many methods are based on the hand-crafted features whose image representation ability is limited. Secondly, they can not label foreground objects accurately due to the dataset bias. Thirdly, in the refinement stage, the traditional Markov Random Filed (MRF) inference is prone to over smoothness. For improving the above problems, this paper proposes a joint method of priori convolutional neural networks at superpixel level (called as ``priori s-CNNs’') and soft restricted context transfer. Our contributions are threefold: (1) A priori s-CNNs model that learns priori location information at superpixel level is proposed to describe various objects discriminatingly; (2) A hierarchical data augmentation method is presented to alleviate dataset bias in the priori s-CNNs training stage, which improves foreground objects labeling significantly; (3) A soft restricted MRF energy function is defined to improve the priori s-CNNs model’s labeling performance and reduce the over smoothness at the same time. The proposed approach is verified on CamVid dataset (11 classes) and SIFT Flow Street dataset (16 classes) and achieves competitive performance. \|
Tasks	Autonomous Driving, Data Augmentation, Scene Labeling, Scene Understanding
Published	2019-05-05
URL	https://arxiv.org/abs/1905.01574v1
PDF	https://arxiv.org/pdf/1905.01574v1.pdf
PWC	https://paperswithcode.com/paper/a-joint-convolutional-neural-networks-and
Repo
Framework

Counterexample-Guided Synthesis of Perception Models and Control


Title	Counterexample-Guided Synthesis of Perception Models and Control
Authors	Shromona Ghosh, Hadi Ravanbakhsh, Sanjit A. Seshia
Abstract	We consider the problem of synthesizing safe and robust controllers for real world robotic systems like autonomous vehicles, which rely on complex perception modules. We propose a counterexample-guided synthesis framework which iteratively learns perception models that enable finding safe control policies. We use counterexamples to extract information relevant for modeling the errors in perception modules. Such models then can be used to synthesize controllers robust to errors in perception. If the resulting policy is not safe, we gather new counterexamples. By repeating the process, we eventually find a controller which can keep the system safe even when there is perception failure. Finally, we show that our framework computes robust controllers for autonomous vehicles in two different simulated scenarios: (i) lane keeping, and (ii) automatic braking.
Tasks	Autonomous Vehicles
Published	2019-11-04
URL	https://arxiv.org/abs/1911.01523v2
PDF	https://arxiv.org/pdf/1911.01523v2.pdf
PWC	https://paperswithcode.com/paper/counterexample-guided-synthesis-of-perception
Repo
Framework

Translating Translationese: A Two-Step Approach to Unsupervised Machine Translation


Title	Translating Translationese: A Two-Step Approach to Unsupervised Machine Translation
Authors	Nima Pourdamghani, Nada Aldarrab, Marjan Ghazvininejad, Kevin Knight, Jonathan May
Abstract	Given a rough, word-by-word gloss of a source language sentence, target language natives can uncover the latent, fully-fluent rendering of the translation. In this work we explore this intuition by breaking translation into a two step process: generating a rough gloss by means of a dictionary and then `translating' the resulting pseudo-translation, or` Translationese’ into a fully fluent translation. We build our Translationese decoder once from a mish-mash of parallel data that has the target language in common and then can build dictionaries on demand using unsupervised techniques, resulting in rapidly generated unsupervised neural MT systems for many source languages. We apply this process to 14 test languages, obtaining better or comparable translation results on high-resource languages than previously published unsupervised MT studies, and obtaining good quality results for low-resource languages that have never been used in an unsupervised MT scenario.
Tasks	Machine Translation, Unsupervised Machine Translation
Published	2019-06-11
URL	https://arxiv.org/abs/1906.05683v1
PDF	https://arxiv.org/pdf/1906.05683v1.pdf
PWC	https://paperswithcode.com/paper/translating-translationese-a-two-step
Repo
Framework

Associative Embedding for Game-Agnostic Team Discrimination


Title	Associative Embedding for Game-Agnostic Team Discrimination
Authors	Maxime Istasse, Julien Moreau, Christophe De Vleeschouwer
Abstract	Assigning team labels to players in a sport game is not a trivial task when no prior is known about the visual appearance of each team. Our work builds on a Convolutional Neural Network (CNN) to learn a descriptor, namely a pixel-wise embedding vector, that is similar for pixels depicting players from the same team, and dissimilar when pixels correspond to distinct teams. The advantage of this idea is that no per-game learning is needed, allowing efficient team discrimination as soon as the game starts. In principle, the approach follows the associative embedding framework introduced in arXiv:1611.05424 to differentiate instances of objects. Our work is however different in that it derives the embeddings from a lightweight segmentation network and, more fundamentally, because it considers the assignment of the same embedding to unconnected pixels, as required by pixels of distinct players from the same team. Excellent results, both in terms of team labelling accuracy and generalization to new games/arenas, have been achieved on panoramic views of a large variety of basketball games involving players interactions and occlusions. This makes our method a good candidate to integrate team separation in many CNN-based sport analytics pipelines.
Tasks
Published	2019-07-01
URL	https://arxiv.org/abs/1907.01058v1
PDF	https://arxiv.org/pdf/1907.01058v1.pdf
PWC	https://paperswithcode.com/paper/associative-embedding-for-game-agnostic-team
Repo
Framework

X-Armed Bandits: Optimizing Quantiles, CVaR and Other Risks


Title	X-Armed Bandits: Optimizing Quantiles, CVaR and Other Risks
Authors	Léonard Torossian, Aurélien Garivier, Victor Picheny
Abstract	We propose and analyze StoROO, an algorithm for risk optimization on stochastic black-box functions derived from StoOO. Motivated by risk-averse decision making fields like agriculture, medicine, biology or finance, we do not focus on the mean payoff but on generic functionals of the return distribution. We provide a generic regret analysis of StoROO and illustrate its applicability with two examples: the optimization of quantiles and CVaR. Inspired by the bandit literature and black-box mean optimizers, StoROO relies on the possibility to construct confidence intervals for the targeted functional based on random-size samples. We detail their construction in the case of quantiles, providing tight bounds based on Kullback-Leibler divergence. We finally present numerical experiments that show a dramatic impact of tight bounds for the optimization of quantiles and CVaR.
Tasks	Decision Making
Published	2019-04-17
URL	https://arxiv.org/abs/1904.08205v3
PDF	https://arxiv.org/pdf/1904.08205v3.pdf
PWC	https://paperswithcode.com/paper/x-armed-bandits-optimizing-quantiles-and
Repo
Framework

Hearing your touch: A new acoustic side channel on smartphones


Title	Hearing your touch: A new acoustic side channel on smartphones
Authors	Ilia Shumailov, Laurent Simon, Jeff Yan, Ross Anderson
Abstract	We present the first acoustic side-channel attack that recovers what users type on the virtual keyboard of their touch-screen smartphone or tablet. When a user taps the screen with a finger, the tap generates a sound wave that propagates on the screen surface and in the air. We found the device’s microphone(s) can recover this wave and “hear” the finger’s touch, and the wave’s distortions are characteristic of the tap’s location on the screen. Hence, by recording audio through the built-in microphone(s), a malicious app can infer text as the user enters it on their device. We evaluate the effectiveness of the attack with 45 participants in a real-world environment on an Android tablet and an Android smartphone. For the tablet, we recover 61% of 200 4-digit PIN-codes within 20 attempts, even if the model is not trained with the victim’s data. For the smartphone, we recover 9 words of size 7–13 letters with 50 attempts in a common side-channel attack benchmark. Our results suggest that it not always sufficient to rely on isolation mechanisms such as TrustZone to protect user input. We propose and discuss hardware, operating-system and application-level mechanisms to block this attack more effectively. Mobile devices may need a richer capability model, a more user-friendly notification system for sensor usage and a more thorough evaluation of the information leaked by the underlying hardware.
Tasks
Published	2019-03-26
URL	http://arxiv.org/abs/1903.11137v1
PDF	http://arxiv.org/pdf/1903.11137v1.pdf
PWC	https://paperswithcode.com/paper/hearing-your-touch-a-new-acoustic-side
Repo
Framework