October 20, 2019

2732 words 13 mins read

Paper Group AWR 296

Learning to Steer by Mimicking Features from Heterogeneous Auxiliary Networks. The N-Tuple Bandit Evolutionary Algorithm for Game Agent Optimisation. Bayesian Coreset Construction via Greedy Iterative Geodesic Ascent. Zero-shot User Intent Detection via Capsule Neural Networks. Differentiable Learning-to-Normalize via Switchable Normalization. A Ze …

Learning to Steer by Mimicking Features from Heterogeneous Auxiliary Networks


Title	Learning to Steer by Mimicking Features from Heterogeneous Auxiliary Networks
Authors	Yuenan Hou, Zheng Ma, Chunxiao Liu, Chen Change Loy
Abstract	The training of many existing end-to-end steering angle prediction models heavily relies on steering angles as the supervisory signal. Without learning from much richer contexts, these methods are susceptible to the presence of sharp road curves, challenging traffic conditions, strong shadows, and severe lighting changes. In this paper, we considerably improve the accuracy and robustness of predictions through heterogeneous auxiliary networks feature mimicking, a new and effective training method that provides us with much richer contextual signals apart from steering direction. Specifically, we train our steering angle predictive model by distilling multi-layer knowledge from multiple heterogeneous auxiliary networks that perform related but different tasks, e.g., image segmentation or optical flow estimation. As opposed to multi-task learning, our method does not require expensive annotations of related tasks on the target set. This is made possible by applying contemporary off-the-shelf networks on the target set and mimicking their features in different layers after transformation. The auxiliary networks are discarded after training without affecting the runtime efficiency of our model. Our approach achieves a new state-of-the-art on Udacity and Comma.ai, outperforming the previous best by a large margin of 12.8% and 52.1%, respectively. Encouraging results are also shown on Berkeley Deep Drive (BDD) dataset.
Tasks	Multi-Task Learning, Optical Flow Estimation, Semantic Segmentation, Steering Control
Published	2018-11-07
URL	http://arxiv.org/abs/1811.02759v1
PDF	http://arxiv.org/pdf/1811.02759v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-steer-by-mimicking-features-from
Repo	https://github.com/aj9011/Car-Speed-Prediction
Framework	pytorch

The N-Tuple Bandit Evolutionary Algorithm for Game Agent Optimisation


Title	The N-Tuple Bandit Evolutionary Algorithm for Game Agent Optimisation
Authors	Simon M Lucas, Jialin Liu, Diego Perez-Liebana
Abstract	This paper describes the N-Tuple Bandit Evolutionary Algorithm (NTBEA), an optimisation algorithm developed for noisy and expensive discrete (combinatorial) optimisation problems. The algorithm is applied to two game-based hyper-parameter optimisation problems. The N-Tuple system directly models the statistics, approximating the fitness and number of evaluations of each modelled combination of parameters. The model is simple, efficient and informative. Results show that the NTBEA significantly outperforms grid search and an estimation of distribution algorithm.
Tasks
Published	2018-02-16
URL	http://arxiv.org/abs/1802.05991v2
PDF	http://arxiv.org/pdf/1802.05991v2.pdf
PWC	https://paperswithcode.com/paper/the-n-tuple-bandit-evolutionary-algorithm-for
Repo	https://github.com/SimonLucas/KotlinGamesJS
Framework	none

Bayesian Coreset Construction via Greedy Iterative Geodesic Ascent


Title	Bayesian Coreset Construction via Greedy Iterative Geodesic Ascent
Authors	Trevor Campbell, Tamara Broderick
Abstract	Coherent uncertainty quantification is a key strength of Bayesian methods. But modern algorithms for approximate Bayesian posterior inference often sacrifice accurate posterior uncertainty estimation in the pursuit of scalability. This work shows that previous Bayesian coreset construction algorithms—which build a small, weighted subset of the data that approximates the full dataset—are no exception. We demonstrate that these algorithms scale the coreset log-likelihood suboptimally, resulting in underestimated posterior uncertainty. To address this shortcoming, we develop greedy iterative geodesic ascent (GIGA), a novel algorithm for Bayesian coreset construction that scales the coreset log-likelihood optimally. GIGA provides geometric decay in posterior approximation error as a function of coreset size, and maintains the fast running time of its predecessors. The paper concludes with validation of GIGA on both synthetic and real datasets, demonstrating that it reduces posterior approximation error by orders of magnitude compared with previous coreset constructions.
Tasks
Published	2018-02-05
URL	http://arxiv.org/abs/1802.01737v2
PDF	http://arxiv.org/pdf/1802.01737v2.pdf
PWC	https://paperswithcode.com/paper/bayesian-coreset-construction-via-greedy
Repo	https://github.com/trevorcampbell/bayesian-coresets
Framework	none

Zero-shot User Intent Detection via Capsule Neural Networks


Title	Zero-shot User Intent Detection via Capsule Neural Networks
Authors	Congying Xia, Chenwei Zhang, Xiaohui Yan, Yi Chang, Philip S. Yu
Abstract	User intent detection plays a critical role in question-answering and dialog systems. Most previous works treat intent detection as a classification problem where utterances are labeled with predefined intents. However, it is labor-intensive and time-consuming to label users’ utterances as intents are diversely expressed and novel intents will continually be involved. Instead, we study the zero-shot intent detection problem, which aims to detect emerging user intents where no labeled utterances are currently available. We propose two capsule-based architectures: INTENT-CAPSNET that extracts semantic features from utterances and aggregates them to discriminate existing intents, and INTENTCAPSNET-ZSL which gives INTENTCAPSNET the zero-shot learning ability to discriminate emerging intents via knowledge transfer from existing intents. Experiments on two real-world datasets show that our model not only can better discriminate diversely expressed existing intents, but is also able to discriminate emerging intents when no labeled utterances are available.
Tasks	Intent Detection, Question Answering, Transfer Learning, Zero-Shot Learning
Published	2018-09-02
URL	http://arxiv.org/abs/1809.00385v1
PDF	http://arxiv.org/pdf/1809.00385v1.pdf
PWC	https://paperswithcode.com/paper/zero-shot-user-intent-detection-via-capsule
Repo	https://github.com/joel-huang/zeroshot-capsnet-pytorch
Framework	pytorch

Differentiable Learning-to-Normalize via Switchable Normalization


Title	Differentiable Learning-to-Normalize via Switchable Normalization
Authors	Ping Luo, Jiamin Ren, Zhanglin Peng, Ruimao Zhang, Jingyu Li
Abstract	We address a learning-to-normalize problem by proposing Switchable Normalization (SN), which learns to select different normalizers for different normalization layers of a deep neural network. SN employs three distinct scopes to compute statistics (means and variances) including a channel, a layer, and a minibatch. SN switches between them by learning their importance weights in an end-to-end manner. It has several good properties. First, it adapts to various network architectures and tasks (see Fig.1). Second, it is robust to a wide range of batch sizes, maintaining high performance even when small minibatch is presented (e.g. 2 images/GPU). Third, SN does not have sensitive hyper-parameter, unlike group normalization that searches the number of groups as a hyper-parameter. Without bells and whistles, SN outperforms its counterparts on various challenging benchmarks, such as ImageNet, COCO, CityScapes, ADE20K, and Kinetics. Analyses of SN are also presented. We hope SN will help ease the usage and understand the normalization techniques in deep learning. The code of SN has been made available in https://github.com/switchablenorms/.
Tasks
Published	2018-06-28
URL	http://arxiv.org/abs/1806.10779v5
PDF	http://arxiv.org/pdf/1806.10779v5.pdf
PWC	https://paperswithcode.com/paper/differentiable-learning-to-normalize-via
Repo	https://github.com/switchablenorms/Switchable-Normalization
Framework	pytorch

A Zero-Shot Framework for Sketch-based Image Retrieval


Title	A Zero-Shot Framework for Sketch-based Image Retrieval
Authors	Sasi Kiran Yelamarthi, Shiva Krishna Reddy, Ashish Mishra, Anurag Mittal
Abstract	Sketch-based image retrieval (SBIR) is the task of retrieving images from a natural image database that correspond to a given hand-drawn sketch. Ideally, an SBIR model should learn to associate components in the sketch (say, feet, tail, etc.) with the corresponding components in the image having similar shape characteristics. However, current evaluation methods simply focus only on coarse-grained evaluation where the focus is on retrieving images which belong to the same class as the sketch but not necessarily having the same shape characteristics as in the sketch. As a result, existing methods simply learn to associate sketches with classes seen during training and hence fail to generalize to unseen classes. In this paper, we propose a new benchmark for zero-shot SBIR where the model is evaluated in novel classes that are not seen during training. We show through extensive experiments that existing models for SBIR that are trained in a discriminative setting learn only class specific mappings and fail to generalize to the proposed zero-shot setting. To circumvent this, we propose a generative approach for the SBIR task by proposing deep conditional generative models that take the sketch as an input and fill the missing information stochastically. Experiments on this new benchmark created from the “Sketchy” dataset, which is a large-scale database of sketch-photo pairs demonstrate that the performance of these generative models is significantly better than several state-of-the-art approaches in the proposed zero-shot framework of the coarse-grained SBIR task.
Tasks	Image Retrieval, Sketch-Based Image Retrieval
Published	2018-07-31
URL	http://arxiv.org/abs/1807.11724v1
PDF	http://arxiv.org/pdf/1807.11724v1.pdf
PWC	https://paperswithcode.com/paper/a-zero-shot-framework-for-sketch-based-image
Repo	https://github.com/ShivaKrishnaM/ZS-SBIR
Framework	tf

Talking to myself: self-dialogues as data for conversational agents


Title	Talking to myself: self-dialogues as data for conversational agents
Authors	Joachim Fainberg, Ben Krause, Mihai Dobre, Marco Damonte, Emmanuel Kahembwe, Daniel Duma, Bonnie Webber, Federico Fancellu
Abstract	Conversational agents are gaining popularity with the increasing ubiquity of smart devices. However, training agents in a data driven manner is challenging due to a lack of suitable corpora. This paper presents a novel method for gathering topical, unstructured conversational data in an efficient way: self-dialogues through crowd-sourcing. Alongside this paper, we include a corpus of 3.6 million words across 23 topics. We argue the utility of the corpus by comparing self-dialogues with standard two-party conversations as well as data from other corpora.
Tasks
Published	2018-09-18
URL	http://arxiv.org/abs/1809.06641v2
PDF	http://arxiv.org/pdf/1809.06641v2.pdf
PWC	https://paperswithcode.com/paper/talking-to-myself-self-dialogues-as-data-for
Repo	https://github.com/jfainberg/self_dialogue_corpus
Framework	none

An Adaptive Conversational Bot Framework


Title	An Adaptive Conversational Bot Framework
Authors	Isak Czeresnia Etinger
Abstract	How can we enable users to heavily specify criteria for database queries in a user-friendly way? This paper describes a general framework of a conversational bot that extracts meaningful information from user’s sentences, that asks subsequent questions to complete missing information, and that adjusts its questions and information-extraction parameters for later conversations depending on users’ behavior. Additionally, we provide a comparison of existing tools and give novel techniques to implement such framework. Finally, we exemplify the framework with a bot to query movies in a database, whose code is available for Microsoft employees.
Tasks
Published	2018-08-27
URL	http://arxiv.org/abs/1808.09890v1
PDF	http://arxiv.org/pdf/1808.09890v1.pdf
PWC	https://paperswithcode.com/paper/an-adaptive-conversational-bot-framework
Repo	https://github.com/ICEtinger/AdaptiveConversationBotFramework
Framework	none

MGCN: Semi-supervised Classification in Multi-layer Graphs with Graph Convolutional Networks


Title	MGCN: Semi-supervised Classification in Multi-layer Graphs with Graph Convolutional Networks
Authors	Mahsa Ghorbani, Mahdieh Soleymani Baghshah, Hamid R. Rabiee
Abstract	Graph embedding is an important approach for graph analysis tasks such as node classification and link prediction. The goal of graph embedding is to find a low dimensional representation of graph nodes that preserves the graph information. Recent methods like Graph Convolutional Network (GCN) try to consider node attributes (if available) besides node relations and learn node embeddings for unsupervised and semi-supervised tasks on graphs. On the other hand, multi-layer graph analysis has been received attention recently. However, the existing methods for multi-layer graph embedding cannot incorporate all available information (like node attributes). Moreover, most of them consider either type of nodes or type of edges, and they do not treat within and between layer edges differently. In this paper, we propose a method called MGCN that utilizes the GCN for multi-layer graphs. MGCN embeds nodes of multi-layer graphs using both within and between layers relations and nodes attributes. We evaluate our method on the semi-supervised node classification task. Experimental results demonstrate the superiority of the proposed method to other multi-layer and single-layer competitors and also show the positive effect of using cross-layer edges.
Tasks	Graph Embedding, Link Prediction, Network Embedding, Node Classification
Published	2018-11-21
URL	https://arxiv.org/abs/1811.08800v3
PDF	https://arxiv.org/pdf/1811.08800v3.pdf
PWC	https://paperswithcode.com/paper/multi-layered-graph-embedding-with-graph
Repo	https://github.com/mahsa91/py_mgcn
Framework	pytorch

Implicit Quantile Networks for Distributional Reinforcement Learning


Title	Implicit Quantile Networks for Distributional Reinforcement Learning
Authors	Will Dabney, Georg Ostrovski, David Silver, Rémi Munos
Abstract	In this work, we build on recent advances in distributional reinforcement learning to give a generally applicable, flexible, and state-of-the-art distributional variant of DQN. We achieve this by using quantile regression to approximate the full quantile function for the state-action return distribution. By reparameterizing a distribution over the sample space, this yields an implicitly defined return distribution and gives rise to a large class of risk-sensitive policies. We demonstrate improved performance on the 57 Atari 2600 games in the ALE, and use our algorithm’s implicitly defined distributions to study the effects of risk-sensitive policies in Atari games.
Tasks	Atari Games, Distributional Reinforcement Learning
Published	2018-06-14
URL	http://arxiv.org/abs/1806.06923v1
PDF	http://arxiv.org/pdf/1806.06923v1.pdf
PWC	https://paperswithcode.com/paper/implicit-quantile-networks-for-distributional
Repo	https://github.com/ACampero/dopamine
Framework	tf

MGGAN: Solving Mode Collapse using Manifold Guided Training


Title	MGGAN: Solving Mode Collapse using Manifold Guided Training
Authors	Duhyeon Bang, Hyunjung Shim
Abstract	Mode collapse is a critical problem in training generative adversarial networks. To alleviate mode collapse, several recent studies introduce new objective functions, network architectures or alternative training schemes. However, their achievement is often the result of sacrificing the image quality. In this paper, we propose a new algorithm, namely a manifold guided generative adversarial network (MGGAN), which leverages a guidance network on existing GAN architecture to induce generator learning all modes of data distribution. Based on extensive evaluations, we show that our algorithm resolves mode collapse without losing image quality. In particular, we demonstrate that our algorithm is easily extendable to various existing GANs. Experimental analysis justifies that the proposed algorithm is an effective and efficient tool for training GANs.
Tasks
Published	2018-04-12
URL	http://arxiv.org/abs/1804.04391v1
PDF	http://arxiv.org/pdf/1804.04391v1.pdf
PWC	https://paperswithcode.com/paper/mggan-solving-mode-collapse-using-manifold
Repo	https://github.com/QuickSolverKyle/Tensorflow-MyGANs
Framework	tf

Reaching Human-level Performance in Automatic Grammatical Error Correction: An Empirical Study


Title	Reaching Human-level Performance in Automatic Grammatical Error Correction: An Empirical Study
Authors	Tao Ge, Furu Wei, Ming Zhou
Abstract	Neural sequence-to-sequence (seq2seq) approaches have proven to be successful in grammatical error correction (GEC). Based on the seq2seq framework, we propose a novel fluency boost learning and inference mechanism. Fluency boosting learning generates diverse error-corrected sentence pairs during training, enabling the error correction model to learn how to improve a sentence’s fluency from more instances, while fluency boosting inference allows the model to correct a sentence incrementally with multiple inference steps. Combining fluency boost learning and inference with convolutional seq2seq models, our approach achieves the state-of-the-art performance: 75.72 (F_{0.5}) on CoNLL-2014 10 annotation dataset and 62.42 (GLEU) on JFLEG test set respectively, becoming the first GEC system that reaches human-level performance (72.58 for CoNLL and 62.37 for JFLEG) on both of the benchmarks.
Tasks	Grammatical Error Correction
Published	2018-07-03
URL	http://arxiv.org/abs/1807.01270v5
PDF	http://arxiv.org/pdf/1807.01270v5.pdf
PWC	https://paperswithcode.com/paper/reaching-human-level-performance-in-automatic
Repo	https://github.com/getao/human-performance-gec
Framework	none

Neural Modular Control for Embodied Question Answering


Title	Neural Modular Control for Embodied Question Answering
Authors	Abhishek Das, Georgia Gkioxari, Stefan Lee, Devi Parikh, Dhruv Batra
Abstract	We present a modular approach for learning policies for navigation over long planning horizons from language input. Our hierarchical policy operates at multiple timescales, where the higher-level master policy proposes subgoals to be executed by specialized sub-policies. Our choice of subgoals is compositional and semantic, i.e. they can be sequentially combined in arbitrary orderings, and assume human-interpretable descriptions (e.g. ‘exit room’, ‘find kitchen’, ‘find refrigerator’, etc.). We use imitation learning to warm-start policies at each level of the hierarchy, dramatically increasing sample efficiency, followed by reinforcement learning. Independent reinforcement learning at each level of hierarchy enables sub-policies to adapt to consequences of their actions and recover from errors. Subsequent joint hierarchical training enables the master policy to adapt to the sub-policies. On the challenging EQA (Das et al., 2018) benchmark in House3D (Wu et al., 2018), requiring navigating diverse realistic indoor environments, our approach outperforms prior work by a significant margin, both in terms of navigation and question answering.
Tasks	Embodied Question Answering, Imitation Learning, Question Answering
Published	2018-10-26
URL	https://arxiv.org/abs/1810.11181v2
PDF	https://arxiv.org/pdf/1810.11181v2.pdf
PWC	https://paperswithcode.com/paper/neural-modular-control-for-embodied-question
Repo	https://github.com/facebookresearch/House3D
Framework	none

Neural Segmental Hypergraphs for Overlapping Mention Recognition


Title	Neural Segmental Hypergraphs for Overlapping Mention Recognition
Authors	Bailin Wang, Wei Lu
Abstract	In this work, we propose a novel segmental hypergraph representation to model overlapping entity mentions that are prevalent in many practical datasets. We show that our model built on top of such a new representation is able to capture features and interactions that cannot be captured by previous models while maintaining a low time complexity for inference. We also present a theoretical analysis to formally assess how our representation is better than alternative representations reported in the literature in terms of representational power. Coupled with neural networks for feature learning, our model achieves the state-of-the-art performance in three benchmark datasets annotated with overlapping mentions.
Tasks	Named Entity Recognition, Nested Mention Recognition, Nested Named Entity Recognition, Overlapping Mention Recognition
Published	2018-10-03
URL	http://arxiv.org/abs/1810.01817v1
PDF	http://arxiv.org/pdf/1810.01817v1.pdf
PWC	https://paperswithcode.com/paper/neural-segmental-hypergraphs-for-overlapping
Repo	https://github.com/berlino/overlapping-ner-em18
Framework	pytorch

Targeted Adversarial Examples for Black Box Audio Systems


Title	Targeted Adversarial Examples for Black Box Audio Systems
Authors	Rohan Taori, Amog Kamsetty, Brenton Chu, Nikita Vemuri
Abstract	The application of deep recurrent networks to audio transcription has led to impressive gains in automatic speech recognition (ASR) systems. Many have demonstrated that small adversarial perturbations can fool deep neural networks into incorrectly predicting a specified target with high confidence. Current work on fooling ASR systems have focused on white-box attacks, in which the model architecture and parameters are known. In this paper, we adopt a black-box approach to adversarial generation, combining the approaches of both genetic algorithms and gradient estimation to solve the task. We achieve a 89.25% targeted attack similarity after 3000 generations while maintaining 94.6% audio file similarity.
Tasks	Speech Recognition
Published	2018-05-20
URL	https://arxiv.org/abs/1805.07820v2
PDF	https://arxiv.org/pdf/1805.07820v2.pdf
PWC	https://paperswithcode.com/paper/targeted-adversarial-examples-for-black-box
Repo	https://github.com/rtaori/Black-Box-Audio
Framework	tf