Paper Group AWR 296
Learning to Steer by Mimicking Features from Heterogeneous Auxiliary Networks. The N-Tuple Bandit Evolutionary Algorithm for Game Agent Optimisation. Bayesian Coreset Construction via Greedy Iterative Geodesic Ascent. Zero-shot User Intent Detection via Capsule Neural Networks. Differentiable Learning-to-Normalize via Switchable Normalization. A Ze …
Learning to Steer by Mimicking Features from Heterogeneous Auxiliary Networks
Title | Learning to Steer by Mimicking Features from Heterogeneous Auxiliary Networks |
Authors | Yuenan Hou, Zheng Ma, Chunxiao Liu, Chen Change Loy |
Abstract | The training of many existing end-to-end steering angle prediction models heavily relies on steering angles as the supervisory signal. Without learning from much richer contexts, these methods are susceptible to the presence of sharp road curves, challenging traffic conditions, strong shadows, and severe lighting changes. In this paper, we considerably improve the accuracy and robustness of predictions through heterogeneous auxiliary networks feature mimicking, a new and effective training method that provides us with much richer contextual signals apart from steering direction. Specifically, we train our steering angle predictive model by distilling multi-layer knowledge from multiple heterogeneous auxiliary networks that perform related but different tasks, e.g., image segmentation or optical flow estimation. As opposed to multi-task learning, our method does not require expensive annotations of related tasks on the target set. This is made possible by applying contemporary off-the-shelf networks on the target set and mimicking their features in different layers after transformation. The auxiliary networks are discarded after training without affecting the runtime efficiency of our model. Our approach achieves a new state-of-the-art on Udacity and Comma.ai, outperforming the previous best by a large margin of 12.8% and 52.1%, respectively. Encouraging results are also shown on Berkeley Deep Drive (BDD) dataset. |
Tasks | Multi-Task Learning, Optical Flow Estimation, Semantic Segmentation, Steering Control |
Published | 2018-11-07 |
URL | http://arxiv.org/abs/1811.02759v1 |
http://arxiv.org/pdf/1811.02759v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-steer-by-mimicking-features-from |
Repo | https://github.com/aj9011/Car-Speed-Prediction |
Framework | pytorch |
The N-Tuple Bandit Evolutionary Algorithm for Game Agent Optimisation
Title | The N-Tuple Bandit Evolutionary Algorithm for Game Agent Optimisation |
Authors | Simon M Lucas, Jialin Liu, Diego Perez-Liebana |
Abstract | This paper describes the N-Tuple Bandit Evolutionary Algorithm (NTBEA), an optimisation algorithm developed for noisy and expensive discrete (combinatorial) optimisation problems. The algorithm is applied to two game-based hyper-parameter optimisation problems. The N-Tuple system directly models the statistics, approximating the fitness and number of evaluations of each modelled combination of parameters. The model is simple, efficient and informative. Results show that the NTBEA significantly outperforms grid search and an estimation of distribution algorithm. |
Tasks | |
Published | 2018-02-16 |
URL | http://arxiv.org/abs/1802.05991v2 |
http://arxiv.org/pdf/1802.05991v2.pdf | |
PWC | https://paperswithcode.com/paper/the-n-tuple-bandit-evolutionary-algorithm-for |
Repo | https://github.com/SimonLucas/KotlinGamesJS |
Framework | none |
Bayesian Coreset Construction via Greedy Iterative Geodesic Ascent
Title | Bayesian Coreset Construction via Greedy Iterative Geodesic Ascent |
Authors | Trevor Campbell, Tamara Broderick |
Abstract | Coherent uncertainty quantification is a key strength of Bayesian methods. But modern algorithms for approximate Bayesian posterior inference often sacrifice accurate posterior uncertainty estimation in the pursuit of scalability. This work shows that previous Bayesian coreset construction algorithms—which build a small, weighted subset of the data that approximates the full dataset—are no exception. We demonstrate that these algorithms scale the coreset log-likelihood suboptimally, resulting in underestimated posterior uncertainty. To address this shortcoming, we develop greedy iterative geodesic ascent (GIGA), a novel algorithm for Bayesian coreset construction that scales the coreset log-likelihood optimally. GIGA provides geometric decay in posterior approximation error as a function of coreset size, and maintains the fast running time of its predecessors. The paper concludes with validation of GIGA on both synthetic and real datasets, demonstrating that it reduces posterior approximation error by orders of magnitude compared with previous coreset constructions. |
Tasks | |
Published | 2018-02-05 |
URL | http://arxiv.org/abs/1802.01737v2 |
http://arxiv.org/pdf/1802.01737v2.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-coreset-construction-via-greedy |
Repo | https://github.com/trevorcampbell/bayesian-coresets |
Framework | none |
Zero-shot User Intent Detection via Capsule Neural Networks
Title | Zero-shot User Intent Detection via Capsule Neural Networks |
Authors | Congying Xia, Chenwei Zhang, Xiaohui Yan, Yi Chang, Philip S. Yu |
Abstract | User intent detection plays a critical role in question-answering and dialog systems. Most previous works treat intent detection as a classification problem where utterances are labeled with predefined intents. However, it is labor-intensive and time-consuming to label users’ utterances as intents are diversely expressed and novel intents will continually be involved. Instead, we study the zero-shot intent detection problem, which aims to detect emerging user intents where no labeled utterances are currently available. We propose two capsule-based architectures: INTENT-CAPSNET that extracts semantic features from utterances and aggregates them to discriminate existing intents, and INTENTCAPSNET-ZSL which gives INTENTCAPSNET the zero-shot learning ability to discriminate emerging intents via knowledge transfer from existing intents. Experiments on two real-world datasets show that our model not only can better discriminate diversely expressed existing intents, but is also able to discriminate emerging intents when no labeled utterances are available. |
Tasks | Intent Detection, Question Answering, Transfer Learning, Zero-Shot Learning |
Published | 2018-09-02 |
URL | http://arxiv.org/abs/1809.00385v1 |
http://arxiv.org/pdf/1809.00385v1.pdf | |
PWC | https://paperswithcode.com/paper/zero-shot-user-intent-detection-via-capsule |
Repo | https://github.com/joel-huang/zeroshot-capsnet-pytorch |
Framework | pytorch |
Differentiable Learning-to-Normalize via Switchable Normalization
Title | Differentiable Learning-to-Normalize via Switchable Normalization |
Authors | Ping Luo, Jiamin Ren, Zhanglin Peng, Ruimao Zhang, Jingyu Li |
Abstract | We address a learning-to-normalize problem by proposing Switchable Normalization (SN), which learns to select different normalizers for different normalization layers of a deep neural network. SN employs three distinct scopes to compute statistics (means and variances) including a channel, a layer, and a minibatch. SN switches between them by learning their importance weights in an end-to-end manner. It has several good properties. First, it adapts to various network architectures and tasks (see Fig.1). Second, it is robust to a wide range of batch sizes, maintaining high performance even when small minibatch is presented (e.g. 2 images/GPU). Third, SN does not have sensitive hyper-parameter, unlike group normalization that searches the number of groups as a hyper-parameter. Without bells and whistles, SN outperforms its counterparts on various challenging benchmarks, such as ImageNet, COCO, CityScapes, ADE20K, and Kinetics. Analyses of SN are also presented. We hope SN will help ease the usage and understand the normalization techniques in deep learning. The code of SN has been made available in https://github.com/switchablenorms/. |
Tasks | |
Published | 2018-06-28 |
URL | http://arxiv.org/abs/1806.10779v5 |
http://arxiv.org/pdf/1806.10779v5.pdf | |
PWC | https://paperswithcode.com/paper/differentiable-learning-to-normalize-via |
Repo | https://github.com/switchablenorms/Switchable-Normalization |
Framework | pytorch |
A Zero-Shot Framework for Sketch-based Image Retrieval
Title | A Zero-Shot Framework for Sketch-based Image Retrieval |
Authors | Sasi Kiran Yelamarthi, Shiva Krishna Reddy, Ashish Mishra, Anurag Mittal |
Abstract | Sketch-based image retrieval (SBIR) is the task of retrieving images from a natural image database that correspond to a given hand-drawn sketch. Ideally, an SBIR model should learn to associate components in the sketch (say, feet, tail, etc.) with the corresponding components in the image having similar shape characteristics. However, current evaluation methods simply focus only on coarse-grained evaluation where the focus is on retrieving images which belong to the same class as the sketch but not necessarily having the same shape characteristics as in the sketch. As a result, existing methods simply learn to associate sketches with classes seen during training and hence fail to generalize to unseen classes. In this paper, we propose a new benchmark for zero-shot SBIR where the model is evaluated in novel classes that are not seen during training. We show through extensive experiments that existing models for SBIR that are trained in a discriminative setting learn only class specific mappings and fail to generalize to the proposed zero-shot setting. To circumvent this, we propose a generative approach for the SBIR task by proposing deep conditional generative models that take the sketch as an input and fill the missing information stochastically. Experiments on this new benchmark created from the “Sketchy” dataset, which is a large-scale database of sketch-photo pairs demonstrate that the performance of these generative models is significantly better than several state-of-the-art approaches in the proposed zero-shot framework of the coarse-grained SBIR task. |
Tasks | Image Retrieval, Sketch-Based Image Retrieval |
Published | 2018-07-31 |
URL | http://arxiv.org/abs/1807.11724v1 |
http://arxiv.org/pdf/1807.11724v1.pdf | |
PWC | https://paperswithcode.com/paper/a-zero-shot-framework-for-sketch-based-image |
Repo | https://github.com/ShivaKrishnaM/ZS-SBIR |
Framework | tf |
Talking to myself: self-dialogues as data for conversational agents
Title | Talking to myself: self-dialogues as data for conversational agents |
Authors | Joachim Fainberg, Ben Krause, Mihai Dobre, Marco Damonte, Emmanuel Kahembwe, Daniel Duma, Bonnie Webber, Federico Fancellu |
Abstract | Conversational agents are gaining popularity with the increasing ubiquity of smart devices. However, training agents in a data driven manner is challenging due to a lack of suitable corpora. This paper presents a novel method for gathering topical, unstructured conversational data in an efficient way: self-dialogues through crowd-sourcing. Alongside this paper, we include a corpus of 3.6 million words across 23 topics. We argue the utility of the corpus by comparing self-dialogues with standard two-party conversations as well as data from other corpora. |
Tasks | |
Published | 2018-09-18 |
URL | http://arxiv.org/abs/1809.06641v2 |
http://arxiv.org/pdf/1809.06641v2.pdf | |
PWC | https://paperswithcode.com/paper/talking-to-myself-self-dialogues-as-data-for |
Repo | https://github.com/jfainberg/self_dialogue_corpus |
Framework | none |
An Adaptive Conversational Bot Framework
Title | An Adaptive Conversational Bot Framework |
Authors | Isak Czeresnia Etinger |
Abstract | How can we enable users to heavily specify criteria for database queries in a user-friendly way? This paper describes a general framework of a conversational bot that extracts meaningful information from user’s sentences, that asks subsequent questions to complete missing information, and that adjusts its questions and information-extraction parameters for later conversations depending on users’ behavior. Additionally, we provide a comparison of existing tools and give novel techniques to implement such framework. Finally, we exemplify the framework with a bot to query movies in a database, whose code is available for Microsoft employees. |
Tasks | |
Published | 2018-08-27 |
URL | http://arxiv.org/abs/1808.09890v1 |
http://arxiv.org/pdf/1808.09890v1.pdf | |
PWC | https://paperswithcode.com/paper/an-adaptive-conversational-bot-framework |
Repo | https://github.com/ICEtinger/AdaptiveConversationBotFramework |
Framework | none |
MGCN: Semi-supervised Classification in Multi-layer Graphs with Graph Convolutional Networks
Title | MGCN: Semi-supervised Classification in Multi-layer Graphs with Graph Convolutional Networks |
Authors | Mahsa Ghorbani, Mahdieh Soleymani Baghshah, Hamid R. Rabiee |
Abstract | Graph embedding is an important approach for graph analysis tasks such as node classification and link prediction. The goal of graph embedding is to find a low dimensional representation of graph nodes that preserves the graph information. Recent methods like Graph Convolutional Network (GCN) try to consider node attributes (if available) besides node relations and learn node embeddings for unsupervised and semi-supervised tasks on graphs. On the other hand, multi-layer graph analysis has been received attention recently. However, the existing methods for multi-layer graph embedding cannot incorporate all available information (like node attributes). Moreover, most of them consider either type of nodes or type of edges, and they do not treat within and between layer edges differently. In this paper, we propose a method called MGCN that utilizes the GCN for multi-layer graphs. MGCN embeds nodes of multi-layer graphs using both within and between layers relations and nodes attributes. We evaluate our method on the semi-supervised node classification task. Experimental results demonstrate the superiority of the proposed method to other multi-layer and single-layer competitors and also show the positive effect of using cross-layer edges. |
Tasks | Graph Embedding, Link Prediction, Network Embedding, Node Classification |
Published | 2018-11-21 |
URL | https://arxiv.org/abs/1811.08800v3 |
https://arxiv.org/pdf/1811.08800v3.pdf | |
PWC | https://paperswithcode.com/paper/multi-layered-graph-embedding-with-graph |
Repo | https://github.com/mahsa91/py_mgcn |
Framework | pytorch |
Implicit Quantile Networks for Distributional Reinforcement Learning
Title | Implicit Quantile Networks for Distributional Reinforcement Learning |
Authors | Will Dabney, Georg Ostrovski, David Silver, Rémi Munos |
Abstract | In this work, we build on recent advances in distributional reinforcement learning to give a generally applicable, flexible, and state-of-the-art distributional variant of DQN. We achieve this by using quantile regression to approximate the full quantile function for the state-action return distribution. By reparameterizing a distribution over the sample space, this yields an implicitly defined return distribution and gives rise to a large class of risk-sensitive policies. We demonstrate improved performance on the 57 Atari 2600 games in the ALE, and use our algorithm’s implicitly defined distributions to study the effects of risk-sensitive policies in Atari games. |
Tasks | Atari Games, Distributional Reinforcement Learning |
Published | 2018-06-14 |
URL | http://arxiv.org/abs/1806.06923v1 |
http://arxiv.org/pdf/1806.06923v1.pdf | |
PWC | https://paperswithcode.com/paper/implicit-quantile-networks-for-distributional |
Repo | https://github.com/ACampero/dopamine |
Framework | tf |
MGGAN: Solving Mode Collapse using Manifold Guided Training
Title | MGGAN: Solving Mode Collapse using Manifold Guided Training |
Authors | Duhyeon Bang, Hyunjung Shim |
Abstract | Mode collapse is a critical problem in training generative adversarial networks. To alleviate mode collapse, several recent studies introduce new objective functions, network architectures or alternative training schemes. However, their achievement is often the result of sacrificing the image quality. In this paper, we propose a new algorithm, namely a manifold guided generative adversarial network (MGGAN), which leverages a guidance network on existing GAN architecture to induce generator learning all modes of data distribution. Based on extensive evaluations, we show that our algorithm resolves mode collapse without losing image quality. In particular, we demonstrate that our algorithm is easily extendable to various existing GANs. Experimental analysis justifies that the proposed algorithm is an effective and efficient tool for training GANs. |
Tasks | |
Published | 2018-04-12 |
URL | http://arxiv.org/abs/1804.04391v1 |
http://arxiv.org/pdf/1804.04391v1.pdf | |
PWC | https://paperswithcode.com/paper/mggan-solving-mode-collapse-using-manifold |
Repo | https://github.com/QuickSolverKyle/Tensorflow-MyGANs |
Framework | tf |
Reaching Human-level Performance in Automatic Grammatical Error Correction: An Empirical Study
Title | Reaching Human-level Performance in Automatic Grammatical Error Correction: An Empirical Study |
Authors | Tao Ge, Furu Wei, Ming Zhou |
Abstract | Neural sequence-to-sequence (seq2seq) approaches have proven to be successful in grammatical error correction (GEC). Based on the seq2seq framework, we propose a novel fluency boost learning and inference mechanism. Fluency boosting learning generates diverse error-corrected sentence pairs during training, enabling the error correction model to learn how to improve a sentence’s fluency from more instances, while fluency boosting inference allows the model to correct a sentence incrementally with multiple inference steps. Combining fluency boost learning and inference with convolutional seq2seq models, our approach achieves the state-of-the-art performance: 75.72 (F_{0.5}) on CoNLL-2014 10 annotation dataset and 62.42 (GLEU) on JFLEG test set respectively, becoming the first GEC system that reaches human-level performance (72.58 for CoNLL and 62.37 for JFLEG) on both of the benchmarks. |
Tasks | Grammatical Error Correction |
Published | 2018-07-03 |
URL | http://arxiv.org/abs/1807.01270v5 |
http://arxiv.org/pdf/1807.01270v5.pdf | |
PWC | https://paperswithcode.com/paper/reaching-human-level-performance-in-automatic |
Repo | https://github.com/getao/human-performance-gec |
Framework | none |
Neural Modular Control for Embodied Question Answering
Title | Neural Modular Control for Embodied Question Answering |
Authors | Abhishek Das, Georgia Gkioxari, Stefan Lee, Devi Parikh, Dhruv Batra |
Abstract | We present a modular approach for learning policies for navigation over long planning horizons from language input. Our hierarchical policy operates at multiple timescales, where the higher-level master policy proposes subgoals to be executed by specialized sub-policies. Our choice of subgoals is compositional and semantic, i.e. they can be sequentially combined in arbitrary orderings, and assume human-interpretable descriptions (e.g. ‘exit room’, ‘find kitchen’, ‘find refrigerator’, etc.). We use imitation learning to warm-start policies at each level of the hierarchy, dramatically increasing sample efficiency, followed by reinforcement learning. Independent reinforcement learning at each level of hierarchy enables sub-policies to adapt to consequences of their actions and recover from errors. Subsequent joint hierarchical training enables the master policy to adapt to the sub-policies. On the challenging EQA (Das et al., 2018) benchmark in House3D (Wu et al., 2018), requiring navigating diverse realistic indoor environments, our approach outperforms prior work by a significant margin, both in terms of navigation and question answering. |
Tasks | Embodied Question Answering, Imitation Learning, Question Answering |
Published | 2018-10-26 |
URL | https://arxiv.org/abs/1810.11181v2 |
https://arxiv.org/pdf/1810.11181v2.pdf | |
PWC | https://paperswithcode.com/paper/neural-modular-control-for-embodied-question |
Repo | https://github.com/facebookresearch/House3D |
Framework | none |
Neural Segmental Hypergraphs for Overlapping Mention Recognition
Title | Neural Segmental Hypergraphs for Overlapping Mention Recognition |
Authors | Bailin Wang, Wei Lu |
Abstract | In this work, we propose a novel segmental hypergraph representation to model overlapping entity mentions that are prevalent in many practical datasets. We show that our model built on top of such a new representation is able to capture features and interactions that cannot be captured by previous models while maintaining a low time complexity for inference. We also present a theoretical analysis to formally assess how our representation is better than alternative representations reported in the literature in terms of representational power. Coupled with neural networks for feature learning, our model achieves the state-of-the-art performance in three benchmark datasets annotated with overlapping mentions. |
Tasks | Named Entity Recognition, Nested Mention Recognition, Nested Named Entity Recognition, Overlapping Mention Recognition |
Published | 2018-10-03 |
URL | http://arxiv.org/abs/1810.01817v1 |
http://arxiv.org/pdf/1810.01817v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-segmental-hypergraphs-for-overlapping |
Repo | https://github.com/berlino/overlapping-ner-em18 |
Framework | pytorch |
Targeted Adversarial Examples for Black Box Audio Systems
Title | Targeted Adversarial Examples for Black Box Audio Systems |
Authors | Rohan Taori, Amog Kamsetty, Brenton Chu, Nikita Vemuri |
Abstract | The application of deep recurrent networks to audio transcription has led to impressive gains in automatic speech recognition (ASR) systems. Many have demonstrated that small adversarial perturbations can fool deep neural networks into incorrectly predicting a specified target with high confidence. Current work on fooling ASR systems have focused on white-box attacks, in which the model architecture and parameters are known. In this paper, we adopt a black-box approach to adversarial generation, combining the approaches of both genetic algorithms and gradient estimation to solve the task. We achieve a 89.25% targeted attack similarity after 3000 generations while maintaining 94.6% audio file similarity. |
Tasks | Speech Recognition |
Published | 2018-05-20 |
URL | https://arxiv.org/abs/1805.07820v2 |
https://arxiv.org/pdf/1805.07820v2.pdf | |
PWC | https://paperswithcode.com/paper/targeted-adversarial-examples-for-black-box |
Repo | https://github.com/rtaori/Black-Box-Audio |
Framework | tf |