February 1, 2020

2950 words 14 mins read

Paper Group AWR 196

SECTOR: A Neural Model for Coherent Topic Segmentation and Classification. Contrastive Attention Mechanism for Abstractive Sentence Summarization. Talk Proposal: Towards the Realistic Evaluation of Evasion Attacks using CARLA. HarDNet: A Low Memory Traffic Network. Multi-Graph Transformer for Free-Hand Sketch Recognition. Understanding Architecture …

SECTOR: A Neural Model for Coherent Topic Segmentation and Classification


Title	SECTOR: A Neural Model for Coherent Topic Segmentation and Classification
Authors	Sebastian Arnold, Rudolf Schneider, Philippe Cudré-Mauroux, Felix A. Gers, Alexander Löser
Abstract	When searching for information, a human reader first glances over a document, spots relevant sections and then focuses on a few sentences for resolving her intention. However, the high variance of document structure complicates to identify the salient topic of a given section at a glance. To tackle this challenge, we present SECTOR, a model to support machine reading systems by segmenting documents into coherent sections and assigning topic labels to each section. Our deep neural network architecture learns a latent topic embedding over the course of a document. This can be leveraged to classify local topics from plain text and segment a document at topic shifts. In addition, we contribute WikiSection, a publicly available dataset with 242k labeled sections in English and German from two distinct domains: diseases and cities. From our extensive evaluation of 20 architectures, we report a highest score of 71.6% F1 for the segmentation and classification of 30 topics from the English city domain, scored by our SECTOR LSTM model with bloom filter embeddings and bidirectional segmentation. This is a significant improvement of 29.5 points F1 compared to state-of-the-art CNN classifiers with baseline segmentation.
Tasks	Reading Comprehension
Published	2019-02-13
URL	http://arxiv.org/abs/1902.04793v1
PDF	http://arxiv.org/pdf/1902.04793v1.pdf
PWC	https://paperswithcode.com/paper/sector-a-neural-model-for-coherent-topic
Repo	https://github.com/sebastianarnold/WikiSection
Framework	none

Contrastive Attention Mechanism for Abstractive Sentence Summarization


Title	Contrastive Attention Mechanism for Abstractive Sentence Summarization
Authors	Xiangyu Duan, Hoongfei Yu, Mingming Yin, Min Zhang, Weihua Luo, Yue Zhang
Abstract	We propose a contrastive attention mechanism to extend the sequence-to-sequence framework for abstractive sentence summarization task, which aims to generate a brief summary of a given source sentence. The proposed contrastive attention mechanism accommodates two categories of attention: one is the conventional attention that attends to relevant parts of the source sentence, the other is the opponent attention that attends to irrelevant or less relevant parts of the source sentence. Both attentions are trained in an opposite way so that the contribution from the conventional attention is encouraged and the contribution from the opponent attention is discouraged through a novel softmax and softmin functionality. Experiments on benchmark datasets show that, the proposed contrastive attention mechanism is more focused on the relevant parts for the summary than the conventional attention mechanism, and greatly advances the state-of-the-art performance on the abstractive sentence summarization task. We release the code at https://github.com/travel-go/Abstractive-Text-Summarization
Tasks	Abstractive Sentence Summarization, Abstractive Text Summarization, Text Summarization
Published	2019-10-29
URL	https://arxiv.org/abs/1910.13114v2
PDF	https://arxiv.org/pdf/1910.13114v2.pdf
PWC	https://paperswithcode.com/paper/contrastive-attention-mechanism-for
Repo	https://github.com/travel-go/Abstractive-Text-Summarization
Framework	pytorch

Talk Proposal: Towards the Realistic Evaluation of Evasion Attacks using CARLA


Title	Talk Proposal: Towards the Realistic Evaluation of Evasion Attacks using CARLA
Authors	Cory Cornelius, Shang-Tse Chen, Jason Martin, Duen Horng Chau
Abstract	In this talk we describe our content-preserving attack on object detectors, ShapeShifter, and demonstrate how to evaluate this threat in realistic scenarios. We describe how we use CARLA, a realistic urban driving simulator, to create these scenarios, and how we use ShapeShifter to generate content-preserving attacks against those scenarios.
Tasks
Published	2019-04-18
URL	http://arxiv.org/abs/1904.12622v1
PDF	http://arxiv.org/pdf/1904.12622v1.pdf
PWC	https://paperswithcode.com/paper/190412622
Repo	https://github.com/shangtse/robust-physical-attack
Framework	tf

HarDNet: A Low Memory Traffic Network


Title	HarDNet: A Low Memory Traffic Network
Authors	Ping Chao, Chao-Yang Kao, Yu-Shan Ruan, Chien-Hsiang Huang, Youn-Long Lin
Abstract	State-of-the-art neural network architectures such as ResNet, MobileNet, and DenseNet have achieved outstanding accuracy over low MACs and small model size counterparts. However, these metrics might not be accurate for predicting the inference time. We suggest that memory traffic for accessing intermediate feature maps can be a factor dominating the inference latency, especially in such tasks as real-time object detection and semantic segmentation of high-resolution video. We propose a Harmonic Densely Connected Network to achieve high efficiency in terms of both low MACs and memory traffic. The new network achieves 35%, 36%, 30%, 32%, and 45% inference time reduction compared with FC-DenseNet-103, DenseNet-264, ResNet-50, ResNet-152, and SSD-VGG, respectively. We use tools including Nvidia profiler and ARM Scale-Sim to measure the memory traffic and verify that the inference latency is indeed proportional to the memory traffic consumption and the proposed network consumes low memory traffic. We conclude that one should take memory traffic into consideration when designing neural network architectures for high-resolution applications at the edge.
Tasks	Object Detection, Real-Time Object Detection, Real-Time Semantic Segmentation, Semantic Segmentation
Published	2019-09-03
URL	https://arxiv.org/abs/1909.00948v1
PDF	https://arxiv.org/pdf/1909.00948v1.pdf
PWC	https://paperswithcode.com/paper/hardnet-a-low-memory-traffic-network
Repo	https://github.com/osmr/imgclsmob
Framework	mxnet

Multi-Graph Transformer for Free-Hand Sketch Recognition


Title	Multi-Graph Transformer for Free-Hand Sketch Recognition
Authors	Peng Xu, Chaitanya K. Joshi, Xavier Bresson
Abstract	Learning meaningful representations of free-hand sketches remains a challenging task given the signal sparsity and the high-level abstraction of sketches. Existing techniques have focused on exploiting either the static nature of sketches with Convolutional Neural Networks (CNNs) or the temporal sequential property with Recurrent Neural Networks (RNNs). In this work, we propose a new representation of sketches as multiple sparsely connected graphs. We design a novel Graph Neural Network (GNN), the Multi-Graph Transformer (MGT), for learning representations of sketches from multiple graphs which simultaneously capture global and local geometric stroke structures, as well as temporal information. We report extensive numerical experiments on a sketch recognition task to demonstrate the performance of the proposed approach. Particularly, MGT applied on 414k sketches from Google QuickDraw: (i) achieves small recognition gap to the CNN-based performance upper bound (72.80% vs. 74.22%), and (ii) outperforms all RNN-based models by a significant margin. To the best of our knowledge, this is the first work proposing to represent sketches as graphs and apply GNNs for sketch recognition. Code and trained models are available at https://github.com/PengBoXiangShang/multigraph_transformer.
Tasks	Sketch Recognition
Published	2019-12-24
URL	https://arxiv.org/abs/1912.11258v2
PDF	https://arxiv.org/pdf/1912.11258v2.pdf
PWC	https://paperswithcode.com/paper/multi-graph-transformer-for-free-hand-sketch
Repo	https://github.com/PengBoXiangShang/multigraph_transformer
Framework	pytorch

Understanding Architectures Learnt by Cell-based Neural Architecture Search


Title	Understanding Architectures Learnt by Cell-based Neural Architecture Search
Authors	Yao Shu, Wei Wang, Shaofeng Cai
Abstract	Neural architecture search (NAS) searches architectures automatically for given tasks, e.g., image classification and language modeling. Improving the search efficiency and effectiveness have attracted increasing attention in recent years. However, few efforts have been devoted to understanding the generated architectures. In this paper, we first reveal that existing NAS algorithms (e.g., DARTS, ENAS) tend to favor architectures with wide and shallow cell structures. These favorable architectures consistently achieve fast convergence and are consequently selected by NAS algorithms. Our empirical and theoretical study further confirms that their fast convergence derives from their smooth loss landscape and accurate gradient information. Nonetheless, these architectures may not necessarily lead to better generalization performance compared with other candidate architectures in the same search space, and therefore further improvement is possible by revising existing NAS algorithms.
Tasks	Image Classification, Language Modelling, Neural Architecture Search
Published	2019-09-20
URL	https://arxiv.org/abs/1909.09569v3
PDF	https://arxiv.org/pdf/1909.09569v3.pdf
PWC	https://paperswithcode.com/paper/understanding-architectures-learnt-by-cell
Repo	https://github.com/shuyao95/Understanding-NAS
Framework	pytorch

A Unified Neural Architecture for Instrumental Audio Tasks


Title	A Unified Neural Architecture for Instrumental Audio Tasks
Authors	Steven Spratley, Daniel Beck, Trevor Cohn
Abstract	Within Music Information Retrieval (MIR), prominent tasks – including pitch-tracking, source-separation, super-resolution, and synthesis – typically call for specialised methods, despite their similarities. Conditional Generative Adversarial Networks (cGANs) have been shown to be highly versatile in learning general image-to-image translations, but have not yet been adapted across MIR. In this work, we present an end-to-end supervisable architecture to perform all aforementioned audio tasks, consisting of a WaveNet synthesiser conditioned on the output of a jointly-trained cGAN spectrogram translator. In doing so, we demonstrate the potential of such flexible techniques to unify MIR tasks, promote efficient transfer learning, and converge research to the improvement of powerful, general methods. Finally, to the best of our knowledge, we present the first application of GANs to guided instrument synthesis.
Tasks	Information Retrieval, Music Information Retrieval, Super-Resolution, Transfer Learning
Published	2019-03-01
URL	http://arxiv.org/abs/1903.00142v1
PDF	http://arxiv.org/pdf/1903.00142v1.pdf
PWC	https://paperswithcode.com/paper/a-unified-neural-architecture-for
Repo	https://github.com/r9y9/wavenet_vocoder
Framework	pytorch

Explain Yourself! Leveraging Language Models for Commonsense Reasoning


Title	Explain Yourself! Leveraging Language Models for Commonsense Reasoning
Authors	Nazneen Fatema Rajani, Bryan McCann, Caiming Xiong, Richard Socher
Abstract	Deep learning models perform poorly on tasks that require commonsense reasoning, which often necessitates some form of world-knowledge or reasoning over information not immediately present in the input. We collect human explanations for commonsense reasoning in the form of natural language sequences and highlighted annotations in a new dataset called Common Sense Explanations (CoS-E). We use CoS-E to train language models to automatically generate explanations that can be used during training and inference in a novel Commonsense Auto-Generated Explanation (CAGE) framework. CAGE improves the state-of-the-art by 10% on the challenging CommonsenseQA task. We further study commonsense reasoning in DNNs using both human and auto-generated explanations including transfer to out-of-domain tasks. Empirical results indicate that we can effectively leverage language models for commonsense reasoning.
Tasks	Common Sense Reasoning
Published	2019-06-06
URL	https://arxiv.org/abs/1906.02361v1
PDF	https://arxiv.org/pdf/1906.02361v1.pdf
PWC	https://paperswithcode.com/paper/explain-yourself-leveraging-language-models
Repo	https://github.com/salesforce/cos-e
Framework	none

Training Neural Networks with Local Error Signals


Title	Training Neural Networks with Local Error Signals
Authors	Arild Nøkland, Lars Hiller Eidnes
Abstract	Supervised training of neural networks for classification is typically performed with a global loss function. The loss function provides a gradient for the output layer, and this gradient is back-propagated to hidden layers to dictate an update direction for the weights. An alternative approach is to train the network with layer-wise loss functions. In this paper we demonstrate, for the first time, that layer-wise training can approach the state-of-the-art on a variety of image datasets. We use single-layer sub-networks and two different supervised loss functions to generate local error signals for the hidden layers, and we show that the combination of these losses help with optimization in the context of local learning. Using local errors could be a step towards more biologically plausible deep learning because the global error does not have to be transported back to hidden layers. A completely backprop free variant outperforms previously reported results among methods aiming for higher biological plausibility. Code is available https://github.com/anokland/local-loss
Tasks	Image Classification
Published	2019-01-20
URL	https://arxiv.org/abs/1901.06656v2
PDF	https://arxiv.org/pdf/1901.06656v2.pdf
PWC	https://paperswithcode.com/paper/training-neural-networks-with-local-error
Repo	https://github.com/anokland/local-loss
Framework	pytorch

Group Re-identification via Transferred Single and Couple Representation Learning


Title	Group Re-identification via Transferred Single and Couple Representation Learning
Authors	Ziling Huang, Zheng Wang, Shin’ichi Satoh, Chia-Wen Lin
Abstract	Group re-identification (G-ReID) is an important yet less-studied task. Its challenges not only lie in appearance changes of individuals which have been well-investigated in general person re-identification (ReID), but also derive from group layout and membership changes. So the key task of G-ReID is to learn representations robust to such changes. To address this issue, we propose a Transferred Single and Couple Representation Learning Network (TSCN). Its merits are two aspects: 1) Due to the lack of labelled training samples, existing G-ReID methods mainly rely on unsatisfactory hand-crafted features. To gain the superiority of deep learning models, we treat a group as multiple persons and transfer the domain of a labeled ReID dataset to a G-ReID target dataset style to learn single representations. 2) Taking into account the neighborhood relationship in a group, we further propose learning a novel couple representation between two group members, that achieves more discriminative power in G-ReID tasks. In addition, an unsupervised weight learning method is exploited to adaptively fuse the results of different views together according to result patterns. Extensive experimental results demonstrate the effectiveness of our approach that significantly outperforms state-of-the-art methods by 11.7% CMC-1 on the Road Group dataset and by 39.0% CMC-1 on the DukeMCMT dataset.
Tasks	Person Re-Identification, Representation Learning
Published	2019-05-13
URL	https://arxiv.org/abs/1905.04854v1
PDF	https://arxiv.org/pdf/1905.04854v1.pdf
PWC	https://paperswithcode.com/paper/group-re-identification-via-transferred
Repo	https://github.com/huangzilingcv/G-ReID
Framework	pytorch

pySOT and POAP: An event-driven asynchronous framework for surrogate optimization


Title	pySOT and POAP: An event-driven asynchronous framework for surrogate optimization
Authors	David Eriksson, David Bindel, Christine A. Shoemaker
Abstract	This paper describes Plumbing for Optimization with Asynchronous Parallelism (POAP) and the Python Surrogate Optimization Toolbox (pySOT). POAP is an event-driven framework for building and combining asynchronous optimization strategies, designed for global optimization of expensive functions where concurrent function evaluations are useful. POAP consists of three components: a worker pool capable of function evaluations, strategies to propose evaluations or other actions, and a controller that mediates the interaction between the workers and strategies. pySOT is a collection of synchronous and asynchronous surrogate optimization strategies, implemented in the POAP framework. We support the stochastic RBF method by Regis and Shoemaker along with various extensions of this method, and a general surrogate optimization strategy that covers most Bayesian optimization methods. We have implemented many different surrogate models, experimental designs, acquisition functions, and a large set of test problems. We make an extensive comparison between synchronous and asynchronous parallelism and find that the advantage of asynchronous computation increases as the variance of the evaluation time or number of processors increases. We observe a close to linear speed-up with 4, 8, and 16 processors in both the synchronous and asynchronous setting.
Tasks
Published	2019-07-30
URL	https://arxiv.org/abs/1908.00420v1
PDF	https://arxiv.org/pdf/1908.00420v1.pdf
PWC	https://paperswithcode.com/paper/pysot-and-poap-an-event-driven-asynchronous
Repo	https://github.com/dme65/pySOT
Framework	none

Chemical Names Standardization using Neural Sequence to Sequence Model


Title	Chemical Names Standardization using Neural Sequence to Sequence Model
Authors	Junlang Zhan, Hai Zhao
Abstract	Chemical information extraction is to convert chemical knowledge in text into true chemical database, which is a text processing task heavily relying on chemical compound name identification and standardization. Once a systematic name for a chemical compound is given, it will naturally and much simply convert the name into the eventually required molecular formula. However, for many chemical substances, they have been shown in many other names besides their systematic names which poses a great challenge for this task. In this paper, we propose a framework to do the auto standardization from the non-systematic names to the corresponding systematic names by using the spelling error correction, byte pair encoding tokenization and neural sequence to sequence model. Our framework is trained end to end and is fully data-driven. Our standardization accuracy on the test dataset achieves 54.04% which has a great improvement compared to previous state-of-the-art result.
Tasks	Tokenization
Published	2019-01-21
URL	http://arxiv.org/abs/1901.07003v1
PDF	http://arxiv.org/pdf/1901.07003v1.pdf
PWC	https://paperswithcode.com/paper/chemical-names-standardization-using-neural
Repo	https://github.com/zhanjunlang/Neural_Chemical_Name_Standardization
Framework	pytorch

C3DPO: Canonical 3D Pose Networks for Non-Rigid Structure From Motion


Title	C3DPO: Canonical 3D Pose Networks for Non-Rigid Structure From Motion
Authors	David Novotny, Nikhila Ravi, Benjamin Graham, Natalia Neverova, Andrea Vedaldi
Abstract	We propose C3DPO, a method for extracting 3D models of deformable objects from 2D keypoint annotations in unconstrained images. We do so by learning a deep network that reconstructs a 3D object from a single view at a time, accounting for partial occlusions, and explicitly factoring the effects of viewpoint changes and object deformations. In order to achieve this factorization, we introduce a novel regularization technique. We first show that the factorization is successful if, and only if, there exists a certain canonicalization function of the reconstructed shapes. Then, we learn the canonicalization function together with the reconstruction one, which constrains the result to be consistent. We demonstrate state-of-the-art reconstruction results for methods that do not use ground-truth 3D supervision for a number of benchmarks, including Up3D and PASCAL3D+. Source code has been made available at https://github.com/facebookresearch/c3dpo_nrsfm.
Tasks
Published	2019-09-05
URL	https://arxiv.org/abs/1909.02533v2
PDF	https://arxiv.org/pdf/1909.02533v2.pdf
PWC	https://paperswithcode.com/paper/c3dpo-canonical-3d-pose-networks-for-non
Repo	https://github.com/facebookresearch/c3dpo_nrsfm
Framework	pytorch

Attention-based Curiosity-driven Exploration in Deep Reinforcement Learning


Title	Attention-based Curiosity-driven Exploration in Deep Reinforcement Learning
Authors	Patrik Reizinger, Márton Szemenyei
Abstract	Reinforcement Learning enables to train an agent via interaction with the environment. However, in the majority of real-world scenarios, the extrinsic feedback is sparse or not sufficient, thus intrinsic reward formulations are needed to successfully train the agent. This work investigates and extends the paradigm of curiosity-driven exploration. First, a probabilistic approach is taken to exploit the advantages of the attention mechanism, which is successfully applied in other domains of Deep Learning. Combining them, we propose new methods, such as AttA2C, an extension of the Actor-Critic framework. Second, another curiosity-based approach - ICM - is extended. The proposed model utilizes attention to emphasize features for the dynamic models within ICM, moreover, we also modify the loss function, resulting in a new curiosity formulation, which we call rational curiosity. The corresponding implementation can be found at https://github.com/rpatrik96/AttA2C/.
Tasks
Published	2019-10-23
URL	https://arxiv.org/abs/1910.10840v1
PDF	https://arxiv.org/pdf/1910.10840v1.pdf
PWC	https://paperswithcode.com/paper/attention-based-curiosity-driven-exploration
Repo	https://github.com/rpatrik96/AttA2C
Framework	pytorch

Improved Training Speed, Accuracy, and Data Utilization Through Loss Function Optimization


Title	Improved Training Speed, Accuracy, and Data Utilization Through Loss Function Optimization
Authors	Santiago Gonzalez, Risto Miikkulainen
Abstract	As the complexity of neural network models has grown, it has become increasingly important to optimize their design automatically through metalearning. Methods for discovering hyperparameters, topologies, and learning rate schedules have lead to significant increases in performance. This paper shows that loss functions can be optimized with metalearning as well, and result in similar improvements. The method, Genetic Loss-function Optimization (GLO), discovers loss functions de novo, and optimizes them for a target task. Leveraging techniques from genetic programming, GLO builds loss functions hierarchically from a set of operators and leaf nodes. These functions are repeatedly recombined and mutated to find an optimal structure, and then a covariance-matrix adaptation evolutionary strategy (CMA-ES) is used to find optimal coefficients. Networks trained with GLO loss functions are found to outperform the standard cross-entropy loss on standard image classification tasks. Training with these new loss functions requires fewer steps, results in lower test error, and allows for smaller datasets to be used. Loss-function optimization thus provides a new dimension of metalearning, and constitutes an important step towards AutoML.
Tasks	AutoML, Image Classification
Published	2019-05-27
URL	https://arxiv.org/abs/1905.11528v2
PDF	https://arxiv.org/pdf/1905.11528v2.pdf
PWC	https://paperswithcode.com/paper/improved-training-speed-accuracy-and-data
Repo	https://github.com/sgonzalez/SwiftGenetics
Framework	none