February 1, 2020

2950 words 14 mins read

Paper Group AWR 196

Paper Group AWR 196

SECTOR: A Neural Model for Coherent Topic Segmentation and Classification. Contrastive Attention Mechanism for Abstractive Sentence Summarization. Talk Proposal: Towards the Realistic Evaluation of Evasion Attacks using CARLA. HarDNet: A Low Memory Traffic Network. Multi-Graph Transformer for Free-Hand Sketch Recognition. Understanding Architecture …

SECTOR: A Neural Model for Coherent Topic Segmentation and Classification

Title SECTOR: A Neural Model for Coherent Topic Segmentation and Classification
Authors Sebastian Arnold, Rudolf Schneider, Philippe Cudré-Mauroux, Felix A. Gers, Alexander Löser
Abstract When searching for information, a human reader first glances over a document, spots relevant sections and then focuses on a few sentences for resolving her intention. However, the high variance of document structure complicates to identify the salient topic of a given section at a glance. To tackle this challenge, we present SECTOR, a model to support machine reading systems by segmenting documents into coherent sections and assigning topic labels to each section. Our deep neural network architecture learns a latent topic embedding over the course of a document. This can be leveraged to classify local topics from plain text and segment a document at topic shifts. In addition, we contribute WikiSection, a publicly available dataset with 242k labeled sections in English and German from two distinct domains: diseases and cities. From our extensive evaluation of 20 architectures, we report a highest score of 71.6% F1 for the segmentation and classification of 30 topics from the English city domain, scored by our SECTOR LSTM model with bloom filter embeddings and bidirectional segmentation. This is a significant improvement of 29.5 points F1 compared to state-of-the-art CNN classifiers with baseline segmentation.
Tasks Reading Comprehension
Published 2019-02-13
URL http://arxiv.org/abs/1902.04793v1
PDF http://arxiv.org/pdf/1902.04793v1.pdf
PWC https://paperswithcode.com/paper/sector-a-neural-model-for-coherent-topic
Repo https://github.com/sebastianarnold/WikiSection
Framework none

Contrastive Attention Mechanism for Abstractive Sentence Summarization

Title Contrastive Attention Mechanism for Abstractive Sentence Summarization
Authors Xiangyu Duan, Hoongfei Yu, Mingming Yin, Min Zhang, Weihua Luo, Yue Zhang
Abstract We propose a contrastive attention mechanism to extend the sequence-to-sequence framework for abstractive sentence summarization task, which aims to generate a brief summary of a given source sentence. The proposed contrastive attention mechanism accommodates two categories of attention: one is the conventional attention that attends to relevant parts of the source sentence, the other is the opponent attention that attends to irrelevant or less relevant parts of the source sentence. Both attentions are trained in an opposite way so that the contribution from the conventional attention is encouraged and the contribution from the opponent attention is discouraged through a novel softmax and softmin functionality. Experiments on benchmark datasets show that, the proposed contrastive attention mechanism is more focused on the relevant parts for the summary than the conventional attention mechanism, and greatly advances the state-of-the-art performance on the abstractive sentence summarization task. We release the code at https://github.com/travel-go/Abstractive-Text-Summarization
Tasks Abstractive Sentence Summarization, Abstractive Text Summarization, Text Summarization
Published 2019-10-29
URL https://arxiv.org/abs/1910.13114v2
PDF https://arxiv.org/pdf/1910.13114v2.pdf
PWC https://paperswithcode.com/paper/contrastive-attention-mechanism-for
Repo https://github.com/travel-go/Abstractive-Text-Summarization
Framework pytorch

Talk Proposal: Towards the Realistic Evaluation of Evasion Attacks using CARLA

Title Talk Proposal: Towards the Realistic Evaluation of Evasion Attacks using CARLA
Authors Cory Cornelius, Shang-Tse Chen, Jason Martin, Duen Horng Chau
Abstract In this talk we describe our content-preserving attack on object detectors, ShapeShifter, and demonstrate how to evaluate this threat in realistic scenarios. We describe how we use CARLA, a realistic urban driving simulator, to create these scenarios, and how we use ShapeShifter to generate content-preserving attacks against those scenarios.
Tasks
Published 2019-04-18
URL http://arxiv.org/abs/1904.12622v1
PDF http://arxiv.org/pdf/1904.12622v1.pdf
PWC https://paperswithcode.com/paper/190412622
Repo https://github.com/shangtse/robust-physical-attack
Framework tf

HarDNet: A Low Memory Traffic Network

Title HarDNet: A Low Memory Traffic Network
Authors Ping Chao, Chao-Yang Kao, Yu-Shan Ruan, Chien-Hsiang Huang, Youn-Long Lin
Abstract State-of-the-art neural network architectures such as ResNet, MobileNet, and DenseNet have achieved outstanding accuracy over low MACs and small model size counterparts. However, these metrics might not be accurate for predicting the inference time. We suggest that memory traffic for accessing intermediate feature maps can be a factor dominating the inference latency, especially in such tasks as real-time object detection and semantic segmentation of high-resolution video. We propose a Harmonic Densely Connected Network to achieve high efficiency in terms of both low MACs and memory traffic. The new network achieves 35%, 36%, 30%, 32%, and 45% inference time reduction compared with FC-DenseNet-103, DenseNet-264, ResNet-50, ResNet-152, and SSD-VGG, respectively. We use tools including Nvidia profiler and ARM Scale-Sim to measure the memory traffic and verify that the inference latency is indeed proportional to the memory traffic consumption and the proposed network consumes low memory traffic. We conclude that one should take memory traffic into consideration when designing neural network architectures for high-resolution applications at the edge.
Tasks Object Detection, Real-Time Object Detection, Real-Time Semantic Segmentation, Semantic Segmentation
Published 2019-09-03
URL https://arxiv.org/abs/1909.00948v1
PDF https://arxiv.org/pdf/1909.00948v1.pdf
PWC https://paperswithcode.com/paper/hardnet-a-low-memory-traffic-network
Repo https://github.com/osmr/imgclsmob
Framework mxnet

Multi-Graph Transformer for Free-Hand Sketch Recognition

Title Multi-Graph Transformer for Free-Hand Sketch Recognition
Authors Peng Xu, Chaitanya K. Joshi, Xavier Bresson
Abstract Learning meaningful representations of free-hand sketches remains a challenging task given the signal sparsity and the high-level abstraction of sketches. Existing techniques have focused on exploiting either the static nature of sketches with Convolutional Neural Networks (CNNs) or the temporal sequential property with Recurrent Neural Networks (RNNs). In this work, we propose a new representation of sketches as multiple sparsely connected graphs. We design a novel Graph Neural Network (GNN), the Multi-Graph Transformer (MGT), for learning representations of sketches from multiple graphs which simultaneously capture global and local geometric stroke structures, as well as temporal information. We report extensive numerical experiments on a sketch recognition task to demonstrate the performance of the proposed approach. Particularly, MGT applied on 414k sketches from Google QuickDraw: (i) achieves small recognition gap to the CNN-based performance upper bound (72.80% vs. 74.22%), and (ii) outperforms all RNN-based models by a significant margin. To the best of our knowledge, this is the first work proposing to represent sketches as graphs and apply GNNs for sketch recognition. Code and trained models are available at https://github.com/PengBoXiangShang/multigraph_transformer.
Tasks Sketch Recognition
Published 2019-12-24
URL https://arxiv.org/abs/1912.11258v2
PDF https://arxiv.org/pdf/1912.11258v2.pdf
PWC https://paperswithcode.com/paper/multi-graph-transformer-for-free-hand-sketch
Repo https://github.com/PengBoXiangShang/multigraph_transformer
Framework pytorch
Title Understanding Architectures Learnt by Cell-based Neural Architecture Search
Authors Yao Shu, Wei Wang, Shaofeng Cai
Abstract Neural architecture search (NAS) searches architectures automatically for given tasks, e.g., image classification and language modeling. Improving the search efficiency and effectiveness have attracted increasing attention in recent years. However, few efforts have been devoted to understanding the generated architectures. In this paper, we first reveal that existing NAS algorithms (e.g., DARTS, ENAS) tend to favor architectures with wide and shallow cell structures. These favorable architectures consistently achieve fast convergence and are consequently selected by NAS algorithms. Our empirical and theoretical study further confirms that their fast convergence derives from their smooth loss landscape and accurate gradient information. Nonetheless, these architectures may not necessarily lead to better generalization performance compared with other candidate architectures in the same search space, and therefore further improvement is possible by revising existing NAS algorithms.
Tasks Image Classification, Language Modelling, Neural Architecture Search
Published 2019-09-20
URL https://arxiv.org/abs/1909.09569v3
PDF https://arxiv.org/pdf/1909.09569v3.pdf
PWC https://paperswithcode.com/paper/understanding-architectures-learnt-by-cell
Repo https://github.com/shuyao95/Understanding-NAS
Framework pytorch

A Unified Neural Architecture for Instrumental Audio Tasks

Title A Unified Neural Architecture for Instrumental Audio Tasks
Authors Steven Spratley, Daniel Beck, Trevor Cohn
Abstract Within Music Information Retrieval (MIR), prominent tasks – including pitch-tracking, source-separation, super-resolution, and synthesis – typically call for specialised methods, despite their similarities. Conditional Generative Adversarial Networks (cGANs) have been shown to be highly versatile in learning general image-to-image translations, but have not yet been adapted across MIR. In this work, we present an end-to-end supervisable architecture to perform all aforementioned audio tasks, consisting of a WaveNet synthesiser conditioned on the output of a jointly-trained cGAN spectrogram translator. In doing so, we demonstrate the potential of such flexible techniques to unify MIR tasks, promote efficient transfer learning, and converge research to the improvement of powerful, general methods. Finally, to the best of our knowledge, we present the first application of GANs to guided instrument synthesis.
Tasks Information Retrieval, Music Information Retrieval, Super-Resolution, Transfer Learning
Published 2019-03-01
URL http://arxiv.org/abs/1903.00142v1
PDF http://arxiv.org/pdf/1903.00142v1.pdf
PWC https://paperswithcode.com/paper/a-unified-neural-architecture-for
Repo https://github.com/r9y9/wavenet_vocoder
Framework pytorch

Explain Yourself! Leveraging Language Models for Commonsense Reasoning

Title Explain Yourself! Leveraging Language Models for Commonsense Reasoning
Authors Nazneen Fatema Rajani, Bryan McCann, Caiming Xiong, Richard Socher
Abstract Deep learning models perform poorly on tasks that require commonsense reasoning, which often necessitates some form of world-knowledge or reasoning over information not immediately present in the input. We collect human explanations for commonsense reasoning in the form of natural language sequences and highlighted annotations in a new dataset called Common Sense Explanations (CoS-E). We use CoS-E to train language models to automatically generate explanations that can be used during training and inference in a novel Commonsense Auto-Generated Explanation (CAGE) framework. CAGE improves the state-of-the-art by 10% on the challenging CommonsenseQA task. We further study commonsense reasoning in DNNs using both human and auto-generated explanations including transfer to out-of-domain tasks. Empirical results indicate that we can effectively leverage language models for commonsense reasoning.
Tasks Common Sense Reasoning
Published 2019-06-06
URL https://arxiv.org/abs/1906.02361v1
PDF https://arxiv.org/pdf/1906.02361v1.pdf
PWC https://paperswithcode.com/paper/explain-yourself-leveraging-language-models
Repo https://github.com/salesforce/cos-e
Framework none

Training Neural Networks with Local Error Signals

Title Training Neural Networks with Local Error Signals
Authors Arild Nøkland, Lars Hiller Eidnes
Abstract Supervised training of neural networks for classification is typically performed with a global loss function. The loss function provides a gradient for the output layer, and this gradient is back-propagated to hidden layers to dictate an update direction for the weights. An alternative approach is to train the network with layer-wise loss functions. In this paper we demonstrate, for the first time, that layer-wise training can approach the state-of-the-art on a variety of image datasets. We use single-layer sub-networks and two different supervised loss functions to generate local error signals for the hidden layers, and we show that the combination of these losses help with optimization in the context of local learning. Using local errors could be a step towards more biologically plausible deep learning because the global error does not have to be transported back to hidden layers. A completely backprop free variant outperforms previously reported results among methods aiming for higher biological plausibility. Code is available https://github.com/anokland/local-loss
Tasks Image Classification
Published 2019-01-20
URL https://arxiv.org/abs/1901.06656v2
PDF https://arxiv.org/pdf/1901.06656v2.pdf
PWC https://paperswithcode.com/paper/training-neural-networks-with-local-error
Repo https://github.com/anokland/local-loss
Framework pytorch

Group Re-identification via Transferred Single and Couple Representation Learning

Title Group Re-identification via Transferred Single and Couple Representation Learning
Authors Ziling Huang, Zheng Wang, Shin’ichi Satoh, Chia-Wen Lin
Abstract Group re-identification (G-ReID) is an important yet less-studied task. Its challenges not only lie in appearance changes of individuals which have been well-investigated in general person re-identification (ReID), but also derive from group layout and membership changes. So the key task of G-ReID is to learn representations robust to such changes. To address this issue, we propose a Transferred Single and Couple Representation Learning Network (TSCN). Its merits are two aspects: 1) Due to the lack of labelled training samples, existing G-ReID methods mainly rely on unsatisfactory hand-crafted features. To gain the superiority of deep learning models, we treat a group as multiple persons and transfer the domain of a labeled ReID dataset to a G-ReID target dataset style to learn single representations. 2) Taking into account the neighborhood relationship in a group, we further propose learning a novel couple representation between two group members, that achieves more discriminative power in G-ReID tasks. In addition, an unsupervised weight learning method is exploited to adaptively fuse the results of different views together according to result patterns. Extensive experimental results demonstrate the effectiveness of our approach that significantly outperforms state-of-the-art methods by 11.7% CMC-1 on the Road Group dataset and by 39.0% CMC-1 on the DukeMCMT dataset.
Tasks Person Re-Identification, Representation Learning
Published 2019-05-13
URL https://arxiv.org/abs/1905.04854v1
PDF https://arxiv.org/pdf/1905.04854v1.pdf
PWC https://paperswithcode.com/paper/group-re-identification-via-transferred
Repo https://github.com/huangzilingcv/G-ReID
Framework pytorch

pySOT and POAP: An event-driven asynchronous framework for surrogate optimization

Title pySOT and POAP: An event-driven asynchronous framework for surrogate optimization
Authors David Eriksson, David Bindel, Christine A. Shoemaker
Abstract This paper describes Plumbing for Optimization with Asynchronous Parallelism (POAP) and the Python Surrogate Optimization Toolbox (pySOT). POAP is an event-driven framework for building and combining asynchronous optimization strategies, designed for global optimization of expensive functions where concurrent function evaluations are useful. POAP consists of three components: a worker pool capable of function evaluations, strategies to propose evaluations or other actions, and a controller that mediates the interaction between the workers and strategies. pySOT is a collection of synchronous and asynchronous surrogate optimization strategies, implemented in the POAP framework. We support the stochastic RBF method by Regis and Shoemaker along with various extensions of this method, and a general surrogate optimization strategy that covers most Bayesian optimization methods. We have implemented many different surrogate models, experimental designs, acquisition functions, and a large set of test problems. We make an extensive comparison between synchronous and asynchronous parallelism and find that the advantage of asynchronous computation increases as the variance of the evaluation time or number of processors increases. We observe a close to linear speed-up with 4, 8, and 16 processors in both the synchronous and asynchronous setting.
Tasks
Published 2019-07-30
URL https://arxiv.org/abs/1908.00420v1
PDF https://arxiv.org/pdf/1908.00420v1.pdf
PWC https://paperswithcode.com/paper/pysot-and-poap-an-event-driven-asynchronous
Repo https://github.com/dme65/pySOT
Framework none

Chemical Names Standardization using Neural Sequence to Sequence Model

Title Chemical Names Standardization using Neural Sequence to Sequence Model
Authors Junlang Zhan, Hai Zhao
Abstract Chemical information extraction is to convert chemical knowledge in text into true chemical database, which is a text processing task heavily relying on chemical compound name identification and standardization. Once a systematic name for a chemical compound is given, it will naturally and much simply convert the name into the eventually required molecular formula. However, for many chemical substances, they have been shown in many other names besides their systematic names which poses a great challenge for this task. In this paper, we propose a framework to do the auto standardization from the non-systematic names to the corresponding systematic names by using the spelling error correction, byte pair encoding tokenization and neural sequence to sequence model. Our framework is trained end to end and is fully data-driven. Our standardization accuracy on the test dataset achieves 54.04% which has a great improvement compared to previous state-of-the-art result.
Tasks Tokenization
Published 2019-01-21
URL http://arxiv.org/abs/1901.07003v1
PDF http://arxiv.org/pdf/1901.07003v1.pdf
PWC https://paperswithcode.com/paper/chemical-names-standardization-using-neural
Repo https://github.com/zhanjunlang/Neural_Chemical_Name_Standardization
Framework pytorch

C3DPO: Canonical 3D Pose Networks for Non-Rigid Structure From Motion

Title C3DPO: Canonical 3D Pose Networks for Non-Rigid Structure From Motion
Authors David Novotny, Nikhila Ravi, Benjamin Graham, Natalia Neverova, Andrea Vedaldi
Abstract We propose C3DPO, a method for extracting 3D models of deformable objects from 2D keypoint annotations in unconstrained images. We do so by learning a deep network that reconstructs a 3D object from a single view at a time, accounting for partial occlusions, and explicitly factoring the effects of viewpoint changes and object deformations. In order to achieve this factorization, we introduce a novel regularization technique. We first show that the factorization is successful if, and only if, there exists a certain canonicalization function of the reconstructed shapes. Then, we learn the canonicalization function together with the reconstruction one, which constrains the result to be consistent. We demonstrate state-of-the-art reconstruction results for methods that do not use ground-truth 3D supervision for a number of benchmarks, including Up3D and PASCAL3D+. Source code has been made available at https://github.com/facebookresearch/c3dpo_nrsfm.
Tasks
Published 2019-09-05
URL https://arxiv.org/abs/1909.02533v2
PDF https://arxiv.org/pdf/1909.02533v2.pdf
PWC https://paperswithcode.com/paper/c3dpo-canonical-3d-pose-networks-for-non
Repo https://github.com/facebookresearch/c3dpo_nrsfm
Framework pytorch

Attention-based Curiosity-driven Exploration in Deep Reinforcement Learning

Title Attention-based Curiosity-driven Exploration in Deep Reinforcement Learning
Authors Patrik Reizinger, Márton Szemenyei
Abstract Reinforcement Learning enables to train an agent via interaction with the environment. However, in the majority of real-world scenarios, the extrinsic feedback is sparse or not sufficient, thus intrinsic reward formulations are needed to successfully train the agent. This work investigates and extends the paradigm of curiosity-driven exploration. First, a probabilistic approach is taken to exploit the advantages of the attention mechanism, which is successfully applied in other domains of Deep Learning. Combining them, we propose new methods, such as AttA2C, an extension of the Actor-Critic framework. Second, another curiosity-based approach - ICM - is extended. The proposed model utilizes attention to emphasize features for the dynamic models within ICM, moreover, we also modify the loss function, resulting in a new curiosity formulation, which we call rational curiosity. The corresponding implementation can be found at https://github.com/rpatrik96/AttA2C/.
Tasks
Published 2019-10-23
URL https://arxiv.org/abs/1910.10840v1
PDF https://arxiv.org/pdf/1910.10840v1.pdf
PWC https://paperswithcode.com/paper/attention-based-curiosity-driven-exploration
Repo https://github.com/rpatrik96/AttA2C
Framework pytorch

Improved Training Speed, Accuracy, and Data Utilization Through Loss Function Optimization

Title Improved Training Speed, Accuracy, and Data Utilization Through Loss Function Optimization
Authors Santiago Gonzalez, Risto Miikkulainen
Abstract As the complexity of neural network models has grown, it has become increasingly important to optimize their design automatically through metalearning. Methods for discovering hyperparameters, topologies, and learning rate schedules have lead to significant increases in performance. This paper shows that loss functions can be optimized with metalearning as well, and result in similar improvements. The method, Genetic Loss-function Optimization (GLO), discovers loss functions de novo, and optimizes them for a target task. Leveraging techniques from genetic programming, GLO builds loss functions hierarchically from a set of operators and leaf nodes. These functions are repeatedly recombined and mutated to find an optimal structure, and then a covariance-matrix adaptation evolutionary strategy (CMA-ES) is used to find optimal coefficients. Networks trained with GLO loss functions are found to outperform the standard cross-entropy loss on standard image classification tasks. Training with these new loss functions requires fewer steps, results in lower test error, and allows for smaller datasets to be used. Loss-function optimization thus provides a new dimension of metalearning, and constitutes an important step towards AutoML.
Tasks AutoML, Image Classification
Published 2019-05-27
URL https://arxiv.org/abs/1905.11528v2
PDF https://arxiv.org/pdf/1905.11528v2.pdf
PWC https://paperswithcode.com/paper/improved-training-speed-accuracy-and-data
Repo https://github.com/sgonzalez/SwiftGenetics
Framework none
comments powered by Disqus