May 7, 2019

2931 words 14 mins read

Paper Group AWR 66

Generalized optimal sub-pattern assignment metric. A Decomposable Attention Model for Natural Language Inference. Deep Video Deblurring. A Deep Learning Approach to Block-based Compressed Sensing of Images. A Parallel-Hierarchical Model for Machine Comprehension on Sparse Data. Text Classification Improved by Integrating Bidirectional LSTM with Two …

Generalized optimal sub-pattern assignment metric


Title	Generalized optimal sub-pattern assignment metric
Authors	Abu Sajana Rahmathullah, Ángel F. García-Fernández, Lennart Svensson
Abstract	This paper presents the generalized optimal sub-pattern assignment (GOSPA) metric on the space of finite sets of targets. Compared to the well-established optimal sub-pattern assignment (OSPA) metric, GOSPA is unnormalized as a function of the cardinality and it penalizes cardinality errors differently, which enables us to express it as an optimisation over assignments instead of permutations. An important consequence of this is that GOSPA allows us to penalize localization errors for detected targets and the errors due to missed and false targets, as indicated by traditional multiple target tracking (MTT) performance measures, in a sound manner. In addition, we extend the GOSPA metric to the space of random finite sets, which is important to evaluate MTT algorithms via simulations in a rigorous way.
Tasks
Published	2016-01-21
URL	http://arxiv.org/abs/1601.05585v7
PDF	http://arxiv.org/pdf/1601.05585v7.pdf
PWC	https://paperswithcode.com/paper/generalized-optimal-sub-pattern-assignment
Repo	https://github.com/abusajana/GOSPA
Framework	none

A Decomposable Attention Model for Natural Language Inference


Title	A Decomposable Attention Model for Natural Language Inference
Authors	Ankur P. Parikh, Oscar Täckström, Dipanjan Das, Jakob Uszkoreit
Abstract	We propose a simple neural architecture for natural language inference. Our approach uses attention to decompose the problem into subproblems that can be solved separately, thus making it trivially parallelizable. On the Stanford Natural Language Inference (SNLI) dataset, we obtain state-of-the-art results with almost an order of magnitude fewer parameters than previous work and without relying on any word-order information. Adding intra-sentence attention that takes a minimum amount of order into account yields further improvements.
Tasks	Natural Language Inference
Published	2016-06-06
URL	http://arxiv.org/abs/1606.01933v2
PDF	http://arxiv.org/pdf/1606.01933v2.pdf
PWC	https://paperswithcode.com/paper/a-decomposable-attention-model-for-natural
Repo	https://github.com/blcunlp/CNLI
Framework	tf

Deep Video Deblurring


Title	Deep Video Deblurring
Authors	Shuochen Su, Mauricio Delbracio, Jue Wang, Guillermo Sapiro, Wolfgang Heidrich, Oliver Wang
Abstract	Motion blur from camera shake is a major problem in videos captured by hand-held devices. Unlike single-image deblurring, video-based approaches can take advantage of the abundant information that exists across neighboring frames. As a result the best performing methods rely on aligning nearby frames. However, aligning images is a computationally expensive and fragile procedure, and methods that aggregate information must therefore be able to identify which regions have been accurately aligned and which have not, a task which requires high level scene understanding. In this work, we introduce a deep learning solution to video deblurring, where a CNN is trained end-to-end to learn how to accumulate information across frames. To train this network, we collected a dataset of real videos recorded with a high framerate camera, which we use to generate synthetic motion blur for supervision. We show that the features learned from this dataset extend to deblurring motion blur that arises due to camera shake in a wide range of videos, and compare the quality of results to a number of other baselines.
Tasks	Deblurring, Scene Understanding
Published	2016-11-25
URL	http://arxiv.org/abs/1611.08387v1
PDF	http://arxiv.org/pdf/1611.08387v1.pdf
PWC	https://paperswithcode.com/paper/deep-video-deblurring
Repo	https://github.com/susomena/DeepSlowMotion
Framework	tf

A Deep Learning Approach to Block-based Compressed Sensing of Images


Title	A Deep Learning Approach to Block-based Compressed Sensing of Images
Authors	Amir Adler, David Boublil, Michael Elad, Michael Zibulevsky
Abstract	Compressed sensing (CS) is a signal processing framework for efficiently reconstructing a signal from a small number of measurements, obtained by linear projections of the signal. Block-based CS is a lightweight CS approach that is mostly suitable for processing very high-dimensional images and videos: it operates on local patches, employs a low-complexity reconstruction operator and requires significantly less memory to store the sensing matrix. In this paper we present a deep learning approach for block-based CS, in which a fully-connected network performs both the block-based linear sensing and non-linear reconstruction stages. During the training phase, the sensing matrix and the non-linear reconstruction operator are \emph{jointly} optimized, and the proposed approach outperforms state-of-the-art both in terms of reconstruction quality and computation time. For example, at a 25% sensing rate the average PSNR advantage is 0.77dB and computation time is over 200-times faster.
Tasks
Published	2016-06-05
URL	http://arxiv.org/abs/1606.01519v1
PDF	http://arxiv.org/pdf/1606.01519v1.pdf
PWC	https://paperswithcode.com/paper/a-deep-learning-approach-to-block-based
Repo	https://github.com/asalp/Block-Based-CS-NNs
Framework	none

A Parallel-Hierarchical Model for Machine Comprehension on Sparse Data


Title	A Parallel-Hierarchical Model for Machine Comprehension on Sparse Data
Authors	Adam Trischler, Zheng Ye, Xingdi Yuan, Jing He, Phillip Bachman, Kaheer Suleman
Abstract	Understanding unstructured text is a major goal within natural language processing. Comprehension tests pose questions based on short text passages to evaluate such understanding. In this work, we investigate machine comprehension on the challenging {\it MCTest} benchmark. Partly because of its limited size, prior work on {\it MCTest} has focused mainly on engineering better features. We tackle the dataset with a neural approach, harnessing simple neural networks arranged in a parallel hierarchy. The parallel hierarchy enables our model to compare the passage, question, and answer from a variety of trainable perspectives, as opposed to using a manually designed, rigid feature set. Perspectives range from the word level to sentence fragments to sequences of sentences; the networks operate only on word-embedding representations of text. When trained with a methodology designed to help cope with limited training data, our Parallel-Hierarchical model sets a new state of the art for {\it MCTest}, outperforming previous feature-engineered approaches slightly and previous neural approaches by a significant margin (over 15% absolute).
Tasks	Question Answering, Reading Comprehension
Published	2016-03-29
URL	http://arxiv.org/abs/1603.08884v1
PDF	http://arxiv.org/pdf/1603.08884v1.pdf
PWC	https://paperswithcode.com/paper/a-parallel-hierarchical-model-for-machine
Repo	https://github.com/Maluuba/mctest-model
Framework	none

Text Classification Improved by Integrating Bidirectional LSTM with Two-dimensional Max Pooling


Title	Text Classification Improved by Integrating Bidirectional LSTM with Two-dimensional Max Pooling
Authors	Peng Zhou, Zhenyu Qi, Suncong Zheng, Jiaming Xu, Hongyun Bao, Bo Xu
Abstract	Recurrent Neural Network (RNN) is one of the most popular architectures used in Natural Language Processsing (NLP) tasks because its recurrent structure is very suitable to process variable-length text. RNN can utilize distributed representations of words by first converting the tokens comprising each text into vectors, which form a matrix. And this matrix includes two dimensions: the time-step dimension and the feature vector dimension. Then most existing models usually utilize one-dimensional (1D) max pooling operation or attention-based operation only on the time-step dimension to obtain a fixed-length vector. However, the features on the feature vector dimension are not mutually independent, and simply applying 1D pooling operation over the time-step dimension independently may destroy the structure of the feature representation. On the other hand, applying two-dimensional (2D) pooling operation over the two dimensions may sample more meaningful features for sequence modeling tasks. To integrate the features on both dimensions of the matrix, this paper explores applying 2D max pooling operation to obtain a fixed-length representation of the text. This paper also utilizes 2D convolution to sample more meaningful information of the matrix. Experiments are conducted on six text classification tasks, including sentiment analysis, question classification, subjectivity classification and newsgroup classification. Compared with the state-of-the-art models, the proposed models achieve excellent performance on 4 out of 6 tasks. Specifically, one of the proposed models achieves highest accuracy on Stanford Sentiment Treebank binary classification and fine-grained classification tasks.
Tasks	Sentiment Analysis, Text Classification
Published	2016-11-21
URL	http://arxiv.org/abs/1611.06639v1
PDF	http://arxiv.org/pdf/1611.06639v1.pdf
PWC	https://paperswithcode.com/paper/text-classification-improved-by-integrating
Repo	https://github.com/ManuelVs/NeuralNetworks
Framework	tf

Semantic Tagging with Deep Residual Networks


Title	Semantic Tagging with Deep Residual Networks
Authors	Johannes Bjerva, Barbara Plank, Johan Bos
Abstract	We propose a novel semantic tagging task, sem-tagging, tailored for the purpose of multilingual semantic parsing, and present the first tagger using deep residual networks (ResNets). Our tagger uses both word and character representations and includes a novel residual bypass architecture. We evaluate the tagset both intrinsically on the new task of semantic tagging, as well as on Part-of-Speech (POS) tagging. Our system, consisting of a ResNet and an auxiliary loss function predicting our semantic tags, significantly outperforms prior results on English Universal Dependencies POS tagging (95.71% accuracy on UD v1.2 and 95.67% accuracy on UD v1.3).
Tasks	Part-Of-Speech Tagging, Semantic Parsing
Published	2016-09-22
URL	http://arxiv.org/abs/1609.07053v2
PDF	http://arxiv.org/pdf/1609.07053v2.pdf
PWC	https://paperswithcode.com/paper/semantic-tagging-with-deep-residual-networks
Repo	https://github.com/bjerva/semantic-tagging
Framework	none

Improving Coreference Resolution by Learning Entity-Level Distributed Representations


Title	Improving Coreference Resolution by Learning Entity-Level Distributed Representations
Authors	Kevin Clark, Christopher D. Manning
Abstract	A long-standing challenge in coreference resolution has been the incorporation of entity-level information - features defined over clusters of mentions instead of mention pairs. We present a neural network based coreference system that produces high-dimensional vector representations for pairs of coreference clusters. Using these representations, our system learns when combining clusters is desirable. We train the system with a learning-to-search algorithm that teaches it which local decisions (cluster merges) will lead to a high-scoring final coreference partition. The system substantially outperforms the current state-of-the-art on the English and Chinese portions of the CoNLL 2012 Shared Task dataset despite using few hand-engineered features.
Tasks	Coreference Resolution
Published	2016-06-04
URL	http://arxiv.org/abs/1606.01323v2
PDF	http://arxiv.org/pdf/1606.01323v2.pdf
PWC	https://paperswithcode.com/paper/improving-coreference-resolution-by-learning
Repo	https://github.com/clarkkev/deep-coref
Framework	none

Towards cross-lingual distributed representations without parallel text trained with adversarial autoencoders


Title	Towards cross-lingual distributed representations without parallel text trained with adversarial autoencoders
Authors	Antonio Valerio Miceli Barone
Abstract	Current approaches to learning vector representations of text that are compatible between different languages usually require some amount of parallel text, aligned at word, sentence or at least document level. We hypothesize however, that different natural languages share enough semantic structure that it should be possible, in principle, to learn compatible vector representations just by analyzing the monolingual distribution of words. In order to evaluate this hypothesis, we propose a scheme to map word vectors trained on a source language to vectors semantically compatible with word vectors trained on a target language using an adversarial autoencoder. We present preliminary qualitative results and discuss possible future developments of this technique, such as applications to cross-lingual sentence representations.
Tasks
Published	2016-08-09
URL	http://arxiv.org/abs/1608.02996v1
PDF	http://arxiv.org/pdf/1608.02996v1.pdf
PWC	https://paperswithcode.com/paper/towards-cross-lingual-distributed
Repo	https://github.com/Avmb/clweadv
Framework	none

Residual Networks Behave Like Ensembles of Relatively Shallow Networks


Title	Residual Networks Behave Like Ensembles of Relatively Shallow Networks
Authors	Andreas Veit, Michael Wilber, Serge Belongie
Abstract	In this work we propose a novel interpretation of residual networks showing that they can be seen as a collection of many paths of differing length. Moreover, residual networks seem to enable very deep networks by leveraging only the short paths during training. To support this observation, we rewrite residual networks as an explicit collection of paths. Unlike traditional models, paths through residual networks vary in length. Further, a lesion study reveals that these paths show ensemble-like behavior in the sense that they do not strongly depend on each other. Finally, and most surprising, most paths are shorter than one might expect, and only the short paths are needed during training, as longer paths do not contribute any gradient. For example, most of the gradient in a residual network with 110 layers comes from paths that are only 10-34 layers deep. Our results reveal one of the key characteristics that seem to enable the training of very deep networks: Residual networks avoid the vanishing gradient problem by introducing short paths which can carry gradient throughout the extent of very deep networks.
Tasks
Published	2016-05-20
URL	http://arxiv.org/abs/1605.06431v2
PDF	http://arxiv.org/pdf/1605.06431v2.pdf
PWC	https://paperswithcode.com/paper/residual-networks-behave-like-ensembles-of
Repo	https://github.com/andreasveit/densenet-pytorch
Framework	pytorch

Learning Deep Feature Representations with Domain Guided Dropout for Person Re-identification


Title	Learning Deep Feature Representations with Domain Guided Dropout for Person Re-identification
Authors	Tong Xiao, Hongsheng Li, Wanli Ouyang, Xiaogang Wang
Abstract	Learning generic and robust feature representations with data from multiple domains for the same problem is of great value, especially for the problems that have multiple datasets but none of them are large enough to provide abundant data variations. In this work, we present a pipeline for learning deep feature representations from multiple domains with Convolutional Neural Networks (CNNs). When training a CNN with data from all the domains, some neurons learn representations shared across several domains, while some others are effective only for a specific one. Based on this important observation, we propose a Domain Guided Dropout algorithm to improve the feature learning procedure. Experiments show the effectiveness of our pipeline and the proposed algorithm. Our methods on the person re-identification problem outperform state-of-the-art methods on multiple datasets by large margins.
Tasks	Person Re-Identification
Published	2016-04-26
URL	http://arxiv.org/abs/1604.07528v1
PDF	http://arxiv.org/pdf/1604.07528v1.pdf
PWC	https://paperswithcode.com/paper/learning-deep-feature-representations-with
Repo	https://github.com/Cysu/person_reid
Framework	none

Feature Selection: A Data Perspective


Title	Feature Selection: A Data Perspective
Authors	Jundong Li, Kewei Cheng, Suhang Wang, Fred Morstatter, Robert P. Trevino, Jiliang Tang, Huan Liu
Abstract	Feature selection, as a data preprocessing strategy, has been proven to be effective and efficient in preparing data (especially high-dimensional data) for various data mining and machine learning problems. The objectives of feature selection include: building simpler and more comprehensible models, improving data mining performance, and preparing clean, understandable data. The recent proliferation of big data has presented some substantial challenges and opportunities to feature selection. In this survey, we provide a comprehensive and structured overview of recent advances in feature selection research. Motivated by current challenges and opportunities in the era of big data, we revisit feature selection research from a data perspective and review representative feature selection algorithms for conventional data, structured data, heterogeneous data and streaming data. Methodologically, to emphasize the differences and similarities of most existing feature selection algorithms for conventional data, we categorize them into four main groups: similarity based, information theoretical based, sparse learning based and statistical based methods. To facilitate and promote the research in this community, we also present an open-source feature selection repository that consists of most of the popular feature selection algorithms (\url{http://featureselection.asu.edu/}). Also, we use it as an example to show how to evaluate feature selection algorithms. At the end of the survey, we present a discussion about some open problems and challenges that require more attention in future research.
Tasks	Feature Selection, Sparse Learning
Published	2016-01-29
URL	http://arxiv.org/abs/1601.07996v5
PDF	http://arxiv.org/pdf/1601.07996v5.pdf
PWC	https://paperswithcode.com/paper/feature-selection-a-data-perspective
Repo	https://github.com/jundongl/scikit-feature
Framework	none

#Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning


Title	#Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning
Authors	Haoran Tang, Rein Houthooft, Davis Foote, Adam Stooke, Xi Chen, Yan Duan, John Schulman, Filip De Turck, Pieter Abbeel
Abstract	Count-based exploration algorithms are known to perform near-optimally when used in conjunction with tabular reinforcement learning (RL) methods for solving small discrete Markov decision processes (MDPs). It is generally thought that count-based methods cannot be applied in high-dimensional state spaces, since most states will only occur once. Recent deep RL exploration strategies are able to deal with high-dimensional continuous state spaces through complex heuristics, often relying on optimism in the face of uncertainty or intrinsic motivation. In this work, we describe a surprising finding: a simple generalization of the classic count-based approach can reach near state-of-the-art performance on various high-dimensional and/or continuous deep RL benchmarks. States are mapped to hash codes, which allows to count their occurrences with a hash table. These counts are then used to compute a reward bonus according to the classic count-based exploration theory. We find that simple hash functions can achieve surprisingly good results on many challenging tasks. Furthermore, we show that a domain-dependent learned hash code may further improve these results. Detailed analysis reveals important aspects of a good hash function: 1) having appropriate granularity and 2) encoding information relevant to solving the MDP. This exploration strategy achieves near state-of-the-art performance on both continuous control tasks and Atari 2600 games, hence providing a simple yet powerful baseline for solving MDPs that require considerable exploration.
Tasks	Atari Games, Continuous Control
Published	2016-11-15
URL	http://arxiv.org/abs/1611.04717v3
PDF	http://arxiv.org/pdf/1611.04717v3.pdf
PWC	https://paperswithcode.com/paper/exploration-a-study-of-count-based
Repo	https://github.com/nhynes/abc
Framework	pytorch

Text Understanding with the Attention Sum Reader Network


Title	Text Understanding with the Attention Sum Reader Network
Authors	Rudolf Kadlec, Martin Schmid, Ondrej Bajgar, Jan Kleindienst
Abstract	Several large cloze-style context-question-answer datasets have been introduced recently: the CNN and Daily Mail news data and the Children’s Book Test. Thanks to the size of these datasets, the associated text comprehension task is well suited for deep-learning techniques that currently seem to outperform all alternative approaches. We present a new, simple model that uses attention to directly pick the answer from the context as opposed to computing the answer using a blended representation of words in the document as is usual in similar models. This makes the model particularly suitable for question-answering problems where the answer is a single word from the document. Ensemble of our models sets new state of the art on all evaluated datasets.
Tasks	Machine Reading Comprehension, Open-Domain Question Answering, Question Answering, Reading Comprehension
Published	2016-03-04
URL	http://arxiv.org/abs/1603.01547v2
PDF	http://arxiv.org/pdf/1603.01547v2.pdf
PWC	https://paperswithcode.com/paper/text-understanding-with-the-attention-sum
Repo	https://github.com/shuxiaobo/QA-Experiment
Framework	tf

End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF


Title	End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF
Authors	Xuezhe Ma, Eduard Hovy
Abstract	State-of-the-art sequence labeling systems traditionally require large amounts of task-specific knowledge in the form of hand-crafted features and data pre-processing. In this paper, we introduce a novel neutral network architecture that benefits from both word- and character-level representations automatically, by using combination of bidirectional LSTM, CNN and CRF. Our system is truly end-to-end, requiring no feature engineering or data pre-processing, thus making it applicable to a wide range of sequence labeling tasks. We evaluate our system on two data sets for two sequence labeling tasks — Penn Treebank WSJ corpus for part-of-speech (POS) tagging and CoNLL 2003 corpus for named entity recognition (NER). We obtain state-of-the-art performance on both the two data — 97.55% accuracy for POS tagging and 91.21% F1 for NER.
Tasks	Feature Engineering, Named Entity Recognition, Part-Of-Speech Tagging
Published	2016-03-04
URL	http://arxiv.org/abs/1603.01354v5
PDF	http://arxiv.org/pdf/1603.01354v5.pdf
PWC	https://paperswithcode.com/paper/end-to-end-sequence-labeling-via-bi
Repo	https://github.com/XiafeiYu/CNN_BILSTM_CRF
Framework	tf