May 7, 2019

2931 words 14 mins read

Paper Group AWR 66

Paper Group AWR 66

Generalized optimal sub-pattern assignment metric. A Decomposable Attention Model for Natural Language Inference. Deep Video Deblurring. A Deep Learning Approach to Block-based Compressed Sensing of Images. A Parallel-Hierarchical Model for Machine Comprehension on Sparse Data. Text Classification Improved by Integrating Bidirectional LSTM with Two …

Generalized optimal sub-pattern assignment metric

Title Generalized optimal sub-pattern assignment metric
Authors Abu Sajana Rahmathullah, Ángel F. García-Fernández, Lennart Svensson
Abstract This paper presents the generalized optimal sub-pattern assignment (GOSPA) metric on the space of finite sets of targets. Compared to the well-established optimal sub-pattern assignment (OSPA) metric, GOSPA is unnormalized as a function of the cardinality and it penalizes cardinality errors differently, which enables us to express it as an optimisation over assignments instead of permutations. An important consequence of this is that GOSPA allows us to penalize localization errors for detected targets and the errors due to missed and false targets, as indicated by traditional multiple target tracking (MTT) performance measures, in a sound manner. In addition, we extend the GOSPA metric to the space of random finite sets, which is important to evaluate MTT algorithms via simulations in a rigorous way.
Tasks
Published 2016-01-21
URL http://arxiv.org/abs/1601.05585v7
PDF http://arxiv.org/pdf/1601.05585v7.pdf
PWC https://paperswithcode.com/paper/generalized-optimal-sub-pattern-assignment
Repo https://github.com/abusajana/GOSPA
Framework none

A Decomposable Attention Model for Natural Language Inference

Title A Decomposable Attention Model for Natural Language Inference
Authors Ankur P. Parikh, Oscar Täckström, Dipanjan Das, Jakob Uszkoreit
Abstract We propose a simple neural architecture for natural language inference. Our approach uses attention to decompose the problem into subproblems that can be solved separately, thus making it trivially parallelizable. On the Stanford Natural Language Inference (SNLI) dataset, we obtain state-of-the-art results with almost an order of magnitude fewer parameters than previous work and without relying on any word-order information. Adding intra-sentence attention that takes a minimum amount of order into account yields further improvements.
Tasks Natural Language Inference
Published 2016-06-06
URL http://arxiv.org/abs/1606.01933v2
PDF http://arxiv.org/pdf/1606.01933v2.pdf
PWC https://paperswithcode.com/paper/a-decomposable-attention-model-for-natural
Repo https://github.com/blcunlp/CNLI
Framework tf

Deep Video Deblurring

Title Deep Video Deblurring
Authors Shuochen Su, Mauricio Delbracio, Jue Wang, Guillermo Sapiro, Wolfgang Heidrich, Oliver Wang
Abstract Motion blur from camera shake is a major problem in videos captured by hand-held devices. Unlike single-image deblurring, video-based approaches can take advantage of the abundant information that exists across neighboring frames. As a result the best performing methods rely on aligning nearby frames. However, aligning images is a computationally expensive and fragile procedure, and methods that aggregate information must therefore be able to identify which regions have been accurately aligned and which have not, a task which requires high level scene understanding. In this work, we introduce a deep learning solution to video deblurring, where a CNN is trained end-to-end to learn how to accumulate information across frames. To train this network, we collected a dataset of real videos recorded with a high framerate camera, which we use to generate synthetic motion blur for supervision. We show that the features learned from this dataset extend to deblurring motion blur that arises due to camera shake in a wide range of videos, and compare the quality of results to a number of other baselines.
Tasks Deblurring, Scene Understanding
Published 2016-11-25
URL http://arxiv.org/abs/1611.08387v1
PDF http://arxiv.org/pdf/1611.08387v1.pdf
PWC https://paperswithcode.com/paper/deep-video-deblurring
Repo https://github.com/susomena/DeepSlowMotion
Framework tf

A Deep Learning Approach to Block-based Compressed Sensing of Images

Title A Deep Learning Approach to Block-based Compressed Sensing of Images
Authors Amir Adler, David Boublil, Michael Elad, Michael Zibulevsky
Abstract Compressed sensing (CS) is a signal processing framework for efficiently reconstructing a signal from a small number of measurements, obtained by linear projections of the signal. Block-based CS is a lightweight CS approach that is mostly suitable for processing very high-dimensional images and videos: it operates on local patches, employs a low-complexity reconstruction operator and requires significantly less memory to store the sensing matrix. In this paper we present a deep learning approach for block-based CS, in which a fully-connected network performs both the block-based linear sensing and non-linear reconstruction stages. During the training phase, the sensing matrix and the non-linear reconstruction operator are \emph{jointly} optimized, and the proposed approach outperforms state-of-the-art both in terms of reconstruction quality and computation time. For example, at a 25% sensing rate the average PSNR advantage is 0.77dB and computation time is over 200-times faster.
Tasks
Published 2016-06-05
URL http://arxiv.org/abs/1606.01519v1
PDF http://arxiv.org/pdf/1606.01519v1.pdf
PWC https://paperswithcode.com/paper/a-deep-learning-approach-to-block-based
Repo https://github.com/asalp/Block-Based-CS-NNs
Framework none

A Parallel-Hierarchical Model for Machine Comprehension on Sparse Data

Title A Parallel-Hierarchical Model for Machine Comprehension on Sparse Data
Authors Adam Trischler, Zheng Ye, Xingdi Yuan, Jing He, Phillip Bachman, Kaheer Suleman
Abstract Understanding unstructured text is a major goal within natural language processing. Comprehension tests pose questions based on short text passages to evaluate such understanding. In this work, we investigate machine comprehension on the challenging {\it MCTest} benchmark. Partly because of its limited size, prior work on {\it MCTest} has focused mainly on engineering better features. We tackle the dataset with a neural approach, harnessing simple neural networks arranged in a parallel hierarchy. The parallel hierarchy enables our model to compare the passage, question, and answer from a variety of trainable perspectives, as opposed to using a manually designed, rigid feature set. Perspectives range from the word level to sentence fragments to sequences of sentences; the networks operate only on word-embedding representations of text. When trained with a methodology designed to help cope with limited training data, our Parallel-Hierarchical model sets a new state of the art for {\it MCTest}, outperforming previous feature-engineered approaches slightly and previous neural approaches by a significant margin (over 15% absolute).
Tasks Question Answering, Reading Comprehension
Published 2016-03-29
URL http://arxiv.org/abs/1603.08884v1
PDF http://arxiv.org/pdf/1603.08884v1.pdf
PWC https://paperswithcode.com/paper/a-parallel-hierarchical-model-for-machine
Repo https://github.com/Maluuba/mctest-model
Framework none

Text Classification Improved by Integrating Bidirectional LSTM with Two-dimensional Max Pooling

Title Text Classification Improved by Integrating Bidirectional LSTM with Two-dimensional Max Pooling
Authors Peng Zhou, Zhenyu Qi, Suncong Zheng, Jiaming Xu, Hongyun Bao, Bo Xu
Abstract Recurrent Neural Network (RNN) is one of the most popular architectures used in Natural Language Processsing (NLP) tasks because its recurrent structure is very suitable to process variable-length text. RNN can utilize distributed representations of words by first converting the tokens comprising each text into vectors, which form a matrix. And this matrix includes two dimensions: the time-step dimension and the feature vector dimension. Then most existing models usually utilize one-dimensional (1D) max pooling operation or attention-based operation only on the time-step dimension to obtain a fixed-length vector. However, the features on the feature vector dimension are not mutually independent, and simply applying 1D pooling operation over the time-step dimension independently may destroy the structure of the feature representation. On the other hand, applying two-dimensional (2D) pooling operation over the two dimensions may sample more meaningful features for sequence modeling tasks. To integrate the features on both dimensions of the matrix, this paper explores applying 2D max pooling operation to obtain a fixed-length representation of the text. This paper also utilizes 2D convolution to sample more meaningful information of the matrix. Experiments are conducted on six text classification tasks, including sentiment analysis, question classification, subjectivity classification and newsgroup classification. Compared with the state-of-the-art models, the proposed models achieve excellent performance on 4 out of 6 tasks. Specifically, one of the proposed models achieves highest accuracy on Stanford Sentiment Treebank binary classification and fine-grained classification tasks.
Tasks Sentiment Analysis, Text Classification
Published 2016-11-21
URL http://arxiv.org/abs/1611.06639v1
PDF http://arxiv.org/pdf/1611.06639v1.pdf
PWC https://paperswithcode.com/paper/text-classification-improved-by-integrating
Repo https://github.com/ManuelVs/NeuralNetworks
Framework tf

Semantic Tagging with Deep Residual Networks

Title Semantic Tagging with Deep Residual Networks
Authors Johannes Bjerva, Barbara Plank, Johan Bos
Abstract We propose a novel semantic tagging task, sem-tagging, tailored for the purpose of multilingual semantic parsing, and present the first tagger using deep residual networks (ResNets). Our tagger uses both word and character representations and includes a novel residual bypass architecture. We evaluate the tagset both intrinsically on the new task of semantic tagging, as well as on Part-of-Speech (POS) tagging. Our system, consisting of a ResNet and an auxiliary loss function predicting our semantic tags, significantly outperforms prior results on English Universal Dependencies POS tagging (95.71% accuracy on UD v1.2 and 95.67% accuracy on UD v1.3).
Tasks Part-Of-Speech Tagging, Semantic Parsing
Published 2016-09-22
URL http://arxiv.org/abs/1609.07053v2
PDF http://arxiv.org/pdf/1609.07053v2.pdf
PWC https://paperswithcode.com/paper/semantic-tagging-with-deep-residual-networks
Repo https://github.com/bjerva/semantic-tagging
Framework none

Improving Coreference Resolution by Learning Entity-Level Distributed Representations

Title Improving Coreference Resolution by Learning Entity-Level Distributed Representations
Authors Kevin Clark, Christopher D. Manning
Abstract A long-standing challenge in coreference resolution has been the incorporation of entity-level information - features defined over clusters of mentions instead of mention pairs. We present a neural network based coreference system that produces high-dimensional vector representations for pairs of coreference clusters. Using these representations, our system learns when combining clusters is desirable. We train the system with a learning-to-search algorithm that teaches it which local decisions (cluster merges) will lead to a high-scoring final coreference partition. The system substantially outperforms the current state-of-the-art on the English and Chinese portions of the CoNLL 2012 Shared Task dataset despite using few hand-engineered features.
Tasks Coreference Resolution
Published 2016-06-04
URL http://arxiv.org/abs/1606.01323v2
PDF http://arxiv.org/pdf/1606.01323v2.pdf
PWC https://paperswithcode.com/paper/improving-coreference-resolution-by-learning
Repo https://github.com/clarkkev/deep-coref
Framework none

Towards cross-lingual distributed representations without parallel text trained with adversarial autoencoders

Title Towards cross-lingual distributed representations without parallel text trained with adversarial autoencoders
Authors Antonio Valerio Miceli Barone
Abstract Current approaches to learning vector representations of text that are compatible between different languages usually require some amount of parallel text, aligned at word, sentence or at least document level. We hypothesize however, that different natural languages share enough semantic structure that it should be possible, in principle, to learn compatible vector representations just by analyzing the monolingual distribution of words. In order to evaluate this hypothesis, we propose a scheme to map word vectors trained on a source language to vectors semantically compatible with word vectors trained on a target language using an adversarial autoencoder. We present preliminary qualitative results and discuss possible future developments of this technique, such as applications to cross-lingual sentence representations.
Tasks
Published 2016-08-09
URL http://arxiv.org/abs/1608.02996v1
PDF http://arxiv.org/pdf/1608.02996v1.pdf
PWC https://paperswithcode.com/paper/towards-cross-lingual-distributed
Repo https://github.com/Avmb/clweadv
Framework none

Residual Networks Behave Like Ensembles of Relatively Shallow Networks

Title Residual Networks Behave Like Ensembles of Relatively Shallow Networks
Authors Andreas Veit, Michael Wilber, Serge Belongie
Abstract In this work we propose a novel interpretation of residual networks showing that they can be seen as a collection of many paths of differing length. Moreover, residual networks seem to enable very deep networks by leveraging only the short paths during training. To support this observation, we rewrite residual networks as an explicit collection of paths. Unlike traditional models, paths through residual networks vary in length. Further, a lesion study reveals that these paths show ensemble-like behavior in the sense that they do not strongly depend on each other. Finally, and most surprising, most paths are shorter than one might expect, and only the short paths are needed during training, as longer paths do not contribute any gradient. For example, most of the gradient in a residual network with 110 layers comes from paths that are only 10-34 layers deep. Our results reveal one of the key characteristics that seem to enable the training of very deep networks: Residual networks avoid the vanishing gradient problem by introducing short paths which can carry gradient throughout the extent of very deep networks.
Tasks
Published 2016-05-20
URL http://arxiv.org/abs/1605.06431v2
PDF http://arxiv.org/pdf/1605.06431v2.pdf
PWC https://paperswithcode.com/paper/residual-networks-behave-like-ensembles-of
Repo https://github.com/andreasveit/densenet-pytorch
Framework pytorch

Learning Deep Feature Representations with Domain Guided Dropout for Person Re-identification

Title Learning Deep Feature Representations with Domain Guided Dropout for Person Re-identification
Authors Tong Xiao, Hongsheng Li, Wanli Ouyang, Xiaogang Wang
Abstract Learning generic and robust feature representations with data from multiple domains for the same problem is of great value, especially for the problems that have multiple datasets but none of them are large enough to provide abundant data variations. In this work, we present a pipeline for learning deep feature representations from multiple domains with Convolutional Neural Networks (CNNs). When training a CNN with data from all the domains, some neurons learn representations shared across several domains, while some others are effective only for a specific one. Based on this important observation, we propose a Domain Guided Dropout algorithm to improve the feature learning procedure. Experiments show the effectiveness of our pipeline and the proposed algorithm. Our methods on the person re-identification problem outperform state-of-the-art methods on multiple datasets by large margins.
Tasks Person Re-Identification
Published 2016-04-26
URL http://arxiv.org/abs/1604.07528v1
PDF http://arxiv.org/pdf/1604.07528v1.pdf
PWC https://paperswithcode.com/paper/learning-deep-feature-representations-with
Repo https://github.com/Cysu/person_reid
Framework none

Feature Selection: A Data Perspective

Title Feature Selection: A Data Perspective
Authors Jundong Li, Kewei Cheng, Suhang Wang, Fred Morstatter, Robert P. Trevino, Jiliang Tang, Huan Liu
Abstract Feature selection, as a data preprocessing strategy, has been proven to be effective and efficient in preparing data (especially high-dimensional data) for various data mining and machine learning problems. The objectives of feature selection include: building simpler and more comprehensible models, improving data mining performance, and preparing clean, understandable data. The recent proliferation of big data has presented some substantial challenges and opportunities to feature selection. In this survey, we provide a comprehensive and structured overview of recent advances in feature selection research. Motivated by current challenges and opportunities in the era of big data, we revisit feature selection research from a data perspective and review representative feature selection algorithms for conventional data, structured data, heterogeneous data and streaming data. Methodologically, to emphasize the differences and similarities of most existing feature selection algorithms for conventional data, we categorize them into four main groups: similarity based, information theoretical based, sparse learning based and statistical based methods. To facilitate and promote the research in this community, we also present an open-source feature selection repository that consists of most of the popular feature selection algorithms (\url{http://featureselection.asu.edu/}). Also, we use it as an example to show how to evaluate feature selection algorithms. At the end of the survey, we present a discussion about some open problems and challenges that require more attention in future research.
Tasks Feature Selection, Sparse Learning
Published 2016-01-29
URL http://arxiv.org/abs/1601.07996v5
PDF http://arxiv.org/pdf/1601.07996v5.pdf
PWC https://paperswithcode.com/paper/feature-selection-a-data-perspective
Repo https://github.com/jundongl/scikit-feature
Framework none

#Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning

Title #Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning
Authors Haoran Tang, Rein Houthooft, Davis Foote, Adam Stooke, Xi Chen, Yan Duan, John Schulman, Filip De Turck, Pieter Abbeel
Abstract Count-based exploration algorithms are known to perform near-optimally when used in conjunction with tabular reinforcement learning (RL) methods for solving small discrete Markov decision processes (MDPs). It is generally thought that count-based methods cannot be applied in high-dimensional state spaces, since most states will only occur once. Recent deep RL exploration strategies are able to deal with high-dimensional continuous state spaces through complex heuristics, often relying on optimism in the face of uncertainty or intrinsic motivation. In this work, we describe a surprising finding: a simple generalization of the classic count-based approach can reach near state-of-the-art performance on various high-dimensional and/or continuous deep RL benchmarks. States are mapped to hash codes, which allows to count their occurrences with a hash table. These counts are then used to compute a reward bonus according to the classic count-based exploration theory. We find that simple hash functions can achieve surprisingly good results on many challenging tasks. Furthermore, we show that a domain-dependent learned hash code may further improve these results. Detailed analysis reveals important aspects of a good hash function: 1) having appropriate granularity and 2) encoding information relevant to solving the MDP. This exploration strategy achieves near state-of-the-art performance on both continuous control tasks and Atari 2600 games, hence providing a simple yet powerful baseline for solving MDPs that require considerable exploration.
Tasks Atari Games, Continuous Control
Published 2016-11-15
URL http://arxiv.org/abs/1611.04717v3
PDF http://arxiv.org/pdf/1611.04717v3.pdf
PWC https://paperswithcode.com/paper/exploration-a-study-of-count-based
Repo https://github.com/nhynes/abc
Framework pytorch

Text Understanding with the Attention Sum Reader Network

Title Text Understanding with the Attention Sum Reader Network
Authors Rudolf Kadlec, Martin Schmid, Ondrej Bajgar, Jan Kleindienst
Abstract Several large cloze-style context-question-answer datasets have been introduced recently: the CNN and Daily Mail news data and the Children’s Book Test. Thanks to the size of these datasets, the associated text comprehension task is well suited for deep-learning techniques that currently seem to outperform all alternative approaches. We present a new, simple model that uses attention to directly pick the answer from the context as opposed to computing the answer using a blended representation of words in the document as is usual in similar models. This makes the model particularly suitable for question-answering problems where the answer is a single word from the document. Ensemble of our models sets new state of the art on all evaluated datasets.
Tasks Machine Reading Comprehension, Open-Domain Question Answering, Question Answering, Reading Comprehension
Published 2016-03-04
URL http://arxiv.org/abs/1603.01547v2
PDF http://arxiv.org/pdf/1603.01547v2.pdf
PWC https://paperswithcode.com/paper/text-understanding-with-the-attention-sum
Repo https://github.com/shuxiaobo/QA-Experiment
Framework tf

End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF

Title End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF
Authors Xuezhe Ma, Eduard Hovy
Abstract State-of-the-art sequence labeling systems traditionally require large amounts of task-specific knowledge in the form of hand-crafted features and data pre-processing. In this paper, we introduce a novel neutral network architecture that benefits from both word- and character-level representations automatically, by using combination of bidirectional LSTM, CNN and CRF. Our system is truly end-to-end, requiring no feature engineering or data pre-processing, thus making it applicable to a wide range of sequence labeling tasks. We evaluate our system on two data sets for two sequence labeling tasks — Penn Treebank WSJ corpus for part-of-speech (POS) tagging and CoNLL 2003 corpus for named entity recognition (NER). We obtain state-of-the-art performance on both the two data — 97.55% accuracy for POS tagging and 91.21% F1 for NER.
Tasks Feature Engineering, Named Entity Recognition, Part-Of-Speech Tagging
Published 2016-03-04
URL http://arxiv.org/abs/1603.01354v5
PDF http://arxiv.org/pdf/1603.01354v5.pdf
PWC https://paperswithcode.com/paper/end-to-end-sequence-labeling-via-bi
Repo https://github.com/XiafeiYu/CNN_BILSTM_CRF
Framework tf
comments powered by Disqus