Paper Group AWR 66
Generalized optimal sub-pattern assignment metric. A Decomposable Attention Model for Natural Language Inference. Deep Video Deblurring. A Deep Learning Approach to Block-based Compressed Sensing of Images. A Parallel-Hierarchical Model for Machine Comprehension on Sparse Data. Text Classification Improved by Integrating Bidirectional LSTM with Two …
Generalized optimal sub-pattern assignment metric
Title | Generalized optimal sub-pattern assignment metric |
Authors | Abu Sajana Rahmathullah, Ángel F. García-Fernández, Lennart Svensson |
Abstract | This paper presents the generalized optimal sub-pattern assignment (GOSPA) metric on the space of finite sets of targets. Compared to the well-established optimal sub-pattern assignment (OSPA) metric, GOSPA is unnormalized as a function of the cardinality and it penalizes cardinality errors differently, which enables us to express it as an optimisation over assignments instead of permutations. An important consequence of this is that GOSPA allows us to penalize localization errors for detected targets and the errors due to missed and false targets, as indicated by traditional multiple target tracking (MTT) performance measures, in a sound manner. In addition, we extend the GOSPA metric to the space of random finite sets, which is important to evaluate MTT algorithms via simulations in a rigorous way. |
Tasks | |
Published | 2016-01-21 |
URL | http://arxiv.org/abs/1601.05585v7 |
http://arxiv.org/pdf/1601.05585v7.pdf | |
PWC | https://paperswithcode.com/paper/generalized-optimal-sub-pattern-assignment |
Repo | https://github.com/abusajana/GOSPA |
Framework | none |
A Decomposable Attention Model for Natural Language Inference
Title | A Decomposable Attention Model for Natural Language Inference |
Authors | Ankur P. Parikh, Oscar Täckström, Dipanjan Das, Jakob Uszkoreit |
Abstract | We propose a simple neural architecture for natural language inference. Our approach uses attention to decompose the problem into subproblems that can be solved separately, thus making it trivially parallelizable. On the Stanford Natural Language Inference (SNLI) dataset, we obtain state-of-the-art results with almost an order of magnitude fewer parameters than previous work and without relying on any word-order information. Adding intra-sentence attention that takes a minimum amount of order into account yields further improvements. |
Tasks | Natural Language Inference |
Published | 2016-06-06 |
URL | http://arxiv.org/abs/1606.01933v2 |
http://arxiv.org/pdf/1606.01933v2.pdf | |
PWC | https://paperswithcode.com/paper/a-decomposable-attention-model-for-natural |
Repo | https://github.com/blcunlp/CNLI |
Framework | tf |
Deep Video Deblurring
Title | Deep Video Deblurring |
Authors | Shuochen Su, Mauricio Delbracio, Jue Wang, Guillermo Sapiro, Wolfgang Heidrich, Oliver Wang |
Abstract | Motion blur from camera shake is a major problem in videos captured by hand-held devices. Unlike single-image deblurring, video-based approaches can take advantage of the abundant information that exists across neighboring frames. As a result the best performing methods rely on aligning nearby frames. However, aligning images is a computationally expensive and fragile procedure, and methods that aggregate information must therefore be able to identify which regions have been accurately aligned and which have not, a task which requires high level scene understanding. In this work, we introduce a deep learning solution to video deblurring, where a CNN is trained end-to-end to learn how to accumulate information across frames. To train this network, we collected a dataset of real videos recorded with a high framerate camera, which we use to generate synthetic motion blur for supervision. We show that the features learned from this dataset extend to deblurring motion blur that arises due to camera shake in a wide range of videos, and compare the quality of results to a number of other baselines. |
Tasks | Deblurring, Scene Understanding |
Published | 2016-11-25 |
URL | http://arxiv.org/abs/1611.08387v1 |
http://arxiv.org/pdf/1611.08387v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-video-deblurring |
Repo | https://github.com/susomena/DeepSlowMotion |
Framework | tf |
A Deep Learning Approach to Block-based Compressed Sensing of Images
Title | A Deep Learning Approach to Block-based Compressed Sensing of Images |
Authors | Amir Adler, David Boublil, Michael Elad, Michael Zibulevsky |
Abstract | Compressed sensing (CS) is a signal processing framework for efficiently reconstructing a signal from a small number of measurements, obtained by linear projections of the signal. Block-based CS is a lightweight CS approach that is mostly suitable for processing very high-dimensional images and videos: it operates on local patches, employs a low-complexity reconstruction operator and requires significantly less memory to store the sensing matrix. In this paper we present a deep learning approach for block-based CS, in which a fully-connected network performs both the block-based linear sensing and non-linear reconstruction stages. During the training phase, the sensing matrix and the non-linear reconstruction operator are \emph{jointly} optimized, and the proposed approach outperforms state-of-the-art both in terms of reconstruction quality and computation time. For example, at a 25% sensing rate the average PSNR advantage is 0.77dB and computation time is over 200-times faster. |
Tasks | |
Published | 2016-06-05 |
URL | http://arxiv.org/abs/1606.01519v1 |
http://arxiv.org/pdf/1606.01519v1.pdf | |
PWC | https://paperswithcode.com/paper/a-deep-learning-approach-to-block-based |
Repo | https://github.com/asalp/Block-Based-CS-NNs |
Framework | none |
A Parallel-Hierarchical Model for Machine Comprehension on Sparse Data
Title | A Parallel-Hierarchical Model for Machine Comprehension on Sparse Data |
Authors | Adam Trischler, Zheng Ye, Xingdi Yuan, Jing He, Phillip Bachman, Kaheer Suleman |
Abstract | Understanding unstructured text is a major goal within natural language processing. Comprehension tests pose questions based on short text passages to evaluate such understanding. In this work, we investigate machine comprehension on the challenging {\it MCTest} benchmark. Partly because of its limited size, prior work on {\it MCTest} has focused mainly on engineering better features. We tackle the dataset with a neural approach, harnessing simple neural networks arranged in a parallel hierarchy. The parallel hierarchy enables our model to compare the passage, question, and answer from a variety of trainable perspectives, as opposed to using a manually designed, rigid feature set. Perspectives range from the word level to sentence fragments to sequences of sentences; the networks operate only on word-embedding representations of text. When trained with a methodology designed to help cope with limited training data, our Parallel-Hierarchical model sets a new state of the art for {\it MCTest}, outperforming previous feature-engineered approaches slightly and previous neural approaches by a significant margin (over 15% absolute). |
Tasks | Question Answering, Reading Comprehension |
Published | 2016-03-29 |
URL | http://arxiv.org/abs/1603.08884v1 |
http://arxiv.org/pdf/1603.08884v1.pdf | |
PWC | https://paperswithcode.com/paper/a-parallel-hierarchical-model-for-machine |
Repo | https://github.com/Maluuba/mctest-model |
Framework | none |
Text Classification Improved by Integrating Bidirectional LSTM with Two-dimensional Max Pooling
Title | Text Classification Improved by Integrating Bidirectional LSTM with Two-dimensional Max Pooling |
Authors | Peng Zhou, Zhenyu Qi, Suncong Zheng, Jiaming Xu, Hongyun Bao, Bo Xu |
Abstract | Recurrent Neural Network (RNN) is one of the most popular architectures used in Natural Language Processsing (NLP) tasks because its recurrent structure is very suitable to process variable-length text. RNN can utilize distributed representations of words by first converting the tokens comprising each text into vectors, which form a matrix. And this matrix includes two dimensions: the time-step dimension and the feature vector dimension. Then most existing models usually utilize one-dimensional (1D) max pooling operation or attention-based operation only on the time-step dimension to obtain a fixed-length vector. However, the features on the feature vector dimension are not mutually independent, and simply applying 1D pooling operation over the time-step dimension independently may destroy the structure of the feature representation. On the other hand, applying two-dimensional (2D) pooling operation over the two dimensions may sample more meaningful features for sequence modeling tasks. To integrate the features on both dimensions of the matrix, this paper explores applying 2D max pooling operation to obtain a fixed-length representation of the text. This paper also utilizes 2D convolution to sample more meaningful information of the matrix. Experiments are conducted on six text classification tasks, including sentiment analysis, question classification, subjectivity classification and newsgroup classification. Compared with the state-of-the-art models, the proposed models achieve excellent performance on 4 out of 6 tasks. Specifically, one of the proposed models achieves highest accuracy on Stanford Sentiment Treebank binary classification and fine-grained classification tasks. |
Tasks | Sentiment Analysis, Text Classification |
Published | 2016-11-21 |
URL | http://arxiv.org/abs/1611.06639v1 |
http://arxiv.org/pdf/1611.06639v1.pdf | |
PWC | https://paperswithcode.com/paper/text-classification-improved-by-integrating |
Repo | https://github.com/ManuelVs/NeuralNetworks |
Framework | tf |
Semantic Tagging with Deep Residual Networks
Title | Semantic Tagging with Deep Residual Networks |
Authors | Johannes Bjerva, Barbara Plank, Johan Bos |
Abstract | We propose a novel semantic tagging task, sem-tagging, tailored for the purpose of multilingual semantic parsing, and present the first tagger using deep residual networks (ResNets). Our tagger uses both word and character representations and includes a novel residual bypass architecture. We evaluate the tagset both intrinsically on the new task of semantic tagging, as well as on Part-of-Speech (POS) tagging. Our system, consisting of a ResNet and an auxiliary loss function predicting our semantic tags, significantly outperforms prior results on English Universal Dependencies POS tagging (95.71% accuracy on UD v1.2 and 95.67% accuracy on UD v1.3). |
Tasks | Part-Of-Speech Tagging, Semantic Parsing |
Published | 2016-09-22 |
URL | http://arxiv.org/abs/1609.07053v2 |
http://arxiv.org/pdf/1609.07053v2.pdf | |
PWC | https://paperswithcode.com/paper/semantic-tagging-with-deep-residual-networks |
Repo | https://github.com/bjerva/semantic-tagging |
Framework | none |
Improving Coreference Resolution by Learning Entity-Level Distributed Representations
Title | Improving Coreference Resolution by Learning Entity-Level Distributed Representations |
Authors | Kevin Clark, Christopher D. Manning |
Abstract | A long-standing challenge in coreference resolution has been the incorporation of entity-level information - features defined over clusters of mentions instead of mention pairs. We present a neural network based coreference system that produces high-dimensional vector representations for pairs of coreference clusters. Using these representations, our system learns when combining clusters is desirable. We train the system with a learning-to-search algorithm that teaches it which local decisions (cluster merges) will lead to a high-scoring final coreference partition. The system substantially outperforms the current state-of-the-art on the English and Chinese portions of the CoNLL 2012 Shared Task dataset despite using few hand-engineered features. |
Tasks | Coreference Resolution |
Published | 2016-06-04 |
URL | http://arxiv.org/abs/1606.01323v2 |
http://arxiv.org/pdf/1606.01323v2.pdf | |
PWC | https://paperswithcode.com/paper/improving-coreference-resolution-by-learning |
Repo | https://github.com/clarkkev/deep-coref |
Framework | none |
Towards cross-lingual distributed representations without parallel text trained with adversarial autoencoders
Title | Towards cross-lingual distributed representations without parallel text trained with adversarial autoencoders |
Authors | Antonio Valerio Miceli Barone |
Abstract | Current approaches to learning vector representations of text that are compatible between different languages usually require some amount of parallel text, aligned at word, sentence or at least document level. We hypothesize however, that different natural languages share enough semantic structure that it should be possible, in principle, to learn compatible vector representations just by analyzing the monolingual distribution of words. In order to evaluate this hypothesis, we propose a scheme to map word vectors trained on a source language to vectors semantically compatible with word vectors trained on a target language using an adversarial autoencoder. We present preliminary qualitative results and discuss possible future developments of this technique, such as applications to cross-lingual sentence representations. |
Tasks | |
Published | 2016-08-09 |
URL | http://arxiv.org/abs/1608.02996v1 |
http://arxiv.org/pdf/1608.02996v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-cross-lingual-distributed |
Repo | https://github.com/Avmb/clweadv |
Framework | none |
Residual Networks Behave Like Ensembles of Relatively Shallow Networks
Title | Residual Networks Behave Like Ensembles of Relatively Shallow Networks |
Authors | Andreas Veit, Michael Wilber, Serge Belongie |
Abstract | In this work we propose a novel interpretation of residual networks showing that they can be seen as a collection of many paths of differing length. Moreover, residual networks seem to enable very deep networks by leveraging only the short paths during training. To support this observation, we rewrite residual networks as an explicit collection of paths. Unlike traditional models, paths through residual networks vary in length. Further, a lesion study reveals that these paths show ensemble-like behavior in the sense that they do not strongly depend on each other. Finally, and most surprising, most paths are shorter than one might expect, and only the short paths are needed during training, as longer paths do not contribute any gradient. For example, most of the gradient in a residual network with 110 layers comes from paths that are only 10-34 layers deep. Our results reveal one of the key characteristics that seem to enable the training of very deep networks: Residual networks avoid the vanishing gradient problem by introducing short paths which can carry gradient throughout the extent of very deep networks. |
Tasks | |
Published | 2016-05-20 |
URL | http://arxiv.org/abs/1605.06431v2 |
http://arxiv.org/pdf/1605.06431v2.pdf | |
PWC | https://paperswithcode.com/paper/residual-networks-behave-like-ensembles-of |
Repo | https://github.com/andreasveit/densenet-pytorch |
Framework | pytorch |
Learning Deep Feature Representations with Domain Guided Dropout for Person Re-identification
Title | Learning Deep Feature Representations with Domain Guided Dropout for Person Re-identification |
Authors | Tong Xiao, Hongsheng Li, Wanli Ouyang, Xiaogang Wang |
Abstract | Learning generic and robust feature representations with data from multiple domains for the same problem is of great value, especially for the problems that have multiple datasets but none of them are large enough to provide abundant data variations. In this work, we present a pipeline for learning deep feature representations from multiple domains with Convolutional Neural Networks (CNNs). When training a CNN with data from all the domains, some neurons learn representations shared across several domains, while some others are effective only for a specific one. Based on this important observation, we propose a Domain Guided Dropout algorithm to improve the feature learning procedure. Experiments show the effectiveness of our pipeline and the proposed algorithm. Our methods on the person re-identification problem outperform state-of-the-art methods on multiple datasets by large margins. |
Tasks | Person Re-Identification |
Published | 2016-04-26 |
URL | http://arxiv.org/abs/1604.07528v1 |
http://arxiv.org/pdf/1604.07528v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-deep-feature-representations-with |
Repo | https://github.com/Cysu/person_reid |
Framework | none |
Feature Selection: A Data Perspective
Title | Feature Selection: A Data Perspective |
Authors | Jundong Li, Kewei Cheng, Suhang Wang, Fred Morstatter, Robert P. Trevino, Jiliang Tang, Huan Liu |
Abstract | Feature selection, as a data preprocessing strategy, has been proven to be effective and efficient in preparing data (especially high-dimensional data) for various data mining and machine learning problems. The objectives of feature selection include: building simpler and more comprehensible models, improving data mining performance, and preparing clean, understandable data. The recent proliferation of big data has presented some substantial challenges and opportunities to feature selection. In this survey, we provide a comprehensive and structured overview of recent advances in feature selection research. Motivated by current challenges and opportunities in the era of big data, we revisit feature selection research from a data perspective and review representative feature selection algorithms for conventional data, structured data, heterogeneous data and streaming data. Methodologically, to emphasize the differences and similarities of most existing feature selection algorithms for conventional data, we categorize them into four main groups: similarity based, information theoretical based, sparse learning based and statistical based methods. To facilitate and promote the research in this community, we also present an open-source feature selection repository that consists of most of the popular feature selection algorithms (\url{http://featureselection.asu.edu/}). Also, we use it as an example to show how to evaluate feature selection algorithms. At the end of the survey, we present a discussion about some open problems and challenges that require more attention in future research. |
Tasks | Feature Selection, Sparse Learning |
Published | 2016-01-29 |
URL | http://arxiv.org/abs/1601.07996v5 |
http://arxiv.org/pdf/1601.07996v5.pdf | |
PWC | https://paperswithcode.com/paper/feature-selection-a-data-perspective |
Repo | https://github.com/jundongl/scikit-feature |
Framework | none |
#Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning
Title | #Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning |
Authors | Haoran Tang, Rein Houthooft, Davis Foote, Adam Stooke, Xi Chen, Yan Duan, John Schulman, Filip De Turck, Pieter Abbeel |
Abstract | Count-based exploration algorithms are known to perform near-optimally when used in conjunction with tabular reinforcement learning (RL) methods for solving small discrete Markov decision processes (MDPs). It is generally thought that count-based methods cannot be applied in high-dimensional state spaces, since most states will only occur once. Recent deep RL exploration strategies are able to deal with high-dimensional continuous state spaces through complex heuristics, often relying on optimism in the face of uncertainty or intrinsic motivation. In this work, we describe a surprising finding: a simple generalization of the classic count-based approach can reach near state-of-the-art performance on various high-dimensional and/or continuous deep RL benchmarks. States are mapped to hash codes, which allows to count their occurrences with a hash table. These counts are then used to compute a reward bonus according to the classic count-based exploration theory. We find that simple hash functions can achieve surprisingly good results on many challenging tasks. Furthermore, we show that a domain-dependent learned hash code may further improve these results. Detailed analysis reveals important aspects of a good hash function: 1) having appropriate granularity and 2) encoding information relevant to solving the MDP. This exploration strategy achieves near state-of-the-art performance on both continuous control tasks and Atari 2600 games, hence providing a simple yet powerful baseline for solving MDPs that require considerable exploration. |
Tasks | Atari Games, Continuous Control |
Published | 2016-11-15 |
URL | http://arxiv.org/abs/1611.04717v3 |
http://arxiv.org/pdf/1611.04717v3.pdf | |
PWC | https://paperswithcode.com/paper/exploration-a-study-of-count-based |
Repo | https://github.com/nhynes/abc |
Framework | pytorch |
Text Understanding with the Attention Sum Reader Network
Title | Text Understanding with the Attention Sum Reader Network |
Authors | Rudolf Kadlec, Martin Schmid, Ondrej Bajgar, Jan Kleindienst |
Abstract | Several large cloze-style context-question-answer datasets have been introduced recently: the CNN and Daily Mail news data and the Children’s Book Test. Thanks to the size of these datasets, the associated text comprehension task is well suited for deep-learning techniques that currently seem to outperform all alternative approaches. We present a new, simple model that uses attention to directly pick the answer from the context as opposed to computing the answer using a blended representation of words in the document as is usual in similar models. This makes the model particularly suitable for question-answering problems where the answer is a single word from the document. Ensemble of our models sets new state of the art on all evaluated datasets. |
Tasks | Machine Reading Comprehension, Open-Domain Question Answering, Question Answering, Reading Comprehension |
Published | 2016-03-04 |
URL | http://arxiv.org/abs/1603.01547v2 |
http://arxiv.org/pdf/1603.01547v2.pdf | |
PWC | https://paperswithcode.com/paper/text-understanding-with-the-attention-sum |
Repo | https://github.com/shuxiaobo/QA-Experiment |
Framework | tf |
End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF
Title | End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF |
Authors | Xuezhe Ma, Eduard Hovy |
Abstract | State-of-the-art sequence labeling systems traditionally require large amounts of task-specific knowledge in the form of hand-crafted features and data pre-processing. In this paper, we introduce a novel neutral network architecture that benefits from both word- and character-level representations automatically, by using combination of bidirectional LSTM, CNN and CRF. Our system is truly end-to-end, requiring no feature engineering or data pre-processing, thus making it applicable to a wide range of sequence labeling tasks. We evaluate our system on two data sets for two sequence labeling tasks — Penn Treebank WSJ corpus for part-of-speech (POS) tagging and CoNLL 2003 corpus for named entity recognition (NER). We obtain state-of-the-art performance on both the two data — 97.55% accuracy for POS tagging and 91.21% F1 for NER. |
Tasks | Feature Engineering, Named Entity Recognition, Part-Of-Speech Tagging |
Published | 2016-03-04 |
URL | http://arxiv.org/abs/1603.01354v5 |
http://arxiv.org/pdf/1603.01354v5.pdf | |
PWC | https://paperswithcode.com/paper/end-to-end-sequence-labeling-via-bi |
Repo | https://github.com/XiafeiYu/CNN_BILSTM_CRF |
Framework | tf |