October 16, 2019

2379 words 12 mins read

Paper Group NAWR 2

Paper Group NAWR 2

Connectionist Temporal Classification with Maximum Entropy Regularization. UFSAC: Unification of Sense Annotated Corpora and Tools. Incorporating Argument-Level Interactions for Persuasion Comments Evaluation using Co-attention Model. Learning Efficient Single-stage Pedestrian Detectors by Asymptotic Localization Fitting. Debiasing Evidence Approxi …

Connectionist Temporal Classification with Maximum Entropy Regularization

Title Connectionist Temporal Classification with Maximum Entropy Regularization
Authors Hu Liu, Sheng Jin, Changshui Zhang
Abstract Connectionist Temporal Classification (CTC) is an objective function for end-to-end sequence learning, which adopts dynamic programming algorithms to directly learn the mapping between sequences. CTC has shown promising results in many sequence learning applications including speech recognition and scene text recognition. However, CTC tends to produce highly peaky and overconfident distributions, which is a symptom of overfitting. To remedy this, we propose a regularization method based on maximum conditional entropy which penalizes peaky distributions and encourages exploration. We also introduce an entropy-based pruning method to dramatically reduce the number of CTC feasible paths by ruling out unreasonable alignments. Experiments on scene text recognition show that our proposed methods consistently improve over the CTC baseline without the need to adjust training settings. Code has been made publicly available at: https://github.com/liuhu-bigeye/enctc.crnn.
Tasks Scene Text Recognition, Speech Recognition
Published 2018-12-01
URL http://papers.nips.cc/paper/7363-connectionist-temporal-classification-with-maximum-entropy-regularization
PDF http://papers.nips.cc/paper/7363-connectionist-temporal-classification-with-maximum-entropy-regularization.pdf
PWC https://paperswithcode.com/paper/connectionist-temporal-classification-with
Repo https://github.com/liuhu-bigeye/enctc.crnn
Framework pytorch

UFSAC: Unification of Sense Annotated Corpora and Tools

Title UFSAC: Unification of Sense Annotated Corpora and Tools
Authors Lo{"\i}c Vial, Benjamin Lecouteux, Didier Schwab
Abstract
Tasks Word Sense Disambiguation
Published 2018-05-01
URL https://www.aclweb.org/anthology/L18-1166/
PDF https://www.aclweb.org/anthology/L18-1166
PWC https://paperswithcode.com/paper/ufsac-unification-of-sense-annotated-corpora
Repo https://github.com/getalp/UFSAC
Framework none

Incorporating Argument-Level Interactions for Persuasion Comments Evaluation using Co-attention Model

Title Incorporating Argument-Level Interactions for Persuasion Comments Evaluation using Co-attention Model
Authors Lu Ji, Zhongyu Wei, Xiangkun Hu, Yang Liu, Qi Zhang, Xuanjing Huang
Abstract In this paper, we investigate the issue of persuasiveness evaluation for argumentative comments. Most of the existing research explores different text features of reply comments on word level and ignores interactions between participants. In general, viewpoints are usually expressed by multiple arguments and exchanged on argument level. To better model the process of dialogical argumentation, we propose a novel co-attention mechanism based neural network to capture the interactions between participants on argument level. Experimental results on a publicly available dataset show that the proposed model significantly outperforms some state-of-the-art methods for persuasiveness evaluation. Further analysis reveals that attention weights computed in our model are able to extract interactive argument pairs from the original post and the reply.
Tasks
Published 2018-08-01
URL https://www.aclweb.org/anthology/C18-1314/
PDF https://www.aclweb.org/anthology/C18-1314
PWC https://paperswithcode.com/paper/incorporating-argument-level-interactions-for
Repo https://github.com/lji0126/Persuasion-Comments-Evaluation
Framework tf

Learning Efficient Single-stage Pedestrian Detectors by Asymptotic Localization Fitting

Title Learning Efficient Single-stage Pedestrian Detectors by Asymptotic Localization Fitting
Authors Wei Liu, Shengcai Liao, Weidong Hu, Xuezhi Liang, Xiao Chen
Abstract Though Faster R-CNN based two-stage detectors have witnessed significant boost in pedestrian detection accuracy, it is still slow for practical applications. One solution is to simplify this working flow as a single-stage detector. However, current single-stage detectors (e.g. SSD) have not presented competitive accuracy on common pedestrian detection benchmarks. This paper is towards a successful pedestrian detector enjoying the speed of SSD while maintaining the accuracy of Faster R-CNN. Specifically, a structurally simple but effective module called emph{Asymptotic Localization Fitting} (ALF) is proposed, which stacks a series of predictors to directly evolve the default anchor boxes of SSD step by step into improving detection results. As a result, during training the latter predictors enjoy more and better-quality positive samples, meanwhile harder negatives could be mined with increasing IoU thresholds. On top of this, an efficient single-stage pedestrian detection architecture (denoted as ALFNet) is designed, achieving state-of-the-art performance on CityPersons and Caltech, two of the largest pedestrian detection benchmarks, and hence resulting in an attractive pedestrian detector in both accuracy and speed. Code is available at href{https://github.com/VideoObjectSearch/ALFNet}{https://github.com/VideoObjectSearch/ALFNet}.
Tasks Pedestrian Detection
Published 2018-09-01
URL http://openaccess.thecvf.com/content_ECCV_2018/html/Wei_Liu_Learning_Efficient_Single-stage_ECCV_2018_paper.html
PDF http://openaccess.thecvf.com/content_ECCV_2018/papers/Wei_Liu_Learning_Efficient_Single-stage_ECCV_2018_paper.pdf
PWC https://paperswithcode.com/paper/learning-efficient-single-stage-pedestrian
Repo https://github.com/VideoObjectSearch/ALFNet
Framework tf

Debiasing Evidence Approximations: On Importance-weighted Autoencoders and Jackknife Variational Inference

Title Debiasing Evidence Approximations: On Importance-weighted Autoencoders and Jackknife Variational Inference
Authors Sebastian Nowozin
Abstract The importance-weighted autoencoder (IWAE) approach of Burda et al. defines a sequence of increasingly tighter bounds on the marginal likelihood of latent variable models. Recently, Cremer et al. reinterpreted the IWAE bounds as ordinary variational evidence lower bounds (ELBO) applied to increasingly accurate variational distributions. In this work, we provide yet another perspective on the IWAE bounds. We interpret each IWAE bound as a biased estimator of the true marginal likelihood where for the bound defined on $K$ samples we show the bias to be of order O(1/K). In our theoretical analysis of the IWAE objective we derive asymptotic bias and variance expressions. Based on this analysis we develop jackknife variational inference (JVI), a family of bias-reduced estimators reducing the bias to $O(K^{-(m+1)})$ for any given m < K while retaining computational efficiency. Finally, we demonstrate that JVI leads to improved evidence estimates in variational autoencoders. We also report first results on applying JVI to learning variational autoencoders. Our implementation is available at https://github.com/Microsoft/jackknife-variational-inference
Tasks Latent Variable Models
Published 2018-01-01
URL https://openreview.net/forum?id=HyZoi-WRb
PDF https://openreview.net/pdf?id=HyZoi-WRb
PWC https://paperswithcode.com/paper/debiasing-evidence-approximations-on
Repo https://github.com/Microsoft/jackknife-variational-inference
Framework none

Valency-Augmented Dependency Parsing

Title Valency-Augmented Dependency Parsing
Authors Tianze Shi, Lillian Lee
Abstract We present a complete, automated, and efficient approach for utilizing valency analysis in making dependency parsing decisions. It includes extraction of valency patterns, a probabilistic model for tagging these patterns, and a joint decoding process that explicitly considers the number and types of each token{'}s syntactic dependents. On 53 treebanks representing 41 languages in the Universal Dependencies data, we find that incorporating valency information yields higher precision and F1 scores on the core arguments (subjects and complements) and functional relations (e.g., auxiliaries) that we employ for valency analysis. Precision on core arguments improves from 80.87 to 85.43. We further show that our approach can be applied to an ostensibly different formalism and dataset, Tree Adjoining Grammar as extracted from the Penn Treebank; there, we outperform the previous state-of-the-art labeled attachment score by 0.7. Finally, we explore the potential of extending valency patterns beyond their traditional domain by confirming their helpfulness in improving PP attachment decisions.
Tasks Dependency Parsing
Published 2018-10-01
URL https://www.aclweb.org/anthology/D18-1159/
PDF https://www.aclweb.org/anthology/D18-1159
PWC https://paperswithcode.com/paper/valency-augmented-dependency-parsing
Repo https://github.com/tzshi/valency-parser-emnlp18
Framework none

Preparation and Usage of Xhosa Lexicographical Data for a Multilingual, Federated Environment

Title Preparation and Usage of Xhosa Lexicographical Data for a Multilingual, Federated Environment
Authors Sonja Bosch, Thomas Eckart, Bettina Klimek, Dirk Goldhahn, Uwe Quasthoff
Abstract
Tasks Language Modelling
Published 2018-05-01
URL https://www.aclweb.org/anthology/L18-1692/
PDF https://www.aclweb.org/anthology/L18-1692
PWC https://paperswithcode.com/paper/preparation-and-usage-of-xhosa
Repo https://github.com/MMoOn-Project/OpenBantu
Framework none

Discourse Representation Structure Parsing

Title Discourse Representation Structure Parsing
Authors Jiangming Liu, Shay B. Cohen, Mirella Lapata
Abstract We introduce an open-domain neural semantic parser which generates formal meaning representations in the style of Discourse Representation Theory (DRT; Kamp and Reyle 1993). We propose a method which transforms Discourse Representation Structures (DRSs) to trees and develop a structure-aware model which decomposes the decoding process into three stages: basic DRS structure prediction, condition prediction (i.e., predicates and relations), and referent prediction (i.e., variables). Experimental results on the Groningen Meaning Bank (GMB) show that our model outperforms competitive baselines by a wide margin.
Tasks Question Answering, Semantic Parsing
Published 2018-07-01
URL https://www.aclweb.org/anthology/P18-1040/
PDF https://www.aclweb.org/anthology/P18-1040
PWC https://paperswithcode.com/paper/discourse-representation-structure-parsing
Repo https://github.com/EdinburghNLP/EncDecDRSparsing
Framework pytorch

Semi-Supervised Neural System for Tagging, Parsing and Lematization

Title Semi-Supervised Neural System for Tagging, Parsing and Lematization
Authors Piotr Rybak, Alina Wr{'o}blewska
Abstract This paper describes the ICS PAS system which took part in CoNLL 2018 shared task on Multilingual Parsing from Raw Text to Universal Dependencies. The system consists of jointly trained tagger, lemmatizer, and dependency parser which are based on features extracted by a biLSTM network. The system uses both fully connected and dilated convolutional neural architectures. The novelty of our approach is the use of an additional loss function, which reduces the number of cycles in the predicted dependency graphs, and the use of self-training to increase the system performance. The proposed system, i.e. ICS PAS (Warszawa), ranked 3th/4th in the official evaluation obtaining the following overall results: 73.02 (LAS), 60.25 (MLAS) and 64.44 (BLEX).
Tasks Dependency Parsing, Lemmatization, Machine Translation, Question Answering, Sentiment Analysis
Published 2018-10-01
URL https://www.aclweb.org/anthology/K18-2004/
PDF https://www.aclweb.org/anthology/K18-2004
PWC https://paperswithcode.com/paper/semi-supervised-neural-system-for-tagging
Repo https://github.com/360er0/COMBO
Framework tf

Forecasting Treatment Responses Over Time Using Recurrent Marginal Structural Networks

Title Forecasting Treatment Responses Over Time Using Recurrent Marginal Structural Networks
Authors Bryan Lim
Abstract Electronic health records provide a rich source of data for machine learning methods to learn dynamic treatment responses over time. However, any direct estimation is hampered by the presence of time-dependent confounding, where actions taken are dependent on time-varying variables related to the outcome of interest. Drawing inspiration from marginal structural models, a class of methods in epidemiology which use propensity weighting to adjust for time-dependent confounders, we introduce the Recurrent Marginal Structural Network - a sequence-to-sequence architecture for forecasting a patient’s expected response to a series of planned treatments. Using simulations of a state-of-the-art pharmacokinetic-pharmacodynamic (PK-PD) model of tumor growth, we demonstrate the ability of our network to accurately learn unbiased treatment responses from observational data – even under changes in the policy of treatment assignments – and performance gains over benchmarks.
Tasks Epidemiology
Published 2018-12-01
URL http://papers.nips.cc/paper/7977-forecasting-treatment-responses-over-time-using-recurrent-marginal-structural-networks
PDF http://papers.nips.cc/paper/7977-forecasting-treatment-responses-over-time-using-recurrent-marginal-structural-networks.pdf
PWC https://paperswithcode.com/paper/forecasting-treatment-responses-over-time
Repo https://github.com/sjblim/rmsn_nips_2018
Framework tf

Deep Learning for Epidemiological Predictions

Title Deep Learning for Epidemiological Predictions
Authors Wu, Yuexin Yang, Yiming Nishiura, Hiroshi Saitoh, Masaya
Abstract Predicting new and urgent trends in epidemiological data is an important problem for public health, and has attracted increasing attention in the data mining and machine learning communities. The temporal nature of epidemiology data and the need for real-time prediction by the system makes the problem residing in the category of time-series forecasting or prediction. While traditional autoregressive (AR) methods and Gaussian Process Regression (GPR) have been actively studied for solving this problem, deep learning techniques have not been explored in this domain. In this paper, we develop a deep learning framework, for the first time, to predict epidemiology profiles in the time-series perspective. We adopt Recurrent Neural Networks (RNNs) to capture the long-term correlation in the data and Convolutional Neural Networks (CNNs) to fuse information from data of different sources. A residual structure is also applied to prevent overfitting issues in the training process. We compared our model with the most widely used AR models on USA and Japan datasets. Our approach provides consistently better results than these baseline methods
Tasks Epidemiology, Multivariate Time Series Forecasting, Time Series, Time Series Forecasting
Published 2018-07-21
URL https://www.onacademic.com/detail/journal_1000040816757010_db82.html
PDF https://www.onacademic.com/detail/journal_1000040816757010_db82.html
PWC https://paperswithcode.com/paper/deep-learning-for-epidemiological-predictions
Repo https://github.com/CrickWu/DL4Epi
Framework pytorch
Title Non-metric Similarity Graphs for Maximum Inner Product Search
Authors Stanislav Morozov, Artem Babenko
Abstract In this paper we address the problem of Maximum Inner Product Search (MIPS) that is currently the computational bottleneck in a large number of machine learning applications. While being similar to the nearest neighbor search (NNS), the MIPS problem was shown to be more challenging, as the inner product is not a proper metric function. We propose to solve the MIPS problem with the usage of similarity graphs, i.e., graphs where each vertex is connected to the vertices that are the most similar in terms of some similarity function. Originally, the framework of similarity graphs was proposed for metric spaces and in this paper we naturally extend it to the non-metric MIPS scenario. We demonstrate that, unlike existing approaches, similarity graphs do not require any data transformation to reduce MIPS to the NNS problem and should be used for the original data. Moreover, we explain why such a reduction is detrimental for similarity graphs. By an extensive comparison to the existing approaches, we show that the proposed method is a game-changer in terms of the runtime/accuracy trade-off for the MIPS problem.
Tasks
Published 2018-12-01
URL http://papers.nips.cc/paper/7722-non-metric-similarity-graphs-for-maximum-inner-product-search
PDF http://papers.nips.cc/paper/7722-non-metric-similarity-graphs-for-maximum-inner-product-search.pdf
PWC https://paperswithcode.com/paper/non-metric-similarity-graphs-for-maximum
Repo https://github.com/stanis-morozov/ip-nsw
Framework none

IIT(BHU)–IIITH at CoNLL–SIGMORPHON 2018 Shared Task on Universal Morphological Reinflection

Title IIT(BHU)–IIITH at CoNLL–SIGMORPHON 2018 Shared Task on Universal Morphological Reinflection
Authors Abhishek Sharma, Ganesh Katrapati, Dipti Misra Sharma
Abstract
Tasks Feature Engineering, Morphological Inflection
Published 2018-10-01
URL https://www.aclweb.org/anthology/K18-3013/
PDF https://www.aclweb.org/anthology/K18-3013
PWC https://paperswithcode.com/paper/iitbhuaiiith-at-conllasigmorphon-2018-shared
Repo https://github.com/abhishek0318/conll-sigmorphon-2018
Framework pytorch

Generalizing A Person Retrieval Model Hetero- and Homogeneously

Title Generalizing A Person Retrieval Model Hetero- and Homogeneously
Authors Zhun Zhong, Liang Zheng, Shaozi Li, Yi Yang
Abstract Person re-identification (re-ID) poses unique challenges for unsupervised domain adaptation (UDA) in that classes in the source and target sets (domains) are entirely different and that image variations are largely caused by cameras. Given a labeled source training set and an unlabeled target training set, we aim to improve the generalization ability of re-ID models on the target testing set. To this end, we introduce a Hetero-Homogeneous Learning (HHL) method. Our method enforces two properties simultaneously: 1) camera invariance, learned via positive pairs formed by unlabeled target images and their camera style transferred counterparts; 2) domain connectedness, by regarding source / target images as negative matching pairs to the target / source images. The first property is implemented by homogeneous learning because training pairs are collected from the same domain. The second property is achieved by heterogeneous learning because we sample training pairs from both the source and target domains. On Market-1501, DukeMTMC-reID and CUHK03, we show that the two properties contribute indispensably and that very competitive re-ID UDA accuracy is achieved. Code is available at: https://github.com/zhunzhong07/HHL
Tasks Domain Adaptation, Person Re-Identification, Person Retrieval, Unsupervised Domain Adaptation
Published 2018-09-01
URL http://openaccess.thecvf.com/content_ECCV_2018/html/Zhun_Zhong_Generalizing_A_Person_ECCV_2018_paper.html
PDF http://openaccess.thecvf.com/content_ECCV_2018/papers/Zhun_Zhong_Generalizing_A_Person_ECCV_2018_paper.pdf
PWC https://paperswithcode.com/paper/generalizing-a-person-retrieval-model-hetero
Repo https://github.com/zhunzhong07/HHL
Framework pytorch

Neural Transition-based String Transduction for Limited-Resource Setting in Morphology

Title Neural Transition-based String Transduction for Limited-Resource Setting in Morphology
Authors Peter Makarov, Simon Clematide
Abstract We present a neural transition-based model that uses a simple set of edit actions (copy, delete, insert) for morphological transduction tasks such as inflection generation, lemmatization, and reinflection. In a large-scale evaluation on four datasets and dozens of languages, our approach consistently outperforms state-of-the-art systems on low and medium training-set sizes and is competitive in the high-resource setting. Learning to apply a generic copy action enables our approach to generalize quickly from a few data points. We successfully leverage minimum risk training to compensate for the weaknesses of MLE parameter learning and neutralize the negative effects of training a pipeline with a separate character aligner.
Tasks Lemmatization, Machine Translation, Morphological Inflection
Published 2018-08-01
URL https://www.aclweb.org/anthology/C18-1008/
PDF https://www.aclweb.org/anthology/C18-1008
PWC https://paperswithcode.com/paper/neural-transition-based-string-transduction
Repo https://github.com/ZurichNLP/coling2018-neural-transition-based-morphology
Framework none
comments powered by Disqus