October 16, 2019

2379 words 12 mins read

Paper Group NAWR 2

Connectionist Temporal Classification with Maximum Entropy Regularization. UFSAC: Unification of Sense Annotated Corpora and Tools. Incorporating Argument-Level Interactions for Persuasion Comments Evaluation using Co-attention Model. Learning Efficient Single-stage Pedestrian Detectors by Asymptotic Localization Fitting. Debiasing Evidence Approxi …

Connectionist Temporal Classification with Maximum Entropy Regularization


Title	Connectionist Temporal Classification with Maximum Entropy Regularization
Authors	Hu Liu, Sheng Jin, Changshui Zhang
Abstract	Connectionist Temporal Classification (CTC) is an objective function for end-to-end sequence learning, which adopts dynamic programming algorithms to directly learn the mapping between sequences. CTC has shown promising results in many sequence learning applications including speech recognition and scene text recognition. However, CTC tends to produce highly peaky and overconfident distributions, which is a symptom of overfitting. To remedy this, we propose a regularization method based on maximum conditional entropy which penalizes peaky distributions and encourages exploration. We also introduce an entropy-based pruning method to dramatically reduce the number of CTC feasible paths by ruling out unreasonable alignments. Experiments on scene text recognition show that our proposed methods consistently improve over the CTC baseline without the need to adjust training settings. Code has been made publicly available at: https://github.com/liuhu-bigeye/enctc.crnn.
Tasks	Scene Text Recognition, Speech Recognition
Published	2018-12-01
URL	http://papers.nips.cc/paper/7363-connectionist-temporal-classification-with-maximum-entropy-regularization
PDF	http://papers.nips.cc/paper/7363-connectionist-temporal-classification-with-maximum-entropy-regularization.pdf
PWC	https://paperswithcode.com/paper/connectionist-temporal-classification-with
Repo	https://github.com/liuhu-bigeye/enctc.crnn
Framework	pytorch

UFSAC: Unification of Sense Annotated Corpora and Tools


Title	UFSAC: Unification of Sense Annotated Corpora and Tools
Authors	Lo{"\i}c Vial, Benjamin Lecouteux, Didier Schwab
Abstract
Tasks	Word Sense Disambiguation
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1166/
PDF	https://www.aclweb.org/anthology/L18-1166
PWC	https://paperswithcode.com/paper/ufsac-unification-of-sense-annotated-corpora
Repo	https://github.com/getalp/UFSAC
Framework	none

Incorporating Argument-Level Interactions for Persuasion Comments Evaluation using Co-attention Model


Title	Incorporating Argument-Level Interactions for Persuasion Comments Evaluation using Co-attention Model
Authors	Lu Ji, Zhongyu Wei, Xiangkun Hu, Yang Liu, Qi Zhang, Xuanjing Huang
Abstract	In this paper, we investigate the issue of persuasiveness evaluation for argumentative comments. Most of the existing research explores different text features of reply comments on word level and ignores interactions between participants. In general, viewpoints are usually expressed by multiple arguments and exchanged on argument level. To better model the process of dialogical argumentation, we propose a novel co-attention mechanism based neural network to capture the interactions between participants on argument level. Experimental results on a publicly available dataset show that the proposed model significantly outperforms some state-of-the-art methods for persuasiveness evaluation. Further analysis reveals that attention weights computed in our model are able to extract interactive argument pairs from the original post and the reply.
Tasks
Published	2018-08-01
URL	https://www.aclweb.org/anthology/C18-1314/
PDF	https://www.aclweb.org/anthology/C18-1314
PWC	https://paperswithcode.com/paper/incorporating-argument-level-interactions-for
Repo	https://github.com/lji0126/Persuasion-Comments-Evaluation
Framework	tf

Learning Efficient Single-stage Pedestrian Detectors by Asymptotic Localization Fitting


Title	Learning Efficient Single-stage Pedestrian Detectors by Asymptotic Localization Fitting
Authors	Wei Liu, Shengcai Liao, Weidong Hu, Xuezhi Liang, Xiao Chen
Abstract	Though Faster R-CNN based two-stage detectors have witnessed significant boost in pedestrian detection accuracy, it is still slow for practical applications. One solution is to simplify this working flow as a single-stage detector. However, current single-stage detectors (e.g. SSD) have not presented competitive accuracy on common pedestrian detection benchmarks. This paper is towards a successful pedestrian detector enjoying the speed of SSD while maintaining the accuracy of Faster R-CNN. Specifically, a structurally simple but effective module called emph{Asymptotic Localization Fitting} (ALF) is proposed, which stacks a series of predictors to directly evolve the default anchor boxes of SSD step by step into improving detection results. As a result, during training the latter predictors enjoy more and better-quality positive samples, meanwhile harder negatives could be mined with increasing IoU thresholds. On top of this, an efficient single-stage pedestrian detection architecture (denoted as ALFNet) is designed, achieving state-of-the-art performance on CityPersons and Caltech, two of the largest pedestrian detection benchmarks, and hence resulting in an attractive pedestrian detector in both accuracy and speed. Code is available at href{https://github.com/VideoObjectSearch/ALFNet}{https://github.com/VideoObjectSearch/ALFNet}.
Tasks	Pedestrian Detection
Published	2018-09-01
URL	http://openaccess.thecvf.com/content_ECCV_2018/html/Wei_Liu_Learning_Efficient_Single-stage_ECCV_2018_paper.html
PDF	http://openaccess.thecvf.com/content_ECCV_2018/papers/Wei_Liu_Learning_Efficient_Single-stage_ECCV_2018_paper.pdf
PWC	https://paperswithcode.com/paper/learning-efficient-single-stage-pedestrian
Repo	https://github.com/VideoObjectSearch/ALFNet
Framework	tf

Debiasing Evidence Approximations: On Importance-weighted Autoencoders and Jackknife Variational Inference


Title	Debiasing Evidence Approximations: On Importance-weighted Autoencoders and Jackknife Variational Inference
Authors	Sebastian Nowozin
Abstract	The importance-weighted autoencoder (IWAE) approach of Burda et al. defines a sequence of increasingly tighter bounds on the marginal likelihood of latent variable models. Recently, Cremer et al. reinterpreted the IWAE bounds as ordinary variational evidence lower bounds (ELBO) applied to increasingly accurate variational distributions. In this work, we provide yet another perspective on the IWAE bounds. We interpret each IWAE bound as a biased estimator of the true marginal likelihood where for the bound defined on $K$ samples we show the bias to be of order O(1/K). In our theoretical analysis of the IWAE objective we derive asymptotic bias and variance expressions. Based on this analysis we develop jackknife variational inference (JVI), a family of bias-reduced estimators reducing the bias to $O(K^{-(m+1)})$ for any given m < K while retaining computational efficiency. Finally, we demonstrate that JVI leads to improved evidence estimates in variational autoencoders. We also report first results on applying JVI to learning variational autoencoders. Our implementation is available at https://github.com/Microsoft/jackknife-variational-inference
Tasks	Latent Variable Models
Published	2018-01-01
URL	https://openreview.net/forum?id=HyZoi-WRb
PDF	https://openreview.net/pdf?id=HyZoi-WRb
PWC	https://paperswithcode.com/paper/debiasing-evidence-approximations-on
Repo	https://github.com/Microsoft/jackknife-variational-inference
Framework	none

Valency-Augmented Dependency Parsing


Title	Valency-Augmented Dependency Parsing
Authors	Tianze Shi, Lillian Lee
Abstract	We present a complete, automated, and efficient approach for utilizing valency analysis in making dependency parsing decisions. It includes extraction of valency patterns, a probabilistic model for tagging these patterns, and a joint decoding process that explicitly considers the number and types of each token{'}s syntactic dependents. On 53 treebanks representing 41 languages in the Universal Dependencies data, we find that incorporating valency information yields higher precision and F1 scores on the core arguments (subjects and complements) and functional relations (e.g., auxiliaries) that we employ for valency analysis. Precision on core arguments improves from 80.87 to 85.43. We further show that our approach can be applied to an ostensibly different formalism and dataset, Tree Adjoining Grammar as extracted from the Penn Treebank; there, we outperform the previous state-of-the-art labeled attachment score by 0.7. Finally, we explore the potential of extending valency patterns beyond their traditional domain by confirming their helpfulness in improving PP attachment decisions.
Tasks	Dependency Parsing
Published	2018-10-01
URL	https://www.aclweb.org/anthology/D18-1159/
PDF	https://www.aclweb.org/anthology/D18-1159
PWC	https://paperswithcode.com/paper/valency-augmented-dependency-parsing
Repo	https://github.com/tzshi/valency-parser-emnlp18
Framework	none

Preparation and Usage of Xhosa Lexicographical Data for a Multilingual, Federated Environment


Title	Preparation and Usage of Xhosa Lexicographical Data for a Multilingual, Federated Environment
Authors	Sonja Bosch, Thomas Eckart, Bettina Klimek, Dirk Goldhahn, Uwe Quasthoff
Abstract
Tasks	Language Modelling
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1692/
PDF	https://www.aclweb.org/anthology/L18-1692
PWC	https://paperswithcode.com/paper/preparation-and-usage-of-xhosa
Repo	https://github.com/MMoOn-Project/OpenBantu
Framework	none

Discourse Representation Structure Parsing


Title	Discourse Representation Structure Parsing
Authors	Jiangming Liu, Shay B. Cohen, Mirella Lapata
Abstract	We introduce an open-domain neural semantic parser which generates formal meaning representations in the style of Discourse Representation Theory (DRT; Kamp and Reyle 1993). We propose a method which transforms Discourse Representation Structures (DRSs) to trees and develop a structure-aware model which decomposes the decoding process into three stages: basic DRS structure prediction, condition prediction (i.e., predicates and relations), and referent prediction (i.e., variables). Experimental results on the Groningen Meaning Bank (GMB) show that our model outperforms competitive baselines by a wide margin.
Tasks	Question Answering, Semantic Parsing
Published	2018-07-01
URL	https://www.aclweb.org/anthology/P18-1040/
PDF	https://www.aclweb.org/anthology/P18-1040
PWC	https://paperswithcode.com/paper/discourse-representation-structure-parsing
Repo	https://github.com/EdinburghNLP/EncDecDRSparsing
Framework	pytorch

Semi-Supervised Neural System for Tagging, Parsing and Lematization


Title	Semi-Supervised Neural System for Tagging, Parsing and Lematization
Authors	Piotr Rybak, Alina Wr{'o}blewska
Abstract	This paper describes the ICS PAS system which took part in CoNLL 2018 shared task on Multilingual Parsing from Raw Text to Universal Dependencies. The system consists of jointly trained tagger, lemmatizer, and dependency parser which are based on features extracted by a biLSTM network. The system uses both fully connected and dilated convolutional neural architectures. The novelty of our approach is the use of an additional loss function, which reduces the number of cycles in the predicted dependency graphs, and the use of self-training to increase the system performance. The proposed system, i.e. ICS PAS (Warszawa), ranked 3th/4th in the official evaluation obtaining the following overall results: 73.02 (LAS), 60.25 (MLAS) and 64.44 (BLEX).
Tasks	Dependency Parsing, Lemmatization, Machine Translation, Question Answering, Sentiment Analysis
Published	2018-10-01
URL	https://www.aclweb.org/anthology/K18-2004/
PDF	https://www.aclweb.org/anthology/K18-2004
PWC	https://paperswithcode.com/paper/semi-supervised-neural-system-for-tagging
Repo	https://github.com/360er0/COMBO
Framework	tf

Forecasting Treatment Responses Over Time Using Recurrent Marginal Structural Networks


Title	Forecasting Treatment Responses Over Time Using Recurrent Marginal Structural Networks
Authors	Bryan Lim
Abstract	Electronic health records provide a rich source of data for machine learning methods to learn dynamic treatment responses over time. However, any direct estimation is hampered by the presence of time-dependent confounding, where actions taken are dependent on time-varying variables related to the outcome of interest. Drawing inspiration from marginal structural models, a class of methods in epidemiology which use propensity weighting to adjust for time-dependent confounders, we introduce the Recurrent Marginal Structural Network - a sequence-to-sequence architecture for forecasting a patient’s expected response to a series of planned treatments. Using simulations of a state-of-the-art pharmacokinetic-pharmacodynamic (PK-PD) model of tumor growth, we demonstrate the ability of our network to accurately learn unbiased treatment responses from observational data – even under changes in the policy of treatment assignments – and performance gains over benchmarks.
Tasks	Epidemiology
Published	2018-12-01
URL	http://papers.nips.cc/paper/7977-forecasting-treatment-responses-over-time-using-recurrent-marginal-structural-networks
PDF	http://papers.nips.cc/paper/7977-forecasting-treatment-responses-over-time-using-recurrent-marginal-structural-networks.pdf
PWC	https://paperswithcode.com/paper/forecasting-treatment-responses-over-time
Repo	https://github.com/sjblim/rmsn_nips_2018
Framework	tf

Deep Learning for Epidemiological Predictions


Title	Deep Learning for Epidemiological Predictions
Authors	Wu, Yuexin Yang, Yiming Nishiura, Hiroshi Saitoh, Masaya
Abstract	Predicting new and urgent trends in epidemiological data is an important problem for public health, and has attracted increasing attention in the data mining and machine learning communities. The temporal nature of epidemiology data and the need for real-time prediction by the system makes the problem residing in the category of time-series forecasting or prediction. While traditional autoregressive (AR) methods and Gaussian Process Regression (GPR) have been actively studied for solving this problem, deep learning techniques have not been explored in this domain. In this paper, we develop a deep learning framework, for the first time, to predict epidemiology profiles in the time-series perspective. We adopt Recurrent Neural Networks (RNNs) to capture the long-term correlation in the data and Convolutional Neural Networks (CNNs) to fuse information from data of different sources. A residual structure is also applied to prevent overfitting issues in the training process. We compared our model with the most widely used AR models on USA and Japan datasets. Our approach provides consistently better results than these baseline methods
Tasks	Epidemiology, Multivariate Time Series Forecasting, Time Series, Time Series Forecasting
Published	2018-07-21
URL	https://www.onacademic.com/detail/journal_1000040816757010_db82.html
PDF	https://www.onacademic.com/detail/journal_1000040816757010_db82.html
PWC	https://paperswithcode.com/paper/deep-learning-for-epidemiological-predictions
Repo	https://github.com/CrickWu/DL4Epi
Framework	pytorch

Non-metric Similarity Graphs for Maximum Inner Product Search


Title	Non-metric Similarity Graphs for Maximum Inner Product Search
Authors	Stanislav Morozov, Artem Babenko
Abstract	In this paper we address the problem of Maximum Inner Product Search (MIPS) that is currently the computational bottleneck in a large number of machine learning applications. While being similar to the nearest neighbor search (NNS), the MIPS problem was shown to be more challenging, as the inner product is not a proper metric function. We propose to solve the MIPS problem with the usage of similarity graphs, i.e., graphs where each vertex is connected to the vertices that are the most similar in terms of some similarity function. Originally, the framework of similarity graphs was proposed for metric spaces and in this paper we naturally extend it to the non-metric MIPS scenario. We demonstrate that, unlike existing approaches, similarity graphs do not require any data transformation to reduce MIPS to the NNS problem and should be used for the original data. Moreover, we explain why such a reduction is detrimental for similarity graphs. By an extensive comparison to the existing approaches, we show that the proposed method is a game-changer in terms of the runtime/accuracy trade-off for the MIPS problem.
Tasks
Published	2018-12-01
URL	http://papers.nips.cc/paper/7722-non-metric-similarity-graphs-for-maximum-inner-product-search
PDF	http://papers.nips.cc/paper/7722-non-metric-similarity-graphs-for-maximum-inner-product-search.pdf
PWC	https://paperswithcode.com/paper/non-metric-similarity-graphs-for-maximum
Repo	https://github.com/stanis-morozov/ip-nsw
Framework	none

IIT(BHU)–IIITH at CoNLL–SIGMORPHON 2018 Shared Task on Universal Morphological Reinflection


Title	IIT(BHU)–IIITH at CoNLL–SIGMORPHON 2018 Shared Task on Universal Morphological Reinflection
Authors	Abhishek Sharma, Ganesh Katrapati, Dipti Misra Sharma
Abstract
Tasks	Feature Engineering, Morphological Inflection
Published	2018-10-01
URL	https://www.aclweb.org/anthology/K18-3013/
PDF	https://www.aclweb.org/anthology/K18-3013
PWC	https://paperswithcode.com/paper/iitbhuaiiith-at-conllasigmorphon-2018-shared
Repo	https://github.com/abhishek0318/conll-sigmorphon-2018
Framework	pytorch

Generalizing A Person Retrieval Model Hetero- and Homogeneously


Title	Generalizing A Person Retrieval Model Hetero- and Homogeneously
Authors	Zhun Zhong, Liang Zheng, Shaozi Li, Yi Yang
Abstract	Person re-identification (re-ID) poses unique challenges for unsupervised domain adaptation (UDA) in that classes in the source and target sets (domains) are entirely different and that image variations are largely caused by cameras. Given a labeled source training set and an unlabeled target training set, we aim to improve the generalization ability of re-ID models on the target testing set. To this end, we introduce a Hetero-Homogeneous Learning (HHL) method. Our method enforces two properties simultaneously: 1) camera invariance, learned via positive pairs formed by unlabeled target images and their camera style transferred counterparts; 2) domain connectedness, by regarding source / target images as negative matching pairs to the target / source images. The first property is implemented by homogeneous learning because training pairs are collected from the same domain. The second property is achieved by heterogeneous learning because we sample training pairs from both the source and target domains. On Market-1501, DukeMTMC-reID and CUHK03, we show that the two properties contribute indispensably and that very competitive re-ID UDA accuracy is achieved. Code is available at: https://github.com/zhunzhong07/HHL
Tasks	Domain Adaptation, Person Re-Identification, Person Retrieval, Unsupervised Domain Adaptation
Published	2018-09-01
URL	http://openaccess.thecvf.com/content_ECCV_2018/html/Zhun_Zhong_Generalizing_A_Person_ECCV_2018_paper.html
PDF	http://openaccess.thecvf.com/content_ECCV_2018/papers/Zhun_Zhong_Generalizing_A_Person_ECCV_2018_paper.pdf
PWC	https://paperswithcode.com/paper/generalizing-a-person-retrieval-model-hetero
Repo	https://github.com/zhunzhong07/HHL
Framework	pytorch

Neural Transition-based String Transduction for Limited-Resource Setting in Morphology


Title	Neural Transition-based String Transduction for Limited-Resource Setting in Morphology
Authors	Peter Makarov, Simon Clematide
Abstract	We present a neural transition-based model that uses a simple set of edit actions (copy, delete, insert) for morphological transduction tasks such as inflection generation, lemmatization, and reinflection. In a large-scale evaluation on four datasets and dozens of languages, our approach consistently outperforms state-of-the-art systems on low and medium training-set sizes and is competitive in the high-resource setting. Learning to apply a generic copy action enables our approach to generalize quickly from a few data points. We successfully leverage minimum risk training to compensate for the weaknesses of MLE parameter learning and neutralize the negative effects of training a pipeline with a separate character aligner.
Tasks	Lemmatization, Machine Translation, Morphological Inflection
Published	2018-08-01
URL	https://www.aclweb.org/anthology/C18-1008/
PDF	https://www.aclweb.org/anthology/C18-1008
PWC	https://paperswithcode.com/paper/neural-transition-based-string-transduction
Repo	https://github.com/ZurichNLP/coling2018-neural-transition-based-morphology
Framework	none