Paper Group NAWR 2
Connectionist Temporal Classification with Maximum Entropy Regularization. UFSAC: Unification of Sense Annotated Corpora and Tools. Incorporating Argument-Level Interactions for Persuasion Comments Evaluation using Co-attention Model. Learning Efficient Single-stage Pedestrian Detectors by Asymptotic Localization Fitting. Debiasing Evidence Approxi …
Connectionist Temporal Classification with Maximum Entropy Regularization
Title | Connectionist Temporal Classification with Maximum Entropy Regularization |
Authors | Hu Liu, Sheng Jin, Changshui Zhang |
Abstract | Connectionist Temporal Classification (CTC) is an objective function for end-to-end sequence learning, which adopts dynamic programming algorithms to directly learn the mapping between sequences. CTC has shown promising results in many sequence learning applications including speech recognition and scene text recognition. However, CTC tends to produce highly peaky and overconfident distributions, which is a symptom of overfitting. To remedy this, we propose a regularization method based on maximum conditional entropy which penalizes peaky distributions and encourages exploration. We also introduce an entropy-based pruning method to dramatically reduce the number of CTC feasible paths by ruling out unreasonable alignments. Experiments on scene text recognition show that our proposed methods consistently improve over the CTC baseline without the need to adjust training settings. Code has been made publicly available at: https://github.com/liuhu-bigeye/enctc.crnn. |
Tasks | Scene Text Recognition, Speech Recognition |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/7363-connectionist-temporal-classification-with-maximum-entropy-regularization |
http://papers.nips.cc/paper/7363-connectionist-temporal-classification-with-maximum-entropy-regularization.pdf | |
PWC | https://paperswithcode.com/paper/connectionist-temporal-classification-with |
Repo | https://github.com/liuhu-bigeye/enctc.crnn |
Framework | pytorch |
UFSAC: Unification of Sense Annotated Corpora and Tools
Title | UFSAC: Unification of Sense Annotated Corpora and Tools |
Authors | Lo{"\i}c Vial, Benjamin Lecouteux, Didier Schwab |
Abstract | |
Tasks | Word Sense Disambiguation |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1166/ |
https://www.aclweb.org/anthology/L18-1166 | |
PWC | https://paperswithcode.com/paper/ufsac-unification-of-sense-annotated-corpora |
Repo | https://github.com/getalp/UFSAC |
Framework | none |
Incorporating Argument-Level Interactions for Persuasion Comments Evaluation using Co-attention Model
Title | Incorporating Argument-Level Interactions for Persuasion Comments Evaluation using Co-attention Model |
Authors | Lu Ji, Zhongyu Wei, Xiangkun Hu, Yang Liu, Qi Zhang, Xuanjing Huang |
Abstract | In this paper, we investigate the issue of persuasiveness evaluation for argumentative comments. Most of the existing research explores different text features of reply comments on word level and ignores interactions between participants. In general, viewpoints are usually expressed by multiple arguments and exchanged on argument level. To better model the process of dialogical argumentation, we propose a novel co-attention mechanism based neural network to capture the interactions between participants on argument level. Experimental results on a publicly available dataset show that the proposed model significantly outperforms some state-of-the-art methods for persuasiveness evaluation. Further analysis reveals that attention weights computed in our model are able to extract interactive argument pairs from the original post and the reply. |
Tasks | |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/C18-1314/ |
https://www.aclweb.org/anthology/C18-1314 | |
PWC | https://paperswithcode.com/paper/incorporating-argument-level-interactions-for |
Repo | https://github.com/lji0126/Persuasion-Comments-Evaluation |
Framework | tf |
Learning Efficient Single-stage Pedestrian Detectors by Asymptotic Localization Fitting
Title | Learning Efficient Single-stage Pedestrian Detectors by Asymptotic Localization Fitting |
Authors | Wei Liu, Shengcai Liao, Weidong Hu, Xuezhi Liang, Xiao Chen |
Abstract | Though Faster R-CNN based two-stage detectors have witnessed significant boost in pedestrian detection accuracy, it is still slow for practical applications. One solution is to simplify this working flow as a single-stage detector. However, current single-stage detectors (e.g. SSD) have not presented competitive accuracy on common pedestrian detection benchmarks. This paper is towards a successful pedestrian detector enjoying the speed of SSD while maintaining the accuracy of Faster R-CNN. Specifically, a structurally simple but effective module called emph{Asymptotic Localization Fitting} (ALF) is proposed, which stacks a series of predictors to directly evolve the default anchor boxes of SSD step by step into improving detection results. As a result, during training the latter predictors enjoy more and better-quality positive samples, meanwhile harder negatives could be mined with increasing IoU thresholds. On top of this, an efficient single-stage pedestrian detection architecture (denoted as ALFNet) is designed, achieving state-of-the-art performance on CityPersons and Caltech, two of the largest pedestrian detection benchmarks, and hence resulting in an attractive pedestrian detector in both accuracy and speed. Code is available at href{https://github.com/VideoObjectSearch/ALFNet}{https://github.com/VideoObjectSearch/ALFNet}. |
Tasks | Pedestrian Detection |
Published | 2018-09-01 |
URL | http://openaccess.thecvf.com/content_ECCV_2018/html/Wei_Liu_Learning_Efficient_Single-stage_ECCV_2018_paper.html |
http://openaccess.thecvf.com/content_ECCV_2018/papers/Wei_Liu_Learning_Efficient_Single-stage_ECCV_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/learning-efficient-single-stage-pedestrian |
Repo | https://github.com/VideoObjectSearch/ALFNet |
Framework | tf |
Debiasing Evidence Approximations: On Importance-weighted Autoencoders and Jackknife Variational Inference
Title | Debiasing Evidence Approximations: On Importance-weighted Autoencoders and Jackknife Variational Inference |
Authors | Sebastian Nowozin |
Abstract | The importance-weighted autoencoder (IWAE) approach of Burda et al. defines a sequence of increasingly tighter bounds on the marginal likelihood of latent variable models. Recently, Cremer et al. reinterpreted the IWAE bounds as ordinary variational evidence lower bounds (ELBO) applied to increasingly accurate variational distributions. In this work, we provide yet another perspective on the IWAE bounds. We interpret each IWAE bound as a biased estimator of the true marginal likelihood where for the bound defined on $K$ samples we show the bias to be of order O(1/K). In our theoretical analysis of the IWAE objective we derive asymptotic bias and variance expressions. Based on this analysis we develop jackknife variational inference (JVI), a family of bias-reduced estimators reducing the bias to $O(K^{-(m+1)})$ for any given m < K while retaining computational efficiency. Finally, we demonstrate that JVI leads to improved evidence estimates in variational autoencoders. We also report first results on applying JVI to learning variational autoencoders. Our implementation is available at https://github.com/Microsoft/jackknife-variational-inference |
Tasks | Latent Variable Models |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=HyZoi-WRb |
https://openreview.net/pdf?id=HyZoi-WRb | |
PWC | https://paperswithcode.com/paper/debiasing-evidence-approximations-on |
Repo | https://github.com/Microsoft/jackknife-variational-inference |
Framework | none |
Valency-Augmented Dependency Parsing
Title | Valency-Augmented Dependency Parsing |
Authors | Tianze Shi, Lillian Lee |
Abstract | We present a complete, automated, and efficient approach for utilizing valency analysis in making dependency parsing decisions. It includes extraction of valency patterns, a probabilistic model for tagging these patterns, and a joint decoding process that explicitly considers the number and types of each token{'}s syntactic dependents. On 53 treebanks representing 41 languages in the Universal Dependencies data, we find that incorporating valency information yields higher precision and F1 scores on the core arguments (subjects and complements) and functional relations (e.g., auxiliaries) that we employ for valency analysis. Precision on core arguments improves from 80.87 to 85.43. We further show that our approach can be applied to an ostensibly different formalism and dataset, Tree Adjoining Grammar as extracted from the Penn Treebank; there, we outperform the previous state-of-the-art labeled attachment score by 0.7. Finally, we explore the potential of extending valency patterns beyond their traditional domain by confirming their helpfulness in improving PP attachment decisions. |
Tasks | Dependency Parsing |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1159/ |
https://www.aclweb.org/anthology/D18-1159 | |
PWC | https://paperswithcode.com/paper/valency-augmented-dependency-parsing |
Repo | https://github.com/tzshi/valency-parser-emnlp18 |
Framework | none |
Preparation and Usage of Xhosa Lexicographical Data for a Multilingual, Federated Environment
Title | Preparation and Usage of Xhosa Lexicographical Data for a Multilingual, Federated Environment |
Authors | Sonja Bosch, Thomas Eckart, Bettina Klimek, Dirk Goldhahn, Uwe Quasthoff |
Abstract | |
Tasks | Language Modelling |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1692/ |
https://www.aclweb.org/anthology/L18-1692 | |
PWC | https://paperswithcode.com/paper/preparation-and-usage-of-xhosa |
Repo | https://github.com/MMoOn-Project/OpenBantu |
Framework | none |
Discourse Representation Structure Parsing
Title | Discourse Representation Structure Parsing |
Authors | Jiangming Liu, Shay B. Cohen, Mirella Lapata |
Abstract | We introduce an open-domain neural semantic parser which generates formal meaning representations in the style of Discourse Representation Theory (DRT; Kamp and Reyle 1993). We propose a method which transforms Discourse Representation Structures (DRSs) to trees and develop a structure-aware model which decomposes the decoding process into three stages: basic DRS structure prediction, condition prediction (i.e., predicates and relations), and referent prediction (i.e., variables). Experimental results on the Groningen Meaning Bank (GMB) show that our model outperforms competitive baselines by a wide margin. |
Tasks | Question Answering, Semantic Parsing |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/P18-1040/ |
https://www.aclweb.org/anthology/P18-1040 | |
PWC | https://paperswithcode.com/paper/discourse-representation-structure-parsing |
Repo | https://github.com/EdinburghNLP/EncDecDRSparsing |
Framework | pytorch |
Semi-Supervised Neural System for Tagging, Parsing and Lematization
Title | Semi-Supervised Neural System for Tagging, Parsing and Lematization |
Authors | Piotr Rybak, Alina Wr{'o}blewska |
Abstract | This paper describes the ICS PAS system which took part in CoNLL 2018 shared task on Multilingual Parsing from Raw Text to Universal Dependencies. The system consists of jointly trained tagger, lemmatizer, and dependency parser which are based on features extracted by a biLSTM network. The system uses both fully connected and dilated convolutional neural architectures. The novelty of our approach is the use of an additional loss function, which reduces the number of cycles in the predicted dependency graphs, and the use of self-training to increase the system performance. The proposed system, i.e. ICS PAS (Warszawa), ranked 3th/4th in the official evaluation obtaining the following overall results: 73.02 (LAS), 60.25 (MLAS) and 64.44 (BLEX). |
Tasks | Dependency Parsing, Lemmatization, Machine Translation, Question Answering, Sentiment Analysis |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/K18-2004/ |
https://www.aclweb.org/anthology/K18-2004 | |
PWC | https://paperswithcode.com/paper/semi-supervised-neural-system-for-tagging |
Repo | https://github.com/360er0/COMBO |
Framework | tf |
Forecasting Treatment Responses Over Time Using Recurrent Marginal Structural Networks
Title | Forecasting Treatment Responses Over Time Using Recurrent Marginal Structural Networks |
Authors | Bryan Lim |
Abstract | Electronic health records provide a rich source of data for machine learning methods to learn dynamic treatment responses over time. However, any direct estimation is hampered by the presence of time-dependent confounding, where actions taken are dependent on time-varying variables related to the outcome of interest. Drawing inspiration from marginal structural models, a class of methods in epidemiology which use propensity weighting to adjust for time-dependent confounders, we introduce the Recurrent Marginal Structural Network - a sequence-to-sequence architecture for forecasting a patient’s expected response to a series of planned treatments. Using simulations of a state-of-the-art pharmacokinetic-pharmacodynamic (PK-PD) model of tumor growth, we demonstrate the ability of our network to accurately learn unbiased treatment responses from observational data – even under changes in the policy of treatment assignments – and performance gains over benchmarks. |
Tasks | Epidemiology |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/7977-forecasting-treatment-responses-over-time-using-recurrent-marginal-structural-networks |
http://papers.nips.cc/paper/7977-forecasting-treatment-responses-over-time-using-recurrent-marginal-structural-networks.pdf | |
PWC | https://paperswithcode.com/paper/forecasting-treatment-responses-over-time |
Repo | https://github.com/sjblim/rmsn_nips_2018 |
Framework | tf |
Deep Learning for Epidemiological Predictions
Title | Deep Learning for Epidemiological Predictions |
Authors | Wu, Yuexin Yang, Yiming Nishiura, Hiroshi Saitoh, Masaya |
Abstract | Predicting new and urgent trends in epidemiological data is an important problem for public health, and has attracted increasing attention in the data mining and machine learning communities. The temporal nature of epidemiology data and the need for real-time prediction by the system makes the problem residing in the category of time-series forecasting or prediction. While traditional autoregressive (AR) methods and Gaussian Process Regression (GPR) have been actively studied for solving this problem, deep learning techniques have not been explored in this domain. In this paper, we develop a deep learning framework, for the first time, to predict epidemiology profiles in the time-series perspective. We adopt Recurrent Neural Networks (RNNs) to capture the long-term correlation in the data and Convolutional Neural Networks (CNNs) to fuse information from data of different sources. A residual structure is also applied to prevent overfitting issues in the training process. We compared our model with the most widely used AR models on USA and Japan datasets. Our approach provides consistently better results than these baseline methods |
Tasks | Epidemiology, Multivariate Time Series Forecasting, Time Series, Time Series Forecasting |
Published | 2018-07-21 |
URL | https://www.onacademic.com/detail/journal_1000040816757010_db82.html |
https://www.onacademic.com/detail/journal_1000040816757010_db82.html | |
PWC | https://paperswithcode.com/paper/deep-learning-for-epidemiological-predictions |
Repo | https://github.com/CrickWu/DL4Epi |
Framework | pytorch |
Non-metric Similarity Graphs for Maximum Inner Product Search
Title | Non-metric Similarity Graphs for Maximum Inner Product Search |
Authors | Stanislav Morozov, Artem Babenko |
Abstract | In this paper we address the problem of Maximum Inner Product Search (MIPS) that is currently the computational bottleneck in a large number of machine learning applications. While being similar to the nearest neighbor search (NNS), the MIPS problem was shown to be more challenging, as the inner product is not a proper metric function. We propose to solve the MIPS problem with the usage of similarity graphs, i.e., graphs where each vertex is connected to the vertices that are the most similar in terms of some similarity function. Originally, the framework of similarity graphs was proposed for metric spaces and in this paper we naturally extend it to the non-metric MIPS scenario. We demonstrate that, unlike existing approaches, similarity graphs do not require any data transformation to reduce MIPS to the NNS problem and should be used for the original data. Moreover, we explain why such a reduction is detrimental for similarity graphs. By an extensive comparison to the existing approaches, we show that the proposed method is a game-changer in terms of the runtime/accuracy trade-off for the MIPS problem. |
Tasks | |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/7722-non-metric-similarity-graphs-for-maximum-inner-product-search |
http://papers.nips.cc/paper/7722-non-metric-similarity-graphs-for-maximum-inner-product-search.pdf | |
PWC | https://paperswithcode.com/paper/non-metric-similarity-graphs-for-maximum |
Repo | https://github.com/stanis-morozov/ip-nsw |
Framework | none |
IIT(BHU)–IIITH at CoNLL–SIGMORPHON 2018 Shared Task on Universal Morphological Reinflection
Title | IIT(BHU)–IIITH at CoNLL–SIGMORPHON 2018 Shared Task on Universal Morphological Reinflection |
Authors | Abhishek Sharma, Ganesh Katrapati, Dipti Misra Sharma |
Abstract | |
Tasks | Feature Engineering, Morphological Inflection |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/K18-3013/ |
https://www.aclweb.org/anthology/K18-3013 | |
PWC | https://paperswithcode.com/paper/iitbhuaiiith-at-conllasigmorphon-2018-shared |
Repo | https://github.com/abhishek0318/conll-sigmorphon-2018 |
Framework | pytorch |
Generalizing A Person Retrieval Model Hetero- and Homogeneously
Title | Generalizing A Person Retrieval Model Hetero- and Homogeneously |
Authors | Zhun Zhong, Liang Zheng, Shaozi Li, Yi Yang |
Abstract | Person re-identification (re-ID) poses unique challenges for unsupervised domain adaptation (UDA) in that classes in the source and target sets (domains) are entirely different and that image variations are largely caused by cameras. Given a labeled source training set and an unlabeled target training set, we aim to improve the generalization ability of re-ID models on the target testing set. To this end, we introduce a Hetero-Homogeneous Learning (HHL) method. Our method enforces two properties simultaneously: 1) camera invariance, learned via positive pairs formed by unlabeled target images and their camera style transferred counterparts; 2) domain connectedness, by regarding source / target images as negative matching pairs to the target / source images. The first property is implemented by homogeneous learning because training pairs are collected from the same domain. The second property is achieved by heterogeneous learning because we sample training pairs from both the source and target domains. On Market-1501, DukeMTMC-reID and CUHK03, we show that the two properties contribute indispensably and that very competitive re-ID UDA accuracy is achieved. Code is available at: https://github.com/zhunzhong07/HHL |
Tasks | Domain Adaptation, Person Re-Identification, Person Retrieval, Unsupervised Domain Adaptation |
Published | 2018-09-01 |
URL | http://openaccess.thecvf.com/content_ECCV_2018/html/Zhun_Zhong_Generalizing_A_Person_ECCV_2018_paper.html |
http://openaccess.thecvf.com/content_ECCV_2018/papers/Zhun_Zhong_Generalizing_A_Person_ECCV_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/generalizing-a-person-retrieval-model-hetero |
Repo | https://github.com/zhunzhong07/HHL |
Framework | pytorch |
Neural Transition-based String Transduction for Limited-Resource Setting in Morphology
Title | Neural Transition-based String Transduction for Limited-Resource Setting in Morphology |
Authors | Peter Makarov, Simon Clematide |
Abstract | We present a neural transition-based model that uses a simple set of edit actions (copy, delete, insert) for morphological transduction tasks such as inflection generation, lemmatization, and reinflection. In a large-scale evaluation on four datasets and dozens of languages, our approach consistently outperforms state-of-the-art systems on low and medium training-set sizes and is competitive in the high-resource setting. Learning to apply a generic copy action enables our approach to generalize quickly from a few data points. We successfully leverage minimum risk training to compensate for the weaknesses of MLE parameter learning and neutralize the negative effects of training a pipeline with a separate character aligner. |
Tasks | Lemmatization, Machine Translation, Morphological Inflection |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/C18-1008/ |
https://www.aclweb.org/anthology/C18-1008 | |
PWC | https://paperswithcode.com/paper/neural-transition-based-string-transduction |
Repo | https://github.com/ZurichNLP/coling2018-neural-transition-based-morphology |
Framework | none |