July 26, 2019

1906 words 9 mins read

Paper Group NANR 15

Graph Databases for Designing High-Performance Speech Recognition Grammars. Automatic Diagnosis Coding of Radiology Reports: A Comparison of Deep Learning and Conventional Classification Methods. An Empirical Comparison of Domain Adaptation Methods for Neural Machine Translation. Annotating Negation in Spanish Clinical Texts. Information Bottleneck …

Graph Databases for Designing High-Performance Speech Recognition Grammars


Title	Graph Databases for Designing High-Performance Speech Recognition Grammars
Authors	Maria Di Maro, Marco Valentino, Anna Riccio, Antonio Origlia
Abstract
Tasks	Language Modelling, Machine Translation, Speech Recognition, Spoken Dialogue Systems, Spoken Language Understanding
Published	2017-01-01
URL	https://www.aclweb.org/anthology/W17-6907/
PDF	https://www.aclweb.org/anthology/W17-6907
PWC	https://paperswithcode.com/paper/graph-databases-for-designing-high
Repo
Framework

Automatic Diagnosis Coding of Radiology Reports: A Comparison of Deep Learning and Conventional Classification Methods


Title	Automatic Diagnosis Coding of Radiology Reports: A Comparison of Deep Learning and Conventional Classification Methods
Authors	Sarvnaz Karimi, Xiang Dai, Hamed Hassanzadeh, Anthony Nguyen
Abstract	Diagnosis autocoding services and research intend to both improve the productivity of clinical coders and the accuracy of the coding. It is an important step in data analysis for funding and reimbursement, as well as health services planning and resource allocation. We investigate the applicability of deep learning at autocoding of radiology reports using International Classification of Diseases (ICD). Deep learning methods are known to require large training data. Our goal is to explore how to use these methods when the training data is sparse, skewed and relatively small, and how their effectiveness compares to conventional methods. We identify optimal parameters that could be used in setting up a convolutional neural network for autocoding with comparable results to that of conventional methods.
Tasks	Decision Making, Feature Engineering, Multi-Label Classification, Text Classification
Published	2017-08-01
URL	https://www.aclweb.org/anthology/W17-2342/
PDF	https://www.aclweb.org/anthology/W17-2342
PWC	https://paperswithcode.com/paper/automatic-diagnosis-coding-of-radiology
Repo
Framework

An Empirical Comparison of Domain Adaptation Methods for Neural Machine Translation


Title	An Empirical Comparison of Domain Adaptation Methods for Neural Machine Translation
Authors	Chenhui Chu, Raj Dabre, Sadao Kurohashi
Abstract	In this paper, we propose a novel domain adaptation method named {``}mixed fine tuning{''} for neural machine translation (NMT). We combine two existing approaches namely fine tuning and multi domain NMT. We first train an NMT model on an out-of-domain parallel corpus, and then fine tune it on a parallel corpus which is a mix of the in-domain and out-of-domain corpora. All corpora are augmented with artificial tags to indicate specific domains. We empirically compare our proposed method against fine tuning and multi domain methods and discuss its benefits and shortcomings. \|
Tasks	Domain Adaptation, Machine Translation
Published	2017-07-01
URL	https://www.aclweb.org/anthology/P17-2061/
PDF	https://www.aclweb.org/anthology/P17-2061
PWC	https://paperswithcode.com/paper/an-empirical-comparison-of-domain-adaptation
Repo
Framework

Annotating Negation in Spanish Clinical Texts


Title	Annotating Negation in Spanish Clinical Texts
Authors	Noa Cruz, Roser Morante, Manuel J. Ma{~n}a L{'o}pez, Jacinto Mata V{'a}zquez, Carlos L. Parra Calder{'o}n
Abstract	In this paper we present on-going work on annotating negation in Spanish clinical documents. A corpus of anamnesis and radiology reports has been annotated by two domain expert annotators with negation markers and negated events. The Dice coefficient for inter-annotator agreement is higher than 0.94 for negation markers and higher than 0.72 for negated events. The corpus will be publicly released when the annotation process is finished, constituting the first corpus annotated with negation for Spanish clinical reports available for the NLP community.
Tasks
Published	2017-04-01
URL	https://www.aclweb.org/anthology/W17-1808/
PDF	https://www.aclweb.org/anthology/W17-1808
PWC	https://paperswithcode.com/paper/annotating-negation-in-spanish-clinical-texts
Repo
Framework

Information Bottleneck Inspired Method For Chat Text Segmentation


Title	Information Bottleneck Inspired Method For Chat Text Segmentation
Authors	S Vishal, Mohit Yadav, Lovekesh Vig, Gautam Shroff
Abstract	We present a novel technique for segmenting chat conversations using the information bottleneck method (Tishby et al., 2000), augmented with sequential continuity constraints. Furthermore, we utilize critical non-textual clues such as time between two consecutive posts and people mentions within the posts. To ascertain the effectiveness of the proposed method, we have collected data from public Slack conversations and Fresco, a proprietary platform deployed inside our organization. Experiments demonstrate that the proposed method yields an absolute (relative) improvement of as high as 3.23{%} (11.25{%}). To facilitate future research, we are releasing manual annotations for segmentation on public Slack conversations.
Tasks	Representation Learning, Text Generation, Topic Models
Published	2017-11-01
URL	https://www.aclweb.org/anthology/I17-1020/
PDF	https://www.aclweb.org/anthology/I17-1020
PWC	https://paperswithcode.com/paper/information-bottleneck-inspired-method-for
Repo
Framework

Sequence-to-Dependency Neural Machine Translation


Title	Sequence-to-Dependency Neural Machine Translation
Authors	Shuangzhi Wu, Dongdong Zhang, Nan Yang, Mu Li, Ming Zhou
Abstract	Nowadays a typical Neural Machine Translation (NMT) model generates translations from left to right as a linear sequence, during which latent syntactic structures of the target sentences are not explicitly concerned. Inspired by the success of using syntactic knowledge of target language for improving statistical machine translation, in this paper we propose a novel Sequence-to-Dependency Neural Machine Translation (SD-NMT) method, in which the target word sequence and its corresponding dependency structure are jointly constructed and modeled, and this structure is used as context to facilitate word generations. Experimental results show that the proposed method significantly outperforms state-of-the-art baselines on Chinese-English and Japanese-English translation tasks.
Tasks	Machine Translation
Published	2017-07-01
URL	https://www.aclweb.org/anthology/P17-1065/
PDF	https://www.aclweb.org/anthology/P17-1065
PWC	https://paperswithcode.com/paper/sequence-to-dependency-neural-machine
Repo
Framework

Cross-language forced alignment to assist community-based linguistics for low resource languages


Title	Cross-language forced alignment to assist community-based linguistics for low resource languages
Authors	Timothy Kempton
Abstract
Tasks
Published	2017-03-01
URL	https://www.aclweb.org/anthology/W17-0122/
PDF	https://www.aclweb.org/anthology/W17-0122
PWC	https://paperswithcode.com/paper/cross-language-forced-alignment-to-assist
Repo
Framework

SemEval-2017 Task 7: Detection and Interpretation of English Puns


Title	SemEval-2017 Task 7: Detection and Interpretation of English Puns
Authors	Tristan Miller, Christian Hempelmann, Iryna Gurevych
Abstract	A pun is a form of wordplay in which a word suggests two or more meanings by exploiting polysemy, homonymy, or phonological similarity to another word, for an intended humorous or rhetorical effect. Though a recurrent and expected feature in many discourse types, puns stymie traditional approaches to computational lexical semantics because they violate their one-sense-per-context assumption. This paper describes the first competitive evaluation for the automatic detection, location, and interpretation of puns. We describe the motivation for these tasks, the evaluation methods, and the manually annotated data set. Finally, we present an overview and discussion of the participating systems{'} methodologies, resources, and results.
Tasks	Word Sense Disambiguation
Published	2017-08-01
URL	https://www.aclweb.org/anthology/S17-2005/
PDF	https://www.aclweb.org/anthology/S17-2005
PWC	https://paperswithcode.com/paper/semeval-2017-task-7-detection-and
Repo
Framework

Thy Friend is My Friend: Iterative Collaborative Filtering for Sparse Matrix Estimation


Title	Thy Friend is My Friend: Iterative Collaborative Filtering for Sparse Matrix Estimation
Authors	Christian Borgs, Jennifer Chayes, Christina E. Lee, Devavrat Shah
Abstract	The sparse matrix estimation problem consists of estimating the distribution of an $n\times n$ matrix $Y$, from a sparsely observed single instance of this matrix where the entries of $Y$ are independent random variables. This captures a wide array of problems; special instances include matrix completion in the context of recommendation systems, graphon estimation, and community detection in (mixed membership) stochastic block models. Inspired by classical collaborative filtering for recommendation systems, we propose a novel iterative, collaborative filtering-style algorithm for matrix estimation in this generic setting. We show that the mean squared error (MSE) of our estimator converges to $0$ at the rate of $O(d^2 (pn)^{-2/5})$ as long as $\omega(d^5 n)$ random entries from a total of $n^2$ entries of $Y$ are observed (uniformly sampled), $\E[Y]$ has rank $d$, and the entries of $Y$ have bounded support. The maximum squared error across all entries converges to $0$ with high probability as long as we observe a little more, $\Omega(d^5 n \ln^5(n))$ entries. Our results are the best known sample complexity results in this generality.
Tasks	Community Detection, Graphon Estimation, Matrix Completion, Recommendation Systems
Published	2017-12-01
URL	http://papers.nips.cc/paper/7057-thy-friend-is-my-friend-iterative-collaborative-filtering-for-sparse-matrix-estimation
PDF	http://papers.nips.cc/paper/7057-thy-friend-is-my-friend-iterative-collaborative-filtering-for-sparse-matrix-estimation.pdf
PWC	https://paperswithcode.com/paper/thy-friend-is-my-friend-iterative
Repo
Framework

Investigating Phrase-Based and Neural-Based Machine Translation on Low-Resource Settings


Title	Investigating Phrase-Based and Neural-Based Machine Translation on Low-Resource Settings
Authors	Hai Long Trieu, Duc-Vu Tran, Le Minh Nguyen
Abstract
Tasks	Machine Translation
Published	2017-11-01
URL	https://www.aclweb.org/anthology/Y17-1051/
PDF	https://www.aclweb.org/anthology/Y17-1051
PWC	https://paperswithcode.com/paper/investigating-phrase-based-and-neural-based
Repo
Framework

STCP: Simplified-Traditional Chinese Conversion and Proofreading


Title	STCP: Simplified-Traditional Chinese Conversion and Proofreading
Authors	Jiarui Xu, Xuezhe Ma, Chen-Tse Tsai, Eduard Hovy
Abstract	This paper aims to provide an effective tool for conversion between Simplified Chinese and Traditional Chinese. We present STCP, a customizable system comprising statistical conversion model, and proofreading web interface. Experiments show that our system achieves comparable character-level conversion performance with the state-of-art systems. In addition, our proofreading interface can effectively support diagnostics and data annotation. STCP is available at \url{http://lagos.lti.cs.cmu.edu:8002/}
Tasks
Published	2017-11-01
URL	https://www.aclweb.org/anthology/I17-3016/
PDF	https://www.aclweb.org/anthology/I17-3016
PWC	https://paperswithcode.com/paper/stcp-simplified-traditional-chinese
Repo
Framework

Universal Dependencies for Greek


Title	Universal Dependencies for Greek
Authors	Prokopis Prokopidis, Haris Papageorgiou
Abstract
Tasks
Published	2017-05-01
URL	https://www.aclweb.org/anthology/W17-0413/
PDF	https://www.aclweb.org/anthology/W17-0413
PWC	https://paperswithcode.com/paper/universal-dependencies-for-greek
Repo
Framework

An Empirical Analysis of Edit Importance between Document Versions


Title	An Empirical Analysis of Edit Importance between Document Versions
Authors	Tanya Goyal, Sachin Kelkar, Manas Agarwal, Jeenu Grover
Abstract	In this paper, we present a novel approach to infer significance of various textual edits to documents. An author may make several edits to a document; each edit varies in its impact to the content of the document. While some edits are surface changes and introduce negligible change, other edits may change the content/tone of the document significantly. In this paper, we perform an analysis on the human perceptions of edit importance while reviewing documents from one version to the next. We identify linguistic features that influence edit importance and model it in a regression based setting. We show that the predicted importance by our approach is highly correlated with the human perceived importance, established by a Mechanical Turk study.
Tasks	Document Summarization
Published	2017-09-01
URL	https://www.aclweb.org/anthology/D17-1295/
PDF	https://www.aclweb.org/anthology/D17-1295
PWC	https://paperswithcode.com/paper/an-empirical-analysis-of-edit-importance
Repo
Framework

Counterfactual Learning from Bandit Feedback under Deterministic Logging : A Case Study in Statistical Machine Translation


Title	Counterfactual Learning from Bandit Feedback under Deterministic Logging : A Case Study in Statistical Machine Translation
Authors	Carolin Lawrence, Artem Sokolov, Stefan Riezler
Abstract	The goal of counterfactual learning for statistical machine translation (SMT) is to optimize a target SMT system from logged data that consist of user feedback to translations that were predicted by another, historic SMT system. A challenge arises by the fact that risk-averse commercial SMT systems deterministically log the most probable translation. The lack of sufficient exploration of the SMT output space seemingly contradicts the theoretical requirements for counterfactual learning. We show that counterfactual learning from deterministic bandit logs is possible nevertheless by smoothing out deterministic components in learning. This can be achieved by additive and multiplicative control variates that avoid degenerate behavior in empirical risk minimization. Our simulation experiments show improvements of up to 2 BLEU points by counterfactual learning from deterministic bandit feedback.
Tasks	Machine Translation, Structured Prediction
Published	2017-09-01
URL	https://www.aclweb.org/anthology/D17-1272/
PDF	https://www.aclweb.org/anthology/D17-1272
PWC	https://paperswithcode.com/paper/counterfactual-learning-from-bandit-feedback-1
Repo
Framework

MayoNLP at SemEval 2017 Task 10: Word Embedding Distance Pattern for Keyphrase Classification in Scientific Publications


Title	MayoNLP at SemEval 2017 Task 10: Word Embedding Distance Pattern for Keyphrase Classification in Scientific Publications
Authors	Sijia Liu, Feichen Shen, Vipin Chaudhary, Hongfang Liu
Abstract	In this paper, we present MayoNLP{'}s results from the participation in the ScienceIE share task at SemEval 2017. We focused on the keyphrase classification task (Subtask B). We explored semantic similarities and patterns of keyphrases in scientific publications using pre-trained word embedding models. Word Embedding Distance Pattern, which uses the head noun word embedding to generate distance patterns based on labeled keyphrases, is proposed as an incremental feature set to enhance the conventional Named Entity Recognition feature sets. Support vector machine is used as the supervised classifier for keyphrase classification. Our system achieved an overall F1 score of 0.67 for keyphrase classification and 0.64 for keyphrase classification and relation detection.
Tasks	Named Entity Recognition, Word Embeddings
Published	2017-08-01
URL	https://www.aclweb.org/anthology/S17-2166/
PDF	https://www.aclweb.org/anthology/S17-2166
PWC	https://paperswithcode.com/paper/mayonlp-at-semeval-2017-task-10-word
Repo
Framework