July 26, 2019

1906 words 9 mins read

Paper Group NANR 15

Paper Group NANR 15

Graph Databases for Designing High-Performance Speech Recognition Grammars. Automatic Diagnosis Coding of Radiology Reports: A Comparison of Deep Learning and Conventional Classification Methods. An Empirical Comparison of Domain Adaptation Methods for Neural Machine Translation. Annotating Negation in Spanish Clinical Texts. Information Bottleneck …

Graph Databases for Designing High-Performance Speech Recognition Grammars

Title Graph Databases for Designing High-Performance Speech Recognition Grammars
Authors Maria Di Maro, Marco Valentino, Anna Riccio, Antonio Origlia
Abstract
Tasks Language Modelling, Machine Translation, Speech Recognition, Spoken Dialogue Systems, Spoken Language Understanding
Published 2017-01-01
URL https://www.aclweb.org/anthology/W17-6907/
PDF https://www.aclweb.org/anthology/W17-6907
PWC https://paperswithcode.com/paper/graph-databases-for-designing-high
Repo
Framework

Automatic Diagnosis Coding of Radiology Reports: A Comparison of Deep Learning and Conventional Classification Methods

Title Automatic Diagnosis Coding of Radiology Reports: A Comparison of Deep Learning and Conventional Classification Methods
Authors Sarvnaz Karimi, Xiang Dai, Hamed Hassanzadeh, Anthony Nguyen
Abstract Diagnosis autocoding services and research intend to both improve the productivity of clinical coders and the accuracy of the coding. It is an important step in data analysis for funding and reimbursement, as well as health services planning and resource allocation. We investigate the applicability of deep learning at autocoding of radiology reports using International Classification of Diseases (ICD). Deep learning methods are known to require large training data. Our goal is to explore how to use these methods when the training data is sparse, skewed and relatively small, and how their effectiveness compares to conventional methods. We identify optimal parameters that could be used in setting up a convolutional neural network for autocoding with comparable results to that of conventional methods.
Tasks Decision Making, Feature Engineering, Multi-Label Classification, Text Classification
Published 2017-08-01
URL https://www.aclweb.org/anthology/W17-2342/
PDF https://www.aclweb.org/anthology/W17-2342
PWC https://paperswithcode.com/paper/automatic-diagnosis-coding-of-radiology
Repo
Framework

An Empirical Comparison of Domain Adaptation Methods for Neural Machine Translation

Title An Empirical Comparison of Domain Adaptation Methods for Neural Machine Translation
Authors Chenhui Chu, Raj Dabre, Sadao Kurohashi
Abstract In this paper, we propose a novel domain adaptation method named {``}mixed fine tuning{''} for neural machine translation (NMT). We combine two existing approaches namely fine tuning and multi domain NMT. We first train an NMT model on an out-of-domain parallel corpus, and then fine tune it on a parallel corpus which is a mix of the in-domain and out-of-domain corpora. All corpora are augmented with artificial tags to indicate specific domains. We empirically compare our proposed method against fine tuning and multi domain methods and discuss its benefits and shortcomings. |
Tasks Domain Adaptation, Machine Translation
Published 2017-07-01
URL https://www.aclweb.org/anthology/P17-2061/
PDF https://www.aclweb.org/anthology/P17-2061
PWC https://paperswithcode.com/paper/an-empirical-comparison-of-domain-adaptation
Repo
Framework

Annotating Negation in Spanish Clinical Texts

Title Annotating Negation in Spanish Clinical Texts
Authors Noa Cruz, Roser Morante, Manuel J. Ma{~n}a L{'o}pez, Jacinto Mata V{'a}zquez, Carlos L. Parra Calder{'o}n
Abstract In this paper we present on-going work on annotating negation in Spanish clinical documents. A corpus of anamnesis and radiology reports has been annotated by two domain expert annotators with negation markers and negated events. The Dice coefficient for inter-annotator agreement is higher than 0.94 for negation markers and higher than 0.72 for negated events. The corpus will be publicly released when the annotation process is finished, constituting the first corpus annotated with negation for Spanish clinical reports available for the NLP community.
Tasks
Published 2017-04-01
URL https://www.aclweb.org/anthology/W17-1808/
PDF https://www.aclweb.org/anthology/W17-1808
PWC https://paperswithcode.com/paper/annotating-negation-in-spanish-clinical-texts
Repo
Framework

Information Bottleneck Inspired Method For Chat Text Segmentation

Title Information Bottleneck Inspired Method For Chat Text Segmentation
Authors S Vishal, Mohit Yadav, Lovekesh Vig, Gautam Shroff
Abstract We present a novel technique for segmenting chat conversations using the information bottleneck method (Tishby et al., 2000), augmented with sequential continuity constraints. Furthermore, we utilize critical non-textual clues such as time between two consecutive posts and people mentions within the posts. To ascertain the effectiveness of the proposed method, we have collected data from public Slack conversations and Fresco, a proprietary platform deployed inside our organization. Experiments demonstrate that the proposed method yields an absolute (relative) improvement of as high as 3.23{%} (11.25{%}). To facilitate future research, we are releasing manual annotations for segmentation on public Slack conversations.
Tasks Representation Learning, Text Generation, Topic Models
Published 2017-11-01
URL https://www.aclweb.org/anthology/I17-1020/
PDF https://www.aclweb.org/anthology/I17-1020
PWC https://paperswithcode.com/paper/information-bottleneck-inspired-method-for
Repo
Framework

Sequence-to-Dependency Neural Machine Translation

Title Sequence-to-Dependency Neural Machine Translation
Authors Shuangzhi Wu, Dongdong Zhang, Nan Yang, Mu Li, Ming Zhou
Abstract Nowadays a typical Neural Machine Translation (NMT) model generates translations from left to right as a linear sequence, during which latent syntactic structures of the target sentences are not explicitly concerned. Inspired by the success of using syntactic knowledge of target language for improving statistical machine translation, in this paper we propose a novel Sequence-to-Dependency Neural Machine Translation (SD-NMT) method, in which the target word sequence and its corresponding dependency structure are jointly constructed and modeled, and this structure is used as context to facilitate word generations. Experimental results show that the proposed method significantly outperforms state-of-the-art baselines on Chinese-English and Japanese-English translation tasks.
Tasks Machine Translation
Published 2017-07-01
URL https://www.aclweb.org/anthology/P17-1065/
PDF https://www.aclweb.org/anthology/P17-1065
PWC https://paperswithcode.com/paper/sequence-to-dependency-neural-machine
Repo
Framework

Cross-language forced alignment to assist community-based linguistics for low resource languages

Title Cross-language forced alignment to assist community-based linguistics for low resource languages
Authors Timothy Kempton
Abstract
Tasks
Published 2017-03-01
URL https://www.aclweb.org/anthology/W17-0122/
PDF https://www.aclweb.org/anthology/W17-0122
PWC https://paperswithcode.com/paper/cross-language-forced-alignment-to-assist
Repo
Framework

SemEval-2017 Task 7: Detection and Interpretation of English Puns

Title SemEval-2017 Task 7: Detection and Interpretation of English Puns
Authors Tristan Miller, Christian Hempelmann, Iryna Gurevych
Abstract A pun is a form of wordplay in which a word suggests two or more meanings by exploiting polysemy, homonymy, or phonological similarity to another word, for an intended humorous or rhetorical effect. Though a recurrent and expected feature in many discourse types, puns stymie traditional approaches to computational lexical semantics because they violate their one-sense-per-context assumption. This paper describes the first competitive evaluation for the automatic detection, location, and interpretation of puns. We describe the motivation for these tasks, the evaluation methods, and the manually annotated data set. Finally, we present an overview and discussion of the participating systems{'} methodologies, resources, and results.
Tasks Word Sense Disambiguation
Published 2017-08-01
URL https://www.aclweb.org/anthology/S17-2005/
PDF https://www.aclweb.org/anthology/S17-2005
PWC https://paperswithcode.com/paper/semeval-2017-task-7-detection-and
Repo
Framework

Thy Friend is My Friend: Iterative Collaborative Filtering for Sparse Matrix Estimation

Title Thy Friend is My Friend: Iterative Collaborative Filtering for Sparse Matrix Estimation
Authors Christian Borgs, Jennifer Chayes, Christina E. Lee, Devavrat Shah
Abstract The sparse matrix estimation problem consists of estimating the distribution of an $n\times n$ matrix $Y$, from a sparsely observed single instance of this matrix where the entries of $Y$ are independent random variables. This captures a wide array of problems; special instances include matrix completion in the context of recommendation systems, graphon estimation, and community detection in (mixed membership) stochastic block models. Inspired by classical collaborative filtering for recommendation systems, we propose a novel iterative, collaborative filtering-style algorithm for matrix estimation in this generic setting. We show that the mean squared error (MSE) of our estimator converges to $0$ at the rate of $O(d^2 (pn)^{-2/5})$ as long as $\omega(d^5 n)$ random entries from a total of $n^2$ entries of $Y$ are observed (uniformly sampled), $\E[Y]$ has rank $d$, and the entries of $Y$ have bounded support. The maximum squared error across all entries converges to $0$ with high probability as long as we observe a little more, $\Omega(d^5 n \ln^5(n))$ entries. Our results are the best known sample complexity results in this generality.
Tasks Community Detection, Graphon Estimation, Matrix Completion, Recommendation Systems
Published 2017-12-01
URL http://papers.nips.cc/paper/7057-thy-friend-is-my-friend-iterative-collaborative-filtering-for-sparse-matrix-estimation
PDF http://papers.nips.cc/paper/7057-thy-friend-is-my-friend-iterative-collaborative-filtering-for-sparse-matrix-estimation.pdf
PWC https://paperswithcode.com/paper/thy-friend-is-my-friend-iterative
Repo
Framework

Investigating Phrase-Based and Neural-Based Machine Translation on Low-Resource Settings

Title Investigating Phrase-Based and Neural-Based Machine Translation on Low-Resource Settings
Authors Hai Long Trieu, Duc-Vu Tran, Le Minh Nguyen
Abstract
Tasks Machine Translation
Published 2017-11-01
URL https://www.aclweb.org/anthology/Y17-1051/
PDF https://www.aclweb.org/anthology/Y17-1051
PWC https://paperswithcode.com/paper/investigating-phrase-based-and-neural-based
Repo
Framework

STCP: Simplified-Traditional Chinese Conversion and Proofreading

Title STCP: Simplified-Traditional Chinese Conversion and Proofreading
Authors Jiarui Xu, Xuezhe Ma, Chen-Tse Tsai, Eduard Hovy
Abstract This paper aims to provide an effective tool for conversion between Simplified Chinese and Traditional Chinese. We present STCP, a customizable system comprising statistical conversion model, and proofreading web interface. Experiments show that our system achieves comparable character-level conversion performance with the state-of-art systems. In addition, our proofreading interface can effectively support diagnostics and data annotation. STCP is available at \url{http://lagos.lti.cs.cmu.edu:8002/}
Tasks
Published 2017-11-01
URL https://www.aclweb.org/anthology/I17-3016/
PDF https://www.aclweb.org/anthology/I17-3016
PWC https://paperswithcode.com/paper/stcp-simplified-traditional-chinese
Repo
Framework

Universal Dependencies for Greek

Title Universal Dependencies for Greek
Authors Prokopis Prokopidis, Haris Papageorgiou
Abstract
Tasks
Published 2017-05-01
URL https://www.aclweb.org/anthology/W17-0413/
PDF https://www.aclweb.org/anthology/W17-0413
PWC https://paperswithcode.com/paper/universal-dependencies-for-greek
Repo
Framework

An Empirical Analysis of Edit Importance between Document Versions

Title An Empirical Analysis of Edit Importance between Document Versions
Authors Tanya Goyal, Sachin Kelkar, Manas Agarwal, Jeenu Grover
Abstract In this paper, we present a novel approach to infer significance of various textual edits to documents. An author may make several edits to a document; each edit varies in its impact to the content of the document. While some edits are surface changes and introduce negligible change, other edits may change the content/tone of the document significantly. In this paper, we perform an analysis on the human perceptions of edit importance while reviewing documents from one version to the next. We identify linguistic features that influence edit importance and model it in a regression based setting. We show that the predicted importance by our approach is highly correlated with the human perceived importance, established by a Mechanical Turk study.
Tasks Document Summarization
Published 2017-09-01
URL https://www.aclweb.org/anthology/D17-1295/
PDF https://www.aclweb.org/anthology/D17-1295
PWC https://paperswithcode.com/paper/an-empirical-analysis-of-edit-importance
Repo
Framework

Counterfactual Learning from Bandit Feedback under Deterministic Logging : A Case Study in Statistical Machine Translation

Title Counterfactual Learning from Bandit Feedback under Deterministic Logging : A Case Study in Statistical Machine Translation
Authors Carolin Lawrence, Artem Sokolov, Stefan Riezler
Abstract The goal of counterfactual learning for statistical machine translation (SMT) is to optimize a target SMT system from logged data that consist of user feedback to translations that were predicted by another, historic SMT system. A challenge arises by the fact that risk-averse commercial SMT systems deterministically log the most probable translation. The lack of sufficient exploration of the SMT output space seemingly contradicts the theoretical requirements for counterfactual learning. We show that counterfactual learning from deterministic bandit logs is possible nevertheless by smoothing out deterministic components in learning. This can be achieved by additive and multiplicative control variates that avoid degenerate behavior in empirical risk minimization. Our simulation experiments show improvements of up to 2 BLEU points by counterfactual learning from deterministic bandit feedback.
Tasks Machine Translation, Structured Prediction
Published 2017-09-01
URL https://www.aclweb.org/anthology/D17-1272/
PDF https://www.aclweb.org/anthology/D17-1272
PWC https://paperswithcode.com/paper/counterfactual-learning-from-bandit-feedback-1
Repo
Framework

MayoNLP at SemEval 2017 Task 10: Word Embedding Distance Pattern for Keyphrase Classification in Scientific Publications

Title MayoNLP at SemEval 2017 Task 10: Word Embedding Distance Pattern for Keyphrase Classification in Scientific Publications
Authors Sijia Liu, Feichen Shen, Vipin Chaudhary, Hongfang Liu
Abstract In this paper, we present MayoNLP{'}s results from the participation in the ScienceIE share task at SemEval 2017. We focused on the keyphrase classification task (Subtask B). We explored semantic similarities and patterns of keyphrases in scientific publications using pre-trained word embedding models. Word Embedding Distance Pattern, which uses the head noun word embedding to generate distance patterns based on labeled keyphrases, is proposed as an incremental feature set to enhance the conventional Named Entity Recognition feature sets. Support vector machine is used as the supervised classifier for keyphrase classification. Our system achieved an overall F1 score of 0.67 for keyphrase classification and 0.64 for keyphrase classification and relation detection.
Tasks Named Entity Recognition, Word Embeddings
Published 2017-08-01
URL https://www.aclweb.org/anthology/S17-2166/
PDF https://www.aclweb.org/anthology/S17-2166
PWC https://paperswithcode.com/paper/mayonlp-at-semeval-2017-task-10-word
Repo
Framework
comments powered by Disqus