May 4, 2019

2138 words 11 mins read

Paper Group NANR 210

Paper Group NANR 210

Tractable Operations for Arithmetic Circuits of Probabilistic Models. The impact of simple feature engineering in multilingual medical NER. Climbing Mont BLEU: The Strange World of Reachable High-BLEU Translations. Sparse Support Recovery with Non-smooth Loss Functions. Agreement on Target-bidirectional Neural Machine Translation. Towards a Distrib …

Tractable Operations for Arithmetic Circuits of Probabilistic Models

Title Tractable Operations for Arithmetic Circuits of Probabilistic Models
Authors Yujia Shen, Arthur Choi, Adnan Darwiche
Abstract We consider tractable representations of probability distributions and the polytime operations they support. In particular, we consider a recently proposed arithmetic circuit representation, the Probabilistic Sentential Decision Diagram (PSDD). We show that PSDD supports a polytime multiplication operator, while they do not support a polytime operator for summing-out variables. A polytime multiplication operator make PSDDs suitable for a broader class of applications compared to arithmetic circuits, which do not in general support multiplication. As one example, we show that PSDD multiplication leads to a very simple but effective compilation algorithm for probabilistic graphical models: represent each model factor as a PSDD, and then multiply them.
Tasks
Published 2016-12-01
URL http://papers.nips.cc/paper/6363-tractable-operations-for-arithmetic-circuits-of-probabilistic-models
PDF http://papers.nips.cc/paper/6363-tractable-operations-for-arithmetic-circuits-of-probabilistic-models.pdf
PWC https://paperswithcode.com/paper/tractable-operations-for-arithmetic-circuits
Repo
Framework

The impact of simple feature engineering in multilingual medical NER

Title The impact of simple feature engineering in multilingual medical NER
Authors Rebecka Weegar, Arantza Casillas, Arantza Diaz de Ilarraza, Maite Oronoz, Alicia P{'e}rez, Koldo Gojenola
Abstract The goal of this paper is to examine the impact of simple feature engineering mechanisms before applying more sophisticated techniques to the task of medical NER. Sometimes papers using scientifically sound techniques present raw baselines that could be improved adding simple and cheap features. This work focuses on entity recognition for the clinical domain for three languages: English, Swedish and Spanish. The task is tackled using simple features, starting from the window size, capitalization, prefixes, and moving to POS and semantic tags. This work demonstrates that a simple initial step of feature engineering can improve the baseline results significantly. Hence, the contributions of this paper are: first, a short list of guidelines well supported with experimental results on three languages and, second, a detailed description of the relevance of these features for medical NER.
Tasks Feature Engineering, Lemmatization, Named Entity Recognition, Relation Extraction
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-4201/
PDF https://www.aclweb.org/anthology/W16-4201
PWC https://paperswithcode.com/paper/the-impact-of-simple-feature-engineering-in
Repo
Framework

Climbing Mont BLEU: The Strange World of Reachable High-BLEU Translations

Title Climbing Mont BLEU: The Strange World of Reachable High-BLEU Translations
Authors Aaron Smith, Christian Hardmeier, Joerg Tiedemann
Abstract
Tasks Machine Translation
Published 2016-01-01
URL https://www.aclweb.org/anthology/W16-3414/
PDF https://www.aclweb.org/anthology/W16-3414
PWC https://paperswithcode.com/paper/climbing-mont-bleu-the-strange-world-of
Repo
Framework

Sparse Support Recovery with Non-smooth Loss Functions

Title Sparse Support Recovery with Non-smooth Loss Functions
Authors Kévin Degraux, Gabriel Peyré, Jalal Fadili, Laurent Jacques
Abstract In this paper, we study the support recovery guarantees of underdetermined sparse regression using the $\ell_1$-norm as a regularizer and a non-smooth loss function for data fidelity. More precisely, we focus in detail on the cases of $\ell_1$ and $\ell_\infty$ losses, and contrast them with the usual $\ell_2$ loss.While these losses are routinely used to account for either sparse ($\ell_1$ loss) or uniform ($\ell_\infty$ loss) noise models, a theoretical analysis of their performance is still lacking. In this article, we extend the existing theory from the smooth $\ell_2$ case to these non-smooth cases. We derive a sharp condition which ensures that the support of the vector to recover is stable to small additive noise in the observations, as long as the loss constraint size is tuned proportionally to the noise level. A distinctive feature of our theory is that it also explains what happens when the support is unstable. While the support is not stable anymore, we identify an “extended support” and show that this extended support is stable to small additive noise. To exemplify the usefulness of our theory, we give a detailed numerical analysis of the support stability/instability of compressed sensing recovery with these different losses. This highlights different parameter regimes, ranging from total support stability to progressively increasing support instability.
Tasks
Published 2016-12-01
URL http://papers.nips.cc/paper/6559-sparse-support-recovery-with-non-smooth-loss-functions
PDF http://papers.nips.cc/paper/6559-sparse-support-recovery-with-non-smooth-loss-functions.pdf
PWC https://paperswithcode.com/paper/sparse-support-recovery-with-non-smooth-loss
Repo
Framework

Agreement on Target-bidirectional Neural Machine Translation

Title Agreement on Target-bidirectional Neural Machine Translation
Authors Lemao Liu, Masao Utiyama, Andrew Finch, Eiichiro Sumita
Abstract
Tasks Machine Translation, Structured Prediction, Transliteration
Published 2016-06-01
URL https://www.aclweb.org/anthology/N16-1046/
PDF https://www.aclweb.org/anthology/N16-1046
PWC https://paperswithcode.com/paper/agreement-on-target-bidirectional-neural
Repo
Framework

Towards a Distributional Model of Semantic Complexity

Title Towards a Distributional Model of Semantic Complexity
Authors Emmanuele Chersoni, Philippe Blache, Aless Lenci, ro
Abstract In this paper, we introduce for the first time a Distributional Model for computing semantic complexity, inspired by the general principles of the Memory, Unification and Control framework(Hagoort, 2013; Hagoort, 2016). We argue that sentence comprehension is an incremental process driven by the goal of constructing a coherent representation of the event represented by the sentence. The composition cost of a sentence depends on the semantic coherence of the event being constructed and on the activation degree of the linguistic constructions. We also report the results of a first evaluation of the model on the Bicknell dataset (Bicknell et al., 2010).
Tasks
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-4102/
PDF https://www.aclweb.org/anthology/W16-4102
PWC https://paperswithcode.com/paper/towards-a-distributional-model-of-semantic
Repo
Framework

EmpiriST: AIPHES - Robust Tokenization and POS-Tagging for Different Genres

Title EmpiriST: AIPHES - Robust Tokenization and POS-Tagging for Different Genres
Authors Steffen Remus, Gerold Hintz, Chris Biemann, Christian M. Meyer, Darina Benikova, Judith Eckle-Kohler, Margot Mieskes, Thomas Arnold
Abstract
Tasks Machine Translation, Part-Of-Speech Tagging, Tokenization
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-2613/
PDF https://www.aclweb.org/anthology/W16-2613
PWC https://paperswithcode.com/paper/empirist-aiphes-robust-tokenization-and-pos
Repo
Framework

SoMaJo: State-of-the-art tokenization for German web and social media texts

Title SoMaJo: State-of-the-art tokenization for German web and social media texts
Authors Thomas Proisl, Peter Uhrig
Abstract
Tasks Lemmatization, Tokenization
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-2607/
PDF https://www.aclweb.org/anthology/W16-2607
PWC https://paperswithcode.com/paper/somajo-state-of-the-art-tokenization-for
Repo
Framework

LTL-UDE @ EmpiriST 2015: Tokenization and PoS Tagging of Social Media Text

Title LTL-UDE @ EmpiriST 2015: Tokenization and PoS Tagging of Social Media Text
Authors Tobias Horsmann, Torsten Zesch
Abstract
Tasks Boundary Detection, Part-Of-Speech Tagging, Tokenization
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-2615/
PDF https://www.aclweb.org/anthology/W16-2615
PWC https://paperswithcode.com/paper/ltl-ude-empirist-2015-tokenization-and-pos
Repo
Framework

Deep Learning Architecture for Patient Data De-identification in Clinical Records

Title Deep Learning Architecture for Patient Data De-identification in Clinical Records
Authors Shweta Yadav, Asif Ekbal, Sriparna Saha, Pushpak Bhattacharyya
Abstract Rapid growth in Electronic Medical Records (EMR) has emerged to an expansion of data in the clinical domain. The majority of the available health care information is sealed in the form of narrative documents which form the rich source of clinical information. Text mining of such clinical records has gained huge attention in various medical applications like treatment and decision making. However, medical records enclose patient Private Health Information (PHI) which can reveal the identities of the patients. In order to retain the privacy of patients, it is mandatory to remove all the PHI information prior to making it publicly available. The aim is to de-identify or encrypt the PHI from the patient medical records. In this paper, we propose an algorithm based on deep learning architecture to solve this problem. We perform de-identification of seven PHI terms from the clinical records. Experiments on benchmark datasets show that our proposed approach achieves encouraging performance, which is better than the baseline model developed with Conditional Random Field.
Tasks Decision Making, Named Entity Recognition
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-4206/
PDF https://www.aclweb.org/anthology/W16-4206
PWC https://paperswithcode.com/paper/deep-learning-architecture-for-patient-data
Repo
Framework

Detecting Japanese Patients with Alzheimer’s Disease based on Word Category Frequencies

Title Detecting Japanese Patients with Alzheimer’s Disease based on Word Category Frequencies
Authors Daisaku Shibata, Shoko Wakamiya, Ayae Kinoshita, Eiji Aramaki
Abstract In recent years, detecting Alzheimer disease (AD) in early stages based on natural language processing (NLP) has drawn much attention. To date, vocabulary size, grammatical complexity, and fluency have been studied using NLP metrics. However, the content analysis of AD narratives is still unreachable for NLP. This study investigates features of the words that AD patients use in their spoken language. After recruiting 18 examinees of 53{–}90 years old (mean: 76.89), they were divided into two groups based on MMSE scores. The AD group comprised 9 examinees with scores of 21 or lower. The healthy control group comprised 9 examinees with a score of 22 or higher. Linguistic Inquiry and Word Count (LIWC) classified words were used to categorize the words that the examinees used. The word frequency was found from observation. Significant differences were confirmed for the usage of impersonal pronouns in the AD group. This result demonstrated the basic feasibility of the proposed NLP-based detection approach.
Tasks
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-4211/
PDF https://www.aclweb.org/anthology/W16-4211
PWC https://paperswithcode.com/paper/detecting-japanese-patients-with-alzheimeras
Repo
Framework

SPA: Web-based Platform for easy Access to Speech Processing Modules

Title SPA: Web-based Platform for easy Access to Speech Processing Modules
Authors Fern Batista, o, Pedro Curto, Isabel Trancoso, Alberto Abad, Jaime Ferreira, Eug{'e}nio Ribeiro, Helena Moniz, David Martins de Matos, Ricardo Ribeiro
Abstract This paper presents SPA, a web-based Speech Analytics platform that integrates several speech processing modules and that makes it possible to use them through the web. It was developed with the aim of facilitating the usage of the modules, without the need to know about software dependencies and specific configurations. Apart from being accessed by a web-browser, the platform also provides a REST API for easy integration with other applications. The platform is flexible, scalable, provides authentication for access restrictions, and was developed taking into consideration the time and effort of providing new services. The platform is still being improved, but it already integrates a considerable number of audio and text processing modules, including: Automatic transcription, speech disfluency classification, emotion detection, dialog act recognition, age and gender classification, non-nativeness detection, hyper-articulation detection, dialog act recognition, and two external modules for feature extraction and DTMF detection. This paper describes the SPA architecture, presents the already integrated modules, and provides a detailed description for the ones most recently integrated.
Tasks Age And Gender Classification
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1615/
PDF https://www.aclweb.org/anthology/L16-1615
PWC https://paperswithcode.com/paper/spa-web-based-platform-for-easy-access-to
Repo
Framework

Automated Anonymization as Spelling Variant Detection

Title Automated Anonymization as Spelling Variant Detection
Authors Steven Kester Yuwono, Hwee Tou Ng, Kee Yuan Ngiam
Abstract The issue of privacy has always been a concern when clinical texts are used for research purposes. Personal health information (PHI) (such as name and identification number) needs to be removed so that patients cannot be identified. Manual anonymization is not feasible due to the large number of clinical texts to be anonymized. In this paper, we tackle the task of anonymizing clinical texts written in sentence fragments and which frequently contain symbols, abbreviations, and misspelled words. Our clinical texts therefore differ from those in the i2b2 shared tasks which are in prose form with complete sentences. Our clinical texts are also part of a structured database which contains patient name and identification number in structured fields. As such, we formulate our anonymization task as spelling variant detection, exploiting patients{'} personal information in the structured fields to detect their spelling variants in clinical texts. We successfully anonymized clinical texts consisting of more than 200 million words, using minimum edit distance and regular expression patterns.
Tasks
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-4214/
PDF https://www.aclweb.org/anthology/W16-4214
PWC https://paperswithcode.com/paper/automated-anonymization-as-spelling-variant
Repo
Framework

The Effect of Gender and Age Differences on the Recognition of Emotions from Facial Expressions

Title The Effect of Gender and Age Differences on the Recognition of Emotions from Facial Expressions
Authors Daniela Schneevogt, Patrizia Paggio
Abstract Recent studies have demonstrated gender and cultural differences in the recognition of emotions in facial expressions. However, most studies were conducted on American subjects. In this paper, we explore the generalizability of several findings to a non-American culture in the form of Danish subjects. We conduct an emotion recognition task followed by two stereotype questionnaires with different genders and age groups. While recent findings (Krems et al., 2015) suggest that women are biased to see anger in neutral facial expressions posed by females, in our sample both genders assign higher ratings of anger to all emotions expressed by females. Furthermore, we demonstrate an effect of gender on the fear-surprise-confusion observed by Tomkins and McCarter (1964); females overpredict fear, while males overpredict surprise.
Tasks Emotion Recognition, Sentiment Analysis
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-4302/
PDF https://www.aclweb.org/anthology/W16-4302
PWC https://paperswithcode.com/paper/the-effect-of-gender-and-age-differences-on
Repo
Framework

A Recurrent and Compositional Model for Personality Trait Recognition from Short Texts

Title A Recurrent and Compositional Model for Personality Trait Recognition from Short Texts
Authors Fei Liu, Julien Perez, Scott Nowson
Abstract Many methods have been used to recognise author personality traits from text, typically combining linguistic feature engineering with shallow learning models, e.g. linear regression or Support Vector Machines. This work uses deep-learning-based models and atomic features of text, the characters, to build hierarchical, vectorial word and sentence representations for trait inference. This method, applied to a corpus of tweets, shows state-of-the-art performance across five traits compared with prior work. The results, supported by preliminary visualisation work, are encouraging for the ability to detect complex human traits.
Tasks Feature Engineering, Part-Of-Speech Tagging, Personality Trait Recognition, Sentiment Analysis
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-4303/
PDF https://www.aclweb.org/anthology/W16-4303
PWC https://paperswithcode.com/paper/a-recurrent-and-compositional-model-for
Repo
Framework
comments powered by Disqus