Paper Group NANR 210
Tractable Operations for Arithmetic Circuits of Probabilistic Models. The impact of simple feature engineering in multilingual medical NER. Climbing Mont BLEU: The Strange World of Reachable High-BLEU Translations. Sparse Support Recovery with Non-smooth Loss Functions. Agreement on Target-bidirectional Neural Machine Translation. Towards a Distrib …
Tractable Operations for Arithmetic Circuits of Probabilistic Models
Title | Tractable Operations for Arithmetic Circuits of Probabilistic Models |
Authors | Yujia Shen, Arthur Choi, Adnan Darwiche |
Abstract | We consider tractable representations of probability distributions and the polytime operations they support. In particular, we consider a recently proposed arithmetic circuit representation, the Probabilistic Sentential Decision Diagram (PSDD). We show that PSDD supports a polytime multiplication operator, while they do not support a polytime operator for summing-out variables. A polytime multiplication operator make PSDDs suitable for a broader class of applications compared to arithmetic circuits, which do not in general support multiplication. As one example, we show that PSDD multiplication leads to a very simple but effective compilation algorithm for probabilistic graphical models: represent each model factor as a PSDD, and then multiply them. |
Tasks | |
Published | 2016-12-01 |
URL | http://papers.nips.cc/paper/6363-tractable-operations-for-arithmetic-circuits-of-probabilistic-models |
http://papers.nips.cc/paper/6363-tractable-operations-for-arithmetic-circuits-of-probabilistic-models.pdf | |
PWC | https://paperswithcode.com/paper/tractable-operations-for-arithmetic-circuits |
Repo | |
Framework | |
The impact of simple feature engineering in multilingual medical NER
Title | The impact of simple feature engineering in multilingual medical NER |
Authors | Rebecka Weegar, Arantza Casillas, Arantza Diaz de Ilarraza, Maite Oronoz, Alicia P{'e}rez, Koldo Gojenola |
Abstract | The goal of this paper is to examine the impact of simple feature engineering mechanisms before applying more sophisticated techniques to the task of medical NER. Sometimes papers using scientifically sound techniques present raw baselines that could be improved adding simple and cheap features. This work focuses on entity recognition for the clinical domain for three languages: English, Swedish and Spanish. The task is tackled using simple features, starting from the window size, capitalization, prefixes, and moving to POS and semantic tags. This work demonstrates that a simple initial step of feature engineering can improve the baseline results significantly. Hence, the contributions of this paper are: first, a short list of guidelines well supported with experimental results on three languages and, second, a detailed description of the relevance of these features for medical NER. |
Tasks | Feature Engineering, Lemmatization, Named Entity Recognition, Relation Extraction |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-4201/ |
https://www.aclweb.org/anthology/W16-4201 | |
PWC | https://paperswithcode.com/paper/the-impact-of-simple-feature-engineering-in |
Repo | |
Framework | |
Climbing Mont BLEU: The Strange World of Reachable High-BLEU Translations
Title | Climbing Mont BLEU: The Strange World of Reachable High-BLEU Translations |
Authors | Aaron Smith, Christian Hardmeier, Joerg Tiedemann |
Abstract | |
Tasks | Machine Translation |
Published | 2016-01-01 |
URL | https://www.aclweb.org/anthology/W16-3414/ |
https://www.aclweb.org/anthology/W16-3414 | |
PWC | https://paperswithcode.com/paper/climbing-mont-bleu-the-strange-world-of |
Repo | |
Framework | |
Sparse Support Recovery with Non-smooth Loss Functions
Title | Sparse Support Recovery with Non-smooth Loss Functions |
Authors | Kévin Degraux, Gabriel Peyré, Jalal Fadili, Laurent Jacques |
Abstract | In this paper, we study the support recovery guarantees of underdetermined sparse regression using the $\ell_1$-norm as a regularizer and a non-smooth loss function for data fidelity. More precisely, we focus in detail on the cases of $\ell_1$ and $\ell_\infty$ losses, and contrast them with the usual $\ell_2$ loss.While these losses are routinely used to account for either sparse ($\ell_1$ loss) or uniform ($\ell_\infty$ loss) noise models, a theoretical analysis of their performance is still lacking. In this article, we extend the existing theory from the smooth $\ell_2$ case to these non-smooth cases. We derive a sharp condition which ensures that the support of the vector to recover is stable to small additive noise in the observations, as long as the loss constraint size is tuned proportionally to the noise level. A distinctive feature of our theory is that it also explains what happens when the support is unstable. While the support is not stable anymore, we identify an “extended support” and show that this extended support is stable to small additive noise. To exemplify the usefulness of our theory, we give a detailed numerical analysis of the support stability/instability of compressed sensing recovery with these different losses. This highlights different parameter regimes, ranging from total support stability to progressively increasing support instability. |
Tasks | |
Published | 2016-12-01 |
URL | http://papers.nips.cc/paper/6559-sparse-support-recovery-with-non-smooth-loss-functions |
http://papers.nips.cc/paper/6559-sparse-support-recovery-with-non-smooth-loss-functions.pdf | |
PWC | https://paperswithcode.com/paper/sparse-support-recovery-with-non-smooth-loss |
Repo | |
Framework | |
Agreement on Target-bidirectional Neural Machine Translation
Title | Agreement on Target-bidirectional Neural Machine Translation |
Authors | Lemao Liu, Masao Utiyama, Andrew Finch, Eiichiro Sumita |
Abstract | |
Tasks | Machine Translation, Structured Prediction, Transliteration |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/N16-1046/ |
https://www.aclweb.org/anthology/N16-1046 | |
PWC | https://paperswithcode.com/paper/agreement-on-target-bidirectional-neural |
Repo | |
Framework | |
Towards a Distributional Model of Semantic Complexity
Title | Towards a Distributional Model of Semantic Complexity |
Authors | Emmanuele Chersoni, Philippe Blache, Aless Lenci, ro |
Abstract | In this paper, we introduce for the first time a Distributional Model for computing semantic complexity, inspired by the general principles of the Memory, Unification and Control framework(Hagoort, 2013; Hagoort, 2016). We argue that sentence comprehension is an incremental process driven by the goal of constructing a coherent representation of the event represented by the sentence. The composition cost of a sentence depends on the semantic coherence of the event being constructed and on the activation degree of the linguistic constructions. We also report the results of a first evaluation of the model on the Bicknell dataset (Bicknell et al., 2010). |
Tasks | |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-4102/ |
https://www.aclweb.org/anthology/W16-4102 | |
PWC | https://paperswithcode.com/paper/towards-a-distributional-model-of-semantic |
Repo | |
Framework | |
EmpiriST: AIPHES - Robust Tokenization and POS-Tagging for Different Genres
Title | EmpiriST: AIPHES - Robust Tokenization and POS-Tagging for Different Genres |
Authors | Steffen Remus, Gerold Hintz, Chris Biemann, Christian M. Meyer, Darina Benikova, Judith Eckle-Kohler, Margot Mieskes, Thomas Arnold |
Abstract | |
Tasks | Machine Translation, Part-Of-Speech Tagging, Tokenization |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-2613/ |
https://www.aclweb.org/anthology/W16-2613 | |
PWC | https://paperswithcode.com/paper/empirist-aiphes-robust-tokenization-and-pos |
Repo | |
Framework | |
SoMaJo: State-of-the-art tokenization for German web and social media texts
Title | SoMaJo: State-of-the-art tokenization for German web and social media texts |
Authors | Thomas Proisl, Peter Uhrig |
Abstract | |
Tasks | Lemmatization, Tokenization |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-2607/ |
https://www.aclweb.org/anthology/W16-2607 | |
PWC | https://paperswithcode.com/paper/somajo-state-of-the-art-tokenization-for |
Repo | |
Framework | |
LTL-UDE @ EmpiriST 2015: Tokenization and PoS Tagging of Social Media Text
Title | LTL-UDE @ EmpiriST 2015: Tokenization and PoS Tagging of Social Media Text |
Authors | Tobias Horsmann, Torsten Zesch |
Abstract | |
Tasks | Boundary Detection, Part-Of-Speech Tagging, Tokenization |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-2615/ |
https://www.aclweb.org/anthology/W16-2615 | |
PWC | https://paperswithcode.com/paper/ltl-ude-empirist-2015-tokenization-and-pos |
Repo | |
Framework | |
Deep Learning Architecture for Patient Data De-identification in Clinical Records
Title | Deep Learning Architecture for Patient Data De-identification in Clinical Records |
Authors | Shweta Yadav, Asif Ekbal, Sriparna Saha, Pushpak Bhattacharyya |
Abstract | Rapid growth in Electronic Medical Records (EMR) has emerged to an expansion of data in the clinical domain. The majority of the available health care information is sealed in the form of narrative documents which form the rich source of clinical information. Text mining of such clinical records has gained huge attention in various medical applications like treatment and decision making. However, medical records enclose patient Private Health Information (PHI) which can reveal the identities of the patients. In order to retain the privacy of patients, it is mandatory to remove all the PHI information prior to making it publicly available. The aim is to de-identify or encrypt the PHI from the patient medical records. In this paper, we propose an algorithm based on deep learning architecture to solve this problem. We perform de-identification of seven PHI terms from the clinical records. Experiments on benchmark datasets show that our proposed approach achieves encouraging performance, which is better than the baseline model developed with Conditional Random Field. |
Tasks | Decision Making, Named Entity Recognition |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-4206/ |
https://www.aclweb.org/anthology/W16-4206 | |
PWC | https://paperswithcode.com/paper/deep-learning-architecture-for-patient-data |
Repo | |
Framework | |
Detecting Japanese Patients with Alzheimer’s Disease based on Word Category Frequencies
Title | Detecting Japanese Patients with Alzheimer’s Disease based on Word Category Frequencies |
Authors | Daisaku Shibata, Shoko Wakamiya, Ayae Kinoshita, Eiji Aramaki |
Abstract | In recent years, detecting Alzheimer disease (AD) in early stages based on natural language processing (NLP) has drawn much attention. To date, vocabulary size, grammatical complexity, and fluency have been studied using NLP metrics. However, the content analysis of AD narratives is still unreachable for NLP. This study investigates features of the words that AD patients use in their spoken language. After recruiting 18 examinees of 53{–}90 years old (mean: 76.89), they were divided into two groups based on MMSE scores. The AD group comprised 9 examinees with scores of 21 or lower. The healthy control group comprised 9 examinees with a score of 22 or higher. Linguistic Inquiry and Word Count (LIWC) classified words were used to categorize the words that the examinees used. The word frequency was found from observation. Significant differences were confirmed for the usage of impersonal pronouns in the AD group. This result demonstrated the basic feasibility of the proposed NLP-based detection approach. |
Tasks | |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-4211/ |
https://www.aclweb.org/anthology/W16-4211 | |
PWC | https://paperswithcode.com/paper/detecting-japanese-patients-with-alzheimeras |
Repo | |
Framework | |
SPA: Web-based Platform for easy Access to Speech Processing Modules
Title | SPA: Web-based Platform for easy Access to Speech Processing Modules |
Authors | Fern Batista, o, Pedro Curto, Isabel Trancoso, Alberto Abad, Jaime Ferreira, Eug{'e}nio Ribeiro, Helena Moniz, David Martins de Matos, Ricardo Ribeiro |
Abstract | This paper presents SPA, a web-based Speech Analytics platform that integrates several speech processing modules and that makes it possible to use them through the web. It was developed with the aim of facilitating the usage of the modules, without the need to know about software dependencies and specific configurations. Apart from being accessed by a web-browser, the platform also provides a REST API for easy integration with other applications. The platform is flexible, scalable, provides authentication for access restrictions, and was developed taking into consideration the time and effort of providing new services. The platform is still being improved, but it already integrates a considerable number of audio and text processing modules, including: Automatic transcription, speech disfluency classification, emotion detection, dialog act recognition, age and gender classification, non-nativeness detection, hyper-articulation detection, dialog act recognition, and two external modules for feature extraction and DTMF detection. This paper describes the SPA architecture, presents the already integrated modules, and provides a detailed description for the ones most recently integrated. |
Tasks | Age And Gender Classification |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1615/ |
https://www.aclweb.org/anthology/L16-1615 | |
PWC | https://paperswithcode.com/paper/spa-web-based-platform-for-easy-access-to |
Repo | |
Framework | |
Automated Anonymization as Spelling Variant Detection
Title | Automated Anonymization as Spelling Variant Detection |
Authors | Steven Kester Yuwono, Hwee Tou Ng, Kee Yuan Ngiam |
Abstract | The issue of privacy has always been a concern when clinical texts are used for research purposes. Personal health information (PHI) (such as name and identification number) needs to be removed so that patients cannot be identified. Manual anonymization is not feasible due to the large number of clinical texts to be anonymized. In this paper, we tackle the task of anonymizing clinical texts written in sentence fragments and which frequently contain symbols, abbreviations, and misspelled words. Our clinical texts therefore differ from those in the i2b2 shared tasks which are in prose form with complete sentences. Our clinical texts are also part of a structured database which contains patient name and identification number in structured fields. As such, we formulate our anonymization task as spelling variant detection, exploiting patients{'} personal information in the structured fields to detect their spelling variants in clinical texts. We successfully anonymized clinical texts consisting of more than 200 million words, using minimum edit distance and regular expression patterns. |
Tasks | |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-4214/ |
https://www.aclweb.org/anthology/W16-4214 | |
PWC | https://paperswithcode.com/paper/automated-anonymization-as-spelling-variant |
Repo | |
Framework | |
The Effect of Gender and Age Differences on the Recognition of Emotions from Facial Expressions
Title | The Effect of Gender and Age Differences on the Recognition of Emotions from Facial Expressions |
Authors | Daniela Schneevogt, Patrizia Paggio |
Abstract | Recent studies have demonstrated gender and cultural differences in the recognition of emotions in facial expressions. However, most studies were conducted on American subjects. In this paper, we explore the generalizability of several findings to a non-American culture in the form of Danish subjects. We conduct an emotion recognition task followed by two stereotype questionnaires with different genders and age groups. While recent findings (Krems et al., 2015) suggest that women are biased to see anger in neutral facial expressions posed by females, in our sample both genders assign higher ratings of anger to all emotions expressed by females. Furthermore, we demonstrate an effect of gender on the fear-surprise-confusion observed by Tomkins and McCarter (1964); females overpredict fear, while males overpredict surprise. |
Tasks | Emotion Recognition, Sentiment Analysis |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-4302/ |
https://www.aclweb.org/anthology/W16-4302 | |
PWC | https://paperswithcode.com/paper/the-effect-of-gender-and-age-differences-on |
Repo | |
Framework | |
A Recurrent and Compositional Model for Personality Trait Recognition from Short Texts
Title | A Recurrent and Compositional Model for Personality Trait Recognition from Short Texts |
Authors | Fei Liu, Julien Perez, Scott Nowson |
Abstract | Many methods have been used to recognise author personality traits from text, typically combining linguistic feature engineering with shallow learning models, e.g. linear regression or Support Vector Machines. This work uses deep-learning-based models and atomic features of text, the characters, to build hierarchical, vectorial word and sentence representations for trait inference. This method, applied to a corpus of tweets, shows state-of-the-art performance across five traits compared with prior work. The results, supported by preliminary visualisation work, are encouraging for the ability to detect complex human traits. |
Tasks | Feature Engineering, Part-Of-Speech Tagging, Personality Trait Recognition, Sentiment Analysis |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-4303/ |
https://www.aclweb.org/anthology/W16-4303 | |
PWC | https://paperswithcode.com/paper/a-recurrent-and-compositional-model-for |
Repo | |
Framework | |