July 26, 2019

2453 words 12 mins read

Paper Group NANR 180

Paper Group NANR 180

A Full Non-Monotonic Transition System for Unrestricted Non-Projective Parsing. An analysis of eye-movements during reading for the detection of mild cognitive impairment. Twitter Demographic Classification Using Deep Multi-modal Multi-task Learning. Don’t Throw Those Morphological Analyzers Away Just Yet: Neural Morphological Disambiguation for Ar …

A Full Non-Monotonic Transition System for Unrestricted Non-Projective Parsing

Title A Full Non-Monotonic Transition System for Unrestricted Non-Projective Parsing
Authors Daniel Fern{'a}ndez-Gonz{'a}lez, Carlos G{'o}mez-Rodr{'\i}guez
Abstract Restricted non-monotonicity has been shown beneficial for the projective arc-eager dependency parser in previous research, as posterior decisions can repair mistakes made in previous states due to the lack of information. In this paper, we propose a novel, fully non-monotonic transition system based on the non-projective Covington algorithm. As a non-monotonic system requires exploration of erroneous actions during the training process, we develop several non-monotonic variants of the recently defined dynamic oracle for the Covington parser, based on tight approximations of the loss. Experiments on datasets from the CoNLL-X and CoNLL-XI shared tasks show that a non-monotonic dynamic oracle outperforms the monotonic version in the majority of languages.
Tasks
Published 2017-07-01
URL https://www.aclweb.org/anthology/P17-1027/
PDF https://www.aclweb.org/anthology/P17-1027
PWC https://paperswithcode.com/paper/a-full-non-monotonic-transition-system-for-1
Repo
Framework

An analysis of eye-movements during reading for the detection of mild cognitive impairment

Title An analysis of eye-movements during reading for the detection of mild cognitive impairment
Authors Kathleen C. Fraser, Kristina Lundholm Fors, Dimitrios Kokkinakis, Arto Nordlund
Abstract We present a machine learning analysis of eye-tracking data for the detection of mild cognitive impairment, a decline in cognitive abilities that is associated with an increased risk of developing dementia. We compare two experimental configurations (reading aloud versus reading silently), as well as two methods of combining information from the two trials (concatenation and merging). Additionally, we annotate the words being read with information about their frequency and syntactic category, and use these annotations to generate new features. Ultimately, we are able to distinguish between participants with and without cognitive impairment with up to 86{%} accuracy.
Tasks Eye Tracking
Published 2017-09-01
URL https://www.aclweb.org/anthology/D17-1107/
PDF https://www.aclweb.org/anthology/D17-1107
PWC https://paperswithcode.com/paper/an-analysis-of-eye-movements-during-reading
Repo
Framework

Twitter Demographic Classification Using Deep Multi-modal Multi-task Learning

Title Twitter Demographic Classification Using Deep Multi-modal Multi-task Learning
Authors Prashanth Vijayaraghavan, Soroush Vosoughi, Deb Roy
Abstract Twitter should be an ideal place to get a fresh read on how different issues are playing with the public, one that{'}s potentially more reflective of democracy in this new media age than traditional polls. Pollsters typically ask people a fixed set of questions, while in social media people use their own voices to speak about whatever is on their minds. However, the demographic distribution of users on Twitter is not representative of the general population. In this paper, we present a demographic classifier for gender, age, political orientation and location on Twitter. We collected and curated a robust Twitter demographic dataset for this task. Our classifier uses a deep multi-modal multi-task learning architecture to reach a state-of-the-art performance, achieving an F1-score of 0.89, 0.82, 0.86, and 0.68 for gender, age, political orientation, and location respectively.
Tasks Multi-Task Learning
Published 2017-07-01
URL https://www.aclweb.org/anthology/P17-2076/
PDF https://www.aclweb.org/anthology/P17-2076
PWC https://paperswithcode.com/paper/twitter-demographic-classification-using-deep
Repo
Framework

Don’t Throw Those Morphological Analyzers Away Just Yet: Neural Morphological Disambiguation for Arabic

Title Don’t Throw Those Morphological Analyzers Away Just Yet: Neural Morphological Disambiguation for Arabic
Authors Nasser Zalmout, Nizar Habash
Abstract This paper presents a model for Arabic morphological disambiguation based on Recurrent Neural Networks (RNN). We train Long Short-Term Memory (LSTM) cells in several configurations and embedding levels to model the various morphological features. Our experiments show that these models outperform state-of-the-art systems without explicit use of feature engineering. However, adding learning features from a morphological analyzer to model the space of possible analyses provides additional improvement. We make use of the resulting morphological models for scoring and ranking the analyses of the morphological analyzer for morphological disambiguation. The results show significant gains in accuracy across several evaluation metrics. Our system results in 4.4{%} absolute increase over the state-of-the-art in full morphological analysis accuracy (30.6{%} relative error reduction), and 10.6{%} (31.5{%} relative error reduction) for out-of-vocabulary words.
Tasks Feature Engineering, Language Modelling, Morphological Analysis, Morphological Tagging
Published 2017-09-01
URL https://www.aclweb.org/anthology/D17-1073/
PDF https://www.aclweb.org/anthology/D17-1073
PWC https://paperswithcode.com/paper/dont-throw-those-morphological-analyzers-away
Repo
Framework

Cross-lingual Character-Level Neural Morphological Tagging

Title Cross-lingual Character-Level Neural Morphological Tagging
Authors Ryan Cotterell, Georg Heigold
Abstract Even for common NLP tasks, sufficient supervision is not available in many languages {–} morphological tagging is no exception. In the work presented here, we explore a transfer learning scheme, whereby we train character-level recurrent neural taggers to predict morphological taggings for high-resource languages and low-resource languages together. Learning joint character representations among multiple related languages successfully enables knowledge transfer from the high-resource languages to the low-resource ones.
Tasks Language Modelling, Morphological Tagging, Part-Of-Speech Tagging, Transfer Learning
Published 2017-09-01
URL https://www.aclweb.org/anthology/D17-1078/
PDF https://www.aclweb.org/anthology/D17-1078
PWC https://paperswithcode.com/paper/cross-lingual-character-level-neural-1
Repo
Framework

Entity Extraction in Biomedical Corpora: An Approach to Evaluate Word Embedding Features with PSO based Feature Selection

Title Entity Extraction in Biomedical Corpora: An Approach to Evaluate Word Embedding Features with PSO based Feature Selection
Authors Shweta Yadav, Asif Ekbal, Sriparna Saha, Pushpak Bhattacharyya
Abstract Text mining has drawn significant attention in recent past due to the rapid growth in biomedical and clinical records. Entity extraction is one of the fundamental components for biomedical text mining. In this paper, we propose a novel approach of feature selection for entity extraction that exploits the concept of deep learning and Particle Swarm Optimization (PSO). The system utilizes word embedding features along with several other features extracted by studying the properties of the datasets. We obtain an interesting observation that compact word embedding features as determined by PSO are more effective compared to the entire word embedding feature set for entity extraction. The proposed system is evaluated on three benchmark biomedical datasets such as GENIA, GENETAG, and AiMed. The effectiveness of the proposed approach is evident with significant performance gains over the baseline models as well as the other existing systems. We observe improvements of 7.86{%}, 5.27{%} and 7.25{%} F-measure points over the baseline models for GENIA, GENETAG, and AiMed dataset respectively.
Tasks Boundary Detection, Entity Extraction, Feature Selection, Word Sense Disambiguation
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-1109/
PDF https://www.aclweb.org/anthology/E17-1109
PWC https://paperswithcode.com/paper/entity-extraction-in-biomedical-corpora-an
Repo
Framework

YNU-HPCC at IJCNLP-2017 Task 1: Chinese Grammatical Error Diagnosis Using a Bi-directional LSTM-CRF Model

Title YNU-HPCC at IJCNLP-2017 Task 1: Chinese Grammatical Error Diagnosis Using a Bi-directional LSTM-CRF Model
Authors Quanlei Liao, Jin Wang, Jinnan Yang, Xuejie Zhang
Abstract Building a system to detect Chinese grammatical errors is a challenge for natural-language processing researchers. As Chinese learners are increasing, developing such a system can help them study Chinese more easily. This paper introduces a bi-directional long short-term memory (BiLSTM) - conditional random field (CRF) model to produce the sequences that indicate an error type for every position of a sentence, since we regard Chinese grammatical error diagnosis (CGED) as a sequence-labeling problem.
Tasks
Published 2017-12-01
URL https://www.aclweb.org/anthology/I17-4011/
PDF https://www.aclweb.org/anthology/I17-4011
PWC https://paperswithcode.com/paper/ynu-hpcc-at-ijcnlp-2017-task-1-chinese
Repo
Framework

A Biomedical Question Answering System in BioASQ 2017

Title A Biomedical Question Answering System in BioASQ 2017
Authors Mourad Sarrouti, Said Ouatik El Alaoui
Abstract Question answering, the identification of short accurate answers to users questions, is a longstanding challenge widely studied over the last decades in the open domain. However, it still requires further efforts in the biomedical domain. In this paper, we describe our participation in phase B of task 5b in the 2017 BioASQ challenge using our biomedical question answering system. Our system, dealing with four types of questions (i.e., yes/no, factoid, list, and summary), is based on (1) a dictionary-based approach for generating the exact answers of yes/no questions, (2) UMLS metathesaurus and term frequency metric for extracting the exact answers of factoid and list questions, and (3) the BM25 model and UMLS concepts for retrieving the ideal answers (i.e., paragraph-sized summaries). Preliminary results show that our system achieves good and competitive results in both exact and ideal answers extraction tasks as compared with the participating systems.
Tasks Question Answering
Published 2017-08-01
URL https://www.aclweb.org/anthology/W17-2337/
PDF https://www.aclweb.org/anthology/W17-2337
PWC https://paperswithcode.com/paper/a-biomedical-question-answering-system-in
Repo
Framework

Clone MCMC: Parallel High-Dimensional Gaussian Gibbs Sampling

Title Clone MCMC: Parallel High-Dimensional Gaussian Gibbs Sampling
Authors Andrei-Cristian Barbos, Francois Caron, Jean-François Giovannelli, Arnaud Doucet
Abstract We propose a generalized Gibbs sampler algorithm for obtaining samples approximately distributed from a high-dimensional Gaussian distribution. Similarly to Hogwild methods, our approach does not target the original Gaussian distribution of interest, but an approximation to it. Contrary to Hogwild methods, a single parameter allows us to trade bias for variance. We show empirically that our method is very flexible and performs well compared to Hogwild-type algorithms.
Tasks
Published 2017-12-01
URL http://papers.nips.cc/paper/7087-clone-mcmc-parallel-high-dimensional-gaussian-gibbs-sampling
PDF http://papers.nips.cc/paper/7087-clone-mcmc-parallel-high-dimensional-gaussian-gibbs-sampling.pdf
PWC https://paperswithcode.com/paper/clone-mcmc-parallel-high-dimensional-gaussian
Repo
Framework

Findings of the WMT 2017 Biomedical Translation Shared Task

Title Findings of the WMT 2017 Biomedical Translation Shared Task
Authors Antonio Jimeno Yepes, Aur{'e}lie N{'e}v{'e}ol, Mariana Neves, Karin Verspoor, Ond{\v{r}}ej Bojar, Arthur Boyer, Cristian Grozea, Barry Haddow, Madeleine Kittner, Yvonne Lichtblau, Pavel Pecina, Rol Roller, , Rudolf Rosa, Amy Siu, Philippe Thomas, Saskia Trescher
Abstract
Tasks Machine Translation
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-4719/
PDF https://www.aclweb.org/anthology/W17-4719
PWC https://paperswithcode.com/paper/findings-of-the-wmt-2017-biomedical
Repo
Framework

Quasi-Second-Order Parsing for 1-Endpoint-Crossing, Pagenumber-2 Graphs

Title Quasi-Second-Order Parsing for 1-Endpoint-Crossing, Pagenumber-2 Graphs
Authors Junjie Cao, Sheng Huang, Weiwei Sun, Xiaojun Wan
Abstract We propose a new Maximum Subgraph algorithm for first-order parsing to 1-endpoint-crossing, pagenumber-2 graphs. Our algorithm has two characteristics: (1) it separates the construction for noncrossing edges and crossing edges; (2) in a single construction step, whether to create a new arc is deterministic. These two characteristics make our algorithm relatively easy to be extended to incorporiate crossing-sensitive second-order features. We then introduce a new algorithm for quasi-second-order parsing. Experiments demonstrate that second-order features are helpful for Maximum Subgraph parsing.
Tasks Dependency Parsing
Published 2017-09-01
URL https://www.aclweb.org/anthology/D17-1003/
PDF https://www.aclweb.org/anthology/D17-1003
PWC https://paperswithcode.com/paper/quasi-second-order-parsing-for-1-endpoint
Repo
Framework

WSISA: Making Survival Prediction From Whole Slide Histopathological Images

Title WSISA: Making Survival Prediction From Whole Slide Histopathological Images
Authors Xinliang Zhu, Jiawen Yao, Feiyun Zhu, Junzhou Huang
Abstract Image-based precision medicine techniques can be used to better treat cancer patients. However, the gigapixel resolution of Whole Slide Histopathological Images (WSIs) makes traditional survival models computationally impossible. These models usually adopt manually labeled discriminative patches from region of interests (ROIs) and are unable to directly learn discriminative patches from WSIs. We argue that only a small set of patches cannot fully represent the patients’ survival status due to the heterogeneity of tumor. Another challenge is that survival prediction usually comes with insufficient training patient samples. In this paper, we propose an effective Whole Slide Histopathological Images Survival Analysis framework (WSISA) to overcome above challenges. To exploit survival-discriminative patterns from WSIs, we first extract hundreds of patches from each WSI by adaptive sampling and then group these images into different clusters. Then we propose to train an aggregation model to make patient-level predictions based on cluster-level Deep Convolutional Survival (DeepConvSurv) prediction results. Different from existing state-of-the-arts image-based survival models which extract features using some patches from small regions of WSIs, the proposed framework can efficiently exploit and utilize all discriminative patterns in WSIs to predict patients’ survival status. To the best of our knowledge, this has not been shown before. We apply our method to the survival predictions of glioma and non-small-cell lung cancer using three datasets. Results demonstrate the proposed framework can significantly improve the prediction performance compared with the existing state-of-the-arts survival methods.
Tasks Survival Analysis
Published 2017-07-01
URL http://openaccess.thecvf.com/content_cvpr_2017/html/Zhu_WSISA_Making_Survival_CVPR_2017_paper.html
PDF http://openaccess.thecvf.com/content_cvpr_2017/papers/Zhu_WSISA_Making_Survival_CVPR_2017_paper.pdf
PWC https://paperswithcode.com/paper/wsisa-making-survival-prediction-from-whole
Repo
Framework

Controlling Target Features in Neural Machine Translation via Prefix Constraints

Title Controlling Target Features in Neural Machine Translation via Prefix Constraints
Authors Shunsuke Takeno, Masaaki Nagata, Kazuhide Yamamoto
Abstract We propose \textit{prefix constraints}, a novel method to enforce constraints on target sentences in neural machine translation. It places a sequence of special tokens at the beginning of target sentence (target prefix), while side constraints places a special token at the end of source sentence (source suffix). Prefix constraints can be predicted from source sentence jointly with target sentence, while side constraints (Sennrich et al., 2016) must be provided by the user or predicted by some other methods. In both methods, special tokens are designed to encode arbitrary features on target-side or metatextual information. We show that prefix constraints are more flexible than side constraints and can be used to control the behavior of neural machine translation, in terms of output length, bidirectional decoding, domain adaptation, and unaligned target word generation.
Tasks Domain Adaptation, Machine Translation
Published 2017-11-01
URL https://www.aclweb.org/anthology/W17-5702/
PDF https://www.aclweb.org/anthology/W17-5702
PWC https://paperswithcode.com/paper/controlling-target-features-in-neural-machine
Repo
Framework

Unsupervised Does Not Mean Uninterpretable: The Case for Word Sense Induction and Disambiguation

Title Unsupervised Does Not Mean Uninterpretable: The Case for Word Sense Induction and Disambiguation
Authors Alex Panchenko, er, Eugen Ruppert, Stefano Faralli, Simone Paolo Ponzetto, Chris Biemann
Abstract The current trend in NLP is the use of highly opaque models, e.g. neural networks and word embeddings. While these models yield state-of-the-art results on a range of tasks, their drawback is poor interpretability. On the example of word sense induction and disambiguation (WSID), we show that it is possible to develop an interpretable model that matches the state-of-the-art models in accuracy. Namely, we present an unsupervised, knowledge-free WSID approach, which is interpretable at three levels: word sense inventory, sense feature representations, and disambiguation procedure. Experiments show that our model performs on par with state-of-the-art word sense embeddings and other unsupervised systems while offering the possibility to justify its decisions in human-readable form.
Tasks Word Embeddings, Word Sense Disambiguation, Word Sense Induction
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-1009/
PDF https://www.aclweb.org/anthology/E17-1009
PWC https://paperswithcode.com/paper/unsupervised-does-not-mean-uninterpretable
Repo
Framework

Identifying the Provision of Choices in Privacy Policy Text

Title Identifying the Provision of Choices in Privacy Policy Text
Authors Kanthashree Mysore Sathyendra, Shomir Wilson, Florian Schaub, Sebastian Zimmeck, Norman Sadeh
Abstract Websites{'} and mobile apps{'} privacy policies, written in natural language, tend to be long and difficult to understand. Information privacy revolves around the fundamental principle of Notice and choice, namely the idea that users should be able to make informed decisions about what information about them can be collected and how it can be used. Internet users want control over their privacy, but their choices are often hidden in long and convoluted privacy policy texts. Moreover, little (if any) prior work has been done to detect the provision of choices in text. We address this challenge of enabling user choice by automatically identifying and extracting pertinent choice language in privacy policies. In particular, we present a two-stage architecture of classification models to identify opt-out choices in privacy policy text, labelling common varieties of choices with a mean F1 score of 0.735. Our techniques enable the creation of systems to help Internet users to learn about their choices, thereby effectuating notice and choice and improving Internet privacy.
Tasks Question Answering
Published 2017-09-01
URL https://www.aclweb.org/anthology/D17-1294/
PDF https://www.aclweb.org/anthology/D17-1294
PWC https://paperswithcode.com/paper/identifying-the-provision-of-choices-in
Repo
Framework
comments powered by Disqus