July 26, 2019

2453 words 12 mins read

Paper Group NANR 180

A Full Non-Monotonic Transition System for Unrestricted Non-Projective Parsing. An analysis of eye-movements during reading for the detection of mild cognitive impairment. Twitter Demographic Classification Using Deep Multi-modal Multi-task Learning. Don’t Throw Those Morphological Analyzers Away Just Yet: Neural Morphological Disambiguation for Ar …

A Full Non-Monotonic Transition System for Unrestricted Non-Projective Parsing


Title	A Full Non-Monotonic Transition System for Unrestricted Non-Projective Parsing
Authors	Daniel Fern{'a}ndez-Gonz{'a}lez, Carlos G{'o}mez-Rodr{'\i}guez
Abstract	Restricted non-monotonicity has been shown beneficial for the projective arc-eager dependency parser in previous research, as posterior decisions can repair mistakes made in previous states due to the lack of information. In this paper, we propose a novel, fully non-monotonic transition system based on the non-projective Covington algorithm. As a non-monotonic system requires exploration of erroneous actions during the training process, we develop several non-monotonic variants of the recently defined dynamic oracle for the Covington parser, based on tight approximations of the loss. Experiments on datasets from the CoNLL-X and CoNLL-XI shared tasks show that a non-monotonic dynamic oracle outperforms the monotonic version in the majority of languages.
Tasks
Published	2017-07-01
URL	https://www.aclweb.org/anthology/P17-1027/
PDF	https://www.aclweb.org/anthology/P17-1027
PWC	https://paperswithcode.com/paper/a-full-non-monotonic-transition-system-for-1
Repo
Framework

An analysis of eye-movements during reading for the detection of mild cognitive impairment


Title	An analysis of eye-movements during reading for the detection of mild cognitive impairment
Authors	Kathleen C. Fraser, Kristina Lundholm Fors, Dimitrios Kokkinakis, Arto Nordlund
Abstract	We present a machine learning analysis of eye-tracking data for the detection of mild cognitive impairment, a decline in cognitive abilities that is associated with an increased risk of developing dementia. We compare two experimental configurations (reading aloud versus reading silently), as well as two methods of combining information from the two trials (concatenation and merging). Additionally, we annotate the words being read with information about their frequency and syntactic category, and use these annotations to generate new features. Ultimately, we are able to distinguish between participants with and without cognitive impairment with up to 86{%} accuracy.
Tasks	Eye Tracking
Published	2017-09-01
URL	https://www.aclweb.org/anthology/D17-1107/
PDF	https://www.aclweb.org/anthology/D17-1107
PWC	https://paperswithcode.com/paper/an-analysis-of-eye-movements-during-reading
Repo
Framework


Title	Twitter Demographic Classification Using Deep Multi-modal Multi-task Learning
Authors	Prashanth Vijayaraghavan, Soroush Vosoughi, Deb Roy
Abstract	Twitter should be an ideal place to get a fresh read on how different issues are playing with the public, one that{'}s potentially more reflective of democracy in this new media age than traditional polls. Pollsters typically ask people a fixed set of questions, while in social media people use their own voices to speak about whatever is on their minds. However, the demographic distribution of users on Twitter is not representative of the general population. In this paper, we present a demographic classifier for gender, age, political orientation and location on Twitter. We collected and curated a robust Twitter demographic dataset for this task. Our classifier uses a deep multi-modal multi-task learning architecture to reach a state-of-the-art performance, achieving an F1-score of 0.89, 0.82, 0.86, and 0.68 for gender, age, political orientation, and location respectively.
Tasks	Multi-Task Learning
Published	2017-07-01
URL	https://www.aclweb.org/anthology/P17-2076/
PDF	https://www.aclweb.org/anthology/P17-2076
PWC	https://paperswithcode.com/paper/twitter-demographic-classification-using-deep
Repo
Framework

Don’t Throw Those Morphological Analyzers Away Just Yet: Neural Morphological Disambiguation for Arabic


Title	Don’t Throw Those Morphological Analyzers Away Just Yet: Neural Morphological Disambiguation for Arabic
Authors	Nasser Zalmout, Nizar Habash
Abstract	This paper presents a model for Arabic morphological disambiguation based on Recurrent Neural Networks (RNN). We train Long Short-Term Memory (LSTM) cells in several configurations and embedding levels to model the various morphological features. Our experiments show that these models outperform state-of-the-art systems without explicit use of feature engineering. However, adding learning features from a morphological analyzer to model the space of possible analyses provides additional improvement. We make use of the resulting morphological models for scoring and ranking the analyses of the morphological analyzer for morphological disambiguation. The results show significant gains in accuracy across several evaluation metrics. Our system results in 4.4{%} absolute increase over the state-of-the-art in full morphological analysis accuracy (30.6{%} relative error reduction), and 10.6{%} (31.5{%} relative error reduction) for out-of-vocabulary words.
Tasks	Feature Engineering, Language Modelling, Morphological Analysis, Morphological Tagging
Published	2017-09-01
URL	https://www.aclweb.org/anthology/D17-1073/
PDF	https://www.aclweb.org/anthology/D17-1073
PWC	https://paperswithcode.com/paper/dont-throw-those-morphological-analyzers-away
Repo
Framework

Cross-lingual Character-Level Neural Morphological Tagging


Title	Cross-lingual Character-Level Neural Morphological Tagging
Authors	Ryan Cotterell, Georg Heigold
Abstract	Even for common NLP tasks, sufficient supervision is not available in many languages {–} morphological tagging is no exception. In the work presented here, we explore a transfer learning scheme, whereby we train character-level recurrent neural taggers to predict morphological taggings for high-resource languages and low-resource languages together. Learning joint character representations among multiple related languages successfully enables knowledge transfer from the high-resource languages to the low-resource ones.
Tasks	Language Modelling, Morphological Tagging, Part-Of-Speech Tagging, Transfer Learning
Published	2017-09-01
URL	https://www.aclweb.org/anthology/D17-1078/
PDF	https://www.aclweb.org/anthology/D17-1078
PWC	https://paperswithcode.com/paper/cross-lingual-character-level-neural-1
Repo
Framework

Entity Extraction in Biomedical Corpora: An Approach to Evaluate Word Embedding Features with PSO based Feature Selection


Title	Entity Extraction in Biomedical Corpora: An Approach to Evaluate Word Embedding Features with PSO based Feature Selection
Authors	Shweta Yadav, Asif Ekbal, Sriparna Saha, Pushpak Bhattacharyya
Abstract	Text mining has drawn significant attention in recent past due to the rapid growth in biomedical and clinical records. Entity extraction is one of the fundamental components for biomedical text mining. In this paper, we propose a novel approach of feature selection for entity extraction that exploits the concept of deep learning and Particle Swarm Optimization (PSO). The system utilizes word embedding features along with several other features extracted by studying the properties of the datasets. We obtain an interesting observation that compact word embedding features as determined by PSO are more effective compared to the entire word embedding feature set for entity extraction. The proposed system is evaluated on three benchmark biomedical datasets such as GENIA, GENETAG, and AiMed. The effectiveness of the proposed approach is evident with significant performance gains over the baseline models as well as the other existing systems. We observe improvements of 7.86{%}, 5.27{%} and 7.25{%} F-measure points over the baseline models for GENIA, GENETAG, and AiMed dataset respectively.
Tasks	Boundary Detection, Entity Extraction, Feature Selection, Word Sense Disambiguation
Published	2017-04-01
URL	https://www.aclweb.org/anthology/E17-1109/
PDF	https://www.aclweb.org/anthology/E17-1109
PWC	https://paperswithcode.com/paper/entity-extraction-in-biomedical-corpora-an
Repo
Framework

YNU-HPCC at IJCNLP-2017 Task 1: Chinese Grammatical Error Diagnosis Using a Bi-directional LSTM-CRF Model


Title	YNU-HPCC at IJCNLP-2017 Task 1: Chinese Grammatical Error Diagnosis Using a Bi-directional LSTM-CRF Model
Authors	Quanlei Liao, Jin Wang, Jinnan Yang, Xuejie Zhang
Abstract	Building a system to detect Chinese grammatical errors is a challenge for natural-language processing researchers. As Chinese learners are increasing, developing such a system can help them study Chinese more easily. This paper introduces a bi-directional long short-term memory (BiLSTM) - conditional random field (CRF) model to produce the sequences that indicate an error type for every position of a sentence, since we regard Chinese grammatical error diagnosis (CGED) as a sequence-labeling problem.
Tasks
Published	2017-12-01
URL	https://www.aclweb.org/anthology/I17-4011/
PDF	https://www.aclweb.org/anthology/I17-4011
PWC	https://paperswithcode.com/paper/ynu-hpcc-at-ijcnlp-2017-task-1-chinese
Repo
Framework

A Biomedical Question Answering System in BioASQ 2017


Title	A Biomedical Question Answering System in BioASQ 2017
Authors	Mourad Sarrouti, Said Ouatik El Alaoui
Abstract	Question answering, the identification of short accurate answers to users questions, is a longstanding challenge widely studied over the last decades in the open domain. However, it still requires further efforts in the biomedical domain. In this paper, we describe our participation in phase B of task 5b in the 2017 BioASQ challenge using our biomedical question answering system. Our system, dealing with four types of questions (i.e., yes/no, factoid, list, and summary), is based on (1) a dictionary-based approach for generating the exact answers of yes/no questions, (2) UMLS metathesaurus and term frequency metric for extracting the exact answers of factoid and list questions, and (3) the BM25 model and UMLS concepts for retrieving the ideal answers (i.e., paragraph-sized summaries). Preliminary results show that our system achieves good and competitive results in both exact and ideal answers extraction tasks as compared with the participating systems.
Tasks	Question Answering
Published	2017-08-01
URL	https://www.aclweb.org/anthology/W17-2337/
PDF	https://www.aclweb.org/anthology/W17-2337
PWC	https://paperswithcode.com/paper/a-biomedical-question-answering-system-in
Repo
Framework

Clone MCMC: Parallel High-Dimensional Gaussian Gibbs Sampling


Title	Clone MCMC: Parallel High-Dimensional Gaussian Gibbs Sampling
Authors	Andrei-Cristian Barbos, Francois Caron, Jean-François Giovannelli, Arnaud Doucet
Abstract	We propose a generalized Gibbs sampler algorithm for obtaining samples approximately distributed from a high-dimensional Gaussian distribution. Similarly to Hogwild methods, our approach does not target the original Gaussian distribution of interest, but an approximation to it. Contrary to Hogwild methods, a single parameter allows us to trade bias for variance. We show empirically that our method is very flexible and performs well compared to Hogwild-type algorithms.
Tasks
Published	2017-12-01
URL	http://papers.nips.cc/paper/7087-clone-mcmc-parallel-high-dimensional-gaussian-gibbs-sampling
PDF	http://papers.nips.cc/paper/7087-clone-mcmc-parallel-high-dimensional-gaussian-gibbs-sampling.pdf
PWC	https://paperswithcode.com/paper/clone-mcmc-parallel-high-dimensional-gaussian
Repo
Framework

Findings of the WMT 2017 Biomedical Translation Shared Task


Title	Findings of the WMT 2017 Biomedical Translation Shared Task
Authors	Antonio Jimeno Yepes, Aur{'e}lie N{'e}v{'e}ol, Mariana Neves, Karin Verspoor, Ond{\v{r}}ej Bojar, Arthur Boyer, Cristian Grozea, Barry Haddow, Madeleine Kittner, Yvonne Lichtblau, Pavel Pecina, Rol Roller, , Rudolf Rosa, Amy Siu, Philippe Thomas, Saskia Trescher
Abstract
Tasks	Machine Translation
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-4719/
PDF	https://www.aclweb.org/anthology/W17-4719
PWC	https://paperswithcode.com/paper/findings-of-the-wmt-2017-biomedical
Repo
Framework

Quasi-Second-Order Parsing for 1-Endpoint-Crossing, Pagenumber-2 Graphs


Title	Quasi-Second-Order Parsing for 1-Endpoint-Crossing, Pagenumber-2 Graphs
Authors	Junjie Cao, Sheng Huang, Weiwei Sun, Xiaojun Wan
Abstract	We propose a new Maximum Subgraph algorithm for first-order parsing to 1-endpoint-crossing, pagenumber-2 graphs. Our algorithm has two characteristics: (1) it separates the construction for noncrossing edges and crossing edges; (2) in a single construction step, whether to create a new arc is deterministic. These two characteristics make our algorithm relatively easy to be extended to incorporiate crossing-sensitive second-order features. We then introduce a new algorithm for quasi-second-order parsing. Experiments demonstrate that second-order features are helpful for Maximum Subgraph parsing.
Tasks	Dependency Parsing
Published	2017-09-01
URL	https://www.aclweb.org/anthology/D17-1003/
PDF	https://www.aclweb.org/anthology/D17-1003
PWC	https://paperswithcode.com/paper/quasi-second-order-parsing-for-1-endpoint
Repo
Framework

WSISA: Making Survival Prediction From Whole Slide Histopathological Images


Title	WSISA: Making Survival Prediction From Whole Slide Histopathological Images
Authors	Xinliang Zhu, Jiawen Yao, Feiyun Zhu, Junzhou Huang
Abstract	Image-based precision medicine techniques can be used to better treat cancer patients. However, the gigapixel resolution of Whole Slide Histopathological Images (WSIs) makes traditional survival models computationally impossible. These models usually adopt manually labeled discriminative patches from region of interests (ROIs) and are unable to directly learn discriminative patches from WSIs. We argue that only a small set of patches cannot fully represent the patients’ survival status due to the heterogeneity of tumor. Another challenge is that survival prediction usually comes with insufficient training patient samples. In this paper, we propose an effective Whole Slide Histopathological Images Survival Analysis framework (WSISA) to overcome above challenges. To exploit survival-discriminative patterns from WSIs, we first extract hundreds of patches from each WSI by adaptive sampling and then group these images into different clusters. Then we propose to train an aggregation model to make patient-level predictions based on cluster-level Deep Convolutional Survival (DeepConvSurv) prediction results. Different from existing state-of-the-arts image-based survival models which extract features using some patches from small regions of WSIs, the proposed framework can efficiently exploit and utilize all discriminative patterns in WSIs to predict patients’ survival status. To the best of our knowledge, this has not been shown before. We apply our method to the survival predictions of glioma and non-small-cell lung cancer using three datasets. Results demonstrate the proposed framework can significantly improve the prediction performance compared with the existing state-of-the-arts survival methods.
Tasks	Survival Analysis
Published	2017-07-01
URL	http://openaccess.thecvf.com/content_cvpr_2017/html/Zhu_WSISA_Making_Survival_CVPR_2017_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2017/papers/Zhu_WSISA_Making_Survival_CVPR_2017_paper.pdf
PWC	https://paperswithcode.com/paper/wsisa-making-survival-prediction-from-whole
Repo
Framework

Controlling Target Features in Neural Machine Translation via Prefix Constraints


Title	Controlling Target Features in Neural Machine Translation via Prefix Constraints
Authors	Shunsuke Takeno, Masaaki Nagata, Kazuhide Yamamoto
Abstract	We propose \textit{prefix constraints}, a novel method to enforce constraints on target sentences in neural machine translation. It places a sequence of special tokens at the beginning of target sentence (target prefix), while side constraints places a special token at the end of source sentence (source suffix). Prefix constraints can be predicted from source sentence jointly with target sentence, while side constraints (Sennrich et al., 2016) must be provided by the user or predicted by some other methods. In both methods, special tokens are designed to encode arbitrary features on target-side or metatextual information. We show that prefix constraints are more flexible than side constraints and can be used to control the behavior of neural machine translation, in terms of output length, bidirectional decoding, domain adaptation, and unaligned target word generation.
Tasks	Domain Adaptation, Machine Translation
Published	2017-11-01
URL	https://www.aclweb.org/anthology/W17-5702/
PDF	https://www.aclweb.org/anthology/W17-5702
PWC	https://paperswithcode.com/paper/controlling-target-features-in-neural-machine
Repo
Framework

Unsupervised Does Not Mean Uninterpretable: The Case for Word Sense Induction and Disambiguation


Title	Unsupervised Does Not Mean Uninterpretable: The Case for Word Sense Induction and Disambiguation
Authors	Alex Panchenko, er, Eugen Ruppert, Stefano Faralli, Simone Paolo Ponzetto, Chris Biemann
Abstract	The current trend in NLP is the use of highly opaque models, e.g. neural networks and word embeddings. While these models yield state-of-the-art results on a range of tasks, their drawback is poor interpretability. On the example of word sense induction and disambiguation (WSID), we show that it is possible to develop an interpretable model that matches the state-of-the-art models in accuracy. Namely, we present an unsupervised, knowledge-free WSID approach, which is interpretable at three levels: word sense inventory, sense feature representations, and disambiguation procedure. Experiments show that our model performs on par with state-of-the-art word sense embeddings and other unsupervised systems while offering the possibility to justify its decisions in human-readable form.
Tasks	Word Embeddings, Word Sense Disambiguation, Word Sense Induction
Published	2017-04-01
URL	https://www.aclweb.org/anthology/E17-1009/
PDF	https://www.aclweb.org/anthology/E17-1009
PWC	https://paperswithcode.com/paper/unsupervised-does-not-mean-uninterpretable
Repo
Framework

Identifying the Provision of Choices in Privacy Policy Text


Title	Identifying the Provision of Choices in Privacy Policy Text
Authors	Kanthashree Mysore Sathyendra, Shomir Wilson, Florian Schaub, Sebastian Zimmeck, Norman Sadeh
Abstract	Websites{'} and mobile apps{'} privacy policies, written in natural language, tend to be long and difficult to understand. Information privacy revolves around the fundamental principle of Notice and choice, namely the idea that users should be able to make informed decisions about what information about them can be collected and how it can be used. Internet users want control over their privacy, but their choices are often hidden in long and convoluted privacy policy texts. Moreover, little (if any) prior work has been done to detect the provision of choices in text. We address this challenge of enabling user choice by automatically identifying and extracting pertinent choice language in privacy policies. In particular, we present a two-stage architecture of classification models to identify opt-out choices in privacy policy text, labelling common varieties of choices with a mean F1 score of 0.735. Our techniques enable the creation of systems to help Internet users to learn about their choices, thereby effectuating notice and choice and improving Internet privacy.
Tasks	Question Answering
Published	2017-09-01
URL	https://www.aclweb.org/anthology/D17-1294/
PDF	https://www.aclweb.org/anthology/D17-1294
PWC	https://paperswithcode.com/paper/identifying-the-provision-of-choices-in
Repo
Framework