Paper Group NANR 180
A Full Non-Monotonic Transition System for Unrestricted Non-Projective Parsing. An analysis of eye-movements during reading for the detection of mild cognitive impairment. Twitter Demographic Classification Using Deep Multi-modal Multi-task Learning. Don’t Throw Those Morphological Analyzers Away Just Yet: Neural Morphological Disambiguation for Ar …
A Full Non-Monotonic Transition System for Unrestricted Non-Projective Parsing
Title | A Full Non-Monotonic Transition System for Unrestricted Non-Projective Parsing |
Authors | Daniel Fern{'a}ndez-Gonz{'a}lez, Carlos G{'o}mez-Rodr{'\i}guez |
Abstract | Restricted non-monotonicity has been shown beneficial for the projective arc-eager dependency parser in previous research, as posterior decisions can repair mistakes made in previous states due to the lack of information. In this paper, we propose a novel, fully non-monotonic transition system based on the non-projective Covington algorithm. As a non-monotonic system requires exploration of erroneous actions during the training process, we develop several non-monotonic variants of the recently defined dynamic oracle for the Covington parser, based on tight approximations of the loss. Experiments on datasets from the CoNLL-X and CoNLL-XI shared tasks show that a non-monotonic dynamic oracle outperforms the monotonic version in the majority of languages. |
Tasks | |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/P17-1027/ |
https://www.aclweb.org/anthology/P17-1027 | |
PWC | https://paperswithcode.com/paper/a-full-non-monotonic-transition-system-for-1 |
Repo | |
Framework | |
An analysis of eye-movements during reading for the detection of mild cognitive impairment
Title | An analysis of eye-movements during reading for the detection of mild cognitive impairment |
Authors | Kathleen C. Fraser, Kristina Lundholm Fors, Dimitrios Kokkinakis, Arto Nordlund |
Abstract | We present a machine learning analysis of eye-tracking data for the detection of mild cognitive impairment, a decline in cognitive abilities that is associated with an increased risk of developing dementia. We compare two experimental configurations (reading aloud versus reading silently), as well as two methods of combining information from the two trials (concatenation and merging). Additionally, we annotate the words being read with information about their frequency and syntactic category, and use these annotations to generate new features. Ultimately, we are able to distinguish between participants with and without cognitive impairment with up to 86{%} accuracy. |
Tasks | Eye Tracking |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/D17-1107/ |
https://www.aclweb.org/anthology/D17-1107 | |
PWC | https://paperswithcode.com/paper/an-analysis-of-eye-movements-during-reading |
Repo | |
Framework | |
Twitter Demographic Classification Using Deep Multi-modal Multi-task Learning
Title | Twitter Demographic Classification Using Deep Multi-modal Multi-task Learning |
Authors | Prashanth Vijayaraghavan, Soroush Vosoughi, Deb Roy |
Abstract | Twitter should be an ideal place to get a fresh read on how different issues are playing with the public, one that{'}s potentially more reflective of democracy in this new media age than traditional polls. Pollsters typically ask people a fixed set of questions, while in social media people use their own voices to speak about whatever is on their minds. However, the demographic distribution of users on Twitter is not representative of the general population. In this paper, we present a demographic classifier for gender, age, political orientation and location on Twitter. We collected and curated a robust Twitter demographic dataset for this task. Our classifier uses a deep multi-modal multi-task learning architecture to reach a state-of-the-art performance, achieving an F1-score of 0.89, 0.82, 0.86, and 0.68 for gender, age, political orientation, and location respectively. |
Tasks | Multi-Task Learning |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/P17-2076/ |
https://www.aclweb.org/anthology/P17-2076 | |
PWC | https://paperswithcode.com/paper/twitter-demographic-classification-using-deep |
Repo | |
Framework | |
Don’t Throw Those Morphological Analyzers Away Just Yet: Neural Morphological Disambiguation for Arabic
Title | Don’t Throw Those Morphological Analyzers Away Just Yet: Neural Morphological Disambiguation for Arabic |
Authors | Nasser Zalmout, Nizar Habash |
Abstract | This paper presents a model for Arabic morphological disambiguation based on Recurrent Neural Networks (RNN). We train Long Short-Term Memory (LSTM) cells in several configurations and embedding levels to model the various morphological features. Our experiments show that these models outperform state-of-the-art systems without explicit use of feature engineering. However, adding learning features from a morphological analyzer to model the space of possible analyses provides additional improvement. We make use of the resulting morphological models for scoring and ranking the analyses of the morphological analyzer for morphological disambiguation. The results show significant gains in accuracy across several evaluation metrics. Our system results in 4.4{%} absolute increase over the state-of-the-art in full morphological analysis accuracy (30.6{%} relative error reduction), and 10.6{%} (31.5{%} relative error reduction) for out-of-vocabulary words. |
Tasks | Feature Engineering, Language Modelling, Morphological Analysis, Morphological Tagging |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/D17-1073/ |
https://www.aclweb.org/anthology/D17-1073 | |
PWC | https://paperswithcode.com/paper/dont-throw-those-morphological-analyzers-away |
Repo | |
Framework | |
Cross-lingual Character-Level Neural Morphological Tagging
Title | Cross-lingual Character-Level Neural Morphological Tagging |
Authors | Ryan Cotterell, Georg Heigold |
Abstract | Even for common NLP tasks, sufficient supervision is not available in many languages {–} morphological tagging is no exception. In the work presented here, we explore a transfer learning scheme, whereby we train character-level recurrent neural taggers to predict morphological taggings for high-resource languages and low-resource languages together. Learning joint character representations among multiple related languages successfully enables knowledge transfer from the high-resource languages to the low-resource ones. |
Tasks | Language Modelling, Morphological Tagging, Part-Of-Speech Tagging, Transfer Learning |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/D17-1078/ |
https://www.aclweb.org/anthology/D17-1078 | |
PWC | https://paperswithcode.com/paper/cross-lingual-character-level-neural-1 |
Repo | |
Framework | |
Entity Extraction in Biomedical Corpora: An Approach to Evaluate Word Embedding Features with PSO based Feature Selection
Title | Entity Extraction in Biomedical Corpora: An Approach to Evaluate Word Embedding Features with PSO based Feature Selection |
Authors | Shweta Yadav, Asif Ekbal, Sriparna Saha, Pushpak Bhattacharyya |
Abstract | Text mining has drawn significant attention in recent past due to the rapid growth in biomedical and clinical records. Entity extraction is one of the fundamental components for biomedical text mining. In this paper, we propose a novel approach of feature selection for entity extraction that exploits the concept of deep learning and Particle Swarm Optimization (PSO). The system utilizes word embedding features along with several other features extracted by studying the properties of the datasets. We obtain an interesting observation that compact word embedding features as determined by PSO are more effective compared to the entire word embedding feature set for entity extraction. The proposed system is evaluated on three benchmark biomedical datasets such as GENIA, GENETAG, and AiMed. The effectiveness of the proposed approach is evident with significant performance gains over the baseline models as well as the other existing systems. We observe improvements of 7.86{%}, 5.27{%} and 7.25{%} F-measure points over the baseline models for GENIA, GENETAG, and AiMed dataset respectively. |
Tasks | Boundary Detection, Entity Extraction, Feature Selection, Word Sense Disambiguation |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/E17-1109/ |
https://www.aclweb.org/anthology/E17-1109 | |
PWC | https://paperswithcode.com/paper/entity-extraction-in-biomedical-corpora-an |
Repo | |
Framework | |
YNU-HPCC at IJCNLP-2017 Task 1: Chinese Grammatical Error Diagnosis Using a Bi-directional LSTM-CRF Model
Title | YNU-HPCC at IJCNLP-2017 Task 1: Chinese Grammatical Error Diagnosis Using a Bi-directional LSTM-CRF Model |
Authors | Quanlei Liao, Jin Wang, Jinnan Yang, Xuejie Zhang |
Abstract | Building a system to detect Chinese grammatical errors is a challenge for natural-language processing researchers. As Chinese learners are increasing, developing such a system can help them study Chinese more easily. This paper introduces a bi-directional long short-term memory (BiLSTM) - conditional random field (CRF) model to produce the sequences that indicate an error type for every position of a sentence, since we regard Chinese grammatical error diagnosis (CGED) as a sequence-labeling problem. |
Tasks | |
Published | 2017-12-01 |
URL | https://www.aclweb.org/anthology/I17-4011/ |
https://www.aclweb.org/anthology/I17-4011 | |
PWC | https://paperswithcode.com/paper/ynu-hpcc-at-ijcnlp-2017-task-1-chinese |
Repo | |
Framework | |
A Biomedical Question Answering System in BioASQ 2017
Title | A Biomedical Question Answering System in BioASQ 2017 |
Authors | Mourad Sarrouti, Said Ouatik El Alaoui |
Abstract | Question answering, the identification of short accurate answers to users questions, is a longstanding challenge widely studied over the last decades in the open domain. However, it still requires further efforts in the biomedical domain. In this paper, we describe our participation in phase B of task 5b in the 2017 BioASQ challenge using our biomedical question answering system. Our system, dealing with four types of questions (i.e., yes/no, factoid, list, and summary), is based on (1) a dictionary-based approach for generating the exact answers of yes/no questions, (2) UMLS metathesaurus and term frequency metric for extracting the exact answers of factoid and list questions, and (3) the BM25 model and UMLS concepts for retrieving the ideal answers (i.e., paragraph-sized summaries). Preliminary results show that our system achieves good and competitive results in both exact and ideal answers extraction tasks as compared with the participating systems. |
Tasks | Question Answering |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/W17-2337/ |
https://www.aclweb.org/anthology/W17-2337 | |
PWC | https://paperswithcode.com/paper/a-biomedical-question-answering-system-in |
Repo | |
Framework | |
Clone MCMC: Parallel High-Dimensional Gaussian Gibbs Sampling
Title | Clone MCMC: Parallel High-Dimensional Gaussian Gibbs Sampling |
Authors | Andrei-Cristian Barbos, Francois Caron, Jean-François Giovannelli, Arnaud Doucet |
Abstract | We propose a generalized Gibbs sampler algorithm for obtaining samples approximately distributed from a high-dimensional Gaussian distribution. Similarly to Hogwild methods, our approach does not target the original Gaussian distribution of interest, but an approximation to it. Contrary to Hogwild methods, a single parameter allows us to trade bias for variance. We show empirically that our method is very flexible and performs well compared to Hogwild-type algorithms. |
Tasks | |
Published | 2017-12-01 |
URL | http://papers.nips.cc/paper/7087-clone-mcmc-parallel-high-dimensional-gaussian-gibbs-sampling |
http://papers.nips.cc/paper/7087-clone-mcmc-parallel-high-dimensional-gaussian-gibbs-sampling.pdf | |
PWC | https://paperswithcode.com/paper/clone-mcmc-parallel-high-dimensional-gaussian |
Repo | |
Framework | |
Findings of the WMT 2017 Biomedical Translation Shared Task
Title | Findings of the WMT 2017 Biomedical Translation Shared Task |
Authors | Antonio Jimeno Yepes, Aur{'e}lie N{'e}v{'e}ol, Mariana Neves, Karin Verspoor, Ond{\v{r}}ej Bojar, Arthur Boyer, Cristian Grozea, Barry Haddow, Madeleine Kittner, Yvonne Lichtblau, Pavel Pecina, Rol Roller, , Rudolf Rosa, Amy Siu, Philippe Thomas, Saskia Trescher |
Abstract | |
Tasks | Machine Translation |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-4719/ |
https://www.aclweb.org/anthology/W17-4719 | |
PWC | https://paperswithcode.com/paper/findings-of-the-wmt-2017-biomedical |
Repo | |
Framework | |
Quasi-Second-Order Parsing for 1-Endpoint-Crossing, Pagenumber-2 Graphs
Title | Quasi-Second-Order Parsing for 1-Endpoint-Crossing, Pagenumber-2 Graphs |
Authors | Junjie Cao, Sheng Huang, Weiwei Sun, Xiaojun Wan |
Abstract | We propose a new Maximum Subgraph algorithm for first-order parsing to 1-endpoint-crossing, pagenumber-2 graphs. Our algorithm has two characteristics: (1) it separates the construction for noncrossing edges and crossing edges; (2) in a single construction step, whether to create a new arc is deterministic. These two characteristics make our algorithm relatively easy to be extended to incorporiate crossing-sensitive second-order features. We then introduce a new algorithm for quasi-second-order parsing. Experiments demonstrate that second-order features are helpful for Maximum Subgraph parsing. |
Tasks | Dependency Parsing |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/D17-1003/ |
https://www.aclweb.org/anthology/D17-1003 | |
PWC | https://paperswithcode.com/paper/quasi-second-order-parsing-for-1-endpoint |
Repo | |
Framework | |
WSISA: Making Survival Prediction From Whole Slide Histopathological Images
Title | WSISA: Making Survival Prediction From Whole Slide Histopathological Images |
Authors | Xinliang Zhu, Jiawen Yao, Feiyun Zhu, Junzhou Huang |
Abstract | Image-based precision medicine techniques can be used to better treat cancer patients. However, the gigapixel resolution of Whole Slide Histopathological Images (WSIs) makes traditional survival models computationally impossible. These models usually adopt manually labeled discriminative patches from region of interests (ROIs) and are unable to directly learn discriminative patches from WSIs. We argue that only a small set of patches cannot fully represent the patients’ survival status due to the heterogeneity of tumor. Another challenge is that survival prediction usually comes with insufficient training patient samples. In this paper, we propose an effective Whole Slide Histopathological Images Survival Analysis framework (WSISA) to overcome above challenges. To exploit survival-discriminative patterns from WSIs, we first extract hundreds of patches from each WSI by adaptive sampling and then group these images into different clusters. Then we propose to train an aggregation model to make patient-level predictions based on cluster-level Deep Convolutional Survival (DeepConvSurv) prediction results. Different from existing state-of-the-arts image-based survival models which extract features using some patches from small regions of WSIs, the proposed framework can efficiently exploit and utilize all discriminative patterns in WSIs to predict patients’ survival status. To the best of our knowledge, this has not been shown before. We apply our method to the survival predictions of glioma and non-small-cell lung cancer using three datasets. Results demonstrate the proposed framework can significantly improve the prediction performance compared with the existing state-of-the-arts survival methods. |
Tasks | Survival Analysis |
Published | 2017-07-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2017/html/Zhu_WSISA_Making_Survival_CVPR_2017_paper.html |
http://openaccess.thecvf.com/content_cvpr_2017/papers/Zhu_WSISA_Making_Survival_CVPR_2017_paper.pdf | |
PWC | https://paperswithcode.com/paper/wsisa-making-survival-prediction-from-whole |
Repo | |
Framework | |
Controlling Target Features in Neural Machine Translation via Prefix Constraints
Title | Controlling Target Features in Neural Machine Translation via Prefix Constraints |
Authors | Shunsuke Takeno, Masaaki Nagata, Kazuhide Yamamoto |
Abstract | We propose \textit{prefix constraints}, a novel method to enforce constraints on target sentences in neural machine translation. It places a sequence of special tokens at the beginning of target sentence (target prefix), while side constraints places a special token at the end of source sentence (source suffix). Prefix constraints can be predicted from source sentence jointly with target sentence, while side constraints (Sennrich et al., 2016) must be provided by the user or predicted by some other methods. In both methods, special tokens are designed to encode arbitrary features on target-side or metatextual information. We show that prefix constraints are more flexible than side constraints and can be used to control the behavior of neural machine translation, in terms of output length, bidirectional decoding, domain adaptation, and unaligned target word generation. |
Tasks | Domain Adaptation, Machine Translation |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/W17-5702/ |
https://www.aclweb.org/anthology/W17-5702 | |
PWC | https://paperswithcode.com/paper/controlling-target-features-in-neural-machine |
Repo | |
Framework | |
Unsupervised Does Not Mean Uninterpretable: The Case for Word Sense Induction and Disambiguation
Title | Unsupervised Does Not Mean Uninterpretable: The Case for Word Sense Induction and Disambiguation |
Authors | Alex Panchenko, er, Eugen Ruppert, Stefano Faralli, Simone Paolo Ponzetto, Chris Biemann |
Abstract | The current trend in NLP is the use of highly opaque models, e.g. neural networks and word embeddings. While these models yield state-of-the-art results on a range of tasks, their drawback is poor interpretability. On the example of word sense induction and disambiguation (WSID), we show that it is possible to develop an interpretable model that matches the state-of-the-art models in accuracy. Namely, we present an unsupervised, knowledge-free WSID approach, which is interpretable at three levels: word sense inventory, sense feature representations, and disambiguation procedure. Experiments show that our model performs on par with state-of-the-art word sense embeddings and other unsupervised systems while offering the possibility to justify its decisions in human-readable form. |
Tasks | Word Embeddings, Word Sense Disambiguation, Word Sense Induction |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/E17-1009/ |
https://www.aclweb.org/anthology/E17-1009 | |
PWC | https://paperswithcode.com/paper/unsupervised-does-not-mean-uninterpretable |
Repo | |
Framework | |
Identifying the Provision of Choices in Privacy Policy Text
Title | Identifying the Provision of Choices in Privacy Policy Text |
Authors | Kanthashree Mysore Sathyendra, Shomir Wilson, Florian Schaub, Sebastian Zimmeck, Norman Sadeh |
Abstract | Websites{'} and mobile apps{'} privacy policies, written in natural language, tend to be long and difficult to understand. Information privacy revolves around the fundamental principle of Notice and choice, namely the idea that users should be able to make informed decisions about what information about them can be collected and how it can be used. Internet users want control over their privacy, but their choices are often hidden in long and convoluted privacy policy texts. Moreover, little (if any) prior work has been done to detect the provision of choices in text. We address this challenge of enabling user choice by automatically identifying and extracting pertinent choice language in privacy policies. In particular, we present a two-stage architecture of classification models to identify opt-out choices in privacy policy text, labelling common varieties of choices with a mean F1 score of 0.735. Our techniques enable the creation of systems to help Internet users to learn about their choices, thereby effectuating notice and choice and improving Internet privacy. |
Tasks | Question Answering |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/D17-1294/ |
https://www.aclweb.org/anthology/D17-1294 | |
PWC | https://paperswithcode.com/paper/identifying-the-provision-of-choices-in |
Repo | |
Framework | |