Paper Group NANR 142
Time Expression Analysis and Recognition Using Syntactic Token Types and General Heuristic Rules. A parallel collection of clinical trials in Portuguese and English. Dynamic Importance Sampling for Anytime Bounds of the Partition Function. Identification of Risk Factors in Clinical Texts through Association Rules. Stochastic Mirror Descent in Varia …
Time Expression Analysis and Recognition Using Syntactic Token Types and General Heuristic Rules
Title | Time Expression Analysis and Recognition Using Syntactic Token Types and General Heuristic Rules |
Authors | Xiaoshi Zhong, Aixin Sun, Erik Cambria |
Abstract | Extracting time expressions from free text is a fundamental task for many applications. We analyze the time expressions from four datasets and find that only a small group of words are used to express time information, and the words in time expressions demonstrate similar syntactic behaviour. Based on the findings, we propose a type-based approach, named SynTime, to recognize time expressions. Specifically, we define three main syntactic token types, namely time token, modifier, and numeral, to group time-related regular expressions over tokens. On the types we design general heuristic rules to recognize time expressions. In recognition, SynTime first identifies the time tokens from raw text, then searches their surroundings for modifiers and numerals to form time segments, and finally merges the time segments to time expressions. As a light-weight rule-based tagger, SynTime runs in real time, and can be easily expanded by simply adding keywords for the text of different types and of different domains. Experiment on benchmark datasets and tweets data shows that SynTime outperforms state-of-the-art methods. |
Tasks | Information Retrieval |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/P17-1039/ |
https://www.aclweb.org/anthology/P17-1039 | |
PWC | https://paperswithcode.com/paper/time-expression-analysis-and-recognition |
Repo | |
Framework | |
A parallel collection of clinical trials in Portuguese and English
Title | A parallel collection of clinical trials in Portuguese and English |
Authors | Mariana Neves |
Abstract | Parallel collections of documents are crucial resources for training and evaluating machine translation (MT) systems. Even though large collections are available for certain domains and language pairs, these are still scarce in the biomedical domain. We developed a parallel corpus of clinical trials in Portuguese and English. The documents are derived from the Brazilian Clinical Trials Registry and the corpus currently contains a total of 1188 documents. In this paper, we describe the corpus construction and discuss the quality of the translation and the sentence alignment that we obtained. |
Tasks | Machine Translation |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/W17-2507/ |
https://www.aclweb.org/anthology/W17-2507 | |
PWC | https://paperswithcode.com/paper/a-parallel-collection-of-clinical-trials-in |
Repo | |
Framework | |
Dynamic Importance Sampling for Anytime Bounds of the Partition Function
Title | Dynamic Importance Sampling for Anytime Bounds of the Partition Function |
Authors | Qi Lou, Rina Dechter, Alexander T. Ihler |
Abstract | Computing the partition function is a key inference task in many graphical models. In this paper, we propose a dynamic importance sampling scheme that provides anytime finite-sample bounds for the partition function. Our algorithm balances the advantages of the three major inference strategies, heuristic search, variational bounds, and Monte Carlo methods, blending sampling with search to refine a variationally defined proposal. Our algorithm combines and generalizes recent work on anytime search and probabilistic bounds of the partition function. By using an intelligently chosen weighted average over the samples, we construct an unbiased estimator of the partition function with strong finite-sample confidence intervals that inherit both the rapid early improvement rate of sampling and the long-term benefits of an improved proposal from search. This gives significantly improved anytime behavior, and more flexible trade-offs between memory, time, and solution quality. We demonstrate the effectiveness of our approach empirically on real-world problem instances taken from recent UAI competitions. |
Tasks | |
Published | 2017-12-01 |
URL | http://papers.nips.cc/paper/6912-dynamic-importance-sampling-for-anytime-bounds-of-the-partition-function |
http://papers.nips.cc/paper/6912-dynamic-importance-sampling-for-anytime-bounds-of-the-partition-function.pdf | |
PWC | https://paperswithcode.com/paper/dynamic-importance-sampling-for-anytime |
Repo | |
Framework | |
Identification of Risk Factors in Clinical Texts through Association Rules
Title | Identification of Risk Factors in Clinical Texts through Association Rules |
Authors | Svetla Boytcheva, Ivelina Nikolova, Galia Angelova, Zhivko Angelov |
Abstract | We describe a method which extracts Association Rules from texts in order to recognise verbalisations of risk factors. Usually some basic vocabulary about risk factors is known but medical conditions are expressed in clinical narratives with much higher variety. We propose an approach for data-driven learning of specialised medical vocabulary which, once collected, enables early alerting of potentially affected patients. The method is illustrated by experimens with clinical records of patients with Chronic Obstructive Pulmonary Disease (COPD) and comorbidity of CORD, Diabetes Melitus and Schizophrenia. Our input data come from the Bulgarian Diabetic Register, which is built using a pseudonymised collection of outpatient records for about 500,000 diabetic patients. The generated Association Rules for CORD are analysed in the context of demographic, gender, and age information. Valuable anounts of meaningful words, signalling risk factors, are discovered with high precision and confidence. |
Tasks | |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-8009/ |
https://doi.org/10.26615/978-954-452-044-1_009 | |
PWC | https://paperswithcode.com/paper/identification-of-risk-factors-in-clinical |
Repo | |
Framework | |
Stochastic Mirror Descent in Variationally Coherent Optimization Problems
Title | Stochastic Mirror Descent in Variationally Coherent Optimization Problems |
Authors | Zhengyuan Zhou, Panayotis Mertikopoulos, Nicholas Bambos, Stephen Boyd, Peter W. Glynn |
Abstract | In this paper, we examine a class of non-convex stochastic optimization problems which we call variationally coherent, and which properly includes pseudo-/quasiconvex and star-convex optimization problems. To solve such problems, we focus on the widely used stochastic mirror descent (SMD) family of algorithms (which contains stochastic gradient descent as a special case), and we show that the last iterate of SMD converges to the problem’s solution set with probability 1. This result contributes to the landscape of non-convex stochastic optimization by clarifying that neither pseudo-/quasi-convexity nor star-convexity is essential for (almost sure) global convergence; rather, variational coherence, a much weaker requirement, suffices. Characterization of convergence rates for the subclass of strongly variationally coherent optimization problems as well as simulation results are also presented. |
Tasks | Stochastic Optimization |
Published | 2017-12-01 |
URL | http://papers.nips.cc/paper/7279-stochastic-mirror-descent-in-variationally-coherent-optimization-problems |
http://papers.nips.cc/paper/7279-stochastic-mirror-descent-in-variationally-coherent-optimization-problems.pdf | |
PWC | https://paperswithcode.com/paper/stochastic-mirror-descent-in-variationally |
Repo | |
Framework | |
Proceedings of the Events and Stories in the News Workshop
Title | Proceedings of the Events and Stories in the News Workshop |
Authors | |
Abstract | |
Tasks | |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/W17-2700/ |
https://www.aclweb.org/anthology/W17-2700 | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-events-and-stories-in-the |
Repo | |
Framework | |
Automatic Measures to Characterise Verbal Alignment in Human-Agent Interaction
Title | Automatic Measures to Characterise Verbal Alignment in Human-Agent Interaction |
Authors | Guillaume Dubuisson Duplessis, Chlo{'e} Clavel, L, Fr{'e}d{'e}ric ragin |
Abstract | This work aims at characterising verbal alignment processes for improving virtual agent communicative capabilities. We propose computationally inexpensive measures of verbal alignment based on expression repetition in dyadic textual dialogues. Using these measures, we present a contrastive study between Human-Human and Human-Agent dialogues on a negotiation task. We exhibit quantitative differences in the strength and orientation of verbal alignment showing the ability of our approach to characterise important aspects of verbal alignment. |
Tasks | |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/W17-5510/ |
https://www.aclweb.org/anthology/W17-5510 | |
PWC | https://paperswithcode.com/paper/automatic-measures-to-characterise-verbal |
Repo | |
Framework | |
Weakly supervised learning of allomorphy
Title | Weakly supervised learning of allomorphy |
Authors | Miikka Silfverberg, Mans Hulden |
Abstract | Most NLP resources that offer annotations at the word segment level provide morphological annotation that includes features indicating tense, aspect, modality, gender, case, and other inflectional information. Such information is rarely aligned to the relevant parts of the words{—}i.e. the allomorphs, as such annotation would be very costly. These unaligned weak labelings are commonly provided by annotated NLP corpora such as treebanks in various languages. Although they lack alignment information, the presence/absence of labels at the word level is also consistent with the amount of supervision assumed to be provided to L1 and L2 learners. In this paper, we explore several methods to learn this latent alignment between parts of word forms and the grammatical information provided. All the methods under investigation favor hypotheses regarding allomorphs of morphemes that re-use a small inventory, i.e. implicitly minimize the number of allomorphs that a morpheme can be realized as. We show that the provided information offers a significant advantage for both word segmentation and the learning of allomorphy. |
Tasks | |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-4107/ |
https://www.aclweb.org/anthology/W17-4107 | |
PWC | https://paperswithcode.com/paper/weakly-supervised-learning-of-allomorphy |
Repo | |
Framework | |
From Clickbait to Fake News Detection: An Approach based on Detecting the Stance of Headlines to Articles
Title | From Clickbait to Fake News Detection: An Approach based on Detecting the Stance of Headlines to Articles |
Authors | Peter Bourgonje, Julian Moreno Schneider, Georg Rehm |
Abstract | We present a system for the detection of the stance of headlines with regard to their corresponding article bodies. The approach can be applied in fake news, especially clickbait detection scenarios. The component is part of a larger platform for the curation of digital content; we consider veracity and relevancy an increasingly important part of curating online information. We want to contribute to the debate on how to deal with fake news and related online phenomena with technological means, by providing means to separate related from unrelated headlines and further classifying the related headlines. On a publicly available data set annotated for the stance of headlines with regard to their corresponding article bodies, we achieve a (weighted) accuracy score of 89.59. |
Tasks | Clickbait Detection, Fake News Detection, Rumour Detection |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-4215/ |
https://www.aclweb.org/anthology/W17-4215 | |
PWC | https://paperswithcode.com/paper/from-clickbait-to-fake-news-detection-an |
Repo | |
Framework | |
``i have a feeling trump will win………………'': Forecasting Winners and Losers from User Predictions on Twitter
Title | ``i have a feeling trump will win………………'': Forecasting Winners and Losers from User Predictions on Twitter | |
Authors | S Swamy, esh, Alan Ritter, Marie-Catherine de Marneffe |
Abstract | Social media users often make explicit predictions about upcoming events. Such statements vary in the degree of certainty the author expresses toward the outcome: {}Leonardo DiCaprio will win Best Actor{''} vs. { }Leonardo DiCaprio may win{''} or {``}No way Leonardo wins!{''}. Can popular beliefs on social media predict who will win? To answer this question, we build a corpus of tweets annotated for veridicality on which we train a log-linear classifier that detects positive veridicality with high precision. We then forecast uncertain outcomes using the wisdom of crowds, by aggregating users{'} explicit predictions. Our method for forecasting winners is fully automated, relying only on a set of contenders as input. It requires no training data of past outcomes and outperforms sentiment and tweet volume baselines on a broad range of contest prediction tasks. We further demonstrate how our approach can be used to measure the reliability of individual accounts{'} predictions and retrospectively identify surprise outcomes. | |
Tasks | Sentiment Analysis |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/D17-1166/ |
https://www.aclweb.org/anthology/D17-1166 | |
PWC | https://paperswithcode.com/paper/i-have-a-feeling-trump-will-win-forecasting-1 |
Repo | |
Framework | |
Instances and concepts in distributional space
Title | Instances and concepts in distributional space |
Authors | Gemma Boleda, Abhijeet Gupta, Sebastian Pad{'o} |
Abstract | Instances ({}Mozart{''}) are ontologically distinct from concepts or classes ({ }composer{''}). Natural language encompasses both, but instances have received comparatively little attention in distributional semantics. Our results show that instances and concepts differ in their distributional properties. We also establish that instantiation detection ({}Mozart {--} composer{''}) is generally easier than hypernymy detection ({ }chemist {–} scientist{''}), and that results on the influence of input representation do not transfer from hyponymy to instantiation. |
Tasks | |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/E17-2013/ |
https://www.aclweb.org/anthology/E17-2013 | |
PWC | https://paperswithcode.com/paper/instances-and-concepts-in-distributional |
Repo | |
Framework | |
Proceedings of the 4th Workshop on Asian Translation (WAT2017)
Title | Proceedings of the 4th Workshop on Asian Translation (WAT2017) |
Authors | |
Abstract | |
Tasks | |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/W17-5700/ |
https://www.aclweb.org/anthology/W17-5700 | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-4th-workshop-on-asian |
Repo | |
Framework | |
Creating POS Tagging and Dependency Parsing Experts via Topic Modeling
Title | Creating POS Tagging and Dependency Parsing Experts via Topic Modeling |
Authors | Atreyee Mukherjee, S K{"u}bler, ra, Matthias Scheutz |
Abstract | Part of speech (POS) taggers and dependency parsers tend to work well on homogeneous datasets but their performance suffers on datasets containing data from different genres. In our current work, we investigate how to create POS tagging and dependency parsing experts for heterogeneous data by employing topic modeling. We create topic models (using Latent Dirichlet Allocation) to determine genres from a heterogeneous dataset and then train an expert for each of the genres. Our results show that the topic modeling experts reach substantial improvements when compared to the general versions. For dependency parsing, the improvement reaches 2 percent points over the full training baseline when we use two topics. |
Tasks | Dependency Parsing, Domain Adaptation, Topic Models |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/E17-1033/ |
https://www.aclweb.org/anthology/E17-1033 | |
PWC | https://paperswithcode.com/paper/creating-pos-tagging-and-dependency-parsing |
Repo | |
Framework | |
MainiwayAI at IJCNLP-2017 Task 2: Ensembles of Deep Architectures for Valence-Arousal Prediction
Title | MainiwayAI at IJCNLP-2017 Task 2: Ensembles of Deep Architectures for Valence-Arousal Prediction |
Authors | Yassine Benajiba, Jin Sun, Yong Zhang, Zhiliang Weng, Or Biran |
Abstract | This paper introduces Mainiway AI Labs submitted system for the IJCNLP 2017 shared task on Dimensional Sentiment Analysis of Chinese Phrases (DSAP), and related experiments. Our approach consists of deep neural networks with various architectures, and our best system is a voted ensemble of networks. We achieve a Mean Absolute Error of 0.64 in valence prediction and 0.68 in arousal prediction on the test set, both placing us as the 5th ranked team in the competition. |
Tasks | Sentiment Analysis, Word Embeddings |
Published | 2017-12-01 |
URL | https://www.aclweb.org/anthology/I17-4019/ |
https://www.aclweb.org/anthology/I17-4019 | |
PWC | https://paperswithcode.com/paper/mainiwayai-at-ijcnlp-2017-task-2-ensembles-of |
Repo | |
Framework | |
Experiments in taxonomy induction in Spanish and French
Title | Experiments in taxonomy induction in Spanish and French |
Authors | Irene Renau, Rogelio Nazar, Rafael Mar{'\i}n |
Abstract | |
Tasks | |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-7008/ |
https://www.aclweb.org/anthology/W17-7008 | |
PWC | https://paperswithcode.com/paper/experiments-in-taxonomy-induction-in-spanish |
Repo | |
Framework | |