July 26, 2019

2126 words 10 mins read

Paper Group NANR 142

Time Expression Analysis and Recognition Using Syntactic Token Types and General Heuristic Rules. A parallel collection of clinical trials in Portuguese and English. Dynamic Importance Sampling for Anytime Bounds of the Partition Function. Identification of Risk Factors in Clinical Texts through Association Rules. Stochastic Mirror Descent in Varia …

Time Expression Analysis and Recognition Using Syntactic Token Types and General Heuristic Rules


Title	Time Expression Analysis and Recognition Using Syntactic Token Types and General Heuristic Rules
Authors	Xiaoshi Zhong, Aixin Sun, Erik Cambria
Abstract	Extracting time expressions from free text is a fundamental task for many applications. We analyze the time expressions from four datasets and find that only a small group of words are used to express time information, and the words in time expressions demonstrate similar syntactic behaviour. Based on the findings, we propose a type-based approach, named SynTime, to recognize time expressions. Specifically, we define three main syntactic token types, namely time token, modifier, and numeral, to group time-related regular expressions over tokens. On the types we design general heuristic rules to recognize time expressions. In recognition, SynTime first identifies the time tokens from raw text, then searches their surroundings for modifiers and numerals to form time segments, and finally merges the time segments to time expressions. As a light-weight rule-based tagger, SynTime runs in real time, and can be easily expanded by simply adding keywords for the text of different types and of different domains. Experiment on benchmark datasets and tweets data shows that SynTime outperforms state-of-the-art methods.
Tasks	Information Retrieval
Published	2017-07-01
URL	https://www.aclweb.org/anthology/P17-1039/
PDF	https://www.aclweb.org/anthology/P17-1039
PWC	https://paperswithcode.com/paper/time-expression-analysis-and-recognition
Repo
Framework

A parallel collection of clinical trials in Portuguese and English


Title	A parallel collection of clinical trials in Portuguese and English
Authors	Mariana Neves
Abstract	Parallel collections of documents are crucial resources for training and evaluating machine translation (MT) systems. Even though large collections are available for certain domains and language pairs, these are still scarce in the biomedical domain. We developed a parallel corpus of clinical trials in Portuguese and English. The documents are derived from the Brazilian Clinical Trials Registry and the corpus currently contains a total of 1188 documents. In this paper, we describe the corpus construction and discuss the quality of the translation and the sentence alignment that we obtained.
Tasks	Machine Translation
Published	2017-08-01
URL	https://www.aclweb.org/anthology/W17-2507/
PDF	https://www.aclweb.org/anthology/W17-2507
PWC	https://paperswithcode.com/paper/a-parallel-collection-of-clinical-trials-in
Repo
Framework

Dynamic Importance Sampling for Anytime Bounds of the Partition Function


Title	Dynamic Importance Sampling for Anytime Bounds of the Partition Function
Authors	Qi Lou, Rina Dechter, Alexander T. Ihler
Abstract	Computing the partition function is a key inference task in many graphical models. In this paper, we propose a dynamic importance sampling scheme that provides anytime finite-sample bounds for the partition function. Our algorithm balances the advantages of the three major inference strategies, heuristic search, variational bounds, and Monte Carlo methods, blending sampling with search to refine a variationally defined proposal. Our algorithm combines and generalizes recent work on anytime search and probabilistic bounds of the partition function. By using an intelligently chosen weighted average over the samples, we construct an unbiased estimator of the partition function with strong finite-sample confidence intervals that inherit both the rapid early improvement rate of sampling and the long-term benefits of an improved proposal from search. This gives significantly improved anytime behavior, and more flexible trade-offs between memory, time, and solution quality. We demonstrate the effectiveness of our approach empirically on real-world problem instances taken from recent UAI competitions.
Tasks
Published	2017-12-01
URL	http://papers.nips.cc/paper/6912-dynamic-importance-sampling-for-anytime-bounds-of-the-partition-function
PDF	http://papers.nips.cc/paper/6912-dynamic-importance-sampling-for-anytime-bounds-of-the-partition-function.pdf
PWC	https://paperswithcode.com/paper/dynamic-importance-sampling-for-anytime
Repo
Framework

Identification of Risk Factors in Clinical Texts through Association Rules


Title	Identification of Risk Factors in Clinical Texts through Association Rules
Authors	Svetla Boytcheva, Ivelina Nikolova, Galia Angelova, Zhivko Angelov
Abstract	We describe a method which extracts Association Rules from texts in order to recognise verbalisations of risk factors. Usually some basic vocabulary about risk factors is known but medical conditions are expressed in clinical narratives with much higher variety. We propose an approach for data-driven learning of specialised medical vocabulary which, once collected, enables early alerting of potentially affected patients. The method is illustrated by experimens with clinical records of patients with Chronic Obstructive Pulmonary Disease (COPD) and comorbidity of CORD, Diabetes Melitus and Schizophrenia. Our input data come from the Bulgarian Diabetic Register, which is built using a pseudonymised collection of outpatient records for about 500,000 diabetic patients. The generated Association Rules for CORD are analysed in the context of demographic, gender, and age information. Valuable anounts of meaningful words, signalling risk factors, are discovered with high precision and confidence.
Tasks
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-8009/
PDF	https://doi.org/10.26615/978-954-452-044-1_009
PWC	https://paperswithcode.com/paper/identification-of-risk-factors-in-clinical
Repo
Framework

Stochastic Mirror Descent in Variationally Coherent Optimization Problems


Title	Stochastic Mirror Descent in Variationally Coherent Optimization Problems
Authors	Zhengyuan Zhou, Panayotis Mertikopoulos, Nicholas Bambos, Stephen Boyd, Peter W. Glynn
Abstract	In this paper, we examine a class of non-convex stochastic optimization problems which we call variationally coherent, and which properly includes pseudo-/quasiconvex and star-convex optimization problems. To solve such problems, we focus on the widely used stochastic mirror descent (SMD) family of algorithms (which contains stochastic gradient descent as a special case), and we show that the last iterate of SMD converges to the problem’s solution set with probability 1. This result contributes to the landscape of non-convex stochastic optimization by clarifying that neither pseudo-/quasi-convexity nor star-convexity is essential for (almost sure) global convergence; rather, variational coherence, a much weaker requirement, suffices. Characterization of convergence rates for the subclass of strongly variationally coherent optimization problems as well as simulation results are also presented.
Tasks	Stochastic Optimization
Published	2017-12-01
URL	http://papers.nips.cc/paper/7279-stochastic-mirror-descent-in-variationally-coherent-optimization-problems
PDF	http://papers.nips.cc/paper/7279-stochastic-mirror-descent-in-variationally-coherent-optimization-problems.pdf
PWC	https://paperswithcode.com/paper/stochastic-mirror-descent-in-variationally
Repo
Framework

Proceedings of the Events and Stories in the News Workshop


Title	Proceedings of the Events and Stories in the News Workshop
Authors
Abstract
Tasks
Published	2017-08-01
URL	https://www.aclweb.org/anthology/W17-2700/
PDF	https://www.aclweb.org/anthology/W17-2700
PWC	https://paperswithcode.com/paper/proceedings-of-the-events-and-stories-in-the
Repo
Framework

Automatic Measures to Characterise Verbal Alignment in Human-Agent Interaction


Title	Automatic Measures to Characterise Verbal Alignment in Human-Agent Interaction
Authors	Guillaume Dubuisson Duplessis, Chlo{'e} Clavel, L, Fr{'e}d{'e}ric ragin
Abstract	This work aims at characterising verbal alignment processes for improving virtual agent communicative capabilities. We propose computationally inexpensive measures of verbal alignment based on expression repetition in dyadic textual dialogues. Using these measures, we present a contrastive study between Human-Human and Human-Agent dialogues on a negotiation task. We exhibit quantitative differences in the strength and orientation of verbal alignment showing the ability of our approach to characterise important aspects of verbal alignment.
Tasks
Published	2017-08-01
URL	https://www.aclweb.org/anthology/W17-5510/
PDF	https://www.aclweb.org/anthology/W17-5510
PWC	https://paperswithcode.com/paper/automatic-measures-to-characterise-verbal
Repo
Framework

Weakly supervised learning of allomorphy


Title	Weakly supervised learning of allomorphy
Authors	Miikka Silfverberg, Mans Hulden
Abstract	Most NLP resources that offer annotations at the word segment level provide morphological annotation that includes features indicating tense, aspect, modality, gender, case, and other inflectional information. Such information is rarely aligned to the relevant parts of the words{—}i.e. the allomorphs, as such annotation would be very costly. These unaligned weak labelings are commonly provided by annotated NLP corpora such as treebanks in various languages. Although they lack alignment information, the presence/absence of labels at the word level is also consistent with the amount of supervision assumed to be provided to L1 and L2 learners. In this paper, we explore several methods to learn this latent alignment between parts of word forms and the grammatical information provided. All the methods under investigation favor hypotheses regarding allomorphs of morphemes that re-use a small inventory, i.e. implicitly minimize the number of allomorphs that a morpheme can be realized as. We show that the provided information offers a significant advantage for both word segmentation and the learning of allomorphy.
Tasks
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-4107/
PDF	https://www.aclweb.org/anthology/W17-4107
PWC	https://paperswithcode.com/paper/weakly-supervised-learning-of-allomorphy
Repo
Framework

From Clickbait to Fake News Detection: An Approach based on Detecting the Stance of Headlines to Articles


Title	From Clickbait to Fake News Detection: An Approach based on Detecting the Stance of Headlines to Articles
Authors	Peter Bourgonje, Julian Moreno Schneider, Georg Rehm
Abstract	We present a system for the detection of the stance of headlines with regard to their corresponding article bodies. The approach can be applied in fake news, especially clickbait detection scenarios. The component is part of a larger platform for the curation of digital content; we consider veracity and relevancy an increasingly important part of curating online information. We want to contribute to the debate on how to deal with fake news and related online phenomena with technological means, by providing means to separate related from unrelated headlines and further classifying the related headlines. On a publicly available data set annotated for the stance of headlines with regard to their corresponding article bodies, we achieve a (weighted) accuracy score of 89.59.
Tasks	Clickbait Detection, Fake News Detection, Rumour Detection
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-4215/
PDF	https://www.aclweb.org/anthology/W17-4215
PWC	https://paperswithcode.com/paper/from-clickbait-to-fake-news-detection-an
Repo
Framework

``i have a feeling trump will win………………'': Forecasting Winners and Losers from User Predictions on Twitter


Title	``i have a feeling trump will win………………'': Forecasting Winners and Losers from User Predictions on Twitter \|
Authors	S Swamy, esh, Alan Ritter, Marie-Catherine de Marneffe
Abstract	Social media users often make explicit predictions about upcoming events. Such statements vary in the degree of certainty the author expresses toward the outcome: {`}Leonardo DiCaprio will win Best Actor{''} vs. {`}Leonardo DiCaprio may win{''} or {``}No way Leonardo wins!{''}. Can popular beliefs on social media predict who will win? To answer this question, we build a corpus of tweets annotated for veridicality on which we train a log-linear classifier that detects positive veridicality with high precision. We then forecast uncertain outcomes using the wisdom of crowds, by aggregating users{'} explicit predictions. Our method for forecasting winners is fully automated, relying only on a set of contenders as input. It requires no training data of past outcomes and outperforms sentiment and tweet volume baselines on a broad range of contest prediction tasks. We further demonstrate how our approach can be used to measure the reliability of individual accounts{'} predictions and retrospectively identify surprise outcomes. \|
Tasks	Sentiment Analysis
Published	2017-09-01
URL	https://www.aclweb.org/anthology/D17-1166/
PDF	https://www.aclweb.org/anthology/D17-1166
PWC	https://paperswithcode.com/paper/i-have-a-feeling-trump-will-win-forecasting-1
Repo
Framework

Instances and concepts in distributional space


Title	Instances and concepts in distributional space
Authors	Gemma Boleda, Abhijeet Gupta, Sebastian Pad{'o}
Abstract	Instances ({`}Mozart{''}) are ontologically distinct from concepts or classes ({`}composer{''}). Natural language encompasses both, but instances have received comparatively little attention in distributional semantics. Our results show that instances and concepts differ in their distributional properties. We also establish that instantiation detection ({`}Mozart {--} composer{''}) is generally easier than hypernymy detection ({`}chemist {–} scientist{''}), and that results on the influence of input representation do not transfer from hyponymy to instantiation.
Tasks
Published	2017-04-01
URL	https://www.aclweb.org/anthology/E17-2013/
PDF	https://www.aclweb.org/anthology/E17-2013
PWC	https://paperswithcode.com/paper/instances-and-concepts-in-distributional
Repo
Framework

Proceedings of the 4th Workshop on Asian Translation (WAT2017)


Title	Proceedings of the 4th Workshop on Asian Translation (WAT2017)
Authors
Abstract
Tasks
Published	2017-11-01
URL	https://www.aclweb.org/anthology/W17-5700/
PDF	https://www.aclweb.org/anthology/W17-5700
PWC	https://paperswithcode.com/paper/proceedings-of-the-4th-workshop-on-asian
Repo
Framework

Creating POS Tagging and Dependency Parsing Experts via Topic Modeling


Title	Creating POS Tagging and Dependency Parsing Experts via Topic Modeling
Authors	Atreyee Mukherjee, S K{"u}bler, ra, Matthias Scheutz
Abstract	Part of speech (POS) taggers and dependency parsers tend to work well on homogeneous datasets but their performance suffers on datasets containing data from different genres. In our current work, we investigate how to create POS tagging and dependency parsing experts for heterogeneous data by employing topic modeling. We create topic models (using Latent Dirichlet Allocation) to determine genres from a heterogeneous dataset and then train an expert for each of the genres. Our results show that the topic modeling experts reach substantial improvements when compared to the general versions. For dependency parsing, the improvement reaches 2 percent points over the full training baseline when we use two topics.
Tasks	Dependency Parsing, Domain Adaptation, Topic Models
Published	2017-04-01
URL	https://www.aclweb.org/anthology/E17-1033/
PDF	https://www.aclweb.org/anthology/E17-1033
PWC	https://paperswithcode.com/paper/creating-pos-tagging-and-dependency-parsing
Repo
Framework

MainiwayAI at IJCNLP-2017 Task 2: Ensembles of Deep Architectures for Valence-Arousal Prediction


Title	MainiwayAI at IJCNLP-2017 Task 2: Ensembles of Deep Architectures for Valence-Arousal Prediction
Authors	Yassine Benajiba, Jin Sun, Yong Zhang, Zhiliang Weng, Or Biran
Abstract	This paper introduces Mainiway AI Labs submitted system for the IJCNLP 2017 shared task on Dimensional Sentiment Analysis of Chinese Phrases (DSAP), and related experiments. Our approach consists of deep neural networks with various architectures, and our best system is a voted ensemble of networks. We achieve a Mean Absolute Error of 0.64 in valence prediction and 0.68 in arousal prediction on the test set, both placing us as the 5th ranked team in the competition.
Tasks	Sentiment Analysis, Word Embeddings
Published	2017-12-01
URL	https://www.aclweb.org/anthology/I17-4019/
PDF	https://www.aclweb.org/anthology/I17-4019
PWC	https://paperswithcode.com/paper/mainiwayai-at-ijcnlp-2017-task-2-ensembles-of
Repo
Framework

Experiments in taxonomy induction in Spanish and French


Title	Experiments in taxonomy induction in Spanish and French
Authors	Irene Renau, Rogelio Nazar, Rafael Mar{'\i}n
Abstract
Tasks
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-7008/
PDF	https://www.aclweb.org/anthology/W17-7008
PWC	https://paperswithcode.com/paper/experiments-in-taxonomy-induction-in-spanish
Repo
Framework