July 26, 2019

2126 words 10 mins read

Paper Group NANR 142

Paper Group NANR 142

Time Expression Analysis and Recognition Using Syntactic Token Types and General Heuristic Rules. A parallel collection of clinical trials in Portuguese and English. Dynamic Importance Sampling for Anytime Bounds of the Partition Function. Identification of Risk Factors in Clinical Texts through Association Rules. Stochastic Mirror Descent in Varia …

Time Expression Analysis and Recognition Using Syntactic Token Types and General Heuristic Rules

Title Time Expression Analysis and Recognition Using Syntactic Token Types and General Heuristic Rules
Authors Xiaoshi Zhong, Aixin Sun, Erik Cambria
Abstract Extracting time expressions from free text is a fundamental task for many applications. We analyze the time expressions from four datasets and find that only a small group of words are used to express time information, and the words in time expressions demonstrate similar syntactic behaviour. Based on the findings, we propose a type-based approach, named SynTime, to recognize time expressions. Specifically, we define three main syntactic token types, namely time token, modifier, and numeral, to group time-related regular expressions over tokens. On the types we design general heuristic rules to recognize time expressions. In recognition, SynTime first identifies the time tokens from raw text, then searches their surroundings for modifiers and numerals to form time segments, and finally merges the time segments to time expressions. As a light-weight rule-based tagger, SynTime runs in real time, and can be easily expanded by simply adding keywords for the text of different types and of different domains. Experiment on benchmark datasets and tweets data shows that SynTime outperforms state-of-the-art methods.
Tasks Information Retrieval
Published 2017-07-01
URL https://www.aclweb.org/anthology/P17-1039/
PDF https://www.aclweb.org/anthology/P17-1039
PWC https://paperswithcode.com/paper/time-expression-analysis-and-recognition
Repo
Framework

A parallel collection of clinical trials in Portuguese and English

Title A parallel collection of clinical trials in Portuguese and English
Authors Mariana Neves
Abstract Parallel collections of documents are crucial resources for training and evaluating machine translation (MT) systems. Even though large collections are available for certain domains and language pairs, these are still scarce in the biomedical domain. We developed a parallel corpus of clinical trials in Portuguese and English. The documents are derived from the Brazilian Clinical Trials Registry and the corpus currently contains a total of 1188 documents. In this paper, we describe the corpus construction and discuss the quality of the translation and the sentence alignment that we obtained.
Tasks Machine Translation
Published 2017-08-01
URL https://www.aclweb.org/anthology/W17-2507/
PDF https://www.aclweb.org/anthology/W17-2507
PWC https://paperswithcode.com/paper/a-parallel-collection-of-clinical-trials-in
Repo
Framework

Dynamic Importance Sampling for Anytime Bounds of the Partition Function

Title Dynamic Importance Sampling for Anytime Bounds of the Partition Function
Authors Qi Lou, Rina Dechter, Alexander T. Ihler
Abstract Computing the partition function is a key inference task in many graphical models. In this paper, we propose a dynamic importance sampling scheme that provides anytime finite-sample bounds for the partition function. Our algorithm balances the advantages of the three major inference strategies, heuristic search, variational bounds, and Monte Carlo methods, blending sampling with search to refine a variationally defined proposal. Our algorithm combines and generalizes recent work on anytime search and probabilistic bounds of the partition function. By using an intelligently chosen weighted average over the samples, we construct an unbiased estimator of the partition function with strong finite-sample confidence intervals that inherit both the rapid early improvement rate of sampling and the long-term benefits of an improved proposal from search. This gives significantly improved anytime behavior, and more flexible trade-offs between memory, time, and solution quality. We demonstrate the effectiveness of our approach empirically on real-world problem instances taken from recent UAI competitions.
Tasks
Published 2017-12-01
URL http://papers.nips.cc/paper/6912-dynamic-importance-sampling-for-anytime-bounds-of-the-partition-function
PDF http://papers.nips.cc/paper/6912-dynamic-importance-sampling-for-anytime-bounds-of-the-partition-function.pdf
PWC https://paperswithcode.com/paper/dynamic-importance-sampling-for-anytime
Repo
Framework

Identification of Risk Factors in Clinical Texts through Association Rules

Title Identification of Risk Factors in Clinical Texts through Association Rules
Authors Svetla Boytcheva, Ivelina Nikolova, Galia Angelova, Zhivko Angelov
Abstract We describe a method which extracts Association Rules from texts in order to recognise verbalisations of risk factors. Usually some basic vocabulary about risk factors is known but medical conditions are expressed in clinical narratives with much higher variety. We propose an approach for data-driven learning of specialised medical vocabulary which, once collected, enables early alerting of potentially affected patients. The method is illustrated by experimens with clinical records of patients with Chronic Obstructive Pulmonary Disease (COPD) and comorbidity of CORD, Diabetes Melitus and Schizophrenia. Our input data come from the Bulgarian Diabetic Register, which is built using a pseudonymised collection of outpatient records for about 500,000 diabetic patients. The generated Association Rules for CORD are analysed in the context of demographic, gender, and age information. Valuable anounts of meaningful words, signalling risk factors, are discovered with high precision and confidence.
Tasks
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-8009/
PDF https://doi.org/10.26615/978-954-452-044-1_009
PWC https://paperswithcode.com/paper/identification-of-risk-factors-in-clinical
Repo
Framework

Stochastic Mirror Descent in Variationally Coherent Optimization Problems

Title Stochastic Mirror Descent in Variationally Coherent Optimization Problems
Authors Zhengyuan Zhou, Panayotis Mertikopoulos, Nicholas Bambos, Stephen Boyd, Peter W. Glynn
Abstract In this paper, we examine a class of non-convex stochastic optimization problems which we call variationally coherent, and which properly includes pseudo-/quasiconvex and star-convex optimization problems. To solve such problems, we focus on the widely used stochastic mirror descent (SMD) family of algorithms (which contains stochastic gradient descent as a special case), and we show that the last iterate of SMD converges to the problem’s solution set with probability 1. This result contributes to the landscape of non-convex stochastic optimization by clarifying that neither pseudo-/quasi-convexity nor star-convexity is essential for (almost sure) global convergence; rather, variational coherence, a much weaker requirement, suffices. Characterization of convergence rates for the subclass of strongly variationally coherent optimization problems as well as simulation results are also presented.
Tasks Stochastic Optimization
Published 2017-12-01
URL http://papers.nips.cc/paper/7279-stochastic-mirror-descent-in-variationally-coherent-optimization-problems
PDF http://papers.nips.cc/paper/7279-stochastic-mirror-descent-in-variationally-coherent-optimization-problems.pdf
PWC https://paperswithcode.com/paper/stochastic-mirror-descent-in-variationally
Repo
Framework

Proceedings of the Events and Stories in the News Workshop

Title Proceedings of the Events and Stories in the News Workshop
Authors
Abstract
Tasks
Published 2017-08-01
URL https://www.aclweb.org/anthology/W17-2700/
PDF https://www.aclweb.org/anthology/W17-2700
PWC https://paperswithcode.com/paper/proceedings-of-the-events-and-stories-in-the
Repo
Framework

Automatic Measures to Characterise Verbal Alignment in Human-Agent Interaction

Title Automatic Measures to Characterise Verbal Alignment in Human-Agent Interaction
Authors Guillaume Dubuisson Duplessis, Chlo{'e} Clavel, L, Fr{'e}d{'e}ric ragin
Abstract This work aims at characterising verbal alignment processes for improving virtual agent communicative capabilities. We propose computationally inexpensive measures of verbal alignment based on expression repetition in dyadic textual dialogues. Using these measures, we present a contrastive study between Human-Human and Human-Agent dialogues on a negotiation task. We exhibit quantitative differences in the strength and orientation of verbal alignment showing the ability of our approach to characterise important aspects of verbal alignment.
Tasks
Published 2017-08-01
URL https://www.aclweb.org/anthology/W17-5510/
PDF https://www.aclweb.org/anthology/W17-5510
PWC https://paperswithcode.com/paper/automatic-measures-to-characterise-verbal
Repo
Framework

Weakly supervised learning of allomorphy

Title Weakly supervised learning of allomorphy
Authors Miikka Silfverberg, Mans Hulden
Abstract Most NLP resources that offer annotations at the word segment level provide morphological annotation that includes features indicating tense, aspect, modality, gender, case, and other inflectional information. Such information is rarely aligned to the relevant parts of the words{—}i.e. the allomorphs, as such annotation would be very costly. These unaligned weak labelings are commonly provided by annotated NLP corpora such as treebanks in various languages. Although they lack alignment information, the presence/absence of labels at the word level is also consistent with the amount of supervision assumed to be provided to L1 and L2 learners. In this paper, we explore several methods to learn this latent alignment between parts of word forms and the grammatical information provided. All the methods under investigation favor hypotheses regarding allomorphs of morphemes that re-use a small inventory, i.e. implicitly minimize the number of allomorphs that a morpheme can be realized as. We show that the provided information offers a significant advantage for both word segmentation and the learning of allomorphy.
Tasks
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-4107/
PDF https://www.aclweb.org/anthology/W17-4107
PWC https://paperswithcode.com/paper/weakly-supervised-learning-of-allomorphy
Repo
Framework

From Clickbait to Fake News Detection: An Approach based on Detecting the Stance of Headlines to Articles

Title From Clickbait to Fake News Detection: An Approach based on Detecting the Stance of Headlines to Articles
Authors Peter Bourgonje, Julian Moreno Schneider, Georg Rehm
Abstract We present a system for the detection of the stance of headlines with regard to their corresponding article bodies. The approach can be applied in fake news, especially clickbait detection scenarios. The component is part of a larger platform for the curation of digital content; we consider veracity and relevancy an increasingly important part of curating online information. We want to contribute to the debate on how to deal with fake news and related online phenomena with technological means, by providing means to separate related from unrelated headlines and further classifying the related headlines. On a publicly available data set annotated for the stance of headlines with regard to their corresponding article bodies, we achieve a (weighted) accuracy score of 89.59.
Tasks Clickbait Detection, Fake News Detection, Rumour Detection
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-4215/
PDF https://www.aclweb.org/anthology/W17-4215
PWC https://paperswithcode.com/paper/from-clickbait-to-fake-news-detection-an
Repo
Framework

``i have a feeling trump will win………………'': Forecasting Winners and Losers from User Predictions on Twitter

Title ``i have a feeling trump will win………………'': Forecasting Winners and Losers from User Predictions on Twitter |
Authors S Swamy, esh, Alan Ritter, Marie-Catherine de Marneffe
Abstract Social media users often make explicit predictions about upcoming events. Such statements vary in the degree of certainty the author expresses toward the outcome: {}Leonardo DiCaprio will win Best Actor{''} vs. {}Leonardo DiCaprio may win{''} or {``}No way Leonardo wins!{''}. Can popular beliefs on social media predict who will win? To answer this question, we build a corpus of tweets annotated for veridicality on which we train a log-linear classifier that detects positive veridicality with high precision. We then forecast uncertain outcomes using the wisdom of crowds, by aggregating users{'} explicit predictions. Our method for forecasting winners is fully automated, relying only on a set of contenders as input. It requires no training data of past outcomes and outperforms sentiment and tweet volume baselines on a broad range of contest prediction tasks. We further demonstrate how our approach can be used to measure the reliability of individual accounts{'} predictions and retrospectively identify surprise outcomes. |
Tasks Sentiment Analysis
Published 2017-09-01
URL https://www.aclweb.org/anthology/D17-1166/
PDF https://www.aclweb.org/anthology/D17-1166
PWC https://paperswithcode.com/paper/i-have-a-feeling-trump-will-win-forecasting-1
Repo
Framework

Instances and concepts in distributional space

Title Instances and concepts in distributional space
Authors Gemma Boleda, Abhijeet Gupta, Sebastian Pad{'o}
Abstract Instances ({}Mozart{''}) are ontologically distinct from concepts or classes ({}composer{''}). Natural language encompasses both, but instances have received comparatively little attention in distributional semantics. Our results show that instances and concepts differ in their distributional properties. We also establish that instantiation detection ({}Mozart {--} composer{''}) is generally easier than hypernymy detection ({}chemist {–} scientist{''}), and that results on the influence of input representation do not transfer from hyponymy to instantiation.
Tasks
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-2013/
PDF https://www.aclweb.org/anthology/E17-2013
PWC https://paperswithcode.com/paper/instances-and-concepts-in-distributional
Repo
Framework

Proceedings of the 4th Workshop on Asian Translation (WAT2017)

Title Proceedings of the 4th Workshop on Asian Translation (WAT2017)
Authors
Abstract
Tasks
Published 2017-11-01
URL https://www.aclweb.org/anthology/W17-5700/
PDF https://www.aclweb.org/anthology/W17-5700
PWC https://paperswithcode.com/paper/proceedings-of-the-4th-workshop-on-asian
Repo
Framework

Creating POS Tagging and Dependency Parsing Experts via Topic Modeling

Title Creating POS Tagging and Dependency Parsing Experts via Topic Modeling
Authors Atreyee Mukherjee, S K{"u}bler, ra, Matthias Scheutz
Abstract Part of speech (POS) taggers and dependency parsers tend to work well on homogeneous datasets but their performance suffers on datasets containing data from different genres. In our current work, we investigate how to create POS tagging and dependency parsing experts for heterogeneous data by employing topic modeling. We create topic models (using Latent Dirichlet Allocation) to determine genres from a heterogeneous dataset and then train an expert for each of the genres. Our results show that the topic modeling experts reach substantial improvements when compared to the general versions. For dependency parsing, the improvement reaches 2 percent points over the full training baseline when we use two topics.
Tasks Dependency Parsing, Domain Adaptation, Topic Models
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-1033/
PDF https://www.aclweb.org/anthology/E17-1033
PWC https://paperswithcode.com/paper/creating-pos-tagging-and-dependency-parsing
Repo
Framework

MainiwayAI at IJCNLP-2017 Task 2: Ensembles of Deep Architectures for Valence-Arousal Prediction

Title MainiwayAI at IJCNLP-2017 Task 2: Ensembles of Deep Architectures for Valence-Arousal Prediction
Authors Yassine Benajiba, Jin Sun, Yong Zhang, Zhiliang Weng, Or Biran
Abstract This paper introduces Mainiway AI Labs submitted system for the IJCNLP 2017 shared task on Dimensional Sentiment Analysis of Chinese Phrases (DSAP), and related experiments. Our approach consists of deep neural networks with various architectures, and our best system is a voted ensemble of networks. We achieve a Mean Absolute Error of 0.64 in valence prediction and 0.68 in arousal prediction on the test set, both placing us as the 5th ranked team in the competition.
Tasks Sentiment Analysis, Word Embeddings
Published 2017-12-01
URL https://www.aclweb.org/anthology/I17-4019/
PDF https://www.aclweb.org/anthology/I17-4019
PWC https://paperswithcode.com/paper/mainiwayai-at-ijcnlp-2017-task-2-ensembles-of
Repo
Framework

Experiments in taxonomy induction in Spanish and French

Title Experiments in taxonomy induction in Spanish and French
Authors Irene Renau, Rogelio Nazar, Rafael Mar{'\i}n
Abstract
Tasks
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-7008/
PDF https://www.aclweb.org/anthology/W17-7008
PWC https://paperswithcode.com/paper/experiments-in-taxonomy-induction-in-spanish
Repo
Framework
comments powered by Disqus