July 26, 2019

2188 words 11 mins read

Paper Group NANR 186

Paper Group NANR 186

Near Minimax Optimal Players for the Finite-Time 3-Expert Prediction Problem. Ideological Phrase Indicators for Classification of Political Discourse Framing on Twitter. Detecting annotation noise in automatically labelled data. Investigating the Opacity of Verb-Noun Multiword Expression Usages in Context. Pocket Knowledge Base Population. A Consti …

Near Minimax Optimal Players for the Finite-Time 3-Expert Prediction Problem

Title Near Minimax Optimal Players for the Finite-Time 3-Expert Prediction Problem
Authors Yasin Abbasi, Peter L. Bartlett, Victor Gabillon
Abstract We study minimax strategies for the online prediction problem with expert advice. It has been conjectured that a simple adversary strategy, called COMB, is near optimal in this game for any number of experts. Our results and new insights make progress in this direction by showing that, up to a small additive term, COMB is minimax optimal in the finite-time three expert problem. In addition, we provide for this setting a new near minimax optimal COMB-based learner. Prior to this work, in this problem, learners obtaining the optimal multiplicative constant in their regret rate were known only when $K=2$ or $K\rightarrow\infty$. We characterize, when $K=3$, the regret of the game scaling as $\sqrt{8/(9\pi)T}\pm \log(T)^2$ which gives for the first time the optimal constant in the leading ($\sqrt{T}$) term of the regret.
Tasks
Published 2017-12-01
URL http://papers.nips.cc/paper/6896-near-minimax-optimal-players-for-the-finite-time-3-expert-prediction-problem
PDF http://papers.nips.cc/paper/6896-near-minimax-optimal-players-for-the-finite-time-3-expert-prediction-problem.pdf
PWC https://paperswithcode.com/paper/near-minimax-optimal-players-for-the-finite
Repo
Framework

Ideological Phrase Indicators for Classification of Political Discourse Framing on Twitter

Title Ideological Phrase Indicators for Classification of Political Discourse Framing on Twitter
Authors Kristen Johnson, I-Ta Lee, Dan Goldwasser
Abstract Politicians carefully word their statements in order to influence how others view an issue, a political strategy called framing. Simultaneously, these frames may also reveal the beliefs or positions on an issue of the politician. Simple language features such as unigrams, bigrams, and trigrams are important indicators for identifying the general frame of a text, for both longer congressional speeches and shorter tweets of politicians. However, tweets may contain multiple unigrams across different frames which limits the effectiveness of this approach. In this paper, we present a joint model which uses both linguistic features of tweets and ideological phrase indicators extracted from a state-of-the-art embedding-based model to predict the general frame of political tweets.
Tasks
Published 2017-08-01
URL https://www.aclweb.org/anthology/W17-2913/
PDF https://www.aclweb.org/anthology/W17-2913
PWC https://paperswithcode.com/paper/ideological-phrase-indicators-for
Repo
Framework

Detecting annotation noise in automatically labelled data

Title Detecting annotation noise in automatically labelled data
Authors Ines Rehbein, Josef Ruppenhofer
Abstract We introduce a method for error detection in automatically annotated text, aimed at supporting the creation of high-quality language resources at affordable cost. Our method combines an unsupervised generative model with human supervision from active learning. We test our approach on in-domain and out-of-domain data in two languages, in AL simulations and in a real world setting. For all settings, the results show that our method is able to detect annotation errors with high precision and high recall.
Tasks Active Learning, Domain Adaptation, Language Modelling, Named Entity Recognition
Published 2017-07-01
URL https://www.aclweb.org/anthology/P17-1107/
PDF https://www.aclweb.org/anthology/P17-1107
PWC https://paperswithcode.com/paper/detecting-annotation-noise-in-automatically
Repo
Framework

Investigating the Opacity of Verb-Noun Multiword Expression Usages in Context

Title Investigating the Opacity of Verb-Noun Multiword Expression Usages in Context
Authors Shiva Taslimipoor, Omid Rohanian, Ruslan Mitkov, Afsaneh Fazly
Abstract This study investigates the supervised token-based identification of Multiword Expressions (MWEs). This is an ongoing research to exploit the information contained in the contexts in which different instances of an expression could occur. This information is used to investigate the question of whether an expression is literal or MWE. Lexical and syntactic context features derived from vector representations are shown to be more effective over traditional statistical measures to identify tokens of MWEs.
Tasks
Published 2017-04-01
URL https://www.aclweb.org/anthology/W17-1718/
PDF https://www.aclweb.org/anthology/W17-1718
PWC https://paperswithcode.com/paper/investigating-the-opacity-of-verb-noun
Repo
Framework

Pocket Knowledge Base Population

Title Pocket Knowledge Base Population
Authors Travis Wolfe, Mark Dredze, Benjamin Van Durme
Abstract Existing Knowledge Base Population methods extract relations from a closed relational schema with limited coverage leading to sparse KBs. We propose Pocket Knowledge Base Population (PKBP), the task of dynamically constructing a KB of entities related to a query and finding the best characterization of relationships between entities. We describe novel Open Information Extraction methods which leverage the PKB to find informative trigger words. We evaluate using existing KBP shared-task data as well anew annotations collected for this work. Our methods produce high quality KB from just text with many more entities and relationships than existing KBP systems.
Tasks Knowledge Base Population, Open Information Extraction, Slot Filling
Published 2017-07-01
URL https://www.aclweb.org/anthology/P17-2048/
PDF https://www.aclweb.org/anthology/P17-2048
PWC https://paperswithcode.com/paper/pocket-knowledge-base-population
Repo
Framework

A Constituent-Centric Neural Architecture for Reading Comprehension

Title A Constituent-Centric Neural Architecture for Reading Comprehension
Authors Pengtao Xie, Eric Xing
Abstract Reading comprehension (RC), aiming to understand natural texts and answer questions therein, is a challenging task. In this paper, we study the RC problem on the Stanford Question Answering Dataset (SQuAD). Observing from the training set that most correct answers are centered around constituents in the parse tree, we design a constituent-centric neural architecture where the generation of candidate answers and their representation learning are both based on constituents and guided by the parse tree. Under this architecture, the search space of candidate answers can be greatly reduced without sacrificing the coverage of correct answers and the syntactic, hierarchical and compositional structure among constituents can be well captured, which contributes to better representation learning of the candidate answers. On SQuAD, our method achieves the state of the art performance and the ablation study corroborates the effectiveness of individual modules.
Tasks Machine Reading Comprehension, Question Answering, Reading Comprehension, Representation Learning
Published 2017-07-01
URL https://www.aclweb.org/anthology/P17-1129/
PDF https://www.aclweb.org/anthology/P17-1129
PWC https://paperswithcode.com/paper/a-constituent-centric-neural-architecture-for
Repo
Framework

A BiLSTM-based System for Cross-lingual Pronoun Prediction

Title A BiLSTM-based System for Cross-lingual Pronoun Prediction
Authors Sara Stymne, Sharid Lo{'a}iciga, Fabienne Cap
Abstract We describe the Uppsala system for the 2017 DiscoMT shared task on cross-lingual pronoun prediction. The system is based on a lower layer of BiLSTMs reading the source and target sentences respectively. Classification is based on the BiLSTM representation of the source and target positions for the pronouns. In addition we enrich our system with dependency representations from an external parser and character representations of the source sentence. We show that these additions perform well for German and Spanish as source languages. Our system is competitive and is in first or second place for all language pairs.
Tasks Machine Translation, Word Alignment
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-4805/
PDF https://www.aclweb.org/anthology/W17-4805
PWC https://paperswithcode.com/paper/a-bilstm-based-system-for-cross-lingual
Repo
Framework
Title Discovery of Discourse-Related Language Contrasts through Alignment Discrepancies in English-German Translation
Authors Ekaterina Lapshinova-Koltunski, Christian Hardmeier
Abstract In this paper, we analyse alignment discrepancies for discourse structures in English-German parallel data {–} sentence pairs, in which discourse structures in target or source texts have no alignment in the corresponding parallel sentences. The discourse-related structures are designed in form of linguistic patterns based on the information delivered by automatic part-of-speech and dependency annotation. In addition to alignment errors (existing structures left unaligned), these alignment discrepancies can be caused by language contrasts or through the phenomena of explicitation and implicitation in the translation process. We propose a new approach including new type of resources for corpus-based language contrast analysis and apply it to study and classify the contrasts found in our English-German parallel corpus. As unaligned discourse structures may also result in the loss of discourse information in the MT training data, we hope to deliver information in support of discourse-aware machine translation (MT).
Tasks Machine Translation, Word Alignment
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-4810/
PDF https://www.aclweb.org/anthology/W17-4810
PWC https://paperswithcode.com/paper/discovery-of-discourse-related-language
Repo
Framework

Automatic classification of doctor-patient questions for a virtual patient record query task

Title Automatic classification of doctor-patient questions for a virtual patient record query task
Authors Leonardo Campillos Llanos, Sophie Rosset, Pierre Zweigenbaum
Abstract We present the work-in-progress of automating the classification of doctor-patient questions in the context of a simulated consultation with a virtual patient. We classify questions according to the computational strategy (rule-based or other) needed for looking up data in the clinical record. We compare {`}traditional{'} machine learning methods (Gaussian and Multinomial Naive Bayes, and Support Vector Machines) and a neural network classifier (FastText). We obtained the best results with the SVM using semantic annotations, whereas the neural classifier achieved promising results without it. |
Tasks Dialogue Management, Information Retrieval, Named Entity Recognition, Question Answering
Published 2017-08-01
URL https://www.aclweb.org/anthology/W17-2343/
PDF https://www.aclweb.org/anthology/W17-2343
PWC https://paperswithcode.com/paper/automatic-classification-of-doctor-patient
Repo
Framework

Prices go Up, Surge, Jump, Spike, Skyrocket, Go through the Roof\ldots Intensifier Collocations with Parametric Nouns of Type PRICE

Title Prices go Up, Surge, Jump, Spike, Skyrocket, Go through the Roof\ldots Intensifier Collocations with Parametric Nouns of Type PRICE
Authors Jasmina Mili{'c}evi{'c}
Abstract
Tasks
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-6518/
PDF https://www.aclweb.org/anthology/W17-6518
PWC https://paperswithcode.com/paper/prices-go-up-surge-jump-spike-skyrocket-go
Repo
Framework

Identification of Languages in Algerian Arabic Multilingual Documents

Title Identification of Languages in Algerian Arabic Multilingual Documents
Authors Wafia Adouane, Simon Dobnik
Abstract This paper presents a language identification system designed to detect the language of each word, in its context, in a multilingual documents as generated in social media by bilingual/multilingual communities, in our case speakers of Algerian Arabic. We frame the task as a sequence tagging problem and use supervised machine learning with standard methods like HMM and Ngram classification tagging. We also experiment with a lexicon-based method. Combining all the methods in a fall-back mechanism and introducing some linguistic rules, to deal with unseen tokens and ambiguous words, gives an overall accuracy of 93.14{%}. Finally, we introduced rules for language identification from sequences of recognised words.
Tasks Chunking, Language Identification
Published 2017-04-01
URL https://www.aclweb.org/anthology/W17-1301/
PDF https://www.aclweb.org/anthology/W17-1301
PWC https://paperswithcode.com/paper/identification-of-languages-in-algerian
Repo
Framework

Aye or naw, whit dae ye hink? Scottish independence and linguistic identity on social media

Title Aye or naw, whit dae ye hink? Scottish independence and linguistic identity on social media
Authors Philippa Shoemark, Debnil Sur, Luke Shrimpton, Iain Murray, Sharon Goldwater
Abstract Political surveys have indicated a relationship between a sense of Scottish identity and voting decisions in the 2014 Scottish Independence Referendum. Identity is often reflected in language use, suggesting the intuitive hypothesis that individuals who support Scottish independence are more likely to use distinctively Scottish words than those who oppose it. In the first large-scale study of sociolinguistic variation on social media in the UK, we identify distinctively Scottish terms in a data-driven way, and find that these terms are indeed used at a higher rate by users of pro-independence hashtags than by users of anti-independence hashtags. However, we also find that in general people are less likely to use distinctively Scottish words in tweets with referendum-related hashtags than in their general Twitter activity. We attribute this difference to style shifting relative to audience, aligning with previous work showing that Twitter users tend to use fewer local variants when addressing a broader audience.
Tasks
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-1116/
PDF https://www.aclweb.org/anthology/E17-1116
PWC https://paperswithcode.com/paper/aye-or-naw-whit-dae-ye-hink-scottish
Repo
Framework

A Simple and Strong Baseline: NAIST-NICT Neural Machine Translation System for WAT2017 English-Japanese Translation Task

Title A Simple and Strong Baseline: NAIST-NICT Neural Machine Translation System for WAT2017 English-Japanese Translation Task
Authors Yusuke Oda, Katsuhito Sudoh, Satoshi Nakamura, Masao Utiyama, Eiichiro Sumita
Abstract This paper describes the details about the NAIST-NICT machine translation system for WAT2017 English-Japanese Scientific Paper Translation Task. The system consists of a language-independent tokenizer and an attentional encoder-decoder style neural machine translation model. According to the official results, our system achieves higher translation accuracy than any systems submitted previous campaigns despite simple model architecture.
Tasks Machine Translation
Published 2017-11-01
URL https://www.aclweb.org/anthology/W17-5712/
PDF https://www.aclweb.org/anthology/W17-5712
PWC https://paperswithcode.com/paper/a-simple-and-strong-baseline-naist-nict
Repo
Framework

Parsing to 1-Endpoint-Crossing, Pagenumber-2 Graphs

Title Parsing to 1-Endpoint-Crossing, Pagenumber-2 Graphs
Authors Junjie Cao, Sheng Huang, Weiwei Sun, Xiaojun Wan
Abstract We study the Maximum Subgraph problem in deep dependency parsing. We consider two restrictions to deep dependency graphs: (a) 1-endpoint-crossing and (b) pagenumber-2. Our main contribution is an exact algorithm that obtains maximum subgraphs satisfying both restrictions simultaneously in time O(n5). Moreover, ignoring one linguistically-rare structure descreases the complexity to O(n4). We also extend our quartic-time algorithm into a practical parser with a discriminative disambiguation model and evaluate its performance on four linguistic data sets used in semantic dependency parsing.
Tasks Dependency Parsing, Semantic Dependency Parsing
Published 2017-07-01
URL https://www.aclweb.org/anthology/P17-1193/
PDF https://www.aclweb.org/anthology/P17-1193
PWC https://paperswithcode.com/paper/parsing-to-1-endpoint-crossing-pagenumber-2
Repo
Framework

Content Selection for Real-time Sports News Construction from Commentary Texts

Title Content Selection for Real-time Sports News Construction from Commentary Texts
Authors Jin-ge Yao, Jianmin Zhang, Xiaojun Wan, Jianguo Xiao
Abstract We study the task of constructing sports news report automatically from live commentary and focus on content selection. Rather than receiving every piece of text of a sports match before news construction, as in previous related work, we novelly verify the feasibility of a more challenging but more useful setting to generate news report on the fly by treating live text input as a stream. Specifically, we design various scoring functions to address different requirements of the task. The near submodularity of scoring functions makes it possible to adapt efficient greedy algorithms even in stream data settings. Experiments suggest that our proposed framework can already produce comparable results compared with previous work that relies on a supervised learning-to-rank model with heavy feature engineering.
Tasks Document Summarization, Feature Engineering, Learning-To-Rank, Text Generation
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-3504/
PDF https://www.aclweb.org/anthology/W17-3504
PWC https://paperswithcode.com/paper/content-selection-for-real-time-sports-news
Repo
Framework
comments powered by Disqus