July 26, 2019

2188 words 11 mins read

Paper Group NANR 186

Near Minimax Optimal Players for the Finite-Time 3-Expert Prediction Problem. Ideological Phrase Indicators for Classification of Political Discourse Framing on Twitter. Detecting annotation noise in automatically labelled data. Investigating the Opacity of Verb-Noun Multiword Expression Usages in Context. Pocket Knowledge Base Population. A Consti …

Near Minimax Optimal Players for the Finite-Time 3-Expert Prediction Problem


Title	Near Minimax Optimal Players for the Finite-Time 3-Expert Prediction Problem
Authors	Yasin Abbasi, Peter L. Bartlett, Victor Gabillon
Abstract	We study minimax strategies for the online prediction problem with expert advice. It has been conjectured that a simple adversary strategy, called COMB, is near optimal in this game for any number of experts. Our results and new insights make progress in this direction by showing that, up to a small additive term, COMB is minimax optimal in the finite-time three expert problem. In addition, we provide for this setting a new near minimax optimal COMB-based learner. Prior to this work, in this problem, learners obtaining the optimal multiplicative constant in their regret rate were known only when $K=2$ or $K\rightarrow\infty$. We characterize, when $K=3$, the regret of the game scaling as $\sqrt{8/(9\pi)T}\pm \log(T)^2$ which gives for the first time the optimal constant in the leading ($\sqrt{T}$) term of the regret.
Tasks
Published	2017-12-01
URL	http://papers.nips.cc/paper/6896-near-minimax-optimal-players-for-the-finite-time-3-expert-prediction-problem
PDF	http://papers.nips.cc/paper/6896-near-minimax-optimal-players-for-the-finite-time-3-expert-prediction-problem.pdf
PWC	https://paperswithcode.com/paper/near-minimax-optimal-players-for-the-finite
Repo
Framework

Ideological Phrase Indicators for Classification of Political Discourse Framing on Twitter


Title	Ideological Phrase Indicators for Classification of Political Discourse Framing on Twitter
Authors	Kristen Johnson, I-Ta Lee, Dan Goldwasser
Abstract	Politicians carefully word their statements in order to influence how others view an issue, a political strategy called framing. Simultaneously, these frames may also reveal the beliefs or positions on an issue of the politician. Simple language features such as unigrams, bigrams, and trigrams are important indicators for identifying the general frame of a text, for both longer congressional speeches and shorter tweets of politicians. However, tweets may contain multiple unigrams across different frames which limits the effectiveness of this approach. In this paper, we present a joint model which uses both linguistic features of tweets and ideological phrase indicators extracted from a state-of-the-art embedding-based model to predict the general frame of political tweets.
Tasks
Published	2017-08-01
URL	https://www.aclweb.org/anthology/W17-2913/
PDF	https://www.aclweb.org/anthology/W17-2913
PWC	https://paperswithcode.com/paper/ideological-phrase-indicators-for
Repo
Framework

Detecting annotation noise in automatically labelled data


Title	Detecting annotation noise in automatically labelled data
Authors	Ines Rehbein, Josef Ruppenhofer
Abstract	We introduce a method for error detection in automatically annotated text, aimed at supporting the creation of high-quality language resources at affordable cost. Our method combines an unsupervised generative model with human supervision from active learning. We test our approach on in-domain and out-of-domain data in two languages, in AL simulations and in a real world setting. For all settings, the results show that our method is able to detect annotation errors with high precision and high recall.
Tasks	Active Learning, Domain Adaptation, Language Modelling, Named Entity Recognition
Published	2017-07-01
URL	https://www.aclweb.org/anthology/P17-1107/
PDF	https://www.aclweb.org/anthology/P17-1107
PWC	https://paperswithcode.com/paper/detecting-annotation-noise-in-automatically
Repo
Framework

Investigating the Opacity of Verb-Noun Multiword Expression Usages in Context


Title	Investigating the Opacity of Verb-Noun Multiword Expression Usages in Context
Authors	Shiva Taslimipoor, Omid Rohanian, Ruslan Mitkov, Afsaneh Fazly
Abstract	This study investigates the supervised token-based identification of Multiword Expressions (MWEs). This is an ongoing research to exploit the information contained in the contexts in which different instances of an expression could occur. This information is used to investigate the question of whether an expression is literal or MWE. Lexical and syntactic context features derived from vector representations are shown to be more effective over traditional statistical measures to identify tokens of MWEs.
Tasks
Published	2017-04-01
URL	https://www.aclweb.org/anthology/W17-1718/
PDF	https://www.aclweb.org/anthology/W17-1718
PWC	https://paperswithcode.com/paper/investigating-the-opacity-of-verb-noun
Repo
Framework

Pocket Knowledge Base Population


Title	Pocket Knowledge Base Population
Authors	Travis Wolfe, Mark Dredze, Benjamin Van Durme
Abstract	Existing Knowledge Base Population methods extract relations from a closed relational schema with limited coverage leading to sparse KBs. We propose Pocket Knowledge Base Population (PKBP), the task of dynamically constructing a KB of entities related to a query and finding the best characterization of relationships between entities. We describe novel Open Information Extraction methods which leverage the PKB to find informative trigger words. We evaluate using existing KBP shared-task data as well anew annotations collected for this work. Our methods produce high quality KB from just text with many more entities and relationships than existing KBP systems.
Tasks	Knowledge Base Population, Open Information Extraction, Slot Filling
Published	2017-07-01
URL	https://www.aclweb.org/anthology/P17-2048/
PDF	https://www.aclweb.org/anthology/P17-2048
PWC	https://paperswithcode.com/paper/pocket-knowledge-base-population
Repo
Framework

A Constituent-Centric Neural Architecture for Reading Comprehension


Title	A Constituent-Centric Neural Architecture for Reading Comprehension
Authors	Pengtao Xie, Eric Xing
Abstract	Reading comprehension (RC), aiming to understand natural texts and answer questions therein, is a challenging task. In this paper, we study the RC problem on the Stanford Question Answering Dataset (SQuAD). Observing from the training set that most correct answers are centered around constituents in the parse tree, we design a constituent-centric neural architecture where the generation of candidate answers and their representation learning are both based on constituents and guided by the parse tree. Under this architecture, the search space of candidate answers can be greatly reduced without sacrificing the coverage of correct answers and the syntactic, hierarchical and compositional structure among constituents can be well captured, which contributes to better representation learning of the candidate answers. On SQuAD, our method achieves the state of the art performance and the ablation study corroborates the effectiveness of individual modules.
Tasks	Machine Reading Comprehension, Question Answering, Reading Comprehension, Representation Learning
Published	2017-07-01
URL	https://www.aclweb.org/anthology/P17-1129/
PDF	https://www.aclweb.org/anthology/P17-1129
PWC	https://paperswithcode.com/paper/a-constituent-centric-neural-architecture-for
Repo
Framework

A BiLSTM-based System for Cross-lingual Pronoun Prediction


Title	A BiLSTM-based System for Cross-lingual Pronoun Prediction
Authors	Sara Stymne, Sharid Lo{'a}iciga, Fabienne Cap
Abstract	We describe the Uppsala system for the 2017 DiscoMT shared task on cross-lingual pronoun prediction. The system is based on a lower layer of BiLSTMs reading the source and target sentences respectively. Classification is based on the BiLSTM representation of the source and target positions for the pronouns. In addition we enrich our system with dependency representations from an external parser and character representations of the source sentence. We show that these additions perform well for German and Spanish as source languages. Our system is competitive and is in first or second place for all language pairs.
Tasks	Machine Translation, Word Alignment
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-4805/
PDF	https://www.aclweb.org/anthology/W17-4805
PWC	https://paperswithcode.com/paper/a-bilstm-based-system-for-cross-lingual
Repo
Framework


Title	Discovery of Discourse-Related Language Contrasts through Alignment Discrepancies in English-German Translation
Authors	Ekaterina Lapshinova-Koltunski, Christian Hardmeier
Abstract	In this paper, we analyse alignment discrepancies for discourse structures in English-German parallel data {–} sentence pairs, in which discourse structures in target or source texts have no alignment in the corresponding parallel sentences. The discourse-related structures are designed in form of linguistic patterns based on the information delivered by automatic part-of-speech and dependency annotation. In addition to alignment errors (existing structures left unaligned), these alignment discrepancies can be caused by language contrasts or through the phenomena of explicitation and implicitation in the translation process. We propose a new approach including new type of resources for corpus-based language contrast analysis and apply it to study and classify the contrasts found in our English-German parallel corpus. As unaligned discourse structures may also result in the loss of discourse information in the MT training data, we hope to deliver information in support of discourse-aware machine translation (MT).
Tasks	Machine Translation, Word Alignment
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-4810/
PDF	https://www.aclweb.org/anthology/W17-4810
PWC	https://paperswithcode.com/paper/discovery-of-discourse-related-language
Repo
Framework

Automatic classification of doctor-patient questions for a virtual patient record query task


Title	Automatic classification of doctor-patient questions for a virtual patient record query task
Authors	Leonardo Campillos Llanos, Sophie Rosset, Pierre Zweigenbaum
Abstract	We present the work-in-progress of automating the classification of doctor-patient questions in the context of a simulated consultation with a virtual patient. We classify questions according to the computational strategy (rule-based or other) needed for looking up data in the clinical record. We compare {`}traditional{'} machine learning methods (Gaussian and Multinomial Naive Bayes, and Support Vector Machines) and a neural network classifier (FastText). We obtained the best results with the SVM using semantic annotations, whereas the neural classifier achieved promising results without it. \|
Tasks	Dialogue Management, Information Retrieval, Named Entity Recognition, Question Answering
Published	2017-08-01
URL	https://www.aclweb.org/anthology/W17-2343/
PDF	https://www.aclweb.org/anthology/W17-2343
PWC	https://paperswithcode.com/paper/automatic-classification-of-doctor-patient
Repo
Framework

Prices go Up, Surge, Jump, Spike, Skyrocket, Go through the Roof\ldots Intensifier Collocations with Parametric Nouns of Type PRICE


Title	Prices go Up, Surge, Jump, Spike, Skyrocket, Go through the Roof\ldots Intensifier Collocations with Parametric Nouns of Type PRICE
Authors	Jasmina Mili{'c}evi{'c}
Abstract
Tasks
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-6518/
PDF	https://www.aclweb.org/anthology/W17-6518
PWC	https://paperswithcode.com/paper/prices-go-up-surge-jump-spike-skyrocket-go
Repo
Framework

Identification of Languages in Algerian Arabic Multilingual Documents


Title	Identification of Languages in Algerian Arabic Multilingual Documents
Authors	Wafia Adouane, Simon Dobnik
Abstract	This paper presents a language identification system designed to detect the language of each word, in its context, in a multilingual documents as generated in social media by bilingual/multilingual communities, in our case speakers of Algerian Arabic. We frame the task as a sequence tagging problem and use supervised machine learning with standard methods like HMM and Ngram classification tagging. We also experiment with a lexicon-based method. Combining all the methods in a fall-back mechanism and introducing some linguistic rules, to deal with unseen tokens and ambiguous words, gives an overall accuracy of 93.14{%}. Finally, we introduced rules for language identification from sequences of recognised words.
Tasks	Chunking, Language Identification
Published	2017-04-01
URL	https://www.aclweb.org/anthology/W17-1301/
PDF	https://www.aclweb.org/anthology/W17-1301
PWC	https://paperswithcode.com/paper/identification-of-languages-in-algerian
Repo
Framework


Title	Aye or naw, whit dae ye hink? Scottish independence and linguistic identity on social media
Authors	Philippa Shoemark, Debnil Sur, Luke Shrimpton, Iain Murray, Sharon Goldwater
Abstract	Political surveys have indicated a relationship between a sense of Scottish identity and voting decisions in the 2014 Scottish Independence Referendum. Identity is often reflected in language use, suggesting the intuitive hypothesis that individuals who support Scottish independence are more likely to use distinctively Scottish words than those who oppose it. In the first large-scale study of sociolinguistic variation on social media in the UK, we identify distinctively Scottish terms in a data-driven way, and find that these terms are indeed used at a higher rate by users of pro-independence hashtags than by users of anti-independence hashtags. However, we also find that in general people are less likely to use distinctively Scottish words in tweets with referendum-related hashtags than in their general Twitter activity. We attribute this difference to style shifting relative to audience, aligning with previous work showing that Twitter users tend to use fewer local variants when addressing a broader audience.
Tasks
Published	2017-04-01
URL	https://www.aclweb.org/anthology/E17-1116/
PDF	https://www.aclweb.org/anthology/E17-1116
PWC	https://paperswithcode.com/paper/aye-or-naw-whit-dae-ye-hink-scottish
Repo
Framework

A Simple and Strong Baseline: NAIST-NICT Neural Machine Translation System for WAT2017 English-Japanese Translation Task


Title	A Simple and Strong Baseline: NAIST-NICT Neural Machine Translation System for WAT2017 English-Japanese Translation Task
Authors	Yusuke Oda, Katsuhito Sudoh, Satoshi Nakamura, Masao Utiyama, Eiichiro Sumita
Abstract	This paper describes the details about the NAIST-NICT machine translation system for WAT2017 English-Japanese Scientific Paper Translation Task. The system consists of a language-independent tokenizer and an attentional encoder-decoder style neural machine translation model. According to the official results, our system achieves higher translation accuracy than any systems submitted previous campaigns despite simple model architecture.
Tasks	Machine Translation
Published	2017-11-01
URL	https://www.aclweb.org/anthology/W17-5712/
PDF	https://www.aclweb.org/anthology/W17-5712
PWC	https://paperswithcode.com/paper/a-simple-and-strong-baseline-naist-nict
Repo
Framework

Parsing to 1-Endpoint-Crossing, Pagenumber-2 Graphs


Title	Parsing to 1-Endpoint-Crossing, Pagenumber-2 Graphs
Authors	Junjie Cao, Sheng Huang, Weiwei Sun, Xiaojun Wan
Abstract	We study the Maximum Subgraph problem in deep dependency parsing. We consider two restrictions to deep dependency graphs: (a) 1-endpoint-crossing and (b) pagenumber-2. Our main contribution is an exact algorithm that obtains maximum subgraphs satisfying both restrictions simultaneously in time O(n5). Moreover, ignoring one linguistically-rare structure descreases the complexity to O(n4). We also extend our quartic-time algorithm into a practical parser with a discriminative disambiguation model and evaluate its performance on four linguistic data sets used in semantic dependency parsing.
Tasks	Dependency Parsing, Semantic Dependency Parsing
Published	2017-07-01
URL	https://www.aclweb.org/anthology/P17-1193/
PDF	https://www.aclweb.org/anthology/P17-1193
PWC	https://paperswithcode.com/paper/parsing-to-1-endpoint-crossing-pagenumber-2
Repo
Framework

Content Selection for Real-time Sports News Construction from Commentary Texts


Title	Content Selection for Real-time Sports News Construction from Commentary Texts
Authors	Jin-ge Yao, Jianmin Zhang, Xiaojun Wan, Jianguo Xiao
Abstract	We study the task of constructing sports news report automatically from live commentary and focus on content selection. Rather than receiving every piece of text of a sports match before news construction, as in previous related work, we novelly verify the feasibility of a more challenging but more useful setting to generate news report on the fly by treating live text input as a stream. Specifically, we design various scoring functions to address different requirements of the task. The near submodularity of scoring functions makes it possible to adapt efficient greedy algorithms even in stream data settings. Experiments suggest that our proposed framework can already produce comparable results compared with previous work that relies on a supervised learning-to-rank model with heavy feature engineering.
Tasks	Document Summarization, Feature Engineering, Learning-To-Rank, Text Generation
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-3504/
PDF	https://www.aclweb.org/anthology/W17-3504
PWC	https://paperswithcode.com/paper/content-selection-for-real-time-sports-news
Repo
Framework