July 26, 2019

1793 words 9 mins read

Paper Group NANR 12

Interoperable annotation of (co)references in the Democrat project. Reference Scope Identification for Citances Using Convolutional Neural Networks. Optimistic posterior sampling for reinforcement learning: worst-case regret bounds. Rule-based Reordering and Post-Processing for Indonesian-Korean Statistical Machine Translation. Normalization of Soc …

Interoperable annotation of (co)references in the Democrat project


Title	Interoperable annotation of (co)references in the Democrat project
Authors	Lo{"\i}c Grobol, L, Fr{'e}d{'e}ric ragin, Serge Heiden
Abstract
Tasks
Published	2017-01-01
URL	https://www.aclweb.org/anthology/W17-7411/
PDF	https://www.aclweb.org/anthology/W17-7411
PWC	https://paperswithcode.com/paper/interoperable-annotation-of-coreferences-in
Repo
Framework

Reference Scope Identification for Citances Using Convolutional Neural Networks


Title	Reference Scope Identification for Citances Using Convolutional Neural Networks
Authors	Saurav Jha, Aanchal Chaurasia, Akhilesh Sudhakar, Anil Kumar Singh
Abstract
Tasks
Published	2017-12-01
URL	https://www.aclweb.org/anthology/W17-7504/
PDF	https://www.aclweb.org/anthology/W17-7504
PWC	https://paperswithcode.com/paper/reference-scope-identification-for-citances
Repo
Framework

Optimistic posterior sampling for reinforcement learning: worst-case regret bounds


Title	Optimistic posterior sampling for reinforcement learning: worst-case regret bounds
Authors	Shipra Agrawal, Randy Jia
Abstract	We present an algorithm based on posterior sampling (aka Thompson sampling) that achieves near-optimal worst-case regret bounds when the underlying Markov Decision Process (MDP) is communicating with a finite, though unknown, diameter. Our main result is a high probability regret upper bound of $\tilde{O}(D\sqrt{SAT})$ for any communicating MDP with $S$ states, $A$ actions and diameter $D$, when $T\ge S^5A$. Here, regret compares the total reward achieved by the algorithm to the total expected reward of an optimal infinite-horizon undiscounted average reward policy, in time horizon $T$. This result improves over the best previously known upper bound of $\tilde{O}(DS\sqrt{AT})$ achieved by any algorithm in this setting, and matches the dependence on $S$ in the established lower bound of $\Omega(\sqrt{DSAT})$ for this problem. Our techniques involve proving some novel results about the anti-concentration of Dirichlet distribution, which may be of independent interest.
Tasks
Published	2017-12-01
URL	http://papers.nips.cc/paper/6718-optimistic-posterior-sampling-for-reinforcement-learning-worst-case-regret-bounds
PDF	http://papers.nips.cc/paper/6718-optimistic-posterior-sampling-for-reinforcement-learning-worst-case-regret-bounds.pdf
PWC	https://paperswithcode.com/paper/optimistic-posterior-sampling-for
Repo
Framework

Rule-based Reordering and Post-Processing for Indonesian-Korean Statistical Machine Translation


Title	Rule-based Reordering and Post-Processing for Indonesian-Korean Statistical Machine Translation
Authors	C Mawalim, y Olivia, Dessi Puji Lestari, Ayu Purwarianti
Abstract
Tasks	Language Modelling, Machine Translation
Published	2017-11-01
URL	https://www.aclweb.org/anthology/Y17-1039/
PDF	https://www.aclweb.org/anthology/Y17-1039
PWC	https://paperswithcode.com/paper/rule-based-reordering-and-post-processing-for
Repo
Framework


Title	Normalization of Social Media Text using Deep Neural Networks
Authors	Ajay Shankar Tiwari, Sudip Kumar Naskar
Abstract
Tasks
Published	2017-12-01
URL	https://www.aclweb.org/anthology/W17-7539/
PDF	https://www.aclweb.org/anthology/W17-7539
PWC	https://paperswithcode.com/paper/normalization-of-social-media-text-using-deep
Repo
Framework

Downstream use of syntactic analysis: does representation matter?


Title	Downstream use of syntactic analysis: does representation matter?
Authors	Lilja {\O}vrelid
Abstract
Tasks
Published	2017-01-01
URL	https://www.aclweb.org/anthology/W17-7602/
PDF	https://www.aclweb.org/anthology/W17-7602
PWC	https://paperswithcode.com/paper/downstream-use-of-syntactic-analysis-does
Repo
Framework

Document Level Novelty Detection: Textual Entailment Lends a Helping Hand


Title	Document Level Novelty Detection: Textual Entailment Lends a Helping Hand
Authors	Tanik Saikh, Tirthankar Ghosal, Asif Ekbal, Pushpak Bhattacharyya
Abstract
Tasks	Natural Language Inference
Published	2017-12-01
URL	https://www.aclweb.org/anthology/W17-7517/
PDF	https://www.aclweb.org/anthology/W17-7517
PWC	https://paperswithcode.com/paper/document-level-novelty-detection-textual
Repo
Framework

Query-based summarization using MDL principle


Title	Query-based summarization using MDL principle
Authors	Marina Litvak, Natalia Vanetik
Abstract	Query-based text summarization is aimed at extracting essential information that answers the query from original text. The answer is presented in a minimal, often predefined, number of words. In this paper we introduce a new unsupervised approach for query-based extractive summarization, based on the minimum description length (MDL) principle that employs Krimp compression algorithm (Vreeken et al., 2011). The key idea of our approach is to select frequent word sets related to a given query that compress document sentences better and therefore describe the document better. A summary is extracted by selecting sentences that best cover query-related frequent word sets. The approach is evaluated based on the DUC 2005 and DUC 2006 datasets which are specifically designed for query-based summarization (DUC, 2005 2006). It competes with the best results.
Tasks	Language Modelling, Query-Based Extractive Summarization, Text Summarization
Published	2017-04-01
URL	https://www.aclweb.org/anthology/W17-1004/
PDF	https://www.aclweb.org/anthology/W17-1004
PWC	https://paperswithcode.com/paper/query-based-summarization-using-mdl-principle
Repo
Framework

LIM-LIG at SemEval-2017 Task1: Enhancing the Semantic Similarity for Arabic Sentences with Vectors Weighting


Title	LIM-LIG at SemEval-2017 Task1: Enhancing the Semantic Similarity for Arabic Sentences with Vectors Weighting
Authors	El Moatez Billah Nagoudi, J{'e}r{'e}my Ferrero, Didier Schwab
Abstract	This article describes our proposed system named LIM-LIG. This system is designed for SemEval 2017 Task1: Semantic Textual Similarity (Track1). LIM-LIG proposes an innovative enhancement to word embedding-based model devoted to measure the semantic similarity in Arabic sentences. The main idea is to exploit the word representations as vectors in a multidimensional space to capture the semantic and syntactic properties of words. IDF weighting and Part-of-Speech tagging are applied on the examined sentences to support the identification of words that are highly descriptive in each sentence. LIM-LIG system achieves a Pearson{'}s correlation of 0.74633, ranking 2nd among all participants in the Arabic monolingual pairs STS task organized within the SemEval 2017 evaluation campaign
Tasks	Information Retrieval, Machine Translation, Paraphrase Identification, Part-Of-Speech Tagging, Semantic Similarity, Semantic Textual Similarity, Word Sense Disambiguation
Published	2017-08-01
URL	https://www.aclweb.org/anthology/S17-2017/
PDF	https://www.aclweb.org/anthology/S17-2017
PWC	https://paperswithcode.com/paper/lim-lig-at-semeval-2017-task1-enhancing-the
Repo
Framework

STS-UHH at SemEval-2017 Task 1: Scoring Semantic Textual Similarity Using Supervised and Unsupervised Ensemble


Title	STS-UHH at SemEval-2017 Task 1: Scoring Semantic Textual Similarity Using Supervised and Unsupervised Ensemble
Authors	Sarah Kohail, Amr Rekaby Salama, Chris Biemann
Abstract	This paper reports the STS-UHH participation in the SemEval 2017 shared Task 1 of Semantic Textual Similarity (STS). Overall, we submitted 3 runs covering monolingual and cross-lingual STS tracks. Our participation involves two approaches: unsupervised approach, which estimates a word alignment-based similarity score, and supervised approach, which combines dependency graph similarity and coverage features with lexical similarity measures using regression methods. We also present a way on ensembling both models. Out of 84 submitted runs, our team best multi-lingual run has been ranked 12th in overall performance with correlation of 0.61, 7th among 31 participating teams.
Tasks	Graph Similarity, Information Retrieval, Natural Language Inference, Paraphrase Identification, Semantic Textual Similarity, Word Alignment
Published	2017-08-01
URL	https://www.aclweb.org/anthology/S17-2025/
PDF	https://www.aclweb.org/anthology/S17-2025
PWC	https://paperswithcode.com/paper/sts-uhh-at-semeval-2017-task-1-scoring
Repo
Framework

Sentence Alignment using Unfolding Recursive Autoencoders


Title	Sentence Alignment using Unfolding Recursive Autoencoders
Authors	Jeenu Grover, Pabitra Mitra
Abstract	In this paper, we propose a novel two step algorithm for sentence alignment in monolingual corpora using Unfolding Recursive Autoencoders. First, we use unfolding recursive auto-encoders (RAE) to learn feature vectors for phrases in syntactical tree of the sentence. To compare two sentences we use a similarity matrix which has dimensions proportional to the size of the two sentences. Since the similarity matrix generated to compare two sentences has varying dimension due to different sentence lengths, a dynamic pooling layer is used to map it to a matrix of fixed dimension. The resulting matrix is used to calculate the similarity scores between the two sentences. The second step of the algorithm captures the contexts in which the sentences occur in the document by using a dynamic programming algorithm for global alignment.
Tasks	Information Retrieval, Machine Translation, Paraphrase Identification, Question Answering, Text Summarization
Published	2017-08-01
URL	https://www.aclweb.org/anthology/W17-2503/
PDF	https://www.aclweb.org/anthology/W17-2503
PWC	https://paperswithcode.com/paper/sentence-alignment-using-unfolding-recursive
Repo
Framework


Title	LABDA at SemEval-2017 Task 10: Extracting Keyphrases from Scientific Publications by combining the BANNER tool and the UMLS Semantic Network
Authors	Isabel Segura-Bedmar, Crist{'o}bal Col{'o}n-Ruiz, Paloma Mart{'\i}nez
Abstract	This paper describes the system presented by the LABDA group at SemEval 2017 Task 10 ScienceIE, specifically for the subtasks of identification and classification of keyphrases from scientific articles. For the task of identification, we use the BANNER tool, a named entity recognition system, which is based on conditional random fields (CRF) and has obtained successful results in the biomedical domain. To classify keyphrases, we study the UMLS semantic network and propose a possible linking between the keyphrase types and the UMLS semantic groups. Based on this semantic linking, we create a dictionary for each keyphrase type. Then, a feature indicating if a token is found in one of these dictionaries is incorporated to feature set used by the BANNER tool. The final results on the test dataset show that our system still needs to be improved, but the conditional random fields and, consequently, the BANNER system can be used as a first approximation to identify and classify keyphrases.
Tasks	Named Entity Recognition
Published	2017-08-01
URL	https://www.aclweb.org/anthology/S17-2164/
PDF	https://www.aclweb.org/anthology/S17-2164
PWC	https://paperswithcode.com/paper/labda-at-semeval-2017-task-10-extracting
Repo
Framework

A data-driven approach to verbal multiword expression detection. PARSEME Shared Task system description paper


Title	A data-driven approach to verbal multiword expression detection. PARSEME Shared Task system description paper
Authors	Tiberiu Boros, Sonia Pipa, Verginica Barbu Mititelu, Dan Tufis
Abstract	{``}Multiword expressions{''} are groups of words acting as a morphologic, syntactic and semantic unit in linguistic analysis. Verbal multiword expressions represent the subgroup of multiword expressions, namely that in which a verb is the syntactic head of the group considered in its canonical (or dictionary) form. All multiword expressions are a great challenge for natural language processing, but the verbal ones are particularly interesting for tasks such as parsing, as the verb is the central element in the syntactic organization of a sentence. In this paper we introduce our data-driven approach to verbal multiword expressions which was objectively validated during the PARSEME shared task on verbal multiword expressions identification. We tested our approach on 12 languages, and we provide detailed information about corpora composition, feature selection process, validation procedure and performance on all languages. \|
Tasks	Feature Selection, Lemmatization, Tokenization
Published	2017-04-01
URL	https://www.aclweb.org/anthology/W17-1716/
PDF	https://www.aclweb.org/anthology/W17-1716
PWC	https://paperswithcode.com/paper/a-data-driven-approach-to-verbal-multiword
Repo
Framework

LIMSI@CoNLL’17: UD Shared Task


Title	LIMSI@CoNLL’17: UD Shared Task
Authors	Lauriane Aufrant, Guillaume Wisniewski, Fran{\c{c}}ois Yvon
Abstract	This paper describes LIMSI{'}s submission to the CoNLL 2017 UD Shared Task, which is focused on small treebanks, and how to improve low-resourced parsing only by ad hoc combination of multiple views and resources. We present our approach for low-resourced parsing, together with a detailed analysis of the results for each test treebank. We also report extensive analysis experiments on model selection for the PUD treebanks, and on annotation consistency among UD treebanks.
Tasks	Model Selection, Tokenization
Published	2017-08-01
URL	https://www.aclweb.org/anthology/K17-3017/
PDF	https://www.aclweb.org/anthology/K17-3017
PWC	https://paperswithcode.com/paper/limsiconll17-ud-shared-task
Repo
Framework

Playing with Embeddings : Evaluating embeddings for Robot Language Learning through MUD Games


Title	Playing with Embeddings : Evaluating embeddings for Robot Language Learning through MUD Games
Authors	Anmol Gulati, Kumar Krishna Agrawal
Abstract	Acquiring language provides a ubiquitous mode of communication, across humans and robots. To this effect, distributional representations of words based on co-occurrence statistics, have provided significant advancements ranging across machine translation to comprehension. In this paper, we study the suitability of using general purpose word-embeddings for language learning in robots. We propose using text-based games as a proxy to evaluating word embedding on real robots. Based in a risk-reward setting, we review the effectiveness of the embeddings in navigating tasks in fantasy games, as an approximation to their performance on more complex scenarios, like language assisted robot navigation.
Tasks	Machine Translation, Robot Navigation, Self-Driving Cars, Word Embeddings
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-5305/
PDF	https://www.aclweb.org/anthology/W17-5305
PWC	https://paperswithcode.com/paper/playing-with-embeddings-evaluating-embeddings
Repo
Framework