July 26, 2019

2463 words 12 mins read

Paper Group NANR 52

LDCCNLP at IJCNLP-2017 Task 2: Dimensional Sentiment Analysis for Chinese Phrases Using Machine Learning. Initial Explorations of CCG Supertagging for Universal Dependency Parsing. Translation Memory Systems Have a Long Way to Go. QU-BIGIR at SemEval 2017 Task 3: Using Similarity Features for Arabic Community Question Answering Forums. RelTextRank: …

LDCCNLP at IJCNLP-2017 Task 2: Dimensional Sentiment Analysis for Chinese Phrases Using Machine Learning


Title	LDCCNLP at IJCNLP-2017 Task 2: Dimensional Sentiment Analysis for Chinese Phrases Using Machine Learning
Authors	Peng Zhong, Jingbin Wang
Abstract	Sentiment analysis on Chinese text has intensively studied. The basic task for related research is to construct an affective lexicon and thereby predict emotional scores of different levels. However, finite lexicon resources make it difficult to effectively and automatically distinguish between various types of sentiment information in Chinese texts. This IJCNLP2017-Task2 competition seeks to automatically calculate Valence and Arousal ratings within the hierarchies of vocabulary and phrases in Chinese. We introduce a regression methodology to automatically recognize continuous emotional values, and incorporate a word embedding technique. In our system, the MAE predictive values of Valence and Arousal were 0.811 and 0.996, respectively, for the sentiment dimension prediction of words in Chinese. In phrase prediction, the corresponding results were 0.822 and 0.489, ranking sixth among all teams.
Tasks	Sentiment Analysis, Word Embeddings
Published	2017-12-01
URL	https://www.aclweb.org/anthology/I17-4013/
PDF	https://www.aclweb.org/anthology/I17-4013
PWC	https://paperswithcode.com/paper/ldccnlp-at-ijcnlp-2017-task-2-dimensional
Repo
Framework

Initial Explorations of CCG Supertagging for Universal Dependency Parsing


Title	Initial Explorations of CCG Supertagging for Universal Dependency Parsing
Authors	Burak Kerim Akkus, Heval Azizoglu, Ruket Cakici
Abstract	In this paper we describe the system by METU team for universal dependency parsing of multilingual text. We use a neural network-based dependency parser that has a greedy transition approach to dependency parsing. CCG supertags contain rich structural information that proves useful in certain NLP tasks. We experiment with CCG supertags as additional features in our experiments. The neural network parser is trained together with dependencies and simplified CCG tags as well as other features provided.
Tasks	CCG Supertagging, Dependency Parsing, Machine Translation
Published	2017-08-01
URL	https://www.aclweb.org/anthology/K17-3023/
PDF	https://www.aclweb.org/anthology/K17-3023
PWC	https://paperswithcode.com/paper/initial-explorations-of-ccg-supertagging-for
Repo
Framework

Translation Memory Systems Have a Long Way to Go


Title	Translation Memory Systems Have a Long Way to Go
Authors	Andrea Silvestre Baquero, Ruslan Mitkov
Abstract	The TM memory systems changed the work of translators and now the translators not benefiting from these tools are a tiny minority. These tools operate on fuzzy (surface) matching mostly and cannot benefit from already translated texts which are synonymous to (or paraphrased versions of) the text to be translated. The match score is mostly based on character-string similarity, calculated through Levenshtein distance. The TM tools have difficulties with detecting similarities even in sentences which represent a minor revision of sentences already available in the translation memory. This shortcoming of the current TM systems was the subject of the present study and was empirically proven in the experiments we conducted. To this end, we compiled a small translation memory (English-Spanish) and applied several lexical and syntactic transformation rules to the source sentences with both English and Spanish being the source language. The results of this study show that current TM systems have a long way to go and highlight the need for TM systems equipped with NLP capabilities which will offer the translator the advantage of he/she not having to translate a sentence again if an almost identical sentence has already been already translated.
Tasks	Machine Translation
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-7906/
PDF	https://doi.org/10.26615/978-954-452-042-7_006
PWC	https://paperswithcode.com/paper/translation-memory-systems-have-a-long-way-to
Repo
Framework

QU-BIGIR at SemEval 2017 Task 3: Using Similarity Features for Arabic Community Question Answering Forums


Title	QU-BIGIR at SemEval 2017 Task 3: Using Similarity Features for Arabic Community Question Answering Forums
Authors	Marwan Torki, Maram Hasanain, Tamer Elsayed
Abstract	In this paper we describe our QU-BIGIR system for the Arabic subtask D of the SemEval 2017 Task 3. Our approach builds on our participation in the past version of the same subtask. This year, our system uses different similarity measures that encodes lexical and semantic pairwise similarity of text pairs. In addition to well known similarity measures such as cosine similarity, we use other measures based on the summary statistics of word embedding representation for a given text. To rank a list of candidate question answer pairs for a given question, we learn a linear SVM classifier over our similarity features. Our best resulting run came second in subtask D with a very competitive performance to the first-ranking system.
Tasks	Community Question Answering, Question Answering, Semantic Textual Similarity
Published	2017-08-01
URL	https://www.aclweb.org/anthology/S17-2059/
PDF	https://www.aclweb.org/anthology/S17-2059
PWC	https://paperswithcode.com/paper/qu-bigir-at-semeval-2017-task-3-using
Repo
Framework

RelTextRank: An Open Source Framework for Building Relational Syntactic-Semantic Text Pair Representations


Title	RelTextRank: An Open Source Framework for Building Relational Syntactic-Semantic Text Pair Representations
Authors	Kateryna Tymoshenko, Aless Moschitti, ro, Massimo Nicosia, Aliaksei Severyn
Abstract
Tasks	Community Question Answering, Natural Language Inference, Question Answering, Relational Reasoning
Published	2017-07-01
URL	https://www.aclweb.org/anthology/P17-4014/
PDF	https://www.aclweb.org/anthology/P17-4014
PWC	https://paperswithcode.com/paper/reltextrank-an-open-source-framework-for
Repo
Framework

The Influence of Spelling Errors on Content Scoring Performance


Title	The Influence of Spelling Errors on Content Scoring Performance
Authors	Andrea Horbach, Yuning Ding, Torsten Zesch
Abstract	Spelling errors occur frequently in educational settings, but their influence on automatic scoring is largely unknown. We therefore investigate the influence of spelling errors on content scoring performance using the example of the ASAP corpus. We conduct an annotation study on the nature of spelling errors in the ASAP dataset and utilize these finding in machine learning experiments that measure the influence of spelling errors on automatic content scoring. Our main finding is that scoring methods using both token and character n-gram features are robust against spelling errors up to the error frequency in ASAP.
Tasks
Published	2017-12-01
URL	https://www.aclweb.org/anthology/W17-5908/
PDF	https://www.aclweb.org/anthology/W17-5908
PWC	https://paperswithcode.com/paper/the-influence-of-spelling-errors-on-content
Repo
Framework

A graph-theoretic approach to multitasking


Title	A graph-theoretic approach to multitasking
Authors	Noga Alon, Daniel Reichman, Igor Shinkar, Tal Wagner, Sebastian Musslick, Jonathan D. Cohen, Tom Griffiths, Biswadip Dey, Kayhan Ozcimder
Abstract	A key feature of neural network architectures is their ability to support the simultaneous interaction among large numbers of units in the learning and processing of representations. However, how the richness of such interactions trades off against the ability of a network to simultaneously carry out multiple independent processes – a salient limitation in many domains of human cognition – remains largely unexplored. In this paper we use a graph-theoretic analysis of network architecture to address this question, where tasks are represented as edges in a bipartite graph $G=(A \cup B, E)$. We define a new measure of multitasking capacity of such networks, based on the assumptions that tasks that \emph{need} to be multitasked rely on independent resources, i.e., form a matching, and that tasks \emph{can} be performed without interference if they form an induced matching. Our main result is an inherent tradeoff between the multitasking capacity and the average degree of the network that holds \emph{regardless of the network architecture}. These results are also extended to networks of depth greater than $2$. On the positive side, we demonstrate that networks that are random-like (e.g., locally sparse) can have desirable multitasking properties. Our results shed light into the parallel-processing limitations of neural systems and provide insights that may be useful for the analysis and design of parallel architectures.
Tasks
Published	2017-12-01
URL	http://papers.nips.cc/paper/6805-a-graph-theoretic-approach-to-multitasking
PDF	http://papers.nips.cc/paper/6805-a-graph-theoretic-approach-to-multitasking.pdf
PWC	https://paperswithcode.com/paper/a-graph-theoretic-approach-to-multitasking
Repo
Framework

Interpreting Strategies Annotation in the WAW Corpus


Title	Interpreting Strategies Annotation in the WAW Corpus
Authors	Irina Temnikova, Ahmed Abdelali, Samy Hedaya, Stephan Vogel, Aishah Al Daher
Abstract	With the aim to teach our automatic speech-to-text translation system human interpreting strategies, our first step is to identify which interpreting strategies are most often used in the language pair of our interest (English-Arabic). In this article we run an automatic analysis of a corpus of parallel speeches and their human interpretations, and provide the results of manually annotating the human interpreting strategies in a sample of the corpus. We give a glimpse of the corpus, whose value surpasses the fact that it contains a high number of scientific speeches with their interpretations from English into Arabic, as it also provides rich information about the interpreters. We also discuss the difficulties, which we encountered on our way, as well as our solutions to them: our methodology for manual re-segmentation and alignment of parallel segments, the choice of annotation tool, and the annotation procedure. Our annotation findings explain the previously extracted specific statistical features of the interpreted corpus (compared with a translation one) as well as the quality of interpretation provided by different interpreters.
Tasks	Machine Translation
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-7905/
PDF	https://doi.org/10.26615/978-954-452-042-7_005
PWC	https://paperswithcode.com/paper/interpreting-strategies-annotation-in-the-waw
Repo
Framework

Adaptive Batch Size for Safe Policy Gradients


Title	Adaptive Batch Size for Safe Policy Gradients
Authors	Matteo Papini, Matteo Pirotta, Marcello Restelli
Abstract	Policy gradient methods are among the best Reinforcement Learning (RL) techniques to solve complex control problems. In real-world RL applications, it is common to have a good initial policy whose performance needs to be improved and it may not be acceptable to try bad policies during the learning process. Although several methods for choosing the step size exist, research paid less attention to determine the batch size, that is the number of samples used to estimate the gradient direction for each update of the policy parameters. In this paper, we propose a set of methods to jointly optimize the step and the batch sizes that guarantee (with high probability) to improve the policy performance after each update. Besides providing theoretical guarantees, we show numerical simulations to analyse the behaviour of our methods.
Tasks	Policy Gradient Methods
Published	2017-12-01
URL	http://papers.nips.cc/paper/6950-adaptive-batch-size-for-safe-policy-gradients
PDF	http://papers.nips.cc/paper/6950-adaptive-batch-size-for-safe-policy-gradients.pdf
PWC	https://paperswithcode.com/paper/adaptive-batch-size-for-safe-policy-gradients
Repo
Framework

(Re)introducing Regular Graph Languages


Title	(Re)introducing Regular Graph Languages
Authors	Sorcha Gilroy, Adam Lopez, Sebastian Maneth, Pijus Simonaitis
Abstract
Tasks	Machine Translation
Published	2017-07-01
URL	https://www.aclweb.org/anthology/W17-3410/
PDF	https://www.aclweb.org/anthology/W17-3410
PWC	https://paperswithcode.com/paper/reintroducing-regular-graph-languages
Repo
Framework

Enhancing Automatic ICD-9-CM Code Assignment for Medical Texts with PubMed


Title	Enhancing Automatic ICD-9-CM Code Assignment for Medical Texts with PubMed
Authors	Danchen Zhang, Daqing He, Sanqiang Zhao, Lei Li
Abstract	Assigning a standard ICD-9-CM code to disease symptoms in medical texts is an important task in the medical domain. Automating this process could greatly reduce the costs. However, the effectiveness of an automatic ICD-9-CM code classifier faces a serious problem, which can be triggered by unbalanced training data. Frequent diseases often have more training data, which helps its classification to perform better than that of an infrequent disease. However, a disease{'}s frequency does not necessarily reflect its importance. To resolve this training data shortage problem, we propose to strategically draw data from PubMed to enrich the training data when there is such need. We validate our method on the CMC dataset, and the evaluation results indicate that our method can significantly improve the code assignment classifiers{'} performance at the macro-averaging level.
Tasks
Published	2017-08-01
URL	https://www.aclweb.org/anthology/W17-2333/
PDF	https://www.aclweb.org/anthology/W17-2333
PWC	https://paperswithcode.com/paper/enhancing-automatic-icd-9-cm-code-assignment
Repo
Framework


Title	A Cross-modal Review of Indicators for Depression Detection Systems
Authors	Michelle Morales, Stefan Scherer, Rivka Levitan
Abstract	Automatic detection of depression has attracted increasing attention from researchers in psychology, computer science, linguistics, and related disciplines. As a result, promising depression detection systems have been reported. This paper surveys these efforts by presenting the first cross-modal review of depression detection systems and discusses best practices and most promising approaches to this task.
Tasks
Published	2017-08-01
URL	https://www.aclweb.org/anthology/W17-3101/
PDF	https://www.aclweb.org/anthology/W17-3101
PWC	https://paperswithcode.com/paper/a-cross-modal-review-of-indicators-for
Repo
Framework

Shared Task Proposal: Multilingual Surface Realization Using Universal Dependency Trees


Title	Shared Task Proposal: Multilingual Surface Realization Using Universal Dependency Trees
Authors	Simon Mille, Bernd Bohnet, Leo Wanner, Anja Belz
Abstract	We propose a shared task on multilingual Surface Realization, i.e., on mapping unordered and uninflected universal dependency trees to correctly ordered and inflected sentences in a number of languages. A second deeper input will be available in which, in addition, functional words, fine-grained PoS and morphological information will be removed from the input trees. The first shared task on Surface Realization was carried out in 2011 with a similar setup, with a focus on English. We think that it is time for relaunching such a shared task effort in view of the arrival of Universal Dependencies annotated treebanks for a large number of languages on the one hand, and the increasing dominance of Deep Learning, which proved to be a game changer for NLP, on the other hand.
Tasks	Machine Translation, Text Generation
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-3517/
PDF	https://www.aclweb.org/anthology/W17-3517
PWC	https://paperswithcode.com/paper/shared-task-proposal-multilingual-surface
Repo
Framework

Trust, but Verify! Better Entity Linking through Automatic Verification


Title	Trust, but Verify! Better Entity Linking through Automatic Verification
Authors	Benjamin Heinzerling, Michael Strube, Chin-Yew Lin
Abstract	We introduce automatic verification as a post-processing step for entity linking (EL). The proposed method trusts EL system results collectively, by assuming entity mentions are mostly linked correctly, in order to create a semantic profile of the given text using geospatial and temporal information, as well as fine-grained entity types. This profile is then used to automatically verify each linked mention individually, i.e., to predict whether it has been linked correctly or not. Verification allows leveraging a rich set of global and pairwise features that would be prohibitively expensive for EL systems employing global inference. Evaluation shows consistent improvements across datasets and systems. In particular, when applied to state-of-the-art systems, our method yields an absolute improvement in linking performance of up to 1.7 F1 on AIDA/CoNLL{'}03 and up to 2.4 F1 on the English TAC KBP 2015 TEDL dataset.
Tasks	Entity Linking
Published	2017-04-01
URL	https://www.aclweb.org/anthology/E17-1078/
PDF	https://www.aclweb.org/anthology/E17-1078
PWC	https://paperswithcode.com/paper/trust-but-verify-better-entity-linking
Repo
Framework

Probabilistic Inference for Cold Start Knowledge Base Population with Prior World Knowledge


Title	Probabilistic Inference for Cold Start Knowledge Base Population with Prior World Knowledge
Authors	Bonan Min, Marjorie Freedman, Talya Meltzer
Abstract	Building knowledge bases (KB) automatically from text corpora is crucial for many applications such as question answering and web search. The problem is very challenging and has been divided into sub-problems such as mention and named entity recognition, entity linking and relation extraction. However, combining these components has shown to be under-constrained and often produces KBs with supersize entities and common-sense errors in relations (a person has multiple birthdates). The errors are difficult to resolve solely with IE tools but become obvious with world knowledge at the corpus level. By analyzing Freebase and a large text collection, we found that per-relation cardinality and the popularity of entities follow the power-law distribution favoring flat long tails with low-frequency instances. We present a probabilistic joint inference algorithm to incorporate this world knowledge during KB construction. Our approach yields state-of-the-art performance on the TAC Cold Start task, and 42{%} and 19.4{%} relative improvements in F1 over our baseline on Cold Start hop-1 and all-hop queries respectively.
Tasks	Common Sense Reasoning, Entity Linking, Knowledge Base Population, Named Entity Recognition, Question Answering, Relation Extraction
Published	2017-04-01
URL	https://www.aclweb.org/anthology/E17-1057/
PDF	https://www.aclweb.org/anthology/E17-1057
PWC	https://paperswithcode.com/paper/probabilistic-inference-for-cold-start
Repo
Framework