July 26, 2019

2463 words 12 mins read

Paper Group NANR 52

Paper Group NANR 52

LDCCNLP at IJCNLP-2017 Task 2: Dimensional Sentiment Analysis for Chinese Phrases Using Machine Learning. Initial Explorations of CCG Supertagging for Universal Dependency Parsing. Translation Memory Systems Have a Long Way to Go. QU-BIGIR at SemEval 2017 Task 3: Using Similarity Features for Arabic Community Question Answering Forums. RelTextRank: …

LDCCNLP at IJCNLP-2017 Task 2: Dimensional Sentiment Analysis for Chinese Phrases Using Machine Learning

Title LDCCNLP at IJCNLP-2017 Task 2: Dimensional Sentiment Analysis for Chinese Phrases Using Machine Learning
Authors Peng Zhong, Jingbin Wang
Abstract Sentiment analysis on Chinese text has intensively studied. The basic task for related research is to construct an affective lexicon and thereby predict emotional scores of different levels. However, finite lexicon resources make it difficult to effectively and automatically distinguish between various types of sentiment information in Chinese texts. This IJCNLP2017-Task2 competition seeks to automatically calculate Valence and Arousal ratings within the hierarchies of vocabulary and phrases in Chinese. We introduce a regression methodology to automatically recognize continuous emotional values, and incorporate a word embedding technique. In our system, the MAE predictive values of Valence and Arousal were 0.811 and 0.996, respectively, for the sentiment dimension prediction of words in Chinese. In phrase prediction, the corresponding results were 0.822 and 0.489, ranking sixth among all teams.
Tasks Sentiment Analysis, Word Embeddings
Published 2017-12-01
URL https://www.aclweb.org/anthology/I17-4013/
PDF https://www.aclweb.org/anthology/I17-4013
PWC https://paperswithcode.com/paper/ldccnlp-at-ijcnlp-2017-task-2-dimensional
Repo
Framework

Initial Explorations of CCG Supertagging for Universal Dependency Parsing

Title Initial Explorations of CCG Supertagging for Universal Dependency Parsing
Authors Burak Kerim Akkus, Heval Azizoglu, Ruket Cakici
Abstract In this paper we describe the system by METU team for universal dependency parsing of multilingual text. We use a neural network-based dependency parser that has a greedy transition approach to dependency parsing. CCG supertags contain rich structural information that proves useful in certain NLP tasks. We experiment with CCG supertags as additional features in our experiments. The neural network parser is trained together with dependencies and simplified CCG tags as well as other features provided.
Tasks CCG Supertagging, Dependency Parsing, Machine Translation
Published 2017-08-01
URL https://www.aclweb.org/anthology/K17-3023/
PDF https://www.aclweb.org/anthology/K17-3023
PWC https://paperswithcode.com/paper/initial-explorations-of-ccg-supertagging-for
Repo
Framework

Translation Memory Systems Have a Long Way to Go

Title Translation Memory Systems Have a Long Way to Go
Authors Andrea Silvestre Baquero, Ruslan Mitkov
Abstract The TM memory systems changed the work of translators and now the translators not benefiting from these tools are a tiny minority. These tools operate on fuzzy (surface) matching mostly and cannot benefit from already translated texts which are synonymous to (or paraphrased versions of) the text to be translated. The match score is mostly based on character-string similarity, calculated through Levenshtein distance. The TM tools have difficulties with detecting similarities even in sentences which represent a minor revision of sentences already available in the translation memory. This shortcoming of the current TM systems was the subject of the present study and was empirically proven in the experiments we conducted. To this end, we compiled a small translation memory (English-Spanish) and applied several lexical and syntactic transformation rules to the source sentences with both English and Spanish being the source language. The results of this study show that current TM systems have a long way to go and highlight the need for TM systems equipped with NLP capabilities which will offer the translator the advantage of he/she not having to translate a sentence again if an almost identical sentence has already been already translated.
Tasks Machine Translation
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-7906/
PDF https://doi.org/10.26615/978-954-452-042-7_006
PWC https://paperswithcode.com/paper/translation-memory-systems-have-a-long-way-to
Repo
Framework

QU-BIGIR at SemEval 2017 Task 3: Using Similarity Features for Arabic Community Question Answering Forums

Title QU-BIGIR at SemEval 2017 Task 3: Using Similarity Features for Arabic Community Question Answering Forums
Authors Marwan Torki, Maram Hasanain, Tamer Elsayed
Abstract In this paper we describe our QU-BIGIR system for the Arabic subtask D of the SemEval 2017 Task 3. Our approach builds on our participation in the past version of the same subtask. This year, our system uses different similarity measures that encodes lexical and semantic pairwise similarity of text pairs. In addition to well known similarity measures such as cosine similarity, we use other measures based on the summary statistics of word embedding representation for a given text. To rank a list of candidate question answer pairs for a given question, we learn a linear SVM classifier over our similarity features. Our best resulting run came second in subtask D with a very competitive performance to the first-ranking system.
Tasks Community Question Answering, Question Answering, Semantic Textual Similarity
Published 2017-08-01
URL https://www.aclweb.org/anthology/S17-2059/
PDF https://www.aclweb.org/anthology/S17-2059
PWC https://paperswithcode.com/paper/qu-bigir-at-semeval-2017-task-3-using
Repo
Framework

RelTextRank: An Open Source Framework for Building Relational Syntactic-Semantic Text Pair Representations

Title RelTextRank: An Open Source Framework for Building Relational Syntactic-Semantic Text Pair Representations
Authors Kateryna Tymoshenko, Aless Moschitti, ro, Massimo Nicosia, Aliaksei Severyn
Abstract
Tasks Community Question Answering, Natural Language Inference, Question Answering, Relational Reasoning
Published 2017-07-01
URL https://www.aclweb.org/anthology/P17-4014/
PDF https://www.aclweb.org/anthology/P17-4014
PWC https://paperswithcode.com/paper/reltextrank-an-open-source-framework-for
Repo
Framework

The Influence of Spelling Errors on Content Scoring Performance

Title The Influence of Spelling Errors on Content Scoring Performance
Authors Andrea Horbach, Yuning Ding, Torsten Zesch
Abstract Spelling errors occur frequently in educational settings, but their influence on automatic scoring is largely unknown. We therefore investigate the influence of spelling errors on content scoring performance using the example of the ASAP corpus. We conduct an annotation study on the nature of spelling errors in the ASAP dataset and utilize these finding in machine learning experiments that measure the influence of spelling errors on automatic content scoring. Our main finding is that scoring methods using both token and character n-gram features are robust against spelling errors up to the error frequency in ASAP.
Tasks
Published 2017-12-01
URL https://www.aclweb.org/anthology/W17-5908/
PDF https://www.aclweb.org/anthology/W17-5908
PWC https://paperswithcode.com/paper/the-influence-of-spelling-errors-on-content
Repo
Framework

A graph-theoretic approach to multitasking

Title A graph-theoretic approach to multitasking
Authors Noga Alon, Daniel Reichman, Igor Shinkar, Tal Wagner, Sebastian Musslick, Jonathan D. Cohen, Tom Griffiths, Biswadip Dey, Kayhan Ozcimder
Abstract A key feature of neural network architectures is their ability to support the simultaneous interaction among large numbers of units in the learning and processing of representations. However, how the richness of such interactions trades off against the ability of a network to simultaneously carry out multiple independent processes – a salient limitation in many domains of human cognition – remains largely unexplored. In this paper we use a graph-theoretic analysis of network architecture to address this question, where tasks are represented as edges in a bipartite graph $G=(A \cup B, E)$. We define a new measure of multitasking capacity of such networks, based on the assumptions that tasks that \emph{need} to be multitasked rely on independent resources, i.e., form a matching, and that tasks \emph{can} be performed without interference if they form an induced matching. Our main result is an inherent tradeoff between the multitasking capacity and the average degree of the network that holds \emph{regardless of the network architecture}. These results are also extended to networks of depth greater than $2$. On the positive side, we demonstrate that networks that are random-like (e.g., locally sparse) can have desirable multitasking properties. Our results shed light into the parallel-processing limitations of neural systems and provide insights that may be useful for the analysis and design of parallel architectures.
Tasks
Published 2017-12-01
URL http://papers.nips.cc/paper/6805-a-graph-theoretic-approach-to-multitasking
PDF http://papers.nips.cc/paper/6805-a-graph-theoretic-approach-to-multitasking.pdf
PWC https://paperswithcode.com/paper/a-graph-theoretic-approach-to-multitasking
Repo
Framework

Interpreting Strategies Annotation in the WAW Corpus

Title Interpreting Strategies Annotation in the WAW Corpus
Authors Irina Temnikova, Ahmed Abdelali, Samy Hedaya, Stephan Vogel, Aishah Al Daher
Abstract With the aim to teach our automatic speech-to-text translation system human interpreting strategies, our first step is to identify which interpreting strategies are most often used in the language pair of our interest (English-Arabic). In this article we run an automatic analysis of a corpus of parallel speeches and their human interpretations, and provide the results of manually annotating the human interpreting strategies in a sample of the corpus. We give a glimpse of the corpus, whose value surpasses the fact that it contains a high number of scientific speeches with their interpretations from English into Arabic, as it also provides rich information about the interpreters. We also discuss the difficulties, which we encountered on our way, as well as our solutions to them: our methodology for manual re-segmentation and alignment of parallel segments, the choice of annotation tool, and the annotation procedure. Our annotation findings explain the previously extracted specific statistical features of the interpreted corpus (compared with a translation one) as well as the quality of interpretation provided by different interpreters.
Tasks Machine Translation
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-7905/
PDF https://doi.org/10.26615/978-954-452-042-7_005
PWC https://paperswithcode.com/paper/interpreting-strategies-annotation-in-the-waw
Repo
Framework

Adaptive Batch Size for Safe Policy Gradients

Title Adaptive Batch Size for Safe Policy Gradients
Authors Matteo Papini, Matteo Pirotta, Marcello Restelli
Abstract Policy gradient methods are among the best Reinforcement Learning (RL) techniques to solve complex control problems. In real-world RL applications, it is common to have a good initial policy whose performance needs to be improved and it may not be acceptable to try bad policies during the learning process. Although several methods for choosing the step size exist, research paid less attention to determine the batch size, that is the number of samples used to estimate the gradient direction for each update of the policy parameters. In this paper, we propose a set of methods to jointly optimize the step and the batch sizes that guarantee (with high probability) to improve the policy performance after each update. Besides providing theoretical guarantees, we show numerical simulations to analyse the behaviour of our methods.
Tasks Policy Gradient Methods
Published 2017-12-01
URL http://papers.nips.cc/paper/6950-adaptive-batch-size-for-safe-policy-gradients
PDF http://papers.nips.cc/paper/6950-adaptive-batch-size-for-safe-policy-gradients.pdf
PWC https://paperswithcode.com/paper/adaptive-batch-size-for-safe-policy-gradients
Repo
Framework

(Re)introducing Regular Graph Languages

Title (Re)introducing Regular Graph Languages
Authors Sorcha Gilroy, Adam Lopez, Sebastian Maneth, Pijus Simonaitis
Abstract
Tasks Machine Translation
Published 2017-07-01
URL https://www.aclweb.org/anthology/W17-3410/
PDF https://www.aclweb.org/anthology/W17-3410
PWC https://paperswithcode.com/paper/reintroducing-regular-graph-languages
Repo
Framework

Enhancing Automatic ICD-9-CM Code Assignment for Medical Texts with PubMed

Title Enhancing Automatic ICD-9-CM Code Assignment for Medical Texts with PubMed
Authors Danchen Zhang, Daqing He, Sanqiang Zhao, Lei Li
Abstract Assigning a standard ICD-9-CM code to disease symptoms in medical texts is an important task in the medical domain. Automating this process could greatly reduce the costs. However, the effectiveness of an automatic ICD-9-CM code classifier faces a serious problem, which can be triggered by unbalanced training data. Frequent diseases often have more training data, which helps its classification to perform better than that of an infrequent disease. However, a disease{'}s frequency does not necessarily reflect its importance. To resolve this training data shortage problem, we propose to strategically draw data from PubMed to enrich the training data when there is such need. We validate our method on the CMC dataset, and the evaluation results indicate that our method can significantly improve the code assignment classifiers{'} performance at the macro-averaging level.
Tasks
Published 2017-08-01
URL https://www.aclweb.org/anthology/W17-2333/
PDF https://www.aclweb.org/anthology/W17-2333
PWC https://paperswithcode.com/paper/enhancing-automatic-icd-9-cm-code-assignment
Repo
Framework

A Cross-modal Review of Indicators for Depression Detection Systems

Title A Cross-modal Review of Indicators for Depression Detection Systems
Authors Michelle Morales, Stefan Scherer, Rivka Levitan
Abstract Automatic detection of depression has attracted increasing attention from researchers in psychology, computer science, linguistics, and related disciplines. As a result, promising depression detection systems have been reported. This paper surveys these efforts by presenting the first cross-modal review of depression detection systems and discusses best practices and most promising approaches to this task.
Tasks
Published 2017-08-01
URL https://www.aclweb.org/anthology/W17-3101/
PDF https://www.aclweb.org/anthology/W17-3101
PWC https://paperswithcode.com/paper/a-cross-modal-review-of-indicators-for
Repo
Framework

Shared Task Proposal: Multilingual Surface Realization Using Universal Dependency Trees

Title Shared Task Proposal: Multilingual Surface Realization Using Universal Dependency Trees
Authors Simon Mille, Bernd Bohnet, Leo Wanner, Anja Belz
Abstract We propose a shared task on multilingual Surface Realization, i.e., on mapping unordered and uninflected universal dependency trees to correctly ordered and inflected sentences in a number of languages. A second deeper input will be available in which, in addition, functional words, fine-grained PoS and morphological information will be removed from the input trees. The first shared task on Surface Realization was carried out in 2011 with a similar setup, with a focus on English. We think that it is time for relaunching such a shared task effort in view of the arrival of Universal Dependencies annotated treebanks for a large number of languages on the one hand, and the increasing dominance of Deep Learning, which proved to be a game changer for NLP, on the other hand.
Tasks Machine Translation, Text Generation
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-3517/
PDF https://www.aclweb.org/anthology/W17-3517
PWC https://paperswithcode.com/paper/shared-task-proposal-multilingual-surface
Repo
Framework

Trust, but Verify! Better Entity Linking through Automatic Verification

Title Trust, but Verify! Better Entity Linking through Automatic Verification
Authors Benjamin Heinzerling, Michael Strube, Chin-Yew Lin
Abstract We introduce automatic verification as a post-processing step for entity linking (EL). The proposed method trusts EL system results collectively, by assuming entity mentions are mostly linked correctly, in order to create a semantic profile of the given text using geospatial and temporal information, as well as fine-grained entity types. This profile is then used to automatically verify each linked mention individually, i.e., to predict whether it has been linked correctly or not. Verification allows leveraging a rich set of global and pairwise features that would be prohibitively expensive for EL systems employing global inference. Evaluation shows consistent improvements across datasets and systems. In particular, when applied to state-of-the-art systems, our method yields an absolute improvement in linking performance of up to 1.7 F1 on AIDA/CoNLL{'}03 and up to 2.4 F1 on the English TAC KBP 2015 TEDL dataset.
Tasks Entity Linking
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-1078/
PDF https://www.aclweb.org/anthology/E17-1078
PWC https://paperswithcode.com/paper/trust-but-verify-better-entity-linking
Repo
Framework

Probabilistic Inference for Cold Start Knowledge Base Population with Prior World Knowledge

Title Probabilistic Inference for Cold Start Knowledge Base Population with Prior World Knowledge
Authors Bonan Min, Marjorie Freedman, Talya Meltzer
Abstract Building knowledge bases (KB) automatically from text corpora is crucial for many applications such as question answering and web search. The problem is very challenging and has been divided into sub-problems such as mention and named entity recognition, entity linking and relation extraction. However, combining these components has shown to be under-constrained and often produces KBs with supersize entities and common-sense errors in relations (a person has multiple birthdates). The errors are difficult to resolve solely with IE tools but become obvious with world knowledge at the corpus level. By analyzing Freebase and a large text collection, we found that per-relation cardinality and the popularity of entities follow the power-law distribution favoring flat long tails with low-frequency instances. We present a probabilistic joint inference algorithm to incorporate this world knowledge during KB construction. Our approach yields state-of-the-art performance on the TAC Cold Start task, and 42{%} and 19.4{%} relative improvements in F1 over our baseline on Cold Start hop-1 and all-hop queries respectively.
Tasks Common Sense Reasoning, Entity Linking, Knowledge Base Population, Named Entity Recognition, Question Answering, Relation Extraction
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-1057/
PDF https://www.aclweb.org/anthology/E17-1057
PWC https://paperswithcode.com/paper/probabilistic-inference-for-cold-start
Repo
Framework
comments powered by Disqus