Paper Group NANR 52
LDCCNLP at IJCNLP-2017 Task 2: Dimensional Sentiment Analysis for Chinese Phrases Using Machine Learning. Initial Explorations of CCG Supertagging for Universal Dependency Parsing. Translation Memory Systems Have a Long Way to Go. QU-BIGIR at SemEval 2017 Task 3: Using Similarity Features for Arabic Community Question Answering Forums. RelTextRank: …
LDCCNLP at IJCNLP-2017 Task 2: Dimensional Sentiment Analysis for Chinese Phrases Using Machine Learning
Title | LDCCNLP at IJCNLP-2017 Task 2: Dimensional Sentiment Analysis for Chinese Phrases Using Machine Learning |
Authors | Peng Zhong, Jingbin Wang |
Abstract | Sentiment analysis on Chinese text has intensively studied. The basic task for related research is to construct an affective lexicon and thereby predict emotional scores of different levels. However, finite lexicon resources make it difficult to effectively and automatically distinguish between various types of sentiment information in Chinese texts. This IJCNLP2017-Task2 competition seeks to automatically calculate Valence and Arousal ratings within the hierarchies of vocabulary and phrases in Chinese. We introduce a regression methodology to automatically recognize continuous emotional values, and incorporate a word embedding technique. In our system, the MAE predictive values of Valence and Arousal were 0.811 and 0.996, respectively, for the sentiment dimension prediction of words in Chinese. In phrase prediction, the corresponding results were 0.822 and 0.489, ranking sixth among all teams. |
Tasks | Sentiment Analysis, Word Embeddings |
Published | 2017-12-01 |
URL | https://www.aclweb.org/anthology/I17-4013/ |
https://www.aclweb.org/anthology/I17-4013 | |
PWC | https://paperswithcode.com/paper/ldccnlp-at-ijcnlp-2017-task-2-dimensional |
Repo | |
Framework | |
Initial Explorations of CCG Supertagging for Universal Dependency Parsing
Title | Initial Explorations of CCG Supertagging for Universal Dependency Parsing |
Authors | Burak Kerim Akkus, Heval Azizoglu, Ruket Cakici |
Abstract | In this paper we describe the system by METU team for universal dependency parsing of multilingual text. We use a neural network-based dependency parser that has a greedy transition approach to dependency parsing. CCG supertags contain rich structural information that proves useful in certain NLP tasks. We experiment with CCG supertags as additional features in our experiments. The neural network parser is trained together with dependencies and simplified CCG tags as well as other features provided. |
Tasks | CCG Supertagging, Dependency Parsing, Machine Translation |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/K17-3023/ |
https://www.aclweb.org/anthology/K17-3023 | |
PWC | https://paperswithcode.com/paper/initial-explorations-of-ccg-supertagging-for |
Repo | |
Framework | |
Translation Memory Systems Have a Long Way to Go
Title | Translation Memory Systems Have a Long Way to Go |
Authors | Andrea Silvestre Baquero, Ruslan Mitkov |
Abstract | The TM memory systems changed the work of translators and now the translators not benefiting from these tools are a tiny minority. These tools operate on fuzzy (surface) matching mostly and cannot benefit from already translated texts which are synonymous to (or paraphrased versions of) the text to be translated. The match score is mostly based on character-string similarity, calculated through Levenshtein distance. The TM tools have difficulties with detecting similarities even in sentences which represent a minor revision of sentences already available in the translation memory. This shortcoming of the current TM systems was the subject of the present study and was empirically proven in the experiments we conducted. To this end, we compiled a small translation memory (English-Spanish) and applied several lexical and syntactic transformation rules to the source sentences with both English and Spanish being the source language. The results of this study show that current TM systems have a long way to go and highlight the need for TM systems equipped with NLP capabilities which will offer the translator the advantage of he/she not having to translate a sentence again if an almost identical sentence has already been already translated. |
Tasks | Machine Translation |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-7906/ |
https://doi.org/10.26615/978-954-452-042-7_006 | |
PWC | https://paperswithcode.com/paper/translation-memory-systems-have-a-long-way-to |
Repo | |
Framework | |
QU-BIGIR at SemEval 2017 Task 3: Using Similarity Features for Arabic Community Question Answering Forums
Title | QU-BIGIR at SemEval 2017 Task 3: Using Similarity Features for Arabic Community Question Answering Forums |
Authors | Marwan Torki, Maram Hasanain, Tamer Elsayed |
Abstract | In this paper we describe our QU-BIGIR system for the Arabic subtask D of the SemEval 2017 Task 3. Our approach builds on our participation in the past version of the same subtask. This year, our system uses different similarity measures that encodes lexical and semantic pairwise similarity of text pairs. In addition to well known similarity measures such as cosine similarity, we use other measures based on the summary statistics of word embedding representation for a given text. To rank a list of candidate question answer pairs for a given question, we learn a linear SVM classifier over our similarity features. Our best resulting run came second in subtask D with a very competitive performance to the first-ranking system. |
Tasks | Community Question Answering, Question Answering, Semantic Textual Similarity |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/S17-2059/ |
https://www.aclweb.org/anthology/S17-2059 | |
PWC | https://paperswithcode.com/paper/qu-bigir-at-semeval-2017-task-3-using |
Repo | |
Framework | |
RelTextRank: An Open Source Framework for Building Relational Syntactic-Semantic Text Pair Representations
Title | RelTextRank: An Open Source Framework for Building Relational Syntactic-Semantic Text Pair Representations |
Authors | Kateryna Tymoshenko, Aless Moschitti, ro, Massimo Nicosia, Aliaksei Severyn |
Abstract | |
Tasks | Community Question Answering, Natural Language Inference, Question Answering, Relational Reasoning |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/P17-4014/ |
https://www.aclweb.org/anthology/P17-4014 | |
PWC | https://paperswithcode.com/paper/reltextrank-an-open-source-framework-for |
Repo | |
Framework | |
The Influence of Spelling Errors on Content Scoring Performance
Title | The Influence of Spelling Errors on Content Scoring Performance |
Authors | Andrea Horbach, Yuning Ding, Torsten Zesch |
Abstract | Spelling errors occur frequently in educational settings, but their influence on automatic scoring is largely unknown. We therefore investigate the influence of spelling errors on content scoring performance using the example of the ASAP corpus. We conduct an annotation study on the nature of spelling errors in the ASAP dataset and utilize these finding in machine learning experiments that measure the influence of spelling errors on automatic content scoring. Our main finding is that scoring methods using both token and character n-gram features are robust against spelling errors up to the error frequency in ASAP. |
Tasks | |
Published | 2017-12-01 |
URL | https://www.aclweb.org/anthology/W17-5908/ |
https://www.aclweb.org/anthology/W17-5908 | |
PWC | https://paperswithcode.com/paper/the-influence-of-spelling-errors-on-content |
Repo | |
Framework | |
A graph-theoretic approach to multitasking
Title | A graph-theoretic approach to multitasking |
Authors | Noga Alon, Daniel Reichman, Igor Shinkar, Tal Wagner, Sebastian Musslick, Jonathan D. Cohen, Tom Griffiths, Biswadip Dey, Kayhan Ozcimder |
Abstract | A key feature of neural network architectures is their ability to support the simultaneous interaction among large numbers of units in the learning and processing of representations. However, how the richness of such interactions trades off against the ability of a network to simultaneously carry out multiple independent processes – a salient limitation in many domains of human cognition – remains largely unexplored. In this paper we use a graph-theoretic analysis of network architecture to address this question, where tasks are represented as edges in a bipartite graph $G=(A \cup B, E)$. We define a new measure of multitasking capacity of such networks, based on the assumptions that tasks that \emph{need} to be multitasked rely on independent resources, i.e., form a matching, and that tasks \emph{can} be performed without interference if they form an induced matching. Our main result is an inherent tradeoff between the multitasking capacity and the average degree of the network that holds \emph{regardless of the network architecture}. These results are also extended to networks of depth greater than $2$. On the positive side, we demonstrate that networks that are random-like (e.g., locally sparse) can have desirable multitasking properties. Our results shed light into the parallel-processing limitations of neural systems and provide insights that may be useful for the analysis and design of parallel architectures. |
Tasks | |
Published | 2017-12-01 |
URL | http://papers.nips.cc/paper/6805-a-graph-theoretic-approach-to-multitasking |
http://papers.nips.cc/paper/6805-a-graph-theoretic-approach-to-multitasking.pdf | |
PWC | https://paperswithcode.com/paper/a-graph-theoretic-approach-to-multitasking |
Repo | |
Framework | |
Interpreting Strategies Annotation in the WAW Corpus
Title | Interpreting Strategies Annotation in the WAW Corpus |
Authors | Irina Temnikova, Ahmed Abdelali, Samy Hedaya, Stephan Vogel, Aishah Al Daher |
Abstract | With the aim to teach our automatic speech-to-text translation system human interpreting strategies, our first step is to identify which interpreting strategies are most often used in the language pair of our interest (English-Arabic). In this article we run an automatic analysis of a corpus of parallel speeches and their human interpretations, and provide the results of manually annotating the human interpreting strategies in a sample of the corpus. We give a glimpse of the corpus, whose value surpasses the fact that it contains a high number of scientific speeches with their interpretations from English into Arabic, as it also provides rich information about the interpreters. We also discuss the difficulties, which we encountered on our way, as well as our solutions to them: our methodology for manual re-segmentation and alignment of parallel segments, the choice of annotation tool, and the annotation procedure. Our annotation findings explain the previously extracted specific statistical features of the interpreted corpus (compared with a translation one) as well as the quality of interpretation provided by different interpreters. |
Tasks | Machine Translation |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-7905/ |
https://doi.org/10.26615/978-954-452-042-7_005 | |
PWC | https://paperswithcode.com/paper/interpreting-strategies-annotation-in-the-waw |
Repo | |
Framework | |
Adaptive Batch Size for Safe Policy Gradients
Title | Adaptive Batch Size for Safe Policy Gradients |
Authors | Matteo Papini, Matteo Pirotta, Marcello Restelli |
Abstract | Policy gradient methods are among the best Reinforcement Learning (RL) techniques to solve complex control problems. In real-world RL applications, it is common to have a good initial policy whose performance needs to be improved and it may not be acceptable to try bad policies during the learning process. Although several methods for choosing the step size exist, research paid less attention to determine the batch size, that is the number of samples used to estimate the gradient direction for each update of the policy parameters. In this paper, we propose a set of methods to jointly optimize the step and the batch sizes that guarantee (with high probability) to improve the policy performance after each update. Besides providing theoretical guarantees, we show numerical simulations to analyse the behaviour of our methods. |
Tasks | Policy Gradient Methods |
Published | 2017-12-01 |
URL | http://papers.nips.cc/paper/6950-adaptive-batch-size-for-safe-policy-gradients |
http://papers.nips.cc/paper/6950-adaptive-batch-size-for-safe-policy-gradients.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-batch-size-for-safe-policy-gradients |
Repo | |
Framework | |
(Re)introducing Regular Graph Languages
Title | (Re)introducing Regular Graph Languages |
Authors | Sorcha Gilroy, Adam Lopez, Sebastian Maneth, Pijus Simonaitis |
Abstract | |
Tasks | Machine Translation |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/W17-3410/ |
https://www.aclweb.org/anthology/W17-3410 | |
PWC | https://paperswithcode.com/paper/reintroducing-regular-graph-languages |
Repo | |
Framework | |
Enhancing Automatic ICD-9-CM Code Assignment for Medical Texts with PubMed
Title | Enhancing Automatic ICD-9-CM Code Assignment for Medical Texts with PubMed |
Authors | Danchen Zhang, Daqing He, Sanqiang Zhao, Lei Li |
Abstract | Assigning a standard ICD-9-CM code to disease symptoms in medical texts is an important task in the medical domain. Automating this process could greatly reduce the costs. However, the effectiveness of an automatic ICD-9-CM code classifier faces a serious problem, which can be triggered by unbalanced training data. Frequent diseases often have more training data, which helps its classification to perform better than that of an infrequent disease. However, a disease{'}s frequency does not necessarily reflect its importance. To resolve this training data shortage problem, we propose to strategically draw data from PubMed to enrich the training data when there is such need. We validate our method on the CMC dataset, and the evaluation results indicate that our method can significantly improve the code assignment classifiers{'} performance at the macro-averaging level. |
Tasks | |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/W17-2333/ |
https://www.aclweb.org/anthology/W17-2333 | |
PWC | https://paperswithcode.com/paper/enhancing-automatic-icd-9-cm-code-assignment |
Repo | |
Framework | |
A Cross-modal Review of Indicators for Depression Detection Systems
Title | A Cross-modal Review of Indicators for Depression Detection Systems |
Authors | Michelle Morales, Stefan Scherer, Rivka Levitan |
Abstract | Automatic detection of depression has attracted increasing attention from researchers in psychology, computer science, linguistics, and related disciplines. As a result, promising depression detection systems have been reported. This paper surveys these efforts by presenting the first cross-modal review of depression detection systems and discusses best practices and most promising approaches to this task. |
Tasks | |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/W17-3101/ |
https://www.aclweb.org/anthology/W17-3101 | |
PWC | https://paperswithcode.com/paper/a-cross-modal-review-of-indicators-for |
Repo | |
Framework | |
Shared Task Proposal: Multilingual Surface Realization Using Universal Dependency Trees
Title | Shared Task Proposal: Multilingual Surface Realization Using Universal Dependency Trees |
Authors | Simon Mille, Bernd Bohnet, Leo Wanner, Anja Belz |
Abstract | We propose a shared task on multilingual Surface Realization, i.e., on mapping unordered and uninflected universal dependency trees to correctly ordered and inflected sentences in a number of languages. A second deeper input will be available in which, in addition, functional words, fine-grained PoS and morphological information will be removed from the input trees. The first shared task on Surface Realization was carried out in 2011 with a similar setup, with a focus on English. We think that it is time for relaunching such a shared task effort in view of the arrival of Universal Dependencies annotated treebanks for a large number of languages on the one hand, and the increasing dominance of Deep Learning, which proved to be a game changer for NLP, on the other hand. |
Tasks | Machine Translation, Text Generation |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-3517/ |
https://www.aclweb.org/anthology/W17-3517 | |
PWC | https://paperswithcode.com/paper/shared-task-proposal-multilingual-surface |
Repo | |
Framework | |
Trust, but Verify! Better Entity Linking through Automatic Verification
Title | Trust, but Verify! Better Entity Linking through Automatic Verification |
Authors | Benjamin Heinzerling, Michael Strube, Chin-Yew Lin |
Abstract | We introduce automatic verification as a post-processing step for entity linking (EL). The proposed method trusts EL system results collectively, by assuming entity mentions are mostly linked correctly, in order to create a semantic profile of the given text using geospatial and temporal information, as well as fine-grained entity types. This profile is then used to automatically verify each linked mention individually, i.e., to predict whether it has been linked correctly or not. Verification allows leveraging a rich set of global and pairwise features that would be prohibitively expensive for EL systems employing global inference. Evaluation shows consistent improvements across datasets and systems. In particular, when applied to state-of-the-art systems, our method yields an absolute improvement in linking performance of up to 1.7 F1 on AIDA/CoNLL{'}03 and up to 2.4 F1 on the English TAC KBP 2015 TEDL dataset. |
Tasks | Entity Linking |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/E17-1078/ |
https://www.aclweb.org/anthology/E17-1078 | |
PWC | https://paperswithcode.com/paper/trust-but-verify-better-entity-linking |
Repo | |
Framework | |
Probabilistic Inference for Cold Start Knowledge Base Population with Prior World Knowledge
Title | Probabilistic Inference for Cold Start Knowledge Base Population with Prior World Knowledge |
Authors | Bonan Min, Marjorie Freedman, Talya Meltzer |
Abstract | Building knowledge bases (KB) automatically from text corpora is crucial for many applications such as question answering and web search. The problem is very challenging and has been divided into sub-problems such as mention and named entity recognition, entity linking and relation extraction. However, combining these components has shown to be under-constrained and often produces KBs with supersize entities and common-sense errors in relations (a person has multiple birthdates). The errors are difficult to resolve solely with IE tools but become obvious with world knowledge at the corpus level. By analyzing Freebase and a large text collection, we found that per-relation cardinality and the popularity of entities follow the power-law distribution favoring flat long tails with low-frequency instances. We present a probabilistic joint inference algorithm to incorporate this world knowledge during KB construction. Our approach yields state-of-the-art performance on the TAC Cold Start task, and 42{%} and 19.4{%} relative improvements in F1 over our baseline on Cold Start hop-1 and all-hop queries respectively. |
Tasks | Common Sense Reasoning, Entity Linking, Knowledge Base Population, Named Entity Recognition, Question Answering, Relation Extraction |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/E17-1057/ |
https://www.aclweb.org/anthology/E17-1057 | |
PWC | https://paperswithcode.com/paper/probabilistic-inference-for-cold-start |
Repo | |
Framework | |