May 5, 2019

1699 words 8 mins read

Paper Group NANR 107

Paper Group NANR 107

Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Tutorial Abstracts. Bootstrapping Translation Detection and Sentence Extraction from Comparable Corpora. Comparing Two Basic Methods for Discriminating Between Similar Languages and Varieties. Advances in Ngram-based Discrimination of Similar Languages. MAWP …

Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Tutorial Abstracts

Title Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Tutorial Abstracts
Authors
Abstract
Tasks
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-3000/
PDF https://www.aclweb.org/anthology/C16-3000
PWC https://paperswithcode.com/paper/proceedings-of-coling-2016-the-26th-2
Repo
Framework

Bootstrapping Translation Detection and Sentence Extraction from Comparable Corpora

Title Bootstrapping Translation Detection and Sentence Extraction from Comparable Corpora
Authors Kriste Krstovski, David Smith
Abstract
Tasks Machine Translation, Topic Models
Published 2016-06-01
URL https://www.aclweb.org/anthology/N16-1132/
PDF https://www.aclweb.org/anthology/N16-1132
PWC https://paperswithcode.com/paper/bootstrapping-translation-detection-and
Repo
Framework

Comparing Two Basic Methods for Discriminating Between Similar Languages and Varieties

Title Comparing Two Basic Methods for Discriminating Between Similar Languages and Varieties
Authors Pablo Gamallo, I{~n}aki Alegria, Jos{'e} Ramom Pichel, Manex Agirrezabal
Abstract This article describes the systems submitted by the Citius{_}Ixa{_}Imaxin team to the Discriminating Similar Languages Shared Task 2016. The systems are based on two different strategies: classification with ranked dictionaries and Naive Bayes classifiers. The results of the evaluation show that ranking dictionaries are more sound and stable across different domains while basic bayesian models perform reasonably well on in-domain datasets, but their performance drops when they are applied on out-of-domain texts.
Tasks Language Identification, Speech Recognition
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-4822/
PDF https://www.aclweb.org/anthology/W16-4822
PWC https://paperswithcode.com/paper/comparing-two-basic-methods-for
Repo
Framework

Advances in Ngram-based Discrimination of Similar Languages

Title Advances in Ngram-based Discrimination of Similar Languages
Authors Cyril Goutte, Serge L{'e}ger
Abstract We describe the systems entered by the National Research Council in the 2016 shared task on discriminating similar languages. Like previous years, we relied on character ngram features, and a mixture of discriminative and generative statistical classifiers. We mostly investigated the influence of the amount of data on the performance, in the open task, and compared the two-stage approach (predicting language/group, then variant) to a flat approach. Results suggest that ngrams are still state-of-the-art for language and variant identification, and that additional data has a small but decisive impact.
Tasks
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-4823/
PDF https://www.aclweb.org/anthology/W16-4823
PWC https://paperswithcode.com/paper/advances-in-ngram-based-discrimination-of
Repo
Framework

MAWPS: A Math Word Problem Repository

Title MAWPS: A Math Word Problem Repository
Authors Rik Koncel-Kedziorski, Subhro Roy, Aida Amini, Nate Kushman, Hannaneh Hajishirzi
Abstract
Tasks
Published 2016-06-01
URL https://www.aclweb.org/anthology/N16-1136/
PDF https://www.aclweb.org/anthology/N16-1136
PWC https://paperswithcode.com/paper/mawps-a-math-word-problem-repository
Repo
Framework

Multi-label Annotation in Scientific Articles - The Multi-label Cancer Risk Assessment Corpus

Title Multi-label Annotation in Scientific Articles - The Multi-label Cancer Risk Assessment Corpus
Authors James Ravenscroft, Anika Oellrich, Shyamasree Saha, Maria Liakata
Abstract With the constant growth of the scientific literature, automated processes to enable access to its contents are increasingly in demand. Several functional discourse annotation schemes have been proposed to facilitate information extraction and summarisation from scientific articles, the most well known being argumentative zoning. Core Scientific concepts (CoreSC) is a three layered fine-grained annotation scheme providing content-based annotations at the sentence level and has been used to index, extract and summarise scientific publications in the biomedical literature. A previously developed CoreSC corpus on which existing automated tools have been trained contains a single annotation for each sentence. However, it is the case that more than one CoreSC concept can appear in the same sentence. Here, we present the Multi-CoreSC CRA corpus, a text corpus specific to the domain of cancer risk assessment (CRA), consisting of 50 full text papers, each of which contains sentences annotated with one or more CoreSCs. The full text papers have been annotated by three biology experts. We present several inter-annotator agreement measures appropriate for multi-label annotation assessment. Employing several inter-annotator agreement measures, we were able to identify the most reliable annotator and we built a harmonised consensus (gold standard) from the three different annotators, while also taking concept priority (as specified in the guidelines) into account. We also show that the new Multi-CoreSC CRA corpus allows us to improve performance in the recognition of CoreSCs. The updated guidelines, the multi-label CoreSC CRA corpus and other relevant, related materials are available at the time of publication at http://www.sapientaproject.com/.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1650/
PDF https://www.aclweb.org/anthology/L16-1650
PWC https://paperswithcode.com/paper/multi-label-annotation-in-scientific-articles
Repo
Framework

Proceedings of the Workshop on Multilingual and Cross-lingual Methods in NLP

Title Proceedings of the Workshop on Multilingual and Cross-lingual Methods in NLP
Authors
Abstract
Tasks
Published 2016-06-01
URL https://www.aclweb.org/anthology/W16-1200/
PDF https://www.aclweb.org/anthology/W16-1200
PWC https://paperswithcode.com/paper/proceedings-of-the-workshop-on-multilingual
Repo
Framework

Towards error annotation in a learner corpus of Portuguese

Title Towards error annotation in a learner corpus of Portuguese
Authors Iria del R{'\i}o, S Antunes, ra, Am{'a}lia Mendes, Maarten Janssen
Abstract
Tasks Language Acquisition, Lemmatization
Published 2016-11-01
URL https://www.aclweb.org/anthology/W16-6502/
PDF https://www.aclweb.org/anthology/W16-6502
PWC https://paperswithcode.com/paper/towards-error-annotation-in-a-learner-corpus
Repo
Framework

NER for Medical Entities in Twitter using Sequence to Sequence Neural Networks

Title NER for Medical Entities in Twitter using Sequence to Sequence Neural Networks
Authors Antonio Jimeno Yepes, Andrew MacKinlay
Abstract
Tasks Feature Engineering, Named Entity Recognition, Word Embeddings
Published 2016-12-01
URL https://www.aclweb.org/anthology/U16-1016/
PDF https://www.aclweb.org/anthology/U16-1016
PWC https://paperswithcode.com/paper/ner-for-medical-entities-in-twitter-using
Repo
Framework

JUNITMZ at SemEval-2016 Task 1: Identifying Semantic Similarity Using Levenshtein Ratio

Title JUNITMZ at SemEval-2016 Task 1: Identifying Semantic Similarity Using Levenshtein Ratio
Authors S Sarkar, ip, Dipankar Das, Partha Pakray, Alex Gelbukh, er
Abstract
Tasks Information Retrieval, Machine Translation, Semantic Similarity, Semantic Textual Similarity, Text Summarization
Published 2016-06-01
URL https://www.aclweb.org/anthology/S16-1108/
PDF https://www.aclweb.org/anthology/S16-1108
PWC https://paperswithcode.com/paper/junitmz-at-semeval-2016-task-1-identifying
Repo
Framework

Spanish Word Vectors from Wikipedia

Title Spanish Word Vectors from Wikipedia
Authors Mathias Etcheverry, Dina Wonsever
Abstract Contents analisys from text data requires semantic representations that are difficult to obtain automatically, as they may require large handcrafted knowledge bases or manually annotated examples. Unsupervised autonomous methods for generating semantic representations are of greatest interest in face of huge volumes of text to be exploited in all kinds of applications. In this work we describe the generation and validation of semantic representations in the vector space paradigm for Spanish. The method used is GloVe (Pennington, 2014), one of the best performing reported methods , and vectors were trained over Spanish Wikipedia. The learned vectors evaluation is done in terms of word analogy and similarity tasks (Pennington, 2014; Baroni, 2014; Mikolov, 2013a). The vector set and a Spanish version for some widely used semantic relatedness tests are made publicly available.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1584/
PDF https://www.aclweb.org/anthology/L16-1584
PWC https://paperswithcode.com/paper/spanish-word-vectors-from-wikipedia
Repo
Framework

LIPN-IIMAS at SemEval-2016 Task 1: Random Forest Regression Experiments on Align-and-Differentiate and Word Embeddings penalizing strategies

Title LIPN-IIMAS at SemEval-2016 Task 1: Random Forest Regression Experiments on Align-and-Differentiate and Word Embeddings penalizing strategies
Authors Oscar William Lightgow Serrano, Ivan Vladimir Meza Ruiz, Albert Manuel Orozco Camacho, Jorge Garcia Flores, Davide Buscaldi
Abstract
Tasks Information Retrieval, Semantic Textual Similarity, Word Embeddings
Published 2016-06-01
URL https://www.aclweb.org/anthology/S16-1112/
PDF https://www.aclweb.org/anthology/S16-1112
PWC https://paperswithcode.com/paper/lipn-iimas-at-semeval-2016-task-1-random
Repo
Framework

“Congruent” and “Opposite” Neurons: Sisters for Multisensory Integration and Segregation

Title “Congruent” and “Opposite” Neurons: Sisters for Multisensory Integration and Segregation
Authors Wen-Hao Zhang, He Wang, K. Y. Michael Wong, Si Wu
Abstract Experiments reveal that in the dorsal medial superior temporal (MSTd) and the ventral intraparietal (VIP) areas, where visual and vestibular cues are integrated to infer heading direction, there are two types of neurons with roughly the same number. One is “congruent” cells, whose preferred heading directions are similar in response to visual and vestibular cues; and the other is “opposite” cells, whose preferred heading directions are nearly “opposite” (with an offset of 180 degree) in response to visual vs. vestibular cues. Congruent neurons are known to be responsible for cue integration, but the computational role of opposite neurons remains largely unknown. Here, we propose that opposite neurons may serve to encode the disparity information between cues necessary for multisensory segregation. We build a computational model composed of two reciprocally coupled modules, MSTd and VIP, and each module consists of groups of congruent and opposite neurons. In the model, congruent neurons in two modules are reciprocally connected with each other in the congruent manner, whereas opposite neurons are reciprocally connected in the opposite manner. Mimicking the experimental protocol, our model reproduces the characteristics of congruent and opposite neurons, and demonstrates that in each module, the sisters of congruent and opposite neurons can jointly achieve optimal multisensory information integration and segregation. This study sheds light on our understanding of how the brain implements optimal multisensory integration and segregation concurrently in a distributed manner.
Tasks
Published 2016-12-01
URL http://papers.nips.cc/paper/6317-congruent-and-opposite-neurons-sisters-for-multisensory-integration-and-segregation
PDF http://papers.nips.cc/paper/6317-congruent-and-opposite-neurons-sisters-for-multisensory-integration-and-segregation.pdf
PWC https://paperswithcode.com/paper/congruent-and-opposite-neurons-sisters-for
Repo
Framework

Homotopy Smoothing for Non-Smooth Problems with Lower Complexity than O(1/\epsilon)

Title Homotopy Smoothing for Non-Smooth Problems with Lower Complexity than O(1/\epsilon)
Authors Yi Xu, Yan Yan, Qihang Lin, Tianbao Yang
Abstract In this paper, we develop a novel {\bf ho}moto{\bf p}y {\bf s}moothing (HOPS) algorithm for solving a family of non-smooth problems that is composed of a non-smooth term with an explicit max-structure and a smooth term or a simple non-smooth term whose proximal mapping is easy to compute. The best known iteration complexity for solving such non-smooth optimization problems is $O(1/\epsilon)$ without any assumption on the strong convexity. In this work, we will show that the proposed HOPS achieved a lower iteration complexity of $\tilde O(1/\epsilon^{1-\theta})$ with $\theta\in(0,1]$ capturing the local sharpness of the objective function around the optimal solutions. To the best of our knowledge, this is the lowest iteration complexity achieved so far for the considered non-smooth optimization problems without strong convexity assumption. The HOPS algorithm employs Nesterov’s smoothing technique and Nesterov’s accelerated gradient method and runs in stages, which gradually decreases the smoothing parameter in a stage-wise manner until it yields a sufficiently good approximation of the original function. We show that HOPS enjoys a linear convergence for many well-known non-smooth problems (e.g., empirical risk minimization with a piece-wise linear loss function and $\ell_1$ norm regularizer, finding a point in a polyhedron, cone programming, etc). Experimental results verify the effectiveness of HOPS in comparison with Nesterov’s smoothing algorithm and the primal-dual style of first-order methods.
Tasks
Published 2016-12-01
URL http://papers.nips.cc/paper/6407-homotopy-smoothing-for-non-smooth-problems-with-lower-complexity-than-o1epsilon
PDF http://papers.nips.cc/paper/6407-homotopy-smoothing-for-non-smooth-problems-with-lower-complexity-than-o1epsilon.pdf
PWC https://paperswithcode.com/paper/homotopy-smoothing-for-non-smooth-problems-1
Repo
Framework

Opinion Mining in a Code-Mixed Environment: A Case Study with Government Portals

Title Opinion Mining in a Code-Mixed Environment: A Case Study with Government Portals
Authors Deepak Gupta, Ankit Lamba, Asif Ekbal, Pushpak Bhattacharyya
Abstract
Tasks Opinion Mining, Transliteration
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-6331/
PDF https://www.aclweb.org/anthology/W16-6331
PWC https://paperswithcode.com/paper/opinion-mining-in-a-code-mixed-environment-a
Repo
Framework
comments powered by Disqus