May 5, 2019

1699 words 8 mins read

Paper Group NANR 107

Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Tutorial Abstracts. Bootstrapping Translation Detection and Sentence Extraction from Comparable Corpora. Comparing Two Basic Methods for Discriminating Between Similar Languages and Varieties. Advances in Ngram-based Discrimination of Similar Languages. MAWP …

Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Tutorial Abstracts


Title	Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Tutorial Abstracts
Authors
Abstract
Tasks
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-3000/
PDF	https://www.aclweb.org/anthology/C16-3000
PWC	https://paperswithcode.com/paper/proceedings-of-coling-2016-the-26th-2
Repo
Framework

Bootstrapping Translation Detection and Sentence Extraction from Comparable Corpora


Title	Bootstrapping Translation Detection and Sentence Extraction from Comparable Corpora
Authors	Kriste Krstovski, David Smith
Abstract
Tasks	Machine Translation, Topic Models
Published	2016-06-01
URL	https://www.aclweb.org/anthology/N16-1132/
PDF	https://www.aclweb.org/anthology/N16-1132
PWC	https://paperswithcode.com/paper/bootstrapping-translation-detection-and
Repo
Framework

Comparing Two Basic Methods for Discriminating Between Similar Languages and Varieties


Title	Comparing Two Basic Methods for Discriminating Between Similar Languages and Varieties
Authors	Pablo Gamallo, I{~n}aki Alegria, Jos{'e} Ramom Pichel, Manex Agirrezabal
Abstract	This article describes the systems submitted by the Citius{_}Ixa{_}Imaxin team to the Discriminating Similar Languages Shared Task 2016. The systems are based on two different strategies: classification with ranked dictionaries and Naive Bayes classifiers. The results of the evaluation show that ranking dictionaries are more sound and stable across different domains while basic bayesian models perform reasonably well on in-domain datasets, but their performance drops when they are applied on out-of-domain texts.
Tasks	Language Identification, Speech Recognition
Published	2016-12-01
URL	https://www.aclweb.org/anthology/W16-4822/
PDF	https://www.aclweb.org/anthology/W16-4822
PWC	https://paperswithcode.com/paper/comparing-two-basic-methods-for
Repo
Framework

Advances in Ngram-based Discrimination of Similar Languages


Title	Advances in Ngram-based Discrimination of Similar Languages
Authors	Cyril Goutte, Serge L{'e}ger
Abstract	We describe the systems entered by the National Research Council in the 2016 shared task on discriminating similar languages. Like previous years, we relied on character ngram features, and a mixture of discriminative and generative statistical classifiers. We mostly investigated the influence of the amount of data on the performance, in the open task, and compared the two-stage approach (predicting language/group, then variant) to a flat approach. Results suggest that ngrams are still state-of-the-art for language and variant identification, and that additional data has a small but decisive impact.
Tasks
Published	2016-12-01
URL	https://www.aclweb.org/anthology/W16-4823/
PDF	https://www.aclweb.org/anthology/W16-4823
PWC	https://paperswithcode.com/paper/advances-in-ngram-based-discrimination-of
Repo
Framework

MAWPS: A Math Word Problem Repository


Title	MAWPS: A Math Word Problem Repository
Authors	Rik Koncel-Kedziorski, Subhro Roy, Aida Amini, Nate Kushman, Hannaneh Hajishirzi
Abstract
Tasks
Published	2016-06-01
URL	https://www.aclweb.org/anthology/N16-1136/
PDF	https://www.aclweb.org/anthology/N16-1136
PWC	https://paperswithcode.com/paper/mawps-a-math-word-problem-repository
Repo
Framework

Multi-label Annotation in Scientific Articles - The Multi-label Cancer Risk Assessment Corpus


Title	Multi-label Annotation in Scientific Articles - The Multi-label Cancer Risk Assessment Corpus
Authors	James Ravenscroft, Anika Oellrich, Shyamasree Saha, Maria Liakata
Abstract	With the constant growth of the scientific literature, automated processes to enable access to its contents are increasingly in demand. Several functional discourse annotation schemes have been proposed to facilitate information extraction and summarisation from scientific articles, the most well known being argumentative zoning. Core Scientific concepts (CoreSC) is a three layered fine-grained annotation scheme providing content-based annotations at the sentence level and has been used to index, extract and summarise scientific publications in the biomedical literature. A previously developed CoreSC corpus on which existing automated tools have been trained contains a single annotation for each sentence. However, it is the case that more than one CoreSC concept can appear in the same sentence. Here, we present the Multi-CoreSC CRA corpus, a text corpus specific to the domain of cancer risk assessment (CRA), consisting of 50 full text papers, each of which contains sentences annotated with one or more CoreSCs. The full text papers have been annotated by three biology experts. We present several inter-annotator agreement measures appropriate for multi-label annotation assessment. Employing several inter-annotator agreement measures, we were able to identify the most reliable annotator and we built a harmonised consensus (gold standard) from the three different annotators, while also taking concept priority (as specified in the guidelines) into account. We also show that the new Multi-CoreSC CRA corpus allows us to improve performance in the recognition of CoreSCs. The updated guidelines, the multi-label CoreSC CRA corpus and other relevant, related materials are available at the time of publication at http://www.sapientaproject.com/.
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1650/
PDF	https://www.aclweb.org/anthology/L16-1650
PWC	https://paperswithcode.com/paper/multi-label-annotation-in-scientific-articles
Repo
Framework

Proceedings of the Workshop on Multilingual and Cross-lingual Methods in NLP


Title	Proceedings of the Workshop on Multilingual and Cross-lingual Methods in NLP
Authors
Abstract
Tasks
Published	2016-06-01
URL	https://www.aclweb.org/anthology/W16-1200/
PDF	https://www.aclweb.org/anthology/W16-1200
PWC	https://paperswithcode.com/paper/proceedings-of-the-workshop-on-multilingual
Repo
Framework

Towards error annotation in a learner corpus of Portuguese


Title	Towards error annotation in a learner corpus of Portuguese
Authors	Iria del R{'\i}o, S Antunes, ra, Am{'a}lia Mendes, Maarten Janssen
Abstract
Tasks	Language Acquisition, Lemmatization
Published	2016-11-01
URL	https://www.aclweb.org/anthology/W16-6502/
PDF	https://www.aclweb.org/anthology/W16-6502
PWC	https://paperswithcode.com/paper/towards-error-annotation-in-a-learner-corpus
Repo
Framework

NER for Medical Entities in Twitter using Sequence to Sequence Neural Networks


Title	NER for Medical Entities in Twitter using Sequence to Sequence Neural Networks
Authors	Antonio Jimeno Yepes, Andrew MacKinlay
Abstract
Tasks	Feature Engineering, Named Entity Recognition, Word Embeddings
Published	2016-12-01
URL	https://www.aclweb.org/anthology/U16-1016/
PDF	https://www.aclweb.org/anthology/U16-1016
PWC	https://paperswithcode.com/paper/ner-for-medical-entities-in-twitter-using
Repo
Framework

JUNITMZ at SemEval-2016 Task 1: Identifying Semantic Similarity Using Levenshtein Ratio


Title	JUNITMZ at SemEval-2016 Task 1: Identifying Semantic Similarity Using Levenshtein Ratio
Authors	S Sarkar, ip, Dipankar Das, Partha Pakray, Alex Gelbukh, er
Abstract
Tasks	Information Retrieval, Machine Translation, Semantic Similarity, Semantic Textual Similarity, Text Summarization
Published	2016-06-01
URL	https://www.aclweb.org/anthology/S16-1108/
PDF	https://www.aclweb.org/anthology/S16-1108
PWC	https://paperswithcode.com/paper/junitmz-at-semeval-2016-task-1-identifying
Repo
Framework

Spanish Word Vectors from Wikipedia


Title	Spanish Word Vectors from Wikipedia
Authors	Mathias Etcheverry, Dina Wonsever
Abstract	Contents analisys from text data requires semantic representations that are difficult to obtain automatically, as they may require large handcrafted knowledge bases or manually annotated examples. Unsupervised autonomous methods for generating semantic representations are of greatest interest in face of huge volumes of text to be exploited in all kinds of applications. In this work we describe the generation and validation of semantic representations in the vector space paradigm for Spanish. The method used is GloVe (Pennington, 2014), one of the best performing reported methods , and vectors were trained over Spanish Wikipedia. The learned vectors evaluation is done in terms of word analogy and similarity tasks (Pennington, 2014; Baroni, 2014; Mikolov, 2013a). The vector set and a Spanish version for some widely used semantic relatedness tests are made publicly available.
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1584/
PDF	https://www.aclweb.org/anthology/L16-1584
PWC	https://paperswithcode.com/paper/spanish-word-vectors-from-wikipedia
Repo
Framework

LIPN-IIMAS at SemEval-2016 Task 1: Random Forest Regression Experiments on Align-and-Differentiate and Word Embeddings penalizing strategies


Title	LIPN-IIMAS at SemEval-2016 Task 1: Random Forest Regression Experiments on Align-and-Differentiate and Word Embeddings penalizing strategies
Authors	Oscar William Lightgow Serrano, Ivan Vladimir Meza Ruiz, Albert Manuel Orozco Camacho, Jorge Garcia Flores, Davide Buscaldi
Abstract
Tasks	Information Retrieval, Semantic Textual Similarity, Word Embeddings
Published	2016-06-01
URL	https://www.aclweb.org/anthology/S16-1112/
PDF	https://www.aclweb.org/anthology/S16-1112
PWC	https://paperswithcode.com/paper/lipn-iimas-at-semeval-2016-task-1-random
Repo
Framework

“Congruent” and “Opposite” Neurons: Sisters for Multisensory Integration and Segregation


Title	“Congruent” and “Opposite” Neurons: Sisters for Multisensory Integration and Segregation
Authors	Wen-Hao Zhang, He Wang, K. Y. Michael Wong, Si Wu
Abstract	Experiments reveal that in the dorsal medial superior temporal (MSTd) and the ventral intraparietal (VIP) areas, where visual and vestibular cues are integrated to infer heading direction, there are two types of neurons with roughly the same number. One is “congruent” cells, whose preferred heading directions are similar in response to visual and vestibular cues; and the other is “opposite” cells, whose preferred heading directions are nearly “opposite” (with an offset of 180 degree) in response to visual vs. vestibular cues. Congruent neurons are known to be responsible for cue integration, but the computational role of opposite neurons remains largely unknown. Here, we propose that opposite neurons may serve to encode the disparity information between cues necessary for multisensory segregation. We build a computational model composed of two reciprocally coupled modules, MSTd and VIP, and each module consists of groups of congruent and opposite neurons. In the model, congruent neurons in two modules are reciprocally connected with each other in the congruent manner, whereas opposite neurons are reciprocally connected in the opposite manner. Mimicking the experimental protocol, our model reproduces the characteristics of congruent and opposite neurons, and demonstrates that in each module, the sisters of congruent and opposite neurons can jointly achieve optimal multisensory information integration and segregation. This study sheds light on our understanding of how the brain implements optimal multisensory integration and segregation concurrently in a distributed manner.
Tasks
Published	2016-12-01
URL	http://papers.nips.cc/paper/6317-congruent-and-opposite-neurons-sisters-for-multisensory-integration-and-segregation
PDF	http://papers.nips.cc/paper/6317-congruent-and-opposite-neurons-sisters-for-multisensory-integration-and-segregation.pdf
PWC	https://paperswithcode.com/paper/congruent-and-opposite-neurons-sisters-for
Repo
Framework

Homotopy Smoothing for Non-Smooth Problems with Lower Complexity than O(1/\epsilon)


Title	Homotopy Smoothing for Non-Smooth Problems with Lower Complexity than O(1/\epsilon)
Authors	Yi Xu, Yan Yan, Qihang Lin, Tianbao Yang
Abstract	In this paper, we develop a novel {\bf ho}moto{\bf p}y {\bf s}moothing (HOPS) algorithm for solving a family of non-smooth problems that is composed of a non-smooth term with an explicit max-structure and a smooth term or a simple non-smooth term whose proximal mapping is easy to compute. The best known iteration complexity for solving such non-smooth optimization problems is $O(1/\epsilon)$ without any assumption on the strong convexity. In this work, we will show that the proposed HOPS achieved a lower iteration complexity of $\tilde O(1/\epsilon^{1-\theta})$ with $\theta\in(0,1]$ capturing the local sharpness of the objective function around the optimal solutions. To the best of our knowledge, this is the lowest iteration complexity achieved so far for the considered non-smooth optimization problems without strong convexity assumption. The HOPS algorithm employs Nesterov’s smoothing technique and Nesterov’s accelerated gradient method and runs in stages, which gradually decreases the smoothing parameter in a stage-wise manner until it yields a sufficiently good approximation of the original function. We show that HOPS enjoys a linear convergence for many well-known non-smooth problems (e.g., empirical risk minimization with a piece-wise linear loss function and $\ell_1$ norm regularizer, finding a point in a polyhedron, cone programming, etc). Experimental results verify the effectiveness of HOPS in comparison with Nesterov’s smoothing algorithm and the primal-dual style of first-order methods.
Tasks
Published	2016-12-01
URL	http://papers.nips.cc/paper/6407-homotopy-smoothing-for-non-smooth-problems-with-lower-complexity-than-o1epsilon
PDF	http://papers.nips.cc/paper/6407-homotopy-smoothing-for-non-smooth-problems-with-lower-complexity-than-o1epsilon.pdf
PWC	https://paperswithcode.com/paper/homotopy-smoothing-for-non-smooth-problems-1
Repo
Framework

Opinion Mining in a Code-Mixed Environment: A Case Study with Government Portals


Title	Opinion Mining in a Code-Mixed Environment: A Case Study with Government Portals
Authors	Deepak Gupta, Ankit Lamba, Asif Ekbal, Pushpak Bhattacharyya
Abstract
Tasks	Opinion Mining, Transliteration
Published	2016-12-01
URL	https://www.aclweb.org/anthology/W16-6331/
PDF	https://www.aclweb.org/anthology/W16-6331
PWC	https://paperswithcode.com/paper/opinion-mining-in-a-code-mixed-environment-a
Repo
Framework