Paper Group NANR 56
A Report on the 2017 Native Language Identification Shared Task. Improving Implicit Discourse Relation Recognition with Discourse-specific Word Embeddings. Native Language Identification Using a Mixture of Character and Word N-grams. Identifying and Tracking Sentiments and Topics from Social Media Texts during Natural Disasters. Elucidating Concept …
A Report on the 2017 Native Language Identification Shared Task
Title | A Report on the 2017 Native Language Identification Shared Task |
Authors | Shervin Malmasi, Keelan Evanini, Aoife Cahill, Joel Tetreault, Robert Pugh, Christopher Hamill, Diane Napolitano, Yao Qian |
Abstract | Native Language Identification (NLI) is the task of automatically identifying the native language (L1) of an individual based on their language production in a learned language. It is typically framed as a classification task where the set of L1s is known a priori. Two previous shared tasks on NLI have been organized where the aim was to identify the L1 of learners of English based on essays (2013) and spoken responses (2016) they provided during a standardized assessment of academic English proficiency. The 2017 shared task combines the inputs from the two prior tasks for the first time. There are three tracks: NLI on the essay only, NLI on the spoken response only (based on a transcription of the response and i-vector acoustic features), and NLI using both responses. We believe this makes for a more interesting shared task while building on the methods and results from the previous two shared tasks. In this paper, we report the results of the shared task. A total of 19 teams competed across the three different sub-tasks. The fusion track showed that combining the written and spoken responses provides a large boost in prediction accuracy. Multiple classifier systems (e.g. ensembles and meta-classifiers) were the most effective in all tasks, with most based on traditional classifiers (e.g. SVMs) with lexical/syntactic features. |
Tasks | Grammatical Error Correction, Language Acquisition, Language Identification, Native Language Identification |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-5007/ |
https://www.aclweb.org/anthology/W17-5007 | |
PWC | https://paperswithcode.com/paper/a-report-on-the-2017-native-language |
Repo | |
Framework | |
Improving Implicit Discourse Relation Recognition with Discourse-specific Word Embeddings
Title | Improving Implicit Discourse Relation Recognition with Discourse-specific Word Embeddings |
Authors | Changxing Wu, Xiaodong Shi, Yidong Chen, Jinsong Su, Boli Wang |
Abstract | We introduce a simple and effective method to learn discourse-specific word embeddings (DSWE) for implicit discourse relation recognition. Specifically, DSWE is learned by performing connective classification on massive explicit discourse data, and capable of capturing discourse relationships between words. On the PDTB data set, using DSWE as features achieves significant improvements over baselines. |
Tasks | Machine Translation, Question Answering, Word Embeddings |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/P17-2042/ |
https://www.aclweb.org/anthology/P17-2042 | |
PWC | https://paperswithcode.com/paper/improving-implicit-discourse-relation-1 |
Repo | |
Framework | |
Native Language Identification Using a Mixture of Character and Word N-grams
Title | Native Language Identification Using a Mixture of Character and Word N-grams |
Authors | Elham Mohammadi, Hadi Veisi, Hessam Amini |
Abstract | Native language identification (NLI) is the task of determining an author{'}s native language, based on a piece of his/her writing in a second language. In recent years, NLI has received much attention due to its challenging nature and its applications in language pedagogy and forensic linguistics. We participated in the NLI2017 shared task under the name UT-DSP. In our effort to implement a method for native language identification, we made use of a fusion of character and word N-grams, and achieved an optimal F1-Score of 77.64{%}, using both essay and speech transcription datasets. |
Tasks | Language Acquisition, Language Identification, Native Language Identification |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-5022/ |
https://www.aclweb.org/anthology/W17-5022 | |
PWC | https://paperswithcode.com/paper/native-language-identification-using-a |
Repo | |
Framework | |
Identifying and Tracking Sentiments and Topics from Social Media Texts during Natural Disasters
Title | Identifying and Tracking Sentiments and Topics from Social Media Texts during Natural Disasters |
Authors | Min Yang, Jincheng Mei, Heng Ji, Wei Zhao, Zhou Zhao, Xiaojun Chen |
Abstract | We study the problem of identifying the topics and sentiments and tracking their shifts from social media texts in different geographical regions during emergencies and disasters. We propose a location-based dynamic sentiment-topic model (LDST) which can jointly model topic, sentiment, time and Geolocation information. The experimental results demonstrate that LDST performs very well at discovering topics and sentiments from social media and tracking their shifts in different geographical regions during emergencies and disasters. We will release the data and source code after this work is published. |
Tasks | Topic Models |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/D17-1055/ |
https://www.aclweb.org/anthology/D17-1055 | |
PWC | https://paperswithcode.com/paper/identifying-and-tracking-sentiments-and |
Repo | |
Framework | |
Elucidating Conceptual Properties from Word Embeddings
Title | Elucidating Conceptual Properties from Word Embeddings |
Authors | Kyoung-Rok Jang, Sung-Hyon Myaeng |
Abstract | In this paper, we introduce a method of identifying the components (i.e. dimensions) of word embeddings that strongly signifies properties of a word. By elucidating such properties hidden in word embeddings, we could make word embeddings more interpretable, and also could perform property-based meaning comparison. With the capability, we can answer questions like {}To what degree a given word has the property cuteness?{''} or { }In what perspective two words are similar?{''}. We verify our method by examining how the strength of property-signifying components correlates with the degree of prototypicality of a target word. |
Tasks | Decision Making, Named Entity Recognition, Sentiment Analysis, Word Embeddings |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/W17-1911/ |
https://www.aclweb.org/anthology/W17-1911 | |
PWC | https://paperswithcode.com/paper/elucidating-conceptual-properties-from-word |
Repo | |
Framework | |
Automatic Community Creation for Abstractive Spoken Conversations Summarization
Title | Automatic Community Creation for Abstractive Spoken Conversations Summarization |
Authors | Karan Singla, Evgeny Stepanov, Ali Orkan Bayer, Giuseppe Carenini, Giuseppe Riccardi |
Abstract | Summarization of spoken conversations is a challenging task, since it requires deep understanding of dialogs. Abstractive summarization techniques rely on linking the summary sentences to sets of original conversation sentences, i.e. communities. Unfortunately, such linking information is rarely available or requires trained annotators. We propose and experiment automatic community creation using cosine similarity on different levels of representation: raw text, WordNet SynSet IDs, and word embeddings. We show that the abstractive summarization systems with automatic communities significantly outperform previously published results on both English and Italian corpora. |
Tasks | Abstractive Text Summarization, Word Embeddings |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-4506/ |
https://www.aclweb.org/anthology/W17-4506 | |
PWC | https://paperswithcode.com/paper/automatic-community-creation-for-abstractive |
Repo | |
Framework | |
Identifying Deception in Indonesian Transcribed Interviews through Lexical-based Approach
Title | Identifying Deception in Indonesian Transcribed Interviews through Lexical-based Approach |
Authors | Tifani Warnita, Dessi Puji Lestari |
Abstract | |
Tasks | |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/Y17-1022/ |
https://www.aclweb.org/anthology/Y17-1022 | |
PWC | https://paperswithcode.com/paper/identifying-deception-in-indonesian |
Repo | |
Framework | |
Process-constrained batch Bayesian optimisation
Title | Process-constrained batch Bayesian optimisation |
Authors | Pratibha Vellanki, Santu Rana, Sunil Gupta, David Rubin, Alessandra Sutti, Thomas Dorin, Murray Height, Paul Sanders, Svetha Venkatesh |
Abstract | Abstract Prevailing batch Bayesian optimisation methods allow all control variables to be freely altered at each iteration. Real-world experiments, however, often have physical limitations making it time-consuming to alter all settings for each recommendation in a batch. This gives rise to a unique problem in BO: in a recommended batch, a set of variables that are expensive to experimentally change need to be fixed, while the remaining control variables can be varied. We formulate this as a process-constrained batch Bayesian optimisation problem. We propose two algorithms, pc-BO(basic) and pc-BO(nested). pc-BO(basic) is simpler but lacks convergence guarantee. In contrast pc-BO(nested) is slightly more complex, but admits convergence analysis. We show that the regret of pc-BO(nested) is sublinear. We demonstrate the performance of both pc-BO(basic) and pc-BO(nested) by optimising benchmark test functions, tuning hyper-parameters of the SVM classifier, optimising the heat-treatment process for an Al-Sc alloy to achieve target hardness, and optimising the short polymer fibre production process. |
Tasks | Bayesian Optimisation |
Published | 2017-12-01 |
URL | http://papers.nips.cc/paper/6933-process-constrained-batch-bayesian-optimisation |
http://papers.nips.cc/paper/6933-process-constrained-batch-bayesian-optimisation.pdf | |
PWC | https://paperswithcode.com/paper/process-constrained-batch-bayesian |
Repo | |
Framework | |
Alto: Rapid Prototyping for Parsing and Translation
Title | Alto: Rapid Prototyping for Parsing and Translation |
Authors | Johannes Gontrum, Jonas Groschwitz, Alex Koller, er, Christoph Teichmann |
Abstract | We present Alto, a rapid prototyping tool for new grammar formalisms. Alto implements generic but efficient algorithms for parsing, translation, and training for a range of monolingual and synchronous grammar formalisms. It can easily be extended to new formalisms, which makes all of these algorithms immediately available for the new formalism. |
Tasks | Machine Translation, Semantic Parsing |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/E17-3008/ |
https://www.aclweb.org/anthology/E17-3008 | |
PWC | https://paperswithcode.com/paper/alto-rapid-prototyping-for-parsing-and |
Repo | |
Framework | |
EviNets: Neural Networks for Combining Evidence Signals for Factoid Question Answering
Title | EviNets: Neural Networks for Combining Evidence Signals for Factoid Question Answering |
Authors | Denis Savenkov, Eugene Agichtein |
Abstract | A critical task for question answering is the final answer selection stage, which has to combine multiple signals available about each answer candidate. This paper proposes EviNets: a novel neural network architecture for factoid question answering. EviNets scores candidate answer entities by combining the available supporting evidence, e.g., structured knowledge bases and unstructured text documents. EviNets represents each piece of evidence with a dense embeddings vector, scores their relevance to the question, and aggregates the support for each candidate to predict their final scores. Each of the components is generic and allows plugging in a variety of models for semantic similarity scoring and information aggregation. We demonstrate the effectiveness of EviNets in experiments on the existing TREC QA and WikiMovies benchmarks, and on the new Yahoo! Answers dataset introduced in this paper. EviNets can be extended to other information types and could facilitate future work on combining evidence signals for joint reasoning in question answering. |
Tasks | Answer Selection, Feature Engineering, Information Retrieval, Knowledge Base Question Answering, Question Answering, Semantic Similarity, Semantic Textual Similarity |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/P17-2047/ |
https://www.aclweb.org/anthology/P17-2047 | |
PWC | https://paperswithcode.com/paper/evinets-neural-networks-for-combining |
Repo | |
Framework | |
Do LSTMs really work so well for PoS tagging? – A replication study
Title | Do LSTMs really work so well for PoS tagging? – A replication study |
Authors | Tobias Horsmann, Torsten Zesch |
Abstract | A recent study by Plank et al. (2016) found that LSTM-based PoS taggers considerably improve over the current state-of-the-art when evaluated on the corpora of the Universal Dependencies project that use a coarse-grained tagset. We replicate this study using a fresh collection of 27 corpora of 21 languages that are annotated with fine-grained tagsets of varying size. Our replication confirms the result in general, and we additionally find that the advantage of LSTMs is even bigger for larger tagsets. However, we also find that for the very large tagsets of morphologically rich languages, hand-crafted morphological lexicons are still necessary to reach state-of-the-art performance. |
Tasks | Feature Engineering, Part-Of-Speech Tagging |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/D17-1076/ |
https://www.aclweb.org/anthology/D17-1076 | |
PWC | https://paperswithcode.com/paper/do-lstms-really-work-so-well-for-pos-tagging |
Repo | |
Framework | |
Adaptive SVRG Methods under Error Bound Conditions with Unknown Growth Parameter
Title | Adaptive SVRG Methods under Error Bound Conditions with Unknown Growth Parameter |
Authors | Yi Xu, Qihang Lin, Tianbao Yang |
Abstract | Error bound, an inherent property of an optimization problem, has recently revived in the development of algorithms with improved global convergence without strong convexity. The most studied error bound is the quadratic error bound, which generalizes strong convexity and is satisfied by a large family of machine learning problems. Quadratic error bound have been leveraged to achieve linear convergence in many first-order methods including the stochastic variance reduced gradient (SVRG) method, which is one of the most important stochastic optimization methods in machine learning. However, the studies along this direction face the critical issue that the algorithms must depend on an unknown growth parameter (a generalization of strong convexity modulus) in the error bound. This parameter is difficult to estimate exactly and the algorithms choosing this parameter heuristically do not have theoretical convergence guarantee. To address this issue, we propose novel SVRG methods that automatically search for this unknown parameter on the fly of optimization while still obtain almost the same convergence rate as when this parameter is known. We also analyze the convergence property of SVRG methods under H"{o}lderian error bound, which generalizes the quadratic error bound. |
Tasks | Stochastic Optimization |
Published | 2017-12-01 |
URL | http://papers.nips.cc/paper/6920-adaptive-svrg-methods-under-error-bound-conditions-with-unknown-growth-parameter |
http://papers.nips.cc/paper/6920-adaptive-svrg-methods-under-error-bound-conditions-with-unknown-growth-parameter.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-svrg-methods-under-error-bound |
Repo | |
Framework | |
Extending hybrid word-character neural machine translation with multi-task learning of morphological analysis
Title | Extending hybrid word-character neural machine translation with multi-task learning of morphological analysis |
Authors | Stig-Arne Gr{"o}nroos, Sami Virpioja, Mikko Kurimo |
Abstract | |
Tasks | Machine Translation, Morphological Analysis, Multi-Task Learning |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-4727/ |
https://www.aclweb.org/anthology/W17-4727 | |
PWC | https://paperswithcode.com/paper/extending-hybrid-word-character-neural |
Repo | |
Framework | |
A rule-based system for cross-lingual parsing of Romance languages with Universal Dependencies
Title | A rule-based system for cross-lingual parsing of Romance languages with Universal Dependencies |
Authors | Marcos Garcia, Pablo Gamallo |
Abstract | This article describes MetaRomance, a rule-based cross-lingual parser for Romance languages submitted to CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies. The system is an almost delexicalized parser which does not need training data to analyze Romance languages. It contains linguistically motivated rules based on PoS-tag patterns. The rules included in MetaRomance were developed in about 12 hours by one expert with no prior knowledge in Universal Dependencies, and can be easily extended using a transparent formalism. In this paper we compare the performance of MetaRomance with other supervised systems participating in the competition, paying special attention to the parsing of different treebanks of the same language. We also compare our system with a delexicalized parser for Romance languages, and take advantage of the harmonized annotation of Universal Dependencies to propose a language ranking based on the syntactic distance each variety has from Romance languages. |
Tasks | Dependency Parsing |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/K17-3029/ |
https://www.aclweb.org/anthology/K17-3029 | |
PWC | https://paperswithcode.com/paper/a-rule-based-system-for-cross-lingual-parsing |
Repo | |
Framework | |
運用類神經網路方法之語言端點偵測研究 (A Study on Voice Activation Detection by Using Neural Networks) [In Chinese]
Title | 運用類神經網路方法之語言端點偵測研究 (A Study on Voice Activation Detection by Using Neural Networks) [In Chinese] |
Authors | Yu-Chih Deng, Chen-Yu Chiang, Chen-Ming Pan |
Abstract | |
Tasks | |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/O17-1002/ |
https://www.aclweb.org/anthology/O17-1002 | |
PWC | https://paperswithcode.com/paper/ec-eccc2e-13a1eae-c-ea-c-c-a-study-on-voice |
Repo | |
Framework | |