July 26, 2019

2182 words 11 mins read

Paper Group NANR 56

Paper Group NANR 56

A Report on the 2017 Native Language Identification Shared Task. Improving Implicit Discourse Relation Recognition with Discourse-specific Word Embeddings. Native Language Identification Using a Mixture of Character and Word N-grams. Identifying and Tracking Sentiments and Topics from Social Media Texts during Natural Disasters. Elucidating Concept …

A Report on the 2017 Native Language Identification Shared Task

Title A Report on the 2017 Native Language Identification Shared Task
Authors Shervin Malmasi, Keelan Evanini, Aoife Cahill, Joel Tetreault, Robert Pugh, Christopher Hamill, Diane Napolitano, Yao Qian
Abstract Native Language Identification (NLI) is the task of automatically identifying the native language (L1) of an individual based on their language production in a learned language. It is typically framed as a classification task where the set of L1s is known a priori. Two previous shared tasks on NLI have been organized where the aim was to identify the L1 of learners of English based on essays (2013) and spoken responses (2016) they provided during a standardized assessment of academic English proficiency. The 2017 shared task combines the inputs from the two prior tasks for the first time. There are three tracks: NLI on the essay only, NLI on the spoken response only (based on a transcription of the response and i-vector acoustic features), and NLI using both responses. We believe this makes for a more interesting shared task while building on the methods and results from the previous two shared tasks. In this paper, we report the results of the shared task. A total of 19 teams competed across the three different sub-tasks. The fusion track showed that combining the written and spoken responses provides a large boost in prediction accuracy. Multiple classifier systems (e.g. ensembles and meta-classifiers) were the most effective in all tasks, with most based on traditional classifiers (e.g. SVMs) with lexical/syntactic features.
Tasks Grammatical Error Correction, Language Acquisition, Language Identification, Native Language Identification
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-5007/
PDF https://www.aclweb.org/anthology/W17-5007
PWC https://paperswithcode.com/paper/a-report-on-the-2017-native-language
Repo
Framework

Improving Implicit Discourse Relation Recognition with Discourse-specific Word Embeddings

Title Improving Implicit Discourse Relation Recognition with Discourse-specific Word Embeddings
Authors Changxing Wu, Xiaodong Shi, Yidong Chen, Jinsong Su, Boli Wang
Abstract We introduce a simple and effective method to learn discourse-specific word embeddings (DSWE) for implicit discourse relation recognition. Specifically, DSWE is learned by performing connective classification on massive explicit discourse data, and capable of capturing discourse relationships between words. On the PDTB data set, using DSWE as features achieves significant improvements over baselines.
Tasks Machine Translation, Question Answering, Word Embeddings
Published 2017-07-01
URL https://www.aclweb.org/anthology/P17-2042/
PDF https://www.aclweb.org/anthology/P17-2042
PWC https://paperswithcode.com/paper/improving-implicit-discourse-relation-1
Repo
Framework

Native Language Identification Using a Mixture of Character and Word N-grams

Title Native Language Identification Using a Mixture of Character and Word N-grams
Authors Elham Mohammadi, Hadi Veisi, Hessam Amini
Abstract Native language identification (NLI) is the task of determining an author{'}s native language, based on a piece of his/her writing in a second language. In recent years, NLI has received much attention due to its challenging nature and its applications in language pedagogy and forensic linguistics. We participated in the NLI2017 shared task under the name UT-DSP. In our effort to implement a method for native language identification, we made use of a fusion of character and word N-grams, and achieved an optimal F1-Score of 77.64{%}, using both essay and speech transcription datasets.
Tasks Language Acquisition, Language Identification, Native Language Identification
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-5022/
PDF https://www.aclweb.org/anthology/W17-5022
PWC https://paperswithcode.com/paper/native-language-identification-using-a
Repo
Framework

Identifying and Tracking Sentiments and Topics from Social Media Texts during Natural Disasters

Title Identifying and Tracking Sentiments and Topics from Social Media Texts during Natural Disasters
Authors Min Yang, Jincheng Mei, Heng Ji, Wei Zhao, Zhou Zhao, Xiaojun Chen
Abstract We study the problem of identifying the topics and sentiments and tracking their shifts from social media texts in different geographical regions during emergencies and disasters. We propose a location-based dynamic sentiment-topic model (LDST) which can jointly model topic, sentiment, time and Geolocation information. The experimental results demonstrate that LDST performs very well at discovering topics and sentiments from social media and tracking their shifts in different geographical regions during emergencies and disasters. We will release the data and source code after this work is published.
Tasks Topic Models
Published 2017-09-01
URL https://www.aclweb.org/anthology/D17-1055/
PDF https://www.aclweb.org/anthology/D17-1055
PWC https://paperswithcode.com/paper/identifying-and-tracking-sentiments-and
Repo
Framework

Elucidating Conceptual Properties from Word Embeddings

Title Elucidating Conceptual Properties from Word Embeddings
Authors Kyoung-Rok Jang, Sung-Hyon Myaeng
Abstract In this paper, we introduce a method of identifying the components (i.e. dimensions) of word embeddings that strongly signifies properties of a word. By elucidating such properties hidden in word embeddings, we could make word embeddings more interpretable, and also could perform property-based meaning comparison. With the capability, we can answer questions like {}To what degree a given word has the property cuteness?{''} or {}In what perspective two words are similar?{''}. We verify our method by examining how the strength of property-signifying components correlates with the degree of prototypicality of a target word.
Tasks Decision Making, Named Entity Recognition, Sentiment Analysis, Word Embeddings
Published 2017-04-01
URL https://www.aclweb.org/anthology/W17-1911/
PDF https://www.aclweb.org/anthology/W17-1911
PWC https://paperswithcode.com/paper/elucidating-conceptual-properties-from-word
Repo
Framework

Automatic Community Creation for Abstractive Spoken Conversations Summarization

Title Automatic Community Creation for Abstractive Spoken Conversations Summarization
Authors Karan Singla, Evgeny Stepanov, Ali Orkan Bayer, Giuseppe Carenini, Giuseppe Riccardi
Abstract Summarization of spoken conversations is a challenging task, since it requires deep understanding of dialogs. Abstractive summarization techniques rely on linking the summary sentences to sets of original conversation sentences, i.e. communities. Unfortunately, such linking information is rarely available or requires trained annotators. We propose and experiment automatic community creation using cosine similarity on different levels of representation: raw text, WordNet SynSet IDs, and word embeddings. We show that the abstractive summarization systems with automatic communities significantly outperform previously published results on both English and Italian corpora.
Tasks Abstractive Text Summarization, Word Embeddings
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-4506/
PDF https://www.aclweb.org/anthology/W17-4506
PWC https://paperswithcode.com/paper/automatic-community-creation-for-abstractive
Repo
Framework

Identifying Deception in Indonesian Transcribed Interviews through Lexical-based Approach

Title Identifying Deception in Indonesian Transcribed Interviews through Lexical-based Approach
Authors Tifani Warnita, Dessi Puji Lestari
Abstract
Tasks
Published 2017-11-01
URL https://www.aclweb.org/anthology/Y17-1022/
PDF https://www.aclweb.org/anthology/Y17-1022
PWC https://paperswithcode.com/paper/identifying-deception-in-indonesian
Repo
Framework

Process-constrained batch Bayesian optimisation

Title Process-constrained batch Bayesian optimisation
Authors Pratibha Vellanki, Santu Rana, Sunil Gupta, David Rubin, Alessandra Sutti, Thomas Dorin, Murray Height, Paul Sanders, Svetha Venkatesh
Abstract Abstract Prevailing batch Bayesian optimisation methods allow all control variables to be freely altered at each iteration. Real-world experiments, however, often have physical limitations making it time-consuming to alter all settings for each recommendation in a batch. This gives rise to a unique problem in BO: in a recommended batch, a set of variables that are expensive to experimentally change need to be fixed, while the remaining control variables can be varied. We formulate this as a process-constrained batch Bayesian optimisation problem. We propose two algorithms, pc-BO(basic) and pc-BO(nested). pc-BO(basic) is simpler but lacks convergence guarantee. In contrast pc-BO(nested) is slightly more complex, but admits convergence analysis. We show that the regret of pc-BO(nested) is sublinear. We demonstrate the performance of both pc-BO(basic) and pc-BO(nested) by optimising benchmark test functions, tuning hyper-parameters of the SVM classifier, optimising the heat-treatment process for an Al-Sc alloy to achieve target hardness, and optimising the short polymer fibre production process.
Tasks Bayesian Optimisation
Published 2017-12-01
URL http://papers.nips.cc/paper/6933-process-constrained-batch-bayesian-optimisation
PDF http://papers.nips.cc/paper/6933-process-constrained-batch-bayesian-optimisation.pdf
PWC https://paperswithcode.com/paper/process-constrained-batch-bayesian
Repo
Framework

Alto: Rapid Prototyping for Parsing and Translation

Title Alto: Rapid Prototyping for Parsing and Translation
Authors Johannes Gontrum, Jonas Groschwitz, Alex Koller, er, Christoph Teichmann
Abstract We present Alto, a rapid prototyping tool for new grammar formalisms. Alto implements generic but efficient algorithms for parsing, translation, and training for a range of monolingual and synchronous grammar formalisms. It can easily be extended to new formalisms, which makes all of these algorithms immediately available for the new formalism.
Tasks Machine Translation, Semantic Parsing
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-3008/
PDF https://www.aclweb.org/anthology/E17-3008
PWC https://paperswithcode.com/paper/alto-rapid-prototyping-for-parsing-and
Repo
Framework

EviNets: Neural Networks for Combining Evidence Signals for Factoid Question Answering

Title EviNets: Neural Networks for Combining Evidence Signals for Factoid Question Answering
Authors Denis Savenkov, Eugene Agichtein
Abstract A critical task for question answering is the final answer selection stage, which has to combine multiple signals available about each answer candidate. This paper proposes EviNets: a novel neural network architecture for factoid question answering. EviNets scores candidate answer entities by combining the available supporting evidence, e.g., structured knowledge bases and unstructured text documents. EviNets represents each piece of evidence with a dense embeddings vector, scores their relevance to the question, and aggregates the support for each candidate to predict their final scores. Each of the components is generic and allows plugging in a variety of models for semantic similarity scoring and information aggregation. We demonstrate the effectiveness of EviNets in experiments on the existing TREC QA and WikiMovies benchmarks, and on the new Yahoo! Answers dataset introduced in this paper. EviNets can be extended to other information types and could facilitate future work on combining evidence signals for joint reasoning in question answering.
Tasks Answer Selection, Feature Engineering, Information Retrieval, Knowledge Base Question Answering, Question Answering, Semantic Similarity, Semantic Textual Similarity
Published 2017-07-01
URL https://www.aclweb.org/anthology/P17-2047/
PDF https://www.aclweb.org/anthology/P17-2047
PWC https://paperswithcode.com/paper/evinets-neural-networks-for-combining
Repo
Framework

Do LSTMs really work so well for PoS tagging? – A replication study

Title Do LSTMs really work so well for PoS tagging? – A replication study
Authors Tobias Horsmann, Torsten Zesch
Abstract A recent study by Plank et al. (2016) found that LSTM-based PoS taggers considerably improve over the current state-of-the-art when evaluated on the corpora of the Universal Dependencies project that use a coarse-grained tagset. We replicate this study using a fresh collection of 27 corpora of 21 languages that are annotated with fine-grained tagsets of varying size. Our replication confirms the result in general, and we additionally find that the advantage of LSTMs is even bigger for larger tagsets. However, we also find that for the very large tagsets of morphologically rich languages, hand-crafted morphological lexicons are still necessary to reach state-of-the-art performance.
Tasks Feature Engineering, Part-Of-Speech Tagging
Published 2017-09-01
URL https://www.aclweb.org/anthology/D17-1076/
PDF https://www.aclweb.org/anthology/D17-1076
PWC https://paperswithcode.com/paper/do-lstms-really-work-so-well-for-pos-tagging
Repo
Framework

Adaptive SVRG Methods under Error Bound Conditions with Unknown Growth Parameter

Title Adaptive SVRG Methods under Error Bound Conditions with Unknown Growth Parameter
Authors Yi Xu, Qihang Lin, Tianbao Yang
Abstract Error bound, an inherent property of an optimization problem, has recently revived in the development of algorithms with improved global convergence without strong convexity. The most studied error bound is the quadratic error bound, which generalizes strong convexity and is satisfied by a large family of machine learning problems. Quadratic error bound have been leveraged to achieve linear convergence in many first-order methods including the stochastic variance reduced gradient (SVRG) method, which is one of the most important stochastic optimization methods in machine learning. However, the studies along this direction face the critical issue that the algorithms must depend on an unknown growth parameter (a generalization of strong convexity modulus) in the error bound. This parameter is difficult to estimate exactly and the algorithms choosing this parameter heuristically do not have theoretical convergence guarantee. To address this issue, we propose novel SVRG methods that automatically search for this unknown parameter on the fly of optimization while still obtain almost the same convergence rate as when this parameter is known. We also analyze the convergence property of SVRG methods under H"{o}lderian error bound, which generalizes the quadratic error bound.
Tasks Stochastic Optimization
Published 2017-12-01
URL http://papers.nips.cc/paper/6920-adaptive-svrg-methods-under-error-bound-conditions-with-unknown-growth-parameter
PDF http://papers.nips.cc/paper/6920-adaptive-svrg-methods-under-error-bound-conditions-with-unknown-growth-parameter.pdf
PWC https://paperswithcode.com/paper/adaptive-svrg-methods-under-error-bound
Repo
Framework

Extending hybrid word-character neural machine translation with multi-task learning of morphological analysis

Title Extending hybrid word-character neural machine translation with multi-task learning of morphological analysis
Authors Stig-Arne Gr{"o}nroos, Sami Virpioja, Mikko Kurimo
Abstract
Tasks Machine Translation, Morphological Analysis, Multi-Task Learning
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-4727/
PDF https://www.aclweb.org/anthology/W17-4727
PWC https://paperswithcode.com/paper/extending-hybrid-word-character-neural
Repo
Framework

A rule-based system for cross-lingual parsing of Romance languages with Universal Dependencies

Title A rule-based system for cross-lingual parsing of Romance languages with Universal Dependencies
Authors Marcos Garcia, Pablo Gamallo
Abstract This article describes MetaRomance, a rule-based cross-lingual parser for Romance languages submitted to CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies. The system is an almost delexicalized parser which does not need training data to analyze Romance languages. It contains linguistically motivated rules based on PoS-tag patterns. The rules included in MetaRomance were developed in about 12 hours by one expert with no prior knowledge in Universal Dependencies, and can be easily extended using a transparent formalism. In this paper we compare the performance of MetaRomance with other supervised systems participating in the competition, paying special attention to the parsing of different treebanks of the same language. We also compare our system with a delexicalized parser for Romance languages, and take advantage of the harmonized annotation of Universal Dependencies to propose a language ranking based on the syntactic distance each variety has from Romance languages.
Tasks Dependency Parsing
Published 2017-08-01
URL https://www.aclweb.org/anthology/K17-3029/
PDF https://www.aclweb.org/anthology/K17-3029
PWC https://paperswithcode.com/paper/a-rule-based-system-for-cross-lingual-parsing
Repo
Framework

運用類神經網路方法之語言端點偵測研究 (A Study on Voice Activation Detection by Using Neural Networks) [In Chinese]

Title 運用類神經網路方法之語言端點偵測研究 (A Study on Voice Activation Detection by Using Neural Networks) [In Chinese]
Authors Yu-Chih Deng, Chen-Yu Chiang, Chen-Ming Pan
Abstract
Tasks
Published 2017-11-01
URL https://www.aclweb.org/anthology/O17-1002/
PDF https://www.aclweb.org/anthology/O17-1002
PWC https://paperswithcode.com/paper/ec-eccc2e-13a1eae-c-ea-c-c-a-study-on-voice
Repo
Framework
comments powered by Disqus