July 26, 2019

1847 words 9 mins read

Paper Group NANR 89

Paper Group NANR 89

Subjecthood and Grammatical Relations in Korean: An Experimental Study with Honorific Agreement and Plural Copying. Wh-island Effects in Korean Scrambling Constructions. Sparse Coding of Neural Word Embeddings for Multilingual Sequence Labeling. NTUCLE: Developing a Corpus of Learner English to Provide Writing Support for Engineering Students. Zara …

Subjecthood and Grammatical Relations in Korean: An Experimental Study with Honorific Agreement and Plural Copying

Title Subjecthood and Grammatical Relations in Korean: An Experimental Study with Honorific Agreement and Plural Copying
Authors Ji-Hye Kim, Yong-Hun Lee, James Hye-Suk Yoon
Abstract
Tasks
Published 2017-11-01
URL https://www.aclweb.org/anthology/Y17-1029/
PDF https://www.aclweb.org/anthology/Y17-1029
PWC https://paperswithcode.com/paper/subjecthood-and-grammatical-relations-in
Repo
Framework

Wh-island Effects in Korean Scrambling Constructions

Title Wh-island Effects in Korean Scrambling Constructions
Authors Juyeon Cho
Abstract
Tasks
Published 2017-11-01
URL https://www.aclweb.org/anthology/Y17-1044/
PDF https://www.aclweb.org/anthology/Y17-1044
PWC https://paperswithcode.com/paper/wh-island-effects-in-korean-scrambling
Repo
Framework

Sparse Coding of Neural Word Embeddings for Multilingual Sequence Labeling

Title Sparse Coding of Neural Word Embeddings for Multilingual Sequence Labeling
Authors G{'a}bor Berend
Abstract In this paper we propose and carefully evaluate a sequence labeling framework which solely utilizes sparse indicator features derived from dense distributed word representations. The proposed model obtains (near) state-of-the art performance for both part-of-speech tagging and named entity recognition for a variety of languages. Our model relies only on a few thousand sparse coding-derived features, without applying any modification of the word representations employed for the different tasks. The proposed model has favorable generalization properties as it retains over 89.8{%} of its average POS tagging accuracy when trained at 1.2{%} of the total available training data, i.e. 150 sentences per language.
Tasks Feature Engineering, Named Entity Recognition, Part-Of-Speech Tagging, Word Embeddings
Published 2017-01-01
URL https://www.aclweb.org/anthology/Q17-1018/
PDF https://www.aclweb.org/anthology/Q17-1018
PWC https://paperswithcode.com/paper/sparse-coding-of-neural-word-embeddings-for-1
Repo
Framework

NTUCLE: Developing a Corpus of Learner English to Provide Writing Support for Engineering Students

Title NTUCLE: Developing a Corpus of Learner English to Provide Writing Support for Engineering Students
Authors Roger Vivek Placidus Winder, Joseph MacKinnon, Shu Yun Li, Benedict Christopher Tzer Liang Lin, Carmel Lee Hah Heah, Lu{'\i}s Morgado da Costa, Takayuki Kuribayashi, Francis Bond
Abstract This paper describes the creation of a new annotated learner corpus. The aim is to use this corpus to develop an automated system for corrective feedback on students{'} writing. With this system, students will be able to receive timely feedback on language errors before they submit their assignments for grading. A corpus of assignments submitted by first year engineering students was compiled, and a new error tag set for the NTU Corpus of Learner English (NTUCLE) was developed based on that of the NUS Corpus of Learner English (NUCLE), as well as marking rubrics used at NTU. After a description of the corpus, error tag set and annotation process, the paper presents the results of the annotation exercise as well as follow up actions. The final error tag set, which is significantly larger than that for the NUCLE error categories, is then presented before a brief conclusion summarising our experience and future plans.
Tasks
Published 2017-12-01
URL https://www.aclweb.org/anthology/W17-5901/
PDF https://www.aclweb.org/anthology/W17-5901
PWC https://paperswithcode.com/paper/ntucle-developing-a-corpus-of-learner-english
Repo
Framework

Zara Returns: Improved Personality Induction and Adaptation by an Empathetic Virtual Agent

Title Zara Returns: Improved Personality Induction and Adaptation by an Empathetic Virtual Agent
Authors Farhad Bin Siddique, Onno Kampman, Yang Yang, Anik Dey, Pascale Fung
Abstract
Tasks Word Embeddings
Published 2017-07-01
URL https://www.aclweb.org/anthology/P17-4021/
PDF https://www.aclweb.org/anthology/P17-4021
PWC https://paperswithcode.com/paper/zara-returns-improved-personality-induction
Repo
Framework

Extracting Important Tweets for News Writers using Recurrent Neural Network with Attention Mechanism and Multi-task Learning

Title Extracting Important Tweets for News Writers using Recurrent Neural Network with Attention Mechanism and Multi-task Learning
Authors Taro Miyazaki, Shin Toriumi, Yuka Takei, Ichiro Yamada, Jun Goto
Abstract
Tasks Multi-Task Learning
Published 2017-11-01
URL https://www.aclweb.org/anthology/Y17-1048/
PDF https://www.aclweb.org/anthology/Y17-1048
PWC https://paperswithcode.com/paper/extracting-important-tweets-for-news-writers
Repo
Framework

Refining Word Embeddings for Sentiment Analysis

Title Refining Word Embeddings for Sentiment Analysis
Authors Liang-Chih Yu, Jin Wang, K. Robert Lai, Xuejie Zhang
Abstract Word embeddings that can capture semantic and syntactic information from contexts have been extensively used for various natural language processing tasks. However, existing methods for learning context-based word embeddings typically fail to capture sufficient sentiment information. This may result in words with similar vector representations having an opposite sentiment polarity (e.g., good and bad), thus degrading sentiment analysis performance. Therefore, this study proposes a word vector refinement model that can be applied to any pre-trained word vectors (e.g., Word2vec and GloVe). The refinement model is based on adjusting the vector representations of words such that they can be closer to both semantically and sentimentally similar words and further away from sentimentally dissimilar words. Experimental results show that the proposed method can improve conventional word embeddings and outperform previously proposed sentiment embeddings for both binary and fine-grained classification on Stanford Sentiment Treebank (SST).
Tasks Learning Word Embeddings, Sentiment Analysis, Word Embeddings
Published 2017-09-01
URL https://www.aclweb.org/anthology/D17-1056/
PDF https://www.aclweb.org/anthology/D17-1056
PWC https://paperswithcode.com/paper/refining-word-embeddings-for-sentiment
Repo
Framework

Part-of-Speech Tagging for Twitter with Adversarial Neural Networks

Title Part-of-Speech Tagging for Twitter with Adversarial Neural Networks
Authors Tao Gui, Qi Zhang, Haoran Huang, Minlong Peng, Xuanjing Huang
Abstract In this work, we study the problem of part-of-speech tagging for Tweets. In contrast to newswire articles, Tweets are usually informal and contain numerous out-of-vocabulary words. Moreover, there is a lack of large scale labeled datasets for this domain. To tackle these challenges, we propose a novel neural network to make use of out-of-domain labeled data, unlabeled in-domain data, and labeled in-domain data. Inspired by adversarial neural networks, the proposed method tries to learn common features through adversarial discriminator. In addition, we hypothesize that domain-specific features of target domain should be preserved in some degree. Hence, the proposed method adopts a sequence-to-sequence autoencoder to perform this task. Experimental results on three different datasets show that our method achieves better performance than state-of-the-art methods.
Tasks Part-Of-Speech Tagging, Stock Prediction
Published 2017-09-01
URL https://www.aclweb.org/anthology/D17-1256/
PDF https://www.aclweb.org/anthology/D17-1256
PWC https://paperswithcode.com/paper/part-of-speech-tagging-for-twitter-with
Repo
Framework

LearningToQuestion at SemEval 2017 Task 3: Ranking Similar Questions by Learning to Rank Using Rich Features

Title LearningToQuestion at SemEval 2017 Task 3: Ranking Similar Questions by Learning to Rank Using Rich Features
Authors Naman Goyal
Abstract This paper describes our official entry LearningToQuestion for SemEval 2017 task 3 community question answer, subtask B. The objective is to rerank questions obtained in web forum as per their similarity to original question. Our system uses pairwise learning to rank methods on rich set of hand designed and representation learning features. We use various semantic features that help our system to achieve promising results on the task. The system achieved second highest results on official metrics MAP and good results on other search metrics.
Tasks Information Retrieval, Learning-To-Rank, Question Answering, Representation Learning
Published 2017-08-01
URL https://www.aclweb.org/anthology/S17-2050/
PDF https://www.aclweb.org/anthology/S17-2050
PWC https://paperswithcode.com/paper/learningtoquestion-at-semeval-2017-task-3
Repo
Framework

Proceedings of the IJCNLP 2017, Tutorial Abstracts

Title Proceedings of the IJCNLP 2017, Tutorial Abstracts
Authors
Abstract
Tasks
Published 2017-11-01
URL https://www.aclweb.org/anthology/I17-5000/
PDF https://www.aclweb.org/anthology/I17-5000
PWC https://paperswithcode.com/paper/proceedings-of-the-ijcnlp-2017-tutorial
Repo
Framework

Lexical Simplification with Neural Ranking

Title Lexical Simplification with Neural Ranking
Authors Gustavo Paetzold, Lucia Specia
Abstract We present a new Lexical Simplification approach that exploits Neural Networks to learn substitutions from the Newsela corpus - a large set of professionally produced simplifications. We extract candidate substitutions by combining the Newsela corpus with a retrofitted context-aware word embeddings model and rank them using a new neural regression model that learns rankings from annotated data. This strategy leads to the highest Accuracy, Precision and F1 scores to date in standard datasets for the task.
Tasks Complex Word Identification, Information Retrieval, Lexical Simplification, Word Embeddings
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-2006/
PDF https://www.aclweb.org/anthology/E17-2006
PWC https://paperswithcode.com/paper/lexical-simplification-with-neural-ranking
Repo
Framework

Towards a General, Continuous Model of Turn-taking in Spoken Dialogue using LSTM Recurrent Neural Networks

Title Towards a General, Continuous Model of Turn-taking in Spoken Dialogue using LSTM Recurrent Neural Networks
Authors Gabriel Skantze
Abstract Previous models of turn-taking have mostly been trained for specific turn-taking decisions, such as discriminating between turn shifts and turn retention in pauses. In this paper, we present a predictive, continuous model of turn-taking using Long Short-Term Memory (LSTM) Recurrent Neural Networks (RNN). The model is trained on human-human dialogue data to predict upcoming speech activity in a future time window. We show how this general model can be applied to two different tasks that it was not specifically trained for. First, to predict whether a turn-shift will occur or not in pauses, where the model achieves a better performance than human observers, and better than results achieved with more traditional models. Second, to make a prediction at speech onset whether the utterance will be a short backchannel or a longer utterance. Finally, we show how the hidden layer in the network can be used as a feature vector for turn-taking decisions in a human-robot interaction scenario.
Tasks Feature Engineering, Spoken Dialogue Systems
Published 2017-08-01
URL https://www.aclweb.org/anthology/W17-5527/
PDF https://www.aclweb.org/anthology/W17-5527
PWC https://paperswithcode.com/paper/towards-a-general-continuous-model-of-turn
Repo
Framework

Linked Data for Language-Learning Applications

Title Linked Data for Language-Learning Applications
Authors Robyn Loughnane, Kate McCurdy, Peter Kolb, Stefan Selent
Abstract The use of linked data within language-learning applications is an open research question. A research prototype is presented that applies linked-data principles to store linguistic annotation generated from language-learning content using a variety of NLP tools. The result is a database that links learning content, linguistic annotation and open-source resources, on top of which a diverse range of tools for language-learning applications can be built.
Tasks Part-Of-Speech Tagging
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-5005/
PDF https://www.aclweb.org/anthology/W17-5005
PWC https://paperswithcode.com/paper/linked-data-for-language-learning
Repo
Framework

Modeling intra-textual variation with entropy and surprisal: topical vs. stylistic patterns

Title Modeling intra-textual variation with entropy and surprisal: topical vs. stylistic patterns
Authors Stefania Degaetano-Ortlieb, Elke Teich
Abstract We present a data-driven approach to investigate intra-textual variation by combining entropy and surprisal. With this approach we detect linguistic variation based on phrasal lexico-grammatical patterns across sections of research articles. Entropy is used to detect patterns typical of specific sections. Surprisal is used to differentiate between more and less informationally-loaded patterns as well as type of information (topical vs. stylistic). While we here focus on research articles in biology/genetics, the methodology is especially interesting for digital humanities scholars, as it can be applied to any text type or domain and combined with additional variables (e.g. time, author or social group).
Tasks
Published 2017-08-01
URL https://www.aclweb.org/anthology/W17-2209/
PDF https://www.aclweb.org/anthology/W17-2209
PWC https://paperswithcode.com/paper/modeling-intra-textual-variation-with-entropy
Repo
Framework

Towards Lexical Chains for Knowledge-Graph-based Word Embeddings

Title Towards Lexical Chains for Knowledge-Graph-based Word Embeddings
Authors Kiril Simov, Svetla Boytcheva, Petya Osenova
Abstract Word vectors with varying dimensionalities and produced by different algorithms have been extensively used in NLP. The corpora that the algorithms are trained on can contain either natural language text (e.g. Wikipedia or newswire articles) or artificially-generated pseudo corpora due to natural data sparseness. We exploit Lexical Chain based templates over Knowledge Graph for generating pseudo-corpora with controlled linguistic value. These corpora are then used for learning word embeddings. A number of experiments have been conducted over the following test sets: WordSim353 Similarity, WordSim353 Relatedness and SimLex-999. The results show that, on the one hand, the incorporation of many-relation lexical chains improves results, but on the other hand, unrestricted-length chains remain difficult to handle with respect to their huge quantity.
Tasks Language Modelling, Learning Word Embeddings, Word Embeddings, Word Sense Disambiguation
Published 2017-09-01
URL https://www.aclweb.org/anthology/R17-1087/
PDF https://doi.org/10.26615/978-954-452-049-6_087
PWC https://paperswithcode.com/paper/towards-lexical-chains-for-knowledge-graph
Repo
Framework
comments powered by Disqus