July 26, 2019

1847 words 9 mins read

Paper Group NANR 89

Subjecthood and Grammatical Relations in Korean: An Experimental Study with Honorific Agreement and Plural Copying. Wh-island Effects in Korean Scrambling Constructions. Sparse Coding of Neural Word Embeddings for Multilingual Sequence Labeling. NTUCLE: Developing a Corpus of Learner English to Provide Writing Support for Engineering Students. Zara …

Subjecthood and Grammatical Relations in Korean: An Experimental Study with Honorific Agreement and Plural Copying


Title	Subjecthood and Grammatical Relations in Korean: An Experimental Study with Honorific Agreement and Plural Copying
Authors	Ji-Hye Kim, Yong-Hun Lee, James Hye-Suk Yoon
Abstract
Tasks
Published	2017-11-01
URL	https://www.aclweb.org/anthology/Y17-1029/
PDF	https://www.aclweb.org/anthology/Y17-1029
PWC	https://paperswithcode.com/paper/subjecthood-and-grammatical-relations-in
Repo
Framework

Wh-island Effects in Korean Scrambling Constructions


Title	Wh-island Effects in Korean Scrambling Constructions
Authors	Juyeon Cho
Abstract
Tasks
Published	2017-11-01
URL	https://www.aclweb.org/anthology/Y17-1044/
PDF	https://www.aclweb.org/anthology/Y17-1044
PWC	https://paperswithcode.com/paper/wh-island-effects-in-korean-scrambling
Repo
Framework

Sparse Coding of Neural Word Embeddings for Multilingual Sequence Labeling


Title	Sparse Coding of Neural Word Embeddings for Multilingual Sequence Labeling
Authors	G{'a}bor Berend
Abstract	In this paper we propose and carefully evaluate a sequence labeling framework which solely utilizes sparse indicator features derived from dense distributed word representations. The proposed model obtains (near) state-of-the art performance for both part-of-speech tagging and named entity recognition for a variety of languages. Our model relies only on a few thousand sparse coding-derived features, without applying any modification of the word representations employed for the different tasks. The proposed model has favorable generalization properties as it retains over 89.8{%} of its average POS tagging accuracy when trained at 1.2{%} of the total available training data, i.e. 150 sentences per language.
Tasks	Feature Engineering, Named Entity Recognition, Part-Of-Speech Tagging, Word Embeddings
Published	2017-01-01
URL	https://www.aclweb.org/anthology/Q17-1018/
PDF	https://www.aclweb.org/anthology/Q17-1018
PWC	https://paperswithcode.com/paper/sparse-coding-of-neural-word-embeddings-for-1
Repo
Framework

NTUCLE: Developing a Corpus of Learner English to Provide Writing Support for Engineering Students


Title	NTUCLE: Developing a Corpus of Learner English to Provide Writing Support for Engineering Students
Authors	Roger Vivek Placidus Winder, Joseph MacKinnon, Shu Yun Li, Benedict Christopher Tzer Liang Lin, Carmel Lee Hah Heah, Lu{'\i}s Morgado da Costa, Takayuki Kuribayashi, Francis Bond
Abstract	This paper describes the creation of a new annotated learner corpus. The aim is to use this corpus to develop an automated system for corrective feedback on students{'} writing. With this system, students will be able to receive timely feedback on language errors before they submit their assignments for grading. A corpus of assignments submitted by first year engineering students was compiled, and a new error tag set for the NTU Corpus of Learner English (NTUCLE) was developed based on that of the NUS Corpus of Learner English (NUCLE), as well as marking rubrics used at NTU. After a description of the corpus, error tag set and annotation process, the paper presents the results of the annotation exercise as well as follow up actions. The final error tag set, which is significantly larger than that for the NUCLE error categories, is then presented before a brief conclusion summarising our experience and future plans.
Tasks
Published	2017-12-01
URL	https://www.aclweb.org/anthology/W17-5901/
PDF	https://www.aclweb.org/anthology/W17-5901
PWC	https://paperswithcode.com/paper/ntucle-developing-a-corpus-of-learner-english
Repo
Framework

Zara Returns: Improved Personality Induction and Adaptation by an Empathetic Virtual Agent


Title	Zara Returns: Improved Personality Induction and Adaptation by an Empathetic Virtual Agent
Authors	Farhad Bin Siddique, Onno Kampman, Yang Yang, Anik Dey, Pascale Fung
Abstract
Tasks	Word Embeddings
Published	2017-07-01
URL	https://www.aclweb.org/anthology/P17-4021/
PDF	https://www.aclweb.org/anthology/P17-4021
PWC	https://paperswithcode.com/paper/zara-returns-improved-personality-induction
Repo
Framework

Extracting Important Tweets for News Writers using Recurrent Neural Network with Attention Mechanism and Multi-task Learning


Title	Extracting Important Tweets for News Writers using Recurrent Neural Network with Attention Mechanism and Multi-task Learning
Authors	Taro Miyazaki, Shin Toriumi, Yuka Takei, Ichiro Yamada, Jun Goto
Abstract
Tasks	Multi-Task Learning
Published	2017-11-01
URL	https://www.aclweb.org/anthology/Y17-1048/
PDF	https://www.aclweb.org/anthology/Y17-1048
PWC	https://paperswithcode.com/paper/extracting-important-tweets-for-news-writers
Repo
Framework

Refining Word Embeddings for Sentiment Analysis


Title	Refining Word Embeddings for Sentiment Analysis
Authors	Liang-Chih Yu, Jin Wang, K. Robert Lai, Xuejie Zhang
Abstract	Word embeddings that can capture semantic and syntactic information from contexts have been extensively used for various natural language processing tasks. However, existing methods for learning context-based word embeddings typically fail to capture sufficient sentiment information. This may result in words with similar vector representations having an opposite sentiment polarity (e.g., good and bad), thus degrading sentiment analysis performance. Therefore, this study proposes a word vector refinement model that can be applied to any pre-trained word vectors (e.g., Word2vec and GloVe). The refinement model is based on adjusting the vector representations of words such that they can be closer to both semantically and sentimentally similar words and further away from sentimentally dissimilar words. Experimental results show that the proposed method can improve conventional word embeddings and outperform previously proposed sentiment embeddings for both binary and fine-grained classification on Stanford Sentiment Treebank (SST).
Tasks	Learning Word Embeddings, Sentiment Analysis, Word Embeddings
Published	2017-09-01
URL	https://www.aclweb.org/anthology/D17-1056/
PDF	https://www.aclweb.org/anthology/D17-1056
PWC	https://paperswithcode.com/paper/refining-word-embeddings-for-sentiment
Repo
Framework

Part-of-Speech Tagging for Twitter with Adversarial Neural Networks


Title	Part-of-Speech Tagging for Twitter with Adversarial Neural Networks
Authors	Tao Gui, Qi Zhang, Haoran Huang, Minlong Peng, Xuanjing Huang
Abstract	In this work, we study the problem of part-of-speech tagging for Tweets. In contrast to newswire articles, Tweets are usually informal and contain numerous out-of-vocabulary words. Moreover, there is a lack of large scale labeled datasets for this domain. To tackle these challenges, we propose a novel neural network to make use of out-of-domain labeled data, unlabeled in-domain data, and labeled in-domain data. Inspired by adversarial neural networks, the proposed method tries to learn common features through adversarial discriminator. In addition, we hypothesize that domain-specific features of target domain should be preserved in some degree. Hence, the proposed method adopts a sequence-to-sequence autoencoder to perform this task. Experimental results on three different datasets show that our method achieves better performance than state-of-the-art methods.
Tasks	Part-Of-Speech Tagging, Stock Prediction
Published	2017-09-01
URL	https://www.aclweb.org/anthology/D17-1256/
PDF	https://www.aclweb.org/anthology/D17-1256
PWC	https://paperswithcode.com/paper/part-of-speech-tagging-for-twitter-with
Repo
Framework

LearningToQuestion at SemEval 2017 Task 3: Ranking Similar Questions by Learning to Rank Using Rich Features


Title	LearningToQuestion at SemEval 2017 Task 3: Ranking Similar Questions by Learning to Rank Using Rich Features
Authors	Naman Goyal
Abstract	This paper describes our official entry LearningToQuestion for SemEval 2017 task 3 community question answer, subtask B. The objective is to rerank questions obtained in web forum as per their similarity to original question. Our system uses pairwise learning to rank methods on rich set of hand designed and representation learning features. We use various semantic features that help our system to achieve promising results on the task. The system achieved second highest results on official metrics MAP and good results on other search metrics.
Tasks	Information Retrieval, Learning-To-Rank, Question Answering, Representation Learning
Published	2017-08-01
URL	https://www.aclweb.org/anthology/S17-2050/
PDF	https://www.aclweb.org/anthology/S17-2050
PWC	https://paperswithcode.com/paper/learningtoquestion-at-semeval-2017-task-3
Repo
Framework

Proceedings of the IJCNLP 2017, Tutorial Abstracts


Title	Proceedings of the IJCNLP 2017, Tutorial Abstracts
Authors
Abstract
Tasks
Published	2017-11-01
URL	https://www.aclweb.org/anthology/I17-5000/
PDF	https://www.aclweb.org/anthology/I17-5000
PWC	https://paperswithcode.com/paper/proceedings-of-the-ijcnlp-2017-tutorial
Repo
Framework

Lexical Simplification with Neural Ranking


Title	Lexical Simplification with Neural Ranking
Authors	Gustavo Paetzold, Lucia Specia
Abstract	We present a new Lexical Simplification approach that exploits Neural Networks to learn substitutions from the Newsela corpus - a large set of professionally produced simplifications. We extract candidate substitutions by combining the Newsela corpus with a retrofitted context-aware word embeddings model and rank them using a new neural regression model that learns rankings from annotated data. This strategy leads to the highest Accuracy, Precision and F1 scores to date in standard datasets for the task.
Tasks	Complex Word Identification, Information Retrieval, Lexical Simplification, Word Embeddings
Published	2017-04-01
URL	https://www.aclweb.org/anthology/E17-2006/
PDF	https://www.aclweb.org/anthology/E17-2006
PWC	https://paperswithcode.com/paper/lexical-simplification-with-neural-ranking
Repo
Framework

Towards a General, Continuous Model of Turn-taking in Spoken Dialogue using LSTM Recurrent Neural Networks


Title	Towards a General, Continuous Model of Turn-taking in Spoken Dialogue using LSTM Recurrent Neural Networks
Authors	Gabriel Skantze
Abstract	Previous models of turn-taking have mostly been trained for specific turn-taking decisions, such as discriminating between turn shifts and turn retention in pauses. In this paper, we present a predictive, continuous model of turn-taking using Long Short-Term Memory (LSTM) Recurrent Neural Networks (RNN). The model is trained on human-human dialogue data to predict upcoming speech activity in a future time window. We show how this general model can be applied to two different tasks that it was not specifically trained for. First, to predict whether a turn-shift will occur or not in pauses, where the model achieves a better performance than human observers, and better than results achieved with more traditional models. Second, to make a prediction at speech onset whether the utterance will be a short backchannel or a longer utterance. Finally, we show how the hidden layer in the network can be used as a feature vector for turn-taking decisions in a human-robot interaction scenario.
Tasks	Feature Engineering, Spoken Dialogue Systems
Published	2017-08-01
URL	https://www.aclweb.org/anthology/W17-5527/
PDF	https://www.aclweb.org/anthology/W17-5527
PWC	https://paperswithcode.com/paper/towards-a-general-continuous-model-of-turn
Repo
Framework

Linked Data for Language-Learning Applications


Title	Linked Data for Language-Learning Applications
Authors	Robyn Loughnane, Kate McCurdy, Peter Kolb, Stefan Selent
Abstract	The use of linked data within language-learning applications is an open research question. A research prototype is presented that applies linked-data principles to store linguistic annotation generated from language-learning content using a variety of NLP tools. The result is a database that links learning content, linguistic annotation and open-source resources, on top of which a diverse range of tools for language-learning applications can be built.
Tasks	Part-Of-Speech Tagging
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-5005/
PDF	https://www.aclweb.org/anthology/W17-5005
PWC	https://paperswithcode.com/paper/linked-data-for-language-learning
Repo
Framework

Modeling intra-textual variation with entropy and surprisal: topical vs. stylistic patterns


Title	Modeling intra-textual variation with entropy and surprisal: topical vs. stylistic patterns
Authors	Stefania Degaetano-Ortlieb, Elke Teich
Abstract	We present a data-driven approach to investigate intra-textual variation by combining entropy and surprisal. With this approach we detect linguistic variation based on phrasal lexico-grammatical patterns across sections of research articles. Entropy is used to detect patterns typical of specific sections. Surprisal is used to differentiate between more and less informationally-loaded patterns as well as type of information (topical vs. stylistic). While we here focus on research articles in biology/genetics, the methodology is especially interesting for digital humanities scholars, as it can be applied to any text type or domain and combined with additional variables (e.g. time, author or social group).
Tasks
Published	2017-08-01
URL	https://www.aclweb.org/anthology/W17-2209/
PDF	https://www.aclweb.org/anthology/W17-2209
PWC	https://paperswithcode.com/paper/modeling-intra-textual-variation-with-entropy
Repo
Framework

Towards Lexical Chains for Knowledge-Graph-based Word Embeddings


Title	Towards Lexical Chains for Knowledge-Graph-based Word Embeddings
Authors	Kiril Simov, Svetla Boytcheva, Petya Osenova
Abstract	Word vectors with varying dimensionalities and produced by different algorithms have been extensively used in NLP. The corpora that the algorithms are trained on can contain either natural language text (e.g. Wikipedia or newswire articles) or artificially-generated pseudo corpora due to natural data sparseness. We exploit Lexical Chain based templates over Knowledge Graph for generating pseudo-corpora with controlled linguistic value. These corpora are then used for learning word embeddings. A number of experiments have been conducted over the following test sets: WordSim353 Similarity, WordSim353 Relatedness and SimLex-999. The results show that, on the one hand, the incorporation of many-relation lexical chains improves results, but on the other hand, unrestricted-length chains remain difficult to handle with respect to their huge quantity.
Tasks	Language Modelling, Learning Word Embeddings, Word Embeddings, Word Sense Disambiguation
Published	2017-09-01
URL	https://www.aclweb.org/anthology/R17-1087/
PDF	https://doi.org/10.26615/978-954-452-049-6_087
PWC	https://paperswithcode.com/paper/towards-lexical-chains-for-knowledge-graph
Repo
Framework