Paper Group NANR 89
Subjecthood and Grammatical Relations in Korean: An Experimental Study with Honorific Agreement and Plural Copying. Wh-island Effects in Korean Scrambling Constructions. Sparse Coding of Neural Word Embeddings for Multilingual Sequence Labeling. NTUCLE: Developing a Corpus of Learner English to Provide Writing Support for Engineering Students. Zara …
Subjecthood and Grammatical Relations in Korean: An Experimental Study with Honorific Agreement and Plural Copying
Title | Subjecthood and Grammatical Relations in Korean: An Experimental Study with Honorific Agreement and Plural Copying |
Authors | Ji-Hye Kim, Yong-Hun Lee, James Hye-Suk Yoon |
Abstract | |
Tasks | |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/Y17-1029/ |
https://www.aclweb.org/anthology/Y17-1029 | |
PWC | https://paperswithcode.com/paper/subjecthood-and-grammatical-relations-in |
Repo | |
Framework | |
Wh-island Effects in Korean Scrambling Constructions
Title | Wh-island Effects in Korean Scrambling Constructions |
Authors | Juyeon Cho |
Abstract | |
Tasks | |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/Y17-1044/ |
https://www.aclweb.org/anthology/Y17-1044 | |
PWC | https://paperswithcode.com/paper/wh-island-effects-in-korean-scrambling |
Repo | |
Framework | |
Sparse Coding of Neural Word Embeddings for Multilingual Sequence Labeling
Title | Sparse Coding of Neural Word Embeddings for Multilingual Sequence Labeling |
Authors | G{'a}bor Berend |
Abstract | In this paper we propose and carefully evaluate a sequence labeling framework which solely utilizes sparse indicator features derived from dense distributed word representations. The proposed model obtains (near) state-of-the art performance for both part-of-speech tagging and named entity recognition for a variety of languages. Our model relies only on a few thousand sparse coding-derived features, without applying any modification of the word representations employed for the different tasks. The proposed model has favorable generalization properties as it retains over 89.8{%} of its average POS tagging accuracy when trained at 1.2{%} of the total available training data, i.e. 150 sentences per language. |
Tasks | Feature Engineering, Named Entity Recognition, Part-Of-Speech Tagging, Word Embeddings |
Published | 2017-01-01 |
URL | https://www.aclweb.org/anthology/Q17-1018/ |
https://www.aclweb.org/anthology/Q17-1018 | |
PWC | https://paperswithcode.com/paper/sparse-coding-of-neural-word-embeddings-for-1 |
Repo | |
Framework | |
NTUCLE: Developing a Corpus of Learner English to Provide Writing Support for Engineering Students
Title | NTUCLE: Developing a Corpus of Learner English to Provide Writing Support for Engineering Students |
Authors | Roger Vivek Placidus Winder, Joseph MacKinnon, Shu Yun Li, Benedict Christopher Tzer Liang Lin, Carmel Lee Hah Heah, Lu{'\i}s Morgado da Costa, Takayuki Kuribayashi, Francis Bond |
Abstract | This paper describes the creation of a new annotated learner corpus. The aim is to use this corpus to develop an automated system for corrective feedback on students{'} writing. With this system, students will be able to receive timely feedback on language errors before they submit their assignments for grading. A corpus of assignments submitted by first year engineering students was compiled, and a new error tag set for the NTU Corpus of Learner English (NTUCLE) was developed based on that of the NUS Corpus of Learner English (NUCLE), as well as marking rubrics used at NTU. After a description of the corpus, error tag set and annotation process, the paper presents the results of the annotation exercise as well as follow up actions. The final error tag set, which is significantly larger than that for the NUCLE error categories, is then presented before a brief conclusion summarising our experience and future plans. |
Tasks | |
Published | 2017-12-01 |
URL | https://www.aclweb.org/anthology/W17-5901/ |
https://www.aclweb.org/anthology/W17-5901 | |
PWC | https://paperswithcode.com/paper/ntucle-developing-a-corpus-of-learner-english |
Repo | |
Framework | |
Zara Returns: Improved Personality Induction and Adaptation by an Empathetic Virtual Agent
Title | Zara Returns: Improved Personality Induction and Adaptation by an Empathetic Virtual Agent |
Authors | Farhad Bin Siddique, Onno Kampman, Yang Yang, Anik Dey, Pascale Fung |
Abstract | |
Tasks | Word Embeddings |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/P17-4021/ |
https://www.aclweb.org/anthology/P17-4021 | |
PWC | https://paperswithcode.com/paper/zara-returns-improved-personality-induction |
Repo | |
Framework | |
Extracting Important Tweets for News Writers using Recurrent Neural Network with Attention Mechanism and Multi-task Learning
Title | Extracting Important Tweets for News Writers using Recurrent Neural Network with Attention Mechanism and Multi-task Learning |
Authors | Taro Miyazaki, Shin Toriumi, Yuka Takei, Ichiro Yamada, Jun Goto |
Abstract | |
Tasks | Multi-Task Learning |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/Y17-1048/ |
https://www.aclweb.org/anthology/Y17-1048 | |
PWC | https://paperswithcode.com/paper/extracting-important-tweets-for-news-writers |
Repo | |
Framework | |
Refining Word Embeddings for Sentiment Analysis
Title | Refining Word Embeddings for Sentiment Analysis |
Authors | Liang-Chih Yu, Jin Wang, K. Robert Lai, Xuejie Zhang |
Abstract | Word embeddings that can capture semantic and syntactic information from contexts have been extensively used for various natural language processing tasks. However, existing methods for learning context-based word embeddings typically fail to capture sufficient sentiment information. This may result in words with similar vector representations having an opposite sentiment polarity (e.g., good and bad), thus degrading sentiment analysis performance. Therefore, this study proposes a word vector refinement model that can be applied to any pre-trained word vectors (e.g., Word2vec and GloVe). The refinement model is based on adjusting the vector representations of words such that they can be closer to both semantically and sentimentally similar words and further away from sentimentally dissimilar words. Experimental results show that the proposed method can improve conventional word embeddings and outperform previously proposed sentiment embeddings for both binary and fine-grained classification on Stanford Sentiment Treebank (SST). |
Tasks | Learning Word Embeddings, Sentiment Analysis, Word Embeddings |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/D17-1056/ |
https://www.aclweb.org/anthology/D17-1056 | |
PWC | https://paperswithcode.com/paper/refining-word-embeddings-for-sentiment |
Repo | |
Framework | |
Part-of-Speech Tagging for Twitter with Adversarial Neural Networks
Title | Part-of-Speech Tagging for Twitter with Adversarial Neural Networks |
Authors | Tao Gui, Qi Zhang, Haoran Huang, Minlong Peng, Xuanjing Huang |
Abstract | In this work, we study the problem of part-of-speech tagging for Tweets. In contrast to newswire articles, Tweets are usually informal and contain numerous out-of-vocabulary words. Moreover, there is a lack of large scale labeled datasets for this domain. To tackle these challenges, we propose a novel neural network to make use of out-of-domain labeled data, unlabeled in-domain data, and labeled in-domain data. Inspired by adversarial neural networks, the proposed method tries to learn common features through adversarial discriminator. In addition, we hypothesize that domain-specific features of target domain should be preserved in some degree. Hence, the proposed method adopts a sequence-to-sequence autoencoder to perform this task. Experimental results on three different datasets show that our method achieves better performance than state-of-the-art methods. |
Tasks | Part-Of-Speech Tagging, Stock Prediction |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/D17-1256/ |
https://www.aclweb.org/anthology/D17-1256 | |
PWC | https://paperswithcode.com/paper/part-of-speech-tagging-for-twitter-with |
Repo | |
Framework | |
LearningToQuestion at SemEval 2017 Task 3: Ranking Similar Questions by Learning to Rank Using Rich Features
Title | LearningToQuestion at SemEval 2017 Task 3: Ranking Similar Questions by Learning to Rank Using Rich Features |
Authors | Naman Goyal |
Abstract | This paper describes our official entry LearningToQuestion for SemEval 2017 task 3 community question answer, subtask B. The objective is to rerank questions obtained in web forum as per their similarity to original question. Our system uses pairwise learning to rank methods on rich set of hand designed and representation learning features. We use various semantic features that help our system to achieve promising results on the task. The system achieved second highest results on official metrics MAP and good results on other search metrics. |
Tasks | Information Retrieval, Learning-To-Rank, Question Answering, Representation Learning |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/S17-2050/ |
https://www.aclweb.org/anthology/S17-2050 | |
PWC | https://paperswithcode.com/paper/learningtoquestion-at-semeval-2017-task-3 |
Repo | |
Framework | |
Proceedings of the IJCNLP 2017, Tutorial Abstracts
Title | Proceedings of the IJCNLP 2017, Tutorial Abstracts |
Authors | |
Abstract | |
Tasks | |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/I17-5000/ |
https://www.aclweb.org/anthology/I17-5000 | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-ijcnlp-2017-tutorial |
Repo | |
Framework | |
Lexical Simplification with Neural Ranking
Title | Lexical Simplification with Neural Ranking |
Authors | Gustavo Paetzold, Lucia Specia |
Abstract | We present a new Lexical Simplification approach that exploits Neural Networks to learn substitutions from the Newsela corpus - a large set of professionally produced simplifications. We extract candidate substitutions by combining the Newsela corpus with a retrofitted context-aware word embeddings model and rank them using a new neural regression model that learns rankings from annotated data. This strategy leads to the highest Accuracy, Precision and F1 scores to date in standard datasets for the task. |
Tasks | Complex Word Identification, Information Retrieval, Lexical Simplification, Word Embeddings |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/E17-2006/ |
https://www.aclweb.org/anthology/E17-2006 | |
PWC | https://paperswithcode.com/paper/lexical-simplification-with-neural-ranking |
Repo | |
Framework | |
Towards a General, Continuous Model of Turn-taking in Spoken Dialogue using LSTM Recurrent Neural Networks
Title | Towards a General, Continuous Model of Turn-taking in Spoken Dialogue using LSTM Recurrent Neural Networks |
Authors | Gabriel Skantze |
Abstract | Previous models of turn-taking have mostly been trained for specific turn-taking decisions, such as discriminating between turn shifts and turn retention in pauses. In this paper, we present a predictive, continuous model of turn-taking using Long Short-Term Memory (LSTM) Recurrent Neural Networks (RNN). The model is trained on human-human dialogue data to predict upcoming speech activity in a future time window. We show how this general model can be applied to two different tasks that it was not specifically trained for. First, to predict whether a turn-shift will occur or not in pauses, where the model achieves a better performance than human observers, and better than results achieved with more traditional models. Second, to make a prediction at speech onset whether the utterance will be a short backchannel or a longer utterance. Finally, we show how the hidden layer in the network can be used as a feature vector for turn-taking decisions in a human-robot interaction scenario. |
Tasks | Feature Engineering, Spoken Dialogue Systems |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/W17-5527/ |
https://www.aclweb.org/anthology/W17-5527 | |
PWC | https://paperswithcode.com/paper/towards-a-general-continuous-model-of-turn |
Repo | |
Framework | |
Linked Data for Language-Learning Applications
Title | Linked Data for Language-Learning Applications |
Authors | Robyn Loughnane, Kate McCurdy, Peter Kolb, Stefan Selent |
Abstract | The use of linked data within language-learning applications is an open research question. A research prototype is presented that applies linked-data principles to store linguistic annotation generated from language-learning content using a variety of NLP tools. The result is a database that links learning content, linguistic annotation and open-source resources, on top of which a diverse range of tools for language-learning applications can be built. |
Tasks | Part-Of-Speech Tagging |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-5005/ |
https://www.aclweb.org/anthology/W17-5005 | |
PWC | https://paperswithcode.com/paper/linked-data-for-language-learning |
Repo | |
Framework | |
Modeling intra-textual variation with entropy and surprisal: topical vs. stylistic patterns
Title | Modeling intra-textual variation with entropy and surprisal: topical vs. stylistic patterns |
Authors | Stefania Degaetano-Ortlieb, Elke Teich |
Abstract | We present a data-driven approach to investigate intra-textual variation by combining entropy and surprisal. With this approach we detect linguistic variation based on phrasal lexico-grammatical patterns across sections of research articles. Entropy is used to detect patterns typical of specific sections. Surprisal is used to differentiate between more and less informationally-loaded patterns as well as type of information (topical vs. stylistic). While we here focus on research articles in biology/genetics, the methodology is especially interesting for digital humanities scholars, as it can be applied to any text type or domain and combined with additional variables (e.g. time, author or social group). |
Tasks | |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/W17-2209/ |
https://www.aclweb.org/anthology/W17-2209 | |
PWC | https://paperswithcode.com/paper/modeling-intra-textual-variation-with-entropy |
Repo | |
Framework | |
Towards Lexical Chains for Knowledge-Graph-based Word Embeddings
Title | Towards Lexical Chains for Knowledge-Graph-based Word Embeddings |
Authors | Kiril Simov, Svetla Boytcheva, Petya Osenova |
Abstract | Word vectors with varying dimensionalities and produced by different algorithms have been extensively used in NLP. The corpora that the algorithms are trained on can contain either natural language text (e.g. Wikipedia or newswire articles) or artificially-generated pseudo corpora due to natural data sparseness. We exploit Lexical Chain based templates over Knowledge Graph for generating pseudo-corpora with controlled linguistic value. These corpora are then used for learning word embeddings. A number of experiments have been conducted over the following test sets: WordSim353 Similarity, WordSim353 Relatedness and SimLex-999. The results show that, on the one hand, the incorporation of many-relation lexical chains improves results, but on the other hand, unrestricted-length chains remain difficult to handle with respect to their huge quantity. |
Tasks | Language Modelling, Learning Word Embeddings, Word Embeddings, Word Sense Disambiguation |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/R17-1087/ |
https://doi.org/10.26615/978-954-452-049-6_087 | |
PWC | https://paperswithcode.com/paper/towards-lexical-chains-for-knowledge-graph |
Repo | |
Framework | |