May 5, 2019

1506 words 8 mins read

Paper Group NANR 135

Paper Group NANR 135

The Power of Language Music: Arabic Lemmatization through Patterns. Effects of Communicative Pressures on Novice L2 Learners’ Use of Optional Formal Devices. Scaling a Natural Language Generation System. Best of Both Worlds: Making Word Sense Embeddings Interpretable. The hunvec framework for NN-CRF-based sequential tagging. Learning Word Meta-Embe …

The Power of Language Music: Arabic Lemmatization through Patterns

Title The Power of Language Music: Arabic Lemmatization through Patterns
Authors Mohammed Attia, Ayah Zirikly, Mona Diab
Abstract The interaction between roots and patterns in Arabic has intrigued lexicographers and morphologists for centuries. While roots provide the consonantal building blocks, patterns provide the syllabic vocalic moulds. While roots provide abstract semantic classes, patterns realize these classes in specific instances. In this way both roots and patterns are indispensable for understanding the derivational, morphological and, to some extent, the cognitive aspects of the Arabic language. In this paper we perform lemmatization (a high-level lexical processing) without relying on a lookup dictionary. We use a hybrid approach that consists of a machine learning classifier to predict the lemma pattern for a given stem, and mapping rules to convert stems to their respective lemmas with the vocalization defined by the pattern.
Tasks Information Retrieval, Lemmatization
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-5306/
PDF https://www.aclweb.org/anthology/W16-5306
PWC https://paperswithcode.com/paper/the-power-of-language-music-arabic
Repo
Framework

Effects of Communicative Pressures on Novice L2 Learners’ Use of Optional Formal Devices

Title Effects of Communicative Pressures on Novice L2 Learners’ Use of Optional Formal Devices
Authors Yoav Binoun, Francesca Delogu, Clayton Greenberg, Mindaugas Mozuraitis, Matthew Crocker
Abstract
Tasks Language Modelling
Published 2016-06-01
URL https://www.aclweb.org/anthology/N16-2009/
PDF https://www.aclweb.org/anthology/N16-2009
PWC https://paperswithcode.com/paper/effects-of-communicative-pressures-on-novice
Repo
Framework

Scaling a Natural Language Generation System

Title Scaling a Natural Language Generation System
Authors Jonathan Pfeil, Soumya Ray
Abstract
Tasks Text Generation
Published 2016-08-01
URL https://www.aclweb.org/anthology/P16-1109/
PDF https://www.aclweb.org/anthology/P16-1109
PWC https://paperswithcode.com/paper/scaling-a-natural-language-generation-system
Repo
Framework

Best of Both Worlds: Making Word Sense Embeddings Interpretable

Title Best of Both Worlds: Making Word Sense Embeddings Interpretable
Authors Alex Panchenko, er
Abstract Word sense embeddings represent a word sense as a low-dimensional numeric vector. While this representation is potentially useful for NLP applications, its interpretability is inherently limited. We propose a simple technique that improves interpretability of sense vectors by mapping them to synsets of a lexical resource. Our experiments with AdaGram sense embeddings and BabelNet synsets show that it is possible to retrieve synsets that correspond to automatically learned sense vectors with Precision of 0.87, Recall of 0.42 and AUC of 0.78.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1421/
PDF https://www.aclweb.org/anthology/L16-1421
PWC https://paperswithcode.com/paper/best-of-both-worlds-making-word-sense
Repo
Framework

The hunvec framework for NN-CRF-based sequential tagging

Title The hunvec framework for NN-CRF-based sequential tagging
Authors Katalin Pajkossy, Attila Zs{'e}der
Abstract In this work we present the open source hunvec framework for sequential tagging, built upon Theano and Pylearn2. The underlying statistical model, which connects linear CRF-s with neural networks, was used by Collobert and co-workers, and several other researchers. For demonstrating the flexibility of our tool, we describe a set of experiments on part-of-speech and named-entity-recognition tasks, using English and Hungarian datasets, where we modify both model and training parameters, and illustrate the usage of custom features. Model parameters we experiment with affect the vectorial word representations used by the model; we apply different word vector initializations, defined by Word2vec and GloVe embeddings and enrich the representation of words by vectors assigned trigram features. We extend training methods by using their regularized (l2 and dropout) version. When testing our framework on a Hungarian named entity corpus, we find that its performance reaches the best published results on this dataset, with no need for language-specific feature engineering. Our code is available at http://github.com/zseder/hunvec
Tasks Feature Engineering, Named Entity Recognition
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1678/
PDF https://www.aclweb.org/anthology/L16-1678
PWC https://paperswithcode.com/paper/the-hunvec-framework-for-nn-crf-based
Repo
Framework

Learning Word Meta-Embeddings

Title Learning Word Meta-Embeddings
Authors Wenpeng Yin, Hinrich Sch{"u}tze
Abstract
Tasks Dependency Parsing, Dimensionality Reduction, Machine Translation, Part-Of-Speech Tagging, Word Embeddings
Published 2016-08-01
URL https://www.aclweb.org/anthology/P16-1128/
PDF https://www.aclweb.org/anthology/P16-1128
PWC https://paperswithcode.com/paper/learning-word-meta-embeddings
Repo
Framework

Time-Independent and Language-Independent Extraction of Multiword Expressions From Twitter

Title Time-Independent and Language-Independent Extraction of Multiword Expressions From Twitter
Authors Nikhil Londhe, Rohini Srihari, Vishrawas Gopalakrishnan
Abstract Multiword Expressions (MWEs) are crucial lexico-semantic units in any language. However, most work on MWEs has been focused on standard monolingual corpora. In this work, we examine MWE usage on Twitter - an inherently multilingual medium with an extremely short average text length that is often replete with grammatical errors. In this work we present a new graph based, language agnostic method for automatically extracting MWEs from tweets. We show how our method outperforms standard Association Measures. We also present a novel unsupervised evaluation technique to ascertain the accuracy of MWE extraction.
Tasks Sentiment Analysis
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-1214/
PDF https://www.aclweb.org/anthology/C16-1214
PWC https://paperswithcode.com/paper/time-independent-and-language-independent
Repo
Framework

Feature based Sentiment Analysis using a Domain Ontology

Title Feature based Sentiment Analysis using a Domain Ontology
Authors Neha Yadav, C Ravindranath Chowdary
Abstract
Tasks Opinion Mining, Sentiment Analysis, Word Sense Disambiguation
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-6312/
PDF https://www.aclweb.org/anthology/W16-6312
PWC https://paperswithcode.com/paper/feature-based-sentiment-analysis-using-a
Repo
Framework

Learning Knowledge Base Inference with Neural Theorem Provers

Title Learning Knowledge Base Inference with Neural Theorem Provers
Authors Tim Rockt{"a}schel, Sebastian Riedel
Abstract
Tasks
Published 2016-06-01
URL https://www.aclweb.org/anthology/W16-1309/
PDF https://www.aclweb.org/anthology/W16-1309
PWC https://paperswithcode.com/paper/learning-knowledge-base-inference-with-neural
Repo
Framework

A Corpus of Argument Networks: Using Graph Properties to Analyse Divisive Issues

Title A Corpus of Argument Networks: Using Graph Properties to Analyse Divisive Issues
Authors Barbara Konat, John Lawrence, Joonsuk Park, Katarzyna Budzynska, Chris Reed
Abstract Governments are increasingly utilising online platforms in order to engage with, and ascertain the opinions of, their citizens. Whilst policy makers could potentially benefit from such enormous feedback from society, they first face the challenge of making sense out of the large volumes of data produced. This creates a demand for tools and technologies which will enable governments to quickly and thoroughly digest the points being made and to respond accordingly. By determining the argumentative and dialogical structures contained within a debate, we are able to determine the issues which are divisive and those which attract agreement. This paper proposes a method of graph-based analytics which uses properties of graphs representing networks of arguments pro- {&} con- in order to automatically analyse issues which divide citizens about new regulations. By future application of the most recent advances in argument mining, the results reported here will have a chance to scale up to enable sense-making of the vast amount of feedback received from citizens on directions that policy should take.
Tasks Argument Mining
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1617/
PDF https://www.aclweb.org/anthology/L16-1617
PWC https://paperswithcode.com/paper/a-corpus-of-argument-networks-using-graph
Repo
Framework

Towards a Convex HMM Surrogate for Word Alignment

Title Towards a Convex HMM Surrogate for Word Alignment
Authors Andrei Simion, Michael Collins, Cliff Stein
Abstract
Tasks Machine Translation, Word Alignment
Published 2016-11-01
URL https://www.aclweb.org/anthology/D16-1051/
PDF https://www.aclweb.org/anthology/D16-1051
PWC https://paperswithcode.com/paper/towards-a-convex-hmm-surrogate-for-word
Repo
Framework

Corpus for Children’s Writing with Enhanced Output for Specific Spelling Patterns (2nd and 3rd Grade)

Title Corpus for Children’s Writing with Enhanced Output for Specific Spelling Patterns (2nd and 3rd Grade)
Authors Kay Berkling
Abstract This paper describes the collection of the H1 Corpus of children{'}s weekly writing over the course of 3 months in 2nd and 3rd grades, aged 7-11. The texts were collected within the normal classroom setting by the teacher. Texts of children whose parents signed the permission to donate the texts to science were collected and transcribed. The corpus consists of the elicitation techniques, an overview of the data collected and the transcriptions of the texts both with and without spelling errors, aligned on a word by word basis, as well as the scanned in texts. The corpus is available for research via Linguistic Data Consortium (LDC). Researchers are strongly encouraged to make additional annotations and improvements and return it to the public domain via LDC.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1510/
PDF https://www.aclweb.org/anthology/L16-1510
PWC https://paperswithcode.com/paper/corpus-for-childrens-writing-with-enhanced
Repo
Framework

Proceedings of the Workshop on Grammar and Lexicon: interactions and interfaces (GramLex)

Title Proceedings of the Workshop on Grammar and Lexicon: interactions and interfaces (GramLex)
Authors
Abstract
Tasks
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-3800/
PDF https://www.aclweb.org/anthology/W16-3800
PWC https://paperswithcode.com/paper/proceedings-of-the-workshop-on-grammar-and
Repo
Framework

Korean Language Resources for Everyone

Title Korean Language Resources for Everyone
Authors Jungyeul Park, Jeen-Pyo Hong, Jeong-Won Cha
Abstract
Tasks Machine Translation, Morphological Analysis, Part-Of-Speech Tagging
Published 2016-10-01
URL https://www.aclweb.org/anthology/Y16-2002/
PDF https://www.aclweb.org/anthology/Y16-2002
PWC https://paperswithcode.com/paper/korean-language-resources-for-everyone
Repo
Framework

Experiments in Idiom Recognition

Title Experiments in Idiom Recognition
Authors Jing Peng, Anna Feldman
Abstract Some expressions can be ambiguous between idiomatic and literal interpretations depending on the context they occur in, e.g., {}sales hit the roof{'} vs. {}hit the roof of the car{'}. We present a novel method of classifying whether a given instance is literal or idiomatic, focusing on verb-noun constructions. We report state-of-the-art results on this task using an approach based on the hypothesis that the distributions of the contexts of the idiomatic phrases will be different from the contexts of the literal usages. We measure contexts by using projections of the words into vector space. For comparison, we implement Fazly et al. (2009){'}s, Sporleder and Li (2009){'}s, and Li and Sporleder (2010b){'}s methods and apply them to our data. We provide experimental results validating the proposed techniques.
Tasks
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-1259/
PDF https://www.aclweb.org/anthology/C16-1259
PWC https://paperswithcode.com/paper/experiments-in-idiom-recognition
Repo
Framework
comments powered by Disqus