October 15, 2019

2043 words 10 mins read

Paper Group NANR 142

Paper Group NANR 142

Language Informed Modeling of Code-Switched Text. GHHT at CALCS 2018: Named Entity Recognition for Dialectal Arabic Using Neural Networks. Learning Unsupervised Word Translations Without Adversaries. Tackling Code-Switched NER: Participation of CMU. Fine-Grained Arabic Dialect Identification. Clausal Modifiers in the Grammar Matrix. IIT (BHU) Submi …

Language Informed Modeling of Code-Switched Text

Title Language Informed Modeling of Code-Switched Text
Authors Ch, Khyathi u, Thomas Manzini, Sumeet Singh, Alan W. Black
Abstract Code-switching (CS), the practice of alternating between two or more languages in conversations, is pervasive in most multi-lingual communities. CS texts have a complex interplay between languages and occur in informal contexts that make them harder to collect and construct NLP tools for. We approach this problem through Language Modeling (LM) on a new Hindi-English mixed corpus containing 59,189 unique sentences collected from blogging websites. We implement and discuss different Language Models derived from a multi-layered LSTM architecture. We hypothesize that encoding language information strengthens a language model by helping to learn code-switching points. We show that our highest performing model achieves a test perplexity of 19.52 on the CS corpus that we collected and processed. On this data we demonstrate that our performance is an improvement over AWD-LSTM LM (a recent state of the art on monolingual English).
Tasks Language Modelling, Machine Translation, Speech Recognition
Published 2018-07-01
URL https://www.aclweb.org/anthology/W18-3211/
PDF https://www.aclweb.org/anthology/W18-3211
PWC https://paperswithcode.com/paper/language-informed-modeling-of-code-switched
Repo
Framework

GHHT at CALCS 2018: Named Entity Recognition for Dialectal Arabic Using Neural Networks

Title GHHT at CALCS 2018: Named Entity Recognition for Dialectal Arabic Using Neural Networks
Authors Mohammed Attia, Younes Samih, Wolfgang Maier
Abstract This paper describes our system submission to the CALCS 2018 shared task on named entity recognition on code-switched data for the language variant pair of Modern Standard Arabic and Egyptian dialectal Arabic. We build a a Deep Neural Network that combines word and character-based representations in convolutional and recurrent networks with a CRF layer. The model is augmented with stacked layers of enriched information such pre-trained embeddings, Brown clusters and named entity gazetteers. Our system is ranked second among those participating in the shared task achieving an FB1 average of 70.09{%}.
Tasks Named Entity Recognition
Published 2018-07-01
URL https://www.aclweb.org/anthology/W18-3212/
PDF https://www.aclweb.org/anthology/W18-3212
PWC https://paperswithcode.com/paper/ghht-at-calcs-2018-named-entity-recognition
Repo
Framework

Learning Unsupervised Word Translations Without Adversaries

Title Learning Unsupervised Word Translations Without Adversaries
Authors Tanmoy Mukherjee, Makoto Yamada, Timothy Hospedales
Abstract Word translation, or bilingual dictionary induction, is an important capability that impacts many multilingual language processing tasks. Recent research has shown that word translation can be achieved in an unsupervised manner, without parallel seed dictionaries or aligned corpora. However, state of the art methods unsupervised bilingual dictionary induction are based on generative adversarial models, and as such suffer from their well known problems of instability and hyper-parameter sensitivity. We present a statistical dependency-based approach to bilingual dictionary induction that is unsupervised {–} no seed dictionary or parallel corpora required; and introduces no adversary {–} therefore being much easier to train. Our method performs comparably to adversarial alternatives and outperforms prior non-adversarial methods.
Tasks Machine Translation, Multilingual Word Embeddings, Transfer Learning, Word Embeddings
Published 2018-10-01
URL https://www.aclweb.org/anthology/D18-1063/
PDF https://www.aclweb.org/anthology/D18-1063
PWC https://paperswithcode.com/paper/learning-unsupervised-word-translations
Repo
Framework

Tackling Code-Switched NER: Participation of CMU

Title Tackling Code-Switched NER: Participation of CMU
Authors Parvathy Geetha, Ch, Khyathi u, Alan W Black
Abstract Named Entity Recognition plays a major role in several downstream applications in NLP. Though this task has been heavily studied in formal monolingual texts and also noisy texts like Twitter data, it is still an emerging task in code-switched (CS) content on social media. This paper describes our participation in the shared task of NER on code-switched data for Spanglish (Spanish + English) and Arabish (Arabic + English). In this paper we describe models that intuitively developed from the data for the shared task Named Entity Recognition on Code-switched Data. Owing to the sparse and non-linear relationships between words in Twitter data, we explored neural architectures that are capable of non-linearities fairly well. In specific, we trained character level models and word level models based on Bidirectional LSTMs (Bi-LSTMs) to perform sequential tagging. We train multiple models to identify nominal mentions and subsequently use this information to predict the labels of named entity in a sequence. Our best model is a character level model along with word level pre-trained multilingual embeddings that gave an F-score of 56.72 in Spanglish and a word level model that gave an F-score of 65.02 in Arabish on the test data.
Tasks Named Entity Recognition, Question Answering
Published 2018-07-01
URL https://www.aclweb.org/anthology/W18-3217/
PDF https://www.aclweb.org/anthology/W18-3217
PWC https://paperswithcode.com/paper/tackling-code-switched-ner-participation-of
Repo
Framework

Fine-Grained Arabic Dialect Identification

Title Fine-Grained Arabic Dialect Identification
Authors Mohammad Salameh, Houda Bouamor, Nizar Habash
Abstract Previous work on the problem of Arabic Dialect Identification typically targeted coarse-grained five dialect classes plus Standard Arabic (6-way classification). This paper presents the first results on a fine-grained dialect classification task covering 25 specific cities from across the Arab World, in addition to Standard Arabic {–} a very challenging task. We build several classification systems and explore a large space of features. Our results show that we can identify the exact city of a speaker at an accuracy of 67.9{%} for sentences with an average length of 7 words (a 9{%} relative error reduction over the state-of-the-art technique for Arabic dialect identification) and reach more than 90{%} when we consider 16 words. We also report on additional insights from a data analysis of similarity and difference across Arabic dialects.
Tasks Machine Translation, Sentiment Analysis
Published 2018-08-01
URL https://www.aclweb.org/anthology/C18-1113/
PDF https://www.aclweb.org/anthology/C18-1113
PWC https://paperswithcode.com/paper/fine-grained-arabic-dialect-identification
Repo
Framework

Clausal Modifiers in the Grammar Matrix

Title Clausal Modifiers in the Grammar Matrix
Authors Kristen Howell, Olga Zamaraeva
Abstract We extend the coverage of an existing grammar customization system to clausal modifiers, also referred to as adverbial clauses. We present an analysis, taking a typologically-driven approach to account for this phenomenon across the world{'}s languages, which we implement in the Grammar Matrix customization system (Bender et al., 2002, 2010). Testing our analysis on testsuites from five genetically and geographically diverse languages that were not considered in development, we achieve 88.4{%} coverage and 1.5{%} overgeneration.
Tasks
Published 2018-08-01
URL https://www.aclweb.org/anthology/C18-1249/
PDF https://www.aclweb.org/anthology/C18-1249
PWC https://paperswithcode.com/paper/clausal-modifiers-in-the-grammar-matrix
Repo
Framework

IIT (BHU) Submission for the ACL Shared Task on Named Entity Recognition on Code-switched Data

Title IIT (BHU) Submission for the ACL Shared Task on Named Entity Recognition on Code-switched Data
Authors Shashwat Trivedi, Harsh Rangwani, Anil Kumar Singh
Abstract This paper describes the best performing system for the shared task on Named Entity Recognition (NER) on code-switched data for the language pair Spanish-English (ENG-SPA). We introduce a gated neural architecture for the NER task. Our final model achieves an F1 score of 63.76{%}, outperforming the baseline by 10{%}.
Tasks Named Entity Recognition
Published 2018-07-01
URL https://www.aclweb.org/anthology/W18-3220/
PDF https://www.aclweb.org/anthology/W18-3220
PWC https://paperswithcode.com/paper/iit-bhu-submission-for-the-acl-shared-task-on
Repo
Framework

Unified Guidelines and Resources for Arabic Dialect Orthography

Title Unified Guidelines and Resources for Arabic Dialect Orthography
Authors Nizar Habash, Fadhl Eryani, Salam Khalifa, Owen Rambow, Dana Abdulrahim, Alex Erdmann, er, Reem Faraj, Wajdi Zaghouani, Houda Bouamor, Nasser Zalmout, Sara Hassan, Faisal Al-Shargi, Sakhar Alkhereyf, Basma Abdulkareem, Esk, Ramy er, Mohammad Salameh, Hind Saddiki
Abstract
Tasks Machine Translation, Speech Recognition, Transliteration
Published 2018-05-01
URL https://www.aclweb.org/anthology/L18-1574/
PDF https://www.aclweb.org/anthology/L18-1574
PWC https://paperswithcode.com/paper/unified-guidelines-and-resources-for-arabic
Repo
Framework

The LIA Treebank of Spoken Norwegian Dialects

Title The LIA Treebank of Spoken Norwegian Dialects
Authors Lilja {\O}vrelid, Andre K{\aa}sen, Kristin Hagen, Anders N{\o}klestad, Per Erik Solberg, Janne Bondi Johannessen
Abstract
Tasks Dependency Parsing, Transliteration
Published 2018-05-01
URL https://www.aclweb.org/anthology/L18-1710/
PDF https://www.aclweb.org/anthology/L18-1710
PWC https://paperswithcode.com/paper/the-lia-treebank-of-spoken-norwegian-dialects
Repo
Framework

EmotionX-Area66: Predicting Emotions in Dialogues using Hierarchical Attention Network with Sequence Labeling

Title EmotionX-Area66: Predicting Emotions in Dialogues using Hierarchical Attention Network with Sequence Labeling
Authors Rohit Saxena, Savita Bhat, Niranjan Pedanekar
Abstract This paper presents our system submitted to the EmotionX challenge. It is an emotion detection task on dialogues in the EmotionLines dataset. We formulate this as a hierarchical network where network learns data representation at both utterance level and dialogue level. Our model is inspired by Hierarchical Attention network (HAN) and uses pre-trained word embeddings as features. We formulate emotion detection in dialogues as a sequence labeling problem to capture the dependencies among labels. We report the performance accuracy for four emotions (anger, joy, neutral and sadness). The model achieved unweighted accuracy of 55.38{%} on Friends test dataset and 56.73{%} on EmotionPush test dataset. We report an improvement of 22.51{%} in Friends dataset and 36.04{%} in EmotionPush dataset over baseline results.
Tasks Emotion Classification, Emotion Recognition, Word Embeddings
Published 2018-07-01
URL https://www.aclweb.org/anthology/W18-3509/
PDF https://www.aclweb.org/anthology/W18-3509
PWC https://paperswithcode.com/paper/emotionx-area66-predicting-emotions-in
Repo
Framework

Mixed-integer programming formulation of a data-driven solver in computational elasticity

Title Mixed-integer programming formulation of a data-driven solver in computational elasticity
Authors Yoshihiro Kanno
Abstract This paper presents a mixed-integer quadratic programming formulation of an existing data-driven approach to computational elasticity. This formulation is suitable for application of a standard mixed-integer programming solver, which finds a global optimal solution. Therefore, the results obtained by the presented method can be used as benchmark instances for any other algorithm. Preliminary numerical experiments are performed to compare quality of solutions obtained by the proposed method and a heuristic conventionally used in the data-driven computational mechanics.
Tasks Stress-Strain Relation
Published 2018-02-25
URL https://link.springer.com/article/10.1007/s11590-019-01409-w
PDF https://link.springer.com/article/10.1007/s11590-019-01409-w
PWC https://paperswithcode.com/paper/mixed-integer-programming-formulation-of-a
Repo
Framework

Universal Sketch Perceptual Grouping

Title Universal Sketch Perceptual Grouping
Authors Ke Li, Kaiyue Pang, Jifei Song, Yi-Zhe Song, Tao Xiang, Timothy M. Hospedales, Honggang Zhang
Abstract In this work we aim to develop a universal sketch grouper. That is, a grouper that can be applied to sketches of any category in any domain to group constituent strokes/segments into semantically meaningful object parts. The first obstacle to this goal is the lack of large-scale datasets with grouping annotation. To overcome this, we contribute the largest sketch perceptual grouping (SPG) dataset to date, consisting of 20,000 unique sketches evenly distributed over 25 object categories. Furthermore, we propose a novel deep universal perceptual grouping model. The model is learned with both generative and discriminative losses. The generative losses improve the generalisation ability of the model to unseen object categories and datasets. The discriminative losses include a local grouping loss and a novel global grouping loss to enforce global grouping consistency. We show that the proposed model significantly outperforms the state-of-the-art groupers. Further, we show that our grouper is useful for a number of sketch analysis tasks including sketch synthesis and fine-grained sketch-based image retrieval (FG-SBIR).
Tasks Image Retrieval, Sketch-Based Image Retrieval
Published 2018-09-01
URL http://openaccess.thecvf.com/content_ECCV_2018/html/Ke_LI_Universal_Sketch_Perceptual_ECCV_2018_paper.html
PDF http://openaccess.thecvf.com/content_ECCV_2018/papers/Ke_LI_Universal_Sketch_Perceptual_ECCV_2018_paper.pdf
PWC https://paperswithcode.com/paper/universal-sketch-perceptual-grouping
Repo
Framework

Thank ``Goodness’'! A Way to Measure Style in Student Essays

Title Thank ``Goodness’'! A Way to Measure Style in Student Essays |
Authors S Mathias, eep, Pushpak Bhattacharyya
Abstract Essays have two major components for scoring - content and style. In this paper, we describe a property of the essay, called goodness, and use it to predict the score given for the style of student essays. We compare our approach to solve this problem with baseline approaches, like language modeling and also a state-of-the-art deep learning system. We show that, despite being quite intuitive, our approach is very powerful in predicting the style of the essays.
Tasks Language Modelling
Published 2018-07-01
URL https://www.aclweb.org/anthology/W18-3705/
PDF https://www.aclweb.org/anthology/W18-3705
PWC https://paperswithcode.com/paper/thank-agoodnessa-a-way-to-measure-style-in
Repo
Framework

Abstract Meaning Representation for Paraphrase Detection

Title Abstract Meaning Representation for Paraphrase Detection
Authors Fuad Issa, Marco Damonte, Shay B. Cohen, Xiaohui Yan, Yi Chang
Abstract Abstract Meaning Representation (AMR) parsing aims at abstracting away from the syntactic realization of a sentence, and denote only its meaning in a canonical form. As such, it is ideal for paraphrase detection, a problem in which one is required to specify whether two sentences have the same meaning. We show that na{"\i}ve use of AMR in paraphrase detection is not necessarily useful, and turn to describe a technique based on latent semantic analysis in combination with AMR parsing that significantly advances state-of-the-art results in paraphrase detection for the Microsoft Research Paraphrase Corpus. Our best results in the transductive setting are 86.6{%} for accuracy and 90.0{%} for F$_1$ measure.
Tasks Amr Parsing
Published 2018-06-01
URL https://www.aclweb.org/anthology/N18-1041/
PDF https://www.aclweb.org/anthology/N18-1041
PWC https://paperswithcode.com/paper/abstract-meaning-representation-for-1
Repo
Framework

You Tweet What You Speak: A City-Level Dataset of Arabic Dialects

Title You Tweet What You Speak: A City-Level Dataset of Arabic Dialects
Authors Muhammad Abdul-Mageed, Hassan Alhuzali, Mohamed Elaraby
Abstract
Tasks
Published 2018-05-01
URL https://www.aclweb.org/anthology/L18-1577/
PDF https://www.aclweb.org/anthology/L18-1577
PWC https://paperswithcode.com/paper/you-tweet-what-you-speak-a-city-level-dataset
Repo
Framework
comments powered by Disqus