Paper Group NANR 142
Language Informed Modeling of Code-Switched Text. GHHT at CALCS 2018: Named Entity Recognition for Dialectal Arabic Using Neural Networks. Learning Unsupervised Word Translations Without Adversaries. Tackling Code-Switched NER: Participation of CMU. Fine-Grained Arabic Dialect Identification. Clausal Modifiers in the Grammar Matrix. IIT (BHU) Submi …
Language Informed Modeling of Code-Switched Text
Title | Language Informed Modeling of Code-Switched Text |
Authors | Ch, Khyathi u, Thomas Manzini, Sumeet Singh, Alan W. Black |
Abstract | Code-switching (CS), the practice of alternating between two or more languages in conversations, is pervasive in most multi-lingual communities. CS texts have a complex interplay between languages and occur in informal contexts that make them harder to collect and construct NLP tools for. We approach this problem through Language Modeling (LM) on a new Hindi-English mixed corpus containing 59,189 unique sentences collected from blogging websites. We implement and discuss different Language Models derived from a multi-layered LSTM architecture. We hypothesize that encoding language information strengthens a language model by helping to learn code-switching points. We show that our highest performing model achieves a test perplexity of 19.52 on the CS corpus that we collected and processed. On this data we demonstrate that our performance is an improvement over AWD-LSTM LM (a recent state of the art on monolingual English). |
Tasks | Language Modelling, Machine Translation, Speech Recognition |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/W18-3211/ |
https://www.aclweb.org/anthology/W18-3211 | |
PWC | https://paperswithcode.com/paper/language-informed-modeling-of-code-switched |
Repo | |
Framework | |
GHHT at CALCS 2018: Named Entity Recognition for Dialectal Arabic Using Neural Networks
Title | GHHT at CALCS 2018: Named Entity Recognition for Dialectal Arabic Using Neural Networks |
Authors | Mohammed Attia, Younes Samih, Wolfgang Maier |
Abstract | This paper describes our system submission to the CALCS 2018 shared task on named entity recognition on code-switched data for the language variant pair of Modern Standard Arabic and Egyptian dialectal Arabic. We build a a Deep Neural Network that combines word and character-based representations in convolutional and recurrent networks with a CRF layer. The model is augmented with stacked layers of enriched information such pre-trained embeddings, Brown clusters and named entity gazetteers. Our system is ranked second among those participating in the shared task achieving an FB1 average of 70.09{%}. |
Tasks | Named Entity Recognition |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/W18-3212/ |
https://www.aclweb.org/anthology/W18-3212 | |
PWC | https://paperswithcode.com/paper/ghht-at-calcs-2018-named-entity-recognition |
Repo | |
Framework | |
Learning Unsupervised Word Translations Without Adversaries
Title | Learning Unsupervised Word Translations Without Adversaries |
Authors | Tanmoy Mukherjee, Makoto Yamada, Timothy Hospedales |
Abstract | Word translation, or bilingual dictionary induction, is an important capability that impacts many multilingual language processing tasks. Recent research has shown that word translation can be achieved in an unsupervised manner, without parallel seed dictionaries or aligned corpora. However, state of the art methods unsupervised bilingual dictionary induction are based on generative adversarial models, and as such suffer from their well known problems of instability and hyper-parameter sensitivity. We present a statistical dependency-based approach to bilingual dictionary induction that is unsupervised {–} no seed dictionary or parallel corpora required; and introduces no adversary {–} therefore being much easier to train. Our method performs comparably to adversarial alternatives and outperforms prior non-adversarial methods. |
Tasks | Machine Translation, Multilingual Word Embeddings, Transfer Learning, Word Embeddings |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1063/ |
https://www.aclweb.org/anthology/D18-1063 | |
PWC | https://paperswithcode.com/paper/learning-unsupervised-word-translations |
Repo | |
Framework | |
Tackling Code-Switched NER: Participation of CMU
Title | Tackling Code-Switched NER: Participation of CMU |
Authors | Parvathy Geetha, Ch, Khyathi u, Alan W Black |
Abstract | Named Entity Recognition plays a major role in several downstream applications in NLP. Though this task has been heavily studied in formal monolingual texts and also noisy texts like Twitter data, it is still an emerging task in code-switched (CS) content on social media. This paper describes our participation in the shared task of NER on code-switched data for Spanglish (Spanish + English) and Arabish (Arabic + English). In this paper we describe models that intuitively developed from the data for the shared task Named Entity Recognition on Code-switched Data. Owing to the sparse and non-linear relationships between words in Twitter data, we explored neural architectures that are capable of non-linearities fairly well. In specific, we trained character level models and word level models based on Bidirectional LSTMs (Bi-LSTMs) to perform sequential tagging. We train multiple models to identify nominal mentions and subsequently use this information to predict the labels of named entity in a sequence. Our best model is a character level model along with word level pre-trained multilingual embeddings that gave an F-score of 56.72 in Spanglish and a word level model that gave an F-score of 65.02 in Arabish on the test data. |
Tasks | Named Entity Recognition, Question Answering |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/W18-3217/ |
https://www.aclweb.org/anthology/W18-3217 | |
PWC | https://paperswithcode.com/paper/tackling-code-switched-ner-participation-of |
Repo | |
Framework | |
Fine-Grained Arabic Dialect Identification
Title | Fine-Grained Arabic Dialect Identification |
Authors | Mohammad Salameh, Houda Bouamor, Nizar Habash |
Abstract | Previous work on the problem of Arabic Dialect Identification typically targeted coarse-grained five dialect classes plus Standard Arabic (6-way classification). This paper presents the first results on a fine-grained dialect classification task covering 25 specific cities from across the Arab World, in addition to Standard Arabic {–} a very challenging task. We build several classification systems and explore a large space of features. Our results show that we can identify the exact city of a speaker at an accuracy of 67.9{%} for sentences with an average length of 7 words (a 9{%} relative error reduction over the state-of-the-art technique for Arabic dialect identification) and reach more than 90{%} when we consider 16 words. We also report on additional insights from a data analysis of similarity and difference across Arabic dialects. |
Tasks | Machine Translation, Sentiment Analysis |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/C18-1113/ |
https://www.aclweb.org/anthology/C18-1113 | |
PWC | https://paperswithcode.com/paper/fine-grained-arabic-dialect-identification |
Repo | |
Framework | |
Clausal Modifiers in the Grammar Matrix
Title | Clausal Modifiers in the Grammar Matrix |
Authors | Kristen Howell, Olga Zamaraeva |
Abstract | We extend the coverage of an existing grammar customization system to clausal modifiers, also referred to as adverbial clauses. We present an analysis, taking a typologically-driven approach to account for this phenomenon across the world{'}s languages, which we implement in the Grammar Matrix customization system (Bender et al., 2002, 2010). Testing our analysis on testsuites from five genetically and geographically diverse languages that were not considered in development, we achieve 88.4{%} coverage and 1.5{%} overgeneration. |
Tasks | |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/C18-1249/ |
https://www.aclweb.org/anthology/C18-1249 | |
PWC | https://paperswithcode.com/paper/clausal-modifiers-in-the-grammar-matrix |
Repo | |
Framework | |
IIT (BHU) Submission for the ACL Shared Task on Named Entity Recognition on Code-switched Data
Title | IIT (BHU) Submission for the ACL Shared Task on Named Entity Recognition on Code-switched Data |
Authors | Shashwat Trivedi, Harsh Rangwani, Anil Kumar Singh |
Abstract | This paper describes the best performing system for the shared task on Named Entity Recognition (NER) on code-switched data for the language pair Spanish-English (ENG-SPA). We introduce a gated neural architecture for the NER task. Our final model achieves an F1 score of 63.76{%}, outperforming the baseline by 10{%}. |
Tasks | Named Entity Recognition |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/W18-3220/ |
https://www.aclweb.org/anthology/W18-3220 | |
PWC | https://paperswithcode.com/paper/iit-bhu-submission-for-the-acl-shared-task-on |
Repo | |
Framework | |
Unified Guidelines and Resources for Arabic Dialect Orthography
Title | Unified Guidelines and Resources for Arabic Dialect Orthography |
Authors | Nizar Habash, Fadhl Eryani, Salam Khalifa, Owen Rambow, Dana Abdulrahim, Alex Erdmann, er, Reem Faraj, Wajdi Zaghouani, Houda Bouamor, Nasser Zalmout, Sara Hassan, Faisal Al-Shargi, Sakhar Alkhereyf, Basma Abdulkareem, Esk, Ramy er, Mohammad Salameh, Hind Saddiki |
Abstract | |
Tasks | Machine Translation, Speech Recognition, Transliteration |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1574/ |
https://www.aclweb.org/anthology/L18-1574 | |
PWC | https://paperswithcode.com/paper/unified-guidelines-and-resources-for-arabic |
Repo | |
Framework | |
The LIA Treebank of Spoken Norwegian Dialects
Title | The LIA Treebank of Spoken Norwegian Dialects |
Authors | Lilja {\O}vrelid, Andre K{\aa}sen, Kristin Hagen, Anders N{\o}klestad, Per Erik Solberg, Janne Bondi Johannessen |
Abstract | |
Tasks | Dependency Parsing, Transliteration |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1710/ |
https://www.aclweb.org/anthology/L18-1710 | |
PWC | https://paperswithcode.com/paper/the-lia-treebank-of-spoken-norwegian-dialects |
Repo | |
Framework | |
EmotionX-Area66: Predicting Emotions in Dialogues using Hierarchical Attention Network with Sequence Labeling
Title | EmotionX-Area66: Predicting Emotions in Dialogues using Hierarchical Attention Network with Sequence Labeling |
Authors | Rohit Saxena, Savita Bhat, Niranjan Pedanekar |
Abstract | This paper presents our system submitted to the EmotionX challenge. It is an emotion detection task on dialogues in the EmotionLines dataset. We formulate this as a hierarchical network where network learns data representation at both utterance level and dialogue level. Our model is inspired by Hierarchical Attention network (HAN) and uses pre-trained word embeddings as features. We formulate emotion detection in dialogues as a sequence labeling problem to capture the dependencies among labels. We report the performance accuracy for four emotions (anger, joy, neutral and sadness). The model achieved unweighted accuracy of 55.38{%} on Friends test dataset and 56.73{%} on EmotionPush test dataset. We report an improvement of 22.51{%} in Friends dataset and 36.04{%} in EmotionPush dataset over baseline results. |
Tasks | Emotion Classification, Emotion Recognition, Word Embeddings |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/W18-3509/ |
https://www.aclweb.org/anthology/W18-3509 | |
PWC | https://paperswithcode.com/paper/emotionx-area66-predicting-emotions-in |
Repo | |
Framework | |
Mixed-integer programming formulation of a data-driven solver in computational elasticity
Title | Mixed-integer programming formulation of a data-driven solver in computational elasticity |
Authors | Yoshihiro Kanno |
Abstract | This paper presents a mixed-integer quadratic programming formulation of an existing data-driven approach to computational elasticity. This formulation is suitable for application of a standard mixed-integer programming solver, which finds a global optimal solution. Therefore, the results obtained by the presented method can be used as benchmark instances for any other algorithm. Preliminary numerical experiments are performed to compare quality of solutions obtained by the proposed method and a heuristic conventionally used in the data-driven computational mechanics. |
Tasks | Stress-Strain Relation |
Published | 2018-02-25 |
URL | https://link.springer.com/article/10.1007/s11590-019-01409-w |
https://link.springer.com/article/10.1007/s11590-019-01409-w | |
PWC | https://paperswithcode.com/paper/mixed-integer-programming-formulation-of-a |
Repo | |
Framework | |
Universal Sketch Perceptual Grouping
Title | Universal Sketch Perceptual Grouping |
Authors | Ke Li, Kaiyue Pang, Jifei Song, Yi-Zhe Song, Tao Xiang, Timothy M. Hospedales, Honggang Zhang |
Abstract | In this work we aim to develop a universal sketch grouper. That is, a grouper that can be applied to sketches of any category in any domain to group constituent strokes/segments into semantically meaningful object parts. The first obstacle to this goal is the lack of large-scale datasets with grouping annotation. To overcome this, we contribute the largest sketch perceptual grouping (SPG) dataset to date, consisting of 20,000 unique sketches evenly distributed over 25 object categories. Furthermore, we propose a novel deep universal perceptual grouping model. The model is learned with both generative and discriminative losses. The generative losses improve the generalisation ability of the model to unseen object categories and datasets. The discriminative losses include a local grouping loss and a novel global grouping loss to enforce global grouping consistency. We show that the proposed model significantly outperforms the state-of-the-art groupers. Further, we show that our grouper is useful for a number of sketch analysis tasks including sketch synthesis and fine-grained sketch-based image retrieval (FG-SBIR). |
Tasks | Image Retrieval, Sketch-Based Image Retrieval |
Published | 2018-09-01 |
URL | http://openaccess.thecvf.com/content_ECCV_2018/html/Ke_LI_Universal_Sketch_Perceptual_ECCV_2018_paper.html |
http://openaccess.thecvf.com/content_ECCV_2018/papers/Ke_LI_Universal_Sketch_Perceptual_ECCV_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/universal-sketch-perceptual-grouping |
Repo | |
Framework | |
Thank ``Goodness’'! A Way to Measure Style in Student Essays
Title | Thank ``Goodness’'! A Way to Measure Style in Student Essays | |
Authors | S Mathias, eep, Pushpak Bhattacharyya |
Abstract | Essays have two major components for scoring - content and style. In this paper, we describe a property of the essay, called goodness, and use it to predict the score given for the style of student essays. We compare our approach to solve this problem with baseline approaches, like language modeling and also a state-of-the-art deep learning system. We show that, despite being quite intuitive, our approach is very powerful in predicting the style of the essays. |
Tasks | Language Modelling |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/W18-3705/ |
https://www.aclweb.org/anthology/W18-3705 | |
PWC | https://paperswithcode.com/paper/thank-agoodnessa-a-way-to-measure-style-in |
Repo | |
Framework | |
Abstract Meaning Representation for Paraphrase Detection
Title | Abstract Meaning Representation for Paraphrase Detection |
Authors | Fuad Issa, Marco Damonte, Shay B. Cohen, Xiaohui Yan, Yi Chang |
Abstract | Abstract Meaning Representation (AMR) parsing aims at abstracting away from the syntactic realization of a sentence, and denote only its meaning in a canonical form. As such, it is ideal for paraphrase detection, a problem in which one is required to specify whether two sentences have the same meaning. We show that na{"\i}ve use of AMR in paraphrase detection is not necessarily useful, and turn to describe a technique based on latent semantic analysis in combination with AMR parsing that significantly advances state-of-the-art results in paraphrase detection for the Microsoft Research Paraphrase Corpus. Our best results in the transductive setting are 86.6{%} for accuracy and 90.0{%} for F$_1$ measure. |
Tasks | Amr Parsing |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/N18-1041/ |
https://www.aclweb.org/anthology/N18-1041 | |
PWC | https://paperswithcode.com/paper/abstract-meaning-representation-for-1 |
Repo | |
Framework | |
You Tweet What You Speak: A City-Level Dataset of Arabic Dialects
Title | You Tweet What You Speak: A City-Level Dataset of Arabic Dialects |
Authors | Muhammad Abdul-Mageed, Hassan Alhuzali, Mohamed Elaraby |
Abstract | |
Tasks | |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1577/ |
https://www.aclweb.org/anthology/L18-1577 | |
PWC | https://paperswithcode.com/paper/you-tweet-what-you-speak-a-city-level-dataset |
Repo | |
Framework | |