May 5, 2019

1105 words 6 mins read

Paper Group NANR 27

Paper Group NANR 27

That’ll Do Fine!: A Coarse Lexical Resource for English-Hindi MT, Using Polylingual Topic Models. Bidirectional LSTM for Named Entity Recognition in Twitter Messages. Adding syntactic structure to bilingual terminology for improved domain adaptation. Data Resource Acquisition from People at Various Stages of Cognitive Decline – Design and Explorat …

That’ll Do Fine!: A Coarse Lexical Resource for English-Hindi MT, Using Polylingual Topic Models

Title That’ll Do Fine!: A Coarse Lexical Resource for English-Hindi MT, Using Polylingual Topic Models
Authors Diptesh Kanojia, Aditya Joshi, Pushpak Bhattacharyya, Mark James Carman
Abstract Parallel corpora are often injected with bilingual lexical resources for improved Indian language machine translation (MT). In absence of such lexical resources, multilingual topic models have been used to create coarse lexical resources in the past, using a Cartesian product approach. Our results show that for morphologically rich languages like Hindi, the Cartesian product approach is detrimental for MT. We then present a novel {`}sentential{'} approach to use this coarse lexical resource from a multilingual topic model. Our coarse lexical resource when injected with a parallel corpus outperforms a system trained using parallel corpus and a good quality lexical resource. As demonstrated by the quality of our coarse lexical resource and its benefit to MT, we believe that our sentential approach to create such a resource will help MT for resource-constrained languages. |
Tasks Machine Translation, Topic Models
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1349/
PDF https://www.aclweb.org/anthology/L16-1349
PWC https://paperswithcode.com/paper/thatll-do-fine-a-coarse-lexical-resource-for
Repo
Framework

Bidirectional LSTM for Named Entity Recognition in Twitter Messages

Title Bidirectional LSTM for Named Entity Recognition in Twitter Messages
Authors Nut Limsopatham, Nigel Collier
Abstract In this paper, we present our approach for named entity recognition in Twitter messages that we used in our participation in the Named Entity Recognition in Twitter shared task at the COLING 2016 Workshop on Noisy User-generated text (WNUT). The main challenge that we aim to tackle in our participation is the short, noisy and colloquial nature of tweets, which makes named entity recognition in Twitter message a challenging task. In particular, we investigate an approach for dealing with this problem by enabling bidirectional long short-term memory (LSTM) to automatically learn orthographic features without requiring feature engineering. In comparison with other systems participating in the shared task, our system achieved the most effective performance on both the {}segmentation and categorisation{'} and the {}segmentation only{'} sub-tasks.
Tasks Feature Engineering, Named Entity Recognition
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-3920/
PDF https://www.aclweb.org/anthology/W16-3920
PWC https://paperswithcode.com/paper/bidirectional-lstm-for-named-entity
Repo
Framework

Adding syntactic structure to bilingual terminology for improved domain adaptation

Title Adding syntactic structure to bilingual terminology for improved domain adaptation
Authors Mikel Artetxe, Gorka Labaka, Chakaveh Saedi, Jo{~a}o Rodrigues, Jo{~a}o Silva, Ant{'o}nio Branco, Eneko Agirre
Abstract
Tasks Domain Adaptation, Machine Translation
Published 2016-10-01
URL https://www.aclweb.org/anthology/W16-6405/
PDF https://www.aclweb.org/anthology/W16-6405
PWC https://paperswithcode.com/paper/adding-syntactic-structure-to-bilingual
Repo
Framework

Data Resource Acquisition from People at Various Stages of Cognitive Decline – Design and Exploration Considerations

Title Data Resource Acquisition from People at Various Stages of Cognitive Decline – Design and Exploration Considerations
Authors Dimitrios Kokkinakis, Kristina Lundholm Fors, Arto Nordlund
Abstract
Tasks
Published 2016-11-01
URL https://www.aclweb.org/anthology/W16-6104/
PDF https://www.aclweb.org/anthology/W16-6104
PWC https://paperswithcode.com/paper/data-resource-acquisition-from-people-at
Repo
Framework

Relating semantic similarity and semantic association to how humans label other people

Title Relating semantic similarity and semantic association to how humans label other people
Authors Kenneth Joseph, Kathleen M. Carley
Abstract
Tasks Semantic Similarity, Semantic Textual Similarity, Topic Models
Published 2016-11-01
URL https://www.aclweb.org/anthology/W16-5601/
PDF https://www.aclweb.org/anthology/W16-5601
PWC https://paperswithcode.com/paper/relating-semantic-similarity-and-semantic
Repo
Framework

A Vector Model for Type-Theoretical Semantics

Title A Vector Model for Type-Theoretical Semantics
Authors Konstantin Sokolov
Abstract
Tasks Representation Learning
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-1627/
PDF https://www.aclweb.org/anthology/W16-1627
PWC https://paperswithcode.com/paper/a-vector-model-for-type-theoretical-semantics
Repo
Framework

Twitter at the Grammys: A Social Media Corpus for Entity Linking and Disambiguation

Title Twitter at the Grammys: A Social Media Corpus for Entity Linking and Disambiguation
Authors Mark Dredze, Nicholas Andrews, Jay DeYoung
Abstract
Tasks Coreference Resolution, Entity Linking
Published 2016-11-01
URL https://www.aclweb.org/anthology/W16-6204/
PDF https://www.aclweb.org/anthology/W16-6204
PWC https://paperswithcode.com/paper/twitter-at-the-grammys-a-social-media-corpus
Repo
Framework

Human versus Machine Attention in Document Classification: A Dataset with Crowdsourced Annotations

Title Human versus Machine Attention in Document Classification: A Dataset with Crowdsourced Annotations
Authors Nikolaos Pappas, Andrei Popescu-Belis
Abstract
Tasks Document Classification, Multiple Instance Learning, Sentiment Analysis
Published 2016-11-01
URL https://www.aclweb.org/anthology/W16-6213/
PDF https://www.aclweb.org/anthology/W16-6213
PWC https://paperswithcode.com/paper/human-versus-machine-attention-in-document
Repo
Framework

Integrating WordNet for Multiple Sense Embeddings in Vector Semantics

Title Integrating WordNet for Multiple Sense Embeddings in Vector Semantics
Authors David Foley, Jugal Kalita
Abstract
Tasks Word Sense Disambiguation
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-6302/
PDF https://www.aclweb.org/anthology/W16-6302
PWC https://paperswithcode.com/paper/integrating-wordnet-for-multiple-sense
Repo
Framework

Proceedings of the 12th Workshop on Multiword Expressions

Title Proceedings of the 12th Workshop on Multiword Expressions
Authors
Abstract
Tasks
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-1800/
PDF https://www.aclweb.org/anthology/W16-1800
PWC https://paperswithcode.com/paper/proceedings-of-the-12th-workshop-on-multiword
Repo
Framework

The TALP–UPC Spanish–English WMT Biomedical Task: Bilingual Embeddings and Char-based Neural Language Model Rescoring in a Phrase-based System

Title The TALP–UPC Spanish–English WMT Biomedical Task: Bilingual Embeddings and Char-based Neural Language Model Rescoring in a Phrase-based System
Authors Marta R. Costa-juss{`a}, Cristina Espa{~n}a-Bonet, Pranava Madhyastha, Carlos Escolano, Jos{'e} A. R. Fonollosa
Abstract
Tasks Language Modelling, Machine Translation, Word Embeddings
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-2336/
PDF https://www.aclweb.org/anthology/W16-2336
PWC https://paperswithcode.com/paper/the-talpaupc-spanishaenglish-wmt-biomedical
Repo
Framework

網路新興語言&耍'之語意辨析:以批踢踢語料庫為本(On the semantic analysis of the verb shua3 in Taiwan Mandarin: The PTT corpus-based study)[In Chinese]

Title 網路新興語言&耍'之語意辨析:以批踢踢語料庫為本(On the semantic analysis of the verb shua3 in Taiwan Mandarin: The PTT corpus-based study)[In Chinese]
Authors Hsueh-ying Hu, Siaw-Fong Chung
Abstract
Tasks
Published 2016-10-01
URL https://www.aclweb.org/anthology/O16-1017/
PDF https://www.aclweb.org/anthology/O16-1017
PWC https://paperswithcode.com/paper/c2e-eeae-ea1eae34-i14a1e-e-eaaocoon-the
Repo
Framework

Exploring Topic Discriminating Power of Words in Latent Dirichlet Allocation

Title Exploring Topic Discriminating Power of Words in Latent Dirichlet Allocation
Authors Kai Yang, Yi Cai, Zhenhong Chen, Ho-fung Leung, Raymond Lau
Abstract Latent Dirichlet Allocation (LDA) and its variants have been widely used to discover latent topics in textual documents. However, some of topics generated by LDA may be noisy with irrelevant words scattering across these topics. We name this kind of words as topic-indiscriminate words, which tend to make topics more ambiguous and less interpretable by humans. In our work, we propose a new topic model named TWLDA, which assigns low weights to words with low topic discriminating power (ability). Our experimental results show that the proposed approach, which effectively reduces the number of topic-indiscriminate words in discovered topics, improves the effectiveness of LDA.
Tasks Topic Models
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-1211/
PDF https://www.aclweb.org/anthology/C16-1211
PWC https://paperswithcode.com/paper/exploring-topic-discriminating-power-of-words
Repo
Framework

MUSEEC: A Multilingual Text Summarization Tool

Title MUSEEC: A Multilingual Text Summarization Tool
Authors Marina Litvak, Natalia Vanetik, Mark Last, Elena Churkin
Abstract
Tasks Abstractive Text Summarization, Document Summarization, Sentence Compression, Text Summarization
Published 2016-08-01
URL https://www.aclweb.org/anthology/P16-4013/
PDF https://www.aclweb.org/anthology/P16-4013
PWC https://paperswithcode.com/paper/museec-a-multilingual-text-summarization-tool
Repo
Framework

Learning to Identify Subjective Sentences

Title Learning to Identify Subjective Sentences
Authors Girish K. Palshikar, Manoj Apte, P, Deepak ita, Vikram Singh
Abstract
Tasks Opinion Mining, Sentiment Analysis
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-6330/
PDF https://www.aclweb.org/anthology/W16-6330
PWC https://paperswithcode.com/paper/learning-to-identify-subjective-sentences
Repo
Framework
comments powered by Disqus