Paper Group NANR 27
That’ll Do Fine!: A Coarse Lexical Resource for English-Hindi MT, Using Polylingual Topic Models. Bidirectional LSTM for Named Entity Recognition in Twitter Messages. Adding syntactic structure to bilingual terminology for improved domain adaptation. Data Resource Acquisition from People at Various Stages of Cognitive Decline – Design and Explorat …
That’ll Do Fine!: A Coarse Lexical Resource for English-Hindi MT, Using Polylingual Topic Models
Title | That’ll Do Fine!: A Coarse Lexical Resource for English-Hindi MT, Using Polylingual Topic Models |
Authors | Diptesh Kanojia, Aditya Joshi, Pushpak Bhattacharyya, Mark James Carman |
Abstract | Parallel corpora are often injected with bilingual lexical resources for improved Indian language machine translation (MT). In absence of such lexical resources, multilingual topic models have been used to create coarse lexical resources in the past, using a Cartesian product approach. Our results show that for morphologically rich languages like Hindi, the Cartesian product approach is detrimental for MT. We then present a novel {`}sentential{'} approach to use this coarse lexical resource from a multilingual topic model. Our coarse lexical resource when injected with a parallel corpus outperforms a system trained using parallel corpus and a good quality lexical resource. As demonstrated by the quality of our coarse lexical resource and its benefit to MT, we believe that our sentential approach to create such a resource will help MT for resource-constrained languages. | |
Tasks | Machine Translation, Topic Models |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1349/ |
https://www.aclweb.org/anthology/L16-1349 | |
PWC | https://paperswithcode.com/paper/thatll-do-fine-a-coarse-lexical-resource-for |
Repo | |
Framework | |
Bidirectional LSTM for Named Entity Recognition in Twitter Messages
Title | Bidirectional LSTM for Named Entity Recognition in Twitter Messages |
Authors | Nut Limsopatham, Nigel Collier |
Abstract | In this paper, we present our approach for named entity recognition in Twitter messages that we used in our participation in the Named Entity Recognition in Twitter shared task at the COLING 2016 Workshop on Noisy User-generated text (WNUT). The main challenge that we aim to tackle in our participation is the short, noisy and colloquial nature of tweets, which makes named entity recognition in Twitter message a challenging task. In particular, we investigate an approach for dealing with this problem by enabling bidirectional long short-term memory (LSTM) to automatically learn orthographic features without requiring feature engineering. In comparison with other systems participating in the shared task, our system achieved the most effective performance on both the {}segmentation and categorisation{'} and the { }segmentation only{'} sub-tasks. |
Tasks | Feature Engineering, Named Entity Recognition |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-3920/ |
https://www.aclweb.org/anthology/W16-3920 | |
PWC | https://paperswithcode.com/paper/bidirectional-lstm-for-named-entity |
Repo | |
Framework | |
Adding syntactic structure to bilingual terminology for improved domain adaptation
Title | Adding syntactic structure to bilingual terminology for improved domain adaptation |
Authors | Mikel Artetxe, Gorka Labaka, Chakaveh Saedi, Jo{~a}o Rodrigues, Jo{~a}o Silva, Ant{'o}nio Branco, Eneko Agirre |
Abstract | |
Tasks | Domain Adaptation, Machine Translation |
Published | 2016-10-01 |
URL | https://www.aclweb.org/anthology/W16-6405/ |
https://www.aclweb.org/anthology/W16-6405 | |
PWC | https://paperswithcode.com/paper/adding-syntactic-structure-to-bilingual |
Repo | |
Framework | |
Data Resource Acquisition from People at Various Stages of Cognitive Decline – Design and Exploration Considerations
Title | Data Resource Acquisition from People at Various Stages of Cognitive Decline – Design and Exploration Considerations |
Authors | Dimitrios Kokkinakis, Kristina Lundholm Fors, Arto Nordlund |
Abstract | |
Tasks | |
Published | 2016-11-01 |
URL | https://www.aclweb.org/anthology/W16-6104/ |
https://www.aclweb.org/anthology/W16-6104 | |
PWC | https://paperswithcode.com/paper/data-resource-acquisition-from-people-at |
Repo | |
Framework | |
Relating semantic similarity and semantic association to how humans label other people
Title | Relating semantic similarity and semantic association to how humans label other people |
Authors | Kenneth Joseph, Kathleen M. Carley |
Abstract | |
Tasks | Semantic Similarity, Semantic Textual Similarity, Topic Models |
Published | 2016-11-01 |
URL | https://www.aclweb.org/anthology/W16-5601/ |
https://www.aclweb.org/anthology/W16-5601 | |
PWC | https://paperswithcode.com/paper/relating-semantic-similarity-and-semantic |
Repo | |
Framework | |
A Vector Model for Type-Theoretical Semantics
Title | A Vector Model for Type-Theoretical Semantics |
Authors | Konstantin Sokolov |
Abstract | |
Tasks | Representation Learning |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-1627/ |
https://www.aclweb.org/anthology/W16-1627 | |
PWC | https://paperswithcode.com/paper/a-vector-model-for-type-theoretical-semantics |
Repo | |
Framework | |
Twitter at the Grammys: A Social Media Corpus for Entity Linking and Disambiguation
Title | Twitter at the Grammys: A Social Media Corpus for Entity Linking and Disambiguation |
Authors | Mark Dredze, Nicholas Andrews, Jay DeYoung |
Abstract | |
Tasks | Coreference Resolution, Entity Linking |
Published | 2016-11-01 |
URL | https://www.aclweb.org/anthology/W16-6204/ |
https://www.aclweb.org/anthology/W16-6204 | |
PWC | https://paperswithcode.com/paper/twitter-at-the-grammys-a-social-media-corpus |
Repo | |
Framework | |
Human versus Machine Attention in Document Classification: A Dataset with Crowdsourced Annotations
Title | Human versus Machine Attention in Document Classification: A Dataset with Crowdsourced Annotations |
Authors | Nikolaos Pappas, Andrei Popescu-Belis |
Abstract | |
Tasks | Document Classification, Multiple Instance Learning, Sentiment Analysis |
Published | 2016-11-01 |
URL | https://www.aclweb.org/anthology/W16-6213/ |
https://www.aclweb.org/anthology/W16-6213 | |
PWC | https://paperswithcode.com/paper/human-versus-machine-attention-in-document |
Repo | |
Framework | |
Integrating WordNet for Multiple Sense Embeddings in Vector Semantics
Title | Integrating WordNet for Multiple Sense Embeddings in Vector Semantics |
Authors | David Foley, Jugal Kalita |
Abstract | |
Tasks | Word Sense Disambiguation |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-6302/ |
https://www.aclweb.org/anthology/W16-6302 | |
PWC | https://paperswithcode.com/paper/integrating-wordnet-for-multiple-sense |
Repo | |
Framework | |
Proceedings of the 12th Workshop on Multiword Expressions
Title | Proceedings of the 12th Workshop on Multiword Expressions |
Authors | |
Abstract | |
Tasks | |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-1800/ |
https://www.aclweb.org/anthology/W16-1800 | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-12th-workshop-on-multiword |
Repo | |
Framework | |
The TALP–UPC Spanish–English WMT Biomedical Task: Bilingual Embeddings and Char-based Neural Language Model Rescoring in a Phrase-based System
Title | The TALP–UPC Spanish–English WMT Biomedical Task: Bilingual Embeddings and Char-based Neural Language Model Rescoring in a Phrase-based System |
Authors | Marta R. Costa-juss{`a}, Cristina Espa{~n}a-Bonet, Pranava Madhyastha, Carlos Escolano, Jos{'e} A. R. Fonollosa |
Abstract | |
Tasks | Language Modelling, Machine Translation, Word Embeddings |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-2336/ |
https://www.aclweb.org/anthology/W16-2336 | |
PWC | https://paperswithcode.com/paper/the-talpaupc-spanishaenglish-wmt-biomedical |
Repo | |
Framework | |
網路新興語言&耍'之語意辨析:以批踢踢語料庫為本(On the semantic analysis of the verb shua3 in Taiwan Mandarin: The PTT corpus-based study)[In Chinese]
Title | 網路新興語言&耍'之語意辨析:以批踢踢語料庫為本(On the semantic analysis of the verb shua3 in Taiwan Mandarin: The PTT corpus-based study)[In Chinese] |
Authors | Hsueh-ying Hu, Siaw-Fong Chung |
Abstract | |
Tasks | |
Published | 2016-10-01 |
URL | https://www.aclweb.org/anthology/O16-1017/ |
https://www.aclweb.org/anthology/O16-1017 | |
PWC | https://paperswithcode.com/paper/c2e-eeae-ea1eae34-i14a1e-e-eaaocoon-the |
Repo | |
Framework | |
Exploring Topic Discriminating Power of Words in Latent Dirichlet Allocation
Title | Exploring Topic Discriminating Power of Words in Latent Dirichlet Allocation |
Authors | Kai Yang, Yi Cai, Zhenhong Chen, Ho-fung Leung, Raymond Lau |
Abstract | Latent Dirichlet Allocation (LDA) and its variants have been widely used to discover latent topics in textual documents. However, some of topics generated by LDA may be noisy with irrelevant words scattering across these topics. We name this kind of words as topic-indiscriminate words, which tend to make topics more ambiguous and less interpretable by humans. In our work, we propose a new topic model named TWLDA, which assigns low weights to words with low topic discriminating power (ability). Our experimental results show that the proposed approach, which effectively reduces the number of topic-indiscriminate words in discovered topics, improves the effectiveness of LDA. |
Tasks | Topic Models |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-1211/ |
https://www.aclweb.org/anthology/C16-1211 | |
PWC | https://paperswithcode.com/paper/exploring-topic-discriminating-power-of-words |
Repo | |
Framework | |
MUSEEC: A Multilingual Text Summarization Tool
Title | MUSEEC: A Multilingual Text Summarization Tool |
Authors | Marina Litvak, Natalia Vanetik, Mark Last, Elena Churkin |
Abstract | |
Tasks | Abstractive Text Summarization, Document Summarization, Sentence Compression, Text Summarization |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/P16-4013/ |
https://www.aclweb.org/anthology/P16-4013 | |
PWC | https://paperswithcode.com/paper/museec-a-multilingual-text-summarization-tool |
Repo | |
Framework | |
Learning to Identify Subjective Sentences
Title | Learning to Identify Subjective Sentences |
Authors | Girish K. Palshikar, Manoj Apte, P, Deepak ita, Vikram Singh |
Abstract | |
Tasks | Opinion Mining, Sentiment Analysis |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-6330/ |
https://www.aclweb.org/anthology/W16-6330 | |
PWC | https://paperswithcode.com/paper/learning-to-identify-subjective-sentences |
Repo | |
Framework | |