May 5, 2019

1599 words 8 mins read

Paper Group NANR 140

Paper Group NANR 140

An Investigation on The Effectiveness of Employing Topic Modeling Techniques to Provide Topic Awareness For Conversational Agents. Enhancing Cross-border EU E-commerce through Machine Translation: Needed Language Resources, Challenges and Opportunities. Quality Estimation for Language Output Applications. Combination of Convolutional and Recurrent …

An Investigation on The Effectiveness of Employing Topic Modeling Techniques to Provide Topic Awareness For Conversational Agents

Title An Investigation on The Effectiveness of Employing Topic Modeling Techniques to Provide Topic Awareness For Conversational Agents
Authors Omid Moradiannasab
Abstract
Tasks
Published 2016-08-01
URL https://www.aclweb.org/anthology/P16-3012/
PDF https://www.aclweb.org/anthology/P16-3012
PWC https://paperswithcode.com/paper/an-investigation-on-the-effectiveness-of
Repo
Framework

Enhancing Cross-border EU E-commerce through Machine Translation: Needed Language Resources, Challenges and Opportunities

Title Enhancing Cross-border EU E-commerce through Machine Translation: Needed Language Resources, Challenges and Opportunities
Authors Meritxell Fern{'a}ndez Barrera, Vladimir Popescu, Antonio Toral, Federico Gaspari, Khalid Choukri
Abstract This paper discusses the role that statistical machine translation (SMT) can play in the development of cross-border EU e-commerce,by highlighting extant obstacles and identifying relevant technologies to overcome them. In this sense, it firstly proposes a typology of e-commerce static and dynamic textual genres and it identifies those that may be more successfully targeted by SMT. The specific challenges concerning the automatic translation of user-generated content are discussed in detail. Secondly, the paper highlights the risk of data sparsity inherent to e-commerce and it explores the state-of-the-art strategies to achieve domain adequacy via adaptation. Thirdly, it proposes a robust workflow for the development of SMT systems adapted to the e-commerce domain by relying on inexpensive methods. Given the scarcity of user-generated language corpora for most language pairs, the paper proposes to obtain monolingual target-language data to train language models and aligned parallel corpora to tune and evaluate MT systems by means of crowdsourcing.
Tasks Machine Translation
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1721/
PDF https://www.aclweb.org/anthology/L16-1721
PWC https://paperswithcode.com/paper/enhancing-cross-border-eu-e-commerce-through
Repo
Framework

Quality Estimation for Language Output Applications

Title Quality Estimation for Language Output Applications
Authors Carolina Scarton, Gustavo Paetzold, Lucia Specia
Abstract Quality Estimation (QE) of language output applications is a research area that has been attracting significant attention. The goal of QE is to estimate the quality of language output applications without the need of human references. Instead, machine learning algorithms are used to build supervised models based on a few labelled training instances. Such models are able to generalise over unseen data and thus QE is a robust method applicable to scenarios where human input is not available or possible. One such a scenario where QE is particularly appealing is that of Machine Translation, where a score for predicted quality can help decide whether or not a translation is useful (e.g. for post-editing) or reliable (e.g. for gisting). Other potential applications within Natural Language Processing (NLP) include Text Summarisation and Text Simplification. In this tutorial we present the task of QE and its application in NLP, focusing on Machine Translation. We also introduce QuEst++, a toolkit for QE that encompasses feature extraction and machine learning, and propose a practical activity to extend this toolkit in various ways.
Tasks Machine Translation, Multi-Task Learning, Text Simplification
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-3004/
PDF https://www.aclweb.org/anthology/C16-3004
PWC https://paperswithcode.com/paper/quality-estimation-for-language-output
Repo
Framework

Combination of Convolutional and Recurrent Neural Network for Sentiment Analysis of Short Texts

Title Combination of Convolutional and Recurrent Neural Network for Sentiment Analysis of Short Texts
Authors Xingyou Wang, Weijie Jiang, Zhiyong Luo
Abstract Sentiment analysis of short texts is challenging because of the limited contextual information they usually contain. In recent years, deep learning models such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs) have been applied to text sentiment analysis with comparatively remarkable results. In this paper, we describe a jointed CNN and RNN architecture, taking advantage of the coarse-grained local features generated by CNN and long-distance dependencies learned via RNN for sentiment analysis of short texts. Experimental results show an obvious improvement upon the state-of-the-art on three benchmark corpora, MR, SST1 and SST2, with 82.28{%}, 51.50{%} and 89.95{%} accuracy, respectively.
Tasks Information Retrieval, Sentiment Analysis, Speech Recognition, Word Embeddings
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-1229/
PDF https://www.aclweb.org/anthology/C16-1229
PWC https://paperswithcode.com/paper/combination-of-convolutional-and-recurrent
Repo
Framework

Adjusting Word Embeddings with Semantic Intensity Orders

Title Adjusting Word Embeddings with Semantic Intensity Orders
Authors Joo-Kyung Kim, Marie-Catherine de Marneffe, Eric Fosler-Lussier
Abstract
Tasks Representation Learning, Word Embeddings
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-1607/
PDF https://www.aclweb.org/anthology/W16-1607
PWC https://paperswithcode.com/paper/adjusting-word-embeddings-with-semantic
Repo
Framework

Assisting Discussion Forum Users using Deep Recurrent Neural Networks

Title Assisting Discussion Forum Users using Deep Recurrent Neural Networks
Authors Jacob Hagstedt P Suorra, Olof Mogren
Abstract
Tasks Representation Learning
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-1606/
PDF https://www.aclweb.org/anthology/W16-1606
PWC https://paperswithcode.com/paper/assisting-discussion-forum-users-using-deep
Repo
Framework

Entity Disambiguation by Knowledge and Text Jointly Embedding

Title Entity Disambiguation by Knowledge and Text Jointly Embedding
Authors Wei Fang, Jianwen Zhang, Dilin Wang, Zheng Chen, Ming Li
Abstract
Tasks Entity Disambiguation
Published 2016-08-01
URL https://www.aclweb.org/anthology/K16-1026/
PDF https://www.aclweb.org/anthology/K16-1026
PWC https://paperswithcode.com/paper/entity-disambiguation-by-knowledge-and-text
Repo
Framework

Chinese Tense Labelling and Causal Analysis

Title Chinese Tense Labelling and Causal Analysis
Authors Hen-Hsen Huang, Chang-Rui Yang, Hsin-Hsi Chen
Abstract This paper explores the role of tense information in Chinese causal analysis. Both tasks of causal type classification and causal directionality identification are experimented to show the significant improvement gained from tense features. To automatically extract the tense features, a Chinese tense predictor is proposed. Based on large amount of parallel data, our semi-supervised approach improves the dependency-based convolutional neural network (DCNN) models for Chinese tense labelling and thus the causal analysis.
Tasks Question Answering
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-1210/
PDF https://www.aclweb.org/anthology/C16-1210
PWC https://paperswithcode.com/paper/chinese-tense-labelling-and-causal-analysis
Repo
Framework

Domain Adaptation for Authorship Attribution: Improved Structural Correspondence Learning

Title Domain Adaptation for Authorship Attribution: Improved Structural Correspondence Learning
Authors Upendra Sapkota, Thamar Solorio, Manuel Montes, Steven Bethard
Abstract
Tasks Dimensionality Reduction, Domain Adaptation
Published 2016-08-01
URL https://www.aclweb.org/anthology/P16-1210/
PDF https://www.aclweb.org/anthology/P16-1210
PWC https://paperswithcode.com/paper/domain-adaptation-for-authorship-attribution
Repo
Framework

Decoding Anagrammed Texts Written in an Unknown Language and Script

Title Decoding Anagrammed Texts Written in an Unknown Language and Script
Authors Bradley Hauer, Grzegorz Kondrak
Abstract Algorithmic decipherment is a prime example of a truly unsupervised problem. The first step in the decipherment process is the identification of the encrypted language. We propose three methods for determining the source language of a document enciphered with a monoalphabetic substitution cipher. The best method achieves 97{%} accuracy on 380 languages. We then present an approach to decoding anagrammed substitution ciphers, in which the letters within words have been arbitrarily transposed. It obtains the average decryption word accuracy of 93{%} on a set of 50 ciphertexts in 5 languages. Finally, we report the results on the Voynich manuscript, an unsolved fifteenth century cipher, which suggest Hebrew as the language of the document.
Tasks Language Identification, Optical Character Recognition, Transliteration
Published 2016-01-01
URL https://www.aclweb.org/anthology/Q16-1006/
PDF https://www.aclweb.org/anthology/Q16-1006
PWC https://paperswithcode.com/paper/decoding-anagrammed-texts-written-in-an
Repo
Framework

Graph-Based Induction of Word Senses in Croatian

Title Graph-Based Induction of Word Senses in Croatian
Authors Marko Bekavac, Jan {\v{S}}najder
Abstract Word sense induction (WSI) seeks to induce senses of words from unannotated corpora. In this paper, we address the WSI task for the Croatian language. We adopt the word clustering approach based on co-occurrence graphs, in which senses are taken to correspond to strongly inter-connected components of co-occurring words. We experiment with a number of graph construction techniques and clustering algorithms, and evaluate the sense inventories both as a clustering problem and extrinsically on a word sense disambiguation (WSD) task. In the cluster-based evaluation, Chinese Whispers algorithm outperformed Markov Clustering, yielding a normalized mutual information score of 64.3. In contrast, in WSD evaluation Markov Clustering performed better, yielding an accuracy of about 75{%}. We are making available two induced sense inventories of 10,000 most frequent Croatian words: one coarse-grained and one fine-grained inventory, both obtained using the Markov Clustering algorithm.
Tasks graph construction, Word Sense Disambiguation, Word Sense Induction
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1481/
PDF https://www.aclweb.org/anthology/L16-1481
PWC https://paperswithcode.com/paper/graph-based-induction-of-word-senses-in
Repo
Framework

Learning Cross-lingual Representations with Matrix Factorization

Title Learning Cross-lingual Representations with Matrix Factorization
Authors Hanan Aldarmaki, Mona Diab
Abstract
Tasks Cross-Lingual Document Classification, Cross-Lingual Semantic Textual Similarity, Document Classification, Machine Translation, Question Answering, Semantic Textual Similarity, Sentence Embeddings, Word Embeddings
Published 2016-06-01
URL https://www.aclweb.org/anthology/W16-1201/
PDF https://www.aclweb.org/anthology/W16-1201
PWC https://paperswithcode.com/paper/learning-cross-lingual-representations-with
Repo
Framework

Pair Distance Distribution: A Model of Semantic Representation

Title Pair Distance Distribution: A Model of Semantic Representation
Authors Yonatan Ramni, Oded Maimon, Evgeni Khmelnitsky
Abstract
Tasks Dimensionality Reduction, Representation Learning, Semantic Textual Similarity
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-1621/
PDF https://www.aclweb.org/anthology/W16-1621
PWC https://paperswithcode.com/paper/pair-distance-distribution-a-model-of
Repo
Framework

A Corpus of Preposition Supersenses

Title A Corpus of Preposition Supersenses
Authors Nathan Schneider, Jena D. Hwang, Vivek Srikumar, Meredith Green, Abhijit Suresh, Kathryn Conger, Tim O{'}Gorman, Martha Palmer
Abstract
Tasks Machine Translation, Semantic Parsing
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-1712/
PDF https://www.aclweb.org/anthology/W16-1712
PWC https://paperswithcode.com/paper/a-corpus-of-preposition-supersenses
Repo
Framework

ELMD: An Automatically Generated Entity Linking Gold Standard Dataset in the Music Domain

Title ELMD: An Automatically Generated Entity Linking Gold Standard Dataset in the Music Domain
Authors Sergio Oramas, Luis Espinosa Anke, Mohamed Sordo, Horacio Saggion, Xavier Serra
Abstract In this paper we present a gold standard dataset for Entity Linking (EL) in the Music Domain. It contains thousands of musical named entities such as Artist, Song or Record Label, which have been automatically annotated on a set of artist biographies coming from the Music website and social network Last.fm. The annotation process relies on the analysis of the hyperlinks present in the source texts and in a voting-based algorithm for EL, which considers, for each entity mention in text, the degree of agreement across three state-of-the-art EL systems. Manual evaluation shows that EL Precision is at least 94{%}, and due to its tunable nature, it is possible to derive annotations favouring higher Precision or Recall, at will. We make available the annotated dataset along with evaluation data and the code.
Tasks Entity Linking
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1528/
PDF https://www.aclweb.org/anthology/L16-1528
PWC https://paperswithcode.com/paper/elmd-an-automatically-generated-entity
Repo
Framework
comments powered by Disqus