May 5, 2019

1599 words 8 mins read

Paper Group NANR 140

An Investigation on The Effectiveness of Employing Topic Modeling Techniques to Provide Topic Awareness For Conversational Agents. Enhancing Cross-border EU E-commerce through Machine Translation: Needed Language Resources, Challenges and Opportunities. Quality Estimation for Language Output Applications. Combination of Convolutional and Recurrent …

An Investigation on The Effectiveness of Employing Topic Modeling Techniques to Provide Topic Awareness For Conversational Agents


Title	An Investigation on The Effectiveness of Employing Topic Modeling Techniques to Provide Topic Awareness For Conversational Agents
Authors	Omid Moradiannasab
Abstract
Tasks
Published	2016-08-01
URL	https://www.aclweb.org/anthology/P16-3012/
PDF	https://www.aclweb.org/anthology/P16-3012
PWC	https://paperswithcode.com/paper/an-investigation-on-the-effectiveness-of
Repo
Framework

Enhancing Cross-border EU E-commerce through Machine Translation: Needed Language Resources, Challenges and Opportunities


Title	Enhancing Cross-border EU E-commerce through Machine Translation: Needed Language Resources, Challenges and Opportunities
Authors	Meritxell Fern{'a}ndez Barrera, Vladimir Popescu, Antonio Toral, Federico Gaspari, Khalid Choukri
Abstract	This paper discusses the role that statistical machine translation (SMT) can play in the development of cross-border EU e-commerce,by highlighting extant obstacles and identifying relevant technologies to overcome them. In this sense, it firstly proposes a typology of e-commerce static and dynamic textual genres and it identifies those that may be more successfully targeted by SMT. The specific challenges concerning the automatic translation of user-generated content are discussed in detail. Secondly, the paper highlights the risk of data sparsity inherent to e-commerce and it explores the state-of-the-art strategies to achieve domain adequacy via adaptation. Thirdly, it proposes a robust workflow for the development of SMT systems adapted to the e-commerce domain by relying on inexpensive methods. Given the scarcity of user-generated language corpora for most language pairs, the paper proposes to obtain monolingual target-language data to train language models and aligned parallel corpora to tune and evaluate MT systems by means of crowdsourcing.
Tasks	Machine Translation
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1721/
PDF	https://www.aclweb.org/anthology/L16-1721
PWC	https://paperswithcode.com/paper/enhancing-cross-border-eu-e-commerce-through
Repo
Framework

Quality Estimation for Language Output Applications


Title	Quality Estimation for Language Output Applications
Authors	Carolina Scarton, Gustavo Paetzold, Lucia Specia
Abstract	Quality Estimation (QE) of language output applications is a research area that has been attracting significant attention. The goal of QE is to estimate the quality of language output applications without the need of human references. Instead, machine learning algorithms are used to build supervised models based on a few labelled training instances. Such models are able to generalise over unseen data and thus QE is a robust method applicable to scenarios where human input is not available or possible. One such a scenario where QE is particularly appealing is that of Machine Translation, where a score for predicted quality can help decide whether or not a translation is useful (e.g. for post-editing) or reliable (e.g. for gisting). Other potential applications within Natural Language Processing (NLP) include Text Summarisation and Text Simplification. In this tutorial we present the task of QE and its application in NLP, focusing on Machine Translation. We also introduce QuEst++, a toolkit for QE that encompasses feature extraction and machine learning, and propose a practical activity to extend this toolkit in various ways.
Tasks	Machine Translation, Multi-Task Learning, Text Simplification
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-3004/
PDF	https://www.aclweb.org/anthology/C16-3004
PWC	https://paperswithcode.com/paper/quality-estimation-for-language-output
Repo
Framework

Combination of Convolutional and Recurrent Neural Network for Sentiment Analysis of Short Texts


Title	Combination of Convolutional and Recurrent Neural Network for Sentiment Analysis of Short Texts
Authors	Xingyou Wang, Weijie Jiang, Zhiyong Luo
Abstract	Sentiment analysis of short texts is challenging because of the limited contextual information they usually contain. In recent years, deep learning models such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs) have been applied to text sentiment analysis with comparatively remarkable results. In this paper, we describe a jointed CNN and RNN architecture, taking advantage of the coarse-grained local features generated by CNN and long-distance dependencies learned via RNN for sentiment analysis of short texts. Experimental results show an obvious improvement upon the state-of-the-art on three benchmark corpora, MR, SST1 and SST2, with 82.28{%}, 51.50{%} and 89.95{%} accuracy, respectively.
Tasks	Information Retrieval, Sentiment Analysis, Speech Recognition, Word Embeddings
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-1229/
PDF	https://www.aclweb.org/anthology/C16-1229
PWC	https://paperswithcode.com/paper/combination-of-convolutional-and-recurrent
Repo
Framework

Adjusting Word Embeddings with Semantic Intensity Orders


Title	Adjusting Word Embeddings with Semantic Intensity Orders
Authors	Joo-Kyung Kim, Marie-Catherine de Marneffe, Eric Fosler-Lussier
Abstract
Tasks	Representation Learning, Word Embeddings
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-1607/
PDF	https://www.aclweb.org/anthology/W16-1607
PWC	https://paperswithcode.com/paper/adjusting-word-embeddings-with-semantic
Repo
Framework

Assisting Discussion Forum Users using Deep Recurrent Neural Networks


Title	Assisting Discussion Forum Users using Deep Recurrent Neural Networks
Authors	Jacob Hagstedt P Suorra, Olof Mogren
Abstract
Tasks	Representation Learning
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-1606/
PDF	https://www.aclweb.org/anthology/W16-1606
PWC	https://paperswithcode.com/paper/assisting-discussion-forum-users-using-deep
Repo
Framework

Entity Disambiguation by Knowledge and Text Jointly Embedding


Title	Entity Disambiguation by Knowledge and Text Jointly Embedding
Authors	Wei Fang, Jianwen Zhang, Dilin Wang, Zheng Chen, Ming Li
Abstract
Tasks	Entity Disambiguation
Published	2016-08-01
URL	https://www.aclweb.org/anthology/K16-1026/
PDF	https://www.aclweb.org/anthology/K16-1026
PWC	https://paperswithcode.com/paper/entity-disambiguation-by-knowledge-and-text
Repo
Framework

Chinese Tense Labelling and Causal Analysis


Title	Chinese Tense Labelling and Causal Analysis
Authors	Hen-Hsen Huang, Chang-Rui Yang, Hsin-Hsi Chen
Abstract	This paper explores the role of tense information in Chinese causal analysis. Both tasks of causal type classification and causal directionality identification are experimented to show the significant improvement gained from tense features. To automatically extract the tense features, a Chinese tense predictor is proposed. Based on large amount of parallel data, our semi-supervised approach improves the dependency-based convolutional neural network (DCNN) models for Chinese tense labelling and thus the causal analysis.
Tasks	Question Answering
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-1210/
PDF	https://www.aclweb.org/anthology/C16-1210
PWC	https://paperswithcode.com/paper/chinese-tense-labelling-and-causal-analysis
Repo
Framework

Domain Adaptation for Authorship Attribution: Improved Structural Correspondence Learning


Title	Domain Adaptation for Authorship Attribution: Improved Structural Correspondence Learning
Authors	Upendra Sapkota, Thamar Solorio, Manuel Montes, Steven Bethard
Abstract
Tasks	Dimensionality Reduction, Domain Adaptation
Published	2016-08-01
URL	https://www.aclweb.org/anthology/P16-1210/
PDF	https://www.aclweb.org/anthology/P16-1210
PWC	https://paperswithcode.com/paper/domain-adaptation-for-authorship-attribution
Repo
Framework

Decoding Anagrammed Texts Written in an Unknown Language and Script


Title	Decoding Anagrammed Texts Written in an Unknown Language and Script
Authors	Bradley Hauer, Grzegorz Kondrak
Abstract	Algorithmic decipherment is a prime example of a truly unsupervised problem. The first step in the decipherment process is the identification of the encrypted language. We propose three methods for determining the source language of a document enciphered with a monoalphabetic substitution cipher. The best method achieves 97{%} accuracy on 380 languages. We then present an approach to decoding anagrammed substitution ciphers, in which the letters within words have been arbitrarily transposed. It obtains the average decryption word accuracy of 93{%} on a set of 50 ciphertexts in 5 languages. Finally, we report the results on the Voynich manuscript, an unsolved fifteenth century cipher, which suggest Hebrew as the language of the document.
Tasks	Language Identification, Optical Character Recognition, Transliteration
Published	2016-01-01
URL	https://www.aclweb.org/anthology/Q16-1006/
PDF	https://www.aclweb.org/anthology/Q16-1006
PWC	https://paperswithcode.com/paper/decoding-anagrammed-texts-written-in-an
Repo
Framework

Graph-Based Induction of Word Senses in Croatian


Title	Graph-Based Induction of Word Senses in Croatian
Authors	Marko Bekavac, Jan {\v{S}}najder
Abstract	Word sense induction (WSI) seeks to induce senses of words from unannotated corpora. In this paper, we address the WSI task for the Croatian language. We adopt the word clustering approach based on co-occurrence graphs, in which senses are taken to correspond to strongly inter-connected components of co-occurring words. We experiment with a number of graph construction techniques and clustering algorithms, and evaluate the sense inventories both as a clustering problem and extrinsically on a word sense disambiguation (WSD) task. In the cluster-based evaluation, Chinese Whispers algorithm outperformed Markov Clustering, yielding a normalized mutual information score of 64.3. In contrast, in WSD evaluation Markov Clustering performed better, yielding an accuracy of about 75{%}. We are making available two induced sense inventories of 10,000 most frequent Croatian words: one coarse-grained and one fine-grained inventory, both obtained using the Markov Clustering algorithm.
Tasks	graph construction, Word Sense Disambiguation, Word Sense Induction
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1481/
PDF	https://www.aclweb.org/anthology/L16-1481
PWC	https://paperswithcode.com/paper/graph-based-induction-of-word-senses-in
Repo
Framework

Learning Cross-lingual Representations with Matrix Factorization


Title	Learning Cross-lingual Representations with Matrix Factorization
Authors	Hanan Aldarmaki, Mona Diab
Abstract
Tasks	Cross-Lingual Document Classification, Cross-Lingual Semantic Textual Similarity, Document Classification, Machine Translation, Question Answering, Semantic Textual Similarity, Sentence Embeddings, Word Embeddings
Published	2016-06-01
URL	https://www.aclweb.org/anthology/W16-1201/
PDF	https://www.aclweb.org/anthology/W16-1201
PWC	https://paperswithcode.com/paper/learning-cross-lingual-representations-with
Repo
Framework

Pair Distance Distribution: A Model of Semantic Representation


Title	Pair Distance Distribution: A Model of Semantic Representation
Authors	Yonatan Ramni, Oded Maimon, Evgeni Khmelnitsky
Abstract
Tasks	Dimensionality Reduction, Representation Learning, Semantic Textual Similarity
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-1621/
PDF	https://www.aclweb.org/anthology/W16-1621
PWC	https://paperswithcode.com/paper/pair-distance-distribution-a-model-of
Repo
Framework

A Corpus of Preposition Supersenses


Title	A Corpus of Preposition Supersenses
Authors	Nathan Schneider, Jena D. Hwang, Vivek Srikumar, Meredith Green, Abhijit Suresh, Kathryn Conger, Tim O{'}Gorman, Martha Palmer
Abstract
Tasks	Machine Translation, Semantic Parsing
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-1712/
PDF	https://www.aclweb.org/anthology/W16-1712
PWC	https://paperswithcode.com/paper/a-corpus-of-preposition-supersenses
Repo
Framework

ELMD: An Automatically Generated Entity Linking Gold Standard Dataset in the Music Domain


Title	ELMD: An Automatically Generated Entity Linking Gold Standard Dataset in the Music Domain
Authors	Sergio Oramas, Luis Espinosa Anke, Mohamed Sordo, Horacio Saggion, Xavier Serra
Abstract	In this paper we present a gold standard dataset for Entity Linking (EL) in the Music Domain. It contains thousands of musical named entities such as Artist, Song or Record Label, which have been automatically annotated on a set of artist biographies coming from the Music website and social network Last.fm. The annotation process relies on the analysis of the hyperlinks present in the source texts and in a voting-based algorithm for EL, which considers, for each entity mention in text, the degree of agreement across three state-of-the-art EL systems. Manual evaluation shows that EL Precision is at least 94{%}, and due to its tunable nature, it is possible to derive annotations favouring higher Precision or Recall, at will. We make available the annotated dataset along with evaluation data and the code.
Tasks	Entity Linking
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1528/
PDF	https://www.aclweb.org/anthology/L16-1528
PWC	https://paperswithcode.com/paper/elmd-an-automatically-generated-entity
Repo
Framework