May 4, 2019

1592 words 8 mins read

Paper Group NANR 174

FREME: Multilingual Semantic Enrichment with Linked Data and Language Technologies. Using Term Position Similarity and Language Modeling for Bilingual Document Alignment. A Comparison of Domain-based Word Polarity Estimation using different Word Embeddings. A Personalized Markov Clustering and Deep Learning Approach for Arabic Text Categorization. …

FREME: Multilingual Semantic Enrichment with Linked Data and Language Technologies


Title	FREME: Multilingual Semantic Enrichment with Linked Data and Language Technologies
Authors	Milan Dojchinovski, Felix Sasaki, Tatjana Gornostaja, Sebastian Hellmann, Erik Mannens, Frank Salliau, Michele Osella, Phil Ritchie, Giannis Stoitsis, Kevin Koidl, Markus Ackermann, Nilesh Chakraborty
Abstract	In the recent years, Linked Data and Language Technology solutions gained popularity. Nevertheless, their coupling in real-world business is limited due to several issues. Existing products and services are developed for a particular domain, can be used only in combination with already integrated datasets or their language coverage is limited. In this paper, we present an innovative solution FREME - an open framework of e-Services for multilingual and semantic enrichment of digital content. The framework integrates six interoperable e-Services. We describe the core features of each e-Service and illustrate their usage in the context of four business cases: i) authoring and publishing; ii) translation and localisation; iii) cross-lingual access to data; and iv) personalised Web content recommendations. Business cases drive the design and development of the framework.
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1660/
PDF	https://www.aclweb.org/anthology/L16-1660
PWC	https://paperswithcode.com/paper/freme-multilingual-semantic-enrichment-with
Repo
Framework

Using Term Position Similarity and Language Modeling for Bilingual Document Alignment


Title	Using Term Position Similarity and Language Modeling for Bilingual Document Alignment
Authors	Thanh C. Le, Hoa Trong Vu, Jonathan Oberl{"a}nder, Ond{\v{r}}ej Bojar
Abstract
Tasks	Information Retrieval, Language Modelling, Machine Translation
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-2371/
PDF	https://www.aclweb.org/anthology/W16-2371
PWC	https://paperswithcode.com/paper/using-term-position-similarity-and-language
Repo
Framework

A Comparison of Domain-based Word Polarity Estimation using different Word Embeddings


Title	A Comparison of Domain-based Word Polarity Estimation using different Word Embeddings
Authors	Aitor Garc{'\i}a Pablos, Montse Cuadros, German Rigau
Abstract	A key point in Sentiment Analysis is to determine the polarity of the sentiment implied by a certain word or expression. In basic Sentiment Analysis systems this sentiment polarity of the words is accounted and weighted in different ways to provide a degree of positivity/negativity. Currently words are also modelled as continuous dense vectors, known as word embeddings, which seem to encode interesting semantic knowledge. With regard to Sentiment Analysis, word embeddings are used as features to more complex supervised classification systems to obtain sentiment classifiers. In this paper we compare a set of existing sentiment lexicons and sentiment lexicon generation techniques. We also show a simple but effective technique to calculate a word polarity value for each word in a domain using existing continuous word embeddings generation methods. Further, we also show that word embeddings calculated on in-domain corpus capture the polarity better than the ones calculated on general-domain corpus.
Tasks	Sentiment Analysis, Word Embeddings
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1009/
PDF	https://www.aclweb.org/anthology/L16-1009
PWC	https://paperswithcode.com/paper/a-comparison-of-domain-based-word-polarity
Repo
Framework

A Personalized Markov Clustering and Deep Learning Approach for Arabic Text Categorization


Title	A Personalized Markov Clustering and Deep Learning Approach for Arabic Text Categorization
Authors	Vasu Jindal
Abstract
Tasks	Text Categorization, Text Classification
Published	2016-08-01
URL	https://www.aclweb.org/anthology/papers/P16-3022/p16-3022
PDF	https://www.aclweb.org/anthology/P16-3022v2
PWC	https://paperswithcode.com/paper/a-personalized-markov-clustering-and-deep
Repo
Framework

Interactive Relation Extraction in Main Memory Database Systems


Title	Interactive Relation Extraction in Main Memory Database Systems
Authors	Rudolf Schneider, Cordula Guder, Torsten Kilias, Alex L{"o}ser, er, Jens Graupmann, Oleks Kozachuk, r
Abstract	We present INDREX-MM, a main memory database system for interactively executing two interwoven tasks, declarative relation extraction from text and their exploitation with SQL. INDREX-MM simplifies these tasks for the user with powerful SQL extensions for gathering statistical semantics, for executing open information extraction and for integrating relation candidates with domain specific data. We demonstrate these functions on 800k documents from Reuters RCV1 with more than a billion linguistic annotations and report execution times in the order of seconds.
Tasks	Open Information Extraction, Relation Extraction
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-2022/
PDF	https://www.aclweb.org/anthology/C16-2022
PWC	https://paperswithcode.com/paper/interactive-relation-extraction-in-main
Repo
Framework

Learning Phone Embeddings for Word Segmentation of Child-Directed Speech


Title	Learning Phone Embeddings for Word Segmentation of Child-Directed Speech
Authors	Jianqiang Ma, {\c{C}}a{\u{g}}r{\i} {\c{C}}{"o}ltekin, Erhard Hinrichs
Abstract
Tasks	Chinese Word Segmentation, Language Acquisition
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-1908/
PDF	https://www.aclweb.org/anthology/W16-1908
PWC	https://paperswithcode.com/paper/learning-phone-embeddings-for-word
Repo
Framework

Community Detection on Evolving Graphs


Title	Community Detection on Evolving Graphs
Authors	Aris Anagnostopoulos, Jakub Łącki, Silvio Lattanzi, Stefano Leonardi, Mohammad Mahdian
Abstract	Clustering is a fundamental step in many information-retrieval and data-mining applications. Detecting clusters in graphs is also a key tool for finding the community structure in social and behavioral networks. In many of these applications, the input graph evolves over time in a continual and decentralized manner, and, to maintain a good clustering, the clustering algorithm needs to repeatedly probe the graph. Furthermore, there are often limitations on the frequency of such probes, either imposed explicitly by the online platform (e.g., in the case of crawling proprietary social networks like twitter) or implicitly because of resource limitations (e.g., in the case of crawling the web). In this paper, we study a model of clustering on evolving graphs that captures this aspect of the problem. Our model is based on the classical stochastic block model, which has been used to assess rigorously the quality of various static clustering methods. In our model, the algorithm is supposed to reconstruct the planted clustering, given the ability to query for small pieces of local information about the graph, at a limited rate. We design and analyze clustering algorithms that work in this model, and show asymptotically tight upper and lower bounds on their accuracy. Finally, we perform simulations, which demonstrate that our main asymptotic results hold true also in practice.
Tasks	Community Detection, Information Retrieval
Published	2016-12-01
URL	http://papers.nips.cc/paper/6173-community-detection-on-evolving-graphs
PDF	http://papers.nips.cc/paper/6173-community-detection-on-evolving-graphs.pdf
PWC	https://paperswithcode.com/paper/community-detection-on-evolving-graphs
Repo
Framework

Refactoring the Genia Event Extraction Shared Task Toward a General Framework for IE-Driven KB Development


Title	Refactoring the Genia Event Extraction Shared Task Toward a General Framework for IE-Driven KB Development
Authors	Jin-Dong Kim, Yue Wang, Nicola Colic, Seung Han Beak, Yong Hwan Kim, Min Song
Abstract
Tasks
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-3003/
PDF	https://www.aclweb.org/anthology/W16-3003
PWC	https://paperswithcode.com/paper/refactoring-the-genia-event-extraction-shared
Repo
Framework

bot.zen @ EmpiriST 2015 - A minimally-deep learning PoS-tagger (trained for German CMC and Web data)


Title	bot.zen @ EmpiriST 2015 - A minimally-deep learning PoS-tagger (trained for German CMC and Web data)
Authors	Egon Stemle
Abstract
Tasks	Machine Translation, Named Entity Recognition, Part-Of-Speech Tagging, Word Embeddings
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-2614/
PDF	https://www.aclweb.org/anthology/W16-2614
PWC	https://paperswithcode.com/paper/botzen-empirist-2015-a-minimally-deep
Repo
Framework

Babler - Data Collection from the Web to Support Speech Recognition and Keyword Search


Title	Babler - Data Collection from the Web to Support Speech Recognition and Keyword Search
Authors	Gideon Mendels, Erica Cooper, Julia Hirschberg
Abstract
Tasks	Language Identification, Language Modelling, Speech Recognition
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-2609/
PDF	https://www.aclweb.org/anthology/W16-2609
PWC	https://paperswithcode.com/paper/babler-data-collection-from-the-web-to
Repo
Framework

SLEDDED: A Proposed Dataset of Event Descriptions for Evaluating Phrase Representations


Title	SLEDDED: A Proposed Dataset of Event Descriptions for Evaluating Phrase Representations
Authors	Laura Rimell, Eva Maria Vecchi
Abstract
Tasks
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-2525/
PDF	https://www.aclweb.org/anthology/W16-2525
PWC	https://paperswithcode.com/paper/sledded-a-proposed-dataset-of-event
Repo
Framework

Antonymy and Canonicity: Experimental and Distributional Evidence


Title	Antonymy and Canonicity: Experimental and Distributional Evidence
Authors	Andreana Pastena, Aless Lenci, ro
Abstract	The present paper investigates the phenomenon of antonym canonicity by providing new behavioural and distributional evidence on Italian adjectives. Previous studies have showed that some pairs of antonyms are perceived to be better examples of opposition than others, and are so considered representative of the whole category (e.g., Deese, 1964; Murphy, 2003; Paradis et al., 2009). Our goal is to further investigate why such canonical pairs (Murphy, 2003) exist and how they come to be associated. In the literature, two different approaches have dealt with this issue. The lexical-categorical approach (Charles and Miller, 1989; Justeson and Katz, 1991) finds the cause of canonicity in the high co-occurrence frequency of the two adjectives. The cognitive-prototype approach (Paradis et al., 2009; Jones et al., 2012) instead claims that two adjectives form a canonical pair because they are aligned along a simple and salient dimension. Our empirical evidence, while supporting the latter view, shows that the paradigmatic distributional properties of adjectives can also contribute to explain the phenomenon of canonicity, providing a corpus-based correlate of the cognitive notion of salience.
Tasks
Published	2016-12-01
URL	https://www.aclweb.org/anthology/W16-5322/
PDF	https://www.aclweb.org/anthology/W16-5322
PWC	https://paperswithcode.com/paper/antonymy-and-canonicity-experimental-and
Repo
Framework

ICL00 at SemEval-2016 Task 3: Translation-Based Method for CQA System


Title	ICL00 at SemEval-2016 Task 3: Translation-Based Method for CQA System
Authors	Yunfang Wu, Minghua Zhang
Abstract
Tasks	Community Question Answering, Question Answering
Published	2016-06-01
URL	https://www.aclweb.org/anthology/S16-1132/
PDF	https://www.aclweb.org/anthology/S16-1132
PWC	https://paperswithcode.com/paper/icl00-at-semeval-2016-task-3-translation
Repo
Framework

Impact of Automatic Segmentation on the Quality, Productivity and Self-reported Post-editing Effort of Intralingual Subtitles


Title	Impact of Automatic Segmentation on the Quality, Productivity and Self-reported Post-editing Effort of Intralingual Subtitles
Authors	Aitor {'A}lvarez, Marina Balenciaga, Arantza del Pozo, Haritz Arzelus, Anna Matamala, Carlos-D. Mart{'\i}nez-Hinarejos
Abstract	This paper describes the evaluation methodology followed to measure the impact of using a machine learning algorithm to automatically segment intralingual subtitles. The segmentation quality, productivity and self-reported post-editing effort achieved with such approach are shown to improve those obtained by the technique based in counting characters, mainly employed for automatic subtitle segmentation currently. The corpus used to train and test the proposed automated segmentation method is also described and shared with the community, in order to foster further research in this area.
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1487/
PDF	https://www.aclweb.org/anthology/L16-1487
PWC	https://paperswithcode.com/paper/impact-of-automatic-segmentation-on-the
Repo
Framework

Controlled and Balanced Dataset for Japanese Lexical Simplification


Title	Controlled and Balanced Dataset for Japanese Lexical Simplification
Authors	Tomonori Kodaira, Tomoyuki Kajiwara, Mamoru Komachi
Abstract
Tasks	Lexical Simplification, Reading Comprehension
Published	2016-08-01
URL	https://www.aclweb.org/anthology/P16-3001/
PDF	https://www.aclweb.org/anthology/P16-3001
PWC	https://paperswithcode.com/paper/controlled-and-balanced-dataset-for-japanese
Repo
Framework