May 4, 2019

1592 words 8 mins read

Paper Group NANR 174

Paper Group NANR 174

FREME: Multilingual Semantic Enrichment with Linked Data and Language Technologies. Using Term Position Similarity and Language Modeling for Bilingual Document Alignment. A Comparison of Domain-based Word Polarity Estimation using different Word Embeddings. A Personalized Markov Clustering and Deep Learning Approach for Arabic Text Categorization. …

FREME: Multilingual Semantic Enrichment with Linked Data and Language Technologies

Title FREME: Multilingual Semantic Enrichment with Linked Data and Language Technologies
Authors Milan Dojchinovski, Felix Sasaki, Tatjana Gornostaja, Sebastian Hellmann, Erik Mannens, Frank Salliau, Michele Osella, Phil Ritchie, Giannis Stoitsis, Kevin Koidl, Markus Ackermann, Nilesh Chakraborty
Abstract In the recent years, Linked Data and Language Technology solutions gained popularity. Nevertheless, their coupling in real-world business is limited due to several issues. Existing products and services are developed for a particular domain, can be used only in combination with already integrated datasets or their language coverage is limited. In this paper, we present an innovative solution FREME - an open framework of e-Services for multilingual and semantic enrichment of digital content. The framework integrates six interoperable e-Services. We describe the core features of each e-Service and illustrate their usage in the context of four business cases: i) authoring and publishing; ii) translation and localisation; iii) cross-lingual access to data; and iv) personalised Web content recommendations. Business cases drive the design and development of the framework.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1660/
PDF https://www.aclweb.org/anthology/L16-1660
PWC https://paperswithcode.com/paper/freme-multilingual-semantic-enrichment-with
Repo
Framework

Using Term Position Similarity and Language Modeling for Bilingual Document Alignment

Title Using Term Position Similarity and Language Modeling for Bilingual Document Alignment
Authors Thanh C. Le, Hoa Trong Vu, Jonathan Oberl{"a}nder, Ond{\v{r}}ej Bojar
Abstract
Tasks Information Retrieval, Language Modelling, Machine Translation
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-2371/
PDF https://www.aclweb.org/anthology/W16-2371
PWC https://paperswithcode.com/paper/using-term-position-similarity-and-language
Repo
Framework

A Comparison of Domain-based Word Polarity Estimation using different Word Embeddings

Title A Comparison of Domain-based Word Polarity Estimation using different Word Embeddings
Authors Aitor Garc{'\i}a Pablos, Montse Cuadros, German Rigau
Abstract A key point in Sentiment Analysis is to determine the polarity of the sentiment implied by a certain word or expression. In basic Sentiment Analysis systems this sentiment polarity of the words is accounted and weighted in different ways to provide a degree of positivity/negativity. Currently words are also modelled as continuous dense vectors, known as word embeddings, which seem to encode interesting semantic knowledge. With regard to Sentiment Analysis, word embeddings are used as features to more complex supervised classification systems to obtain sentiment classifiers. In this paper we compare a set of existing sentiment lexicons and sentiment lexicon generation techniques. We also show a simple but effective technique to calculate a word polarity value for each word in a domain using existing continuous word embeddings generation methods. Further, we also show that word embeddings calculated on in-domain corpus capture the polarity better than the ones calculated on general-domain corpus.
Tasks Sentiment Analysis, Word Embeddings
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1009/
PDF https://www.aclweb.org/anthology/L16-1009
PWC https://paperswithcode.com/paper/a-comparison-of-domain-based-word-polarity
Repo
Framework

A Personalized Markov Clustering and Deep Learning Approach for Arabic Text Categorization

Title A Personalized Markov Clustering and Deep Learning Approach for Arabic Text Categorization
Authors Vasu Jindal
Abstract
Tasks Text Categorization, Text Classification
Published 2016-08-01
URL https://www.aclweb.org/anthology/papers/P16-3022/p16-3022
PDF https://www.aclweb.org/anthology/P16-3022v2
PWC https://paperswithcode.com/paper/a-personalized-markov-clustering-and-deep
Repo
Framework

Interactive Relation Extraction in Main Memory Database Systems

Title Interactive Relation Extraction in Main Memory Database Systems
Authors Rudolf Schneider, Cordula Guder, Torsten Kilias, Alex L{"o}ser, er, Jens Graupmann, Oleks Kozachuk, r
Abstract We present INDREX-MM, a main memory database system for interactively executing two interwoven tasks, declarative relation extraction from text and their exploitation with SQL. INDREX-MM simplifies these tasks for the user with powerful SQL extensions for gathering statistical semantics, for executing open information extraction and for integrating relation candidates with domain specific data. We demonstrate these functions on 800k documents from Reuters RCV1 with more than a billion linguistic annotations and report execution times in the order of seconds.
Tasks Open Information Extraction, Relation Extraction
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-2022/
PDF https://www.aclweb.org/anthology/C16-2022
PWC https://paperswithcode.com/paper/interactive-relation-extraction-in-main
Repo
Framework

Learning Phone Embeddings for Word Segmentation of Child-Directed Speech

Title Learning Phone Embeddings for Word Segmentation of Child-Directed Speech
Authors Jianqiang Ma, {\c{C}}a{\u{g}}r{\i} {\c{C}}{"o}ltekin, Erhard Hinrichs
Abstract
Tasks Chinese Word Segmentation, Language Acquisition
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-1908/
PDF https://www.aclweb.org/anthology/W16-1908
PWC https://paperswithcode.com/paper/learning-phone-embeddings-for-word
Repo
Framework

Community Detection on Evolving Graphs

Title Community Detection on Evolving Graphs
Authors Aris Anagnostopoulos, Jakub Łącki, Silvio Lattanzi, Stefano Leonardi, Mohammad Mahdian
Abstract Clustering is a fundamental step in many information-retrieval and data-mining applications. Detecting clusters in graphs is also a key tool for finding the community structure in social and behavioral networks. In many of these applications, the input graph evolves over time in a continual and decentralized manner, and, to maintain a good clustering, the clustering algorithm needs to repeatedly probe the graph. Furthermore, there are often limitations on the frequency of such probes, either imposed explicitly by the online platform (e.g., in the case of crawling proprietary social networks like twitter) or implicitly because of resource limitations (e.g., in the case of crawling the web). In this paper, we study a model of clustering on evolving graphs that captures this aspect of the problem. Our model is based on the classical stochastic block model, which has been used to assess rigorously the quality of various static clustering methods. In our model, the algorithm is supposed to reconstruct the planted clustering, given the ability to query for small pieces of local information about the graph, at a limited rate. We design and analyze clustering algorithms that work in this model, and show asymptotically tight upper and lower bounds on their accuracy. Finally, we perform simulations, which demonstrate that our main asymptotic results hold true also in practice.
Tasks Community Detection, Information Retrieval
Published 2016-12-01
URL http://papers.nips.cc/paper/6173-community-detection-on-evolving-graphs
PDF http://papers.nips.cc/paper/6173-community-detection-on-evolving-graphs.pdf
PWC https://paperswithcode.com/paper/community-detection-on-evolving-graphs
Repo
Framework

Refactoring the Genia Event Extraction Shared Task Toward a General Framework for IE-Driven KB Development

Title Refactoring the Genia Event Extraction Shared Task Toward a General Framework for IE-Driven KB Development
Authors Jin-Dong Kim, Yue Wang, Nicola Colic, Seung Han Beak, Yong Hwan Kim, Min Song
Abstract
Tasks
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-3003/
PDF https://www.aclweb.org/anthology/W16-3003
PWC https://paperswithcode.com/paper/refactoring-the-genia-event-extraction-shared
Repo
Framework

bot.zen @ EmpiriST 2015 - A minimally-deep learning PoS-tagger (trained for German CMC and Web data)

Title bot.zen @ EmpiriST 2015 - A minimally-deep learning PoS-tagger (trained for German CMC and Web data)
Authors Egon Stemle
Abstract
Tasks Machine Translation, Named Entity Recognition, Part-Of-Speech Tagging, Word Embeddings
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-2614/
PDF https://www.aclweb.org/anthology/W16-2614
PWC https://paperswithcode.com/paper/botzen-empirist-2015-a-minimally-deep
Repo
Framework
Title Babler - Data Collection from the Web to Support Speech Recognition and Keyword Search
Authors Gideon Mendels, Erica Cooper, Julia Hirschberg
Abstract
Tasks Language Identification, Language Modelling, Speech Recognition
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-2609/
PDF https://www.aclweb.org/anthology/W16-2609
PWC https://paperswithcode.com/paper/babler-data-collection-from-the-web-to
Repo
Framework

SLEDDED: A Proposed Dataset of Event Descriptions for Evaluating Phrase Representations

Title SLEDDED: A Proposed Dataset of Event Descriptions for Evaluating Phrase Representations
Authors Laura Rimell, Eva Maria Vecchi
Abstract
Tasks
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-2525/
PDF https://www.aclweb.org/anthology/W16-2525
PWC https://paperswithcode.com/paper/sledded-a-proposed-dataset-of-event
Repo
Framework

Antonymy and Canonicity: Experimental and Distributional Evidence

Title Antonymy and Canonicity: Experimental and Distributional Evidence
Authors Andreana Pastena, Aless Lenci, ro
Abstract The present paper investigates the phenomenon of antonym canonicity by providing new behavioural and distributional evidence on Italian adjectives. Previous studies have showed that some pairs of antonyms are perceived to be better examples of opposition than others, and are so considered representative of the whole category (e.g., Deese, 1964; Murphy, 2003; Paradis et al., 2009). Our goal is to further investigate why such canonical pairs (Murphy, 2003) exist and how they come to be associated. In the literature, two different approaches have dealt with this issue. The lexical-categorical approach (Charles and Miller, 1989; Justeson and Katz, 1991) finds the cause of canonicity in the high co-occurrence frequency of the two adjectives. The cognitive-prototype approach (Paradis et al., 2009; Jones et al., 2012) instead claims that two adjectives form a canonical pair because they are aligned along a simple and salient dimension. Our empirical evidence, while supporting the latter view, shows that the paradigmatic distributional properties of adjectives can also contribute to explain the phenomenon of canonicity, providing a corpus-based correlate of the cognitive notion of salience.
Tasks
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-5322/
PDF https://www.aclweb.org/anthology/W16-5322
PWC https://paperswithcode.com/paper/antonymy-and-canonicity-experimental-and
Repo
Framework

ICL00 at SemEval-2016 Task 3: Translation-Based Method for CQA System

Title ICL00 at SemEval-2016 Task 3: Translation-Based Method for CQA System
Authors Yunfang Wu, Minghua Zhang
Abstract
Tasks Community Question Answering, Question Answering
Published 2016-06-01
URL https://www.aclweb.org/anthology/S16-1132/
PDF https://www.aclweb.org/anthology/S16-1132
PWC https://paperswithcode.com/paper/icl00-at-semeval-2016-task-3-translation
Repo
Framework

Impact of Automatic Segmentation on the Quality, Productivity and Self-reported Post-editing Effort of Intralingual Subtitles

Title Impact of Automatic Segmentation on the Quality, Productivity and Self-reported Post-editing Effort of Intralingual Subtitles
Authors Aitor {'A}lvarez, Marina Balenciaga, Arantza del Pozo, Haritz Arzelus, Anna Matamala, Carlos-D. Mart{'\i}nez-Hinarejos
Abstract This paper describes the evaluation methodology followed to measure the impact of using a machine learning algorithm to automatically segment intralingual subtitles. The segmentation quality, productivity and self-reported post-editing effort achieved with such approach are shown to improve those obtained by the technique based in counting characters, mainly employed for automatic subtitle segmentation currently. The corpus used to train and test the proposed automated segmentation method is also described and shared with the community, in order to foster further research in this area.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1487/
PDF https://www.aclweb.org/anthology/L16-1487
PWC https://paperswithcode.com/paper/impact-of-automatic-segmentation-on-the
Repo
Framework

Controlled and Balanced Dataset for Japanese Lexical Simplification

Title Controlled and Balanced Dataset for Japanese Lexical Simplification
Authors Tomonori Kodaira, Tomoyuki Kajiwara, Mamoru Komachi
Abstract
Tasks Lexical Simplification, Reading Comprehension
Published 2016-08-01
URL https://www.aclweb.org/anthology/P16-3001/
PDF https://www.aclweb.org/anthology/P16-3001
PWC https://paperswithcode.com/paper/controlled-and-balanced-dataset-for-japanese
Repo
Framework
comments powered by Disqus