Paper Group NANR 174
FREME: Multilingual Semantic Enrichment with Linked Data and Language Technologies. Using Term Position Similarity and Language Modeling for Bilingual Document Alignment. A Comparison of Domain-based Word Polarity Estimation using different Word Embeddings. A Personalized Markov Clustering and Deep Learning Approach for Arabic Text Categorization. …
FREME: Multilingual Semantic Enrichment with Linked Data and Language Technologies
Title | FREME: Multilingual Semantic Enrichment with Linked Data and Language Technologies |
Authors | Milan Dojchinovski, Felix Sasaki, Tatjana Gornostaja, Sebastian Hellmann, Erik Mannens, Frank Salliau, Michele Osella, Phil Ritchie, Giannis Stoitsis, Kevin Koidl, Markus Ackermann, Nilesh Chakraborty |
Abstract | In the recent years, Linked Data and Language Technology solutions gained popularity. Nevertheless, their coupling in real-world business is limited due to several issues. Existing products and services are developed for a particular domain, can be used only in combination with already integrated datasets or their language coverage is limited. In this paper, we present an innovative solution FREME - an open framework of e-Services for multilingual and semantic enrichment of digital content. The framework integrates six interoperable e-Services. We describe the core features of each e-Service and illustrate their usage in the context of four business cases: i) authoring and publishing; ii) translation and localisation; iii) cross-lingual access to data; and iv) personalised Web content recommendations. Business cases drive the design and development of the framework. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1660/ |
https://www.aclweb.org/anthology/L16-1660 | |
PWC | https://paperswithcode.com/paper/freme-multilingual-semantic-enrichment-with |
Repo | |
Framework | |
Using Term Position Similarity and Language Modeling for Bilingual Document Alignment
Title | Using Term Position Similarity and Language Modeling for Bilingual Document Alignment |
Authors | Thanh C. Le, Hoa Trong Vu, Jonathan Oberl{"a}nder, Ond{\v{r}}ej Bojar |
Abstract | |
Tasks | Information Retrieval, Language Modelling, Machine Translation |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-2371/ |
https://www.aclweb.org/anthology/W16-2371 | |
PWC | https://paperswithcode.com/paper/using-term-position-similarity-and-language |
Repo | |
Framework | |
A Comparison of Domain-based Word Polarity Estimation using different Word Embeddings
Title | A Comparison of Domain-based Word Polarity Estimation using different Word Embeddings |
Authors | Aitor Garc{'\i}a Pablos, Montse Cuadros, German Rigau |
Abstract | A key point in Sentiment Analysis is to determine the polarity of the sentiment implied by a certain word or expression. In basic Sentiment Analysis systems this sentiment polarity of the words is accounted and weighted in different ways to provide a degree of positivity/negativity. Currently words are also modelled as continuous dense vectors, known as word embeddings, which seem to encode interesting semantic knowledge. With regard to Sentiment Analysis, word embeddings are used as features to more complex supervised classification systems to obtain sentiment classifiers. In this paper we compare a set of existing sentiment lexicons and sentiment lexicon generation techniques. We also show a simple but effective technique to calculate a word polarity value for each word in a domain using existing continuous word embeddings generation methods. Further, we also show that word embeddings calculated on in-domain corpus capture the polarity better than the ones calculated on general-domain corpus. |
Tasks | Sentiment Analysis, Word Embeddings |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1009/ |
https://www.aclweb.org/anthology/L16-1009 | |
PWC | https://paperswithcode.com/paper/a-comparison-of-domain-based-word-polarity |
Repo | |
Framework | |
A Personalized Markov Clustering and Deep Learning Approach for Arabic Text Categorization
Title | A Personalized Markov Clustering and Deep Learning Approach for Arabic Text Categorization |
Authors | Vasu Jindal |
Abstract | |
Tasks | Text Categorization, Text Classification |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/papers/P16-3022/p16-3022 |
https://www.aclweb.org/anthology/P16-3022v2 | |
PWC | https://paperswithcode.com/paper/a-personalized-markov-clustering-and-deep |
Repo | |
Framework | |
Interactive Relation Extraction in Main Memory Database Systems
Title | Interactive Relation Extraction in Main Memory Database Systems |
Authors | Rudolf Schneider, Cordula Guder, Torsten Kilias, Alex L{"o}ser, er, Jens Graupmann, Oleks Kozachuk, r |
Abstract | We present INDREX-MM, a main memory database system for interactively executing two interwoven tasks, declarative relation extraction from text and their exploitation with SQL. INDREX-MM simplifies these tasks for the user with powerful SQL extensions for gathering statistical semantics, for executing open information extraction and for integrating relation candidates with domain specific data. We demonstrate these functions on 800k documents from Reuters RCV1 with more than a billion linguistic annotations and report execution times in the order of seconds. |
Tasks | Open Information Extraction, Relation Extraction |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-2022/ |
https://www.aclweb.org/anthology/C16-2022 | |
PWC | https://paperswithcode.com/paper/interactive-relation-extraction-in-main |
Repo | |
Framework | |
Learning Phone Embeddings for Word Segmentation of Child-Directed Speech
Title | Learning Phone Embeddings for Word Segmentation of Child-Directed Speech |
Authors | Jianqiang Ma, {\c{C}}a{\u{g}}r{\i} {\c{C}}{"o}ltekin, Erhard Hinrichs |
Abstract | |
Tasks | Chinese Word Segmentation, Language Acquisition |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-1908/ |
https://www.aclweb.org/anthology/W16-1908 | |
PWC | https://paperswithcode.com/paper/learning-phone-embeddings-for-word |
Repo | |
Framework | |
Community Detection on Evolving Graphs
Title | Community Detection on Evolving Graphs |
Authors | Aris Anagnostopoulos, Jakub Łącki, Silvio Lattanzi, Stefano Leonardi, Mohammad Mahdian |
Abstract | Clustering is a fundamental step in many information-retrieval and data-mining applications. Detecting clusters in graphs is also a key tool for finding the community structure in social and behavioral networks. In many of these applications, the input graph evolves over time in a continual and decentralized manner, and, to maintain a good clustering, the clustering algorithm needs to repeatedly probe the graph. Furthermore, there are often limitations on the frequency of such probes, either imposed explicitly by the online platform (e.g., in the case of crawling proprietary social networks like twitter) or implicitly because of resource limitations (e.g., in the case of crawling the web). In this paper, we study a model of clustering on evolving graphs that captures this aspect of the problem. Our model is based on the classical stochastic block model, which has been used to assess rigorously the quality of various static clustering methods. In our model, the algorithm is supposed to reconstruct the planted clustering, given the ability to query for small pieces of local information about the graph, at a limited rate. We design and analyze clustering algorithms that work in this model, and show asymptotically tight upper and lower bounds on their accuracy. Finally, we perform simulations, which demonstrate that our main asymptotic results hold true also in practice. |
Tasks | Community Detection, Information Retrieval |
Published | 2016-12-01 |
URL | http://papers.nips.cc/paper/6173-community-detection-on-evolving-graphs |
http://papers.nips.cc/paper/6173-community-detection-on-evolving-graphs.pdf | |
PWC | https://paperswithcode.com/paper/community-detection-on-evolving-graphs |
Repo | |
Framework | |
Refactoring the Genia Event Extraction Shared Task Toward a General Framework for IE-Driven KB Development
Title | Refactoring the Genia Event Extraction Shared Task Toward a General Framework for IE-Driven KB Development |
Authors | Jin-Dong Kim, Yue Wang, Nicola Colic, Seung Han Beak, Yong Hwan Kim, Min Song |
Abstract | |
Tasks | |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-3003/ |
https://www.aclweb.org/anthology/W16-3003 | |
PWC | https://paperswithcode.com/paper/refactoring-the-genia-event-extraction-shared |
Repo | |
Framework | |
bot.zen @ EmpiriST 2015 - A minimally-deep learning PoS-tagger (trained for German CMC and Web data)
Title | bot.zen @ EmpiriST 2015 - A minimally-deep learning PoS-tagger (trained for German CMC and Web data) |
Authors | Egon Stemle |
Abstract | |
Tasks | Machine Translation, Named Entity Recognition, Part-Of-Speech Tagging, Word Embeddings |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-2614/ |
https://www.aclweb.org/anthology/W16-2614 | |
PWC | https://paperswithcode.com/paper/botzen-empirist-2015-a-minimally-deep |
Repo | |
Framework | |
Babler - Data Collection from the Web to Support Speech Recognition and Keyword Search
Title | Babler - Data Collection from the Web to Support Speech Recognition and Keyword Search |
Authors | Gideon Mendels, Erica Cooper, Julia Hirschberg |
Abstract | |
Tasks | Language Identification, Language Modelling, Speech Recognition |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-2609/ |
https://www.aclweb.org/anthology/W16-2609 | |
PWC | https://paperswithcode.com/paper/babler-data-collection-from-the-web-to |
Repo | |
Framework | |
SLEDDED: A Proposed Dataset of Event Descriptions for Evaluating Phrase Representations
Title | SLEDDED: A Proposed Dataset of Event Descriptions for Evaluating Phrase Representations |
Authors | Laura Rimell, Eva Maria Vecchi |
Abstract | |
Tasks | |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-2525/ |
https://www.aclweb.org/anthology/W16-2525 | |
PWC | https://paperswithcode.com/paper/sledded-a-proposed-dataset-of-event |
Repo | |
Framework | |
Antonymy and Canonicity: Experimental and Distributional Evidence
Title | Antonymy and Canonicity: Experimental and Distributional Evidence |
Authors | Andreana Pastena, Aless Lenci, ro |
Abstract | The present paper investigates the phenomenon of antonym canonicity by providing new behavioural and distributional evidence on Italian adjectives. Previous studies have showed that some pairs of antonyms are perceived to be better examples of opposition than others, and are so considered representative of the whole category (e.g., Deese, 1964; Murphy, 2003; Paradis et al., 2009). Our goal is to further investigate why such canonical pairs (Murphy, 2003) exist and how they come to be associated. In the literature, two different approaches have dealt with this issue. The lexical-categorical approach (Charles and Miller, 1989; Justeson and Katz, 1991) finds the cause of canonicity in the high co-occurrence frequency of the two adjectives. The cognitive-prototype approach (Paradis et al., 2009; Jones et al., 2012) instead claims that two adjectives form a canonical pair because they are aligned along a simple and salient dimension. Our empirical evidence, while supporting the latter view, shows that the paradigmatic distributional properties of adjectives can also contribute to explain the phenomenon of canonicity, providing a corpus-based correlate of the cognitive notion of salience. |
Tasks | |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-5322/ |
https://www.aclweb.org/anthology/W16-5322 | |
PWC | https://paperswithcode.com/paper/antonymy-and-canonicity-experimental-and |
Repo | |
Framework | |
ICL00 at SemEval-2016 Task 3: Translation-Based Method for CQA System
Title | ICL00 at SemEval-2016 Task 3: Translation-Based Method for CQA System |
Authors | Yunfang Wu, Minghua Zhang |
Abstract | |
Tasks | Community Question Answering, Question Answering |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/S16-1132/ |
https://www.aclweb.org/anthology/S16-1132 | |
PWC | https://paperswithcode.com/paper/icl00-at-semeval-2016-task-3-translation |
Repo | |
Framework | |
Impact of Automatic Segmentation on the Quality, Productivity and Self-reported Post-editing Effort of Intralingual Subtitles
Title | Impact of Automatic Segmentation on the Quality, Productivity and Self-reported Post-editing Effort of Intralingual Subtitles |
Authors | Aitor {'A}lvarez, Marina Balenciaga, Arantza del Pozo, Haritz Arzelus, Anna Matamala, Carlos-D. Mart{'\i}nez-Hinarejos |
Abstract | This paper describes the evaluation methodology followed to measure the impact of using a machine learning algorithm to automatically segment intralingual subtitles. The segmentation quality, productivity and self-reported post-editing effort achieved with such approach are shown to improve those obtained by the technique based in counting characters, mainly employed for automatic subtitle segmentation currently. The corpus used to train and test the proposed automated segmentation method is also described and shared with the community, in order to foster further research in this area. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1487/ |
https://www.aclweb.org/anthology/L16-1487 | |
PWC | https://paperswithcode.com/paper/impact-of-automatic-segmentation-on-the |
Repo | |
Framework | |
Controlled and Balanced Dataset for Japanese Lexical Simplification
Title | Controlled and Balanced Dataset for Japanese Lexical Simplification |
Authors | Tomonori Kodaira, Tomoyuki Kajiwara, Mamoru Komachi |
Abstract | |
Tasks | Lexical Simplification, Reading Comprehension |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/P16-3001/ |
https://www.aclweb.org/anthology/P16-3001 | |
PWC | https://paperswithcode.com/paper/controlled-and-balanced-dataset-for-japanese |
Repo | |
Framework | |