May 4, 2019

1642 words 8 mins read

Paper Group NANR 156

Paper Group NANR 156

A Paraphrase and Semantic Similarity Detection System for User Generated Short-Text Content on Microblogs. Predicting Restaurant Consumption Level through Social Media Footprints. The CAMOMILE Collaborative Annotation Platform for Multi-modal, Multi-lingual and Multi-media Documents. Hyperedge Replacement and Nonprojective Dependency Structures. LS …

A Paraphrase and Semantic Similarity Detection System for User Generated Short-Text Content on Microblogs

Title A Paraphrase and Semantic Similarity Detection System for User Generated Short-Text Content on Microblogs
Authors Kuntal Dey, Ritvik Shrivastava, Saroj Kaushik
Abstract Existing systems deliver high accuracy and F1-scores for detecting paraphrase and semantic similarity on traditional clean-text corpus. For instance, on the clean-text Microsoft Paraphrase benchmark database, the existing systems attain an accuracy as high as 0:8596. However, existing systems for detecting paraphrases and semantic similarity on user-generated short-text content on microblogs such as Twitter, comprising of noisy and ad hoc short-text, needs significant research attention. In this paper, we propose a machine learning based approach towards this. We propose a set of features that, although well-known in the NLP literature for solving other problems, have not been explored for detecting paraphrase or semantic similarity, on noisy user-generated short-text data such as Twitter. We apply support vector machine (SVM) based learning. We use the benchmark Twitter paraphrase data, released as a part of SemEval 2015, for experiments. Our system delivers a paraphrase detection F1-score of 0.717 and semantic similarity detection F1-score of 0.741, thereby significantly outperforming the existing systems, that deliver F1-scores of 0.696 and 0.724 for the two problems respectively. Our features also allow us to obtain a rank among the top-10, when trained on the Microsoft Paraphrase corpus and tested on the corresponding test data, thereby empirically establishing our approach as ubiquitous across the different paraphrase detection databases.
Tasks Semantic Similarity, Semantic Textual Similarity
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-1271/
PDF https://www.aclweb.org/anthology/C16-1271
PWC https://paperswithcode.com/paper/a-paraphrase-and-semantic-similarity
Repo
Framework

Predicting Restaurant Consumption Level through Social Media Footprints

Title Predicting Restaurant Consumption Level through Social Media Footprints
Authors Yang Xiao, Yuan Wang, Hangyu Mao, Zhen Xiao
Abstract Accurate prediction of user attributes from social media is valuable for both social science analysis and consumer targeting. In this paper, we propose a systematic method to leverage user online social media content for predicting offline restaurant consumption level. We utilize the social login as a bridge and construct a dataset of 8,844 users who have been linked across Dianping (similar to Yelp) and Sina Weibo. More specifically, we construct consumption level ground truth based on user self report spending. We build predictive models using both raw features and, especially, latent features, such as topic distributions and celebrities clusters. The employed methods demonstrate that online social media content has strong predictive power for offline spending. Finally, combined with qualitative feature analysis, we present the differences in words usage, topic interests and following behavior between different consumption level groups.
Tasks
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-1314/
PDF https://www.aclweb.org/anthology/C16-1314
PWC https://paperswithcode.com/paper/predicting-restaurant-consumption-level
Repo
Framework

The CAMOMILE Collaborative Annotation Platform for Multi-modal, Multi-lingual and Multi-media Documents

Title The CAMOMILE Collaborative Annotation Platform for Multi-modal, Multi-lingual and Multi-media Documents
Authors Johann Poignant, Mateusz Budnik, Herv{'e} Bredin, Claude Barras, Mickael Stefas, Pierrick Bruneau, Gilles Adda, Laurent Besacier, Hazim Ekenel, Gil Francopoulo, Hern, Javier o, Joseph Mariani, Ramon Morros, Georges Qu{'e}not, Sophie Rosset, Thomas Tamisier
Abstract In this paper, we describe the organization and the implementation of the CAMOMILE collaborative annotation framework for multimodal, multimedia, multilingual (3M) data. Given the versatile nature of the analysis which can be performed on 3M data, the structure of the server was kept intentionally simple in order to preserve its genericity, relying on standard Web technologies. Layers of annotations, defined as data associated to a media fragment from the corpus, are stored in a database and can be managed through standard interfaces with authentication. Interfaces tailored specifically to the needed task can then be developed in an agile way, relying on simple but reliable services for the management of the centralized annotations. We then present our implementation of an active learning scenario for person annotation in video, relying on the CAMOMILE server; during a dry run experiment, the manual annotation of 716 speech segments was thus propagated to 3504 labeled tracks. The code of the CAMOMILE framework is distributed in open source.
Tasks Active Learning
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1226/
PDF https://www.aclweb.org/anthology/L16-1226
PWC https://paperswithcode.com/paper/the-camomile-collaborative-annotation
Repo
Framework

Hyperedge Replacement and Nonprojective Dependency Structures

Title Hyperedge Replacement and Nonprojective Dependency Structures
Authors Daniel Bauer, Owen Rambow
Abstract
Tasks Machine Translation, Semantic Parsing
Published 2016-06-01
URL https://www.aclweb.org/anthology/W16-3311/
PDF https://www.aclweb.org/anthology/W16-3311
PWC https://paperswithcode.com/paper/hyperedge-replacement-and-nonprojective
Repo
Framework

LSIS at SemEval-2016 Task 7: Using Web Search Engines for English and Arabic Unsupervised Sentiment Intensity Prediction

Title LSIS at SemEval-2016 Task 7: Using Web Search Engines for English and Arabic Unsupervised Sentiment Intensity Prediction
Authors Amal Htait, Sebastien Fournier, Patrice Bellot
Abstract
Tasks Sentiment Analysis
Published 2016-06-01
URL https://www.aclweb.org/anthology/S16-1076/
PDF https://www.aclweb.org/anthology/S16-1076
PWC https://paperswithcode.com/paper/lsis-at-semeval-2016-task-7-using-web-search
Repo
Framework

Use of Semantic Knowledge Base for Enhancement of Coherence of Code-mixed Topic-Based Aspect Clusters

Title Use of Semantic Knowledge Base for Enhancement of Coherence of Code-mixed Topic-Based Aspect Clusters
Authors Kavita Asnani, Jyoti D Pawar
Abstract
Tasks
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-6332/
PDF https://www.aclweb.org/anthology/W16-6332
PWC https://paperswithcode.com/paper/use-of-semantic-knowledge-base-for
Repo
Framework

Liberal Event Extraction and Event Schema Induction

Title Liberal Event Extraction and Event Schema Induction
Authors Lifu Huang, Taylor Cassidy, Xiaocheng Feng, Heng Ji, Clare R. Voss, Jiawei Han, Avirup Sil
Abstract
Tasks
Published 2016-08-01
URL https://www.aclweb.org/anthology/P16-1025/
PDF https://www.aclweb.org/anthology/P16-1025
PWC https://paperswithcode.com/paper/liberal-event-extraction-and-event-schema
Repo
Framework

Efficient construction of metadata-enhanced web corpora

Title Efficient construction of metadata-enhanced web corpora
Authors Adrien Barbaresi
Abstract
Tasks Information Retrieval
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-2602/
PDF https://www.aclweb.org/anthology/W16-2602
PWC https://paperswithcode.com/paper/efficient-construction-of-metadata-enhanced
Repo
Framework

Unsupervised Event Coreference for Abstract Words

Title Unsupervised Event Coreference for Abstract Words
Authors Dheeraj Rajagopal, Eduard Hovy, Teruko Mitamura
Abstract
Tasks
Published 2016-11-01
URL https://www.aclweb.org/anthology/W16-6005/
PDF https://www.aclweb.org/anthology/W16-6005
PWC https://paperswithcode.com/paper/unsupervised-event-coreference-for-abstract
Repo
Framework

Extending WordNet with Fine-Grained Collocational Information via Supervised Distributional Learning

Title Extending WordNet with Fine-Grained Collocational Information via Supervised Distributional Learning
Authors Luis Espinosa-Anke, Jose Camacho-Collados, Sara Rodr{'\i}guez-Fern{'a}ndez, Horacio Saggion, Leo Wanner
Abstract WordNet is probably the best known lexical resource in Natural Language Processing. While it is widely regarded as a high quality repository of concepts and semantic relations, updating and extending it manually is costly. One important type of relation which could potentially add enormous value to WordNet is the inclusion of collocational information, which is paramount in tasks such as Machine Translation, Natural Language Generation and Second Language Learning. In this paper, we present ColWordNet (CWN), an extended WordNet version with fine-grained collocational information, automatically introduced thanks to a method exploiting linear relations between analogous sense-level embeddings spaces. We perform both intrinsic and extrinsic evaluations, and release CWN for the use and scrutiny of the community.
Tasks Machine Translation, Semantic Textual Similarity, Sentiment Analysis, Text Generation, Word Embeddings, Word Sense Disambiguation
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-1323/
PDF https://www.aclweb.org/anthology/C16-1323
PWC https://paperswithcode.com/paper/extending-wordnet-with-fine-grained
Repo
Framework

Demonstration of ChaKi.NET – beyond the corpus search system

Title Demonstration of ChaKi.NET – beyond the corpus search system
Authors Masayuki Asahara, Yuji Matsumoto, Toshio Morita
Abstract ChaKi.NET is a corpus management system for dependency structure annotated corpora. After more than 10 years of continuous development, the system is now usable not only for corpus search, but also for visualization, annotation, labelling, and formatting for statistical analysis. This paper describes the various functions included in the current ChaKi.NET system.
Tasks
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-2011/
PDF https://www.aclweb.org/anthology/C16-2011
PWC https://paperswithcode.com/paper/demonstration-of-chakinet-a-beyond-the-corpus
Repo
Framework

The Power of Adaptivity in Identifying Statistical Alternatives

Title The Power of Adaptivity in Identifying Statistical Alternatives
Authors Kevin G. Jamieson, Daniel Haas, Benjamin Recht
Abstract This paper studies the trade-off between two different kinds of pure exploration: breadth versus depth. We focus on the most biased coin problem, asking how many total coin flips are required to identify a heavy'' coin from an infinite bag containing both heavy’’ coins with mean $\theta_1 \in (0,1)$, and ``light” coins with mean $\theta_0 \in (0,\theta_1)$, where heavy coins are drawn from the bag with proportion $\alpha \in (0,1/2)$. When $\alpha,\theta_0,\theta_1$ are unknown, the key difficulty of this problem lies in distinguishing whether the two kinds of coins have very similar means, or whether heavy coins are just extremely rare. While existing solutions to this problem require some prior knowledge of the parameters $\theta_0,\theta_1,\alpha$, we propose an adaptive algorithm that requires no such knowledge yet still obtains near-optimal sample complexity guarantees. In contrast, we provide a lower bound showing that non-adaptive strategies require at least quadratically more samples. In characterizing this gap between adaptive and nonadaptive strategies, we make connections to anomaly detection and prove lower bounds on the sample complexity of differentiating between a single parametric distribution and a mixture of two such distributions. |
Tasks Anomaly Detection
Published 2016-12-01
URL http://papers.nips.cc/paper/6072-the-power-of-adaptivity-in-identifying-statistical-alternatives
PDF http://papers.nips.cc/paper/6072-the-power-of-adaptivity-in-identifying-statistical-alternatives.pdf
PWC https://paperswithcode.com/paper/the-power-of-adaptivity-in-identifying
Repo
Framework

An Open Source Library for Semantic-Based Datetime Resolution

Title An Open Source Library for Semantic-Based Datetime Resolution
Authors Aur{'e}lie Merlo, Denis Pasin
Abstract In this paper, we introduce an original Python implementation of datetime resolution in french, which we make available as open-source library. Our approach is based on Frame Semantics and Corpus Pattern Analysis in order to provide a precise semantic interpretation of datetime expressions. This interpretation facilitates the contextual resolution of datetime expressions in timestamp format.
Tasks
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-2023/
PDF https://www.aclweb.org/anthology/C16-2023
PWC https://paperswithcode.com/paper/an-open-source-library-for-semantic-based
Repo
Framework

Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Tutorial Abstracts

Title Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Tutorial Abstracts
Authors
Abstract
Tasks
Published 2016-06-01
URL https://www.aclweb.org/anthology/N16-4000/
PDF https://www.aclweb.org/anthology/N16-4000
PWC https://paperswithcode.com/paper/proceedings-of-the-2016-conference-of-the-1
Repo
Framework

USFD at SemEval-2016 Task 1: Putting different State-of-the-Arts into a Box

Title USFD at SemEval-2016 Task 1: Putting different State-of-the-Arts into a Box
Authors Ahmet Aker, Frederic Blain, Andres Duque, Marina Fomicheva, Jurica Seva, Kashif Shah, Daniel Beck
Abstract
Tasks Information Retrieval, Machine Translation, Semantic Textual Similarity, Word Alignment
Published 2016-06-01
URL https://www.aclweb.org/anthology/S16-1092/
PDF https://www.aclweb.org/anthology/S16-1092
PWC https://paperswithcode.com/paper/usfd-at-semeval-2016-task-1-putting-different
Repo
Framework
comments powered by Disqus