May 4, 2019

1642 words 8 mins read

Paper Group NANR 156

A Paraphrase and Semantic Similarity Detection System for User Generated Short-Text Content on Microblogs. Predicting Restaurant Consumption Level through Social Media Footprints. The CAMOMILE Collaborative Annotation Platform for Multi-modal, Multi-lingual and Multi-media Documents. Hyperedge Replacement and Nonprojective Dependency Structures. LS …

A Paraphrase and Semantic Similarity Detection System for User Generated Short-Text Content on Microblogs


Title	A Paraphrase and Semantic Similarity Detection System for User Generated Short-Text Content on Microblogs
Authors	Kuntal Dey, Ritvik Shrivastava, Saroj Kaushik
Abstract	Existing systems deliver high accuracy and F1-scores for detecting paraphrase and semantic similarity on traditional clean-text corpus. For instance, on the clean-text Microsoft Paraphrase benchmark database, the existing systems attain an accuracy as high as 0:8596. However, existing systems for detecting paraphrases and semantic similarity on user-generated short-text content on microblogs such as Twitter, comprising of noisy and ad hoc short-text, needs significant research attention. In this paper, we propose a machine learning based approach towards this. We propose a set of features that, although well-known in the NLP literature for solving other problems, have not been explored for detecting paraphrase or semantic similarity, on noisy user-generated short-text data such as Twitter. We apply support vector machine (SVM) based learning. We use the benchmark Twitter paraphrase data, released as a part of SemEval 2015, for experiments. Our system delivers a paraphrase detection F1-score of 0.717 and semantic similarity detection F1-score of 0.741, thereby significantly outperforming the existing systems, that deliver F1-scores of 0.696 and 0.724 for the two problems respectively. Our features also allow us to obtain a rank among the top-10, when trained on the Microsoft Paraphrase corpus and tested on the corresponding test data, thereby empirically establishing our approach as ubiquitous across the different paraphrase detection databases.
Tasks	Semantic Similarity, Semantic Textual Similarity
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-1271/
PDF	https://www.aclweb.org/anthology/C16-1271
PWC	https://paperswithcode.com/paper/a-paraphrase-and-semantic-similarity
Repo
Framework


Title	Predicting Restaurant Consumption Level through Social Media Footprints
Authors	Yang Xiao, Yuan Wang, Hangyu Mao, Zhen Xiao
Abstract	Accurate prediction of user attributes from social media is valuable for both social science analysis and consumer targeting. In this paper, we propose a systematic method to leverage user online social media content for predicting offline restaurant consumption level. We utilize the social login as a bridge and construct a dataset of 8,844 users who have been linked across Dianping (similar to Yelp) and Sina Weibo. More specifically, we construct consumption level ground truth based on user self report spending. We build predictive models using both raw features and, especially, latent features, such as topic distributions and celebrities clusters. The employed methods demonstrate that online social media content has strong predictive power for offline spending. Finally, combined with qualitative feature analysis, we present the differences in words usage, topic interests and following behavior between different consumption level groups.
Tasks
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-1314/
PDF	https://www.aclweb.org/anthology/C16-1314
PWC	https://paperswithcode.com/paper/predicting-restaurant-consumption-level
Repo
Framework


Title	The CAMOMILE Collaborative Annotation Platform for Multi-modal, Multi-lingual and Multi-media Documents
Authors	Johann Poignant, Mateusz Budnik, Herv{'e} Bredin, Claude Barras, Mickael Stefas, Pierrick Bruneau, Gilles Adda, Laurent Besacier, Hazim Ekenel, Gil Francopoulo, Hern, Javier o, Joseph Mariani, Ramon Morros, Georges Qu{'e}not, Sophie Rosset, Thomas Tamisier
Abstract	In this paper, we describe the organization and the implementation of the CAMOMILE collaborative annotation framework for multimodal, multimedia, multilingual (3M) data. Given the versatile nature of the analysis which can be performed on 3M data, the structure of the server was kept intentionally simple in order to preserve its genericity, relying on standard Web technologies. Layers of annotations, defined as data associated to a media fragment from the corpus, are stored in a database and can be managed through standard interfaces with authentication. Interfaces tailored specifically to the needed task can then be developed in an agile way, relying on simple but reliable services for the management of the centralized annotations. We then present our implementation of an active learning scenario for person annotation in video, relying on the CAMOMILE server; during a dry run experiment, the manual annotation of 716 speech segments was thus propagated to 3504 labeled tracks. The code of the CAMOMILE framework is distributed in open source.
Tasks	Active Learning
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1226/
PDF	https://www.aclweb.org/anthology/L16-1226
PWC	https://paperswithcode.com/paper/the-camomile-collaborative-annotation
Repo
Framework

Hyperedge Replacement and Nonprojective Dependency Structures


Title	Hyperedge Replacement and Nonprojective Dependency Structures
Authors	Daniel Bauer, Owen Rambow
Abstract
Tasks	Machine Translation, Semantic Parsing
Published	2016-06-01
URL	https://www.aclweb.org/anthology/W16-3311/
PDF	https://www.aclweb.org/anthology/W16-3311
PWC	https://paperswithcode.com/paper/hyperedge-replacement-and-nonprojective
Repo
Framework

LSIS at SemEval-2016 Task 7: Using Web Search Engines for English and Arabic Unsupervised Sentiment Intensity Prediction


Title	LSIS at SemEval-2016 Task 7: Using Web Search Engines for English and Arabic Unsupervised Sentiment Intensity Prediction
Authors	Amal Htait, Sebastien Fournier, Patrice Bellot
Abstract
Tasks	Sentiment Analysis
Published	2016-06-01
URL	https://www.aclweb.org/anthology/S16-1076/
PDF	https://www.aclweb.org/anthology/S16-1076
PWC	https://paperswithcode.com/paper/lsis-at-semeval-2016-task-7-using-web-search
Repo
Framework

Use of Semantic Knowledge Base for Enhancement of Coherence of Code-mixed Topic-Based Aspect Clusters


Title	Use of Semantic Knowledge Base for Enhancement of Coherence of Code-mixed Topic-Based Aspect Clusters
Authors	Kavita Asnani, Jyoti D Pawar
Abstract
Tasks
Published	2016-12-01
URL	https://www.aclweb.org/anthology/W16-6332/
PDF	https://www.aclweb.org/anthology/W16-6332
PWC	https://paperswithcode.com/paper/use-of-semantic-knowledge-base-for
Repo
Framework

Liberal Event Extraction and Event Schema Induction


Title	Liberal Event Extraction and Event Schema Induction
Authors	Lifu Huang, Taylor Cassidy, Xiaocheng Feng, Heng Ji, Clare R. Voss, Jiawei Han, Avirup Sil
Abstract
Tasks
Published	2016-08-01
URL	https://www.aclweb.org/anthology/P16-1025/
PDF	https://www.aclweb.org/anthology/P16-1025
PWC	https://paperswithcode.com/paper/liberal-event-extraction-and-event-schema
Repo
Framework

Efficient construction of metadata-enhanced web corpora


Title	Efficient construction of metadata-enhanced web corpora
Authors	Adrien Barbaresi
Abstract
Tasks	Information Retrieval
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-2602/
PDF	https://www.aclweb.org/anthology/W16-2602
PWC	https://paperswithcode.com/paper/efficient-construction-of-metadata-enhanced
Repo
Framework

Unsupervised Event Coreference for Abstract Words


Title	Unsupervised Event Coreference for Abstract Words
Authors	Dheeraj Rajagopal, Eduard Hovy, Teruko Mitamura
Abstract
Tasks
Published	2016-11-01
URL	https://www.aclweb.org/anthology/W16-6005/
PDF	https://www.aclweb.org/anthology/W16-6005
PWC	https://paperswithcode.com/paper/unsupervised-event-coreference-for-abstract
Repo
Framework

Extending WordNet with Fine-Grained Collocational Information via Supervised Distributional Learning


Title	Extending WordNet with Fine-Grained Collocational Information via Supervised Distributional Learning
Authors	Luis Espinosa-Anke, Jose Camacho-Collados, Sara Rodr{'\i}guez-Fern{'a}ndez, Horacio Saggion, Leo Wanner
Abstract	WordNet is probably the best known lexical resource in Natural Language Processing. While it is widely regarded as a high quality repository of concepts and semantic relations, updating and extending it manually is costly. One important type of relation which could potentially add enormous value to WordNet is the inclusion of collocational information, which is paramount in tasks such as Machine Translation, Natural Language Generation and Second Language Learning. In this paper, we present ColWordNet (CWN), an extended WordNet version with fine-grained collocational information, automatically introduced thanks to a method exploiting linear relations between analogous sense-level embeddings spaces. We perform both intrinsic and extrinsic evaluations, and release CWN for the use and scrutiny of the community.
Tasks	Machine Translation, Semantic Textual Similarity, Sentiment Analysis, Text Generation, Word Embeddings, Word Sense Disambiguation
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-1323/
PDF	https://www.aclweb.org/anthology/C16-1323
PWC	https://paperswithcode.com/paper/extending-wordnet-with-fine-grained
Repo
Framework

Demonstration of ChaKi.NET – beyond the corpus search system


Title	Demonstration of ChaKi.NET – beyond the corpus search system
Authors	Masayuki Asahara, Yuji Matsumoto, Toshio Morita
Abstract	ChaKi.NET is a corpus management system for dependency structure annotated corpora. After more than 10 years of continuous development, the system is now usable not only for corpus search, but also for visualization, annotation, labelling, and formatting for statistical analysis. This paper describes the various functions included in the current ChaKi.NET system.
Tasks
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-2011/
PDF	https://www.aclweb.org/anthology/C16-2011
PWC	https://paperswithcode.com/paper/demonstration-of-chakinet-a-beyond-the-corpus
Repo
Framework

The Power of Adaptivity in Identifying Statistical Alternatives


Title	The Power of Adaptivity in Identifying Statistical Alternatives
Authors	Kevin G. Jamieson, Daniel Haas, Benjamin Recht
Abstract	This paper studies the trade-off between two different kinds of pure exploration: breadth versus depth. We focus on the most biased coin problem, asking how many total coin flips are required to identify a `heavy'' coin from an infinite bag containing both` heavy’’ coins with mean $\theta_1 \in (0,1)$, and ``light” coins with mean $\theta_0 \in (0,\theta_1)$, where heavy coins are drawn from the bag with proportion $\alpha \in (0,1/2)$. When $\alpha,\theta_0,\theta_1$ are unknown, the key difficulty of this problem lies in distinguishing whether the two kinds of coins have very similar means, or whether heavy coins are just extremely rare. While existing solutions to this problem require some prior knowledge of the parameters $\theta_0,\theta_1,\alpha$, we propose an adaptive algorithm that requires no such knowledge yet still obtains near-optimal sample complexity guarantees. In contrast, we provide a lower bound showing that non-adaptive strategies require at least quadratically more samples. In characterizing this gap between adaptive and nonadaptive strategies, we make connections to anomaly detection and prove lower bounds on the sample complexity of differentiating between a single parametric distribution and a mixture of two such distributions. \|
Tasks	Anomaly Detection
Published	2016-12-01
URL	http://papers.nips.cc/paper/6072-the-power-of-adaptivity-in-identifying-statistical-alternatives
PDF	http://papers.nips.cc/paper/6072-the-power-of-adaptivity-in-identifying-statistical-alternatives.pdf
PWC	https://paperswithcode.com/paper/the-power-of-adaptivity-in-identifying
Repo
Framework

An Open Source Library for Semantic-Based Datetime Resolution


Title	An Open Source Library for Semantic-Based Datetime Resolution
Authors	Aur{'e}lie Merlo, Denis Pasin
Abstract	In this paper, we introduce an original Python implementation of datetime resolution in french, which we make available as open-source library. Our approach is based on Frame Semantics and Corpus Pattern Analysis in order to provide a precise semantic interpretation of datetime expressions. This interpretation facilitates the contextual resolution of datetime expressions in timestamp format.
Tasks
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-2023/
PDF	https://www.aclweb.org/anthology/C16-2023
PWC	https://paperswithcode.com/paper/an-open-source-library-for-semantic-based
Repo
Framework

Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Tutorial Abstracts


Title	Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Tutorial Abstracts
Authors
Abstract
Tasks
Published	2016-06-01
URL	https://www.aclweb.org/anthology/N16-4000/
PDF	https://www.aclweb.org/anthology/N16-4000
PWC	https://paperswithcode.com/paper/proceedings-of-the-2016-conference-of-the-1
Repo
Framework

USFD at SemEval-2016 Task 1: Putting different State-of-the-Arts into a Box


Title	USFD at SemEval-2016 Task 1: Putting different State-of-the-Arts into a Box
Authors	Ahmet Aker, Frederic Blain, Andres Duque, Marina Fomicheva, Jurica Seva, Kashif Shah, Daniel Beck
Abstract
Tasks	Information Retrieval, Machine Translation, Semantic Textual Similarity, Word Alignment
Published	2016-06-01
URL	https://www.aclweb.org/anthology/S16-1092/
PDF	https://www.aclweb.org/anthology/S16-1092
PWC	https://paperswithcode.com/paper/usfd-at-semeval-2016-task-1-putting-different
Repo
Framework