May 5, 2019

2163 words 11 mins read

Paper Group NANR 104

CLARIAH in the Netherlands. This is how we do it: Answer Reranking for Open-domain How Questions with Paragraph Vectors and Minimal Feature Engineering. Chinese Textual Sentiment Analysis: Datasets, Resources and Tools. ArabTAG: from a Handcrafted to a Semi-automatically Generated TAG. Sentence Similarity based on Dependency Tree Kernels for Multi- …

CLARIAH in the Netherlands


Title	CLARIAH in the Netherlands
Authors	Jan Odijk
Abstract	I introduce CLARIAH in the Netherlands, which aims to contribute the Netherlands part of a Europe-wide humanities research infrastructure. I describe the digital turn in the humanities, the background and context of CLARIAH, both nationally and internationally, its relation to the CLARIN and DARIAH infrastructures, and the rationale for joining forces between CLARIN and DARIAH in the Netherlands. I also describe the first results of joining forces as achieved in the CLARIAH-SEED project, and the plans of the CLARIAH-CORE project, which is currently running
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1394/
PDF	https://www.aclweb.org/anthology/L16-1394
PWC	https://paperswithcode.com/paper/clariah-in-the-netherlands
Repo
Framework

This is how we do it: Answer Reranking for Open-domain How Questions with Paragraph Vectors and Minimal Feature Engineering


Title	This is how we do it: Answer Reranking for Open-domain How Questions with Paragraph Vectors and Minimal Feature Engineering
Authors	Dasha Bogdanova, Jennifer Foster
Abstract
Tasks	Feature Engineering, Question Answering
Published	2016-06-01
URL	https://www.aclweb.org/anthology/N16-1154/
PDF	https://www.aclweb.org/anthology/N16-1154
PWC	https://paperswithcode.com/paper/this-is-how-we-do-it-answer-reranking-for
Repo
Framework

Chinese Textual Sentiment Analysis: Datasets, Resources and Tools


Title	Chinese Textual Sentiment Analysis: Datasets, Resources and Tools
Authors	Lun-Wei Ku, Wei-Fan Chen
Abstract	The rapid accumulation of data in social media (in million and billion scales) has imposed great challenges in information extraction, knowledge discovery, and data mining, and texts bearing sentiment and opinions are one of the major categories of user generated data in social media. Sentiment analysis is the main technology to quickly capture what people think from these text data, and is a research direction with immediate practical value in {`}big data{'} era. Learning such techniques will allow data miners to perform advanced mining tasks considering real sentiment and opinions expressed by users in additional to the statistics calculated from the physical actions (such as viewing or purchasing records) user perform, which facilitates the development of real-world applications. However, the situation that most tools are limited to the English language might stop academic or industrial people from doing research or products which cover a wider scope of data, retrieving information from people who speak different languages, or developing applications for worldwide users. More specifically, sentiment analysis determines the polarities and strength of the sentiment-bearing expressions, and it has been an important and attractive research area. In the past decade, resources and tools have been developed for sentiment analysis in order to provide subsequent vital applications, such as product reviews, reputation management, call center robots, automatic public survey, etc. However, most of these resources are for the English language. Being the key to the understanding of business and government issues, sentiment analysis resources and tools are required for other major languages, e.g., Chinese. In this tutorial, audience can learn the skills for retrieving sentiment from texts in another major language, Chinese, to overcome this obstacle. The goal of this tutorial is to introduce the proposed sentiment analysis technologies and datasets in the literature, and give the audience the opportunities to use resources and tools to process Chinese texts from the very basic preprocessing, i.e., word segmentation and part of speech tagging, to sentiment analysis, i.e., applying sentiment dictionaries and obtaining sentiment scores, through step-by-step instructions and a hand-on practice. The basic processing tools are from CKIP Participants can download these resources, use them and solve the problems they encounter in this tutorial. This tutorial will begin from some background knowledge of sentiment analysis, such as how sentiment are categorized, where to find available corpora and which models are commonly applied, especially for the Chinese language. Then a set of basic Chinese text processing tools for word segmentation, tagging and parsing will be introduced for the preparation of mining sentiment and opinions. After bringing the idea of how to pre-process the Chinese language to the audience, I will describe our work on compositional Chinese sentiment analysis from words to sentences, and an application on social media text (Facebook) as an example. All our involved and recently developed related resources, including Chinese Morphological Dataset, Augmented NTU Sentiment Dictionary (aug-NTUSD), E-hownet with sentiment information, Chinese Opinion Treebank, and the CopeOpi Sentiment Scorer, will also be introduced and distributed in this tutorial. The tutorial will end by a hands-on session of how to use these materials and tools to process Chinese sentiment. Content Details, Materials, and Program please refer to the tutorial URL: \url{http://www.lunweiku.com/} \|
Tasks	Part-Of-Speech Tagging, Sentiment Analysis
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-3002/
PDF	https://www.aclweb.org/anthology/C16-3002
PWC	https://paperswithcode.com/paper/chinese-textual-sentiment-analysis-datasets
Repo
Framework

ArabTAG: from a Handcrafted to a Semi-automatically Generated TAG


Title	ArabTAG: from a Handcrafted to a Semi-automatically Generated TAG
Authors	Ch{'e}rifa Ben Khelil, Denys Duchier, Yannick Parmentier, Chiraz Zribi, F{'e}riel Ben Fraj
Abstract
Tasks	Question Answering
Published	2016-06-01
URL	https://www.aclweb.org/anthology/W16-3302/
PDF	https://www.aclweb.org/anthology/W16-3302
PWC	https://paperswithcode.com/paper/arabtag-from-a-handcrafted-to-a-semi
Repo
Framework

Sentence Similarity based on Dependency Tree Kernels for Multi-document Summarization


Title	Sentence Similarity based on Dependency Tree Kernels for Multi-document Summarization
Authors	{\c{S}}aziye Bet{"u}l {"O}zate{\c{s}}, Arzucan {"O}zg{"u}r, Dragomir Radev
Abstract	We introduce an approach based on using the dependency grammar representations of sentences to compute sentence similarity for extractive multi-document summarization. We adapt and investigate the effects of two untyped dependency tree kernels, which have originally been proposed for relation extraction, to the multi-document summarization problem. In addition, we propose a series of novel dependency grammar based kernels to better represent the syntactic and semantic similarities among the sentences. The proposed methods incorporate the type information of the dependency relations for sentence similarity calculation. To our knowledge, this is the first study that investigates using dependency tree based sentence similarity for multi-document summarization.
Tasks	Document Summarization, Multi-Document Summarization, Relation Extraction
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1452/
PDF	https://www.aclweb.org/anthology/L16-1452
PWC	https://paperswithcode.com/paper/sentence-similarity-based-on-dependency-tree
Repo
Framework


Title	The Naming Sharing Structure and its Cognitive Meaning in Chinese and English
Authors	Shili Ge, Rou Song
Abstract
Tasks	Machine Translation
Published	2016-06-01
URL	https://www.aclweb.org/anthology/W16-0603/
PDF	https://www.aclweb.org/anthology/W16-0603
PWC	https://paperswithcode.com/paper/the-naming-sharing-structure-and-its
Repo
Framework

Negation Detection in Clinical Reports Written in German


Title	Negation Detection in Clinical Reports Written in German
Authors	Viviana Cotik, Rol Roller, , Feiyu Xu, Hans Uszkoreit, Klemens Budde, Danilo Schmidt
Abstract	An important subtask in clinical text mining tries to identify whether a clinical finding is expressed as present, absent or unsure in a text. This work presents a system for detecting mentions of clinical findings that are negated or just speculated. The system has been applied to two different types of German clinical texts: clinical notes and discharge summaries. Our approach is built on top of NegEx, a well known algorithm for identifying non-factive mentions of medical findings. In this work, we adjust a previous adaptation of NegEx to German and evaluate the system on our data to detect negation and speculation. The results are compared to a baseline algorithm and are analyzed for both types of clinical documents. Our system achieves an F1-Score above 0.9 on both types of reports.
Tasks	Named Entity Recognition, Negation Detection, Relation Extraction
Published	2016-12-01
URL	https://www.aclweb.org/anthology/W16-5113/
PDF	https://www.aclweb.org/anthology/W16-5113
PWC	https://paperswithcode.com/paper/negation-detection-in-clinical-reports
Repo
Framework

Unsupervised Learning of Spoken Language with Visual Context


Title	Unsupervised Learning of Spoken Language with Visual Context
Authors	David Harwath, Antonio Torralba, James Glass
Abstract	Humans learn to speak before they can read or write, so why can’t computers do the same? In this paper, we present a deep neural network model capable of rudimentary spoken language acquisition using untranscribed audio training data, whose only supervision comes in the form of contextually relevant visual images. We describe the collection of our data comprised of over 120,000 spoken audio captions for the Places image dataset and evaluate our model on an image search and annotation task. We also provide some visualizations which suggest that our model is learning to recognize meaningful words within the caption spectrograms.
Tasks	Image Retrieval, Language Acquisition
Published	2016-12-01
URL	http://papers.nips.cc/paper/6186-unsupervised-learning-of-spoken-language-with-visual-context
PDF	http://papers.nips.cc/paper/6186-unsupervised-learning-of-spoken-language-with-visual-context.pdf
PWC	https://paperswithcode.com/paper/unsupervised-learning-of-spoken-language-with
Repo
Framework

Joint search in a bilingual valency lexicon and an annotated corpus


Title	Joint search in a bilingual valency lexicon and an annotated corpus
Authors	Eva Fu{\v{c}}{'\i}kov{'a}, Jan Haji{\v{c}}, Zde{\v{n}}ka Ure{\v{s}}ov{'a}
Abstract	In this paper and the associated system demo, we present an advanced search system that allows to perform a joint search over a (bilingual) valency lexicon and a correspondingly annotated linked parallel corpus. This search tool has been developed on the basis of the Prague Czech-English Dependency Treebank, but its ideas are applicable in principle to any bilingual parallel corpus that is annotated for dependencies and valency (i.e., predicate-argument structure), and where verbs are linked to appropriate entries in an associated valency lexicon. Our online search tool consolidates more search interfaces into one, providing expanded structured search capability and a more efficient advanced way to search, allowing users to search for verb pairs, verbal argument pairs, their surface realization as recorded in the lexicon, or for their surface form actually appearing in the linked parallel corpus. The search system is currently under development, and is replacing our current search tool available at \url{http://lindat.mff.cuni.cz/services/CzEngVallex}, which could search the lexicon but the queries cannot take advantage of the underlying corpus nor use the additional surface form information from the lexicon(s). The system is available as open source.
Tasks
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-2009/
PDF	https://www.aclweb.org/anthology/C16-2009
PWC	https://paperswithcode.com/paper/joint-search-in-a-bilingual-valency-lexicon
Repo
Framework

Harnessing Sequence Labeling for Sarcasm Detection in Dialogue from TV Series `Friends’


Title	Harnessing Sequence Labeling for Sarcasm Detection in Dialogue from TV Series `Friends’ \|
Authors	Aditya Joshi, Vaibhav Tripathi, Pushpak Bhattacharyya, Mark J. Carman
Abstract
Tasks	Sarcasm Detection, Sentiment Analysis
Published	2016-08-01
URL	https://www.aclweb.org/anthology/K16-1015/
PDF	https://www.aclweb.org/anthology/K16-1015
PWC	https://paperswithcode.com/paper/harnessing-sequence-labeling-for-sarcasm
Repo
Framework


Title	Data Formats and Management Strategies from the Perspective of Language Resource Producers ― Personal Diachronic and Social Synchronic Data Sharing ―
Authors	Kazushi Ohya
Abstract	This is a report of findings from on-going language documentation research based on three consecutive projects from 2008 to 2016. In the light of this research, we propose that (1) we should stand on the side of language resource producers to enhance the research of language processing. We support personal data management in addition to social data sharing. (2) This support leads to adopting simple data formats instead of the multi-link-path data models proposed as international standards up to the present. (3) We should set up a framework for total language resource study that includes not only pivotal data formats such as standard formats, but also the surroundings of data formation to capture a wider range of language activities, e.g. annotation, hesitant language formation, and reference-referent relations. A study of this framework is expected to be a foundation of rebuilding man-machine interface studies in which we seek to observe generative processes of informational symbols in order to establish a high affinity interface in regard to documentation.
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1516/
PDF	https://www.aclweb.org/anthology/L16-1516
PWC	https://paperswithcode.com/paper/data-formats-and-management-strategies-from
Repo
Framework

Augmenting Course Material with Open Access Textbooks


Title	Augmenting Course Material with Open Access Textbooks
Authors	Smitha Milli, Marti A. Hearst
Abstract
Tasks	Information Retrieval
Published	2016-06-01
URL	https://www.aclweb.org/anthology/W16-0526/
PDF	https://www.aclweb.org/anthology/W16-0526
PWC	https://paperswithcode.com/paper/augmenting-course-material-with-open-access
Repo
Framework

A Constant-Factor Bi-Criteria Approximation Guarantee for k-means++


Title	A Constant-Factor Bi-Criteria Approximation Guarantee for k-means++
Authors	Dennis Wei
Abstract	This paper studies the $k$-means++ algorithm for clustering as well as the class of $D^\ell$ sampling algorithms to which $k$-means++ belongs. It is shown that for any constant factor $\beta > 1$, selecting $\beta k$ cluster centers by $D^\ell$ sampling yields a constant-factor approximation to the optimal clustering with $k$ centers, in expectation and without conditions on the dataset. This result extends the previously known $O(\log k)$ guarantee for the case $\beta = 1$ to the constant-factor bi-criteria regime. It also improves upon an existing constant-factor bi-criteria result that holds only with constant probability.
Tasks
Published	2016-12-01
URL	http://papers.nips.cc/paper/6309-a-constant-factor-bi-criteria-approximation-guarantee-for-k-means
PDF	http://papers.nips.cc/paper/6309-a-constant-factor-bi-criteria-approximation-guarantee-for-k-means.pdf
PWC	https://paperswithcode.com/paper/a-constant-factor-bi-criteria-approximation-1
Repo
Framework

Microblog Emotion Classification by Computing Similarity in Text, Time, and Space


Title	Microblog Emotion Classification by Computing Similarity in Text, Time, and Space
Authors	Anja Summa, Bernd Resch, Michael Strube
Abstract	Most work in NLP analysing microblogs focuses on textual content thus neglecting temporal and spatial information. We present a new interdisciplinary method for emotion classification that combines linguistic, temporal, and spatial information into a single metric. We create a graph of labeled and unlabeled tweets that encodes the relations between neighboring tweets with respect to their emotion labels. Graph-based semi-supervised learning labels all tweets with an emotion.
Tasks	Emotion Classification, Sentiment Analysis
Published	2016-12-01
URL	https://www.aclweb.org/anthology/W16-4317/
PDF	https://www.aclweb.org/anthology/W16-4317
PWC	https://paperswithcode.com/paper/microblog-emotion-classification-by-computing
Repo
Framework

Multiplicative Representations for Unsupervised Semantic Role Induction


Title	Multiplicative Representations for Unsupervised Semantic Role Induction
Authors	Yi Luan, Yangfeng Ji, Hannaneh Hajishirzi, Boyang Li
Abstract
Tasks	Representation Learning, Semantic Role Labeling
Published	2016-08-01
URL	https://www.aclweb.org/anthology/P16-2020/
PDF	https://www.aclweb.org/anthology/P16-2020
PWC	https://paperswithcode.com/paper/multiplicative-representations-for
Repo
Framework