Paper Group NANR 104
CLARIAH in the Netherlands. This is how we do it: Answer Reranking for Open-domain How Questions with Paragraph Vectors and Minimal Feature Engineering. Chinese Textual Sentiment Analysis: Datasets, Resources and Tools. ArabTAG: from a Handcrafted to a Semi-automatically Generated TAG. Sentence Similarity based on Dependency Tree Kernels for Multi- …
CLARIAH in the Netherlands
Title | CLARIAH in the Netherlands |
Authors | Jan Odijk |
Abstract | I introduce CLARIAH in the Netherlands, which aims to contribute the Netherlands part of a Europe-wide humanities research infrastructure. I describe the digital turn in the humanities, the background and context of CLARIAH, both nationally and internationally, its relation to the CLARIN and DARIAH infrastructures, and the rationale for joining forces between CLARIN and DARIAH in the Netherlands. I also describe the first results of joining forces as achieved in the CLARIAH-SEED project, and the plans of the CLARIAH-CORE project, which is currently running |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1394/ |
https://www.aclweb.org/anthology/L16-1394 | |
PWC | https://paperswithcode.com/paper/clariah-in-the-netherlands |
Repo | |
Framework | |
This is how we do it: Answer Reranking for Open-domain How Questions with Paragraph Vectors and Minimal Feature Engineering
Title | This is how we do it: Answer Reranking for Open-domain How Questions with Paragraph Vectors and Minimal Feature Engineering |
Authors | Dasha Bogdanova, Jennifer Foster |
Abstract | |
Tasks | Feature Engineering, Question Answering |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/N16-1154/ |
https://www.aclweb.org/anthology/N16-1154 | |
PWC | https://paperswithcode.com/paper/this-is-how-we-do-it-answer-reranking-for |
Repo | |
Framework | |
Chinese Textual Sentiment Analysis: Datasets, Resources and Tools
Title | Chinese Textual Sentiment Analysis: Datasets, Resources and Tools |
Authors | Lun-Wei Ku, Wei-Fan Chen |
Abstract | The rapid accumulation of data in social media (in million and billion scales) has imposed great challenges in information extraction, knowledge discovery, and data mining, and texts bearing sentiment and opinions are one of the major categories of user generated data in social media. Sentiment analysis is the main technology to quickly capture what people think from these text data, and is a research direction with immediate practical value in {`}big data{'} era. Learning such techniques will allow data miners to perform advanced mining tasks considering real sentiment and opinions expressed by users in additional to the statistics calculated from the physical actions (such as viewing or purchasing records) user perform, which facilitates the development of real-world applications. However, the situation that most tools are limited to the English language might stop academic or industrial people from doing research or products which cover a wider scope of data, retrieving information from people who speak different languages, or developing applications for worldwide users. More specifically, sentiment analysis determines the polarities and strength of the sentiment-bearing expressions, and it has been an important and attractive research area. In the past decade, resources and tools have been developed for sentiment analysis in order to provide subsequent vital applications, such as product reviews, reputation management, call center robots, automatic public survey, etc. However, most of these resources are for the English language. Being the key to the understanding of business and government issues, sentiment analysis resources and tools are required for other major languages, e.g., Chinese. In this tutorial, audience can learn the skills for retrieving sentiment from texts in another major language, Chinese, to overcome this obstacle. The goal of this tutorial is to introduce the proposed sentiment analysis technologies and datasets in the literature, and give the audience the opportunities to use resources and tools to process Chinese texts from the very basic preprocessing, i.e., word segmentation and part of speech tagging, to sentiment analysis, i.e., applying sentiment dictionaries and obtaining sentiment scores, through step-by-step instructions and a hand-on practice. The basic processing tools are from CKIP Participants can download these resources, use them and solve the problems they encounter in this tutorial. This tutorial will begin from some background knowledge of sentiment analysis, such as how sentiment are categorized, where to find available corpora and which models are commonly applied, especially for the Chinese language. Then a set of basic Chinese text processing tools for word segmentation, tagging and parsing will be introduced for the preparation of mining sentiment and opinions. After bringing the idea of how to pre-process the Chinese language to the audience, I will describe our work on compositional Chinese sentiment analysis from words to sentences, and an application on social media text (Facebook) as an example. All our involved and recently developed related resources, including Chinese Morphological Dataset, Augmented NTU Sentiment Dictionary (aug-NTUSD), E-hownet with sentiment information, Chinese Opinion Treebank, and the CopeOpi Sentiment Scorer, will also be introduced and distributed in this tutorial. The tutorial will end by a hands-on session of how to use these materials and tools to process Chinese sentiment. Content Details, Materials, and Program please refer to the tutorial URL: \url{http://www.lunweiku.com/} | |
Tasks | Part-Of-Speech Tagging, Sentiment Analysis |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-3002/ |
https://www.aclweb.org/anthology/C16-3002 | |
PWC | https://paperswithcode.com/paper/chinese-textual-sentiment-analysis-datasets |
Repo | |
Framework | |
ArabTAG: from a Handcrafted to a Semi-automatically Generated TAG
Title | ArabTAG: from a Handcrafted to a Semi-automatically Generated TAG |
Authors | Ch{'e}rifa Ben Khelil, Denys Duchier, Yannick Parmentier, Chiraz Zribi, F{'e}riel Ben Fraj |
Abstract | |
Tasks | Question Answering |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/W16-3302/ |
https://www.aclweb.org/anthology/W16-3302 | |
PWC | https://paperswithcode.com/paper/arabtag-from-a-handcrafted-to-a-semi |
Repo | |
Framework | |
Sentence Similarity based on Dependency Tree Kernels for Multi-document Summarization
Title | Sentence Similarity based on Dependency Tree Kernels for Multi-document Summarization |
Authors | {\c{S}}aziye Bet{"u}l {"O}zate{\c{s}}, Arzucan {"O}zg{"u}r, Dragomir Radev |
Abstract | We introduce an approach based on using the dependency grammar representations of sentences to compute sentence similarity for extractive multi-document summarization. We adapt and investigate the effects of two untyped dependency tree kernels, which have originally been proposed for relation extraction, to the multi-document summarization problem. In addition, we propose a series of novel dependency grammar based kernels to better represent the syntactic and semantic similarities among the sentences. The proposed methods incorporate the type information of the dependency relations for sentence similarity calculation. To our knowledge, this is the first study that investigates using dependency tree based sentence similarity for multi-document summarization. |
Tasks | Document Summarization, Multi-Document Summarization, Relation Extraction |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1452/ |
https://www.aclweb.org/anthology/L16-1452 | |
PWC | https://paperswithcode.com/paper/sentence-similarity-based-on-dependency-tree |
Repo | |
Framework | |
The Naming Sharing Structure and its Cognitive Meaning in Chinese and English
Title | The Naming Sharing Structure and its Cognitive Meaning in Chinese and English |
Authors | Shili Ge, Rou Song |
Abstract | |
Tasks | Machine Translation |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/W16-0603/ |
https://www.aclweb.org/anthology/W16-0603 | |
PWC | https://paperswithcode.com/paper/the-naming-sharing-structure-and-its |
Repo | |
Framework | |
Negation Detection in Clinical Reports Written in German
Title | Negation Detection in Clinical Reports Written in German |
Authors | Viviana Cotik, Rol Roller, , Feiyu Xu, Hans Uszkoreit, Klemens Budde, Danilo Schmidt |
Abstract | An important subtask in clinical text mining tries to identify whether a clinical finding is expressed as present, absent or unsure in a text. This work presents a system for detecting mentions of clinical findings that are negated or just speculated. The system has been applied to two different types of German clinical texts: clinical notes and discharge summaries. Our approach is built on top of NegEx, a well known algorithm for identifying non-factive mentions of medical findings. In this work, we adjust a previous adaptation of NegEx to German and evaluate the system on our data to detect negation and speculation. The results are compared to a baseline algorithm and are analyzed for both types of clinical documents. Our system achieves an F1-Score above 0.9 on both types of reports. |
Tasks | Named Entity Recognition, Negation Detection, Relation Extraction |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-5113/ |
https://www.aclweb.org/anthology/W16-5113 | |
PWC | https://paperswithcode.com/paper/negation-detection-in-clinical-reports |
Repo | |
Framework | |
Unsupervised Learning of Spoken Language with Visual Context
Title | Unsupervised Learning of Spoken Language with Visual Context |
Authors | David Harwath, Antonio Torralba, James Glass |
Abstract | Humans learn to speak before they can read or write, so why can’t computers do the same? In this paper, we present a deep neural network model capable of rudimentary spoken language acquisition using untranscribed audio training data, whose only supervision comes in the form of contextually relevant visual images. We describe the collection of our data comprised of over 120,000 spoken audio captions for the Places image dataset and evaluate our model on an image search and annotation task. We also provide some visualizations which suggest that our model is learning to recognize meaningful words within the caption spectrograms. |
Tasks | Image Retrieval, Language Acquisition |
Published | 2016-12-01 |
URL | http://papers.nips.cc/paper/6186-unsupervised-learning-of-spoken-language-with-visual-context |
http://papers.nips.cc/paper/6186-unsupervised-learning-of-spoken-language-with-visual-context.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-learning-of-spoken-language-with |
Repo | |
Framework | |
Joint search in a bilingual valency lexicon and an annotated corpus
Title | Joint search in a bilingual valency lexicon and an annotated corpus |
Authors | Eva Fu{\v{c}}{'\i}kov{'a}, Jan Haji{\v{c}}, Zde{\v{n}}ka Ure{\v{s}}ov{'a} |
Abstract | In this paper and the associated system demo, we present an advanced search system that allows to perform a joint search over a (bilingual) valency lexicon and a correspondingly annotated linked parallel corpus. This search tool has been developed on the basis of the Prague Czech-English Dependency Treebank, but its ideas are applicable in principle to any bilingual parallel corpus that is annotated for dependencies and valency (i.e., predicate-argument structure), and where verbs are linked to appropriate entries in an associated valency lexicon. Our online search tool consolidates more search interfaces into one, providing expanded structured search capability and a more efficient advanced way to search, allowing users to search for verb pairs, verbal argument pairs, their surface realization as recorded in the lexicon, or for their surface form actually appearing in the linked parallel corpus. The search system is currently under development, and is replacing our current search tool available at \url{http://lindat.mff.cuni.cz/services/CzEngVallex}, which could search the lexicon but the queries cannot take advantage of the underlying corpus nor use the additional surface form information from the lexicon(s). The system is available as open source. |
Tasks | |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-2009/ |
https://www.aclweb.org/anthology/C16-2009 | |
PWC | https://paperswithcode.com/paper/joint-search-in-a-bilingual-valency-lexicon |
Repo | |
Framework | |
Harnessing Sequence Labeling for Sarcasm Detection in Dialogue from TV Series `Friends’
Title | Harnessing Sequence Labeling for Sarcasm Detection in Dialogue from TV Series `Friends’ | |
Authors | Aditya Joshi, Vaibhav Tripathi, Pushpak Bhattacharyya, Mark J. Carman |
Abstract | |
Tasks | Sarcasm Detection, Sentiment Analysis |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/K16-1015/ |
https://www.aclweb.org/anthology/K16-1015 | |
PWC | https://paperswithcode.com/paper/harnessing-sequence-labeling-for-sarcasm |
Repo | |
Framework | |
Data Formats and Management Strategies from the Perspective of Language Resource Producers ― Personal Diachronic and Social Synchronic Data Sharing ―
Title | Data Formats and Management Strategies from the Perspective of Language Resource Producers ― Personal Diachronic and Social Synchronic Data Sharing ― |
Authors | Kazushi Ohya |
Abstract | This is a report of findings from on-going language documentation research based on three consecutive projects from 2008 to 2016. In the light of this research, we propose that (1) we should stand on the side of language resource producers to enhance the research of language processing. We support personal data management in addition to social data sharing. (2) This support leads to adopting simple data formats instead of the multi-link-path data models proposed as international standards up to the present. (3) We should set up a framework for total language resource study that includes not only pivotal data formats such as standard formats, but also the surroundings of data formation to capture a wider range of language activities, e.g. annotation, hesitant language formation, and reference-referent relations. A study of this framework is expected to be a foundation of rebuilding man-machine interface studies in which we seek to observe generative processes of informational symbols in order to establish a high affinity interface in regard to documentation. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1516/ |
https://www.aclweb.org/anthology/L16-1516 | |
PWC | https://paperswithcode.com/paper/data-formats-and-management-strategies-from |
Repo | |
Framework | |
Augmenting Course Material with Open Access Textbooks
Title | Augmenting Course Material with Open Access Textbooks |
Authors | Smitha Milli, Marti A. Hearst |
Abstract | |
Tasks | Information Retrieval |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/W16-0526/ |
https://www.aclweb.org/anthology/W16-0526 | |
PWC | https://paperswithcode.com/paper/augmenting-course-material-with-open-access |
Repo | |
Framework | |
A Constant-Factor Bi-Criteria Approximation Guarantee for k-means++
Title | A Constant-Factor Bi-Criteria Approximation Guarantee for k-means++ |
Authors | Dennis Wei |
Abstract | This paper studies the $k$-means++ algorithm for clustering as well as the class of $D^\ell$ sampling algorithms to which $k$-means++ belongs. It is shown that for any constant factor $\beta > 1$, selecting $\beta k$ cluster centers by $D^\ell$ sampling yields a constant-factor approximation to the optimal clustering with $k$ centers, in expectation and without conditions on the dataset. This result extends the previously known $O(\log k)$ guarantee for the case $\beta = 1$ to the constant-factor bi-criteria regime. It also improves upon an existing constant-factor bi-criteria result that holds only with constant probability. |
Tasks | |
Published | 2016-12-01 |
URL | http://papers.nips.cc/paper/6309-a-constant-factor-bi-criteria-approximation-guarantee-for-k-means |
http://papers.nips.cc/paper/6309-a-constant-factor-bi-criteria-approximation-guarantee-for-k-means.pdf | |
PWC | https://paperswithcode.com/paper/a-constant-factor-bi-criteria-approximation-1 |
Repo | |
Framework | |
Microblog Emotion Classification by Computing Similarity in Text, Time, and Space
Title | Microblog Emotion Classification by Computing Similarity in Text, Time, and Space |
Authors | Anja Summa, Bernd Resch, Michael Strube |
Abstract | Most work in NLP analysing microblogs focuses on textual content thus neglecting temporal and spatial information. We present a new interdisciplinary method for emotion classification that combines linguistic, temporal, and spatial information into a single metric. We create a graph of labeled and unlabeled tweets that encodes the relations between neighboring tweets with respect to their emotion labels. Graph-based semi-supervised learning labels all tweets with an emotion. |
Tasks | Emotion Classification, Sentiment Analysis |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-4317/ |
https://www.aclweb.org/anthology/W16-4317 | |
PWC | https://paperswithcode.com/paper/microblog-emotion-classification-by-computing |
Repo | |
Framework | |
Multiplicative Representations for Unsupervised Semantic Role Induction
Title | Multiplicative Representations for Unsupervised Semantic Role Induction |
Authors | Yi Luan, Yangfeng Ji, Hannaneh Hajishirzi, Boyang Li |
Abstract | |
Tasks | Representation Learning, Semantic Role Labeling |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/P16-2020/ |
https://www.aclweb.org/anthology/P16-2020 | |
PWC | https://paperswithcode.com/paper/multiplicative-representations-for |
Repo | |
Framework | |