October 16, 2019

2987 words 15 mins read

Paper Group NANR 42

Measuring language distance among historical varieties using perplexity. Application to European Portuguese.. A New Large Scale Dynamic Texture Dataset with Application to ConvNet Understanding. Single Image Intrinsic Decomposition without a Single Intrinsic Image. Stacked Semantics-Guided Attention Model for Fine-Grained Zero-Shot Learning. Will i …

Measuring language distance among historical varieties using perplexity. Application to European Portuguese.


Title	Measuring language distance among historical varieties using perplexity. Application to European Portuguese.
Authors	Jose Ramom Pichel Campos, Pablo Gamallo, I{~n}aki Alegria
Abstract	The objective of this work is to quantify, with a simple and robust measure, the distance between historical varieties of a language. The measure will be inferred from text corpora corresponding to historical periods. Different approaches have been proposed for similar aims: Language Identification, Phylogenetics, Historical Linguistics or Dialectology. In our approach, we used a perplexity-based measure to calculate language distance between all the historical periods of a specific language: European Portuguese. Perplexity has also proven to be a robust metric to calculate distance between languages. However, this measure has not been tested yet to identify diachronic periods within the historical evolution of a specific language. For this purpose, a historical Portuguese corpus has been constructed from different open sources containing texts with close original spelling. The results of our experiments show that Portuguese keeps an important degree of homogeneity over time. We anticipate this metric to be a starting point to be applied to other languages.
Tasks	Language Acquisition, Language Identification
Published	2018-08-01
URL	https://www.aclweb.org/anthology/W18-3916/
PDF	https://www.aclweb.org/anthology/W18-3916
PWC	https://paperswithcode.com/paper/measuring-language-distance-among-historical
Repo
Framework

A New Large Scale Dynamic Texture Dataset with Application to ConvNet Understanding


Title	A New Large Scale Dynamic Texture Dataset with Application to ConvNet Understanding
Authors	Isma Hadji, Richard P. Wildes
Abstract	This paper introduces a new large scale dynamic texture dataset. The dataset is provided with two complementary organizations, one based on dynamics independent of spatial appearance and one based on spatial appearance independent of dynamics. With over 10,000 videos, the proposed Dynamic Texture DataBase (DTDB) is two orders of magnitude larger than any previously available dynamic texture dataset. The complementary organizations of the dataset allow for uniquely insightful experiments regarding the abilities of major classes of spatiotemporal ConvNet architectures to exploit appearance vs. dynamic information. We also present a novel two-stream ConvNet that provides an alterna- tive to the standard optical-flow-based motion stream to broaden the range of dynamic patterns that can be encompassed. The resulting motion stream is shown to outperform the traditional optical flow stream by considerable margins. Finally, the utility of the dataset as a pre-training substrate is demonstrated via transfer learning experiments with a different dynamic texture dataset as well as the companion task of dynamic scene recognition resulting in a new state-of-the-art.
Tasks	Optical Flow Estimation, Scene Recognition, Transfer Learning
Published	2018-09-01
URL	http://openaccess.thecvf.com/content_ECCV_2018/html/Isma_Hadji_A_New_Large_ECCV_2018_paper.html
PDF	http://openaccess.thecvf.com/content_ECCV_2018/papers/Isma_Hadji_A_New_Large_ECCV_2018_paper.pdf
PWC	https://paperswithcode.com/paper/a-new-large-scale-dynamic-texture-dataset
Repo
Framework

Single Image Intrinsic Decomposition without a Single Intrinsic Image


Title	Single Image Intrinsic Decomposition without a Single Intrinsic Image
Authors	Wei-Chiu Ma, Hang Chu, Bolei Zhou, Raquel Urtasun, Antonio Torralba
Abstract	Intrinsic image decomposition—decomposing a natural image into a set of images corresponding to different physical causes—is one of the key and fundamental problems of computer vision. Previous intrinsic decomposition approaches either address the problem in a fully supervised manner, or require multiple images of the same scene as input. These approaches are less desirable in practice, as ground truth intrinsic images are extremely difficult to acquire, and requirement of multiple images pose severe limitation on applicable scenarios. In this paper, we propose to bring the best of both worlds. We present a two stream convolutional neural network framework that is capable of learning the decomposition effectively in the absence of any ground truth intrinsic images, and can be easily extended to a (semi-)supervised setup. At inference time, our model can be easily reduced to a single stream module that performs intrinsic decomposition on a single input image. We demonstrate the effectiveness of our framework through extensive experimental study on both synthetic and real-world datasets, showing superior performance over previous approaches in both single-image and multi-image settings. Notably, our approach outperforms previous state-of-the-art single image methods while using only 50% of ground truth supervision.
Tasks	Intrinsic Image Decomposition
Published	2018-09-01
URL	http://openaccess.thecvf.com/content_ECCV_2018/html/Wei-Chiu_Single_Image_Intrinsic_ECCV_2018_paper.html
PDF	http://openaccess.thecvf.com/content_ECCV_2018/papers/Wei-Chiu_Single_Image_Intrinsic_ECCV_2018_paper.pdf
PWC	https://paperswithcode.com/paper/single-image-intrinsic-decomposition-without
Repo
Framework

Stacked Semantics-Guided Attention Model for Fine-Grained Zero-Shot Learning


Title	Stacked Semantics-Guided Attention Model for Fine-Grained Zero-Shot Learning
Authors	Yunlong Yu, Zhong Ji, Yanwei Fu, Jichang Guo, Yanwei Pang, Zhongfei (Mark) Zhang
Abstract	Zero-Shot Learning (ZSL) is generally achieved via aligning the semantic relationships between the visual features and the corresponding class semantic descriptions. However, using the global features to represent fine-grained images may lead to sub-optimal results since they neglect the discriminative differences of local regions. Besides, different regions contain distinct discriminative information. The important regions should contribute more to the prediction. To this end, we propose a novel stacked semantics-guided attention (S2GA) model to obtain semantic relevant features by using individual class semantic features to progressively guide the visual features to generate an attention map for weighting the importance of different local regions. Feeding both the integrated visual features and the class semantic features into a multi-class classification architecture, the proposed framework can be trained end-to-end. Extensive experimental results on CUB and NABird datasets show that the proposed approach has a consistent improvement on both fine-grained zero-shot classification and retrieval tasks.
Tasks	Zero-Shot Learning
Published	2018-12-01
URL	http://papers.nips.cc/paper/7839-stacked-semantics-guided-attention-model-for-fine-grained-zero-shot-learning
PDF	http://papers.nips.cc/paper/7839-stacked-semantics-guided-attention-model-for-fine-grained-zero-shot-learning.pdf
PWC	https://paperswithcode.com/paper/stacked-semantics-guided-attention-model-for
Repo
Framework

Will it Blend? Blending Weak and Strong Labeled Data in a Neural Network for Argumentation Mining


Title	Will it Blend? Blending Weak and Strong Labeled Data in a Neural Network for Argumentation Mining
Authors	Eyal Shnarch, Carlos Alzate, Lena Dankin, Martin Gleize, Yufang Hou, Leshem Choshen, Ranit Aharonov, Noam Slonim
Abstract	The process of obtaining high quality labeled data for natural language understanding tasks is often slow, error-prone, complicated and expensive. With the vast usage of neural networks, this issue becomes more notorious since these networks require a large amount of labeled data to produce satisfactory results. We propose a methodology to blend high quality but scarce strong labeled data with noisy but abundant weak labeled data during the training of neural networks. Experiments in the context of topic-dependent evidence detection with two forms of weak labeled data show the advantages of the blending scheme. In addition, we provide a manually annotated data set for the task of topic-dependent evidence detection. We believe that blending weak and strong labeled data is a general notion that may be applicable to many language understanding tasks, and can especially assist researchers who wish to train a network but have a small amount of high quality labeled data for their task of interest.
Tasks	Information Retrieval, Relation Extraction, Sarcasm Detection, Sentiment Analysis
Published	2018-07-01
URL	https://www.aclweb.org/anthology/P18-2095/
PDF	https://www.aclweb.org/anthology/P18-2095
PWC	https://paperswithcode.com/paper/will-it-blend-blending-weak-and-strong
Repo
Framework

YNU-HPCC at SemEval-2018 Task 3: Ensemble Neural Network Models for Irony Detection on Twitter


Title	YNU-HPCC at SemEval-2018 Task 3: Ensemble Neural Network Models for Irony Detection on Twitter
Authors	Bo Peng, Jin Wang, Xuejie Zhang
Abstract	This paper describe the system we proposed to participate the first year of Irony detection in English tweets competition. Previous works demonstrate that LSTMs models have achieved remarkable performance in natural language processing; besides, combining multiple classification from various individual classifiers in general is more powerful than a single classification. In order to obtain more precision classification of irony detection, our system trained several individual neural network classifiers and combined their results according to the ensemble-learning algorithm.
Tasks	Sarcasm Detection, Sentiment Analysis
Published	2018-06-01
URL	https://www.aclweb.org/anthology/S18-1101/
PDF	https://www.aclweb.org/anthology/S18-1101
PWC	https://paperswithcode.com/paper/ynu-hpcc-at-semeval-2018-task-3-ensemble
Repo
Framework

An Analysis of Annotated Corpora for Emotion Classification in Text


Title	An Analysis of Annotated Corpora for Emotion Classification in Text
Authors	Laura-Ana-Maria Bostan, Roman Klinger
Abstract	Several datasets have been annotated and published for classification of emotions. They differ in several ways: (1) the use of different annotation schemata (e. g., discrete label sets, including joy, anger, fear, or sadness or continuous values including valence, or arousal), (2) the domain, and, (3) the file formats. This leads to several research gaps: supervised models often only use a limited set of available resources. Additionally, no previous work has compared emotion corpora in a systematic manner. We aim at contributing to this situation with a survey of the datasets, and aggregate them in a common file format with a common annotation schema. Based on this aggregation, we perform the first cross-corpus classification experiments in the spirit of future research enabled by this paper, in order to gain insight and a better understanding of differences of models inferred from the data. This work also simplifies the choice of the most appropriate resources for developing a model for a novel domain. One result from our analysis is that a subset of corpora is better classified with models trained on a different corpus. For none of the corpora, training on all data altogether is better than using a subselection of the resources. Our unified corpus is available at http://www.ims.uni-stuttgart.de/data/unifyemotion.
Tasks	Emotion Classification
Published	2018-08-01
URL	https://www.aclweb.org/anthology/C18-1179/
PDF	https://www.aclweb.org/anthology/C18-1179
PWC	https://paperswithcode.com/paper/an-analysis-of-annotated-corpora-for-emotion
Repo
Framework

NLPRL-IITBHU at SemEval-2018 Task 3: Combining Linguistic Features and Emoji pre-trained CNN for Irony Detection in Tweets


Title	NLPRL-IITBHU at SemEval-2018 Task 3: Combining Linguistic Features and Emoji pre-trained CNN for Irony Detection in Tweets
Authors	Harsh Rangwani, Devang Kulshreshtha, Anil Kumar Singh
Abstract	This paper describes our participation in SemEval 2018 Task 3 on Irony Detection in Tweets. We combine linguistic features with pre-trained activations of a neural network. The CNN is trained on the emoji prediction task. We combine the two feature sets and feed them into an XGBoost Classifier for classification. Subtask-A involves classification of tweets into ironic and non-ironic instances whereas Subtask-B involves classification of the tweet into - non-ironic, verbal irony, situational irony or other verbal irony. It is observed that combining features from these two different feature spaces improves our system results. We leverage the SMOTE algorithm to handle the problem of class imbalance in Subtask-B. Our final model achieves an F1-score of 0.65 and 0.47 on Subtask-A and Subtask-B respectively. Our system ranks 4th on both tasks respectively, outperforming the baseline by 6{%} on Subtask-A and 14{%} on Subtask-B.
Tasks	Sarcasm Detection
Published	2018-06-01
URL	https://www.aclweb.org/anthology/S18-1104/
PDF	https://www.aclweb.org/anthology/S18-1104
PWC	https://paperswithcode.com/paper/nlprl-iitbhu-at-semeval-2018-task-3-combining
Repo
Framework

ELiRF-UPV at SemEval-2018 Task 11: Machine Comprehension using Commonsense Knowledge


Title	ELiRF-UPV at SemEval-2018 Task 11: Machine Comprehension using Commonsense Knowledge
Authors	Jos{'e}-{'A}ngel Gonz{'a}lez, Llu{'\i}s-F. Hurtado, Encarna Segarra, Ferran Pla
Abstract	This paper describes the participation of ELiRF-UPV team at task 11, Machine Comprehension using Commonsense Knowledge, of SemEval-2018. Our approach is based on the use of word embeddings, NumberBatch Embeddings, and a Deep Learning architecture to find the best answer for the multiple-choice questions based on the narrative text. The results obtained are in line with those obtained by the other participants and they encourage us to continue working on this problem.
Tasks	Question Answering, Reading Comprehension, Tokenization, Word Embeddings
Published	2018-06-01
URL	https://www.aclweb.org/anthology/S18-1172/
PDF	https://www.aclweb.org/anthology/S18-1172
PWC	https://paperswithcode.com/paper/elirf-upv-at-semeval-2018-task-11-machine
Repo
Framework

Assessing Meaning Components in German Complex Verbs: A Collection of Source-Target Domains and Directionality


Title	Assessing Meaning Components in German Complex Verbs: A Collection of Source-Target Domains and Directionality
Authors	Sabine Schulte im Walde, Maximilian K{"o}per, Sylvia Springorum
Abstract	This paper presents a collection to assess meaning components in German complex verbs, which frequently undergo meaning shifts. We use a novel strategy to obtain source and target domain characterisations via sentence generation rather than sentence annotation. A selection of arrows adds spatial directional information to the generated contexts. We provide a broad qualitative description of the dataset, and a series of standard classification experiments verifies the quantitative reliability of the presented resource. The setup for collecting the meaning components is applicable also to other languages, regarding complex verbs as well as other language-specific targets that involve meaning shifts.
Tasks
Published	2018-06-01
URL	https://www.aclweb.org/anthology/S18-2003/
PDF	https://www.aclweb.org/anthology/S18-2003
PWC	https://paperswithcode.com/paper/assessing-meaning-components-in-german
Repo
Framework

Towards Qualitative Word Embeddings Evaluation: Measuring Neighbors Variation


Title	Towards Qualitative Word Embeddings Evaluation: Measuring Neighbors Variation
Authors	B{'e}n{'e}dicte Pierrejean, Ludovic Tanguy
Abstract	We propose a method to study the variation lying between different word embeddings models trained with different parameters. We explore the variation between models trained with only one varying parameter by observing the distributional neighbors variation and show how changing only one parameter can have a massive impact on a given semantic space. We show that the variation is not affecting all words of the semantic space equally. Variation is influenced by parameters such as setting a parameter to its minimum or maximum value but it also depends on the corpus intrinsic features such as the frequency of a word. We identify semantic classes of words remaining stable across the models trained and specific words having high variation.
Tasks	Named Entity Recognition, Part-Of-Speech Tagging, Word Embeddings
Published	2018-06-01
URL	https://www.aclweb.org/anthology/N18-4005/
PDF	https://www.aclweb.org/anthology/N18-4005
PWC	https://paperswithcode.com/paper/towards-qualitative-word-embeddings
Repo
Framework

What type of happiness are you looking for? - A closer look at detecting mental health from language


Title	What type of happiness are you looking for? - A closer look at detecting mental health from language
Authors	Alina Arseniev-Koehler, Sharon Mozgai, Stefan Scherer
Abstract	Computational models to detect mental illnesses from text and speech could enhance our understanding of mental health while offering opportunities for early detection and intervention. However, these models are often disconnected from the lived experience of depression and the larger diagnostic debates in mental health. This article investigates these disconnects, primarily focusing on the labels used to diagnose depression, how these labels are computationally represented, and the performance metrics used to evaluate computational models. We also consider how medical instruments used to measure depression, such as the Patient Health Questionnaire (PHQ), contribute to these disconnects. To illustrate our points, we incorporate mixed-methods analyses of 698 interviews on emotional health, which are coupled with self-report PHQ screens for depression. We propose possible strategies to bridge these gaps between modern psychiatric understandings of depression, lay experience of depression, and computational representation.
Tasks
Published	2018-06-01
URL	https://www.aclweb.org/anthology/W18-0601/
PDF	https://www.aclweb.org/anthology/W18-0601
PWC	https://paperswithcode.com/paper/what-type-of-happiness-are-you-looking-for-a
Repo
Framework

Is Something Better than Nothing? Automatically Predicting Stance-based Arguments Using Deep Learning and Small Labelled Dataset


Title	Is Something Better than Nothing? Automatically Predicting Stance-based Arguments Using Deep Learning and Small Labelled Dataset
Authors	Pavithra Rajendran, Danushka Bollegala, Simon Parsons
Abstract	Online reviews have become a popular portal among customers making decisions about purchasing products. A number of corpora of reviews have been widely investigated in NLP in general, and, in particular, in argument mining. This is a subset of NLP that deals with extracting arguments and the relations among them from user-based content. A major problem faced by argument mining research is the lack of human-annotated data. In this paper, we investigate the use of weakly supervised and semi-supervised methods for automatically annotating data, and thus providing large annotated datasets. We do this by building on previous work that explores the classification of opinions present in reviews based whether the stance is expressed explicitly or implicitly. In the work described here, we automatically annotate stance as implicit or explicit and our results show that the datasets we generate, although noisy, can be used to learn better models for implicit/explicit opinion classification.
Tasks	Abstract Argumentation, Argument Mining, Opinion Mining, Sentiment Analysis
Published	2018-06-01
URL	https://www.aclweb.org/anthology/N18-2005/
PDF	https://www.aclweb.org/anthology/N18-2005
PWC	https://paperswithcode.com/paper/is-something-better-than-nothing
Repo
Framework

HHU at SemEval-2018 Task 12: Analyzing an Ensemble-based Deep Learning Approach for the Argument Mining Task of Choosing the Correct Warrant


Title	HHU at SemEval-2018 Task 12: Analyzing an Ensemble-based Deep Learning Approach for the Argument Mining Task of Choosing the Correct Warrant
Authors	Matthias Liebeck, Andreas Funke, Stefan Conrad
Abstract	This paper describes our participation in the SemEval-2018 Task 12 Argument Reasoning Comprehension Task which calls to develop systems that, given a reason and a claim, predict the correct warrant from two opposing options. We decided to use a deep learning architecture and combined 623 models with different hyperparameters into an ensemble. Our extensive analysis of our architecture and ensemble reveals that the decision to use an ensemble was suboptimal. Additionally, we benchmark a support vector machine as a baseline. Furthermore, we experimented with an alternative data split and achieved more stable results.
Tasks	Argument Mining
Published	2018-06-01
URL	https://www.aclweb.org/anthology/S18-1188/
PDF	https://www.aclweb.org/anthology/S18-1188
PWC	https://paperswithcode.com/paper/hhu-at-semeval-2018-task-12-analyzing-an
Repo
Framework

Multi-Cell Detection and Classification Using a Generative Convolutional Model


Title	Multi-Cell Detection and Classification Using a Generative Convolutional Model
Authors	Florence Yellin, Benjamin D. Haeffele, Sophie Roth, RenÃ© Vidal
Abstract	Detecting, counting, and classifying various cell types in images of human blood is important in many biomedical applications. However, these tasks can be very difficult due to the wide range of biological variability and the resolution limitations of many imaging modalities. This paper proposes a new approach to detecting, counting and classifying white blood cell populations in holographic images, which capitalizes on the fact that the variability in a mixture of blood cells is constrained by physiology. The proposed approach is based on a probabilistic generative model that describes an image of a population of cells as the sum of atoms from a convolutional dictionary of cell templates. The class of each template is drawn from a prior distribution that captures statistical information about blood cell mixtures. The parameters of the prior distribution are learned from a database of complete blood count results obtained from patients, and the cell templates are learned from images of purified cells from a single cell class using an extension of convolutional dictionary learning. Cell detection, counting and classification is then done using an extension of convolutional sparse coding that accounts for class proportion priors. This method has been successfully used to detect, count and classify white blood cell populations in holographic images of lysed blood obtained from 20 normal blood donors and 12 abnormal clinical blood discard samples. The error from our method is under 6.8% for all class populations, compared to errors of over 28.6% for all other methods tested.
Tasks	Dictionary Learning
Published	2018-06-01
URL	http://openaccess.thecvf.com/content_cvpr_2018/html/Yellin_Multi-Cell_Detection_and_CVPR_2018_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2018/papers/Yellin_Multi-Cell_Detection_and_CVPR_2018_paper.pdf
PWC	https://paperswithcode.com/paper/multi-cell-detection-and-classification-using
Repo
Framework