October 15, 2019

2270 words 11 mins read

Paper Group NANR 181

Paper Group NANR 181

Getting to ``Hearer-old’': Charting Referring Expressions Across Time. Modeling Personality Traits of Filipino Twitter Users. An Approach to the CLPsych 2018 Shared Task Using Top-Down Text Representation and Simple Bottom-Up Model Selection. MIAPARLE: Online training for the discrimination of stress contrasts. Limbic: Author-Based Sentiment Aspect …

Getting to ``Hearer-old’': Charting Referring Expressions Across Time

Title Getting to ``Hearer-old’': Charting Referring Expressions Across Time |
Authors Ieva Stali{=u}nait{.e}, Hannah Rohde, Bonnie Webber, Annie Louis
Abstract When a reader is first introduced to an entity, its referring expression must describe the entity. For entities that are widely known, a single word or phrase often suffices. This paper presents the first study of how expressions that refer to the same entity develop over time. We track thousands of person and organization entities over 20 years of New York Times (NYT). As entities move from hearer-new (first introduction to the NYT audience) to hearer-old (common knowledge) status, we show empirically that the referring expressions along this trajectory depend on the type of the entity, and exhibit linguistic properties related to becoming common knowledge (e.g., shorter length, less use of appositives, more definiteness). These properties can also be used to build a model to predict how long it will take for an entity to reach hearer-old status. Our results reach 10-30{%} absolute improvement over a majority-class baseline.
Tasks Coreference Resolution
Published 2018-10-01
URL https://www.aclweb.org/anthology/D18-1466/
PDF https://www.aclweb.org/anthology/D18-1466
PWC https://paperswithcode.com/paper/getting-to-hearer-old-charting-referring
Repo
Framework

Modeling Personality Traits of Filipino Twitter Users

Title Modeling Personality Traits of Filipino Twitter Users
Authors Edward Tighe, Charibeth Cheng
Abstract Recent studies in the field of text-based personality recognition experiment with different languages, feature extraction techniques, and machine learning algorithms to create better and more accurate models; however, little focus is placed on exploring the language use of a group of individuals defined by nationality. Individuals of the same nationality share certain practices and communicate certain ideas that can become embedded into their natural language. Many nationals are also not limited to speaking just one language, such as how Filipinos speak Filipino and English, the two national languages of the Philippines. The addition of several regional/indigenous languages, along with the commonness of code-switching, allow for a Filipino to have a rich vocabulary. This presents an opportunity to create a text-based personality model based on how Filipinos speak, regardless of the language they use. To do so, data was collected from 250 Filipino Twitter users. Different combinations of data processing techniques were experimented upon to create personality models for each of the Big Five. The results for both regression and classification show that Conscientiousness is consistently the easiest trait to model, followed by Extraversion. Classification models for Agreeableness and Neuroticism had subpar performances, but performed better than those of Openness. An analysis on personality trait score representation showed that classifying extreme outliers generally produce better results for all traits except for Neuroticism and Openness.
Tasks
Published 2018-06-01
URL https://www.aclweb.org/anthology/W18-1115/
PDF https://www.aclweb.org/anthology/W18-1115
PWC https://paperswithcode.com/paper/modeling-personality-traits-of-filipino
Repo
Framework

An Approach to the CLPsych 2018 Shared Task Using Top-Down Text Representation and Simple Bottom-Up Model Selection

Title An Approach to the CLPsych 2018 Shared Task Using Top-Down Text Representation and Simple Bottom-Up Model Selection
Authors Micah Iserman, Molly Ireland, Andrew Littlefield, Tyler Davis, Sage Maliepaard
Abstract
Tasks Model Selection
Published 2018-06-01
URL https://www.aclweb.org/anthology/papers/W18-0605/w18-0605
PDF https://www.aclweb.org/anthology/W18-0605
PWC https://paperswithcode.com/paper/an-approach-to-the-clpsych-2018-shared-task
Repo
Framework

MIAPARLE: Online training for the discrimination of stress contrasts

Title MIAPARLE: Online training for the discrimination of stress contrasts
Authors Jean-Philippe Goldman, S Schwab, ra
Abstract
Tasks
Published 2018-05-01
URL https://www.aclweb.org/anthology/L18-1364/
PDF https://www.aclweb.org/anthology/L18-1364
PWC https://paperswithcode.com/paper/miaparle-online-training-for-the
Repo
Framework

Limbic: Author-Based Sentiment Aspect Modeling Regularized with Word Embeddings and Discourse Relations

Title Limbic: Author-Based Sentiment Aspect Modeling Regularized with Word Embeddings and Discourse Relations
Authors Zhe Zhang, Munindar Singh
Abstract We propose Limbic, an unsupervised probabilistic model that addresses the problem of discovering aspects and sentiments and associating them with authors of opinionated texts. Limbic combines three ideas, incorporating authors, discourse relations, and word embeddings. For discourse relations, Limbic adopts a generative process regularized by a Markov Random Field. To promote words with high semantic similarity into the same topic, Limbic captures semantic regularities from word embeddings via a generalized P{'o}lya Urn process. We demonstrate that Limbic (1) discovers aspects associated with sentiments with high lexical diversity; (2) outperforms state-of-the-art models by a substantial margin in topic cohesion and sentiment classification.
Tasks Semantic Similarity, Semantic Textual Similarity, Sentiment Analysis, Topic Models, Word Embeddings
Published 2018-10-01
URL https://www.aclweb.org/anthology/D18-1378/
PDF https://www.aclweb.org/anthology/D18-1378
PWC https://paperswithcode.com/paper/limbic-author-based-sentiment-aspect-modeling
Repo
Framework

Infant Word Comprehension-to-Production Index Applied to Investigation of Noun Learning Predominance Using Cross-lingual CDI database

Title Infant Word Comprehension-to-Production Index Applied to Investigation of Noun Learning Predominance Using Cross-lingual CDI database
Authors Yasuhiro Minami, Tessei Kobayashi, Yuko Okumura
Abstract
Tasks
Published 2018-05-01
URL https://www.aclweb.org/anthology/L18-1362/
PDF https://www.aclweb.org/anthology/L18-1362
PWC https://paperswithcode.com/paper/infant-word-comprehension-to-production-index
Repo
Framework

AUTOMATA GUIDED HIERARCHICAL REINFORCEMENT LEARNING FOR ZERO-SHOT SKILL COMPOSITION

Title AUTOMATA GUIDED HIERARCHICAL REINFORCEMENT LEARNING FOR ZERO-SHOT SKILL COMPOSITION
Authors Xiao Li, Yao Ma, Calin Belta
Abstract An obstacle that prevents the wide adoption of (deep) reinforcement learning (RL) in control systems is its need for a large number of interactions with the environment in order to master a skill. The learned skill usually generalizes poorly across domains and re-training is often necessary when presented with a new task. We present a framework that combines techniques in \textit{formal methods} with \textit{hierarchical reinforcement learning} (HRL). The set of techniques we provide allows for the convenient specification of tasks with logical expressions, learns hierarchical policies (meta-controller and low-level controllers) with well-defined intrinsic rewards using any RL methods and is able to construct new skills from existing ones without additional learning. We evaluate the proposed methods in a simple grid world simulation as well as simulation on a Baxter robot.
Tasks Hierarchical Reinforcement Learning
Published 2018-01-01
URL https://openreview.net/forum?id=BJgVaG-Ab
PDF https://openreview.net/pdf?id=BJgVaG-Ab
PWC https://paperswithcode.com/paper/automata-guided-hierarchical-reinforcement-1
Repo
Framework

Aggressive language in an online hacking forum

Title Aggressive language in an online hacking forum
Authors Andrew Caines, Sergio Pastrana, Alice Hutchings, Paula Buttery
Abstract We probe the heterogeneity in levels of abusive language in different sections of the Internet, using an annotated corpus of Wikipedia page edit comments to train a binary classifier for abuse detection. Our test data come from the CrimeBB Corpus of hacking-related forum posts and we find that (a) forum interactions are rarely abusive, (b) the abusive language which does exist tends to be relatively mild compared to that found in the Wikipedia comments domain, and tends to involve aggressive posturing rather than hate speech or threats of violence. We observe that the purpose of conversations in online forums tend to be more constructive and informative than those in Wikipedia page edit comments which are geared more towards adversarial interactions, and that this may explain the lower levels of abuse found in our forum data than in Wikipedia comments. Further work remains to be done to compare these results with other inter-domain classification experiments, and to understand the impact of aggressive language in forum conversations.
Tasks Abuse Detection
Published 2018-10-01
URL https://www.aclweb.org/anthology/W18-5109/
PDF https://www.aclweb.org/anthology/W18-5109
PWC https://paperswithcode.com/paper/aggressive-language-in-an-online-hacking
Repo
Framework

Sanskrit Sandhi Splitting using seq2(seq)2

Title Sanskrit Sandhi Splitting using seq2(seq)2
Authors Rahul Aralikatte, Neelamadhav Gantayat, Naveen Panwar, Anush Sankaran, Senthil Mani
Abstract In Sanskrit, small words (morphemes) are combined to form compound words through a process known as Sandhi. Sandhi splitting is the process of splitting a given compound word into its constituent morphemes. Although rules governing word splitting exists in the language, it is highly challenging to identify the location of the splits in a compound word. Though existing Sandhi splitting systems incorporate these pre-defined splitting rules, they have a low accuracy as the same compound word might be broken down in multiple ways to provide syntactically correct splits. In this research, we propose a novel deep learning architecture called Double Decoder RNN (DD-RNN), which (i) predicts the location of the split(s) with 95{%} accuracy, and (ii) predicts the constituent words (learning the Sandhi splitting rules) with 79.5{%} accuracy, outperforming the state-of-art by 20{%}. Additionally, we show the generalization capability of our deep learning model, by showing competitive results in the problem of Chinese word segmentation, as well.
Tasks Chinese Word Segmentation
Published 2018-10-01
URL https://www.aclweb.org/anthology/D18-1530/
PDF https://www.aclweb.org/anthology/D18-1530
PWC https://paperswithcode.com/paper/sanskrit-sandhi-splitting-using-seq2seq2
Repo
Framework

Learning and Using the Arrow of Time

Title Learning and Using the Arrow of Time
Authors Donglai Wei, Joseph J. Lim, Andrew Zisserman, William T. Freeman
Abstract We seek to understand the arrow of time in videos – what makes videos look like they are playing forwards or backwards? Can we visualize the cues? Can the arrow of time be a supervisory signal useful for activity analysis? To this end, we build three large-scale video datasets and apply a learning-based approach to these tasks. To learn the arrow of time efficiently and reliably, we design a ConvNet suitable for extended temporal footprints and for class activation visualization, and study the effect of artificial cues, such as cinematographic conventions, on learning. Our trained model achieves state-of-the-art performance on large-scale real-world video datasets. Through cluster analysis and localization of important regions for the prediction, we examine learned visual cues that are consistent among many samples and show when and where they occur. Lastly, we use the trained ConvNet for two applications: self-supervision for action recognition, and video forensics – determining whether Hollywood film clips have been deliberately reversed in time, often used as special effects.
Tasks Temporal Action Localization
Published 2018-06-01
URL http://openaccess.thecvf.com/content_cvpr_2018/html/Wei_Learning_and_Using_CVPR_2018_paper.html
PDF http://openaccess.thecvf.com/content_cvpr_2018/papers/Wei_Learning_and_Using_CVPR_2018_paper.pdf
PWC https://paperswithcode.com/paper/learning-and-using-the-arrow-of-time
Repo
Framework

Read and Comprehend by Gated-Attention Reader with More Belief

Title Read and Comprehend by Gated-Attention Reader with More Belief
Authors Haohui Deng, Yik-Cheung Tam
Abstract Gated-Attention (GA) Reader has been effective for reading comprehension. GA Reader makes two assumptions: (1) a uni-directional attention that uses an input query to gate token encodings of a document; (2) encoding at the cloze position of an input query is considered for answer prediction. In this paper, we propose Collaborative Gating (CG) and Self-Belief Aggregation (SBA) to address the above assumptions respectively. In CG, we first use an input document to gate token encodings of an input query so that the influence of irrelevant query tokens may be reduced. Then the filtered query is used to gate token encodings of an document in a collaborative fashion. In SBA, we conjecture that query tokens other than the cloze token may be informative for answer prediction. We apply self-attention to link the cloze token with other tokens in a query so that the importance of query tokens with respect to the cloze position are weighted. Then their evidences are weighted, propagated and aggregated for better reading comprehension. Experiments show that our approaches advance the state-of-theart results in CNN, Daily Mail, and Who Did What public test sets.
Tasks Reading Comprehension, Word Alignment
Published 2018-06-01
URL https://www.aclweb.org/anthology/N18-4012/
PDF https://www.aclweb.org/anthology/N18-4012
PWC https://paperswithcode.com/paper/read-and-comprehend-by-gated-attention-reader
Repo
Framework

Chrono at SemEval-2018 Task 6: A System for Normalizing Temporal Expressions

Title Chrono at SemEval-2018 Task 6: A System for Normalizing Temporal Expressions
Authors Amy Olex, Luke Maffey, Nicholas Morgan, Bridget McInnes
Abstract Temporal information extraction is a challenging task. Here we describe Chrono, a hybrid rule-based and machine learning system that identifies temporal expressions in text and normalizes them into the SCATE schema. After minor parsing logic adjustments, Chrono has emerged as the top performing system for SemEval 2018 Task 6: Parsing Time Normalizations.
Tasks Temporal Information Extraction
Published 2018-06-01
URL https://www.aclweb.org/anthology/S18-1012/
PDF https://www.aclweb.org/anthology/S18-1012
PWC https://paperswithcode.com/paper/chrono-at-semeval-2018-task-6-a-system-for
Repo
Framework

Treat the system like a human student: Automatic naturalness evaluation of generated text without reference texts

Title Treat the system like a human student: Automatic naturalness evaluation of generated text without reference texts
Authors Isabel Groves, Ye Tian, Ioannis Douratsos
Abstract The current most popular method for automatic Natural Language Generation (NLG) evaluation is comparing generated text with human-written reference sentences using a metrics system, which has drawbacks around reliability and scalability. We draw inspiration from second language (L2) assessment and extract a set of linguistic features to predict human judgments of sentence naturalness. Our experiment using a small dataset showed that the feature-based approach yields promising results, with the added potential of providing interpretability into the source of the problems.
Tasks Image Captioning, Machine Translation, Text Generation, Text Summarization
Published 2018-11-01
URL https://www.aclweb.org/anthology/W18-6512/
PDF https://www.aclweb.org/anthology/W18-6512
PWC https://paperswithcode.com/paper/treat-the-system-like-a-human-student
Repo
Framework

HSA-RNN: Hierarchical Structure-Adaptive RNN for Video Summarization

Title HSA-RNN: Hierarchical Structure-Adaptive RNN for Video Summarization
Authors Bin Zhao, Xuelong Li, Xiaoqiang Lu
Abstract Although video summarization has achieved great success in recent years, few approaches have realized the influence of video structure on the summarization results. As we know, the video data follow a hierarchical structure, i.e., a video is composed of shots, and a shot is composed of several frames. Generally, shots provide the activity-level information for people to understand the video content. While few existing summarization approaches pay attention to the shot segmentation procedure. They generate shots by some trivial strategies, such as fixed length segmentation, which may destroy the underlying hierarchical structure of video data and further reduce the quality of generated summaries. To address this problem, we propose a structure-adaptive video summarization approach that integrates shot segmentation and video summarization into a Hierarchical Structure-Adaptive RNN, denoted as HSA-RNN. We evaluate the proposed approach on four popular datasets, i.e., SumMe, TVsum, CoSum and VTW. The experimental results have demonstrated the effectiveness of HSA-RNN in the video summarization task.
Tasks Video Summarization
Published 2018-06-01
URL http://openaccess.thecvf.com/content_cvpr_2018/html/Zhao_HSA-RNN_Hierarchical_Structure-Adaptive_CVPR_2018_paper.html
PDF http://openaccess.thecvf.com/content_cvpr_2018/papers/Zhao_HSA-RNN_Hierarchical_Structure-Adaptive_CVPR_2018_paper.pdf
PWC https://paperswithcode.com/paper/hsa-rnn-hierarchical-structure-adaptive-rnn
Repo
Framework

A Bird’s-eye View of Language Processing Projects at the Romanian Academy

Title A Bird’s-eye View of Language Processing Projects at the Romanian Academy
Authors Dan Tufi{\textcommabelow{s}}, Dan Cristea
Abstract
Tasks Autonomous Vehicles, Keyword Spotting, Transliteration
Published 2018-05-01
URL https://www.aclweb.org/anthology/L18-1388/
PDF https://www.aclweb.org/anthology/L18-1388
PWC https://paperswithcode.com/paper/a-birdas-eye-view-of-language-processing
Repo
Framework
comments powered by Disqus