Paper Group NANR 181
Getting to ``Hearer-old’': Charting Referring Expressions Across Time. Modeling Personality Traits of Filipino Twitter Users. An Approach to the CLPsych 2018 Shared Task Using Top-Down Text Representation and Simple Bottom-Up Model Selection. MIAPARLE: Online training for the discrimination of stress contrasts. Limbic: Author-Based Sentiment Aspect …
Getting to ``Hearer-old’': Charting Referring Expressions Across Time
Title | Getting to ``Hearer-old’': Charting Referring Expressions Across Time | |
Authors | Ieva Stali{=u}nait{.e}, Hannah Rohde, Bonnie Webber, Annie Louis |
Abstract | When a reader is first introduced to an entity, its referring expression must describe the entity. For entities that are widely known, a single word or phrase often suffices. This paper presents the first study of how expressions that refer to the same entity develop over time. We track thousands of person and organization entities over 20 years of New York Times (NYT). As entities move from hearer-new (first introduction to the NYT audience) to hearer-old (common knowledge) status, we show empirically that the referring expressions along this trajectory depend on the type of the entity, and exhibit linguistic properties related to becoming common knowledge (e.g., shorter length, less use of appositives, more definiteness). These properties can also be used to build a model to predict how long it will take for an entity to reach hearer-old status. Our results reach 10-30{%} absolute improvement over a majority-class baseline. |
Tasks | Coreference Resolution |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1466/ |
https://www.aclweb.org/anthology/D18-1466 | |
PWC | https://paperswithcode.com/paper/getting-to-hearer-old-charting-referring |
Repo | |
Framework | |
Modeling Personality Traits of Filipino Twitter Users
Title | Modeling Personality Traits of Filipino Twitter Users |
Authors | Edward Tighe, Charibeth Cheng |
Abstract | Recent studies in the field of text-based personality recognition experiment with different languages, feature extraction techniques, and machine learning algorithms to create better and more accurate models; however, little focus is placed on exploring the language use of a group of individuals defined by nationality. Individuals of the same nationality share certain practices and communicate certain ideas that can become embedded into their natural language. Many nationals are also not limited to speaking just one language, such as how Filipinos speak Filipino and English, the two national languages of the Philippines. The addition of several regional/indigenous languages, along with the commonness of code-switching, allow for a Filipino to have a rich vocabulary. This presents an opportunity to create a text-based personality model based on how Filipinos speak, regardless of the language they use. To do so, data was collected from 250 Filipino Twitter users. Different combinations of data processing techniques were experimented upon to create personality models for each of the Big Five. The results for both regression and classification show that Conscientiousness is consistently the easiest trait to model, followed by Extraversion. Classification models for Agreeableness and Neuroticism had subpar performances, but performed better than those of Openness. An analysis on personality trait score representation showed that classifying extreme outliers generally produce better results for all traits except for Neuroticism and Openness. |
Tasks | |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/W18-1115/ |
https://www.aclweb.org/anthology/W18-1115 | |
PWC | https://paperswithcode.com/paper/modeling-personality-traits-of-filipino |
Repo | |
Framework | |
An Approach to the CLPsych 2018 Shared Task Using Top-Down Text Representation and Simple Bottom-Up Model Selection
Title | An Approach to the CLPsych 2018 Shared Task Using Top-Down Text Representation and Simple Bottom-Up Model Selection |
Authors | Micah Iserman, Molly Ireland, Andrew Littlefield, Tyler Davis, Sage Maliepaard |
Abstract | |
Tasks | Model Selection |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/papers/W18-0605/w18-0605 |
https://www.aclweb.org/anthology/W18-0605 | |
PWC | https://paperswithcode.com/paper/an-approach-to-the-clpsych-2018-shared-task |
Repo | |
Framework | |
MIAPARLE: Online training for the discrimination of stress contrasts
Title | MIAPARLE: Online training for the discrimination of stress contrasts |
Authors | Jean-Philippe Goldman, S Schwab, ra |
Abstract | |
Tasks | |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1364/ |
https://www.aclweb.org/anthology/L18-1364 | |
PWC | https://paperswithcode.com/paper/miaparle-online-training-for-the |
Repo | |
Framework | |
Limbic: Author-Based Sentiment Aspect Modeling Regularized with Word Embeddings and Discourse Relations
Title | Limbic: Author-Based Sentiment Aspect Modeling Regularized with Word Embeddings and Discourse Relations |
Authors | Zhe Zhang, Munindar Singh |
Abstract | We propose Limbic, an unsupervised probabilistic model that addresses the problem of discovering aspects and sentiments and associating them with authors of opinionated texts. Limbic combines three ideas, incorporating authors, discourse relations, and word embeddings. For discourse relations, Limbic adopts a generative process regularized by a Markov Random Field. To promote words with high semantic similarity into the same topic, Limbic captures semantic regularities from word embeddings via a generalized P{'o}lya Urn process. We demonstrate that Limbic (1) discovers aspects associated with sentiments with high lexical diversity; (2) outperforms state-of-the-art models by a substantial margin in topic cohesion and sentiment classification. |
Tasks | Semantic Similarity, Semantic Textual Similarity, Sentiment Analysis, Topic Models, Word Embeddings |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1378/ |
https://www.aclweb.org/anthology/D18-1378 | |
PWC | https://paperswithcode.com/paper/limbic-author-based-sentiment-aspect-modeling |
Repo | |
Framework | |
Infant Word Comprehension-to-Production Index Applied to Investigation of Noun Learning Predominance Using Cross-lingual CDI database
Title | Infant Word Comprehension-to-Production Index Applied to Investigation of Noun Learning Predominance Using Cross-lingual CDI database |
Authors | Yasuhiro Minami, Tessei Kobayashi, Yuko Okumura |
Abstract | |
Tasks | |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1362/ |
https://www.aclweb.org/anthology/L18-1362 | |
PWC | https://paperswithcode.com/paper/infant-word-comprehension-to-production-index |
Repo | |
Framework | |
AUTOMATA GUIDED HIERARCHICAL REINFORCEMENT LEARNING FOR ZERO-SHOT SKILL COMPOSITION
Title | AUTOMATA GUIDED HIERARCHICAL REINFORCEMENT LEARNING FOR ZERO-SHOT SKILL COMPOSITION |
Authors | Xiao Li, Yao Ma, Calin Belta |
Abstract | An obstacle that prevents the wide adoption of (deep) reinforcement learning (RL) in control systems is its need for a large number of interactions with the environment in order to master a skill. The learned skill usually generalizes poorly across domains and re-training is often necessary when presented with a new task. We present a framework that combines techniques in \textit{formal methods} with \textit{hierarchical reinforcement learning} (HRL). The set of techniques we provide allows for the convenient specification of tasks with logical expressions, learns hierarchical policies (meta-controller and low-level controllers) with well-defined intrinsic rewards using any RL methods and is able to construct new skills from existing ones without additional learning. We evaluate the proposed methods in a simple grid world simulation as well as simulation on a Baxter robot. |
Tasks | Hierarchical Reinforcement Learning |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=BJgVaG-Ab |
https://openreview.net/pdf?id=BJgVaG-Ab | |
PWC | https://paperswithcode.com/paper/automata-guided-hierarchical-reinforcement-1 |
Repo | |
Framework | |
Aggressive language in an online hacking forum
Title | Aggressive language in an online hacking forum |
Authors | Andrew Caines, Sergio Pastrana, Alice Hutchings, Paula Buttery |
Abstract | We probe the heterogeneity in levels of abusive language in different sections of the Internet, using an annotated corpus of Wikipedia page edit comments to train a binary classifier for abuse detection. Our test data come from the CrimeBB Corpus of hacking-related forum posts and we find that (a) forum interactions are rarely abusive, (b) the abusive language which does exist tends to be relatively mild compared to that found in the Wikipedia comments domain, and tends to involve aggressive posturing rather than hate speech or threats of violence. We observe that the purpose of conversations in online forums tend to be more constructive and informative than those in Wikipedia page edit comments which are geared more towards adversarial interactions, and that this may explain the lower levels of abuse found in our forum data than in Wikipedia comments. Further work remains to be done to compare these results with other inter-domain classification experiments, and to understand the impact of aggressive language in forum conversations. |
Tasks | Abuse Detection |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/W18-5109/ |
https://www.aclweb.org/anthology/W18-5109 | |
PWC | https://paperswithcode.com/paper/aggressive-language-in-an-online-hacking |
Repo | |
Framework | |
Sanskrit Sandhi Splitting using seq2(seq)2
Title | Sanskrit Sandhi Splitting using seq2(seq)2 |
Authors | Rahul Aralikatte, Neelamadhav Gantayat, Naveen Panwar, Anush Sankaran, Senthil Mani |
Abstract | In Sanskrit, small words (morphemes) are combined to form compound words through a process known as Sandhi. Sandhi splitting is the process of splitting a given compound word into its constituent morphemes. Although rules governing word splitting exists in the language, it is highly challenging to identify the location of the splits in a compound word. Though existing Sandhi splitting systems incorporate these pre-defined splitting rules, they have a low accuracy as the same compound word might be broken down in multiple ways to provide syntactically correct splits. In this research, we propose a novel deep learning architecture called Double Decoder RNN (DD-RNN), which (i) predicts the location of the split(s) with 95{%} accuracy, and (ii) predicts the constituent words (learning the Sandhi splitting rules) with 79.5{%} accuracy, outperforming the state-of-art by 20{%}. Additionally, we show the generalization capability of our deep learning model, by showing competitive results in the problem of Chinese word segmentation, as well. |
Tasks | Chinese Word Segmentation |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1530/ |
https://www.aclweb.org/anthology/D18-1530 | |
PWC | https://paperswithcode.com/paper/sanskrit-sandhi-splitting-using-seq2seq2 |
Repo | |
Framework | |
Learning and Using the Arrow of Time
Title | Learning and Using the Arrow of Time |
Authors | Donglai Wei, Joseph J. Lim, Andrew Zisserman, William T. Freeman |
Abstract | We seek to understand the arrow of time in videos – what makes videos look like they are playing forwards or backwards? Can we visualize the cues? Can the arrow of time be a supervisory signal useful for activity analysis? To this end, we build three large-scale video datasets and apply a learning-based approach to these tasks. To learn the arrow of time efficiently and reliably, we design a ConvNet suitable for extended temporal footprints and for class activation visualization, and study the effect of artificial cues, such as cinematographic conventions, on learning. Our trained model achieves state-of-the-art performance on large-scale real-world video datasets. Through cluster analysis and localization of important regions for the prediction, we examine learned visual cues that are consistent among many samples and show when and where they occur. Lastly, we use the trained ConvNet for two applications: self-supervision for action recognition, and video forensics – determining whether Hollywood film clips have been deliberately reversed in time, often used as special effects. |
Tasks | Temporal Action Localization |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Wei_Learning_and_Using_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Wei_Learning_and_Using_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/learning-and-using-the-arrow-of-time |
Repo | |
Framework | |
Read and Comprehend by Gated-Attention Reader with More Belief
Title | Read and Comprehend by Gated-Attention Reader with More Belief |
Authors | Haohui Deng, Yik-Cheung Tam |
Abstract | Gated-Attention (GA) Reader has been effective for reading comprehension. GA Reader makes two assumptions: (1) a uni-directional attention that uses an input query to gate token encodings of a document; (2) encoding at the cloze position of an input query is considered for answer prediction. In this paper, we propose Collaborative Gating (CG) and Self-Belief Aggregation (SBA) to address the above assumptions respectively. In CG, we first use an input document to gate token encodings of an input query so that the influence of irrelevant query tokens may be reduced. Then the filtered query is used to gate token encodings of an document in a collaborative fashion. In SBA, we conjecture that query tokens other than the cloze token may be informative for answer prediction. We apply self-attention to link the cloze token with other tokens in a query so that the importance of query tokens with respect to the cloze position are weighted. Then their evidences are weighted, propagated and aggregated for better reading comprehension. Experiments show that our approaches advance the state-of-theart results in CNN, Daily Mail, and Who Did What public test sets. |
Tasks | Reading Comprehension, Word Alignment |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/N18-4012/ |
https://www.aclweb.org/anthology/N18-4012 | |
PWC | https://paperswithcode.com/paper/read-and-comprehend-by-gated-attention-reader |
Repo | |
Framework | |
Chrono at SemEval-2018 Task 6: A System for Normalizing Temporal Expressions
Title | Chrono at SemEval-2018 Task 6: A System for Normalizing Temporal Expressions |
Authors | Amy Olex, Luke Maffey, Nicholas Morgan, Bridget McInnes |
Abstract | Temporal information extraction is a challenging task. Here we describe Chrono, a hybrid rule-based and machine learning system that identifies temporal expressions in text and normalizes them into the SCATE schema. After minor parsing logic adjustments, Chrono has emerged as the top performing system for SemEval 2018 Task 6: Parsing Time Normalizations. |
Tasks | Temporal Information Extraction |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/S18-1012/ |
https://www.aclweb.org/anthology/S18-1012 | |
PWC | https://paperswithcode.com/paper/chrono-at-semeval-2018-task-6-a-system-for |
Repo | |
Framework | |
Treat the system like a human student: Automatic naturalness evaluation of generated text without reference texts
Title | Treat the system like a human student: Automatic naturalness evaluation of generated text without reference texts |
Authors | Isabel Groves, Ye Tian, Ioannis Douratsos |
Abstract | The current most popular method for automatic Natural Language Generation (NLG) evaluation is comparing generated text with human-written reference sentences using a metrics system, which has drawbacks around reliability and scalability. We draw inspiration from second language (L2) assessment and extract a set of linguistic features to predict human judgments of sentence naturalness. Our experiment using a small dataset showed that the feature-based approach yields promising results, with the added potential of providing interpretability into the source of the problems. |
Tasks | Image Captioning, Machine Translation, Text Generation, Text Summarization |
Published | 2018-11-01 |
URL | https://www.aclweb.org/anthology/W18-6512/ |
https://www.aclweb.org/anthology/W18-6512 | |
PWC | https://paperswithcode.com/paper/treat-the-system-like-a-human-student |
Repo | |
Framework | |
HSA-RNN: Hierarchical Structure-Adaptive RNN for Video Summarization
Title | HSA-RNN: Hierarchical Structure-Adaptive RNN for Video Summarization |
Authors | Bin Zhao, Xuelong Li, Xiaoqiang Lu |
Abstract | Although video summarization has achieved great success in recent years, few approaches have realized the influence of video structure on the summarization results. As we know, the video data follow a hierarchical structure, i.e., a video is composed of shots, and a shot is composed of several frames. Generally, shots provide the activity-level information for people to understand the video content. While few existing summarization approaches pay attention to the shot segmentation procedure. They generate shots by some trivial strategies, such as fixed length segmentation, which may destroy the underlying hierarchical structure of video data and further reduce the quality of generated summaries. To address this problem, we propose a structure-adaptive video summarization approach that integrates shot segmentation and video summarization into a Hierarchical Structure-Adaptive RNN, denoted as HSA-RNN. We evaluate the proposed approach on four popular datasets, i.e., SumMe, TVsum, CoSum and VTW. The experimental results have demonstrated the effectiveness of HSA-RNN in the video summarization task. |
Tasks | Video Summarization |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Zhao_HSA-RNN_Hierarchical_Structure-Adaptive_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Zhao_HSA-RNN_Hierarchical_Structure-Adaptive_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/hsa-rnn-hierarchical-structure-adaptive-rnn |
Repo | |
Framework | |
A Bird’s-eye View of Language Processing Projects at the Romanian Academy
Title | A Bird’s-eye View of Language Processing Projects at the Romanian Academy |
Authors | Dan Tufi{\textcommabelow{s}}, Dan Cristea |
Abstract | |
Tasks | Autonomous Vehicles, Keyword Spotting, Transliteration |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1388/ |
https://www.aclweb.org/anthology/L18-1388 | |
PWC | https://paperswithcode.com/paper/a-birdas-eye-view-of-language-processing |
Repo | |
Framework | |