July 26, 2019

1946 words 10 mins read

Paper Group NANR 122

Paper Group NANR 122

Behind the Scenes of an Evolving Event Cloze Test. Intrinsic and Extrinsic Evaluation of Spatiotemporal Text Representations in Twitter Streams. How (not) to train a dependency parser: The curious case of jackknifing part-of-speech taggers. Does the Geometry of Word Embeddings Help Document Classification? A Case Study on Persistent Homology-Based …

Behind the Scenes of an Evolving Event Cloze Test

Title Behind the Scenes of an Evolving Event Cloze Test
Authors Nathanael Chambers
Abstract This paper analyzes the narrative event cloze test and its recent evolution. The test removes one event from a document{'}s chain of events, and systems predict the missing event. Originally proposed to evaluate learned knowledge of event scenarios (e.g., scripts and frames), most recent work now builds ngram-like language models (LM) to beat the test. This paper argues that the test has slowly/unknowingly been altered to accommodate LMs.5 Most notably, tests are auto-generated rather than by hand, and no effort is taken to include core script events. Recent work is not clear on evaluation goals and contains contradictory results. We implement several models, and show that the test{'}s bias to high-frequency events explains the inconsistencies. We conclude with recommendations on how to return to the test{'}s original intent, and offer brief suggestions on a path forward.
Tasks Common Sense Reasoning, Language Modelling
Published 2017-04-01
URL https://www.aclweb.org/anthology/W17-0905/
PDF https://www.aclweb.org/anthology/W17-0905
PWC https://paperswithcode.com/paper/behind-the-scenes-of-an-evolving-event-cloze
Repo
Framework

Intrinsic and Extrinsic Evaluation of Spatiotemporal Text Representations in Twitter Streams

Title Intrinsic and Extrinsic Evaluation of Spatiotemporal Text Representations in Twitter Streams
Authors Lawrence Phillips, Kyle Shaffer, Dustin Arendt, Nathan Hodas, Svitlana Volkova
Abstract Language in social media is a dynamic system, constantly evolving and adapting, with words and concepts rapidly emerging, disappearing, and changing their meaning. These changes can be estimated using word representations in context, over time and across locations. A number of methods have been proposed to track these spatiotemporal changes but no general method exists to evaluate the quality of these representations. Previous work largely focused on qualitative evaluation, which we improve by proposing a set of visualizations that highlight changes in text representation over both space and time. We demonstrate usefulness of novel spatiotemporal representations to explore and characterize specific aspects of the corpus of tweets collected from European countries over a two-week period centered around the terrorist attacks in Brussels in March 2016. In addition, we quantitatively evaluate spatiotemporal representations by feeding them into a downstream classification task {–} event type prediction. Thus, our work is the first to provide both intrinsic (qualitative) and extrinsic (quantitative) evaluation of text representations for spatiotemporal trends.
Tasks Representation Learning
Published 2017-08-01
URL https://www.aclweb.org/anthology/W17-2624/
PDF https://www.aclweb.org/anthology/W17-2624
PWC https://paperswithcode.com/paper/intrinsic-and-extrinsic-evaluation-of
Repo
Framework

How (not) to train a dependency parser: The curious case of jackknifing part-of-speech taggers

Title How (not) to train a dependency parser: The curious case of jackknifing part-of-speech taggers
Authors {\v{Z}}eljko Agi{'c}, Natalie Schluter
Abstract In dependency parsing, jackknifing taggers is indiscriminately used as a simple adaptation strategy. Here, we empirically evaluate when and how (not) to use jackknifing in parsing. On 26 languages, we reveal a preference that conflicts with, and surpasses the ubiquitous ten-folding. We show no clear benefits of tagging the training data in cross-lingual parsing.
Tasks Dependency Parsing
Published 2017-07-01
URL https://www.aclweb.org/anthology/P17-2107/
PDF https://www.aclweb.org/anthology/P17-2107
PWC https://paperswithcode.com/paper/how-not-to-train-a-dependency-parser-the
Repo
Framework

Does the Geometry of Word Embeddings Help Document Classification? A Case Study on Persistent Homology-Based Representations

Title Does the Geometry of Word Embeddings Help Document Classification? A Case Study on Persistent Homology-Based Representations
Authors Paul Michel, Ravich, Abhilasha er, Shruti Rijhwani
Abstract We investigate the pertinence of methods from algebraic topology for text data analysis. These methods enable the development of mathematically-principled isometric-invariant mappings from a set of vectors to a document embedding, which is stable with respect to the geometry of the document in the selected metric space. In this work, we evaluate the utility of these topology-based document representations in traditional NLP tasks, specifically document clustering and sentiment classification. We find that the embeddings do not benefit text analysis. In fact, performance is worse than simple techniques like tf-idf, indicating that the geometry of the document does not provide enough variability for classification on the basis of topic or sentiment in the chosen datasets.
Tasks Document Classification, Document Embedding, Representation Learning, Sentiment Analysis, Text Classification, Word Embeddings
Published 2017-08-01
URL https://www.aclweb.org/anthology/W17-2628/
PDF https://www.aclweb.org/anthology/W17-2628
PWC https://paperswithcode.com/paper/does-the-geometry-of-word-embeddings-help
Repo
Framework

Learning Synchronous Grammar Patterns for Assisted Writing for Second Language Learners

Title Learning Synchronous Grammar Patterns for Assisted Writing for Second Language Learners
Authors Chi-En Wu, Jhih-Jie Chen, Jim Chang, Jason Chang
Abstract In this paper, we present a method for extracting Synchronous Grammar Patterns (SGPs) from a given parallel corpus in order to assisted second language learners in writing. A grammar pattern consists of a head word (verb, noun, or adjective) and its syntactic environment. A synchronous grammar pattern describes a grammar pattern in the target language (e.g., English) and its counterpart in an other language (e.g., Mandarin), serving the purpose of native language support. Our method involves identifying the grammar patterns in the target language, aligning these patterns with the target language patterns, and finally filtering valid SGPs. The extracted SGPs with examples are then used to develop a prototype writing assistant system, called WriteAhead/bilingual. Evaluation on a set of randomly selected SGPs shows that our system provides satisfactory writing suggestions for English as a Second Language (ESL) learners.
Tasks
Published 2017-11-01
URL https://www.aclweb.org/anthology/I17-3014/
PDF https://www.aclweb.org/anthology/I17-3014
PWC https://paperswithcode.com/paper/learning-synchronous-grammar-patterns-for
Repo
Framework

Proceedings of the 10th Workshop on Building and Using Comparable Corpora

Title Proceedings of the 10th Workshop on Building and Using Comparable Corpora
Authors
Abstract
Tasks
Published 2017-08-01
URL https://www.aclweb.org/anthology/W17-2500/
PDF https://www.aclweb.org/anthology/W17-2500
PWC https://paperswithcode.com/paper/proceedings-of-the-10th-workshop-on-building
Repo
Framework

Issues in digital text representation, on-line dissemination, sharing and re-use for African minority languages

Title Issues in digital text representation, on-line dissemination, sharing and re-use for African minority languages
Authors Emmanuel Ngu{'e} Um
Abstract
Tasks
Published 2017-03-01
URL https://www.aclweb.org/anthology/W17-0104/
PDF https://www.aclweb.org/anthology/W17-0104
PWC https://paperswithcode.com/paper/issues-in-digital-text-representation-on-line
Repo
Framework

Instant annotations in ELAN corpora of spoken and written Komi, an endangered language of the Barents Sea region

Title Instant annotations in ELAN corpora of spoken and written Komi, an endangered language of the Barents Sea region
Authors Ciprian Gerstenberger, Niko Partanen, Michael Rie{\ss}ler
Abstract
Tasks
Published 2017-03-01
URL https://www.aclweb.org/anthology/W17-0109/
PDF https://www.aclweb.org/anthology/W17-0109
PWC https://paperswithcode.com/paper/instant-annotations-in-elan-corpora-of-spoken
Repo
Framework

Case Studies in the Automatic Characterization of Grammars from Small Wordlists

Title Case Studies in the Automatic Characterization of Grammars from Small Wordlists
Authors Jordan Kodner, Spencer Caplan, Hongzhi Xu, Mitchell P. Marcus, Charles Yang
Abstract
Tasks Machine Translation
Published 2017-03-01
URL https://www.aclweb.org/anthology/W17-0111/
PDF https://www.aclweb.org/anthology/W17-0111
PWC https://paperswithcode.com/paper/case-studies-in-the-automatic
Repo
Framework

Random Projection Filter Bank for Time Series Data

Title Random Projection Filter Bank for Time Series Data
Authors Amir-Massoud Farahmand, Sepideh Pourazarm, Daniel Nikovski
Abstract We propose Random Projection Filter Bank (RPFB) as a generic and simple approach to extract features from time series data. RPFB is a set of randomly generated stable autoregressive filters that are convolved with the input time series to generate the features. These features can be used by any conventional machine learning algorithm for solving tasks such as time series prediction, classification with time series data, etc. Different filters in RPFB extract different aspects of the time series, and together they provide a reasonably good summary of the time series. RPFB is easy to implement, fast to compute, and parallelizable. We provide an error upper bound indicating that RPFB provides a reasonable approximation to a class of dynamical systems. The empirical results in a series of synthetic and real-world problems show that RPFB is an effective method to extract features from time series.
Tasks Time Series, Time Series Prediction
Published 2017-12-01
URL http://papers.nips.cc/paper/7234-random-projection-filter-bank-for-time-series-data
PDF http://papers.nips.cc/paper/7234-random-projection-filter-bank-for-time-series-data.pdf
PWC https://paperswithcode.com/paper/random-projection-filter-bank-for-time-series
Repo
Framework

newsLens: building and visualizing long-ranging news stories

Title newsLens: building and visualizing long-ranging news stories
Authors Philippe Laban, Marti Hearst
Abstract We propose a method to aggregate and organize a large, multi-source dataset of news articles into a collection of major stories, and automatically name and visualize these stories in a working system. The approach is able to run online, as new articles are added, processing 4 million news articles from 20 news sources, and extracting 80000 major stories, some of which span several years. The visual interface consists of lanes of timelines, each annotated with information that is deemed important for the story, including extracted quotations. The working system allows a user to search and navigate 8 years of story information.
Tasks
Published 2017-08-01
URL https://www.aclweb.org/anthology/W17-2701/
PDF https://www.aclweb.org/anthology/W17-2701
PWC https://paperswithcode.com/paper/newslens-building-and-visualizing-long
Repo
Framework

Predicting User Competence from Linguistic Data

Title Predicting User Competence from Linguistic Data
Authors Yonas Woldemariam, Henrik Bj{"o}rklund, Suna Bensch
Abstract
Tasks
Published 2017-12-01
URL https://www.aclweb.org/anthology/W17-7558/
PDF https://www.aclweb.org/anthology/W17-7558
PWC https://paperswithcode.com/paper/predicting-user-competence-from-linguistic
Repo
Framework

Learning Efficient Object Detection Models with Knowledge Distillation

Title Learning Efficient Object Detection Models with Knowledge Distillation
Authors Guobin Chen, Wongun Choi, Xiang Yu, Tony Han, Manmohan Chandraker
Abstract Despite significant accuracy improvement in convolutional neural networks (CNN) based object detectors, they often require prohibitive runtimes to process an image for real-time applications. State-of-the-art models often use very deep networks with a large number of floating point operations. Efforts such as model compression learn compact models with fewer number of parameters, but with much reduced accuracy. In this work, we propose a new framework to learn compact and fast ob- ject detection networks with improved accuracy using knowledge distillation [20] and hint learning [34]. Although knowledge distillation has demonstrated excellent improvements for simpler classification setups, the complexity of detection poses new challenges in the form of regression, region proposals and less voluminous la- bels. We address this through several innovations such as a weighted cross-entropy loss to address class imbalance, a teacher bounded loss to handle the regression component and adaptation layers to better learn from intermediate teacher distribu- tions. We conduct comprehensive empirical evaluation with different distillation configurations over multiple datasets including PASCAL, KITTI, ILSVRC and MS-COCO. Our results show consistent improvement in accuracy-speed trade-offs for modern multi-class detection models.
Tasks Model Compression, Object Detection
Published 2017-12-01
URL http://papers.nips.cc/paper/6676-learning-efficient-object-detection-models-with-knowledge-distillation
PDF http://papers.nips.cc/paper/6676-learning-efficient-object-detection-models-with-knowledge-distillation.pdf
PWC https://paperswithcode.com/paper/learning-efficient-object-detection-models
Repo
Framework

Event Detection Using Frame-Semantic Parser

Title Event Detection Using Frame-Semantic Parser
Authors Evangelia Spiliopoulou, Eduard Hovy, Teruko Mitamura
Abstract Recent methods for Event Detection focus on Deep Learning for automatic feature generation and feature ranking. However, most of those approaches fail to exploit rich semantic information, which results in relatively poor recall. This paper is a small {&} focused contribution, where we introduce an Event Detection and classification system, based on deep semantic information retrieved from a frame-semantic parser. Our experiments show that our system achieves higher recall than state-of-the-art systems. Further, we claim that enhancing our system with deep learning techniques like feature ranking can achieve even better results, as it can benefit from both approaches.
Tasks Word Embeddings
Published 2017-08-01
URL https://www.aclweb.org/anthology/W17-2703/
PDF https://www.aclweb.org/anthology/W17-2703
PWC https://paperswithcode.com/paper/event-detection-using-frame-semantic-parser
Repo
Framework

Improving Shared Argument Identification in Japanese Event Knowledge Acquisition

Title Improving Shared Argument Identification in Japanese Event Knowledge Acquisition
Authors Yin Jou Huang, Sadao Kurohashi
Abstract Event knowledge represents the knowledge of causal and temporal relations between events. Shared arguments of event knowledge encode patterns of role shifting in successive events. A two-stage framework was proposed for the task of Japanese event knowledge acquisition, in which related event pairs are first extracted, and shared arguments are then identified to form the complete event knowledge. This paper focuses on the second stage of this framework, and proposes a method to improve the shared argument identification of related event pairs. We constructed a gold dataset for shared argument learning. By evaluating our system on this gold dataset, we found that our proposed model outperformed the baseline models by a large margin.
Tasks Coreference Resolution, Text Generation
Published 2017-08-01
URL https://www.aclweb.org/anthology/W17-2704/
PDF https://www.aclweb.org/anthology/W17-2704
PWC https://paperswithcode.com/paper/improving-shared-argument-identification-in
Repo
Framework
comments powered by Disqus