July 26, 2019

1969 words 10 mins read

Paper Group NANR 157

Paper Group NANR 157

Identifying Usage Expression Sentences in Consumer Product Reviews. Discovering Stylistic Variations in Distributional Vector Space Models via Lexical Paraphrases. Adversarial Surrogate Losses for Ordinal Regression. Pay Attention to the Ending:Strong Neural Baselines for the ROC Story Cloze Task. Named Entity Recognition in the Medical Domain with …

Identifying Usage Expression Sentences in Consumer Product Reviews

Title Identifying Usage Expression Sentences in Consumer Product Reviews
Authors Shibamouli Lahiri, V.G.Vinod Vydiswaran, Rada Mihalcea
Abstract In this paper we introduce the problem of identifying usage expression sentences in a consumer product review. We create a human-annotated gold standard dataset of 565 reviews spanning five distinct product categories. Our dataset consists of more than 3,000 annotated sentences. We further introduce a classification system to label sentences according to whether or not they describe some {``}usage{''}. The system combines lexical, syntactic, and semantic features in a product-agnostic fashion to yield good classification performance. We show the effectiveness of our approach using importance ranking of features, error analysis, and cross-product classification experiments. |
Tasks
Published 2017-11-01
URL https://www.aclweb.org/anthology/I17-1040/
PDF https://www.aclweb.org/anthology/I17-1040
PWC https://paperswithcode.com/paper/identifying-usage-expression-sentences-in
Repo
Framework

Discovering Stylistic Variations in Distributional Vector Space Models via Lexical Paraphrases

Title Discovering Stylistic Variations in Distributional Vector Space Models via Lexical Paraphrases
Authors Xing Niu, Marine Carpuat
Abstract Detecting and analyzing stylistic variation in language is relevant to diverse Natural Language Processing applications. In this work, we investigate whether salient dimensions of style variations are embedded in standard distributional vector spaces of word meaning. We hypothesizes that distances between embeddings of lexical paraphrases can help isolate style from meaning variations and help identify latent style dimensions. We conduct a qualitative analysis of latent style dimensions, and show the effectiveness of identified style subspaces on a lexical formality prediction task.
Tasks Semantic Textual Similarity, Word Embeddings
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-4903/
PDF https://www.aclweb.org/anthology/W17-4903
PWC https://paperswithcode.com/paper/discovering-stylistic-variations-in
Repo
Framework

Adversarial Surrogate Losses for Ordinal Regression

Title Adversarial Surrogate Losses for Ordinal Regression
Authors Rizal Fathony, Mohammad Ali Bashiri, Brian Ziebart
Abstract Ordinal regression seeks class label predictions when the penalty incurred for mistakes increases according to an ordering over the labels. The absolute error is a canonical example. Many existing methods for this task reduce to binary classification problems and employ surrogate losses, such as the hinge loss. We instead derive uniquely defined surrogate ordinal regression loss functions by seeking the predictor that is robust to the worst-case approximations of training data labels, subject to matching certain provided training data statistics. We demonstrate the advantages of our approach over other surrogate losses based on hinge loss approximations using UCI ordinal prediction tasks.
Tasks
Published 2017-12-01
URL http://papers.nips.cc/paper/6659-adversarial-surrogate-losses-for-ordinal-regression
PDF http://papers.nips.cc/paper/6659-adversarial-surrogate-losses-for-ordinal-regression.pdf
PWC https://paperswithcode.com/paper/adversarial-surrogate-losses-for-ordinal
Repo
Framework

Pay Attention to the Ending:Strong Neural Baselines for the ROC Story Cloze Task

Title Pay Attention to the Ending:Strong Neural Baselines for the ROC Story Cloze Task
Authors Zheng Cai, Lifu Tu, Kevin Gimpel
Abstract We consider the ROC story cloze task (Mostafazadeh et al., 2016) and present several findings. We develop a model that uses hierarchical recurrent networks with attention to encode the sentences in the story and score candidate endings. By discarding the large training set and only training on the validation set, we achieve an accuracy of 74.7{%}. Even when we discard the story plots (sentences before the ending) and only train to choose the better of two endings, we can still reach 72.5{%}. We then analyze this {``}ending-only{''} task setting. We estimate human accuracy to be 78{%} and find several types of clues that lead to this high accuracy, including those related to sentiment, negation, and general ending likelihood regardless of the story context. |
Tasks Outlier Detection, Sentiment Analysis
Published 2017-07-01
URL https://www.aclweb.org/anthology/P17-2097/
PDF https://www.aclweb.org/anthology/P17-2097
PWC https://paperswithcode.com/paper/pay-attention-to-the-endingstrong-neural
Repo
Framework

Named Entity Recognition in the Medical Domain with Constrained CRF Models

Title Named Entity Recognition in the Medical Domain with Constrained CRF Models
Authors Charles Jochim, L{'e}a Deleris
Abstract This paper investigates how to improve performance on information extraction tasks by constraining and sequencing CRF-based approaches. We consider two different relation extraction tasks, both from the medical literature: dependence relations and probability statements. We explore whether adding constraints can lead to an improvement over standard CRF decoding. Results on our relation extraction tasks are promising, showing significant increases in performance from both (i) adding constraints to post-process the output of a baseline CRF, which captures {``}domain knowledge{''}, and (ii) further allowing flexibility in the application of those constraints by leveraging a binary classifier as a pre-processing step. |
Tasks Entity Extraction, Named Entity Recognition, Relation Extraction
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-1079/
PDF https://www.aclweb.org/anthology/E17-1079
PWC https://paperswithcode.com/paper/named-entity-recognition-in-the-medical
Repo
Framework

Uniform Convergence Rates for Kernel Density Estimation

Title Uniform Convergence Rates for Kernel Density Estimation
Authors Heinrich Jiang
Abstract Kernel density estimation (KDE) is a popular nonparametric density estimation method. We (1) derive finite-sample high-probability density estimation bounds for multivariate KDE under mild density assumptions which hold uniformly in $x \in \mathbb{R}^d$ and bandwidth matrices. We apply these results to (2) mode, (3) density level set, and (4) class probability estimation and attain optimal rates up to logarithmic factors. We then (5) provide an extension of our results under the manifold hypothesis. Finally, we (6) give uniform convergence results for local intrinsic dimension estimation.
Tasks Density Estimation
Published 2017-08-01
URL https://icml.cc/Conferences/2017/Schedule?showEvent=545
PDF http://proceedings.mlr.press/v70/jiang17b/jiang17b.pdf
PWC https://paperswithcode.com/paper/uniform-convergence-rates-for-kernel-density
Repo
Framework

Word Sense Filtering Improves Embedding-Based Lexical Substitution

Title Word Sense Filtering Improves Embedding-Based Lexical Substitution
Authors Anne Cocos, Marianna Apidianaki, Chris Callison-Burch
Abstract The role of word sense disambiguation in lexical substitution has been questioned due to the high performance of vector space models which propose good substitutes without explicitly accounting for sense. We show that a filtering mechanism based on a sense inventory optimized for substitutability can improve the results of these models. Our sense inventory is constructed using a clustering method which generates paraphrase clusters that are congruent with lexical substitution annotations in a development set. The results show that lexical substitution can still benefit from senses which can improve the output of vector space paraphrase ranking models.
Tasks Entity Extraction, Part-Of-Speech Tagging, Semantic Textual Similarity, Sentiment Analysis, Word Embeddings, Word Sense Disambiguation
Published 2017-04-01
URL https://www.aclweb.org/anthology/W17-1914/
PDF https://www.aclweb.org/anthology/W17-1914
PWC https://paperswithcode.com/paper/word-sense-filtering-improves-embedding-based
Repo
Framework

Proceedings of the 6th Workshop on Balto-Slavic Natural Language Processing

Title Proceedings of the 6th Workshop on Balto-Slavic Natural Language Processing
Authors
Abstract
Tasks
Published 2017-04-01
URL https://www.aclweb.org/anthology/W17-1400/
PDF https://www.aclweb.org/anthology/W17-1400
PWC https://paperswithcode.com/paper/proceedings-of-the-6th-workshop-on-balto
Repo
Framework

Globally Induced Forest: A Prepruning Compression Scheme

Title Globally Induced Forest: A Prepruning Compression Scheme
Authors Jean-Michel Begon, Arnaud Joly, Pierre Geurts
Abstract Tree-based ensemble models are heavy memory-wise. An undesired state of affairs considering nowadays datasets, memory-constrained environment and fitting/prediction times. In this paper, we propose the Globally Induced Forest (GIF) to remedy this problem. GIF is a fast prepruning approach to build lightweight ensembles by iteratively deepening the current forest. It mixes local and global optimizations to produce accurate predictions under memory constraints in reasonable time. We show that the proposed method is more than competitive with standard tree-based ensembles under corresponding constraints, and can sometimes even surpass much larger models.
Tasks
Published 2017-08-01
URL https://icml.cc/Conferences/2017/Schedule?showEvent=802
PDF http://proceedings.mlr.press/v70/begon17a/begon17a.pdf
PWC https://paperswithcode.com/paper/globally-induced-forest-a-prepruning
Repo
Framework

Event Argument Identification on Dependency Graphs with Bidirectional LSTMs

Title Event Argument Identification on Dependency Graphs with Bidirectional LSTMs
Authors Alex Judea, Michael Strube
Abstract In this paper we investigate the performance of event argument identification. We show that the performance is tied to syntactic complexity. Based on this finding, we propose a novel and effective system for event argument identification. Recurrent Neural Networks learn to produce meaningful representations of long and short dependency paths. Convolutional Neural Networks learn to decompose the lexical context of argument candidates. They are combined into a simple system which outperforms a feature-based, state-of-the-art event argument identifier without any manual feature engineering.
Tasks Feature Engineering
Published 2017-11-01
URL https://www.aclweb.org/anthology/I17-1083/
PDF https://www.aclweb.org/anthology/I17-1083
PWC https://paperswithcode.com/paper/event-argument-identification-on-dependency
Repo
Framework

Forest-type Regression with General Losses and Robust Forest

Title Forest-type Regression with General Losses and Robust Forest
Authors Alexander Hanbo Li, Andrew Martin
Abstract This paper introduces a new general framework for forest-type regression which allows the development of robust forest regressors by selecting from a large family of robust loss functions. In particular, when plugged in the squared error and quantile losses, it will recover the classical random forest and quantile random forest. We then use robust loss functions to develop more robust forest-type regression algorithms. In the experiments, we show by simulation and real data that our robust forests are indeed much more insensitive to outliers, and choosing the right number of nearest neighbors can quickly improve the generalization performance of random forest.
Tasks
Published 2017-08-01
URL https://icml.cc/Conferences/2017/Schedule?showEvent=871
PDF http://proceedings.mlr.press/v70/li17e/li17e.pdf
PWC https://paperswithcode.com/paper/forest-type-regression-with-general-losses
Repo
Framework

Sparse + Group-Sparse Dirty Models: Statistical Guarantees without Unreasonable Conditions and a Case for Non-Convexity

Title Sparse + Group-Sparse Dirty Models: Statistical Guarantees without Unreasonable Conditions and a Case for Non-Convexity
Authors Eunho Yang, Aurélie C. Lozano
Abstract Imposing sparse + group-sparse superposition structures in high-dimensional parameter estimation is known to provide flexible regularization that is more realistic for many real-world problems. For example, such a superposition enables partially-shared support sets in multi-task learning, thereby striking the right balance between parameter overlap across tasks and task specificity. Existing theoretical results on estimation consistency, however, are problematic as they require too stringent an assumption: the incoherence between sparse and group-sparse superposed components. In this paper, we fill the gap between the practical success and suboptimal analysis of sparse + group-sparse models, by providing the first consistency results that do not require unrealistic assumptions. We also study non-convex counterparts of sparse + group-sparse models. Interestingly, we show that these are guaranteed to recover the true support set under much milder conditions and with smaller sample size than convex models, which might be critical in practical applications as illustrated by our experiments.
Tasks Multi-Task Learning
Published 2017-08-01
URL https://icml.cc/Conferences/2017/Schedule?showEvent=647
PDF http://proceedings.mlr.press/v70/yang17g/yang17g.pdf
PWC https://paperswithcode.com/paper/sparse-group-sparse-dirty-models-statistical
Repo
Framework

Proceedings of the 2nd Workshop on Coreference Resolution Beyond OntoNotes (CORBON 2017)

Title Proceedings of the 2nd Workshop on Coreference Resolution Beyond OntoNotes (CORBON 2017)
Authors
Abstract
Tasks Coreference Resolution
Published 2017-04-01
URL https://www.aclweb.org/anthology/W17-1500/
PDF https://www.aclweb.org/anthology/W17-1500
PWC https://paperswithcode.com/paper/proceedings-of-the-2nd-workshop-on
Repo
Framework

Investigating the content and form of referring expressions in Mandarin: introducing the Mtuna corpus

Title Investigating the content and form of referring expressions in Mandarin: introducing the Mtuna corpus
Authors Kees van Deemter, Le Sun, Rint Sybesma, Xiao Li, Bo Chen, Muyun Yang
Abstract East Asian languages are thought to handle reference differently from languages such as English, particularly in terms of the marking of definiteness and number. We present the first Data-Text corpus for Referring Expressions in Mandarin, and we use this corpus to test some initial hypotheses inspired by the theoretical linguistics literature. Our findings suggest that function words deserve more attention in Referring Expressions Generation than they have so far received, and they have a bearing on the debate about whether different languages make different trade-offs between clarity and brevity.
Tasks Text Generation
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-3532/
PDF https://www.aclweb.org/anthology/W17-3532
PWC https://paperswithcode.com/paper/investigating-the-content-and-form-of
Repo
Framework

Automated Historical Fact-Checking by Passage Retrieval, Word Statistics, and Virtual Question-Answering

Title Automated Historical Fact-Checking by Passage Retrieval, Word Statistics, and Virtual Question-Answering
Authors Mio Kobayashi, Ai Ishii, Chikara Hoshino, Hiroshi Miyashita, Takuya Matsuzaki
Abstract This paper presents a hybrid approach to the verification of statements about historical facts. The test data was collected from the world history examinations in a standardized achievement test for high school students. The data includes various kinds of false statements that were carefully written so as to deceive the students while they can be disproven on the basis of the teaching materials. Our system predicts the truth or falsehood of a statement based on text search, word cooccurrence statistics, factoid-style question answering, and temporal relation recognition. These features contribute to the judgement complementarily and achieved the state-of-the-art accuracy.
Tasks Question Answering
Published 2017-11-01
URL https://www.aclweb.org/anthology/I17-1097/
PDF https://www.aclweb.org/anthology/I17-1097
PWC https://paperswithcode.com/paper/automated-historical-fact-checking-by-passage
Repo
Framework
comments powered by Disqus