July 26, 2019

2007 words 10 mins read

Paper Group NANR 145

Paper Group NANR 145

A working, non-trivial, topically indifferent NLG System for 17 languages. Exploiting Morphological Regularities in Distributional Word Representations. iSurvive: An Interpretable, Event-time Prediction Model for mHealth. Czech Dataset for Semantic Similarity and Relatedness. Co-reference Resolution in Tamil Text. An Empirical Bayes Approach to Opt …

A working, non-trivial, topically indifferent NLG System for 17 languages

Title A working, non-trivial, topically indifferent NLG System for 17 languages
Authors Robert Wei{\ss}graeber, Andreas Madsack
Abstract A fully fledged practical working application for a rule-based NLG system is presented that is able to create non-trivial, human sounding narrative from structured data, in any language and for any topic.
Tasks Text Generation
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-3524/
PDF https://www.aclweb.org/anthology/W17-3524
PWC https://paperswithcode.com/paper/a-working-non-trivial-topically-indifferent
Repo
Framework

Exploiting Morphological Regularities in Distributional Word Representations

Title Exploiting Morphological Regularities in Distributional Word Representations
Authors Arihant Gupta, Syed Sarfaraz Akhtar, Avijit Vajpayee, Arjit Srivastava, Madan Gopal Jhanwar, Manish Shrivastava
Abstract We present an unsupervised, language agnostic approach for exploiting morphological regularities present in high dimensional vector spaces. We propose a novel method for generating embeddings of words from their morphological variants using morphological transformation operators. We evaluate this approach on MSR word analogy test set with an accuracy of 85{%} which is 12{%} higher than the previous best known system.
Tasks Chunking, Document Classification, Question Answering, Word Embeddings
Published 2017-09-01
URL https://www.aclweb.org/anthology/D17-1028/
PDF https://www.aclweb.org/anthology/D17-1028
PWC https://paperswithcode.com/paper/exploiting-morphological-regularities-in
Repo
Framework

iSurvive: An Interpretable, Event-time Prediction Model for mHealth

Title iSurvive: An Interpretable, Event-time Prediction Model for mHealth
Authors Walter H. Dempsey, Alexander Moreno, Christy K. Scott, Michael L. Dennis, David H. Gustafson, Susan A. Murphy, James M. Rehg
Abstract An important mobile health (mHealth) task is the use of multimodal data, such as sensor streams and self-report, to construct interpretable time-to-event predictions of, for example, lapse to alcohol or illicit drug use. Interpretability of the prediction model is important for acceptance and adoption by domain scientists, enabling model outputs and parameters to inform theory and guide intervention design. Temporal latent state models are therefore attractive, and so we adopt the continuous time hidden Markov model (CT-HMM) due to its ability to describe irregular arrival times of event data. Standard CT-HMMs, however, are not specialized for predicting the time to a future event, the key variable for mHealth interventions. Also, standard emission models lack a sufficiently rich structure to describe multimodal data and incorporate domain knowledge. We present iSurvive, an extension of classical survival analysis to a CT-HMM. We present a parameter learning method for GLM emissions and survival model fitting, and present promising results on both synthetic data and an mHealth drug use dataset.
Tasks Survival Analysis
Published 2017-08-01
URL https://icml.cc/Conferences/2017/Schedule?showEvent=732
PDF http://proceedings.mlr.press/v70/dempsey17a/dempsey17a.pdf
PWC https://paperswithcode.com/paper/isurvive-an-interpretable-event-time
Repo
Framework

Czech Dataset for Semantic Similarity and Relatedness

Title Czech Dataset for Semantic Similarity and Relatedness
Authors Miloslav Konop{'\i}k, Ond{\v{r}}ej Pra{\v{z}}{'a}k, David Steinberger
Abstract This paper introduces a Czech dataset for semantic similarity and semantic relatedness. The dataset contains word pairs with hand annotated scores that indicate the semantic similarity and semantic relatedness of the words. The dataset contains 953 word pairs compiled from 9 different sources. It contains words and their contexts taken from real text corpora including extra examples when the words are ambiguous. The dataset is annotated by 5 independent annotators. The average Spearman correlation coefficient of the annotation agreement is $r = 0.81$. We provide reference evaluation experiments with several methods for computing semantic similarity and relatedness.
Tasks Semantic Similarity, Semantic Textual Similarity
Published 2017-09-01
URL https://www.aclweb.org/anthology/R17-1053/
PDF https://doi.org/10.26615/978-954-452-049-6_053
PWC https://paperswithcode.com/paper/czech-dataset-for-semantic-similarity-and
Repo
Framework

Co-reference Resolution in Tamil Text

Title Co-reference Resolution in Tamil Text
Authors Vijay Sundar Ram, Sobha Lalitha Devi
Abstract
Tasks
Published 2017-12-01
URL https://www.aclweb.org/anthology/W17-7548/
PDF https://www.aclweb.org/anthology/W17-7548
PWC https://paperswithcode.com/paper/co-reference-resolution-in-tamil-text
Repo
Framework

An Empirical Bayes Approach to Optimizing Machine Learning Algorithms

Title An Empirical Bayes Approach to Optimizing Machine Learning Algorithms
Authors James Mcinerney
Abstract There is rapidly growing interest in using Bayesian optimization to tune model and inference hyperparameters for machine learning algorithms that take a long time to run. For example, Spearmint is a popular software package for selecting the optimal number of layers and learning rate in neural networks. But given that there is uncertainty about which hyperparameters give the best predictive performance, and given that fitting a model for each choice of hyperparameters is costly, it is arguably wasteful to “throw away” all but the best result, as per Bayesian optimization. A related issue is the danger of overfitting the validation data when optimizing many hyperparameters. In this paper, we consider an alternative approach that uses more samples from the hyperparameter selection procedure to average over the uncertainty in model hyperparameters. The resulting approach, empirical Bayes for hyperparameter averaging (EB-Hyp) predicts held-out data better than Bayesian optimization in two experiments on latent Dirichlet allocation and deep latent Gaussian models. EB-Hyp suggests a simpler approach to evaluating and deploying machine learning algorithms that does not require a separate validation data set and hyperparameter selection procedure.
Tasks
Published 2017-12-01
URL http://papers.nips.cc/paper/6864-an-empirical-bayes-approach-to-optimizing-machine-learning-algorithms
PDF http://papers.nips.cc/paper/6864-an-empirical-bayes-approach-to-optimizing-machine-learning-algorithms.pdf
PWC https://paperswithcode.com/paper/an-empirical-bayes-approach-to-optimizing
Repo
Framework

Out-of-domain FrameNet Semantic Role Labeling

Title Out-of-domain FrameNet Semantic Role Labeling
Authors Silvana Hartmann, Ilia Kuznetsov, Teresa Martin, Iryna Gurevych
Abstract Domain dependence of NLP systems is one of the major obstacles to their application in large-scale text analysis, also restricting the applicability of FrameNet semantic role labeling (SRL) systems. Yet, current FrameNet SRL systems are still only evaluated on a single in-domain test set. For the first time, we study the domain dependence of FrameNet SRL on a wide range of benchmark sets. We create a novel test set for FrameNet SRL based on user-generated web text and find that the major bottleneck for out-of-domain FrameNet SRL is the frame identification step. To address this problem, we develop a simple, yet efficient system based on distributed word representations. Our system closely approaches the state-of-the-art in-domain while outperforming the best available frame identification system out-of-domain. We publish our system and test data for research purposes.
Tasks Semantic Role Labeling
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-1045/
PDF https://www.aclweb.org/anthology/E17-1045
PWC https://paperswithcode.com/paper/out-of-domain-framenet-semantic-role-labeling
Repo
Framework

Vectors for Counterspeech on Twitter

Title Vectors for Counterspeech on Twitter
Authors Lucas Wright, Derek Ruths, Kelly P Dillon, Haji Mohammad Saleem, Susan Benesch
Abstract A study of conversations on Twitter found that some arguments between strangers led to favorable change in discourse and even in attitudes. The authors propose that such exchanges can be usefully distinguished according to whether individuals or groups take part on each side, since the opportunity for a constructive exchange of views seems to vary accordingly.
Tasks
Published 2017-08-01
URL https://www.aclweb.org/anthology/W17-3009/
PDF https://www.aclweb.org/anthology/W17-3009
PWC https://paperswithcode.com/paper/vectors-for-counterspeech-on-twitter
Repo
Framework

Parsing for Grammatical Relations via Graph Merging

Title Parsing for Grammatical Relations via Graph Merging
Authors Weiwei Sun, Yantao Du, Xiaojun Wan
Abstract This paper is concerned with building deep grammatical relation (GR) analysis using data-driven approach. To deal with this problem, we propose graph merging, a new perspective, for building flexible dependency graphs: Constructing complex graphs via constructing simple subgraphs. We discuss two key problems in this perspective: (1) how to decompose a complex graph into simple subgraphs, and (2) how to combine subgraphs into a coherent complex graph. Experiments demonstrate the effectiveness of graph merging. Our parser reaches state-of-the-art performance and is significantly better than two transition-based parsers.
Tasks
Published 2017-08-01
URL https://www.aclweb.org/anthology/K17-1005/
PDF https://www.aclweb.org/anthology/K17-1005
PWC https://paperswithcode.com/paper/parsing-for-grammatical-relations-via-graph
Repo
Framework

Rephrasing Profanity in Chinese Text

Title Rephrasing Profanity in Chinese Text
Authors Hui-Po Su, Zhen-Jie Huang, Hao-Tsung Chang, Chuan-Jie Lin
Abstract This paper proposes a system that can detect and rephrase profanity in Chinese text. Rather than just masking detected profanity, we want to revise the input sentence by using inoffensive words while keeping their original meanings. 29 of such rephrasing rules were invented after observing sentences on real-word social websites. The overall accuracy of the proposed system is 85.56{%}
Tasks
Published 2017-08-01
URL https://www.aclweb.org/anthology/W17-3003/
PDF https://www.aclweb.org/anthology/W17-3003
PWC https://paperswithcode.com/paper/rephrasing-profanity-in-chinese-text
Repo
Framework

PurdueNLP at SemEval-2017 Task 1: Predicting Semantic Textual Similarity with Paraphrase and Event Embeddings

Title PurdueNLP at SemEval-2017 Task 1: Predicting Semantic Textual Similarity with Paraphrase and Event Embeddings
Authors I-Ta Lee, Mahak Goindani, Chang Li, Di Jin, Kristen Marie Johnson, Xiao Zhang, Maria Leonor Pacheco, Dan Goldwasser
Abstract This paper describes our proposed solution for SemEval 2017 Task 1: Semantic Textual Similarity (Daniel Cer and Specia, 2017). The task aims at measuring the degree of equivalence between sentences given in English. Performance is evaluated by computing Pearson Correlation scores between the predicted scores and human judgements. Our proposed system consists of two subsystems and one regression model for predicting STS scores. The two subsystems are designed to learn Paraphrase and Event Embeddings that can take the consideration of paraphrasing characteristics and sentence structures into our system. The regression model associates these embeddings to make the final predictions. The experimental result shows that our system acquires 0.8 of Pearson Correlation Scores in this task.
Tasks Question Answering, Semantic Textual Similarity, Word Embeddings
Published 2017-08-01
URL https://www.aclweb.org/anthology/S17-2029/
PDF https://www.aclweb.org/anthology/S17-2029
PWC https://paperswithcode.com/paper/purduenlp-at-semeval-2017-task-1-predicting
Repo
Framework

The Agreement Measure γcat a Complement to γ Focused on Categorization of a Continuum

Title The Agreement Measure γcat a Complement to γ Focused on Categorization of a Continuum
Authors Yann Mathet
Abstract Agreement on unitizing, where several annotators freely put units of various sizes and categories on a continuum, is difficult to assess because of the simultaneaous discrepancies in positioning and categorizing. The recent agreement measure γ offers an overall solution that simultaneously takes into account positions and categories. In this article, I propose the additional coefficient γcat, which complements γ by assessing the agreement on categorization of a continuum, putting aside positional discrepancies. When applied to pure categorization (with predefined units), γcat behaves the same way as the famous dedicated Krippendorff{'}s α, even with missing values, which proves its consistency. A variation of γcat is also proposed that provides an in-depth assessment of categorizing for each individual category. The entire family of γ coefficients is implemented in free software.
Tasks
Published 2017-09-01
URL https://www.aclweb.org/anthology/J17-3006/
PDF https://www.aclweb.org/anthology/J17-3006
PWC https://paperswithcode.com/paper/the-agreement-measure-i3cat-a-complement-to
Repo
Framework

Annotating Italian Social Media Texts in Universal Dependencies

Title Annotating Italian Social Media Texts in Universal Dependencies
Authors Manuela Sanguinetti, Cristina Bosco, Aless Mazzei, ro, Alberto Lavelli, Fabio Tamburini
Abstract
Tasks Opinion Mining, Sentiment Analysis
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-6526/
PDF https://www.aclweb.org/anthology/W17-6526
PWC https://paperswithcode.com/paper/annotating-italian-social-media-texts-in
Repo
Framework

OMAM at SemEval-2017 Task 4: Evaluation of English State-of-the-Art Sentiment Analysis Models for Arabic and a New Topic-based Model

Title OMAM at SemEval-2017 Task 4: Evaluation of English State-of-the-Art Sentiment Analysis Models for Arabic and a New Topic-based Model
Authors Ramy Baly, Gilbert Badaro, Ali Hamdi, Rawan Moukalled, Rita Aoun, Georges El-Khoury, Ahmad Al Sallab, Hazem Hajj, Nizar Habash, Khaled Shaban, Wassim El-Hajj
Abstract While sentiment analysis in English has achieved significant progress, it remains a challenging task in Arabic given the rich morphology of the language. It becomes more challenging when applied to Twitter data that comes with additional sources of noise including dialects, misspellings, grammatical mistakes, code switching and the use of non-textual objects to express sentiments. This paper describes the {``}OMAM{''} systems that we developed as part of SemEval-2017 task 4. We evaluate English state-of-the-art methods on Arabic tweets for subtask A. As for the remaining subtasks, we introduce a topic-based approach that accounts for topic specificities by predicting topics or domains of upcoming tweets, and then using this information to predict their sentiment. Results indicate that applying the English state-of-the-art method to Arabic has achieved solid results without significant enhancements. Furthermore, the topic-based method ranked 1st in subtasks C and E, and 2nd in subtask D. |
Tasks Opinion Mining, Sentiment Analysis
Published 2017-08-01
URL https://www.aclweb.org/anthology/S17-2099/
PDF https://www.aclweb.org/anthology/S17-2099
PWC https://paperswithcode.com/paper/omam-at-semeval-2017-task-4-evaluation-of
Repo
Framework

OMAM at SemEval-2017 Task 4: English Sentiment Analysis with Conditional Random Fields

Title OMAM at SemEval-2017 Task 4: English Sentiment Analysis with Conditional Random Fields
Authors Chukwuyem Onyibe, Nizar Habash
Abstract We describe a supervised system that uses optimized Condition Random Fields and lexical features to predict the sentiment of a tweet. The system was submitted to the English version of all subtasks in SemEval-2017 Task 4.
Tasks Opinion Mining, Sentiment Analysis, Stance Detection
Published 2017-08-01
URL https://www.aclweb.org/anthology/S17-2111/
PDF https://www.aclweb.org/anthology/S17-2111
PWC https://paperswithcode.com/paper/omam-at-semeval-2017-task-4-english-sentiment
Repo
Framework
comments powered by Disqus