July 26, 2019

2007 words 10 mins read

Paper Group NANR 145

A working, non-trivial, topically indifferent NLG System for 17 languages. Exploiting Morphological Regularities in Distributional Word Representations. iSurvive: An Interpretable, Event-time Prediction Model for mHealth. Czech Dataset for Semantic Similarity and Relatedness. Co-reference Resolution in Tamil Text. An Empirical Bayes Approach to Opt …

A working, non-trivial, topically indifferent NLG System for 17 languages


Title	A working, non-trivial, topically indifferent NLG System for 17 languages
Authors	Robert Wei{\ss}graeber, Andreas Madsack
Abstract	A fully fledged practical working application for a rule-based NLG system is presented that is able to create non-trivial, human sounding narrative from structured data, in any language and for any topic.
Tasks	Text Generation
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-3524/
PDF	https://www.aclweb.org/anthology/W17-3524
PWC	https://paperswithcode.com/paper/a-working-non-trivial-topically-indifferent
Repo
Framework

Exploiting Morphological Regularities in Distributional Word Representations


Title	Exploiting Morphological Regularities in Distributional Word Representations
Authors	Arihant Gupta, Syed Sarfaraz Akhtar, Avijit Vajpayee, Arjit Srivastava, Madan Gopal Jhanwar, Manish Shrivastava
Abstract	We present an unsupervised, language agnostic approach for exploiting morphological regularities present in high dimensional vector spaces. We propose a novel method for generating embeddings of words from their morphological variants using morphological transformation operators. We evaluate this approach on MSR word analogy test set with an accuracy of 85{%} which is 12{%} higher than the previous best known system.
Tasks	Chunking, Document Classification, Question Answering, Word Embeddings
Published	2017-09-01
URL	https://www.aclweb.org/anthology/D17-1028/
PDF	https://www.aclweb.org/anthology/D17-1028
PWC	https://paperswithcode.com/paper/exploiting-morphological-regularities-in
Repo
Framework

iSurvive: An Interpretable, Event-time Prediction Model for mHealth


Title	iSurvive: An Interpretable, Event-time Prediction Model for mHealth
Authors	Walter H. Dempsey, Alexander Moreno, Christy K. Scott, Michael L. Dennis, David H. Gustafson, Susan A. Murphy, James M. Rehg
Abstract	An important mobile health (mHealth) task is the use of multimodal data, such as sensor streams and self-report, to construct interpretable time-to-event predictions of, for example, lapse to alcohol or illicit drug use. Interpretability of the prediction model is important for acceptance and adoption by domain scientists, enabling model outputs and parameters to inform theory and guide intervention design. Temporal latent state models are therefore attractive, and so we adopt the continuous time hidden Markov model (CT-HMM) due to its ability to describe irregular arrival times of event data. Standard CT-HMMs, however, are not specialized for predicting the time to a future event, the key variable for mHealth interventions. Also, standard emission models lack a sufficiently rich structure to describe multimodal data and incorporate domain knowledge. We present iSurvive, an extension of classical survival analysis to a CT-HMM. We present a parameter learning method for GLM emissions and survival model fitting, and present promising results on both synthetic data and an mHealth drug use dataset.
Tasks	Survival Analysis
Published	2017-08-01
URL	https://icml.cc/Conferences/2017/Schedule?showEvent=732
PDF	http://proceedings.mlr.press/v70/dempsey17a/dempsey17a.pdf
PWC	https://paperswithcode.com/paper/isurvive-an-interpretable-event-time
Repo
Framework

Czech Dataset for Semantic Similarity and Relatedness


Title	Czech Dataset for Semantic Similarity and Relatedness
Authors	Miloslav Konop{'\i}k, Ond{\v{r}}ej Pra{\v{z}}{'a}k, David Steinberger
Abstract	This paper introduces a Czech dataset for semantic similarity and semantic relatedness. The dataset contains word pairs with hand annotated scores that indicate the semantic similarity and semantic relatedness of the words. The dataset contains 953 word pairs compiled from 9 different sources. It contains words and their contexts taken from real text corpora including extra examples when the words are ambiguous. The dataset is annotated by 5 independent annotators. The average Spearman correlation coefficient of the annotation agreement is $r = 0.81$. We provide reference evaluation experiments with several methods for computing semantic similarity and relatedness.
Tasks	Semantic Similarity, Semantic Textual Similarity
Published	2017-09-01
URL	https://www.aclweb.org/anthology/R17-1053/
PDF	https://doi.org/10.26615/978-954-452-049-6_053
PWC	https://paperswithcode.com/paper/czech-dataset-for-semantic-similarity-and
Repo
Framework

Co-reference Resolution in Tamil Text


Title	Co-reference Resolution in Tamil Text
Authors	Vijay Sundar Ram, Sobha Lalitha Devi
Abstract
Tasks
Published	2017-12-01
URL	https://www.aclweb.org/anthology/W17-7548/
PDF	https://www.aclweb.org/anthology/W17-7548
PWC	https://paperswithcode.com/paper/co-reference-resolution-in-tamil-text
Repo
Framework

An Empirical Bayes Approach to Optimizing Machine Learning Algorithms


Title	An Empirical Bayes Approach to Optimizing Machine Learning Algorithms
Authors	James Mcinerney
Abstract	There is rapidly growing interest in using Bayesian optimization to tune model and inference hyperparameters for machine learning algorithms that take a long time to run. For example, Spearmint is a popular software package for selecting the optimal number of layers and learning rate in neural networks. But given that there is uncertainty about which hyperparameters give the best predictive performance, and given that fitting a model for each choice of hyperparameters is costly, it is arguably wasteful to “throw away” all but the best result, as per Bayesian optimization. A related issue is the danger of overfitting the validation data when optimizing many hyperparameters. In this paper, we consider an alternative approach that uses more samples from the hyperparameter selection procedure to average over the uncertainty in model hyperparameters. The resulting approach, empirical Bayes for hyperparameter averaging (EB-Hyp) predicts held-out data better than Bayesian optimization in two experiments on latent Dirichlet allocation and deep latent Gaussian models. EB-Hyp suggests a simpler approach to evaluating and deploying machine learning algorithms that does not require a separate validation data set and hyperparameter selection procedure.
Tasks
Published	2017-12-01
URL	http://papers.nips.cc/paper/6864-an-empirical-bayes-approach-to-optimizing-machine-learning-algorithms
PDF	http://papers.nips.cc/paper/6864-an-empirical-bayes-approach-to-optimizing-machine-learning-algorithms.pdf
PWC	https://paperswithcode.com/paper/an-empirical-bayes-approach-to-optimizing
Repo
Framework

Out-of-domain FrameNet Semantic Role Labeling


Title	Out-of-domain FrameNet Semantic Role Labeling
Authors	Silvana Hartmann, Ilia Kuznetsov, Teresa Martin, Iryna Gurevych
Abstract	Domain dependence of NLP systems is one of the major obstacles to their application in large-scale text analysis, also restricting the applicability of FrameNet semantic role labeling (SRL) systems. Yet, current FrameNet SRL systems are still only evaluated on a single in-domain test set. For the first time, we study the domain dependence of FrameNet SRL on a wide range of benchmark sets. We create a novel test set for FrameNet SRL based on user-generated web text and find that the major bottleneck for out-of-domain FrameNet SRL is the frame identification step. To address this problem, we develop a simple, yet efficient system based on distributed word representations. Our system closely approaches the state-of-the-art in-domain while outperforming the best available frame identification system out-of-domain. We publish our system and test data for research purposes.
Tasks	Semantic Role Labeling
Published	2017-04-01
URL	https://www.aclweb.org/anthology/E17-1045/
PDF	https://www.aclweb.org/anthology/E17-1045
PWC	https://paperswithcode.com/paper/out-of-domain-framenet-semantic-role-labeling
Repo
Framework

Vectors for Counterspeech on Twitter


Title	Vectors for Counterspeech on Twitter
Authors	Lucas Wright, Derek Ruths, Kelly P Dillon, Haji Mohammad Saleem, Susan Benesch
Abstract	A study of conversations on Twitter found that some arguments between strangers led to favorable change in discourse and even in attitudes. The authors propose that such exchanges can be usefully distinguished according to whether individuals or groups take part on each side, since the opportunity for a constructive exchange of views seems to vary accordingly.
Tasks
Published	2017-08-01
URL	https://www.aclweb.org/anthology/W17-3009/
PDF	https://www.aclweb.org/anthology/W17-3009
PWC	https://paperswithcode.com/paper/vectors-for-counterspeech-on-twitter
Repo
Framework

Parsing for Grammatical Relations via Graph Merging


Title	Parsing for Grammatical Relations via Graph Merging
Authors	Weiwei Sun, Yantao Du, Xiaojun Wan
Abstract	This paper is concerned with building deep grammatical relation (GR) analysis using data-driven approach. To deal with this problem, we propose graph merging, a new perspective, for building flexible dependency graphs: Constructing complex graphs via constructing simple subgraphs. We discuss two key problems in this perspective: (1) how to decompose a complex graph into simple subgraphs, and (2) how to combine subgraphs into a coherent complex graph. Experiments demonstrate the effectiveness of graph merging. Our parser reaches state-of-the-art performance and is significantly better than two transition-based parsers.
Tasks
Published	2017-08-01
URL	https://www.aclweb.org/anthology/K17-1005/
PDF	https://www.aclweb.org/anthology/K17-1005
PWC	https://paperswithcode.com/paper/parsing-for-grammatical-relations-via-graph
Repo
Framework

Rephrasing Profanity in Chinese Text


Title	Rephrasing Profanity in Chinese Text
Authors	Hui-Po Su, Zhen-Jie Huang, Hao-Tsung Chang, Chuan-Jie Lin
Abstract	This paper proposes a system that can detect and rephrase profanity in Chinese text. Rather than just masking detected profanity, we want to revise the input sentence by using inoffensive words while keeping their original meanings. 29 of such rephrasing rules were invented after observing sentences on real-word social websites. The overall accuracy of the proposed system is 85.56{%}
Tasks
Published	2017-08-01
URL	https://www.aclweb.org/anthology/W17-3003/
PDF	https://www.aclweb.org/anthology/W17-3003
PWC	https://paperswithcode.com/paper/rephrasing-profanity-in-chinese-text
Repo
Framework

PurdueNLP at SemEval-2017 Task 1: Predicting Semantic Textual Similarity with Paraphrase and Event Embeddings


Title	PurdueNLP at SemEval-2017 Task 1: Predicting Semantic Textual Similarity with Paraphrase and Event Embeddings
Authors	I-Ta Lee, Mahak Goindani, Chang Li, Di Jin, Kristen Marie Johnson, Xiao Zhang, Maria Leonor Pacheco, Dan Goldwasser
Abstract	This paper describes our proposed solution for SemEval 2017 Task 1: Semantic Textual Similarity (Daniel Cer and Specia, 2017). The task aims at measuring the degree of equivalence between sentences given in English. Performance is evaluated by computing Pearson Correlation scores between the predicted scores and human judgements. Our proposed system consists of two subsystems and one regression model for predicting STS scores. The two subsystems are designed to learn Paraphrase and Event Embeddings that can take the consideration of paraphrasing characteristics and sentence structures into our system. The regression model associates these embeddings to make the final predictions. The experimental result shows that our system acquires 0.8 of Pearson Correlation Scores in this task.
Tasks	Question Answering, Semantic Textual Similarity, Word Embeddings
Published	2017-08-01
URL	https://www.aclweb.org/anthology/S17-2029/
PDF	https://www.aclweb.org/anthology/S17-2029
PWC	https://paperswithcode.com/paper/purduenlp-at-semeval-2017-task-1-predicting
Repo
Framework

The Agreement Measure γcat a Complement to γ Focused on Categorization of a Continuum


Title	The Agreement Measure γcat a Complement to γ Focused on Categorization of a Continuum
Authors	Yann Mathet
Abstract	Agreement on unitizing, where several annotators freely put units of various sizes and categories on a continuum, is difficult to assess because of the simultaneaous discrepancies in positioning and categorizing. The recent agreement measure γ offers an overall solution that simultaneously takes into account positions and categories. In this article, I propose the additional coefficient γcat, which complements γ by assessing the agreement on categorization of a continuum, putting aside positional discrepancies. When applied to pure categorization (with predefined units), γcat behaves the same way as the famous dedicated Krippendorff{'}s α, even with missing values, which proves its consistency. A variation of γcat is also proposed that provides an in-depth assessment of categorizing for each individual category. The entire family of γ coefficients is implemented in free software.
Tasks
Published	2017-09-01
URL	https://www.aclweb.org/anthology/J17-3006/
PDF	https://www.aclweb.org/anthology/J17-3006
PWC	https://paperswithcode.com/paper/the-agreement-measure-i3cat-a-complement-to
Repo
Framework


Title	Annotating Italian Social Media Texts in Universal Dependencies
Authors	Manuela Sanguinetti, Cristina Bosco, Aless Mazzei, ro, Alberto Lavelli, Fabio Tamburini
Abstract
Tasks	Opinion Mining, Sentiment Analysis
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-6526/
PDF	https://www.aclweb.org/anthology/W17-6526
PWC	https://paperswithcode.com/paper/annotating-italian-social-media-texts-in
Repo
Framework

OMAM at SemEval-2017 Task 4: Evaluation of English State-of-the-Art Sentiment Analysis Models for Arabic and a New Topic-based Model


Title	OMAM at SemEval-2017 Task 4: Evaluation of English State-of-the-Art Sentiment Analysis Models for Arabic and a New Topic-based Model
Authors	Ramy Baly, Gilbert Badaro, Ali Hamdi, Rawan Moukalled, Rita Aoun, Georges El-Khoury, Ahmad Al Sallab, Hazem Hajj, Nizar Habash, Khaled Shaban, Wassim El-Hajj
Abstract	While sentiment analysis in English has achieved significant progress, it remains a challenging task in Arabic given the rich morphology of the language. It becomes more challenging when applied to Twitter data that comes with additional sources of noise including dialects, misspellings, grammatical mistakes, code switching and the use of non-textual objects to express sentiments. This paper describes the {``}OMAM{''} systems that we developed as part of SemEval-2017 task 4. We evaluate English state-of-the-art methods on Arabic tweets for subtask A. As for the remaining subtasks, we introduce a topic-based approach that accounts for topic specificities by predicting topics or domains of upcoming tweets, and then using this information to predict their sentiment. Results indicate that applying the English state-of-the-art method to Arabic has achieved solid results without significant enhancements. Furthermore, the topic-based method ranked 1st in subtasks C and E, and 2nd in subtask D. \|
Tasks	Opinion Mining, Sentiment Analysis
Published	2017-08-01
URL	https://www.aclweb.org/anthology/S17-2099/
PDF	https://www.aclweb.org/anthology/S17-2099
PWC	https://paperswithcode.com/paper/omam-at-semeval-2017-task-4-evaluation-of
Repo
Framework

OMAM at SemEval-2017 Task 4: English Sentiment Analysis with Conditional Random Fields


Title	OMAM at SemEval-2017 Task 4: English Sentiment Analysis with Conditional Random Fields
Authors	Chukwuyem Onyibe, Nizar Habash
Abstract	We describe a supervised system that uses optimized Condition Random Fields and lexical features to predict the sentiment of a tweet. The system was submitted to the English version of all subtasks in SemEval-2017 Task 4.
Tasks	Opinion Mining, Sentiment Analysis, Stance Detection
Published	2017-08-01
URL	https://www.aclweb.org/anthology/S17-2111/
PDF	https://www.aclweb.org/anthology/S17-2111
PWC	https://paperswithcode.com/paper/omam-at-semeval-2017-task-4-english-sentiment
Repo
Framework