October 15, 2019

2168 words 11 mins read

Paper Group NANR 129

Paper Group NANR 129

Rule-based vs. Neural Net Approaches to Semantic Textual Similarity. MEMD: A Diversity-Promoting Learning Framework for Short-Text Conversation. The Remarkable Benefit of User-Level Aggregation for Lexical-based Population-Level Predictions. LREMap, a Song of Resources and Evaluation. Building Literary Corpora for Computational Literary Analysis - …

Rule-based vs. Neural Net Approaches to Semantic Textual Similarity

Title Rule-based vs. Neural Net Approaches to Semantic Textual Similarity
Authors Linrui Zhang, Dan Moldovan
Abstract This paper presents a neural net approach to determine Semantic Textual Similarity (STS) using attention-based bidirectional Long Short-Term Memory Networks (Bi-LSTM). To this date, most of the traditional STS systems were rule-based that built on top of excessive use of linguistic features and resources. In this paper, we present an end-to-end attention-based Bi-LSTM neural network system that solely takes word-level features, without expensive feature engineering work or the usage of external resources. By comparing its performance with traditional rule-based systems against SemEval-2012 benchmark, we make an assessment on the limitations and strengths of neural net systems to rule-based systems on Semantic Textual Similarity.
Tasks Feature Engineering, Semantic Textual Similarity, Sentence Pair Modeling
Published 2018-08-01
URL https://www.aclweb.org/anthology/W18-3803/
PDF https://www.aclweb.org/anthology/W18-3803
PWC https://paperswithcode.com/paper/rule-based-vs-neural-net-approaches-to
Repo
Framework

MEMD: A Diversity-Promoting Learning Framework for Short-Text Conversation

Title MEMD: A Diversity-Promoting Learning Framework for Short-Text Conversation
Authors Meng Zou, Xihan Li, Haokun Liu, Zhihong Deng
Abstract Neural encoder-decoder models have been widely applied to conversational response generation, which is a research hot spot in recent years. However, conventional neural encoder-decoder models tend to generate commonplace responses like {``}I don{'}t know{''} regardless of what the input is. In this paper, we analyze this problem from a new perspective: latent vectors. Based on it, we propose an easy-to-extend learning framework named MEMD (Multi-Encoder to Multi-Decoder), in which an auxiliary encoder and an auxiliary decoder are introduced to provide necessary training guidance without resorting to extra data or complicating network{'}s inner structure. Experimental results demonstrate that our method effectively improve the quality of generated responses according to automatic metrics and human evaluations, yielding more diverse and smooth replies. |
Tasks Conversational Response Generation, Short-Text Conversation
Published 2018-08-01
URL https://www.aclweb.org/anthology/C18-1109/
PDF https://www.aclweb.org/anthology/C18-1109
PWC https://paperswithcode.com/paper/memd-a-diversity-promoting-learning-framework
Repo
Framework

The Remarkable Benefit of User-Level Aggregation for Lexical-based Population-Level Predictions

Title The Remarkable Benefit of User-Level Aggregation for Lexical-based Population-Level Predictions
Authors Salvatore Giorgi, Daniel Preo{\c{t}}iuc-Pietro, Anneke Buffone, Daniel Rieman, Lyle Ungar, H. Andrew Schwartz
Abstract Nowcasting based on social media text promises to provide unobtrusive and near real-time predictions of community-level outcomes. These outcomes are typically regarding people, but the data is often aggregated without regard to users in the Twitter populations of each community. This paper describes a simple yet effective method for building community-level models using Twitter language aggregated by user. Results on four different U.S. county-level tasks, spanning demographic, health, and psychological outcomes show large and consistent improvements in prediction accuracies (e.g. from Pearson r=.73 to .82 for median income prediction or r=.37 to .47 for life satisfaction prediction) over the standard approach of aggregating all tweets. We make our aggregated and anonymized community-level data, derived from 37 billion tweets {–} over 1 billion of which were mapped to counties, available for research.
Tasks
Published 2018-10-01
URL https://www.aclweb.org/anthology/D18-1148/
PDF https://www.aclweb.org/anthology/D18-1148
PWC https://paperswithcode.com/paper/the-remarkable-benefit-of-user-level
Repo
Framework

LREMap, a Song of Resources and Evaluation

Title LREMap, a Song of Resources and Evaluation
Authors Riccardo Del Gratta, Sara Goggi, Gabriella Pardelli, Nicoletta Calzolari
Abstract
Tasks
Published 2018-05-01
URL https://www.aclweb.org/anthology/L18-1203/
PDF https://www.aclweb.org/anthology/L18-1203
PWC https://paperswithcode.com/paper/lremap-a-song-of-resources-and-evaluation
Repo
Framework

Building Literary Corpora for Computational Literary Analysis - A Prototype to Bridge the Gap between CL and DH

Title Building Literary Corpora for Computational Literary Analysis - A Prototype to Bridge the Gap between CL and DH
Authors Andrew Frank, Christine Ivanovic
Abstract
Tasks
Published 2018-05-01
URL https://www.aclweb.org/anthology/L18-1128/
PDF https://www.aclweb.org/anthology/L18-1128
PWC https://paperswithcode.com/paper/building-literary-corpora-for-computational
Repo
Framework

Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

Title Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)
Authors
Abstract
Tasks
Published 2018-06-01
URL https://www.aclweb.org/anthology/N18-1000/
PDF https://www.aclweb.org/anthology/N18-1000
PWC https://paperswithcode.com/paper/proceedings-of-the-2018-conference-of-the-5
Repo
Framework

A Lexicon-Based Supervised Attention Model for Neural Sentiment Analysis

Title A Lexicon-Based Supervised Attention Model for Neural Sentiment Analysis
Authors Yicheng Zou, Tao Gui, Qi Zhang, Xuanjing Huang
Abstract Attention mechanisms have been leveraged for sentiment classification tasks because not all words have the same importance. However, most existing attention models did not take full advantage of sentiment lexicons, which provide rich sentiment information and play a critical role in sentiment analysis. To achieve the above target, in this work, we propose a novel lexicon-based supervised attention model (LBSA), which allows a recurrent neural network to focus on the sentiment content, thus generating sentiment-informative representations. Compared with general attention models, our model has better interpretability and less noise. Experimental results on three large-scale sentiment classification datasets showed that the proposed method outperforms previous methods.
Tasks Sentiment Analysis
Published 2018-08-01
URL https://www.aclweb.org/anthology/C18-1074/
PDF https://www.aclweb.org/anthology/C18-1074
PWC https://paperswithcode.com/paper/a-lexicon-based-supervised-attention-model
Repo
Framework

Cooperative Denoising for Distantly Supervised Relation Extraction

Title Cooperative Denoising for Distantly Supervised Relation Extraction
Authors Kai Lei, Daoyuan Chen, Yaliang Li, Nan Du, Min Yang, Wei Fan, Ying Shen
Abstract Distantly supervised relation extraction greatly reduces human efforts in extracting relational facts from unstructured texts. However, it suffers from noisy labeling problem, which can degrade its performance. Meanwhile, the useful information expressed in knowledge graph is still underutilized in the state-of-the-art methods for distantly supervised relation extraction. In the light of these challenges, we propose CORD, a novelCOopeRativeDenoising framework, which consists two base networks leveraging text corpus and knowledge graph respectively, and a cooperative module involving their mutual learning by the adaptive bi-directional knowledge distillation and dynamic ensemble with noisy-varying instances. Experimental results on a real-world dataset demonstrate that the proposed method reduces the noisy labels and achieves substantial improvement over the state-of-the-art methods.
Tasks Denoising, Information Retrieval, Question Answering, Relation Extraction
Published 2018-08-01
URL https://www.aclweb.org/anthology/C18-1036/
PDF https://www.aclweb.org/anthology/C18-1036
PWC https://paperswithcode.com/paper/cooperative-denoising-for-distantly
Repo
Framework

Word Affect Intensities

Title Word Affect Intensities
Authors Saif Mohammad
Abstract
Tasks Emotion Recognition, Sentiment Analysis, Text Generation
Published 2018-05-01
URL https://www.aclweb.org/anthology/L18-1027/
PDF https://www.aclweb.org/anthology/L18-1027
PWC https://paperswithcode.com/paper/word-affect-intensities-1
Repo
Framework

Constrained Interacting Submodular Groupings

Title Constrained Interacting Submodular Groupings
Authors Andrew Cotter, Mahdi Milani Fard, Seungil You, Maya Gupta, Jeff Bilmes
Abstract We introduce the problem of grouping a finite ground set into blocks where each block is a subset of the ground set and where: (i) the blocks are individually highly valued by a submodular function (both robustly and in the average case) while satisfying block-specific matroid constraints; and (ii) block scores interact where blocks are jointly scored highly, thus making the blocks mutually non-redundant. Submodular functions are good models of information and diversity; thus, the above can be seen as grouping the ground set into matroid constrained blocks that are both intra- and inter-diverse. Potential applications include forming ensembles of classification/regression models, partitioning data for parallel processing, and summarization. In the non-robust case, we reduce the problem to non-monotone submodular maximization subject to multiple matroid constraints. In the mixed robust/average case, we offer a bi-criterion guarantee for a polynomial time deterministic algorithm and a probabilistic guarantee for randomized algorithm, as long as the involved submodular functions (including the inter-block interaction terms) are monotone. We close with a case study in which we use these algorithms to find high quality diverse ensembles of classifiers, showing good results.
Tasks
Published 2018-07-01
URL https://icml.cc/Conferences/2018/Schedule?showEvent=2129
PDF http://proceedings.mlr.press/v80/cotter18a/cotter18a.pdf
PWC https://paperswithcode.com/paper/constrained-interacting-submodular-groupings
Repo
Framework

Universal Sentence Encoder for English

Title Universal Sentence Encoder for English
Authors Daniel Cer, Yinfei Yang, Sheng-yi Kong, Nan Hua, Nicole Limtiaco, Rhomni St. John, Noah Constant, Mario Guajardo-Cespedes, Steve Yuan, Chris Tar, Brian Strope, Ray Kurzweil
Abstract We present easy-to-use TensorFlow Hub sentence embedding models having good task transfer performance. Model variants allow for trade-offs between accuracy and compute resources. We report the relationship between model complexity, resources, and transfer performance. Comparisons are made with baselines without transfer learning and to baselines that incorporate word-level transfer. Transfer learning using sentence-level embeddings is shown to outperform models without transfer learning and often those that use only word-level transfer. We show good transfer task performance with minimal training data and obtain encouraging results on word embedding association tests (WEAT) of model bias.
Tasks Multi-Task Learning, Sentence Embedding, Sentence Embeddings, Tokenization, Transfer Learning
Published 2018-11-01
URL https://www.aclweb.org/anthology/D18-2029/
PDF https://www.aclweb.org/anthology/D18-2029
PWC https://paperswithcode.com/paper/universal-sentence-encoder-for-english
Repo
Framework

DropMax: Adaptive Stochastic Softmax

Title DropMax: Adaptive Stochastic Softmax
Authors Hae Beom Lee, Juho Lee, Eunho Yang, Sung Ju Hwang
Abstract We propose DropMax, a stochastic version of softmax classifier which at each iteration drops non-target classes with some probability, for each instance. Specifically, we overlay binary masking variables over class output probabilities, which are learned based on the input via regularized variational inference. This stochastic regularization has an effect of building an ensemble classifier out of combinatorial number of classifiers with different decision boundaries. Moreover, the learning of dropout probabilities for non-target classes on each instance allows the classifier to focus more on classification against the most confusing classes. We validate our model on multiple public datasets for classification, on which it obtains improved accuracy over regular softmax classifier and other baselines. Further analysis of the learned dropout masks shows that our model indeed selects confusing classes more often when it performs classification.
Tasks
Published 2018-01-01
URL https://openreview.net/forum?id=Sy4c-3xRW
PDF https://openreview.net/pdf?id=Sy4c-3xRW
PWC https://paperswithcode.com/paper/dropmax-adaptive-stochastic-softmax
Repo
Framework

Cross-corpus Native Language Identification via Statistical Embedding

Title Cross-corpus Native Language Identification via Statistical Embedding
Authors Francisco Rangel, Paolo Rosso, Julian Brooke, Alex Uitdenbogerd, ra
Abstract In this paper, we approach the task of native language identification in a realistic cross-corpus scenario where a model is trained with available data and has to predict the native language from data of a different corpus. The motivation behind this study is to investigate native language identification in the Australian academic scenario where a majority of students come from China, Indonesia, and Arabic-speaking nations. We have proposed a statistical embedding representation reporting a significant improvement over common single-layer approaches of the state of the art, identifying Chinese, Arabic, and Indonesian in a cross-corpus scenario. The proposed approach was shown to be competitive even when the data is scarce and imbalanced.
Tasks Language Identification, Native Language Identification
Published 2018-06-01
URL https://www.aclweb.org/anthology/W18-1605/
PDF https://www.aclweb.org/anthology/W18-1605
PWC https://paperswithcode.com/paper/cross-corpus-native-language-identification
Repo
Framework

TCAV: Relative concept importance testing with Linear Concept Activation Vectors

Title TCAV: Relative concept importance testing with Linear Concept Activation Vectors
Authors Been Kim, Justin Gilmer, Martin Wattenberg, Fernanda Viégas
Abstract Despite neural network’s high performance, the lack of interpretability has been the main bottleneck for its safe usage in practice. In domains with high stakes (e.g., medical diagnosis), gaining insights into the network is critical for gaining trust and being adopted. One of the ways to improve interpretability of a NN is to explain the importance of a particular concept (e.g., gender) in prediction. This is useful for explaining reasoning behind the networks’ predictions, and for revealing any biases the network may have. This work aims to provide quantitative answers to \textit{the relative importance of concepts of interest} via concept activation vectors (CAV). In particular, this framework enables non-machine learning experts to express concepts of interests and test hypotheses using examples (e.g., a set of pictures that illustrate the concept). We show that CAV can be learned given a relatively small set of examples. Testing with CAV, for example, can answer whether a particular concept (e.g., gender) is more important in predicting a given class (e.g., doctor) than other set of concepts. Interpreting with CAV does not require any retraining or modification of the network. We show that many levels of meaningful concepts are learned (e.g., color, texture, objects, a person’s occupation), and we present CAV’s \textit{empirical deepdream} — where we maximize an activation using a set of example pictures. We show how various insights can be gained from the relative importance testing with CAV.
Tasks Medical Diagnosis
Published 2018-01-01
URL https://openreview.net/forum?id=S1viikbCW
PDF https://openreview.net/pdf?id=S1viikbCW
PWC https://paperswithcode.com/paper/tcav-relative-concept-importance-testing-with
Repo
Framework

Improving Neural Machine Translation by Incorporating Hierarchical Subword Features

Title Improving Neural Machine Translation by Incorporating Hierarchical Subword Features
Authors Makoto Morishita, Jun Suzuki, Masaaki Nagata
Abstract This paper focuses on subword-based Neural Machine Translation (NMT). We hypothesize that in the NMT model, the appropriate subword units for the following three modules (layers) can differ: (1) the encoder embedding layer, (2) the decoder embedding layer, and (3) the decoder output layer. We find the subword based on Sennrich et al. (2016) has a feature that a large vocabulary is a superset of a small vocabulary and modify the NMT model enables the incorporation of several different subword units in a single embedding layer. We refer these small subword features as hierarchical subword features. To empirically investigate our assumption, we compare the performance of several different subword units and hierarchical subword features for both the encoder and decoder embedding layers. We confirmed that incorporating hierarchical subword features in the encoder consistently improves BLEU scores on the IWSLT evaluation datasets.
Tasks Machine Translation
Published 2018-08-01
URL https://www.aclweb.org/anthology/C18-1052/
PDF https://www.aclweb.org/anthology/C18-1052
PWC https://paperswithcode.com/paper/improving-neural-machine-translation-by
Repo
Framework
comments powered by Disqus