April 2, 2020

3235 words 16 mins read

Paper Group ANR 115

Paper Group ANR 115

Text Classification with Lexicon from PreAttention Mechanism. Hierarchical Transformer Network for Utterance-level Emotion Recognition. Explicit agreement extremes for a $2\times2$ table with given marginals. The Utility of General Domain Transfer Learning for Medical Language Tasks. Learning Not to Learn in the Presence of Noisy Labels. Comparison …

Text Classification with Lexicon from PreAttention Mechanism

Title Text Classification with Lexicon from PreAttention Mechanism
Authors QINGBIAO LI, CHUNHUA WU, KANGFENG ZHENG
Abstract A comprehensive and high-quality lexicon plays a crucial role in traditional text classification approaches. And it improves the utilization of the linguistic knowledge. Although it is helpful for the task, the lexicon has got little attention in recent neural network models. Firstly, getting a high-quality lexicon is not easy. We lack an effective automated lexicon extraction method, and most lexicons are hand crafted, which is very inefficient for big data. What’s more, there is no an effective way to use a lexicon in a neural network. To address those limitations, we propose a Pre-Attention mechanism for text classification in this paper, which can learn attention of different words according to their effects in the classification tasks. The words with different attention can form a domain lexicon. Experiments on three benchmark text classification tasks show that our models get competitive result comparing with the state-of-the-art methods. We get 90.5% accuracy on Stanford Large Movie Review dataset, 82.3% on Subjectivity dataset, 93.7% on Movie Reviews. And compared with the text classification model without Pre-Attention mechanism, those with Pre-Attention mechanism improve by 0.9%-2.4% accuracy, which proves the validity of the Pre-Attention mechanism. In addition, the Pre-Attention mechanism performs well followed by different types of neural networks (e.g., convolutional neural networks and Long Short-Term Memory networks). For the same dataset, when we use Pre-Attention mechanism to get attention value followed by different neural networks, those words with high attention values have a high degree of coincidence, which proves the versatility and portability of the Pre-Attention mechanism. we can get stable lexicons by attention values, which is an inspiring method of information extraction.
Tasks Text Classification
Published 2020-02-18
URL https://arxiv.org/abs/2002.07591v1
PDF https://arxiv.org/pdf/2002.07591v1.pdf
PWC https://paperswithcode.com/paper/text-classification-with-lexicon-from
Repo
Framework

Hierarchical Transformer Network for Utterance-level Emotion Recognition

Title Hierarchical Transformer Network for Utterance-level Emotion Recognition
Authors QingBiao Li, ChunHua Wu, KangFeng Zheng, Zhe Wang
Abstract While there have been significant advances in de-tecting emotions in text, in the field of utter-ance-level emotion recognition (ULER), there are still many problems to be solved. In this paper, we address some challenges in ULER in dialog sys-tems. (1) The same utterance can deliver different emotions when it is in different contexts or from different speakers. (2) Long-range contextual in-formation is hard to effectively capture. (3) Unlike the traditional text classification problem, this task is supported by a limited number of datasets, among which most contain inadequate conversa-tions or speech. To address these problems, we propose a hierarchical transformer framework (apart from the description of other studies, the “transformer” in this paper usually refers to the encoder part of the transformer) with a lower-level transformer to model the word-level input and an upper-level transformer to capture the context of utterance-level embeddings. We use a pretrained language model bidirectional encoder representa-tions from transformers (BERT) as the lower-level transformer, which is equivalent to introducing external data into the model and solve the problem of data shortage to some extent. In addition, we add speaker embeddings to the model for the first time, which enables our model to capture the in-teraction between speakers. Experiments on three dialog emotion datasets, Friends, EmotionPush, and EmoryNLP, demonstrate that our proposed hierarchical transformer network models achieve 1.98%, 2.83%, and 3.94% improvement, respec-tively, over the state-of-the-art methods on each dataset in terms of macro-F1.
Tasks Emotion Recognition, Language Modelling, Text Classification
Published 2020-02-18
URL https://arxiv.org/abs/2002.07551v1
PDF https://arxiv.org/pdf/2002.07551v1.pdf
PWC https://paperswithcode.com/paper/hierarchical-transformer-network-for
Repo
Framework

Explicit agreement extremes for a $2\times2$ table with given marginals

Title Explicit agreement extremes for a $2\times2$ table with given marginals
Authors José E. Chacón
Abstract The problem of maximizing (or minimizing) the agreement between clusterings, subject to given marginals, can be formally posed under a common framework for several agreement measures. Until now, it was possible to find its solution only through numerical algorithms. Here, an explicit solution is shown for the case where the two clusterings have two clusters each.
Tasks
Published 2020-01-21
URL https://arxiv.org/abs/2001.07415v1
PDF https://arxiv.org/pdf/2001.07415v1.pdf
PWC https://paperswithcode.com/paper/explicit-agreement-extremes-for-a-2times2
Repo
Framework

The Utility of General Domain Transfer Learning for Medical Language Tasks

Title The Utility of General Domain Transfer Learning for Medical Language Tasks
Authors Daniel Ranti, Katie Hanss, Shan Zhao, Varun Arvind, Joseph Titano, Anthony Costa, Eric Oermann
Abstract The purpose of this study is to analyze the efficacy of transfer learning techniques and transformer-based models as applied to medical natural language processing (NLP) tasks, specifically radiological text classification. We used 1,977 labeled head CT reports, from a corpus of 96,303 total reports, to evaluate the efficacy of pretraining using general domain corpora and a combined general and medical domain corpus with a bidirectional representations from transformers (BERT) model for the purpose of radiological text classification. Model performance was benchmarked to a logistic regression using bag-of-words vectorization and a long short-term memory (LSTM) multi-label multi-class classification model, and compared to the published literature in medical text classification. The BERT models using either set of pretrained checkpoints outperformed the logistic regression model, achieving sample-weighted average F1-scores of 0.87 and 0.87 for the general domain model and the combined general and biomedical-domain model. General text transfer learning may be a viable technique to generate state-of-the-art results within medical NLP tasks on radiological corpora, outperforming other deep models such as LSTMs. The efficacy of pretraining and transformer-based models could serve to facilitate the creation of groundbreaking NLP models in the uniquely challenging data environment of medical text.
Tasks Text Classification, Transfer Learning
Published 2020-02-16
URL https://arxiv.org/abs/2002.06670v1
PDF https://arxiv.org/pdf/2002.06670v1.pdf
PWC https://paperswithcode.com/paper/the-utility-of-general-domain-transfer
Repo
Framework

Learning Not to Learn in the Presence of Noisy Labels

Title Learning Not to Learn in the Presence of Noisy Labels
Authors Liu Ziyin, Blair Chen, Ru Wang, Paul Pu Liang, Ruslan Salakhutdinov, Louis-Philippe Morency, Masahito Ueda
Abstract Learning in the presence of label noise is a challenging yet important task: it is crucial to design models that are robust in the presence of mislabeled datasets. In this paper, we discover that a new class of loss functions called the gambler’s loss provides strong robustness to label noise across various levels of corruption. We show that training with this loss function encourages the model to “abstain” from learning on the data points with noisy labels, resulting in a simple and effective method to improve robustness and generalization. In addition, we propose two practical extensions of the method: 1) an analytical early stopping criterion to approximately stop training before the memorization of noisy labels, as well as 2) a heuristic for setting hyperparameters which do not require knowledge of the noise corruption rate. We demonstrate the effectiveness of our method by achieving strong results across three image and text classification tasks as compared to existing baselines.
Tasks Text Classification
Published 2020-02-16
URL https://arxiv.org/abs/2002.06541v1
PDF https://arxiv.org/pdf/2002.06541v1.pdf
PWC https://paperswithcode.com/paper/learning-not-to-learn-in-the-presence-of
Repo
Framework

Comparison of Turkish Word Representations Trained on Different Morphological Forms

Title Comparison of Turkish Word Representations Trained on Different Morphological Forms
Authors Gökhan Güler, A. Cüneyd Tantuğ
Abstract Increased popularity of different text representations has also brought many improvements in Natural Language Processing (NLP) tasks. Without need of supervised data, embeddings trained on large corpora provide us meaningful relations to be used on different NLP tasks. Even though training these vectors is relatively easy with recent methods, information gained from the data heavily depends on the structure of the corpus language. Since the popularly researched languages have a similar morphological structure, problems occurring for morphologically rich languages are mainly disregarded in studies. For morphologically rich languages, context-free word vectors ignore morphological structure of languages. In this study, we prepared texts in morphologically different forms in a morphologically rich language, Turkish, and compared the results on different intrinsic and extrinsic tasks. To see the effect of morphological structure, we trained word2vec model on texts which lemma and suffixes are treated differently. We also trained subword model fastText and compared the embeddings on word analogy, text classification, sentimental analysis, and language model tasks.
Tasks Language Modelling, Text Classification
Published 2020-02-13
URL https://arxiv.org/abs/2002.05417v1
PDF https://arxiv.org/pdf/2002.05417v1.pdf
PWC https://paperswithcode.com/paper/comparison-of-turkish-word-representations
Repo
Framework

Localized Flood DetectionWith Minimal Labeled Social Media Data Using Transfer Learning

Title Localized Flood DetectionWith Minimal Labeled Social Media Data Using Transfer Learning
Authors Neha Singh, Nirmalya Roy, Aryya Gangopadhyay
Abstract Social media generates an enormous amount of data on a daily basis but it is very challenging to effectively utilize the data without annotating or labeling it according to the target application. We investigate the problem of localized flood detection using the social sensing model (Twitter) in order to provide an efficient, reliable and accurate flood text classification model with minimal labeled data. This study is important since it can immensely help in providing the flood-related updates and notifications to the city officials for emergency decision making, rescue operations, and early warnings, etc. We propose to perform the text classification using the inductive transfer learning method i.e pre-trained language model ULMFiT and fine-tune it in order to effectively classify the flood-related feeds in any new location. Finally, we show that using very little new labeled data in the target domain we can successfully build an efficient and high performing model for flood detection and analysis with human-generated facts and observations from Twitter.
Tasks Decision Making, Language Modelling, Text Classification, Transfer Learning
Published 2020-02-10
URL https://arxiv.org/abs/2003.04973v1
PDF https://arxiv.org/pdf/2003.04973v1.pdf
PWC https://paperswithcode.com/paper/localized-flood-detectionwith-minimal-labeled
Repo
Framework

Short Text Classification via Knowledge powered Attention with Similarity Matrix based CNN

Title Short Text Classification via Knowledge powered Attention with Similarity Matrix based CNN
Authors Mingchen Li, Gabtone. Clinton, Yijia Miao, Feng Gao
Abstract Short text is becoming more and more popular on the web, such as Chat Message, SMS and Product Reviews. Accurately classifying short text is an important and challenging task. A number of studies have difficulties in addressing this problem because of the word ambiguity and data sparsity. To address this issue, we propose a knowledge powered attention with similarity matrix based convolutional neural network (KASM) model, which can compute comprehensive information by utilizing the knowledge and deep neural network. We use knowledge graph (KG) to enrich the semantic representation of short text, specially, the information of parent-entity is introduced in our model. Meanwhile, we consider the word interaction in the literal-level between short text and the representation of label, and utilize similarity matrix based convolutional neural network (CNN) to extract it. For the purpose of measuring the importance of knowledge, we introduce the attention mechanisms to choose the important information. Experimental results on five standard datasets show that our model significantly outperforms state-of-the-art methods.
Tasks Text Classification
Published 2020-02-09
URL https://arxiv.org/abs/2002.03350v1
PDF https://arxiv.org/pdf/2002.03350v1.pdf
PWC https://paperswithcode.com/paper/short-text-classification-via-knowledge
Repo
Framework

Predicting Goal-directed Attention Control Using Inverse-Reinforcement Learning

Title Predicting Goal-directed Attention Control Using Inverse-Reinforcement Learning
Authors Gregory J. Zelinsky, Yupei Chen, Seoyoung Ahn, Hossein Adeli, Zhibo Yang, Lihan Huang, Dimitrios Samaras, Minh Hoai
Abstract Understanding how goal states control behavior is a question ripe for interrogation by new methods from machine learning. These methods require large and labeled datasets to train models. To annotate a large-scale image dataset with observed search fixations, we collected 16,184 fixations from people searching for either microwaves or clocks in a dataset of 4,366 images (MS-COCO). We then used this behaviorally-annotated dataset and the machine learning method of Inverse-Reinforcement Learning (IRL) to learn target-specific reward functions and policies for these two target goals. Finally, we used these learned policies to predict the fixations of 60 new behavioral searchers (clock = 30, microwave = 30) in a disjoint test dataset of kitchen scenes depicting both a microwave and a clock (thus controlling for differences in low-level image contrast). We found that the IRL model predicted behavioral search efficiency and fixation-density maps using multiple metrics. Moreover, reward maps from the IRL model revealed target-specific patterns that suggest, not just attention guidance by target features, but also guidance by scene context (e.g., fixations along walls in the search of clocks). Using machine learning and the psychologically-meaningful principle of reward, it is possible to learn the visual features used in goal-directed attention control.
Tasks
Published 2020-01-31
URL https://arxiv.org/abs/2001.11921v1
PDF https://arxiv.org/pdf/2001.11921v1.pdf
PWC https://paperswithcode.com/paper/predicting-goal-directed-attention-control
Repo
Framework

Statistical Tests and Confidential Intervals as Thresholds for Quantum Neural Networks

Title Statistical Tests and Confidential Intervals as Thresholds for Quantum Neural Networks
Authors Do Ngoc Diep
Abstract Some basic quantum neural networks were analyzed and constructed in the recent work of the author \cite{dndiep3}, published in International Journal of Theoretical Physics (2020). In particular the Least Quare Problem (LSP) and the Linear Regression Problem (LRP) was discussed. In this second paper we continue to analyze and construct the least square quantum neural network (LS-QNN), the polynomial interpolation quantum neural network (PI-QNN), the polynomial regression quantum neural network (PR-QNN) and chi-squared quantum neural network ($\chi^2$-QNN). We use the corresponding solution or tests as the threshold for the corresponding training rules.
Tasks
Published 2020-01-30
URL https://arxiv.org/abs/2001.11844v1
PDF https://arxiv.org/pdf/2001.11844v1.pdf
PWC https://paperswithcode.com/paper/statistical-tests-and-confidential-intervals
Repo
Framework

Description Based Text Classification with Reinforcement Learning

Title Description Based Text Classification with Reinforcement Learning
Authors Duo Chai, Wei Wu, Qinghong Han, Fei Wu, Jiwei Li
Abstract The task of text classification is usually divided into two stages: {\it text feature extraction} and {\it classification}. In this standard formalization categories are merely represented as indexes in the label vocabulary, and the model lacks for explicit instructions on what to classify. Inspired by the current trend of formalizing NLP problems as question answering tasks, we propose a new framework for text classification, in which each category label is associated with a category description. Descriptions are generated by hand-crafted templates or using abstractive/extractive models from reinforcement learning. The concatenation of the description and the text is fed to the classifier to decide whether or not the current label should be assigned to the text. The proposed strategy forces the model to attend to the most salient texts with respect to the label, which can be regarded as a hard version of attention, leading to better performances. We observe significant performance boosts over strong baselines on a wide range of text classification tasks including single-label classification, multi-label classification and multi-aspect sentiment analysis.
Tasks Multi-Label Classification, Question Answering, Sentiment Analysis, Text Classification
Published 2020-02-08
URL https://arxiv.org/abs/2002.03067v1
PDF https://arxiv.org/pdf/2002.03067v1.pdf
PWC https://paperswithcode.com/paper/description-based-text-classification-with
Repo
Framework

An efficient automated data analytics approach to large scale computational comparative linguistics

Title An efficient automated data analytics approach to large scale computational comparative linguistics
Authors Gabija Mikulyte, David Gilbert
Abstract This research project aimed to overcome the challenge of analysing human language relationships, facilitate the grouping of languages and formation of genealogical relationship between them by developing automated comparison techniques. Techniques were based on the phonetic representation of certain key words and concept. Example word sets included numbers 1-10 (curated), large database of numbers 1-10 and sheep counting numbers 1-10 (other sources), colours (curated), basic words (curated). To enable comparison within the sets the measure of Edit distance was calculated based on Levenshtein distance metric. This metric between two strings is the minimum number of single-character edits, operations including: insertions, deletions or substitutions. To explore which words exhibit more or less variation, which words are more preserved and examine how languages could be grouped based on linguistic distances within sets, several data analytics techniques were involved. Those included density evaluation, hierarchical clustering, silhouette, mean, standard deviation and Bhattacharya coefficient calculations. These techniques lead to the development of a workflow which was later implemented by combining Unix shell scripts, a developed R package and SWI Prolog. This proved to be computationally efficient and permitted the fast exploration of large language sets and their analysis.
Tasks
Published 2020-01-31
URL https://arxiv.org/abs/2001.11899v1
PDF https://arxiv.org/pdf/2001.11899v1.pdf
PWC https://paperswithcode.com/paper/an-efficient-automated-data-analytics
Repo
Framework

Reward Shaping for Reinforcement Learning with Omega-Regular Objectives

Title Reward Shaping for Reinforcement Learning with Omega-Regular Objectives
Authors E. M. Hahn, M. Perez, S. Schewe, F. Somenzi, A. Trivedi, D. Wojtczak
Abstract Recently, successful approaches have been made to exploit good-for-MDPs automata (B"uchi automata with a restricted form of nondeterminism) for model free reinforcement learning, a class of automata that subsumes good for games automata and the most widespread class of limit deterministic automata. The foundation of using these B"uchi automata is that the B"uchi condition can, for good-for-MDP automata, be translated to reachability. The drawback of this translation is that the rewards are, on average, reaped very late, which requires long episodes during the learning process. We devise a new reward shaping approach that overcomes this issue. We show that the resulting model is equivalent to a discounted payoff objective with a biased discount that simplifies and improves on prior work in this direction.
Tasks
Published 2020-01-16
URL https://arxiv.org/abs/2001.05977v1
PDF https://arxiv.org/pdf/2001.05977v1.pdf
PWC https://paperswithcode.com/paper/reward-shaping-for-reinforcement-learning
Repo
Framework

Multi-task Learning Based Neural Bridging Reference Resolution

Title Multi-task Learning Based Neural Bridging Reference Resolution
Authors Juntao Yu, Massimo Poesio
Abstract We propose a multi task learning-based neural model for bridging reference resolution tackling two key challenges faced by bridging reference resolution. The first challenge is the lack of large corpora annotated with bridging references. To address this, we use multi-task learning to help bridging reference resolution with coreference resolution. We show that substantial improvements of up to 8 p.p. can be achieved on full bridging resolution with this architecture. The second challenge is the different definitions of bridging used in different corpora, meaning that hand-coded systems or systems using special features designed for one corpus do not work well with other corpora. Our neural model only uses a small number of corpus independent features, thus can be applied easily to different corpora. Evaluations with very different bridging corpora (ARRAU, ISNOTES, BASHI and SCICORP) suggest that our architecture works equally well on all corpora, and achieves the SoTA results on full bridging resolution for all corpora, outperforming the best reported results by up to 34.9 percentage points.
Tasks Coreference Resolution, Multi-Task Learning
Published 2020-03-07
URL https://arxiv.org/abs/2003.03666v1
PDF https://arxiv.org/pdf/2003.03666v1.pdf
PWC https://paperswithcode.com/paper/multi-task-learning-based-neural-bridging
Repo
Framework

Application of Support Vector Machines for Seismogram Analysis and Differentiation

Title Application of Support Vector Machines for Seismogram Analysis and Differentiation
Authors Rohit Kumar Shrivastava
Abstract Support Vector Machines (SVM) is a computational technique which has been used in various fields of sciences as a classifier with k-class classification capability, k being 2,3,4, etc. Seismograms of volcanic tremors often contain noises which can prove harmful for correct interpretation. The PCAB station (located in the northern region of Panarea island, Italy) has been recording seismic signals from a pump installed nearby, corrupting the useful signals from Strombolli volcano. SVM with k=2 classification technique after optimization through grid search has been instrumental in identification and classification of the seismic signals coming from pump, reaching a score of 99.7149% of patterns which match the actual membership of class (determined through cross-validation). The predicted labels of SVM has been used to estimate the pump’s duration of activity leading to the declaration of corresponding seismograms redundant (not fit for processing and interpretation). However, when the same trained SVM was used to determine whether the seismogram used by Pino et al., 2011 recorded at the same PCAB station on 4th April, 2003 contained pump’s signals or not, SVM showed 100% absence of pump’s signals thereby authenticating the research work done in the latter.
Tasks
Published 2020-03-06
URL https://arxiv.org/abs/2003.04219v1
PDF https://arxiv.org/pdf/2003.04219v1.pdf
PWC https://paperswithcode.com/paper/application-of-support-vector-machines-for
Repo
Framework
comments powered by Disqus