Paper Group NANR 39
Detection of Adverse Drug Reaction Mentions in Tweets Using ELMo. Detection of Adverse Drug Reaction in Tweets Using a Combination of Heterogeneous Word Embeddings. A Simple Approach to Classify Fictional and Non-Fictional Genres. Dynamic Recursive Neural Network. Think out of the “Box”: Generically-Constrained Asynchronous Composite Optimization a …
Detection of Adverse Drug Reaction Mentions in Tweets Using ELMo
Title | Detection of Adverse Drug Reaction Mentions in Tweets Using ELMo |
Authors | Sarah Sarabadani |
Abstract | This paper describes the models used by our team in SMM4H 2019 shared task. We submitted results for subtasks 1 and 2. For task 1 which aims to detect tweets with Adverse Drug Reaction (ADR) mentions we used ELMo embeddings which is a deep contextualized word representation able to capture both syntactic and semantic characteristics. For task 2, which focuses on extraction of ADR mentions, first the same architecture as task 1 was used to identify whether or not a tweet contains ADR. Then, for tweets positively classified as mentioning ADR, the relevant text span was identified by similarity matching with 3 different lexicon sets. |
Tasks | |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-3221/ |
https://www.aclweb.org/anthology/W19-3221 | |
PWC | https://paperswithcode.com/paper/detection-of-adverse-drug-reaction-mentions |
Repo | |
Framework | |
Detection of Adverse Drug Reaction in Tweets Using a Combination of Heterogeneous Word Embeddings
Title | Detection of Adverse Drug Reaction in Tweets Using a Combination of Heterogeneous Word Embeddings |
Authors | Segun Taofeek Aroyehun, Alex Gelbukh, er |
Abstract | This paper details our approach to the task of detecting reportage of adverse drug reaction in tweets as part of the 2019 social media mining for healthcare applications shared task. We employed a combination of three types of word representations as input to a LSTM model. With this approach, we achieved an F1 score of 0.5209. |
Tasks | Word Embeddings |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-3224/ |
https://www.aclweb.org/anthology/W19-3224 | |
PWC | https://paperswithcode.com/paper/detection-of-adverse-drug-reaction-in-tweets |
Repo | |
Framework | |
A Simple Approach to Classify Fictional and Non-Fictional Genres
Title | A Simple Approach to Classify Fictional and Non-Fictional Genres |
Authors | Mohammed Rameez Qureshi, Sidharth Ranjan, Rajakrishnan Rajkumar, Kushal Shah |
Abstract | In this work, we deploy a logistic regression classifier to ascertain whether a given document belongs to the fiction or non-fiction genre. For genre identification, previous work had proposed three classes of features, viz., low-level (character-level and token counts), high-level (lexical and syntactic information) and derived features (type-token ratio, average word length or average sentence length). Using the Recursive feature elimination with cross-validation (RFECV) algorithm, we perform feature selection experiments on an exhaustive set of nineteen features (belonging to all the classes mentioned above) extracted from Brown corpus text. As a result, two simple features viz., the ratio of the number of adverbs to adjectives and the number of adjectives to pronouns turn out to be the most significant. Subsequently, our classification experiments aimed towards genre identification of documents from the Brown and Baby BNC corpora demonstrate that the performance of a classifier containing just the two aforementioned features is at par with that of a classifier containing the exhaustive feature set. |
Tasks | Feature Selection |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-3409/ |
https://www.aclweb.org/anthology/W19-3409 | |
PWC | https://paperswithcode.com/paper/a-simple-approach-to-classify-fictional-and |
Repo | |
Framework | |
Dynamic Recursive Neural Network
Title | Dynamic Recursive Neural Network |
Authors | Qiushan Guo, Zhipeng Yu, Yichao Wu, Ding Liang, Haoyu Qin, Junjie Yan |
Abstract | This paper proposes the dynamic recursive neural network (DRNN), which simplifies the duplicated building blocks in deep neural network. Different from forwarding through different blocks sequentially in previous networks, we demonstrate that the DRNN can achieve better performance with fewer blocks by employing block recursively. We further add a gate structure to each block, which can adaptively decide the loop times of recursive blocks to reduce the computational cost. Since the recursive networks are hard to train, we propose the Loopy Variable Batch Normalization (LVBN) to stabilize the volatile gradient. Further, we improve the LVBN to correct statistical bias caused by the gate structure. Experiments show that the DRNN reduces the parameters and computational cost and while outperforms the original model in term of the accuracy consistently on CIFAR-10 and ImageNet-1k. Lastly we visualize and discuss the relation between image saliency and the number of loop time. |
Tasks | |
Published | 2019-06-01 |
URL | http://openaccess.thecvf.com/content_CVPR_2019/html/Guo_Dynamic_Recursive_Neural_Network_CVPR_2019_paper.html |
http://openaccess.thecvf.com/content_CVPR_2019/papers/Guo_Dynamic_Recursive_Neural_Network_CVPR_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/dynamic-recursive-neural-network |
Repo | |
Framework | |
Think out of the “Box”: Generically-Constrained Asynchronous Composite Optimization and Hedging
Title | Think out of the “Box”: Generically-Constrained Asynchronous Composite Optimization and Hedging |
Authors | Pooria Joulani, András György, Csaba Szepesvari |
Abstract | We present two new algorithms, ASYNCADA and HEDGEHOG, for asynchronous sparse online and stochastic optimization. ASYNCADA is, to our knowledge, the first asynchronous stochastic optimization algorithm with finite-time data-dependent convergence guarantees for generic convex constraints. In addition, ASYNCADA: (a) allows for proximal (i.e., composite-objective) updates and adaptive step-sizes; (b) enjoys any-time convergence guarantees without requiring an exact global clock; and (c) when the data is sufficiently sparse, its convergence rate for (non-)smooth, (non-)strongly-convex, and even a limited class of non-convex objectives matches the corresponding serial rate, implying a theoretical “linear speed-up”. The second algorithm, HEDGEHOG, is an asynchronous parallel version of the Exponentiated Gradient (EG) algorithm for optimization over the probability simplex (a.k.a. Hedge in online learning), and, to our knowledge, the first asynchronous algorithm enjoying linear speed-ups under sparsity with non-SGD-style updates. Unlike previous work, ASYNCADA and HEDGEHOG and their convergence and speed-up analyses are not limited to individual coordinate-wise (i.e., “box-shaped”) constraints or smooth and strongly-convex objectives. Underlying both results is a generic analysis framework that is of independent interest, and further applicable to distributed and delayed feedback optimization |
Tasks | Stochastic Optimization |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/9391-think-out-of-the-box-generically-constrained-asynchronous-composite-optimization-and-hedging |
http://papers.nips.cc/paper/9391-think-out-of-the-box-generically-constrained-asynchronous-composite-optimization-and-hedging.pdf | |
PWC | https://paperswithcode.com/paper/think-out-of-the-box-generically-constrained |
Repo | |
Framework | |
Procedural Text Generation from a Photo Sequence
Title | Procedural Text Generation from a Photo Sequence |
Authors | Taichi Nishimura, Atsushi Hashimoto, Shinsuke Mori |
Abstract | Multimedia procedural texts, such as instructions and manuals with pictures, support people to share how-to knowledge. In this paper, we propose a method for generating a procedural text given a photo sequence allowing users to obtain a multimedia procedural text. We propose a single embedding space both for image and text enabling to interconnect them and to select appropriate words to describe a photo. We implemented our method and tested it on cooking instructions, i.e., recipes. Various experimental results showed that our method outperforms standard baselines. |
Tasks | Text Generation |
Published | 2019-10-01 |
URL | https://www.aclweb.org/anthology/W19-8650/ |
https://www.aclweb.org/anthology/W19-8650 | |
PWC | https://paperswithcode.com/paper/procedural-text-generation-from-a-photo |
Repo | |
Framework | |
Provable Online Dictionary Learning and Sparse Coding
Title | Provable Online Dictionary Learning and Sparse Coding |
Authors | Sirisha Rambhatla, Xingguo Li, Jarvis Haupt |
Abstract | We consider the dictionary learning problem, where the aim is to model the given data as a linear combination of a few columns of a matrix known as a dictionary, where the sparse weights forming the linear combination are known as coefficients. Since both the dictionary and coefficients parameterizing the linear model are unknown, the corresponding optimization is inherently non-convex. This was a major challenge until recently, when provable algorithms for dictionary learning were proposed. Yet, these provide guarantees only on the recovery of the dictionary, without explicit recovery guarantees on the coefficients. Moreover, any estimation error in the dictionary adversely impacts the ability to successfully localize and estimate the coefficients. This potentially limits the utility of existing provable dictionary learning methods in applications where coefficient recovery is of interest. To this end, we develop a simple online alternating optimization-based algorithm for dictionary learning, which recovers both the dictionary and coefficients exactly at a geometric rate. Specifically, we show that – when initialized appropriately – the algorithm linearly converges to the true factors. Our algorithm is also scalable and amenable for large scale distributed implementations in neural architectures, by which we mean that it only involves simple linear and non-linear operations. Finally, we corroborate these theoretical results via experimental evaluation of the proposed algorithm with the current state-of-the-art techniques. |
Tasks | Dictionary Learning |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=HJeu43ActQ |
https://openreview.net/pdf?id=HJeu43ActQ | |
PWC | https://paperswithcode.com/paper/provable-online-dictionary-learning-and |
Repo | |
Framework | |
Monash University’s Submissions to the WNGT 2019 Document Translation Task
Title | Monash University’s Submissions to the WNGT 2019 Document Translation Task |
Authors | Sameen Maruf, Gholamreza Haffari |
Abstract | We describe the work of Monash University for the shared task of Rotowire document translation organised by the 3rd Workshop on Neural Generation and Translation (WNGT 2019). We submitted systems for both directions of the English-German language pair. Our main focus is on employing an established document-level neural machine translation model for this task. We achieve a BLEU score of 39.83 (41.46 BLEU per WNGT evaluation) for En-De and 45.06 (47.39 BLEU per WNGT evaluation) for De-En translation directions on the Rotowire test set. All experiments conducted in the process are also described. |
Tasks | Machine Translation |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-5628/ |
https://www.aclweb.org/anthology/D19-5628 | |
PWC | https://paperswithcode.com/paper/monash-universitys-submissions-to-the-wngt |
Repo | |
Framework | |
Bridging the Gap: Improve Part-of-speech Tagging for Chinese Social Media Texts with Foreign Words
Title | Bridging the Gap: Improve Part-of-speech Tagging for Chinese Social Media Texts with Foreign Words |
Authors | Dingmin Wang, Meng Fang, Yan Song, Juntao Li |
Abstract | |
Tasks | Part-Of-Speech Tagging |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-5803/ |
https://www.aclweb.org/anthology/W19-5803 | |
PWC | https://paperswithcode.com/paper/bridging-the-gap-improve-part-of-speech |
Repo | |
Framework | |
Leveraging Local and Global Patterns for Self-Attention Networks
Title | Leveraging Local and Global Patterns for Self-Attention Networks |
Authors | Mingzhou Xu, Derek F. Wong, Baosong Yang, Yue Zhang, Lidia S. Chao |
Abstract | Self-attention networks have received increasing research attention. By default, the hidden states of each word are hierarchically calculated by attending to all words in the sentence, which assembles global information. However, several studies pointed out that taking all signals into account may lead to overlooking neighboring information (e.g. phrase pattern). To address this argument, we propose a hybrid attention mechanism to dynamically leverage both of the local and global information. Specifically, our approach uses a gating scalar for integrating both sources of the information, which is also convenient for quantifying their contributions. Experiments on various neural machine translation tasks demonstrate the effectiveness of the proposed method. The extensive analyses verify that the two types of contexts are complementary to each other, and our method gives highly effective improvements in their integration. |
Tasks | Machine Translation |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/P19-1295/ |
https://www.aclweb.org/anthology/P19-1295 | |
PWC | https://paperswithcode.com/paper/leveraging-local-and-global-patterns-for-self |
Repo | |
Framework | |
Selecting, Planning, and Rewriting: A Modular Approach for Data-to-Document Generation and Translation
Title | Selecting, Planning, and Rewriting: A Modular Approach for Data-to-Document Generation and Translation |
Authors | Lesly Miculicich, Marc Marone, Hany Hassan |
Abstract | In this paper, we report our system submissions to all 6 tracks of the WNGT 2019 shared task on Document-Level Generation and Translation. The objective is to generate a textual document from either structured data: generation task, or a document in a different language: translation task. For the translation task, we focused on adapting a large scale system trained on WMT data by fine tuning it on the RotoWire data. For the generation task, we participated with two systems based on a selection and planning model followed by (a) a simple language model generation, and (b) a GPT-2 pre-trained language model approach. The selection and planning module chooses a subset of table records in order, and the language models produce text given such a subset. |
Tasks | Language Modelling |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-5633/ |
https://www.aclweb.org/anthology/D19-5633 | |
PWC | https://paperswithcode.com/paper/selecting-planning-and-rewriting-a-modular |
Repo | |
Framework | |
Women’s Syntactic Resilience and Men’s Grammatical Luck: Gender-Bias in Part-of-Speech Tagging and Dependency Parsing
Title | Women’s Syntactic Resilience and Men’s Grammatical Luck: Gender-Bias in Part-of-Speech Tagging and Dependency Parsing |
Authors | Aparna Garimella, Carmen Banea, Dirk Hovy, Rada Mihalcea |
Abstract | Several linguistic studies have shown the prevalence of various lexical and grammatical patterns in texts authored by a person of a particular gender, but models for part-of-speech tagging and dependency parsing have still not adapted to account for these differences. To address this, we annotate the Wall Street Journal part of the Penn Treebank with the gender information of the articles{'} authors, and build taggers and parsers trained on this data that show performance differences in text written by men and women. Further analyses reveal numerous part-of-speech tags and syntactic relations whose prediction performances benefit from the prevalence of a specific gender in the training data. The results underscore the importance of accounting for gendered differences in syntactic tasks, and outline future venues for developing more accurate taggers and parsers. We release our data to the research community. |
Tasks | Dependency Parsing, Part-Of-Speech Tagging |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/P19-1339/ |
https://www.aclweb.org/anthology/P19-1339 | |
PWC | https://paperswithcode.com/paper/womens-syntactic-resilience-and-mens |
Repo | |
Framework | |
Segmentation of Argumentative Texts with Contextualised Word Representations
Title | Segmentation of Argumentative Texts with Contextualised Word Representations |
Authors | Georgios Petasis |
Abstract | The segmentation of argumentative units is an important subtask of argument mining, which is frequently addressed at a coarse granularity, usually assuming argumentative units to be no smaller than sentences. Approaches focusing at the clause-level granularity, typically address the task as sequence labeling at the token level, aiming to classify whether a token begins, is inside, or is outside of an argumentative unit. Most approaches exploit highly engineered, manually constructed features, and algorithms typically used in sequential tagging {–} such as Conditional Random Fields, while more recent approaches try to exploit manually constructed features in the context of deep neural networks. In this context, we examined to what extend recent advances in sequential labelling allow to reduce the need for highly sophisticated, manually constructed features, and whether limiting features to embeddings, pre-trained on large corpora is a promising approach. Evaluation results suggest the examined models and approaches can exhibit comparable performance, minimising the need for feature engineering. |
Tasks | Argument Mining, Feature Engineering |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-4501/ |
https://www.aclweb.org/anthology/W19-4501 | |
PWC | https://paperswithcode.com/paper/segmentation-of-argumentative-texts-with |
Repo | |
Framework | |
Robust Determinantal Generative Classifier for Noisy Labels and Adversarial Attacks
Title | Robust Determinantal Generative Classifier for Noisy Labels and Adversarial Attacks |
Authors | Kimin Lee, Sukmin Yun, Kibok Lee, Honglak Lee, Bo Li, Jinwoo Shin |
Abstract | Large-scale datasets may contain significant proportions of noisy (incorrect) class labels, and it is well-known that modern deep neural networks poorly generalize from such noisy training datasets. In this paper, we propose a novel inference method, Deep Determinantal Generative Classifier (DDGC), which can obtain a more robust decision boundary under any softmax neural classifier pre-trained on noisy datasets. Our main idea is inducing a generative classifier on top of hidden feature spaces of the discriminative deep model. By estimating the parameters of generative classifier using the minimum covariance determinant estimator, we significantly improve the classification accuracy, with neither re-training of the deep model nor changing its architectures. In particular, we show that DDGC not only generalizes well from noisy labels, but also is robust against adversarial perturbations due to its large margin property. Finally, we propose the ensemble version ofDDGC to improve its performance, by investigating the layer-wise characteristics of generative classifier. Our extensive experimental results demonstrate the superiority of DDGC given different learning models optimized by various training techniques to handle noisy labels or adversarial samples. For instance, on CIFAR-10 dataset containing 45% noisy training labels, we improve the test accuracy of a deep model optimized by the state-of-the-art noise-handling training method from33.34% to 43.02%. |
Tasks | |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=rkle3i09K7 |
https://openreview.net/pdf?id=rkle3i09K7 | |
PWC | https://paperswithcode.com/paper/robust-determinantal-generative-classifier |
Repo | |
Framework | |
DisSent: Learning Sentence Representations from Explicit Discourse Relations
Title | DisSent: Learning Sentence Representations from Explicit Discourse Relations |
Authors | Allen Nie, Erin Bennett, Noah Goodman |
Abstract | Learning effective representations of sentences is one of the core missions of natural language understanding. Existing models either train on a vast amount of text, or require costly, manually curated sentence relation datasets. We show that with dependency parsing and rule-based rubrics, we can curate a high quality sentence relation task by leveraging explicit discourse relations. We show that our curated dataset provides an excellent signal for learning vector representations of sentence meaning, representing relations that can only be determined when the meanings of two sentences are combined. We demonstrate that the automatically curated corpus allows a bidirectional LSTM sentence encoder to yield high quality sentence embeddings and can serve as a supervised fine-tuning dataset for larger models such as BERT. Our fixed sentence embeddings achieve high performance on a variety of transfer tasks, including SentEval, and we achieve state-of-the-art results on Penn Discourse Treebank{'}s implicit relation prediction task. |
Tasks | Dependency Parsing, Sentence Embeddings |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/P19-1442/ |
https://www.aclweb.org/anthology/P19-1442 | |
PWC | https://paperswithcode.com/paper/dissent-learning-sentence-representations |
Repo | |
Framework | |