Paper Group AWR 264
SemEval-2019 Task 6: Identifying and Categorizing Offensive Language in Social Media (OffensEval). Simultaneous Semantic Segmentation and Outlier Detection in Presence of Domain Shift. Graph Residual Flow for Molecular Graph Generation. Attention Based Glaucoma Detection: A Large-scale Database and CNN Model. Studying the Inductive Biases of RNNs w …
SemEval-2019 Task 6: Identifying and Categorizing Offensive Language in Social Media (OffensEval)
Title | SemEval-2019 Task 6: Identifying and Categorizing Offensive Language in Social Media (OffensEval) |
Authors | Marcos Zampieri, Shervin Malmasi, Preslav Nakov, Sara Rosenthal, Noura Farra, Ritesh Kumar |
Abstract | We present the results and the main findings of SemEval-2019 Task 6 on Identifying and Categorizing Offensive Language in Social Media (OffensEval). The task was based on a new dataset, the Offensive Language Identification Dataset (OLID), which contains over 14,000 English tweets. It featured three sub-tasks. In sub-task A, the goal was to discriminate between offensive and non-offensive posts. In sub-task B, the focus was on the type of offensive content in the post. Finally, in sub-task C, systems had to detect the target of the offensive posts. OffensEval attracted a large number of participants and it was one of the most popular tasks in SemEval-2019. In total, about 800 teams signed up to participate in the task, and 115 of them submitted results, which we present and analyze in this report. |
Tasks | Language Identification |
Published | 2019-03-19 |
URL | http://arxiv.org/abs/1903.08983v3 |
http://arxiv.org/pdf/1903.08983v3.pdf | |
PWC | https://paperswithcode.com/paper/semeval-2019-task-6-identifying-and-1 |
Repo | https://github.com/VadymV/OffensEval |
Framework | tf |
Simultaneous Semantic Segmentation and Outlier Detection in Presence of Domain Shift
Title | Simultaneous Semantic Segmentation and Outlier Detection in Presence of Domain Shift |
Authors | Petra Bevandić, Ivan Krešo, Marin Oršić, Siniša Šegvić |
Abstract | Recent success on realistic road driving datasets has increased interest in exploring robust performance in real-world applications. One of the major unsolved problems is to identify image content which can not be reliably recognized with a given inference engine. We therefore study approaches to recover a dense outlier map alongside the primary task with a single forward pass, by relying on shared convolutional features. We consider semantic segmentation as the primary task and perform extensive validation on WildDash val (inliers), LSUN val (outliers), and pasted objects from Pascal VOC 2007 (outliers). We achieve the best validation performance by training to discriminate inliers from pasted ImageNet-1k content, even though ImageNet-1k contains many road-driving pixels, and, at least nominally, fails to account for the full diversity of the visual world. The proposed two-head model performs comparably to the C-way multi-class model trained to predict uniform distribution in outliers, while outperforming several other validated approaches. We evaluate our best two models on the WildDash test dataset and set a new state of the art on the WildDash benchmark. |
Tasks | Outlier Detection, Semantic Segmentation |
Published | 2019-08-03 |
URL | https://arxiv.org/abs/1908.01098v1 |
https://arxiv.org/pdf/1908.01098v1.pdf | |
PWC | https://paperswithcode.com/paper/simultaneous-semantic-segmentation-and |
Repo | https://github.com/pb-brainiac/semseg_od |
Framework | pytorch |
Graph Residual Flow for Molecular Graph Generation
Title | Graph Residual Flow for Molecular Graph Generation |
Authors | Shion Honda, Hirotaka Akita, Katsuhiko Ishiguro, Toshiki Nakanishi, Kenta Oono |
Abstract | Statistical generative models for molecular graphs attract attention from many researchers from the fields of bio- and chemo-informatics. Among these models, invertible flow-based approaches are not fully explored yet. In this paper, we propose a powerful invertible flow for molecular graphs, called graph residual flow (GRF). The GRF is based on residual flows, which are known for more flexible and complex non-linear mappings than traditional coupling flows. We theoretically derive non-trivial conditions such that GRF is invertible, and present a way of keeping the entire flows invertible throughout the training and sampling. Experimental results show that a generative model based on the proposed GRF achieves comparable generation performance, with much smaller number of trainable parameters compared to the existing flow-based model. |
Tasks | Graph Generation |
Published | 2019-09-30 |
URL | https://arxiv.org/abs/1909.13521v1 |
https://arxiv.org/pdf/1909.13521v1.pdf | |
PWC | https://paperswithcode.com/paper/graph-residual-flow-for-molecular-graph |
Repo | https://github.com/pfnet-research/chainer-chemistry |
Framework | none |
Attention Based Glaucoma Detection: A Large-scale Database and CNN Model
Title | Attention Based Glaucoma Detection: A Large-scale Database and CNN Model |
Authors | Liu Li, Mai Xu, Xiaofei Wang, Lai Jiang, Hanruo Liu |
Abstract | Recently, the attention mechanism has been successfully applied in convolutional neural networks (CNNs), significantly boosting the performance of many computer vision tasks. Unfortunately, few medical image recognition approaches incorporate the attention mechanism in the CNNs. In particular, there exists high redundancy in fundus images for glaucoma detection, such that the attention mechanism has potential in improving the performance of CNN-based glaucoma detection. This paper proposes an attention-based CNN for glaucoma detection (AG-CNN). Specifically, we first establish a large-scale attention based glaucoma (LAG) database, which includes 5,824 fundus images labeled with either positive glaucoma (2,392) or negative glaucoma (3,432). The attention maps of the ophthalmologists are also collected in LAG database through a simulated eye-tracking experiment. Then, a new structure of AG-CNN is designed, including an attention prediction subnet, a pathological area localization subnet and a glaucoma classification subnet. Different from other attention-based CNN methods, the features are also visualized as the localized pathological area, which can advance the performance of glaucoma detection. Finally, the experiment results show that the proposed AG-CNN approach significantly advances state-of-the-art glaucoma detection. |
Tasks | Eye Tracking |
Published | 2019-03-26 |
URL | http://arxiv.org/abs/1903.10831v3 |
http://arxiv.org/pdf/1903.10831v3.pdf | |
PWC | https://paperswithcode.com/paper/attention-based-glaucoma-detection-a-large |
Repo | https://github.com/smilell/AG-CNN |
Framework | none |
Studying the Inductive Biases of RNNs with Synthetic Variations of Natural Languages
Title | Studying the Inductive Biases of RNNs with Synthetic Variations of Natural Languages |
Authors | Shauli Ravfogel, Yoav Goldberg, Tal Linzen |
Abstract | How do typological properties such as word order and morphological case marking affect the ability of neural sequence models to acquire the syntax of a language? Cross-linguistic comparisons of RNNs’ syntactic performance (e.g., on subject-verb agreement prediction) are complicated by the fact that any two languages differ in multiple typological properties, as well as by differences in training corpus. We propose a paradigm that addresses these issues: we create synthetic versions of English, which differ from English in one or more typological parameters, and generate corpora for those languages based on a parsed English corpus. We report a series of experiments in which RNNs were trained to predict agreement features for verbs in each of those synthetic languages. Among other findings, (1) performance was higher in subject-verb-object order (as in English) than in subject-object-verb order (as in Japanese), suggesting that RNNs have a recency bias; (2) predicting agreement with both subject and object (polypersonal agreement) improves over predicting each separately, suggesting that underlying syntactic knowledge transfers across the two tasks; and (3) overt morphological case makes agreement prediction significantly easier, regardless of word order. |
Tasks | |
Published | 2019-03-15 |
URL | http://arxiv.org/abs/1903.06400v2 |
http://arxiv.org/pdf/1903.06400v2.pdf | |
PWC | https://paperswithcode.com/paper/studying-the-inductive-biases-of-rnns-with |
Repo | https://github.com/Shaul1321/rnn_typology |
Framework | none |
Kernel-Based Approaches for Sequence Modeling: Connections to Neural Methods
Title | Kernel-Based Approaches for Sequence Modeling: Connections to Neural Methods |
Authors | Kevin J Liang, Guoyin Wang, Yitong Li, Ricardo Henao, Lawrence Carin |
Abstract | We investigate time-dependent data analysis from the perspective of recurrent kernel machines, from which models with hidden units and gated memory cells arise naturally. By considering dynamic gating of the memory cell, a model closely related to the long short-term memory (LSTM) recurrent neural network is derived. Extending this setup to $n$-gram filters, the convolutional neural network (CNN), Gated CNN, and recurrent additive network (RAN) are also recovered as special cases. Our analysis provides a new perspective on the LSTM, while also extending it to $n$-gram convolutional filters. Experiments are performed on natural language processing tasks and on analysis of local field potentials (neuroscience). We demonstrate that the variants we derive from kernels perform on par or even better than traditional neural methods. For the neuroscience application, the new models demonstrate significant improvements relative to the prior state of the art. |
Tasks | |
Published | 2019-10-09 |
URL | https://arxiv.org/abs/1910.04233v1 |
https://arxiv.org/pdf/1910.04233v1.pdf | |
PWC | https://paperswithcode.com/paper/kernel-based-approaches-for-sequence-modeling |
Repo | https://github.com/kevinjliang/kernels2rnns |
Framework | tf |
A Graph Theoretic Framework of Recomputation Algorithms for Memory-Efficient Backpropagation
Title | A Graph Theoretic Framework of Recomputation Algorithms for Memory-Efficient Backpropagation |
Authors | Mitsuru Kusumoto, Takuya Inoue, Gentaro Watanabe, Takuya Akiba, Masanori Koyama |
Abstract | Recomputation algorithms collectively refer to a family of methods that aims to reduce the memory consumption of the backpropagation by selectively discarding the intermediate results of the forward propagation and recomputing the discarded results as needed. In this paper, we will propose a novel and efficient recomputation method that can be applied to a wider range of neural nets than previous methods. We use the language of graph theory to formalize the general recomputation problem of minimizing the computational overhead under a fixed memory budget constraint, and provide a dynamic programming solution to the problem. Our method can reduce the peak memory consumption on various benchmark networks by 36%~81%, which outperforms the reduction achieved by other methods. |
Tasks | |
Published | 2019-05-28 |
URL | https://arxiv.org/abs/1905.11722v1 |
https://arxiv.org/pdf/1905.11722v1.pdf | |
PWC | https://paperswithcode.com/paper/a-graph-theoretic-framework-of-recomputation |
Repo | https://github.com/pfnet-research/recompute |
Framework | none |
Optic-Net: A Novel Convolutional Neural Network for Diagnosis of Retinal Diseases from Optical Tomography Images
Title | Optic-Net: A Novel Convolutional Neural Network for Diagnosis of Retinal Diseases from Optical Tomography Images |
Authors | Sharif Amit Kamran, Sourajit Saha, Ali Shihab Sabbir, Alireza Tavakkoli |
Abstract | Diagnosing different retinal diseases from Spectral Domain Optical Coherence Tomography (SD-OCT) images is a challenging task. Different automated approaches such as image processing, machine learning and deep learning algorithms have been used for early detection and diagnosis of retinal diseases. Unfortunately, these are prone to error and computational inefficiency, which requires further intervention from human experts. In this paper, we propose a novel convolution neural network architecture to successfully distinguish between different degeneration of retinal layers and their underlying causes. The proposed novel architecture outperforms other classification models while addressing the issue of gradient explosion. Our approach reaches near perfect accuracy of 99.8% and 100% for two separately available Retinal SD-OCT data-set respectively. Additionally, our architecture predicts retinal diseases in real time while outperforming human diagnosticians. |
Tasks | |
Published | 2019-10-13 |
URL | https://arxiv.org/abs/1910.05672v1 |
https://arxiv.org/pdf/1910.05672v1.pdf | |
PWC | https://paperswithcode.com/paper/optic-net-a-novel-convolutional-neural |
Repo | https://github.com/SharifAmit/OCT_Classification |
Framework | tf |
RLCard: A Toolkit for Reinforcement Learning in Card Games
Title | RLCard: A Toolkit for Reinforcement Learning in Card Games |
Authors | Daochen Zha, Kwei-Herng Lai, Yuanpu Cao, Songyi Huang, Ruzhe Wei, Junyu Guo, Xia Hu |
Abstract | RLCard is an open-source toolkit for reinforcement learning research in card games. It supports various card environments with easy-to-use interfaces, including Blackjack, Leduc Hold’em, Texas Hold’em, UNO, Dou Dizhu and Mahjong. The goal of RLCard is to bridge reinforcement learning and imperfect information games, and push forward the research of reinforcement learning in domains with multiple agents, large state and action space, and sparse reward. In this paper, we provide an overview of the key components in RLCard, a discussion of the design principles, a brief introduction of the interfaces, and comprehensive evaluations of the environments. The codes and documents are available at https://github.com/datamllab/rlcard |
Tasks | Board Games, Card Games, Game of Poker, Multi-agent Reinforcement Learning |
Published | 2019-10-10 |
URL | https://arxiv.org/abs/1910.04376v2 |
https://arxiv.org/pdf/1910.04376v2.pdf | |
PWC | https://paperswithcode.com/paper/rlcard-a-toolkit-for-reinforcement-learning |
Repo | https://github.com/datamllab/rlcard |
Framework | pytorch |
Block Coordinate Regularization by Denoising
Title | Block Coordinate Regularization by Denoising |
Authors | Yu Sun, Jiaming Liu, Ulugbek S. Kamilov |
Abstract | We consider the problem of estimating a vector from its noisy measurements using a prior specified only through a denoising function. Recent work on plug-and-play priors (PnP) and regularization-by-denoising (RED) has shown the state-of-the-art performance of estimators under such priors in a range of imaging tasks. In this work, we develop a new block coordinate RED algorithm that decomposes a large-scale estimation problem into a sequence of updates over a small subset of the unknown variables. We theoretically analyze the convergence of the algorithm and discuss its relationship to the traditional proximal optimization. Our analysis complements and extends recent theoretical results for RED-based estimation methods. We numerically validate our method using several denoiser priors, including those based on convolutional neural network (CNN) denoisers. |
Tasks | Denoising |
Published | 2019-05-13 |
URL | https://arxiv.org/abs/1905.05113v2 |
https://arxiv.org/pdf/1905.05113v2.pdf | |
PWC | https://paperswithcode.com/paper/block-coordinate-regularization-by-denoising |
Repo | https://github.com/wustl-cig/bcred |
Framework | tf |
Zoho at SemEval-2019 Task 9: Semi-supervised Domain Adaptation using Tri-training for Suggestion Mining
Title | Zoho at SemEval-2019 Task 9: Semi-supervised Domain Adaptation using Tri-training for Suggestion Mining |
Authors | Sai Prasanna, Sri Ananda Seelan |
Abstract | This paper describes our submission for the SemEval-2019 Suggestion Mining task. A simple Convolutional Neural Network (CNN) classifier with contextual word representations from a pre-trained language model was used for sentence classification. The model is trained using tri-training, a semi-supervised bootstrapping mechanism for labelling unseen data. Tri-training proved to be an effective technique to accommodate domain shift for cross-domain suggestion mining (Subtask B) where there is no hand labelled training data. For in-domain evaluation (Subtask A), we use the same technique to augment the training set. Our system ranks thirteenth in Subtask A with an $F_1$-score of 68.07 and third in Subtask B with an $F_1$-score of 81.94. |
Tasks | Domain Adaptation, Language Modelling, Sentence Classification |
Published | 2019-02-27 |
URL | http://arxiv.org/abs/1902.10623v2 |
http://arxiv.org/pdf/1902.10623v2.pdf | |
PWC | https://paperswithcode.com/paper/zoho-at-semeval-2019-task-9-semi-supervised |
Repo | https://github.com/sai-prasanna/suggestion-mining-semeval19 |
Framework | none |
BS-Nets: An End-to-End Framework For Band Selection of Hyperspectral Image
Title | BS-Nets: An End-to-End Framework For Band Selection of Hyperspectral Image |
Authors | Yaoming Cai, Xiaobo Liu, Zhihua Cai |
Abstract | Hyperspectral image (HSI) consists of hundreds of continuous narrow bands with high spectral correlation, which would lead to the so-called Hughes phenomenon and the high computational cost in processing. Band selection has been proven effective in avoiding such problems by removing the redundant bands. However, many of existing band selection methods separately estimate the significance for every single band and cannot fully consider the nonlinear and global interaction between spectral bands. In this paper, by assuming that a complete HSI can be reconstructed from its few informative bands, we propose a general band selection framework, Band Selection Network (termed as BS-Net). The framework consists of a band attention module (BAM), which aims to explicitly model the nonlinear inter-dependencies between spectral bands, and a reconstruction network (RecNet), which is used to restore the original HSI cube from the learned informative bands, resulting in a flexible architecture. The resulting framework is end-to-end trainable, making it easier to train from scratch and to combine with existing networks. We implement two BS-Nets respectively using fully connected networks (BS-Net-FC) and convolutional neural networks (BS-Net-Conv), and compare the results with many existing band selection approaches for three real hyperspectral images, demonstrating that the proposed BS-Nets can accurately select informative band subset with less redundancy and achieve significantly better classification performance with an acceptable time cost. |
Tasks | Classification Of Hyperspectral Images, Hyperspectral Image Classification |
Published | 2019-04-17 |
URL | http://arxiv.org/abs/1904.08269v1 |
http://arxiv.org/pdf/1904.08269v1.pdf | |
PWC | https://paperswithcode.com/paper/bs-nets-an-end-to-end-framework-for-band |
Repo | https://github.com/ucalyptus/BS-Nets-Implementation-Pytorch |
Framework | pytorch |
Minimax Confidence Intervals for the Sliced Wasserstein Distance
Title | Minimax Confidence Intervals for the Sliced Wasserstein Distance |
Authors | Tudor Manole, Sivaraman Balakrishnan, Larry Wasserman |
Abstract | The Wasserstein distance has risen in popularity in the statistics and machine learning communities as a useful metric for comparing probability distributions. We study the problem of uncertainty quantification for the Sliced Wasserstein distance–an easily computable approximation of the Wasserstein distance. Specifically, we construct confidence intervals for the Sliced Wasserstein distance which have finite-sample validity under no assumptions or mild moment assumptions, and are adaptive in length to the smoothness of the underlying distributions. We also bound the minimax risk of estimating the Sliced Wasserstein distance, and show that the length of our proposed confidence intervals is minimax optimal over appropriate distribution classes. To motivate the choice of these classes, we also study minimax rates of estimating a distribution under the Sliced Wasserstein distance. These theoretical findings are complemented with a simulation study. |
Tasks | |
Published | 2019-09-17 |
URL | https://arxiv.org/abs/1909.07862v1 |
https://arxiv.org/pdf/1909.07862v1.pdf | |
PWC | https://paperswithcode.com/paper/minimax-confidence-intervals-for-the-sliced |
Repo | https://github.com/kaiwenwu96/bib-transformer |
Framework | none |
Relevance Factor VAE: Learning and Identifying Disentangled Factors
Title | Relevance Factor VAE: Learning and Identifying Disentangled Factors |
Authors | Minyoung Kim, Yuting Wang, Pritish Sahu, Vladimir Pavlovic |
Abstract | We propose a novel VAE-based deep auto-encoder model that can learn disentangled latent representations in a fully unsupervised manner, endowed with the ability to identify all meaningful sources of variation and their cardinality. Our model, dubbed Relevance-Factor-VAE, leverages the total correlation (TC) in the latent space to achieve the disentanglement goal, but also addresses the key issue of existing approaches which cannot distinguish between meaningful and nuisance factors of latent variation, often the source of considerable degradation in disentanglement performance. We tackle this issue by introducing the so-called relevance indicator variables that can be automatically learned from data, together with the VAE parameters. Our model effectively focuses the TC loss onto the relevant factors only by tolerating large prior KL divergences, a desideratum justified by our semi-parametric theoretical analysis. Using a suite of disentanglement metrics, including a newly proposed one, as well as qualitative evidence, we demonstrate that our model outperforms existing methods across several challenging benchmark datasets. |
Tasks | |
Published | 2019-02-05 |
URL | http://arxiv.org/abs/1902.01568v1 |
http://arxiv.org/pdf/1902.01568v1.pdf | |
PWC | https://paperswithcode.com/paper/relevance-factor-vae-learning-and-identifying |
Repo | https://github.com/ThomasMrY/RF-VAE |
Framework | pytorch |
Structural Scaffolds for Citation Intent Classification in Scientific Publications
Title | Structural Scaffolds for Citation Intent Classification in Scientific Publications |
Authors | Arman Cohan, Waleed Ammar, Madeleine van Zuylen, Field Cady |
Abstract | Identifying the intent of a citation in scientific papers (e.g., background information, use of methods, comparing results) is critical for machine reading of individual publications and automated analysis of the scientific literature. We propose structural scaffolds, a multitask model to incorporate structural information of scientific papers into citations for effective classification of citation intents. Our model achieves a new state-of-the-art on an existing ACL anthology dataset (ACL-ARC) with a 13.3% absolute increase in F1 score, without relying on external linguistic resources or hand-engineered features as done in existing methods. In addition, we introduce a new dataset of citation intents (SciCite) which is more than five times larger and covers multiple scientific domains compared with existing datasets. Our code and data are available at: https://github.com/allenai/scicite. |
Tasks | Citation Intent Classification, Intent Classification, Reading Comprehension, Sentence Classification |
Published | 2019-04-02 |
URL | https://arxiv.org/abs/1904.01608v2 |
https://arxiv.org/pdf/1904.01608v2.pdf | |
PWC | https://paperswithcode.com/paper/structural-scaffolds-for-citation-intent |
Repo | https://github.com/allenai/scicite |
Framework | none |