February 1, 2020

2901 words 14 mins read

Paper Group AWR 264

Paper Group AWR 264

SemEval-2019 Task 6: Identifying and Categorizing Offensive Language in Social Media (OffensEval). Simultaneous Semantic Segmentation and Outlier Detection in Presence of Domain Shift. Graph Residual Flow for Molecular Graph Generation. Attention Based Glaucoma Detection: A Large-scale Database and CNN Model. Studying the Inductive Biases of RNNs w …

SemEval-2019 Task 6: Identifying and Categorizing Offensive Language in Social Media (OffensEval)

Title SemEval-2019 Task 6: Identifying and Categorizing Offensive Language in Social Media (OffensEval)
Authors Marcos Zampieri, Shervin Malmasi, Preslav Nakov, Sara Rosenthal, Noura Farra, Ritesh Kumar
Abstract We present the results and the main findings of SemEval-2019 Task 6 on Identifying and Categorizing Offensive Language in Social Media (OffensEval). The task was based on a new dataset, the Offensive Language Identification Dataset (OLID), which contains over 14,000 English tweets. It featured three sub-tasks. In sub-task A, the goal was to discriminate between offensive and non-offensive posts. In sub-task B, the focus was on the type of offensive content in the post. Finally, in sub-task C, systems had to detect the target of the offensive posts. OffensEval attracted a large number of participants and it was one of the most popular tasks in SemEval-2019. In total, about 800 teams signed up to participate in the task, and 115 of them submitted results, which we present and analyze in this report.
Tasks Language Identification
Published 2019-03-19
URL http://arxiv.org/abs/1903.08983v3
PDF http://arxiv.org/pdf/1903.08983v3.pdf
PWC https://paperswithcode.com/paper/semeval-2019-task-6-identifying-and-1
Repo https://github.com/VadymV/OffensEval
Framework tf

Simultaneous Semantic Segmentation and Outlier Detection in Presence of Domain Shift

Title Simultaneous Semantic Segmentation and Outlier Detection in Presence of Domain Shift
Authors Petra Bevandić, Ivan Krešo, Marin Oršić, Siniša Šegvić
Abstract Recent success on realistic road driving datasets has increased interest in exploring robust performance in real-world applications. One of the major unsolved problems is to identify image content which can not be reliably recognized with a given inference engine. We therefore study approaches to recover a dense outlier map alongside the primary task with a single forward pass, by relying on shared convolutional features. We consider semantic segmentation as the primary task and perform extensive validation on WildDash val (inliers), LSUN val (outliers), and pasted objects from Pascal VOC 2007 (outliers). We achieve the best validation performance by training to discriminate inliers from pasted ImageNet-1k content, even though ImageNet-1k contains many road-driving pixels, and, at least nominally, fails to account for the full diversity of the visual world. The proposed two-head model performs comparably to the C-way multi-class model trained to predict uniform distribution in outliers, while outperforming several other validated approaches. We evaluate our best two models on the WildDash test dataset and set a new state of the art on the WildDash benchmark.
Tasks Outlier Detection, Semantic Segmentation
Published 2019-08-03
URL https://arxiv.org/abs/1908.01098v1
PDF https://arxiv.org/pdf/1908.01098v1.pdf
PWC https://paperswithcode.com/paper/simultaneous-semantic-segmentation-and
Repo https://github.com/pb-brainiac/semseg_od
Framework pytorch

Graph Residual Flow for Molecular Graph Generation

Title Graph Residual Flow for Molecular Graph Generation
Authors Shion Honda, Hirotaka Akita, Katsuhiko Ishiguro, Toshiki Nakanishi, Kenta Oono
Abstract Statistical generative models for molecular graphs attract attention from many researchers from the fields of bio- and chemo-informatics. Among these models, invertible flow-based approaches are not fully explored yet. In this paper, we propose a powerful invertible flow for molecular graphs, called graph residual flow (GRF). The GRF is based on residual flows, which are known for more flexible and complex non-linear mappings than traditional coupling flows. We theoretically derive non-trivial conditions such that GRF is invertible, and present a way of keeping the entire flows invertible throughout the training and sampling. Experimental results show that a generative model based on the proposed GRF achieves comparable generation performance, with much smaller number of trainable parameters compared to the existing flow-based model.
Tasks Graph Generation
Published 2019-09-30
URL https://arxiv.org/abs/1909.13521v1
PDF https://arxiv.org/pdf/1909.13521v1.pdf
PWC https://paperswithcode.com/paper/graph-residual-flow-for-molecular-graph
Repo https://github.com/pfnet-research/chainer-chemistry
Framework none

Attention Based Glaucoma Detection: A Large-scale Database and CNN Model

Title Attention Based Glaucoma Detection: A Large-scale Database and CNN Model
Authors Liu Li, Mai Xu, Xiaofei Wang, Lai Jiang, Hanruo Liu
Abstract Recently, the attention mechanism has been successfully applied in convolutional neural networks (CNNs), significantly boosting the performance of many computer vision tasks. Unfortunately, few medical image recognition approaches incorporate the attention mechanism in the CNNs. In particular, there exists high redundancy in fundus images for glaucoma detection, such that the attention mechanism has potential in improving the performance of CNN-based glaucoma detection. This paper proposes an attention-based CNN for glaucoma detection (AG-CNN). Specifically, we first establish a large-scale attention based glaucoma (LAG) database, which includes 5,824 fundus images labeled with either positive glaucoma (2,392) or negative glaucoma (3,432). The attention maps of the ophthalmologists are also collected in LAG database through a simulated eye-tracking experiment. Then, a new structure of AG-CNN is designed, including an attention prediction subnet, a pathological area localization subnet and a glaucoma classification subnet. Different from other attention-based CNN methods, the features are also visualized as the localized pathological area, which can advance the performance of glaucoma detection. Finally, the experiment results show that the proposed AG-CNN approach significantly advances state-of-the-art glaucoma detection.
Tasks Eye Tracking
Published 2019-03-26
URL http://arxiv.org/abs/1903.10831v3
PDF http://arxiv.org/pdf/1903.10831v3.pdf
PWC https://paperswithcode.com/paper/attention-based-glaucoma-detection-a-large
Repo https://github.com/smilell/AG-CNN
Framework none

Studying the Inductive Biases of RNNs with Synthetic Variations of Natural Languages

Title Studying the Inductive Biases of RNNs with Synthetic Variations of Natural Languages
Authors Shauli Ravfogel, Yoav Goldberg, Tal Linzen
Abstract How do typological properties such as word order and morphological case marking affect the ability of neural sequence models to acquire the syntax of a language? Cross-linguistic comparisons of RNNs’ syntactic performance (e.g., on subject-verb agreement prediction) are complicated by the fact that any two languages differ in multiple typological properties, as well as by differences in training corpus. We propose a paradigm that addresses these issues: we create synthetic versions of English, which differ from English in one or more typological parameters, and generate corpora for those languages based on a parsed English corpus. We report a series of experiments in which RNNs were trained to predict agreement features for verbs in each of those synthetic languages. Among other findings, (1) performance was higher in subject-verb-object order (as in English) than in subject-object-verb order (as in Japanese), suggesting that RNNs have a recency bias; (2) predicting agreement with both subject and object (polypersonal agreement) improves over predicting each separately, suggesting that underlying syntactic knowledge transfers across the two tasks; and (3) overt morphological case makes agreement prediction significantly easier, regardless of word order.
Tasks
Published 2019-03-15
URL http://arxiv.org/abs/1903.06400v2
PDF http://arxiv.org/pdf/1903.06400v2.pdf
PWC https://paperswithcode.com/paper/studying-the-inductive-biases-of-rnns-with
Repo https://github.com/Shaul1321/rnn_typology
Framework none

Kernel-Based Approaches for Sequence Modeling: Connections to Neural Methods

Title Kernel-Based Approaches for Sequence Modeling: Connections to Neural Methods
Authors Kevin J Liang, Guoyin Wang, Yitong Li, Ricardo Henao, Lawrence Carin
Abstract We investigate time-dependent data analysis from the perspective of recurrent kernel machines, from which models with hidden units and gated memory cells arise naturally. By considering dynamic gating of the memory cell, a model closely related to the long short-term memory (LSTM) recurrent neural network is derived. Extending this setup to $n$-gram filters, the convolutional neural network (CNN), Gated CNN, and recurrent additive network (RAN) are also recovered as special cases. Our analysis provides a new perspective on the LSTM, while also extending it to $n$-gram convolutional filters. Experiments are performed on natural language processing tasks and on analysis of local field potentials (neuroscience). We demonstrate that the variants we derive from kernels perform on par or even better than traditional neural methods. For the neuroscience application, the new models demonstrate significant improvements relative to the prior state of the art.
Tasks
Published 2019-10-09
URL https://arxiv.org/abs/1910.04233v1
PDF https://arxiv.org/pdf/1910.04233v1.pdf
PWC https://paperswithcode.com/paper/kernel-based-approaches-for-sequence-modeling
Repo https://github.com/kevinjliang/kernels2rnns
Framework tf

A Graph Theoretic Framework of Recomputation Algorithms for Memory-Efficient Backpropagation

Title A Graph Theoretic Framework of Recomputation Algorithms for Memory-Efficient Backpropagation
Authors Mitsuru Kusumoto, Takuya Inoue, Gentaro Watanabe, Takuya Akiba, Masanori Koyama
Abstract Recomputation algorithms collectively refer to a family of methods that aims to reduce the memory consumption of the backpropagation by selectively discarding the intermediate results of the forward propagation and recomputing the discarded results as needed. In this paper, we will propose a novel and efficient recomputation method that can be applied to a wider range of neural nets than previous methods. We use the language of graph theory to formalize the general recomputation problem of minimizing the computational overhead under a fixed memory budget constraint, and provide a dynamic programming solution to the problem. Our method can reduce the peak memory consumption on various benchmark networks by 36%~81%, which outperforms the reduction achieved by other methods.
Tasks
Published 2019-05-28
URL https://arxiv.org/abs/1905.11722v1
PDF https://arxiv.org/pdf/1905.11722v1.pdf
PWC https://paperswithcode.com/paper/a-graph-theoretic-framework-of-recomputation
Repo https://github.com/pfnet-research/recompute
Framework none

Optic-Net: A Novel Convolutional Neural Network for Diagnosis of Retinal Diseases from Optical Tomography Images

Title Optic-Net: A Novel Convolutional Neural Network for Diagnosis of Retinal Diseases from Optical Tomography Images
Authors Sharif Amit Kamran, Sourajit Saha, Ali Shihab Sabbir, Alireza Tavakkoli
Abstract Diagnosing different retinal diseases from Spectral Domain Optical Coherence Tomography (SD-OCT) images is a challenging task. Different automated approaches such as image processing, machine learning and deep learning algorithms have been used for early detection and diagnosis of retinal diseases. Unfortunately, these are prone to error and computational inefficiency, which requires further intervention from human experts. In this paper, we propose a novel convolution neural network architecture to successfully distinguish between different degeneration of retinal layers and their underlying causes. The proposed novel architecture outperforms other classification models while addressing the issue of gradient explosion. Our approach reaches near perfect accuracy of 99.8% and 100% for two separately available Retinal SD-OCT data-set respectively. Additionally, our architecture predicts retinal diseases in real time while outperforming human diagnosticians.
Tasks
Published 2019-10-13
URL https://arxiv.org/abs/1910.05672v1
PDF https://arxiv.org/pdf/1910.05672v1.pdf
PWC https://paperswithcode.com/paper/optic-net-a-novel-convolutional-neural
Repo https://github.com/SharifAmit/OCT_Classification
Framework tf

RLCard: A Toolkit for Reinforcement Learning in Card Games

Title RLCard: A Toolkit for Reinforcement Learning in Card Games
Authors Daochen Zha, Kwei-Herng Lai, Yuanpu Cao, Songyi Huang, Ruzhe Wei, Junyu Guo, Xia Hu
Abstract RLCard is an open-source toolkit for reinforcement learning research in card games. It supports various card environments with easy-to-use interfaces, including Blackjack, Leduc Hold’em, Texas Hold’em, UNO, Dou Dizhu and Mahjong. The goal of RLCard is to bridge reinforcement learning and imperfect information games, and push forward the research of reinforcement learning in domains with multiple agents, large state and action space, and sparse reward. In this paper, we provide an overview of the key components in RLCard, a discussion of the design principles, a brief introduction of the interfaces, and comprehensive evaluations of the environments. The codes and documents are available at https://github.com/datamllab/rlcard
Tasks Board Games, Card Games, Game of Poker, Multi-agent Reinforcement Learning
Published 2019-10-10
URL https://arxiv.org/abs/1910.04376v2
PDF https://arxiv.org/pdf/1910.04376v2.pdf
PWC https://paperswithcode.com/paper/rlcard-a-toolkit-for-reinforcement-learning
Repo https://github.com/datamllab/rlcard
Framework pytorch

Block Coordinate Regularization by Denoising

Title Block Coordinate Regularization by Denoising
Authors Yu Sun, Jiaming Liu, Ulugbek S. Kamilov
Abstract We consider the problem of estimating a vector from its noisy measurements using a prior specified only through a denoising function. Recent work on plug-and-play priors (PnP) and regularization-by-denoising (RED) has shown the state-of-the-art performance of estimators under such priors in a range of imaging tasks. In this work, we develop a new block coordinate RED algorithm that decomposes a large-scale estimation problem into a sequence of updates over a small subset of the unknown variables. We theoretically analyze the convergence of the algorithm and discuss its relationship to the traditional proximal optimization. Our analysis complements and extends recent theoretical results for RED-based estimation methods. We numerically validate our method using several denoiser priors, including those based on convolutional neural network (CNN) denoisers.
Tasks Denoising
Published 2019-05-13
URL https://arxiv.org/abs/1905.05113v2
PDF https://arxiv.org/pdf/1905.05113v2.pdf
PWC https://paperswithcode.com/paper/block-coordinate-regularization-by-denoising
Repo https://github.com/wustl-cig/bcred
Framework tf

Zoho at SemEval-2019 Task 9: Semi-supervised Domain Adaptation using Tri-training for Suggestion Mining

Title Zoho at SemEval-2019 Task 9: Semi-supervised Domain Adaptation using Tri-training for Suggestion Mining
Authors Sai Prasanna, Sri Ananda Seelan
Abstract This paper describes our submission for the SemEval-2019 Suggestion Mining task. A simple Convolutional Neural Network (CNN) classifier with contextual word representations from a pre-trained language model was used for sentence classification. The model is trained using tri-training, a semi-supervised bootstrapping mechanism for labelling unseen data. Tri-training proved to be an effective technique to accommodate domain shift for cross-domain suggestion mining (Subtask B) where there is no hand labelled training data. For in-domain evaluation (Subtask A), we use the same technique to augment the training set. Our system ranks thirteenth in Subtask A with an $F_1$-score of 68.07 and third in Subtask B with an $F_1$-score of 81.94.
Tasks Domain Adaptation, Language Modelling, Sentence Classification
Published 2019-02-27
URL http://arxiv.org/abs/1902.10623v2
PDF http://arxiv.org/pdf/1902.10623v2.pdf
PWC https://paperswithcode.com/paper/zoho-at-semeval-2019-task-9-semi-supervised
Repo https://github.com/sai-prasanna/suggestion-mining-semeval19
Framework none

BS-Nets: An End-to-End Framework For Band Selection of Hyperspectral Image

Title BS-Nets: An End-to-End Framework For Band Selection of Hyperspectral Image
Authors Yaoming Cai, Xiaobo Liu, Zhihua Cai
Abstract Hyperspectral image (HSI) consists of hundreds of continuous narrow bands with high spectral correlation, which would lead to the so-called Hughes phenomenon and the high computational cost in processing. Band selection has been proven effective in avoiding such problems by removing the redundant bands. However, many of existing band selection methods separately estimate the significance for every single band and cannot fully consider the nonlinear and global interaction between spectral bands. In this paper, by assuming that a complete HSI can be reconstructed from its few informative bands, we propose a general band selection framework, Band Selection Network (termed as BS-Net). The framework consists of a band attention module (BAM), which aims to explicitly model the nonlinear inter-dependencies between spectral bands, and a reconstruction network (RecNet), which is used to restore the original HSI cube from the learned informative bands, resulting in a flexible architecture. The resulting framework is end-to-end trainable, making it easier to train from scratch and to combine with existing networks. We implement two BS-Nets respectively using fully connected networks (BS-Net-FC) and convolutional neural networks (BS-Net-Conv), and compare the results with many existing band selection approaches for three real hyperspectral images, demonstrating that the proposed BS-Nets can accurately select informative band subset with less redundancy and achieve significantly better classification performance with an acceptable time cost.
Tasks Classification Of Hyperspectral Images, Hyperspectral Image Classification
Published 2019-04-17
URL http://arxiv.org/abs/1904.08269v1
PDF http://arxiv.org/pdf/1904.08269v1.pdf
PWC https://paperswithcode.com/paper/bs-nets-an-end-to-end-framework-for-band
Repo https://github.com/ucalyptus/BS-Nets-Implementation-Pytorch
Framework pytorch

Minimax Confidence Intervals for the Sliced Wasserstein Distance

Title Minimax Confidence Intervals for the Sliced Wasserstein Distance
Authors Tudor Manole, Sivaraman Balakrishnan, Larry Wasserman
Abstract The Wasserstein distance has risen in popularity in the statistics and machine learning communities as a useful metric for comparing probability distributions. We study the problem of uncertainty quantification for the Sliced Wasserstein distance–an easily computable approximation of the Wasserstein distance. Specifically, we construct confidence intervals for the Sliced Wasserstein distance which have finite-sample validity under no assumptions or mild moment assumptions, and are adaptive in length to the smoothness of the underlying distributions. We also bound the minimax risk of estimating the Sliced Wasserstein distance, and show that the length of our proposed confidence intervals is minimax optimal over appropriate distribution classes. To motivate the choice of these classes, we also study minimax rates of estimating a distribution under the Sliced Wasserstein distance. These theoretical findings are complemented with a simulation study.
Tasks
Published 2019-09-17
URL https://arxiv.org/abs/1909.07862v1
PDF https://arxiv.org/pdf/1909.07862v1.pdf
PWC https://paperswithcode.com/paper/minimax-confidence-intervals-for-the-sliced
Repo https://github.com/kaiwenwu96/bib-transformer
Framework none

Relevance Factor VAE: Learning and Identifying Disentangled Factors

Title Relevance Factor VAE: Learning and Identifying Disentangled Factors
Authors Minyoung Kim, Yuting Wang, Pritish Sahu, Vladimir Pavlovic
Abstract We propose a novel VAE-based deep auto-encoder model that can learn disentangled latent representations in a fully unsupervised manner, endowed with the ability to identify all meaningful sources of variation and their cardinality. Our model, dubbed Relevance-Factor-VAE, leverages the total correlation (TC) in the latent space to achieve the disentanglement goal, but also addresses the key issue of existing approaches which cannot distinguish between meaningful and nuisance factors of latent variation, often the source of considerable degradation in disentanglement performance. We tackle this issue by introducing the so-called relevance indicator variables that can be automatically learned from data, together with the VAE parameters. Our model effectively focuses the TC loss onto the relevant factors only by tolerating large prior KL divergences, a desideratum justified by our semi-parametric theoretical analysis. Using a suite of disentanglement metrics, including a newly proposed one, as well as qualitative evidence, we demonstrate that our model outperforms existing methods across several challenging benchmark datasets.
Tasks
Published 2019-02-05
URL http://arxiv.org/abs/1902.01568v1
PDF http://arxiv.org/pdf/1902.01568v1.pdf
PWC https://paperswithcode.com/paper/relevance-factor-vae-learning-and-identifying
Repo https://github.com/ThomasMrY/RF-VAE
Framework pytorch

Structural Scaffolds for Citation Intent Classification in Scientific Publications

Title Structural Scaffolds for Citation Intent Classification in Scientific Publications
Authors Arman Cohan, Waleed Ammar, Madeleine van Zuylen, Field Cady
Abstract Identifying the intent of a citation in scientific papers (e.g., background information, use of methods, comparing results) is critical for machine reading of individual publications and automated analysis of the scientific literature. We propose structural scaffolds, a multitask model to incorporate structural information of scientific papers into citations for effective classification of citation intents. Our model achieves a new state-of-the-art on an existing ACL anthology dataset (ACL-ARC) with a 13.3% absolute increase in F1 score, without relying on external linguistic resources or hand-engineered features as done in existing methods. In addition, we introduce a new dataset of citation intents (SciCite) which is more than five times larger and covers multiple scientific domains compared with existing datasets. Our code and data are available at: https://github.com/allenai/scicite.
Tasks Citation Intent Classification, Intent Classification, Reading Comprehension, Sentence Classification
Published 2019-04-02
URL https://arxiv.org/abs/1904.01608v2
PDF https://arxiv.org/pdf/1904.01608v2.pdf
PWC https://paperswithcode.com/paper/structural-scaffolds-for-citation-intent
Repo https://github.com/allenai/scicite
Framework none
comments powered by Disqus