February 1, 2020

2901 words 14 mins read

Paper Group AWR 264

SemEval-2019 Task 6: Identifying and Categorizing Offensive Language in Social Media (OffensEval). Simultaneous Semantic Segmentation and Outlier Detection in Presence of Domain Shift. Graph Residual Flow for Molecular Graph Generation. Attention Based Glaucoma Detection: A Large-scale Database and CNN Model. Studying the Inductive Biases of RNNs w …


Title	SemEval-2019 Task 6: Identifying and Categorizing Offensive Language in Social Media (OffensEval)
Authors	Marcos Zampieri, Shervin Malmasi, Preslav Nakov, Sara Rosenthal, Noura Farra, Ritesh Kumar
Abstract	We present the results and the main findings of SemEval-2019 Task 6 on Identifying and Categorizing Offensive Language in Social Media (OffensEval). The task was based on a new dataset, the Offensive Language Identification Dataset (OLID), which contains over 14,000 English tweets. It featured three sub-tasks. In sub-task A, the goal was to discriminate between offensive and non-offensive posts. In sub-task B, the focus was on the type of offensive content in the post. Finally, in sub-task C, systems had to detect the target of the offensive posts. OffensEval attracted a large number of participants and it was one of the most popular tasks in SemEval-2019. In total, about 800 teams signed up to participate in the task, and 115 of them submitted results, which we present and analyze in this report.
Tasks	Language Identification
Published	2019-03-19
URL	http://arxiv.org/abs/1903.08983v3
PDF	http://arxiv.org/pdf/1903.08983v3.pdf
PWC	https://paperswithcode.com/paper/semeval-2019-task-6-identifying-and-1
Repo	https://github.com/VadymV/OffensEval
Framework	tf

Simultaneous Semantic Segmentation and Outlier Detection in Presence of Domain Shift


Title	Simultaneous Semantic Segmentation and Outlier Detection in Presence of Domain Shift
Authors	Petra Bevandić, Ivan Krešo, Marin Oršić, Siniša Šegvić
Abstract	Recent success on realistic road driving datasets has increased interest in exploring robust performance in real-world applications. One of the major unsolved problems is to identify image content which can not be reliably recognized with a given inference engine. We therefore study approaches to recover a dense outlier map alongside the primary task with a single forward pass, by relying on shared convolutional features. We consider semantic segmentation as the primary task and perform extensive validation on WildDash val (inliers), LSUN val (outliers), and pasted objects from Pascal VOC 2007 (outliers). We achieve the best validation performance by training to discriminate inliers from pasted ImageNet-1k content, even though ImageNet-1k contains many road-driving pixels, and, at least nominally, fails to account for the full diversity of the visual world. The proposed two-head model performs comparably to the C-way multi-class model trained to predict uniform distribution in outliers, while outperforming several other validated approaches. We evaluate our best two models on the WildDash test dataset and set a new state of the art on the WildDash benchmark.
Tasks	Outlier Detection, Semantic Segmentation
Published	2019-08-03
URL	https://arxiv.org/abs/1908.01098v1
PDF	https://arxiv.org/pdf/1908.01098v1.pdf
PWC	https://paperswithcode.com/paper/simultaneous-semantic-segmentation-and
Repo	https://github.com/pb-brainiac/semseg_od
Framework	pytorch

Graph Residual Flow for Molecular Graph Generation


Title	Graph Residual Flow for Molecular Graph Generation
Authors	Shion Honda, Hirotaka Akita, Katsuhiko Ishiguro, Toshiki Nakanishi, Kenta Oono
Abstract	Statistical generative models for molecular graphs attract attention from many researchers from the fields of bio- and chemo-informatics. Among these models, invertible flow-based approaches are not fully explored yet. In this paper, we propose a powerful invertible flow for molecular graphs, called graph residual flow (GRF). The GRF is based on residual flows, which are known for more flexible and complex non-linear mappings than traditional coupling flows. We theoretically derive non-trivial conditions such that GRF is invertible, and present a way of keeping the entire flows invertible throughout the training and sampling. Experimental results show that a generative model based on the proposed GRF achieves comparable generation performance, with much smaller number of trainable parameters compared to the existing flow-based model.
Tasks	Graph Generation
Published	2019-09-30
URL	https://arxiv.org/abs/1909.13521v1
PDF	https://arxiv.org/pdf/1909.13521v1.pdf
PWC	https://paperswithcode.com/paper/graph-residual-flow-for-molecular-graph
Repo	https://github.com/pfnet-research/chainer-chemistry
Framework	none

Attention Based Glaucoma Detection: A Large-scale Database and CNN Model


Title	Attention Based Glaucoma Detection: A Large-scale Database and CNN Model
Authors	Liu Li, Mai Xu, Xiaofei Wang, Lai Jiang, Hanruo Liu
Abstract	Recently, the attention mechanism has been successfully applied in convolutional neural networks (CNNs), significantly boosting the performance of many computer vision tasks. Unfortunately, few medical image recognition approaches incorporate the attention mechanism in the CNNs. In particular, there exists high redundancy in fundus images for glaucoma detection, such that the attention mechanism has potential in improving the performance of CNN-based glaucoma detection. This paper proposes an attention-based CNN for glaucoma detection (AG-CNN). Specifically, we first establish a large-scale attention based glaucoma (LAG) database, which includes 5,824 fundus images labeled with either positive glaucoma (2,392) or negative glaucoma (3,432). The attention maps of the ophthalmologists are also collected in LAG database through a simulated eye-tracking experiment. Then, a new structure of AG-CNN is designed, including an attention prediction subnet, a pathological area localization subnet and a glaucoma classification subnet. Different from other attention-based CNN methods, the features are also visualized as the localized pathological area, which can advance the performance of glaucoma detection. Finally, the experiment results show that the proposed AG-CNN approach significantly advances state-of-the-art glaucoma detection.
Tasks	Eye Tracking
Published	2019-03-26
URL	http://arxiv.org/abs/1903.10831v3
PDF	http://arxiv.org/pdf/1903.10831v3.pdf
PWC	https://paperswithcode.com/paper/attention-based-glaucoma-detection-a-large
Repo	https://github.com/smilell/AG-CNN
Framework	none

Studying the Inductive Biases of RNNs with Synthetic Variations of Natural Languages


Title	Studying the Inductive Biases of RNNs with Synthetic Variations of Natural Languages
Authors	Shauli Ravfogel, Yoav Goldberg, Tal Linzen
Abstract	How do typological properties such as word order and morphological case marking affect the ability of neural sequence models to acquire the syntax of a language? Cross-linguistic comparisons of RNNs’ syntactic performance (e.g., on subject-verb agreement prediction) are complicated by the fact that any two languages differ in multiple typological properties, as well as by differences in training corpus. We propose a paradigm that addresses these issues: we create synthetic versions of English, which differ from English in one or more typological parameters, and generate corpora for those languages based on a parsed English corpus. We report a series of experiments in which RNNs were trained to predict agreement features for verbs in each of those synthetic languages. Among other findings, (1) performance was higher in subject-verb-object order (as in English) than in subject-object-verb order (as in Japanese), suggesting that RNNs have a recency bias; (2) predicting agreement with both subject and object (polypersonal agreement) improves over predicting each separately, suggesting that underlying syntactic knowledge transfers across the two tasks; and (3) overt morphological case makes agreement prediction significantly easier, regardless of word order.
Tasks
Published	2019-03-15
URL	http://arxiv.org/abs/1903.06400v2
PDF	http://arxiv.org/pdf/1903.06400v2.pdf
PWC	https://paperswithcode.com/paper/studying-the-inductive-biases-of-rnns-with
Repo	https://github.com/Shaul1321/rnn_typology
Framework	none

Kernel-Based Approaches for Sequence Modeling: Connections to Neural Methods


Title	Kernel-Based Approaches for Sequence Modeling: Connections to Neural Methods
Authors	Kevin J Liang, Guoyin Wang, Yitong Li, Ricardo Henao, Lawrence Carin
Abstract	We investigate time-dependent data analysis from the perspective of recurrent kernel machines, from which models with hidden units and gated memory cells arise naturally. By considering dynamic gating of the memory cell, a model closely related to the long short-term memory (LSTM) recurrent neural network is derived. Extending this setup to $n$-gram filters, the convolutional neural network (CNN), Gated CNN, and recurrent additive network (RAN) are also recovered as special cases. Our analysis provides a new perspective on the LSTM, while also extending it to $n$-gram convolutional filters. Experiments are performed on natural language processing tasks and on analysis of local field potentials (neuroscience). We demonstrate that the variants we derive from kernels perform on par or even better than traditional neural methods. For the neuroscience application, the new models demonstrate significant improvements relative to the prior state of the art.
Tasks
Published	2019-10-09
URL	https://arxiv.org/abs/1910.04233v1
PDF	https://arxiv.org/pdf/1910.04233v1.pdf
PWC	https://paperswithcode.com/paper/kernel-based-approaches-for-sequence-modeling
Repo	https://github.com/kevinjliang/kernels2rnns
Framework	tf

A Graph Theoretic Framework of Recomputation Algorithms for Memory-Efficient Backpropagation


Title	A Graph Theoretic Framework of Recomputation Algorithms for Memory-Efficient Backpropagation
Authors	Mitsuru Kusumoto, Takuya Inoue, Gentaro Watanabe, Takuya Akiba, Masanori Koyama
Abstract	Recomputation algorithms collectively refer to a family of methods that aims to reduce the memory consumption of the backpropagation by selectively discarding the intermediate results of the forward propagation and recomputing the discarded results as needed. In this paper, we will propose a novel and efficient recomputation method that can be applied to a wider range of neural nets than previous methods. We use the language of graph theory to formalize the general recomputation problem of minimizing the computational overhead under a fixed memory budget constraint, and provide a dynamic programming solution to the problem. Our method can reduce the peak memory consumption on various benchmark networks by 36%~81%, which outperforms the reduction achieved by other methods.
Tasks
Published	2019-05-28
URL	https://arxiv.org/abs/1905.11722v1
PDF	https://arxiv.org/pdf/1905.11722v1.pdf
PWC	https://paperswithcode.com/paper/a-graph-theoretic-framework-of-recomputation
Repo	https://github.com/pfnet-research/recompute
Framework	none

Optic-Net: A Novel Convolutional Neural Network for Diagnosis of Retinal Diseases from Optical Tomography Images


Title	Optic-Net: A Novel Convolutional Neural Network for Diagnosis of Retinal Diseases from Optical Tomography Images
Authors	Sharif Amit Kamran, Sourajit Saha, Ali Shihab Sabbir, Alireza Tavakkoli
Abstract	Diagnosing different retinal diseases from Spectral Domain Optical Coherence Tomography (SD-OCT) images is a challenging task. Different automated approaches such as image processing, machine learning and deep learning algorithms have been used for early detection and diagnosis of retinal diseases. Unfortunately, these are prone to error and computational inefficiency, which requires further intervention from human experts. In this paper, we propose a novel convolution neural network architecture to successfully distinguish between different degeneration of retinal layers and their underlying causes. The proposed novel architecture outperforms other classification models while addressing the issue of gradient explosion. Our approach reaches near perfect accuracy of 99.8% and 100% for two separately available Retinal SD-OCT data-set respectively. Additionally, our architecture predicts retinal diseases in real time while outperforming human diagnosticians.
Tasks
Published	2019-10-13
URL	https://arxiv.org/abs/1910.05672v1
PDF	https://arxiv.org/pdf/1910.05672v1.pdf
PWC	https://paperswithcode.com/paper/optic-net-a-novel-convolutional-neural
Repo	https://github.com/SharifAmit/OCT_Classification
Framework	tf

RLCard: A Toolkit for Reinforcement Learning in Card Games


Title	RLCard: A Toolkit for Reinforcement Learning in Card Games
Authors	Daochen Zha, Kwei-Herng Lai, Yuanpu Cao, Songyi Huang, Ruzhe Wei, Junyu Guo, Xia Hu
Abstract	RLCard is an open-source toolkit for reinforcement learning research in card games. It supports various card environments with easy-to-use interfaces, including Blackjack, Leduc Hold’em, Texas Hold’em, UNO, Dou Dizhu and Mahjong. The goal of RLCard is to bridge reinforcement learning and imperfect information games, and push forward the research of reinforcement learning in domains with multiple agents, large state and action space, and sparse reward. In this paper, we provide an overview of the key components in RLCard, a discussion of the design principles, a brief introduction of the interfaces, and comprehensive evaluations of the environments. The codes and documents are available at https://github.com/datamllab/rlcard
Tasks	Board Games, Card Games, Game of Poker, Multi-agent Reinforcement Learning
Published	2019-10-10
URL	https://arxiv.org/abs/1910.04376v2
PDF	https://arxiv.org/pdf/1910.04376v2.pdf
PWC	https://paperswithcode.com/paper/rlcard-a-toolkit-for-reinforcement-learning
Repo	https://github.com/datamllab/rlcard
Framework	pytorch

Block Coordinate Regularization by Denoising


Title	Block Coordinate Regularization by Denoising
Authors	Yu Sun, Jiaming Liu, Ulugbek S. Kamilov
Abstract	We consider the problem of estimating a vector from its noisy measurements using a prior specified only through a denoising function. Recent work on plug-and-play priors (PnP) and regularization-by-denoising (RED) has shown the state-of-the-art performance of estimators under such priors in a range of imaging tasks. In this work, we develop a new block coordinate RED algorithm that decomposes a large-scale estimation problem into a sequence of updates over a small subset of the unknown variables. We theoretically analyze the convergence of the algorithm and discuss its relationship to the traditional proximal optimization. Our analysis complements and extends recent theoretical results for RED-based estimation methods. We numerically validate our method using several denoiser priors, including those based on convolutional neural network (CNN) denoisers.
Tasks	Denoising
Published	2019-05-13
URL	https://arxiv.org/abs/1905.05113v2
PDF	https://arxiv.org/pdf/1905.05113v2.pdf
PWC	https://paperswithcode.com/paper/block-coordinate-regularization-by-denoising
Repo	https://github.com/wustl-cig/bcred
Framework	tf

Zoho at SemEval-2019 Task 9: Semi-supervised Domain Adaptation using Tri-training for Suggestion Mining


Title	Zoho at SemEval-2019 Task 9: Semi-supervised Domain Adaptation using Tri-training for Suggestion Mining
Authors	Sai Prasanna, Sri Ananda Seelan
Abstract	This paper describes our submission for the SemEval-2019 Suggestion Mining task. A simple Convolutional Neural Network (CNN) classifier with contextual word representations from a pre-trained language model was used for sentence classification. The model is trained using tri-training, a semi-supervised bootstrapping mechanism for labelling unseen data. Tri-training proved to be an effective technique to accommodate domain shift for cross-domain suggestion mining (Subtask B) where there is no hand labelled training data. For in-domain evaluation (Subtask A), we use the same technique to augment the training set. Our system ranks thirteenth in Subtask A with an $F_1$-score of 68.07 and third in Subtask B with an $F_1$-score of 81.94.
Tasks	Domain Adaptation, Language Modelling, Sentence Classification
Published	2019-02-27
URL	http://arxiv.org/abs/1902.10623v2
PDF	http://arxiv.org/pdf/1902.10623v2.pdf
PWC	https://paperswithcode.com/paper/zoho-at-semeval-2019-task-9-semi-supervised
Repo	https://github.com/sai-prasanna/suggestion-mining-semeval19
Framework	none

BS-Nets: An End-to-End Framework For Band Selection of Hyperspectral Image


Title	BS-Nets: An End-to-End Framework For Band Selection of Hyperspectral Image
Authors	Yaoming Cai, Xiaobo Liu, Zhihua Cai
Abstract	Hyperspectral image (HSI) consists of hundreds of continuous narrow bands with high spectral correlation, which would lead to the so-called Hughes phenomenon and the high computational cost in processing. Band selection has been proven effective in avoiding such problems by removing the redundant bands. However, many of existing band selection methods separately estimate the significance for every single band and cannot fully consider the nonlinear and global interaction between spectral bands. In this paper, by assuming that a complete HSI can be reconstructed from its few informative bands, we propose a general band selection framework, Band Selection Network (termed as BS-Net). The framework consists of a band attention module (BAM), which aims to explicitly model the nonlinear inter-dependencies between spectral bands, and a reconstruction network (RecNet), which is used to restore the original HSI cube from the learned informative bands, resulting in a flexible architecture. The resulting framework is end-to-end trainable, making it easier to train from scratch and to combine with existing networks. We implement two BS-Nets respectively using fully connected networks (BS-Net-FC) and convolutional neural networks (BS-Net-Conv), and compare the results with many existing band selection approaches for three real hyperspectral images, demonstrating that the proposed BS-Nets can accurately select informative band subset with less redundancy and achieve significantly better classification performance with an acceptable time cost.
Tasks	Classification Of Hyperspectral Images, Hyperspectral Image Classification
Published	2019-04-17
URL	http://arxiv.org/abs/1904.08269v1
PDF	http://arxiv.org/pdf/1904.08269v1.pdf
PWC	https://paperswithcode.com/paper/bs-nets-an-end-to-end-framework-for-band
Repo	https://github.com/ucalyptus/BS-Nets-Implementation-Pytorch
Framework	pytorch

Minimax Confidence Intervals for the Sliced Wasserstein Distance


Title	Minimax Confidence Intervals for the Sliced Wasserstein Distance
Authors	Tudor Manole, Sivaraman Balakrishnan, Larry Wasserman
Abstract	The Wasserstein distance has risen in popularity in the statistics and machine learning communities as a useful metric for comparing probability distributions. We study the problem of uncertainty quantification for the Sliced Wasserstein distance–an easily computable approximation of the Wasserstein distance. Specifically, we construct confidence intervals for the Sliced Wasserstein distance which have finite-sample validity under no assumptions or mild moment assumptions, and are adaptive in length to the smoothness of the underlying distributions. We also bound the minimax risk of estimating the Sliced Wasserstein distance, and show that the length of our proposed confidence intervals is minimax optimal over appropriate distribution classes. To motivate the choice of these classes, we also study minimax rates of estimating a distribution under the Sliced Wasserstein distance. These theoretical findings are complemented with a simulation study.
Tasks
Published	2019-09-17
URL	https://arxiv.org/abs/1909.07862v1
PDF	https://arxiv.org/pdf/1909.07862v1.pdf
PWC	https://paperswithcode.com/paper/minimax-confidence-intervals-for-the-sliced
Repo	https://github.com/kaiwenwu96/bib-transformer
Framework	none

Relevance Factor VAE: Learning and Identifying Disentangled Factors


Title	Relevance Factor VAE: Learning and Identifying Disentangled Factors
Authors	Minyoung Kim, Yuting Wang, Pritish Sahu, Vladimir Pavlovic
Abstract	We propose a novel VAE-based deep auto-encoder model that can learn disentangled latent representations in a fully unsupervised manner, endowed with the ability to identify all meaningful sources of variation and their cardinality. Our model, dubbed Relevance-Factor-VAE, leverages the total correlation (TC) in the latent space to achieve the disentanglement goal, but also addresses the key issue of existing approaches which cannot distinguish between meaningful and nuisance factors of latent variation, often the source of considerable degradation in disentanglement performance. We tackle this issue by introducing the so-called relevance indicator variables that can be automatically learned from data, together with the VAE parameters. Our model effectively focuses the TC loss onto the relevant factors only by tolerating large prior KL divergences, a desideratum justified by our semi-parametric theoretical analysis. Using a suite of disentanglement metrics, including a newly proposed one, as well as qualitative evidence, we demonstrate that our model outperforms existing methods across several challenging benchmark datasets.
Tasks
Published	2019-02-05
URL	http://arxiv.org/abs/1902.01568v1
PDF	http://arxiv.org/pdf/1902.01568v1.pdf
PWC	https://paperswithcode.com/paper/relevance-factor-vae-learning-and-identifying
Repo	https://github.com/ThomasMrY/RF-VAE
Framework	pytorch

Structural Scaffolds for Citation Intent Classification in Scientific Publications


Title	Structural Scaffolds for Citation Intent Classification in Scientific Publications
Authors	Arman Cohan, Waleed Ammar, Madeleine van Zuylen, Field Cady
Abstract	Identifying the intent of a citation in scientific papers (e.g., background information, use of methods, comparing results) is critical for machine reading of individual publications and automated analysis of the scientific literature. We propose structural scaffolds, a multitask model to incorporate structural information of scientific papers into citations for effective classification of citation intents. Our model achieves a new state-of-the-art on an existing ACL anthology dataset (ACL-ARC) with a 13.3% absolute increase in F1 score, without relying on external linguistic resources or hand-engineered features as done in existing methods. In addition, we introduce a new dataset of citation intents (SciCite) which is more than five times larger and covers multiple scientific domains compared with existing datasets. Our code and data are available at: https://github.com/allenai/scicite.
Tasks	Citation Intent Classification, Intent Classification, Reading Comprehension, Sentence Classification
Published	2019-04-02
URL	https://arxiv.org/abs/1904.01608v2
PDF	https://arxiv.org/pdf/1904.01608v2.pdf
PWC	https://paperswithcode.com/paper/structural-scaffolds-for-citation-intent
Repo	https://github.com/allenai/scicite
Framework	none