October 20, 2019

3063 words 15 mins read

Paper Group AWR 223

A linear time method for the detection of point and collective anomalies. Visual Global Localization with a Hybrid WNN-CNN Approach. Simple Models for Word Formation in English Slang. Scalable Spectral Clustering Using Random Binning Features. Scene Text Detection with Supervised Pyramid Context Network. HyperDense-Net: A hyper-densely connected CN …

A linear time method for the detection of point and collective anomalies


Title	A linear time method for the detection of point and collective anomalies
Authors	Alexander T. M. Fisch, Idris A. Eckley, Paul Fearnhead
Abstract	The challenge of efficiently identifying anomalies in data sequences is an important statistical problem that now arises in many applications. Whilst there has been substantial work aimed at making statistical analyses robust to outliers, or point anomalies, there has been much less work on detecting anomalous segments, or collective anomalies, particularly in those settings where point anomalies might also occur. In this article, we introduce Collective And Point Anomalies (CAPA), a computationally efficient approach that is suitable when collective anomalies are characterised by either a change in mean, variance, or both, and distinguishes them from point anomalies. Theoretical results establish the consistency of CAPA at detecting collective anomalies and, as a by-product, the consistency of a popular penalised cost based change in mean and variance detection method. Empirical results show that CAPA has close to linear computational cost as well as being more accurate at detecting and locating collective anomalies than other approaches. We demonstrate the utility of CAPA through its ability to detect exoplanets from light curve data from the Kepler telescope.
Tasks
Published	2018-06-05
URL	http://arxiv.org/abs/1806.01947v2
PDF	http://arxiv.org/pdf/1806.01947v2.pdf
PWC	https://paperswithcode.com/paper/a-linear-time-method-for-the-detection-of
Repo	https://github.com/Fisch-Alex/anomaly
Framework	none

Visual Global Localization with a Hybrid WNN-CNN Approach


Title	Visual Global Localization with a Hybrid WNN-CNN Approach
Authors	Avelino Forechi, Thiago Oliveira-Santos, Claudine Badue, Alberto F. De Souza
Abstract	Currently, self-driving cars rely greatly on the Global Positioning System (GPS) infrastructure, albeit there is an increasing demand for alternative methods for GPS-denied environments. One of them is known as place recognition, which associates images of places with their corresponding positions. We previously proposed systems based on Weightless Neural Networks (WNN) to address this problem as a classification task. This encompasses solely one part of the global localization, which is not precise enough for driverless cars. Instead of just recognizing past places and outputting their poses, it is desired that a global localization system estimates the pose of current place images. In this paper, we propose to tackle this problem as follows. Firstly, given a live image, the place recognition system returns the most similar image and its pose. Then, given live and recollected images, a visual localization system outputs the relative camera pose represented by those images. To estimate the relative camera pose between the recollected and the current images, a Convolutional Neural Network (CNN) is trained with the two images as input and a relative pose vector as output. Together, these systems solve the global localization problem using the topological and metric information to approximate the current vehicle pose. The full approach is compared to a Real- Time Kinematic GPS system and a Simultaneous Localization and Mapping (SLAM) system. Experimental results show that the proposed approach correctly localizes a vehicle 90% of the time with a mean error of 1.20m compared to 1.12m of the SLAM system and 0.37m of the GPS, 89% of the time.
Tasks	Self-Driving Cars, Simultaneous Localization and Mapping, Visual Localization
Published	2018-05-08
URL	http://arxiv.org/abs/1805.03183v2
PDF	http://arxiv.org/pdf/1805.03183v2.pdf
PWC	https://paperswithcode.com/paper/visual-global-localization-with-a-hybrid-wnn
Repo	https://github.com/LCAD-UFES/WNN-CNN-GL
Framework	pytorch

Simple Models for Word Formation in English Slang


Title	Simple Models for Word Formation in English Slang
Authors	Vivek Kulkarni, William Yang Wang
Abstract	We propose generative models for three types of extra-grammatical word formation phenomena abounding in English slang: Blends, Clippings, and Reduplicatives. Adopting a data-driven approach coupled with linguistic knowledge, we propose simple models with state of the art performance on human annotated gold standard datasets. Overall, our models reveal insights into the generative processes of word formation in slang – insights which are increasingly relevant in the context of the rising prevalence of slang and non-standard varieties on the Internet.
Tasks
Published	2018-04-07
URL	http://arxiv.org/abs/1804.02596v1
PDF	http://arxiv.org/pdf/1804.02596v1.pdf
PWC	https://paperswithcode.com/paper/simple-models-for-word-formation-in-english
Repo	https://github.com/viveksck/simplicity
Framework	pytorch

Scalable Spectral Clustering Using Random Binning Features


Title	Scalable Spectral Clustering Using Random Binning Features
Authors	Lingfei Wu, Pin-Yu Chen, Ian En-Hsu Yen, Fangli Xu, Yinglong Xia, Charu Aggarwal
Abstract	Spectral clustering is one of the most effective clustering approaches that capture hidden cluster structures in the data. However, it does not scale well to large-scale problems due to its quadratic complexity in constructing similarity graphs and computing subsequent eigendecomposition. Although a number of methods have been proposed to accelerate spectral clustering, most of them compromise considerable information loss in the original data for reducing computational bottlenecks. In this paper, we present a novel scalable spectral clustering method using Random Binning features (RB) to simultaneously accelerate both similarity graph construction and the eigendecomposition. Specifically, we implicitly approximate the graph similarity (kernel) matrix by the inner product of a large sparse feature matrix generated by RB. Then we introduce a state-of-the-art SVD solver to effectively compute eigenvectors of this large matrix for spectral clustering. Using these two building blocks, we reduce the computational cost from quadratic to linear in the number of data points while achieving similar accuracy. Our theoretical analysis shows that spectral clustering via RB converges faster to the exact spectral clustering than the standard Random Feature approximation. Extensive experiments on 8 benchmarks show that the proposed method either outperforms or matches the state-of-the-art methods in both accuracy and runtime. Moreover, our method exhibits linear scalability in both the number of data samples and the number of RB features.
Tasks	graph construction, Graph Similarity
Published	2018-05-25
URL	https://arxiv.org/abs/1805.11048v3
PDF	https://arxiv.org/pdf/1805.11048v3.pdf
PWC	https://paperswithcode.com/paper/scalable-spectral-clustering-using-random
Repo	https://github.com/IBM/SpectralClustering_RandomBinning
Framework	none

Scene Text Detection with Supervised Pyramid Context Network


Title	Scene Text Detection with Supervised Pyramid Context Network
Authors	Enze Xie, Yuhang Zang, Shuai Shao, Gang Yu, Cong Yao, Guangyao Li
Abstract	Scene text detection methods based on deep learning have achieved remarkable results over the past years. However, due to the high diversity and complexity of natural scenes, previous state-of-the-art text detection methods may still produce a considerable amount of false positives, when applied to images captured in real-world environments. To tackle this issue, mainly inspired by Mask R-CNN, we propose in this paper an effective model for scene text detection, which is based on Feature Pyramid Network (FPN) and instance segmentation. We propose a supervised pyramid context network (SPCNET) to precisely locate text regions while suppressing false positives. Benefited from the guidance of semantic information and sharing FPN, SPCNET obtains significantly enhanced performance while introducing marginal extra computation. Experiments on standard datasets demonstrate that our SPCNET clearly outperforms start-of-the-art methods. Specifically, it achieves an F-measure of 92.1% on ICDAR2013, 87.2% on ICDAR2015, 74.1% on ICDAR2017 MLT and 82.9% on Total-Text.
Tasks	Instance Segmentation, Scene Text Detection, Semantic Segmentation
Published	2018-11-21
URL	http://arxiv.org/abs/1811.08605v1
PDF	http://arxiv.org/pdf/1811.08605v1.pdf
PWC	https://paperswithcode.com/paper/scene-text-detection-with-supervised-pyramid
Repo	https://github.com/brooklyn1900/SPCNet
Framework	tf


Title	HyperDense-Net: A hyper-densely connected CNN for multi-modal image segmentation
Authors	Jose Dolz, Karthik Gopinath, Jing Yuan, Herve Lombaert, Christian Desrosiers, Ismail Ben Ayed
Abstract	Recently, dense connections have attracted substantial attention in computer vision because they facilitate gradient flow and implicit deep supervision during training. Particularly, DenseNet, which connects each layer to every other layer in a feed-forward fashion, has shown impressive performances in natural image classification tasks. We propose HyperDenseNet, a 3D fully convolutional neural network that extends the definition of dense connectivity to multi-modal segmentation problems. Each imaging modality has a path, and dense connections occur not only between the pairs of layers within the same path, but also between those across different paths. This contrasts with the existing multi-modal CNN approaches, in which modeling several modalities relies entirely on a single joint layer (or level of abstraction) for fusion, typically either at the input or at the output of the network. Therefore, the proposed network has total freedom to learn more complex combinations between the modalities, within and in-between all the levels of abstraction, which increases significantly the learning representation. We report extensive evaluations over two different and highly competitive multi-modal brain tissue segmentation challenges, iSEG 2017 and MRBrainS 2013, with the former focusing on 6-month infant data and the latter on adult images. HyperDenseNet yielded significant improvements over many state-of-the-art segmentation networks, ranking at the top on both benchmarks. We further provide a comprehensive experimental analysis of features re-use, which confirms the importance of hyper-dense connections in multi-modal representation learning. Our code is publicly available at https://www.github.com/josedolz/HyperDenseNet.
Tasks	Brain Segmentation, Image Classification, Medical Image Segmentation, Multi-modal image segmentation, Representation Learning, Semantic Segmentation
Published	2018-04-09
URL	http://arxiv.org/abs/1804.02967v2
PDF	http://arxiv.org/pdf/1804.02967v2.pdf
PWC	https://paperswithcode.com/paper/hyperdense-net-a-hyper-densely-connected-cnn
Repo	https://github.com/josedolz/HyperDenseNet_pytorch
Framework	pytorch

A Two-Stage Method for Text Line Detection in Historical Documents


Title	A Two-Stage Method for Text Line Detection in Historical Documents
Authors	Tobias Grüning, Gundram Leifert, Tobias Strauß, Johannes Michael, Roger Labahn
Abstract	This work presents a two-stage text line detection method for historical documents. Each detected text line is represented by its baseline. In a first stage, a deep neural network called ARU-Net labels pixels to belong to one of the three classes: baseline, separator or other. The separator class marks beginning and end of each text line. The ARU-Net is trainable from scratch with manageably few manually annotated example images (less than 50). This is achieved by utilizing data augmentation strategies. The network predictions are used as input for the second stage which performs a bottom-up clustering to build baselines. The developed method is capable of handling complex layouts as well as curved and arbitrarily oriented text lines. It substantially outperforms current state-of-the-art approaches. For example, for the complex track of the cBAD: ICDAR2017 Competition on Baseline Detection the F-value is increased from 0.859 to 0.922. The framework to train and run the ARU-Net is open source.
Tasks	Data Augmentation
Published	2018-02-09
URL	https://arxiv.org/abs/1802.03345v2
PDF	https://arxiv.org/pdf/1802.03345v2.pdf
PWC	https://paperswithcode.com/paper/a-two-stage-method-for-text-line-detection-in
Repo	https://github.com/TobiasGruening/ARU-Net
Framework	tf

Coarse-Graining Auto-Encoders for Molecular Dynamics


Title	Coarse-Graining Auto-Encoders for Molecular Dynamics
Authors	Wujie Wang, Rafael Gómez-Bombarelli
Abstract	Molecular dynamics simulations provide theoretical insight into the microscopic behavior of materials in condensed phase and, as a predictive tool, enable computational design of new compounds. However, because of the large temporal and spatial scales involved in thermodynamic and kinetic phenomena in materials, atomistic simulations are often computationally unfeasible. Coarse-graining methods allow simulating larger systems, by reducing the dimensionality of the simulation, and propagating longer timesteps, by averaging out fast motions. Coarse-graining involves two coupled learning problems; defining the mapping from an all-atom to a reduced representation, and the parametrization of a Hamiltonian over coarse-grained coordinates. Multiple statistical mechanics approaches have addressed the latter, but the former is generally a hand-tuned process based on chemical intuition. Here we present Autograin, an optimization framework based on auto-encoders to learn both tasks simultaneously. Autograin is trained to learn the optimal mapping between all-atom and reduced representation, using the reconstruction loss to facilitate the learning of coarse-grained variables. In addition, a force-matching method is applied to variationally determine the coarse-grained potential energy function. This procedure is tested on a number of model systems including single-molecule and bulk-phase periodic simulations.
Tasks
Published	2018-12-06
URL	http://arxiv.org/abs/1812.02706v2
PDF	http://arxiv.org/pdf/1812.02706v2.pdf
PWC	https://paperswithcode.com/paper/variational-coarse-graining-for-molecular
Repo	https://github.com/wwang2/Coarse-Graining-Auto-encoders
Framework	pytorch

Provable Robustness of ReLU networks via Maximization of Linear Regions


Title	Provable Robustness of ReLU networks via Maximization of Linear Regions
Authors	Francesco Croce, Maksym Andriushchenko, Matthias Hein
Abstract	It has been shown that neural network classifiers are not robust. This raises concerns about their usage in safety-critical systems. We propose in this paper a regularization scheme for ReLU networks which provably improves the robustness of the classifier by maximizing the linear regions of the classifier as well as the distance to the decision boundary. Our techniques allow even to find the minimal adversarial perturbation for a fraction of test points for large networks. In the experiments we show that our approach improves upon adversarial training both in terms of lower and upper bounds on the robustness and is comparable or better than the state-of-the-art in terms of test error and robustness.
Tasks
Published	2018-10-17
URL	http://arxiv.org/abs/1810.07481v2
PDF	http://arxiv.org/pdf/1810.07481v2.pdf
PWC	https://paperswithcode.com/paper/provable-robustness-of-relu-networks-via
Repo	https://github.com/fra31/mmr-universal
Framework	pytorch

Evidential Deep Learning to Quantify Classification Uncertainty


Title	Evidential Deep Learning to Quantify Classification Uncertainty
Authors	Murat Sensoy, Lance Kaplan, Melih Kandemir
Abstract	Deterministic neural nets have been shown to learn effective predictors on a wide range of machine learning problems. However, as the standard approach is to train the network to minimize a prediction loss, the resultant model remains ignorant to its prediction confidence. Orthogonally to Bayesian neural nets that indirectly infer prediction uncertainty through weight uncertainties, we propose explicit modeling of the same using the theory of subjective logic. By placing a Dirichlet distribution on the class probabilities, we treat predictions of a neural net as subjective opinions and learn the function that collects the evidence leading to these opinions by a deterministic neural net from data. The resultant predictor for a multi-class classification problem is another Dirichlet distribution whose parameters are set by the continuous output of a neural net. We provide a preliminary analysis on how the peculiarities of our new loss function drive improved uncertainty estimation. We observe that our method achieves unprecedented success on detection of out-of-distribution queries and endurance against adversarial perturbations.
Tasks
Published	2018-06-05
URL	http://arxiv.org/abs/1806.01768v3
PDF	http://arxiv.org/pdf/1806.01768v3.pdf
PWC	https://paperswithcode.com/paper/evidential-deep-learning-to-quantify
Repo	https://github.com/atilberk/evidential-deep-learning-to-quantify-classification-uncertainty
Framework	none

Diverse feature visualizations reveal invariances in early layers of deep neural networks


Title	Diverse feature visualizations reveal invariances in early layers of deep neural networks
Authors	Santiago A. Cadena, Marissa A. Weis, Leon A. Gatys, Matthias Bethge, Alexander S. Ecker
Abstract	Visualizing features in deep neural networks (DNNs) can help understanding their computations. Many previous studies aimed to visualize the selectivity of individual units by finding meaningful images that maximize their activation. However, comparably little attention has been paid to visualizing to what image transformations units in DNNs are invariant. Here we propose a method to discover invariances in the responses of hidden layer units of deep neural networks. Our approach is based on simultaneously searching for a batch of images that strongly activate a unit while at the same time being as distinct from each other as possible. We find that even early convolutional layers in VGG-19 exhibit various forms of response invariance: near-perfect phase invariance in some units and invariance to local diffeomorphic transformations in others. At the same time, we uncover representational differences with ResNet-50 in its corresponding layers. We conclude that invariance transformations are a major computational component learned by DNNs and we provide a systematic method to study them.
Tasks
Published	2018-07-27
URL	http://arxiv.org/abs/1807.10589v1
PDF	http://arxiv.org/pdf/1807.10589v1.pdf
PWC	https://paperswithcode.com/paper/diverse-feature-visualizations-reveal
Repo	https://github.com/sacadena/diverse_feature_vis
Framework	tf

NIHRIO at SemEval-2018 Task 3: A Simple and Accurate Neural Network Model for Irony Detection in Twitter


Title	NIHRIO at SemEval-2018 Task 3: A Simple and Accurate Neural Network Model for Irony Detection in Twitter
Authors	Thanh Vu, Dat Quoc Nguyen, Xuan-Son Vu, Dai Quoc Nguyen, Michael Catt, Michael Trenell
Abstract	This paper describes our NIHRIO system for SemEval-2018 Task 3 “Irony detection in English tweets”. We propose to use a simple neural network architecture of Multilayer Perceptron with various types of input features including: lexical, syntactic, semantic and polarity features. Our system achieves very high performance in both subtasks of binary and multi-class irony detection in tweets. In particular, we rank third using the accuracy metric and fifth using the F1 metric. Our code is available at https://github.com/NIHRIO/IronyDetectionInTwitter
Tasks
Published	2018-04-02
URL	http://arxiv.org/abs/1804.00520v2
PDF	http://arxiv.org/pdf/1804.00520v2.pdf
PWC	https://paperswithcode.com/paper/nihrio-at-semeval-2018-task-3-a-simple-and
Repo	https://github.com/NIHRIO/IronyDetectionInTwitter
Framework	tf

Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes


Title	Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes
Authors	Pengyuan Lyu, Minghui Liao, Cong Yao, Wenhao Wu, Xiang Bai
Abstract	Recently, models based on deep neural networks have dominated the fields of scene text detection and recognition. In this paper, we investigate the problem of scene text spotting, which aims at simultaneous text detection and recognition in natural images. An end-to-end trainable neural network model for scene text spotting is proposed. The proposed model, named as Mask TextSpotter, is inspired by the newly published work Mask R-CNN. Different from previous methods that also accomplish text spotting with end-to-end trainable deep neural networks, Mask TextSpotter takes advantage of simple and smooth end-to-end learning procedure, in which precise text detection and recognition are acquired via semantic segmentation. Moreover, it is superior to previous methods in handling text instances of irregular shapes, for example, curved text. Experiments on ICDAR2013, ICDAR2015 and Total-Text demonstrate that the proposed method achieves state-of-the-art results in both scene text detection and end-to-end text recognition tasks.
Tasks	Scene Text Detection, Semantic Segmentation, Text Spotting
Published	2018-07-06
URL	http://arxiv.org/abs/1807.02242v2
PDF	http://arxiv.org/pdf/1807.02242v2.pdf
PWC	https://paperswithcode.com/paper/mask-textspotter-an-end-to-end-trainable
Repo	https://github.com/lvpengyuan/masktextspotter.caffe2
Framework	pytorch

Attentive Filtering Networks for Audio Replay Attack Detection


Title	Attentive Filtering Networks for Audio Replay Attack Detection
Authors	Cheng-I Lai, Alberto Abad, Korin Richmond, Junichi Yamagishi, Najim Dehak, Simon King
Abstract	An attacker may use a variety of techniques to fool an automatic speaker verification system into accepting them as a genuine user. Anti-spoofing methods meanwhile aim to make the system robust against such attacks. The ASVspoof 2017 Challenge focused specifically on replay attacks, with the intention of measuring the limits of replay attack detection as well as developing countermeasures against them. In this work, we propose our replay attacks detection system - Attentive Filtering Network, which is composed of an attention-based filtering mechanism that enhances feature representations in both the frequency and time domains, and a ResNet-based classifier. We show that the network enables us to visualize the automatically acquired feature representations that are helpful for spoofing detection. Attentive Filtering Network attains an evaluation EER of 8.99$%$ on the ASVspoof 2017 Version 2.0 dataset. With system fusion, our best system further obtains a 30$%$ relative improvement over the ASVspoof 2017 enhanced baseline system.
Tasks	Speaker Verification
Published	2018-10-31
URL	http://arxiv.org/abs/1810.13048v1
PDF	http://arxiv.org/pdf/1810.13048v1.pdf
PWC	https://paperswithcode.com/paper/attentive-filtering-networks-for-audio-replay
Repo	https://github.com/jefflai108/Attentive-Filtering-Network
Framework	pytorch

Learning Explanations from Language Data


Title	Learning Explanations from Language Data
Authors	David Harbecke, Robert Schwarzenberg, Christoph Alt
Abstract	PatternAttribution is a recent method, introduced in the vision domain, that explains classifications of deep neural networks. We demonstrate that it also generates meaningful interpretations in the language domain.
Tasks
Published	2018-08-13
URL	http://arxiv.org/abs/1808.04127v1
PDF	http://arxiv.org/pdf/1808.04127v1.pdf
PWC	https://paperswithcode.com/paper/learning-explanations-from-language-data
Repo	https://github.com/DFKI-NLP/language-attributions
Framework	pytorch