October 20, 2019

3165 words 15 mins read

Paper Group AWR 229

Paper Group AWR 229

Numerical Coordinate Regression with Convolutional Neural Networks. Constructing Unrestricted Adversarial Examples with Generative Models. PointPillars: Fast Encoders for Object Detection from Point Clouds. Learning to Design RNA. Semantic Adversarial Examples. Classification is a Strong Baseline for Deep Metric Learning. Isospectralization, or how …

Numerical Coordinate Regression with Convolutional Neural Networks

Title Numerical Coordinate Regression with Convolutional Neural Networks
Authors Aiden Nibali, Zhen He, Stuart Morgan, Luke Prendergast
Abstract We study deep learning approaches to inferring numerical coordinates for points of interest in an input image. Existing convolutional neural network-based solutions to this problem either take a heatmap matching approach or regress to coordinates with a fully connected output layer. Neither of these approaches is ideal, since the former is not entirely differentiable, and the latter lacks inherent spatial generalization. We propose our differentiable spatial to numerical transform (DSNT) to fill this gap. The DSNT layer adds no trainable parameters, is fully differentiable, and exhibits good spatial generalization. Unlike heatmap matching, DSNT works well with low heatmap resolutions, so it can be dropped in as an output layer for a wide range of existing fully convolutional architectures. Consequently, DSNT offers a better trade-off between inference speed and prediction accuracy compared to existing techniques. When used to replace the popular heatmap matching approach used in almost all state-of-the-art methods for pose estimation, DSNT gives better prediction accuracy for all model architectures tested.
Tasks Pose Estimation
Published 2018-01-23
URL http://arxiv.org/abs/1801.07372v2
PDF http://arxiv.org/pdf/1801.07372v2.pdf
PWC https://paperswithcode.com/paper/numerical-coordinate-regression-with
Repo https://github.com/anibali/dsntnn
Framework pytorch

Constructing Unrestricted Adversarial Examples with Generative Models

Title Constructing Unrestricted Adversarial Examples with Generative Models
Authors Yang Song, Rui Shu, Nate Kushman, Stefano Ermon
Abstract Adversarial examples are typically constructed by perturbing an existing data point within a small matrix norm, and current defense methods are focused on guarding against this type of attack. In this paper, we propose unrestricted adversarial examples, a new threat model where the attackers are not restricted to small norm-bounded perturbations. Different from perturbation-based attacks, we propose to synthesize unrestricted adversarial examples entirely from scratch using conditional generative models. Specifically, we first train an Auxiliary Classifier Generative Adversarial Network (AC-GAN) to model the class-conditional distribution over data samples. Then, conditioned on a desired class, we search over the AC-GAN latent space to find images that are likely under the generative model and are misclassified by a target classifier. We demonstrate through human evaluation that unrestricted adversarial examples generated this way are legitimate and belong to the desired class. Our empirical results on the MNIST, SVHN, and CelebA datasets show that unrestricted adversarial examples can bypass strong adversarial training and certified defense methods designed for traditional adversarial attacks.
Tasks
Published 2018-05-21
URL http://arxiv.org/abs/1805.07894v4
PDF http://arxiv.org/pdf/1805.07894v4.pdf
PWC https://paperswithcode.com/paper/constructing-unrestricted-adversarial
Repo https://github.com/ermongroup/generative_adversary
Framework tf

PointPillars: Fast Encoders for Object Detection from Point Clouds

Title PointPillars: Fast Encoders for Object Detection from Point Clouds
Authors Alex H. Lang, Sourabh Vora, Holger Caesar, Lubing Zhou, Jiong Yang, Oscar Beijbom
Abstract Object detection in point clouds is an important aspect of many robotics applications such as autonomous driving. In this paper we consider the problem of encoding a point cloud into a format appropriate for a downstream detection pipeline. Recent literature suggests two types of encoders; fixed encoders tend to be fast but sacrifice accuracy, while encoders that are learned from data are more accurate, but slower. In this work we propose PointPillars, a novel encoder which utilizes PointNets to learn a representation of point clouds organized in vertical columns (pillars). While the encoded features can be used with any standard 2D convolutional detection architecture, we further propose a lean downstream network. Extensive experimentation shows that PointPillars outperforms previous encoders with respect to both speed and accuracy by a large margin. Despite only using lidar, our full detection pipeline significantly outperforms the state of the art, even among fusion methods, with respect to both the 3D and bird’s eye view KITTI benchmarks. This detection performance is achieved while running at 62 Hz: a 2 - 4 fold runtime improvement. A faster version of our method matches the state of the art at 105 Hz. These benchmarks suggest that PointPillars is an appropriate encoding for object detection in point clouds.
Tasks 3D Object Detection, Autonomous Driving, Birds Eye View Object Detection, Object Detection
Published 2018-12-14
URL https://arxiv.org/abs/1812.05784v2
PDF https://arxiv.org/pdf/1812.05784v2.pdf
PWC https://paperswithcode.com/paper/pointpillars-fast-encoders-for-object
Repo https://github.com/SmallMunich/nutonomy_pointpillars
Framework pytorch

Learning to Design RNA

Title Learning to Design RNA
Authors Frederic Runge, Danny Stoll, Stefan Falkner, Frank Hutter
Abstract Designing RNA molecules has garnered recent interest in medicine, synthetic biology, biotechnology and bioinformatics since many functional RNA molecules were shown to be involved in regulatory processes for transcription, epigenetics and translation. Since an RNA’s function depends on its structural properties, the RNA Design problem is to find an RNA sequence which satisfies given structural constraints. Here, we propose a new algorithm for the RNA Design problem, dubbed LEARNA. LEARNA uses deep reinforcement learning to train a policy network to sequentially design an entire RNA sequence given a specified target structure. By meta-learning across 65000 different RNA Design tasks for one hour on 20 CPU cores, our extension Meta-LEARNA constructs an RNA Design policy that can be applied out of the box to solve novel RNA Design tasks. Methodologically, for what we believe to be the first time, we jointly optimize over a rich space of architectures for the policy network, the hyperparameters of the training procedure and the formulation of the decision process. Comprehensive empirical results on two widely-used RNA Design benchmarks, as well as a third one that we introduce, show that our approach achieves new state-of-the-art performance on the former while also being orders of magnitudes faster in reaching the previous state-of-the-art performance. In an ablation study, we analyze the importance of our method’s different components.
Tasks Meta-Learning
Published 2018-12-31
URL http://arxiv.org/abs/1812.11951v2
PDF http://arxiv.org/pdf/1812.11951v2.pdf
PWC https://paperswithcode.com/paper/learning-to-design-rna
Repo https://github.com/automl/learna
Framework tf

Semantic Adversarial Examples

Title Semantic Adversarial Examples
Authors Hossein Hosseini, Radha Poovendran
Abstract Deep neural networks are known to be vulnerable to adversarial examples, i.e., images that are maliciously perturbed to fool the model. Generating adversarial examples has been mostly limited to finding small perturbations that maximize the model prediction error. Such images, however, contain artificial perturbations that make them somewhat distinguishable from natural images. This property is used by several defense methods to counter adversarial examples by applying denoising filters or training the model to be robust to small perturbations. In this paper, we introduce a new class of adversarial examples, namely “Semantic Adversarial Examples,” as images that are arbitrarily perturbed to fool the model, but in such a way that the modified image semantically represents the same object as the original image. We formulate the problem of generating such images as a constrained optimization problem and develop an adversarial transformation based on the shape bias property of human cognitive system. In our method, we generate adversarial images by first converting the RGB image into the HSV (Hue, Saturation and Value) color space and then randomly shifting the Hue and Saturation components, while keeping the Value component the same. Our experimental results on CIFAR10 dataset show that the accuracy of VGG16 network on adversarial color-shifted images is 5.7%.
Tasks Denoising
Published 2018-03-16
URL http://arxiv.org/abs/1804.00499v1
PDF http://arxiv.org/pdf/1804.00499v1.pdf
PWC https://paperswithcode.com/paper/semantic-adversarial-examples
Repo https://github.com/HosseinHosseini/Semantic-Adversarial-Examples
Framework none

Classification is a Strong Baseline for Deep Metric Learning

Title Classification is a Strong Baseline for Deep Metric Learning
Authors Andrew Zhai, Hao-Yu Wu
Abstract Deep metric learning aims to learn a function mapping image pixels to embedding feature vectors that model the similarity between images. Two major applications of metric learning are content-based image retrieval and face verification. For the retrieval tasks, the majority of current state-of-the-art (SOTA) approaches are triplet-based non-parametric training. For the face verification tasks, however, recent SOTA approaches have adopted classification-based parametric training. In this paper, we look into the effectiveness of classification based approaches on image retrieval datasets. We evaluate on several standard retrieval datasets such as CAR-196, CUB-200-2011, Stanford Online Product, and In-Shop datasets for image retrieval and clustering, and establish that our classification-based approach is competitive across different feature dimensions and base feature networks. We further provide insights into the performance effects of subsampling classes for scalable classification-based training, and the effects of binarization, enabling efficient storage and computation for practical applications.
Tasks Content-Based Image Retrieval, Face Verification, Image Retrieval, Metric Learning
Published 2018-11-30
URL https://arxiv.org/abs/1811.12649v2
PDF https://arxiv.org/pdf/1811.12649v2.pdf
PWC https://paperswithcode.com/paper/making-classification-competitive-for-deep
Repo https://github.com/microsoft/computervision-recipes
Framework pytorch

Isospectralization, or how to hear shape, style, and correspondence

Title Isospectralization, or how to hear shape, style, and correspondence
Authors Luca Cosmo, Mikhail Panine, Arianna Rampini, Maks Ovsjanikov, Michael M. Bronstein, Emanuele Rodolà
Abstract The question whether one can recover the shape of a geometric object from its Laplacian spectrum (‘hear the shape of the drum’) is a classical problem in spectral geometry with a broad range of implications and applications. While theoretically the answer to this question is negative (there exist examples of iso-spectral but non-isometric manifolds), little is known about the practical possibility of using the spectrum for shape reconstruction and optimization. In this paper, we introduce a numerical procedure called isospectralization, consisting of deforming one shape to make its Laplacian spectrum match that of another. We implement the isospectralization procedure using modern differentiable programming techniques and exemplify its applications in some of the classical and notoriously hard problems in geometry processing, computer vision, and graphics such as shape reconstruction, pose and style transfer, and dense deformable correspondence.
Tasks Style Transfer
Published 2018-11-28
URL http://arxiv.org/abs/1811.11465v2
PDF http://arxiv.org/pdf/1811.11465v2.pdf
PWC https://paperswithcode.com/paper/isospectralization-or-how-to-hear-shape-style
Repo https://github.com/lcosmo/isospectralization
Framework tf

Accelerating the Evolution of Convolutional Neural Networks with Node-Level Mutations and Epigenetic Weight Initialization

Title Accelerating the Evolution of Convolutional Neural Networks with Node-Level Mutations and Epigenetic Weight Initialization
Authors Travis Desell
Abstract This paper examines three generic strategies for improving the performance of neuro-evolution techniques aimed at evolving convolutional neural networks (CNNs). These were implemented as part of the Evolutionary eXploration of Augmenting Convolutional Topologies (EXACT) algorithm. EXACT evolves arbitrary convolutional neural networks (CNNs) with goals of better discovering and understanding new effective architectures of CNNs for machine learning tasks and to potentially automate the process of network design and selection. The strategies examined are node-level mutation operations, epigenetic weight initialization and pooling connections. Results were gathered over the period of a month using a volunteer computing project, where over 225,000 CNNs were trained and evaluated across 16 different EXACT searches. The node mutation operations where shown to dramatically improve evolution rates over traditional edge mutation operations (as used by the NEAT algorithm), and epigenetic weight initialization was shown to further increase the accuracy and generalizability of the trained CNNs. As a negative but interesting result, allowing for pooling connections was shown to degrade the evolution progress. The best trained CNNs reached 99.46% accuracy on the MNIST test data in under 13,500 CNN evaluations – accuracy comparable with some of the best human designed CNNs.
Tasks
Published 2018-11-17
URL http://arxiv.org/abs/1811.08286v1
PDF http://arxiv.org/pdf/1811.08286v1.pdf
PWC https://paperswithcode.com/paper/accelerating-the-evolution-of-convolutional
Repo https://github.com/travisdesell/exact
Framework none

Modeling Diverse Relevance Patterns in Ad-hoc Retrieval

Title Modeling Diverse Relevance Patterns in Ad-hoc Retrieval
Authors Yixing Fan, Jiafeng Guo, Yanyan Lan, Jun Xu, Chengxiang Zhai, Xueqi Cheng
Abstract Assessing relevance between a query and a document is challenging in ad-hoc retrieval due to its diverse patterns, i.e., a document could be relevant to a query as a whole or partially as long as it provides sufficient information for users’ need. Such diverse relevance patterns require an ideal retrieval model to be able to assess relevance in the right granularity adaptively. Unfortunately, most existing retrieval models compute relevance at a single granularity, either document-wide or passage-level, or use fixed combination strategy, restricting their ability in capturing diverse relevance patterns. In this work, we propose a data-driven method to allow relevance signals at different granularities to compete with each other for final relevance assessment. Specifically, we propose a HIerarchical Neural maTching model (HiNT) which consists of two stacked components, namely local matching layer and global decision layer. The local matching layer focuses on producing a set of local relevance signals by modeling the semantic matching between a query and each passage of a document. The global decision layer accumulates local signals into different granularities and allows them to compete with each other to decide the final relevance score. Experimental results demonstrate that our HiNT model outperforms existing state-of-the-art retrieval models significantly on benchmark ad-hoc retrieval datasets.
Tasks
Published 2018-05-15
URL https://arxiv.org/abs/1805.05737v1
PDF https://arxiv.org/pdf/1805.05737v1.pdf
PWC https://paperswithcode.com/paper/modeling-diverse-relevance-patterns-in-ad-hoc-1
Repo https://github.com/faneshion/HiNT
Framework none

Spatially Controllable Image Synthesis with Internal Representation Collaging

Title Spatially Controllable Image Synthesis with Internal Representation Collaging
Authors Ryohei Suzuki, Masanori Koyama, Takeru Miyato, Taizan Yonetsuji, Huachun Zhu
Abstract We present a novel CNN-based image editing strategy that allows the user to change the semantic information of an image over an arbitrary region by manipulating the feature-space representation of the image in a trained GAN model. We will present two variants of our strategy: (1) spatial conditional batch normalization (sCBN), a type of conditional batch normalization with user-specifiable spatial weight maps, and (2) feature-blending, a method of directly modifying the intermediate features. Our methods can be used to edit both artificial image and real image, and they both can be used together with any GAN with conditional normalization layers. We will demonstrate the power of our method through experiments on various types of GANs trained on different datasets. Code will be available at https://github.com/pfnet-research/neural-collage.
Tasks Image Generation
Published 2018-11-26
URL http://arxiv.org/abs/1811.10153v2
PDF http://arxiv.org/pdf/1811.10153v2.pdf
PWC https://paperswithcode.com/paper/collaging-on-internal-representations-an
Repo https://github.com/quolc/neural-collage
Framework none

A Dataset To Evaluate The Representations Learned By Video Prediction Models

Title A Dataset To Evaluate The Representations Learned By Video Prediction Models
Authors Ryan Szeto, Simon Stent, German Ros, Jason J. Corso
Abstract We present a parameterized synthetic dataset called Moving Symbols to support the objective study of video prediction networks. Using several instantiations of the dataset in which variation is explicitly controlled, we highlight issues in an existing state-of-the-art approach and propose the use of a performance metric with greater semantic meaning to improve experimental interpretability. Our dataset provides canonical test cases that will help the community better understand, and eventually improve, the representations learned by such networks in the future. Code is available at https://github.com/rszeto/moving-symbols .
Tasks Video Prediction
Published 2018-02-25
URL http://arxiv.org/abs/1802.08936v3
PDF http://arxiv.org/pdf/1802.08936v3.pdf
PWC https://paperswithcode.com/paper/a-dataset-to-evaluate-the-representations
Repo https://github.com/rszeto/moving-symbols
Framework none

GraphRNN: Generating Realistic Graphs with Deep Auto-regressive Models

Title GraphRNN: Generating Realistic Graphs with Deep Auto-regressive Models
Authors Jiaxuan You, Rex Ying, Xiang Ren, William L. Hamilton, Jure Leskovec
Abstract Modeling and generating graphs is fundamental for studying networks in biology, engineering, and social sciences. However, modeling complex distributions over graphs and then efficiently sampling from these distributions is challenging due to the non-unique, high-dimensional nature of graphs and the complex, non-local dependencies that exist between edges in a given graph. Here we propose GraphRNN, a deep autoregressive model that addresses the above challenges and approximates any distribution of graphs with minimal assumptions about their structure. GraphRNN learns to generate graphs by training on a representative set of graphs and decomposes the graph generation process into a sequence of node and edge formations, conditioned on the graph structure generated so far. In order to quantitatively evaluate the performance of GraphRNN, we introduce a benchmark suite of datasets, baselines and novel evaluation metrics based on Maximum Mean Discrepancy, which measure distances between sets of graphs. Our experiments show that GraphRNN significantly outperforms all baselines, learning to generate diverse graphs that match the structural characteristics of a target set, while also scaling to graphs 50 times larger than previous deep models.
Tasks Graph Generation
Published 2018-02-24
URL http://arxiv.org/abs/1802.08773v3
PDF http://arxiv.org/pdf/1802.08773v3.pdf
PWC https://paperswithcode.com/paper/graphrnn-generating-realistic-graphs-with
Repo https://github.com/snap-stanford/GraphRNN
Framework pytorch
Title PixelLink: Detecting Scene Text via Instance Segmentation
Authors Dan Deng, Haifeng Liu, Xuelong Li, Deng Cai
Abstract Most state-of-the-art scene text detection algorithms are deep learning based methods that depend on bounding box regression and perform at least two kinds of predictions: text/non-text classification and location regression. Regression plays a key role in the acquisition of bounding boxes in these methods, but it is not indispensable because text/non-text prediction can also be considered as a kind of semantic segmentation that contains full location information in itself. However, text instances in scene images often lie very close to each other, making them very difficult to separate via semantic segmentation. Therefore, instance segmentation is needed to address this problem. In this paper, PixelLink, a novel scene text detection algorithm based on instance segmentation, is proposed. Text instances are first segmented out by linking pixels within the same instance together. Text bounding boxes are then extracted directly from the segmentation result without location regression. Experiments show that, compared with regression-based methods, PixelLink can achieve better or comparable performance on several benchmarks, while requiring many fewer training iterations and less training data.
Tasks Instance Segmentation, Scene Text Detection, Semantic Segmentation, Text Classification
Published 2018-01-04
URL http://arxiv.org/abs/1801.01315v1
PDF http://arxiv.org/pdf/1801.01315v1.pdf
PWC https://paperswithcode.com/paper/pixellink-detecting-scene-text-via-instance
Repo https://github.com/opconty/pixellink_keras
Framework tf

Training Domain Specific Models for Energy-Efficient Object Detection

Title Training Domain Specific Models for Energy-Efficient Object Detection
Authors Kentaro Yoshioka, Edward Lee, Mark Horowitz
Abstract We propose an end-to-end framework for training domain specific models (DSMs) to obtain both high accuracy and computational efficiency for object detection tasks. DSMs are trained with distillation \cite{hinton2015distilling} and focus on achieving high accuracy at a limited domain (e.g. fixed view of an intersection). We argue that DSMs can capture essential features well even with a small model size, enabling higher accuracy and efficiency than traditional techniques. In addition, we improve the training efficiency by reducing the dataset size by culling easy to classify images from the training set. For the limited domain, we observed that compact DSMs significantly surpass the accuracy of COCO trained models of the same size. By training on a compact dataset, we show that with an accuracy drop of only 3.6%, the training time can be reduced by 93%. The codes are uploaded in https://github.com/kentaroy47/training-domain-specific-models.
Tasks Object Detection
Published 2018-11-06
URL http://arxiv.org/abs/1811.02689v2
PDF http://arxiv.org/pdf/1811.02689v2.pdf
PWC https://paperswithcode.com/paper/training-domain-specific-models-for-energy
Repo https://github.com/kentaroy47/training-domain-specific-models
Framework pytorch

INFERNO: Inference-Aware Neural Optimisation

Title INFERNO: Inference-Aware Neural Optimisation
Authors Pablo de Castro, Tommaso Dorigo
Abstract Complex computer simulations are commonly required for accurate data modelling in many scientific disciplines, making statistical inference challenging due to the intractability of the likelihood evaluation for the observed data. Furthermore, sometimes one is interested on inference drawn over a subset of the generative model parameters while taking into account model uncertainty or misspecification on the remaining nuisance parameters. In this work, we show how non-linear summary statistics can be constructed by minimising inference-motivated losses via stochastic gradient descent such they provided the smallest uncertainty for the parameters of interest. As a use case, the problem of confidence interval estimation for the mixture coefficient in a multi-dimensional two-component mixture model (i.e. signal vs background) is considered, where the proposed technique clearly outperforms summary statistics based on probabilistic classification, which are a commonly used alternative but do not account for the presence of nuisance parameters.
Tasks
Published 2018-06-12
URL http://arxiv.org/abs/1806.04743v2
PDF http://arxiv.org/pdf/1806.04743v2.pdf
PWC https://paperswithcode.com/paper/inferno-inference-aware-neural-optimisation
Repo https://github.com/pablodecm/paper-inferno
Framework tf
comments powered by Disqus