January 31, 2020

3362 words 16 mins read

Paper Group AWR 377

Let’s Transfer Transformations of Shared Semantic Representations. 3D Point Cloud Generative Adversarial Network Based on Tree Structured Graph Convolutions. Lund jet images from generative and cycle-consistent adversarial networks. Autoregressive Policies for Continuous Control Deep Reinforcement Learning. HybridNetSeg: A Compact Hybrid Network fo …

Let’s Transfer Transformations of Shared Semantic Representations


Title	Let’s Transfer Transformations of Shared Semantic Representations
Authors	Nam Vo, Lu Jiang, James Hays
Abstract	With a good image understanding capability, can we manipulate the images high level semantic representation? Such transformation operation can be used to generate or retrieve similar images but with a desired modification (for example changing beach background to street background); similar ability has been demonstrated in zero shot learning, attribute composition and attribute manipulation image search. In this work we show how one can learn transformations with no training examples by learning them on another domain and then transfer to the target domain. This is feasible if: first, transformation training data is more accessible in the other domain and second, both domains share similar semantics such that one can learn transformations in a shared embedding space. We demonstrate this on an image retrieval task where search query is an image, plus an additional transformation specification (for example: search for images similar to this one but background is a street instead of a beach). In one experiment, we transfer transformation from synthesized 2D blobs image to 3D rendered image, and in the other, we transfer from text domain to natural image domain.
Tasks	Image Retrieval, Zero-Shot Learning
Published	2019-03-02
URL	http://arxiv.org/abs/1903.00793v1
PDF	http://arxiv.org/pdf/1903.00793v1.pdf
PWC	https://paperswithcode.com/paper/lets-transfer-transformations-of-shared
Repo	https://github.com/gchb2012/VQA
Framework	none

3D Point Cloud Generative Adversarial Network Based on Tree Structured Graph Convolutions


Title	3D Point Cloud Generative Adversarial Network Based on Tree Structured Graph Convolutions
Authors	Dong Wook Shu, Sung Woo Park, Junseok Kwon
Abstract	In this paper, we propose a novel generative adversarial network (GAN) for 3D point clouds generation, which is called tree-GAN. To achieve state-of-the-art performance for multi-class 3D point cloud generation, a tree-structured graph convolution network (TreeGCN) is introduced as a generator for tree-GAN. Because TreeGCN performs graph convolutions within a tree, it can use ancestor information to boost the representation power for features. To evaluate GANs for 3D point clouds accurately, we develop a novel evaluation metric called Frechet point cloud distance (FPD). Experimental results demonstrate that the proposed tree-GAN outperforms state-of-the-art GANs in terms of both conventional metrics and FPD, and can generate point clouds for different semantic parts without prior knowledge.
Tasks	Point Cloud Generation
Published	2019-05-15
URL	https://arxiv.org/abs/1905.06292v2
PDF	https://arxiv.org/pdf/1905.06292v2.pdf
PWC	https://paperswithcode.com/paper/3d-point-cloud-generative-adversarial-network
Repo	https://github.com/seowok/TreeGAN
Framework	pytorch

Lund jet images from generative and cycle-consistent adversarial networks


Title	Lund jet images from generative and cycle-consistent adversarial networks
Authors	Stefano Carrazza, Frédéric A. Dreyer
Abstract	We introduce a generative model to simulate radiation patterns within a jet using the Lund jet plane. We show that using an appropriate neural network architecture with a stochastic generation of images, it is possible to construct a generative model which retrieves the underlying two-dimensional distribution to within a few percent. We compare our model with several alternative state-of-the-art generative techniques. Finally, we show how a mapping can be created between different categories of jets, and use this method to retroactively change simulation settings or the underlying process on an existing sample. These results provide a framework for significantly reducing simulation times through fast inference of the neural network as well as for data augmentation of physical measurements.
Tasks	Data Augmentation
Published	2019-09-03
URL	https://arxiv.org/abs/1909.01359v2
PDF	https://arxiv.org/pdf/1909.01359v2.pdf
PWC	https://paperswithcode.com/paper/lund-jet-images-from-generative-and-cycle
Repo	https://github.com/JetsGame/gLund
Framework	tf

Autoregressive Policies for Continuous Control Deep Reinforcement Learning


Title	Autoregressive Policies for Continuous Control Deep Reinforcement Learning
Authors	Dmytro Korenkevych, A. Rupam Mahmood, Gautham Vasan, James Bergstra
Abstract	Reinforcement learning algorithms rely on exploration to discover new behaviors, which is typically achieved by following a stochastic policy. In continuous control tasks, policies with a Gaussian distribution have been widely adopted. Gaussian exploration however does not result in smooth trajectories that generally correspond to safe and rewarding behaviors in practical tasks. In addition, Gaussian policies do not result in an effective exploration of an environment and become increasingly inefficient as the action rate increases. This contributes to a low sample efficiency often observed in learning continuous control tasks. We introduce a family of stationary autoregressive (AR) stochastic processes to facilitate exploration in continuous control domains. We show that proposed processes possess two desirable features: subsequent process observations are temporally coherent with continuously adjustable degree of coherence, and the process stationary distribution is standard normal. We derive an autoregressive policy (ARP) that implements such processes maintaining the standard agent-environment interface. We show how ARPs can be easily used with the existing off-the-shelf learning algorithms. Empirically we demonstrate that using ARPs results in improved exploration and sample efficiency in both simulated and real world domains, and, furthermore, provides smooth exploration trajectories that enable safe operation of robotic hardware.
Tasks	Continuous Control
Published	2019-03-27
URL	http://arxiv.org/abs/1903.11524v1
PDF	http://arxiv.org/pdf/1903.11524v1.pdf
PWC	https://paperswithcode.com/paper/autoregressive-policies-for-continuous
Repo	https://github.com/kindredresearch/arp
Framework	tf

HybridNetSeg: A Compact Hybrid Network for Retinal Vessel Segmentation


Title	HybridNetSeg: A Compact Hybrid Network for Retinal Vessel Segmentation
Authors	Ling Luo, Dingyu Xue, Xinglong Feng
Abstract	A large number of retinal vessel analysis methods based on image segmentation have emerged in recent years. However, existing methods depend on cumbersome backbones, such as VGG16 and ResNet-50, benefiting from their powerful feature extraction capabilities but suffering from high computational costs. In this paper, we propose a novel neural network (HybridNetSeg) dedicated to solving this drawback while further improving overall performance. Considering deformable convolution can extract complex and variable structural information, and larger kernel in mixed depthwise convolution makes contribution to higher accuracy. We have integrated these two modules and propose a Hybrid Convolution Block (HCB) using the idea of heuristic learning. Inspired by the U-Net, we use HCB to replace a part of the common convolution of the U-Net encoder, drastically reducing the parameter count to 0.71M while accelerating the inference process. Not only that, we also propose a multi-scale mixed loss mechanism. Extensive experiments on three major benchmark datasets demonstrate the effectiveness of our proposed method
Tasks	Retinal Vessel Segmentation, Semantic Segmentation
Published	2019-11-22
URL	https://arxiv.org/abs/1911.09982v1
PDF	https://arxiv.org/pdf/1911.09982v1.pdf
PWC	https://paperswithcode.com/paper/hybridnetseg-a-compact-hybrid-network-for
Repo	https://github.com/JACKYLUO1991/HybridNetSeg
Framework	pytorch

Enhancing Cross-task Black-Box Transferability of Adversarial Examples with Dispersion Reduction


Title	Enhancing Cross-task Black-Box Transferability of Adversarial Examples with Dispersion Reduction
Authors	Yantao Lu, Yunhan Jia, Jianyu Wang, Bai Li, Weiheng Chai, Lawrence Carin, Senem Velipasalar
Abstract	Neural networks are known to be vulnerable to carefully crafted adversarial examples, and these malicious samples often transfer, i.e., they remain adversarial even against other models. Although great efforts have been delved into the transferability across models, surprisingly, less attention has been paid to the cross-task transferability, which represents the real-world cybercriminal’s situation, where an ensemble of different defense/detection mechanisms need to be evaded all at once. In this paper, we investigate the transferability of adversarial examples across a wide range of real-world computer vision tasks, including image classification, object detection, semantic segmentation, explicit content detection, and text detection. Our proposed attack minimizes the ``dispersion’’ of the internal feature map, which overcomes existing attacks’ limitation of requiring task-specific loss functions and/or probing a target model. We conduct evaluation on open source detection and segmentation models as well as four different computer vision tasks provided by Google Cloud Vision (GCV) APIs, to show how our approach outperforms existing attacks by degrading performance of multiple CV tasks by a large margin with only modest perturbations linf=16. \|
Tasks	Adversarial Attack, Image Classification, Object Detection, Semantic Segmentation
Published	2019-11-22
URL	https://arxiv.org/abs/1911.11616v1
PDF	https://arxiv.org/pdf/1911.11616v1.pdf
PWC	https://paperswithcode.com/paper/enhancing-cross-task-black-box
Repo	https://github.com/anonymous0120/dr
Framework	pytorch

Query-guided End-to-End Person Search


Title	Query-guided End-to-End Person Search
Authors	Bharti Munjal, Sikandar Amin, Federico Tombari, Fabio Galasso
Abstract	Person search has recently gained attention as the novel task of finding a person, provided as a cropped sample, from a gallery of non-cropped images, whereby several other people are also visible. We believe that i. person detection and re-identification should be pursued in a joint optimization framework and that ii. the person search should leverage the query image extensively (e.g. emphasizing unique query patterns). However, so far, no prior art realizes this. We introduce a novel query-guided end-to-end person search network (QEEPS) to address both aspects. We leverage a most recent joint detector and re-identification work, OIM [37]. We extend this with i. a query-guided Siamese squeeze-and-excitation network (QSSE-Net) that uses global context from both the query and gallery images, ii. a query-guided region proposal network (QRPN) to produce query-relevant proposals, and iii. a query-guided similarity subnetwork (QSimNet), to learn a query-guided reidentification score. QEEPS is the first end-to-end query-guided detection and re-id network. On both the most recent CUHK-SYSU [37] and PRW [46] datasets, we outperform the previous state-of-the-art by a large margin.
Tasks	Human Detection, Person Search
Published	2019-05-03
URL	https://arxiv.org/abs/1905.01203v1
PDF	https://arxiv.org/pdf/1905.01203v1.pdf
PWC	https://paperswithcode.com/paper/query-guided-end-to-end-person-search
Repo	https://github.com/munjalbharti/Query-guided-End-to-End-Person-Search
Framework	none

Fooling automated surveillance cameras: adversarial patches to attack person detection


Title	Fooling automated surveillance cameras: adversarial patches to attack person detection
Authors	Simen Thys, Wiebe Van Ranst, Toon Goedemé
Abstract	Adversarial attacks on machine learning models have seen increasing interest in the past years. By making only subtle changes to the input of a convolutional neural network, the output of the network can be swayed to output a completely different result. The first attacks did this by changing pixel values of an input image slightly to fool a classifier to output the wrong class. Other approaches have tried to learn “patches” that can be applied to an object to fool detectors and classifiers. Some of these approaches have also shown that these attacks are feasible in the real-world, i.e. by modifying an object and filming it with a video camera. However, all of these approaches target classes that contain almost no intra-class variety (e.g. stop signs). The known structure of the object is then used to generate an adversarial patch on top of it. In this paper, we present an approach to generate adversarial patches to targets with lots of intra-class variety, namely persons. The goal is to generate a patch that is able successfully hide a person from a person detector. An attack that could for instance be used maliciously to circumvent surveillance systems, intruders can sneak around undetected by holding a small cardboard plate in front of their body aimed towards the surveillance camera. From our results we can see that our system is able significantly lower the accuracy of a person detector. Our approach also functions well in real-life scenarios where the patch is filmed by a camera. To the best of our knowledge we are the first to attempt this kind of attack on targets with a high level of intra-class variety like persons.
Tasks	Human Detection
Published	2019-04-18
URL	http://arxiv.org/abs/1904.08653v1
PDF	http://arxiv.org/pdf/1904.08653v1.pdf
PWC	https://paperswithcode.com/paper/fooling-automated-surveillance-cameras
Repo	https://github.com/sfc-computational-creativity-lab/x-adversarialfashion
Framework	pytorch

Stein Variational Gradient Descent With Matrix-Valued Kernels


Title	Stein Variational Gradient Descent With Matrix-Valued Kernels
Authors	Dilin Wang, Ziyang Tang, Chandrajit Bajaj, Qiang Liu
Abstract	Stein variational gradient descent (SVGD) is a particle-based inference algorithm that leverages gradient information for efficient approximate inference. In this work, we enhance SVGD by leveraging preconditioning matrices, such as the Hessian and Fisher information matrix, to incorporate geometric information into SVGD updates. We achieve this by presenting a generalization of SVGD that replaces the scalar-valued kernels in vanilla SVGD with more general matrix-valued kernels. This yields a significant extension of SVGD, and more importantly, allows us to flexibly incorporate various preconditioning matrices to accelerate the exploration in the probability landscape. Empirical results show that our method outperforms vanilla SVGD and a variety of baseline approaches over a range of real-world Bayesian inference tasks.
Tasks	Bayesian Inference
Published	2019-10-28
URL	https://arxiv.org/abs/1910.12794v2
PDF	https://arxiv.org/pdf/1910.12794v2.pdf
PWC	https://paperswithcode.com/paper/stein-variational-gradient-descent-with
Repo	https://github.com/dilinwang820/matrix_svgd
Framework	tf

Generalized Planning with Positive and Negative Examples


Title	Generalized Planning with Positive and Negative Examples
Authors	Javier Segovia-Aguas, Sergio Jiménez, Anders Jonsson
Abstract	Generalized planning aims at computing an algorithm-like structure (generalized plan) that solves a set of multiple planning instances. In this paper we define negative examples for generalized planning as planning instances that must not be solved by a generalized plan. With this regard the paper extends the notion of validation of a generalized plan as the problem of verifying that a given generalized plan solves the set of input positives instances while it fails to solve a given input set of negative examples. This notion of plan validation allows us to define quantitative metrics to asses the generalization capacity of generalized plans. The paper also shows how to incorporate this new notion of plan validation into a compilation for plan synthesis that takes both positive and negative instances as input. Experiments show that incorporating negative examples can accelerate plan synthesis in several domains and leverage quantitative metrics to evaluate the generalization capacity of the synthesized plans.
Tasks
Published	2019-11-21
URL	https://arxiv.org/abs/1911.09365v1
PDF	https://arxiv.org/pdf/1911.09365v1.pdf
PWC	https://paperswithcode.com/paper/generalized-planning-with-positive-and
Repo	https://github.com/aig-upf/automated-programming-framework
Framework	none

Symmetric Graph Convolutional Autoencoder for Unsupervised Graph Representation Learning


Title	Symmetric Graph Convolutional Autoencoder for Unsupervised Graph Representation Learning
Authors	Jiwoong Park, Minsik Lee, Hyung Jin Chang, Kyuewang Lee, Jin Young Choi
Abstract	We propose a symmetric graph convolutional autoencoder which produces a low-dimensional latent representation from a graph. In contrast to the existing graph autoencoders with asymmetric decoder parts, the proposed autoencoder has a newly designed decoder which builds a completely symmetric autoencoder form. For the reconstruction of node features, the decoder is designed based on Laplacian sharpening as the counterpart of Laplacian smoothing of the encoder, which allows utilizing the graph structure in the whole processes of the proposed autoencoder architecture. In order to prevent the numerical instability of the network caused by the Laplacian sharpening introduction, we further propose a new numerically stable form of the Laplacian sharpening by incorporating the signed graphs. In addition, a new cost function which finds a latent representation and a latent affinity matrix simultaneously is devised to boost the performance of image clustering tasks. The experimental results on clustering, link prediction and visualization tasks strongly support that the proposed model is stable and outperforms various state-of-the-art algorithms.
Tasks	Graph Clustering, Graph Representation Learning, Image Clustering, Link Prediction, Representation Learning
Published	2019-08-07
URL	https://arxiv.org/abs/1908.02441v1
PDF	https://arxiv.org/pdf/1908.02441v1.pdf
PWC	https://paperswithcode.com/paper/symmetric-graph-convolutional-autoencoder-for
Repo	https://github.com/sseung0703/GALA_TF2.0
Framework	tf

Automatic Source Code Summarization with Extended Tree-LSTM


Title	Automatic Source Code Summarization with Extended Tree-LSTM
Authors	Yusuke Shido, Yasuaki Kobayashi, Akihiro Yamamoto, Atsushi Miyamoto, Tadayuki Matsumura
Abstract	Neural machine translation models are used to automatically generate a document from given source code since this can be regarded as a machine translation task. Source code summarization is one of the components for automatic document generation, which generates a summary in natural language from given source code. This suggests that techniques used in neural machine translation, such as Long Short-Term Memory (LSTM), can be used for source code summarization. However, there is a considerable difference between source code and natural language: Source code is essentially {\em structured}, having loops and conditional branching, etc. Therefore, there is some obstacle to apply known machine translation models to source code. Abstract syntax trees (ASTs) capture these structural properties and play an important role in recent machine learning studies on source code. Tree-LSTM is proposed as a generalization of LSTMs for tree-structured data. However, there is a critical issue when applying it to ASTs: It cannot handle a tree that contains nodes having an arbitrary number of children and their order simultaneously, which ASTs generally have such nodes. To address this issue, we propose an extension of Tree-LSTM, which we call \emph{Multi-way Tree-LSTM} and apply it for source code summarization. As a result of computational experiments, our proposal achieved better results when compared with several state-of-the-art techniques.
Tasks	Code Summarization, Machine Translation
Published	2019-06-19
URL	https://arxiv.org/abs/1906.08094v2
PDF	https://arxiv.org/pdf/1906.08094v2.pdf
PWC	https://paperswithcode.com/paper/automatic-source-code-summarization-with
Repo	https://github.com/sh1doy/summarization_tf
Framework	tf

Sampling Bias in Deep Active Classification: An Empirical Study


Title	Sampling Bias in Deep Active Classification: An Empirical Study
Authors	Ameya Prabhu, Charles Dognin, Maneesh Singh
Abstract	The exploding cost and time needed for data labeling and model training are bottlenecks for training DNN models on large datasets. Identifying smaller representative data samples with strategies like active learning can help mitigate such bottlenecks. Previous works on active learning in NLP identify the problem of sampling bias in the samples acquired by uncertainty-based querying and develop costly approaches to address it. Using a large empirical study, we demonstrate that active set selection using the posterior entropy of deep models like FastText.zip (FTZ) is robust to sampling biases and to various algorithmic choices (query size and strategies) unlike that suggested by traditional literature. We also show that FTZ based query strategy produces sample sets similar to those from more sophisticated approaches (e.g ensemble networks). Finally, we show the effectiveness of the selected samples by creating tiny high-quality datasets, and utilizing them for fast and cheap training of large models. Based on the above, we propose a simple baseline for deep active text classification that outperforms the state-of-the-art. We expect the presented work to be useful and informative for dataset compression and for problems involving active, semi-supervised or online learning scenarios. Code and models are available at: https://github.com/drimpossible/Sampling-Bias-Active-Learning
Tasks	Active Learning, Text Classification
Published	2019-09-20
URL	https://arxiv.org/abs/1909.09389v1
PDF	https://arxiv.org/pdf/1909.09389v1.pdf
PWC	https://paperswithcode.com/paper/sampling-bias-in-deep-active-classification
Repo	https://github.com/drimpossible/Sampling-Bias-Active-Learning
Framework	none

ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday Tasks


Title	ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday Tasks
Authors	Mohit Shridhar, Jesse Thomason, Daniel Gordon, Yonatan Bisk, Winson Han, Roozbeh Mottaghi, Luke Zettlemoyer, Dieter Fox
Abstract	We present ALFRED (Action Learning From Realistic Environments and Directives), a benchmark for learning a mapping from natural language instructions and egocentric vision to sequences of actions for household tasks. ALFRED includes long, compositional tasks with non-reversible state changes to shrink the gap between research benchmarks and real-world applications. ALFRED consists of expert demonstrations in interactive visual environments for 25k natural language directives. These directives contain both high-level goals like “Rinse off a mug and place it in the coffee maker.” and low-level language instructions like “Walk to the coffee maker on the right.” ALFRED tasks are more complex in terms of sequence length, action space, and language than existing vision-and-language task datasets. We show that a baseline model based on recent embodied vision-and-language tasks performs poorly on ALFRED, suggesting that there is significant room for developing innovative grounded visual language understanding models with this benchmark.
Tasks
Published	2019-12-03
URL	https://arxiv.org/abs/1912.01734v2
PDF	https://arxiv.org/pdf/1912.01734v2.pdf
PWC	https://paperswithcode.com/paper/alfred-a-benchmark-for-interpreting-grounded
Repo	https://github.com/askforalfred/alfred
Framework	pytorch

Dynamic Deep Networks for Retinal Vessel Segmentation


Title	Dynamic Deep Networks for Retinal Vessel Segmentation
Authors	Aashis Khanal, Rolando Estrada
Abstract	Segmenting the retinal vasculature entails a trade-off between how much of the overall vascular structure we identify vs. how precisely we segment individual vessels. In particular, state-of-the-art methods tend to under-segment faint vessels, as well as pixels that lie on the edges of thicker vessels. Thus, they underestimate the width of individual vessels, as well as the ratio of large to small vessels. More generally, many crucial bio-markers—including the artery-vein (AV) ratio, branching angles, number of bifurcation, fractal dimension, tortuosity, vascular length-to-diameter ratio and wall-to-lumen length—require precise measurements of individual vessels. To address this limitation, we propose a novel, stochastic training scheme for deep neural networks that better classifies the faint, ambiguous regions of the image. Our approach relies on two key innovations. First, we train our deep networks with dynamic weights that fluctuate during each training iteration. This stochastic approach forces the network to learn a mapping that robustly balances precision and recall. Second, we decouple the segmentation process into two steps. In the first half of our pipeline, we estimate the likelihood of every pixel and then use these likelihoods to segment pixels that are clearly vessel or background. In the latter part of our pipeline, we use a second network to classify the ambiguous regions in the image. Our proposed method obtained state-of-the-art results on five retinal datasets—DRIVE, STARE, CHASE-DB, AV-WIDE, and VEVIO—by learning a robust balance between false positive and false negative rates. In addition, we are the first to report segmentation results on the AV-WIDE dataset, and we have made the ground-truth annotations for this dataset publicly available.
Tasks	Retinal Vessel Segmentation
Published	2019-03-19
URL	http://arxiv.org/abs/1903.07803v2
PDF	http://arxiv.org/pdf/1903.07803v2.pdf
PWC	https://paperswithcode.com/paper/dynamic-deep-networks-for-retinal-vessel
Repo	https://github.com/sraashis/deepdyn
Framework	pytorch