October 21, 2019

2941 words 14 mins read

Paper Group AWR 142

DeepSaucer: Unified Environment for Verifying Deep Neural Networks. Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models. Stochastic seismic waveform inversion using generative adversarial networks as a geological prior. Audio-Visual Event Localization in Unconstrained Videos. Why do deep convolutional networks ge …

DeepSaucer: Unified Environment for Verifying Deep Neural Networks


Title	DeepSaucer: Unified Environment for Verifying Deep Neural Networks
Authors	Naoto Sato, Hironobu Kuruma, Masanori Kaneko, Yuichiroh Nakagawa, Hideto Ogawa, Thai Son Hoang, Michael Butler
Abstract	In recent years, a number of methods for verifying DNNs have been developed. Because the approaches of the methods differ and have their own limitations, we think that a number of verification methods should be applied to a developed DNN. To apply a number of methods to the DNN, it is necessary to translate either the implementation of the DNN or the verification method so that one runs in the same environment as the other. Since those translations are time-consuming, a utility tool, named DeepSaucer, which helps to retain and reuse implementations of DNNs, verification methods, and their environments, is proposed. In DeepSaucer, code snippets of loading DNNs, running verification methods, and creating their environments are retained and reused as software assets in order to reduce cost of verifying DNNs. The feasibility of DeepSaucer is confirmed by implementing it on the basis of Anaconda, which provides virtual environment for loading a DNN and running a verification method. In addition, the effectiveness of DeepSaucer is demonstrated by usecase examples.
Tasks
Published	2018-11-09
URL	http://arxiv.org/abs/1811.03752v1
PDF	http://arxiv.org/pdf/1811.03752v1.pdf
PWC	https://paperswithcode.com/paper/deepsaucer-unified-environment-for-verifying
Repo	https://github.com/hitachi-rd-yokohama-sato/deep_saucer
Framework	tf

Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models


Title	Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models
Authors	Pouya Samangouei, Maya Kabkab, Rama Chellappa
Abstract	In recent years, deep neural network approaches have been widely adopted for machine learning tasks, including classification. However, they were shown to be vulnerable to adversarial perturbations: carefully crafted small perturbations can cause misclassification of legitimate images. We propose Defense-GAN, a new framework leveraging the expressive capability of generative models to defend deep neural networks against such attacks. Defense-GAN is trained to model the distribution of unperturbed images. At inference time, it finds a close output to a given image which does not contain the adversarial changes. This output is then fed to the classifier. Our proposed method can be used with any classification model and does not modify the classifier structure or training procedure. It can also be used as a defense against any attack as it does not assume knowledge of the process for generating the adversarial examples. We empirically show that Defense-GAN is consistently effective against different attack methods and improves on existing defense strategies. Our code has been made publicly available at https://github.com/kabkabm/defensegan
Tasks
Published	2018-05-17
URL	http://arxiv.org/abs/1805.06605v2
PDF	http://arxiv.org/pdf/1805.06605v2.pdf
PWC	https://paperswithcode.com/paper/defense-gan-protecting-classifiers-against
Repo	https://github.com/bibin-sebastian/Physical_Adversarial_examples_GAN
Framework	tf

Stochastic seismic waveform inversion using generative adversarial networks as a geological prior


Title	Stochastic seismic waveform inversion using generative adversarial networks as a geological prior
Authors	Lukas Mosser, Olivier Dubrule, Martin J. Blunt
Abstract	We present an application of deep generative models in the context of partial-differential equation (PDE) constrained inverse problems. We combine a generative adversarial network (GAN) representing an a priori model that creates subsurface geological structures and their petrophysical properties, with the numerical solution of the PDE governing the propagation of acoustic waves within the earth’s interior. We perform Bayesian inversion using an approximate Metropolis-adjusted Langevin algorithm (MALA) to sample from the posterior given seismic observations. Gradients with respect to the model parameters governing the forward problem are obtained by solving the adjoint of the acoustic wave equation. Gradients of the mismatch with respect to the latent variables are obtained by leveraging the differentiable nature of the deep neural network used to represent the generative model. We show that approximate MALA sampling allows efficient Bayesian inversion of model parameters obtained from a prior represented by a deep generative model, obtaining a diverse set of realizations that reflect the observed seismic response.
Tasks
Published	2018-06-10
URL	http://arxiv.org/abs/1806.03720v1
PDF	http://arxiv.org/pdf/1806.03720v1.pdf
PWC	https://paperswithcode.com/paper/stochastic-seismic-waveform-inversion-using
Repo	https://github.com/LukasMosser/stochastic_seismic_waveform_inversion
Framework	pytorch

Audio-Visual Event Localization in Unconstrained Videos


Title	Audio-Visual Event Localization in Unconstrained Videos
Authors	Yapeng Tian, Jing Shi, Bochen Li, Zhiyao Duan, Chenliang Xu
Abstract	In this paper, we introduce a novel problem of audio-visual event localization in unconstrained videos. We define an audio-visual event as an event that is both visible and audible in a video segment. We collect an Audio-Visual Event(AVE) dataset to systemically investigate three temporal localization tasks: supervised and weakly-supervised audio-visual event localization, and cross-modality localization. We develop an audio-guided visual attention mechanism to explore audio-visual correlations, propose a dual multimodal residual network (DMRN) to fuse information over the two modalities, and introduce an audio-visual distance learning network to handle the cross-modality localization. Our experiments support the following findings: joint modeling of auditory and visual modalities outperforms independent modeling, the learned attention can capture semantics of sounding objects, temporal alignment is important for audio-visual fusion, the proposed DMRN is effective in fusing audio-visual features, and strong correlations between the two modalities enable cross-modality localization.
Tasks	Temporal Localization
Published	2018-03-23
URL	http://arxiv.org/abs/1803.08842v1
PDF	http://arxiv.org/pdf/1803.08842v1.pdf
PWC	https://paperswithcode.com/paper/audio-visual-event-localization-in
Repo	https://github.com/YashNita/Audio-Visual-Event-Localization-in-Unconstrained-Videos
Framework	pytorch

Why do deep convolutional networks generalize so poorly to small image transformations?


Title	Why do deep convolutional networks generalize so poorly to small image transformations?
Authors	Aharon Azulay, Yair Weiss
Abstract	Convolutional Neural Networks (CNNs) are commonly assumed to be invariant to small image transformations: either because of the convolutional architecture or because they were trained using data augmentation. Recently, several authors have shown that this is not the case: small translations or rescalings of the input image can drastically change the network’s prediction. In this paper, we quantify this phenomena and ask why neither the convolutional architecture nor data augmentation are sufficient to achieve the desired invariance. Specifically, we show that the convolutional architecture does not give invariance since architectures ignore the classical sampling theorem, and data augmentation does not give invariance because the CNNs learn to be invariant to transformations only for images that are very similar to typical images from the training set. We discuss two possible solutions to this problem: (1) antialiasing the intermediate representations and (2) increasing data augmentation and show that they provide only a partial solution at best. Taken together, our results indicate that the problem of insuring invariance to small image transformations in neural networks while preserving high accuracy remains unsolved.
Tasks	Data Augmentation, Object Recognition
Published	2018-05-30
URL	https://arxiv.org/abs/1805.12177v4
PDF	https://arxiv.org/pdf/1805.12177v4.pdf
PWC	https://paperswithcode.com/paper/why-do-deep-convolutional-networks-generalize
Repo	https://github.com/premthomas/keras-image-classification
Framework	tf

Layered TPOT: Speeding up Tree-based Pipeline Optimization


Title	Layered TPOT: Speeding up Tree-based Pipeline Optimization
Authors	Pieter Gijsbers, Joaquin Vanschoren, Randal S. Olson
Abstract	With the demand for machine learning increasing, so does the demand for tools which make it easier to use. Automated machine learning (AutoML) tools have been developed to address this need, such as the Tree-Based Pipeline Optimization Tool (TPOT) which uses genetic programming to build optimal pipelines. We introduce Layered TPOT, a modification to TPOT which aims to create pipelines equally good as the original, but in significantly less time. This approach evaluates candidate pipelines on increasingly large subsets of the data according to their fitness, using a modified evolutionary algorithm to allow for separate competition between pipelines trained on different sample sizes. Empirical evaluation shows that, on sufficiently large datasets, Layered TPOT indeed finds better models faster.
Tasks	Automated Feature Engineering, AutoML, Hyperparameter Optimization
Published	2018-01-18
URL	http://arxiv.org/abs/1801.06007v2
PDF	http://arxiv.org/pdf/1801.06007v2.pdf
PWC	https://paperswithcode.com/paper/layered-tpot-speeding-up-tree-based-pipeline
Repo	https://github.com/EpistasisLab/tpot
Framework	none

Geometry Score: A Method For Comparing Generative Adversarial Networks


Title	Geometry Score: A Method For Comparing Generative Adversarial Networks
Authors	Valentin Khrulkov, Ivan Oseledets
Abstract	One of the biggest challenges in the research of generative adversarial networks (GANs) is assessing the quality of generated samples and detecting various levels of mode collapse. In this work, we construct a novel measure of performance of a GAN by comparing geometrical properties of the underlying data manifold and the generated one, which provides both qualitative and quantitative means for evaluation. Our algorithm can be applied to datasets of an arbitrary nature and is not limited to visual data. We test the obtained metric on various real-life models and datasets and demonstrate that our method provides new insights into properties of GANs.
Tasks
Published	2018-02-07
URL	http://arxiv.org/abs/1802.02664v3
PDF	http://arxiv.org/pdf/1802.02664v3.pdf
PWC	https://paperswithcode.com/paper/geometry-score-a-method-for-comparing
Repo	https://github.com/KhrulkovV/geometry-score
Framework	none

Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis


Title	Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis
Authors	Yuxuan Wang, Daisy Stanton, Yu Zhang, RJ Skerry-Ryan, Eric Battenberg, Joel Shor, Ying Xiao, Fei Ren, Ye Jia, Rif A. Saurous
Abstract	In this work, we propose “global style tokens” (GSTs), a bank of embeddings that are jointly trained within Tacotron, a state-of-the-art end-to-end speech synthesis system. The embeddings are trained with no explicit labels, yet learn to model a large range of acoustic expressiveness. GSTs lead to a rich set of significant results. The soft interpretable “labels” they generate can be used to control synthesis in novel ways, such as varying speed and speaking style - independently of the text content. They can also be used for style transfer, replicating the speaking style of a single audio clip across an entire long-form text corpus. When trained on noisy, unlabeled found data, GSTs learn to factorize noise and speaker identity, providing a path towards highly scalable but robust speech synthesis.
Tasks	Speech Synthesis, Style Transfer
Published	2018-03-23
URL	http://arxiv.org/abs/1803.09017v1
PDF	http://arxiv.org/pdf/1803.09017v1.pdf
PWC	https://paperswithcode.com/paper/style-tokens-unsupervised-style-modeling
Repo	https://github.com/KinglittleQ/GST-Tacotron
Framework	pytorch

Highly Scalable Image Reconstruction using Deep Neural Networks with Bandpass Filtering


Title	Highly Scalable Image Reconstruction using Deep Neural Networks with Bandpass Filtering
Authors	Joseph Y. Cheng, Feiyu Chen, Marcus T. Alley, John M. Pauly, Shreyas S. Vasanawala
Abstract	To increase the flexibility and scalability of deep neural networks for image reconstruction, a framework is proposed based on bandpass filtering. For many applications, sensing measurements are performed indirectly. For example, in magnetic resonance imaging, data are sampled in the frequency domain. The introduction of bandpass filtering enables leveraging known imaging physics while ensuring that the final reconstruction is consistent with actual measurements to maintain reconstruction accuracy. We demonstrate this flexible architecture for reconstructing subsampled datasets of MRI scans. The resulting high subsampling rates increase the speed of MRI acquisitions and enable the visualization rapid hemodynamics.
Tasks	Image Reconstruction
Published	2018-05-08
URL	http://arxiv.org/abs/1805.03300v2
PDF	http://arxiv.org/pdf/1805.03300v2.pdf
PWC	https://paperswithcode.com/paper/highly-scalable-image-reconstruction-using
Repo	https://github.com/MRSRL/dl-cs
Framework	tf

Multi Task Deep Morphological Analyzer: Context Aware Joint Morphological Tagging and Lemma Prediction


Title	Multi Task Deep Morphological Analyzer: Context Aware Joint Morphological Tagging and Lemma Prediction
Authors	Saurav Jha, Akhilesh Sudhakar, Anil Kumar Singh
Abstract	The ambiguities introduced by the recombination of morphemes constructing several possible inflections for a word makes the prediction of syntactic traits in Morphologically Rich Languages (MRLs) a notoriously complicated task. We propose the Multi Task Deep Morphological analyzer (MT-DMA), a character-level neural morphological analyzer based on multitask learning of word-level tag markers for Hindi and Urdu. MT-DMA predicts a set of six morphological tags for words of Indo-Aryan languages: Parts-of-speech (POS), Gender (G), Number (N), Person (P), Case (C), Tense-Aspect-Modality (TAM) marker as well as the Lemma (L) by jointly learning all these in one trainable framework. We show the effectiveness of training of such deep neural networks by the simultaneous optimization of multiple loss functions and sharing of initial parameters for context-aware morphological analysis. Exploiting character-level features in phonological space optimized for each tag using multi-objective genetic algorithm, our model establishes a new state-of-the-art accuracy score upon all seven of the tasks for both the languages. MT-DMA is publicly accessible: code, models and data are available at https://github.com/Saurav0074/morph_analyzer.
Tasks	Dependency Parsing, Machine Translation, Morphological Analysis, Morphological Tagging
Published	2018-11-21
URL	https://arxiv.org/abs/1811.08619v2
PDF	https://arxiv.org/pdf/1811.08619v2.pdf
PWC	https://paperswithcode.com/paper/multi-task-deep-morphological-analyzer
Repo	https://github.com/Saurav0074/morph_analyzer
Framework	none

Assessing Composition in Sentence Vector Representations


Title	Assessing Composition in Sentence Vector Representations
Authors	Allyson Ettinger, Ahmed Elgohary, Colin Phillips, Philip Resnik
Abstract	An important component of achieving language understanding is mastering the composition of sentence meaning, but an immediate challenge to solving this problem is the opacity of sentence vector representations produced by current neural sentence composition models. We present a method to address this challenge, developing tasks that directly target compositional meaning information in sentence vector representations with a high degree of precision and control. To enable the creation of these controlled tasks, we introduce a specialized sentence generation system that produces large, annotated sentence sets meeting specified syntactic, semantic and lexical constraints. We describe the details of the method and generation system, and then present results of experiments applying our method to probe for compositional information in embeddings from a number of existing sentence composition models. We find that the method is able to extract useful information about the differing capacities of these models, and we discuss the implications of our results with respect to these systems’ capturing of sentence information. We make available for public use the datasets used for these experiments, as well as the generation system.
Tasks
Published	2018-09-11
URL	http://arxiv.org/abs/1809.03992v1
PDF	http://arxiv.org/pdf/1809.03992v1.pdf
PWC	https://paperswithcode.com/paper/assessing-composition-in-sentence-vector
Repo	https://github.com/aetting/compeval-generation-system
Framework	none

Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks


Title	Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks
Authors	Lechao Xiao, Yasaman Bahri, Jascha Sohl-Dickstein, Samuel S. Schoenholz, Jeffrey Pennington
Abstract	In recent years, state-of-the-art methods in computer vision have utilized increasingly deep convolutional neural network architectures (CNNs), with some of the most successful models employing hundreds or even thousands of layers. A variety of pathologies such as vanishing/exploding gradients make training such deep networks challenging. While residual connections and batch normalization do enable training at these depths, it has remained unclear whether such specialized architecture designs are truly necessary to train deep CNNs. In this work, we demonstrate that it is possible to train vanilla CNNs with ten thousand layers or more simply by using an appropriate initialization scheme. We derive this initialization scheme theoretically by developing a mean field theory for signal propagation and by characterizing the conditions for dynamical isometry, the equilibration of singular values of the input-output Jacobian matrix. These conditions require that the convolution operator be an orthogonal transformation in the sense that it is norm-preserving. We present an algorithm for generating such random initial orthogonal convolution kernels and demonstrate empirically that they enable efficient training of extremely deep architectures.
Tasks
Published	2018-06-14
URL	http://arxiv.org/abs/1806.05393v2
PDF	http://arxiv.org/pdf/1806.05393v2.pdf
PWC	https://paperswithcode.com/paper/dynamical-isometry-and-a-mean-field-theory-of-2
Repo	https://github.com/JiJingYu/delta_orthogonal_init_pytorch
Framework	pytorch

TVM: An Automated End-to-End Optimizing Compiler for Deep Learning


Title	TVM: An Automated End-to-End Optimizing Compiler for Deep Learning
Authors	Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Meghan Cowan, Haichen Shen, Leyuan Wang, Yuwei Hu, Luis Ceze, Carlos Guestrin, Arvind Krishnamurthy
Abstract	There is an increasing need to bring machine learning to a wide diversity of hardware devices. Current frameworks rely on vendor-specific operator libraries and optimize for a narrow range of server-class GPUs. Deploying workloads to new platforms – such as mobile phones, embedded devices, and accelerators (e.g., FPGAs, ASICs) – requires significant manual effort. We propose TVM, a compiler that exposes graph-level and operator-level optimizations to provide performance portability to deep learning workloads across diverse hardware back-ends. TVM solves optimization challenges specific to deep learning, such as high-level operator fusion, mapping to arbitrary hardware primitives, and memory latency hiding. It also automates optimization of low-level programs to hardware characteristics by employing a novel, learning-based cost modeling method for rapid exploration of code optimizations. Experimental results show that TVM delivers performance across hardware back-ends that are competitive with state-of-the-art, hand-tuned libraries for low-power CPU, mobile GPU, and server-class GPUs. We also demonstrate TVM’s ability to target new accelerator back-ends, such as the FPGA-based generic deep learning accelerator. The system is open sourced and in production use inside several major companies.
Tasks
Published	2018-02-12
URL	http://arxiv.org/abs/1802.04799v3
PDF	http://arxiv.org/pdf/1802.04799v3.pdf
PWC	https://paperswithcode.com/paper/tvm-an-automated-end-to-end-optimizing
Repo	https://github.com/ctuning/ck-tvm
Framework	mxnet

Causal Queries from Observational Data in Biological Systems via Bayesian Networks: An Empirical Study in Small Networks


Title	Causal Queries from Observational Data in Biological Systems via Bayesian Networks: An Empirical Study in Small Networks
Authors	Alex White, Matthieu Vignes
Abstract	Biological networks are a very convenient modelling and visualisation tool to discover knowledge from modern high-throughput genomics and postgenomics data sets. Indeed, biological entities are not isolated, but are components of complex multi-level systems. We go one step further and advocate for the consideration of causal representations of the interactions in living systems.We present the causal formalism and bring it out in the context of biological networks, when the data is observational. We also discuss its ability to decipher the causal information flow as observed in gene expression. We also illustrate our exploration by experiments on small simulated networks as well as on a real biological data set.
Tasks
Published	2018-05-04
URL	http://arxiv.org/abs/1805.01608v1
PDF	http://arxiv.org/pdf/1805.01608v1.pdf
PWC	https://paperswithcode.com/paper/causal-queries-from-observational-data-in
Repo	https://github.com/alexW335/BNCausalExperiments
Framework	none

Manifold regularization with GANs for semi-supervised learning


Title	Manifold regularization with GANs for semi-supervised learning
Authors	Bruno Lecouat, Chuan-Sheng Foo, Houssam Zenati, Vijay Chandrasekhar
Abstract	Generative Adversarial Networks are powerful generative models that are able to model the manifold of natural images. We leverage this property to perform manifold regularization by approximating a variant of the Laplacian norm using a Monte Carlo approximation that is easily computed with the GAN. When incorporated into the semi-supervised feature-matching GAN we achieve state-of-the-art results for GAN-based semi-supervised learning on CIFAR-10 and SVHN benchmarks, with a method that is significantly easier to implement than competing methods. We also find that manifold regularization improves the quality of generated images, and is affected by the quality of the GAN used to approximate the regularizer.
Tasks
Published	2018-07-11
URL	http://arxiv.org/abs/1807.04307v1
PDF	http://arxiv.org/pdf/1807.04307v1.pdf
PWC	https://paperswithcode.com/paper/manifold-regularization-with-gans-for-semi
Repo	https://github.com/bruno-31/gan-manifold-reg
Framework	tf