Paper Group AWR 142
DeepSaucer: Unified Environment for Verifying Deep Neural Networks. Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models. Stochastic seismic waveform inversion using generative adversarial networks as a geological prior. Audio-Visual Event Localization in Unconstrained Videos. Why do deep convolutional networks ge …
DeepSaucer: Unified Environment for Verifying Deep Neural Networks
Title | DeepSaucer: Unified Environment for Verifying Deep Neural Networks |
Authors | Naoto Sato, Hironobu Kuruma, Masanori Kaneko, Yuichiroh Nakagawa, Hideto Ogawa, Thai Son Hoang, Michael Butler |
Abstract | In recent years, a number of methods for verifying DNNs have been developed. Because the approaches of the methods differ and have their own limitations, we think that a number of verification methods should be applied to a developed DNN. To apply a number of methods to the DNN, it is necessary to translate either the implementation of the DNN or the verification method so that one runs in the same environment as the other. Since those translations are time-consuming, a utility tool, named DeepSaucer, which helps to retain and reuse implementations of DNNs, verification methods, and their environments, is proposed. In DeepSaucer, code snippets of loading DNNs, running verification methods, and creating their environments are retained and reused as software assets in order to reduce cost of verifying DNNs. The feasibility of DeepSaucer is confirmed by implementing it on the basis of Anaconda, which provides virtual environment for loading a DNN and running a verification method. In addition, the effectiveness of DeepSaucer is demonstrated by usecase examples. |
Tasks | |
Published | 2018-11-09 |
URL | http://arxiv.org/abs/1811.03752v1 |
http://arxiv.org/pdf/1811.03752v1.pdf | |
PWC | https://paperswithcode.com/paper/deepsaucer-unified-environment-for-verifying |
Repo | https://github.com/hitachi-rd-yokohama-sato/deep_saucer |
Framework | tf |
Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models
Title | Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models |
Authors | Pouya Samangouei, Maya Kabkab, Rama Chellappa |
Abstract | In recent years, deep neural network approaches have been widely adopted for machine learning tasks, including classification. However, they were shown to be vulnerable to adversarial perturbations: carefully crafted small perturbations can cause misclassification of legitimate images. We propose Defense-GAN, a new framework leveraging the expressive capability of generative models to defend deep neural networks against such attacks. Defense-GAN is trained to model the distribution of unperturbed images. At inference time, it finds a close output to a given image which does not contain the adversarial changes. This output is then fed to the classifier. Our proposed method can be used with any classification model and does not modify the classifier structure or training procedure. It can also be used as a defense against any attack as it does not assume knowledge of the process for generating the adversarial examples. We empirically show that Defense-GAN is consistently effective against different attack methods and improves on existing defense strategies. Our code has been made publicly available at https://github.com/kabkabm/defensegan |
Tasks | |
Published | 2018-05-17 |
URL | http://arxiv.org/abs/1805.06605v2 |
http://arxiv.org/pdf/1805.06605v2.pdf | |
PWC | https://paperswithcode.com/paper/defense-gan-protecting-classifiers-against |
Repo | https://github.com/bibin-sebastian/Physical_Adversarial_examples_GAN |
Framework | tf |
Stochastic seismic waveform inversion using generative adversarial networks as a geological prior
Title | Stochastic seismic waveform inversion using generative adversarial networks as a geological prior |
Authors | Lukas Mosser, Olivier Dubrule, Martin J. Blunt |
Abstract | We present an application of deep generative models in the context of partial-differential equation (PDE) constrained inverse problems. We combine a generative adversarial network (GAN) representing an a priori model that creates subsurface geological structures and their petrophysical properties, with the numerical solution of the PDE governing the propagation of acoustic waves within the earth’s interior. We perform Bayesian inversion using an approximate Metropolis-adjusted Langevin algorithm (MALA) to sample from the posterior given seismic observations. Gradients with respect to the model parameters governing the forward problem are obtained by solving the adjoint of the acoustic wave equation. Gradients of the mismatch with respect to the latent variables are obtained by leveraging the differentiable nature of the deep neural network used to represent the generative model. We show that approximate MALA sampling allows efficient Bayesian inversion of model parameters obtained from a prior represented by a deep generative model, obtaining a diverse set of realizations that reflect the observed seismic response. |
Tasks | |
Published | 2018-06-10 |
URL | http://arxiv.org/abs/1806.03720v1 |
http://arxiv.org/pdf/1806.03720v1.pdf | |
PWC | https://paperswithcode.com/paper/stochastic-seismic-waveform-inversion-using |
Repo | https://github.com/LukasMosser/stochastic_seismic_waveform_inversion |
Framework | pytorch |
Audio-Visual Event Localization in Unconstrained Videos
Title | Audio-Visual Event Localization in Unconstrained Videos |
Authors | Yapeng Tian, Jing Shi, Bochen Li, Zhiyao Duan, Chenliang Xu |
Abstract | In this paper, we introduce a novel problem of audio-visual event localization in unconstrained videos. We define an audio-visual event as an event that is both visible and audible in a video segment. We collect an Audio-Visual Event(AVE) dataset to systemically investigate three temporal localization tasks: supervised and weakly-supervised audio-visual event localization, and cross-modality localization. We develop an audio-guided visual attention mechanism to explore audio-visual correlations, propose a dual multimodal residual network (DMRN) to fuse information over the two modalities, and introduce an audio-visual distance learning network to handle the cross-modality localization. Our experiments support the following findings: joint modeling of auditory and visual modalities outperforms independent modeling, the learned attention can capture semantics of sounding objects, temporal alignment is important for audio-visual fusion, the proposed DMRN is effective in fusing audio-visual features, and strong correlations between the two modalities enable cross-modality localization. |
Tasks | Temporal Localization |
Published | 2018-03-23 |
URL | http://arxiv.org/abs/1803.08842v1 |
http://arxiv.org/pdf/1803.08842v1.pdf | |
PWC | https://paperswithcode.com/paper/audio-visual-event-localization-in |
Repo | https://github.com/YashNita/Audio-Visual-Event-Localization-in-Unconstrained-Videos |
Framework | pytorch |
Why do deep convolutional networks generalize so poorly to small image transformations?
Title | Why do deep convolutional networks generalize so poorly to small image transformations? |
Authors | Aharon Azulay, Yair Weiss |
Abstract | Convolutional Neural Networks (CNNs) are commonly assumed to be invariant to small image transformations: either because of the convolutional architecture or because they were trained using data augmentation. Recently, several authors have shown that this is not the case: small translations or rescalings of the input image can drastically change the network’s prediction. In this paper, we quantify this phenomena and ask why neither the convolutional architecture nor data augmentation are sufficient to achieve the desired invariance. Specifically, we show that the convolutional architecture does not give invariance since architectures ignore the classical sampling theorem, and data augmentation does not give invariance because the CNNs learn to be invariant to transformations only for images that are very similar to typical images from the training set. We discuss two possible solutions to this problem: (1) antialiasing the intermediate representations and (2) increasing data augmentation and show that they provide only a partial solution at best. Taken together, our results indicate that the problem of insuring invariance to small image transformations in neural networks while preserving high accuracy remains unsolved. |
Tasks | Data Augmentation, Object Recognition |
Published | 2018-05-30 |
URL | https://arxiv.org/abs/1805.12177v4 |
https://arxiv.org/pdf/1805.12177v4.pdf | |
PWC | https://paperswithcode.com/paper/why-do-deep-convolutional-networks-generalize |
Repo | https://github.com/premthomas/keras-image-classification |
Framework | tf |
Layered TPOT: Speeding up Tree-based Pipeline Optimization
Title | Layered TPOT: Speeding up Tree-based Pipeline Optimization |
Authors | Pieter Gijsbers, Joaquin Vanschoren, Randal S. Olson |
Abstract | With the demand for machine learning increasing, so does the demand for tools which make it easier to use. Automated machine learning (AutoML) tools have been developed to address this need, such as the Tree-Based Pipeline Optimization Tool (TPOT) which uses genetic programming to build optimal pipelines. We introduce Layered TPOT, a modification to TPOT which aims to create pipelines equally good as the original, but in significantly less time. This approach evaluates candidate pipelines on increasingly large subsets of the data according to their fitness, using a modified evolutionary algorithm to allow for separate competition between pipelines trained on different sample sizes. Empirical evaluation shows that, on sufficiently large datasets, Layered TPOT indeed finds better models faster. |
Tasks | Automated Feature Engineering, AutoML, Hyperparameter Optimization |
Published | 2018-01-18 |
URL | http://arxiv.org/abs/1801.06007v2 |
http://arxiv.org/pdf/1801.06007v2.pdf | |
PWC | https://paperswithcode.com/paper/layered-tpot-speeding-up-tree-based-pipeline |
Repo | https://github.com/EpistasisLab/tpot |
Framework | none |
Geometry Score: A Method For Comparing Generative Adversarial Networks
Title | Geometry Score: A Method For Comparing Generative Adversarial Networks |
Authors | Valentin Khrulkov, Ivan Oseledets |
Abstract | One of the biggest challenges in the research of generative adversarial networks (GANs) is assessing the quality of generated samples and detecting various levels of mode collapse. In this work, we construct a novel measure of performance of a GAN by comparing geometrical properties of the underlying data manifold and the generated one, which provides both qualitative and quantitative means for evaluation. Our algorithm can be applied to datasets of an arbitrary nature and is not limited to visual data. We test the obtained metric on various real-life models and datasets and demonstrate that our method provides new insights into properties of GANs. |
Tasks | |
Published | 2018-02-07 |
URL | http://arxiv.org/abs/1802.02664v3 |
http://arxiv.org/pdf/1802.02664v3.pdf | |
PWC | https://paperswithcode.com/paper/geometry-score-a-method-for-comparing |
Repo | https://github.com/KhrulkovV/geometry-score |
Framework | none |
Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis
Title | Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis |
Authors | Yuxuan Wang, Daisy Stanton, Yu Zhang, RJ Skerry-Ryan, Eric Battenberg, Joel Shor, Ying Xiao, Fei Ren, Ye Jia, Rif A. Saurous |
Abstract | In this work, we propose “global style tokens” (GSTs), a bank of embeddings that are jointly trained within Tacotron, a state-of-the-art end-to-end speech synthesis system. The embeddings are trained with no explicit labels, yet learn to model a large range of acoustic expressiveness. GSTs lead to a rich set of significant results. The soft interpretable “labels” they generate can be used to control synthesis in novel ways, such as varying speed and speaking style - independently of the text content. They can also be used for style transfer, replicating the speaking style of a single audio clip across an entire long-form text corpus. When trained on noisy, unlabeled found data, GSTs learn to factorize noise and speaker identity, providing a path towards highly scalable but robust speech synthesis. |
Tasks | Speech Synthesis, Style Transfer |
Published | 2018-03-23 |
URL | http://arxiv.org/abs/1803.09017v1 |
http://arxiv.org/pdf/1803.09017v1.pdf | |
PWC | https://paperswithcode.com/paper/style-tokens-unsupervised-style-modeling |
Repo | https://github.com/KinglittleQ/GST-Tacotron |
Framework | pytorch |
Highly Scalable Image Reconstruction using Deep Neural Networks with Bandpass Filtering
Title | Highly Scalable Image Reconstruction using Deep Neural Networks with Bandpass Filtering |
Authors | Joseph Y. Cheng, Feiyu Chen, Marcus T. Alley, John M. Pauly, Shreyas S. Vasanawala |
Abstract | To increase the flexibility and scalability of deep neural networks for image reconstruction, a framework is proposed based on bandpass filtering. For many applications, sensing measurements are performed indirectly. For example, in magnetic resonance imaging, data are sampled in the frequency domain. The introduction of bandpass filtering enables leveraging known imaging physics while ensuring that the final reconstruction is consistent with actual measurements to maintain reconstruction accuracy. We demonstrate this flexible architecture for reconstructing subsampled datasets of MRI scans. The resulting high subsampling rates increase the speed of MRI acquisitions and enable the visualization rapid hemodynamics. |
Tasks | Image Reconstruction |
Published | 2018-05-08 |
URL | http://arxiv.org/abs/1805.03300v2 |
http://arxiv.org/pdf/1805.03300v2.pdf | |
PWC | https://paperswithcode.com/paper/highly-scalable-image-reconstruction-using |
Repo | https://github.com/MRSRL/dl-cs |
Framework | tf |
Multi Task Deep Morphological Analyzer: Context Aware Joint Morphological Tagging and Lemma Prediction
Title | Multi Task Deep Morphological Analyzer: Context Aware Joint Morphological Tagging and Lemma Prediction |
Authors | Saurav Jha, Akhilesh Sudhakar, Anil Kumar Singh |
Abstract | The ambiguities introduced by the recombination of morphemes constructing several possible inflections for a word makes the prediction of syntactic traits in Morphologically Rich Languages (MRLs) a notoriously complicated task. We propose the Multi Task Deep Morphological analyzer (MT-DMA), a character-level neural morphological analyzer based on multitask learning of word-level tag markers for Hindi and Urdu. MT-DMA predicts a set of six morphological tags for words of Indo-Aryan languages: Parts-of-speech (POS), Gender (G), Number (N), Person (P), Case (C), Tense-Aspect-Modality (TAM) marker as well as the Lemma (L) by jointly learning all these in one trainable framework. We show the effectiveness of training of such deep neural networks by the simultaneous optimization of multiple loss functions and sharing of initial parameters for context-aware morphological analysis. Exploiting character-level features in phonological space optimized for each tag using multi-objective genetic algorithm, our model establishes a new state-of-the-art accuracy score upon all seven of the tasks for both the languages. MT-DMA is publicly accessible: code, models and data are available at https://github.com/Saurav0074/morph_analyzer. |
Tasks | Dependency Parsing, Machine Translation, Morphological Analysis, Morphological Tagging |
Published | 2018-11-21 |
URL | https://arxiv.org/abs/1811.08619v2 |
https://arxiv.org/pdf/1811.08619v2.pdf | |
PWC | https://paperswithcode.com/paper/multi-task-deep-morphological-analyzer |
Repo | https://github.com/Saurav0074/morph_analyzer |
Framework | none |
Assessing Composition in Sentence Vector Representations
Title | Assessing Composition in Sentence Vector Representations |
Authors | Allyson Ettinger, Ahmed Elgohary, Colin Phillips, Philip Resnik |
Abstract | An important component of achieving language understanding is mastering the composition of sentence meaning, but an immediate challenge to solving this problem is the opacity of sentence vector representations produced by current neural sentence composition models. We present a method to address this challenge, developing tasks that directly target compositional meaning information in sentence vector representations with a high degree of precision and control. To enable the creation of these controlled tasks, we introduce a specialized sentence generation system that produces large, annotated sentence sets meeting specified syntactic, semantic and lexical constraints. We describe the details of the method and generation system, and then present results of experiments applying our method to probe for compositional information in embeddings from a number of existing sentence composition models. We find that the method is able to extract useful information about the differing capacities of these models, and we discuss the implications of our results with respect to these systems’ capturing of sentence information. We make available for public use the datasets used for these experiments, as well as the generation system. |
Tasks | |
Published | 2018-09-11 |
URL | http://arxiv.org/abs/1809.03992v1 |
http://arxiv.org/pdf/1809.03992v1.pdf | |
PWC | https://paperswithcode.com/paper/assessing-composition-in-sentence-vector |
Repo | https://github.com/aetting/compeval-generation-system |
Framework | none |
Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks
Title | Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks |
Authors | Lechao Xiao, Yasaman Bahri, Jascha Sohl-Dickstein, Samuel S. Schoenholz, Jeffrey Pennington |
Abstract | In recent years, state-of-the-art methods in computer vision have utilized increasingly deep convolutional neural network architectures (CNNs), with some of the most successful models employing hundreds or even thousands of layers. A variety of pathologies such as vanishing/exploding gradients make training such deep networks challenging. While residual connections and batch normalization do enable training at these depths, it has remained unclear whether such specialized architecture designs are truly necessary to train deep CNNs. In this work, we demonstrate that it is possible to train vanilla CNNs with ten thousand layers or more simply by using an appropriate initialization scheme. We derive this initialization scheme theoretically by developing a mean field theory for signal propagation and by characterizing the conditions for dynamical isometry, the equilibration of singular values of the input-output Jacobian matrix. These conditions require that the convolution operator be an orthogonal transformation in the sense that it is norm-preserving. We present an algorithm for generating such random initial orthogonal convolution kernels and demonstrate empirically that they enable efficient training of extremely deep architectures. |
Tasks | |
Published | 2018-06-14 |
URL | http://arxiv.org/abs/1806.05393v2 |
http://arxiv.org/pdf/1806.05393v2.pdf | |
PWC | https://paperswithcode.com/paper/dynamical-isometry-and-a-mean-field-theory-of-2 |
Repo | https://github.com/JiJingYu/delta_orthogonal_init_pytorch |
Framework | pytorch |
TVM: An Automated End-to-End Optimizing Compiler for Deep Learning
Title | TVM: An Automated End-to-End Optimizing Compiler for Deep Learning |
Authors | Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Meghan Cowan, Haichen Shen, Leyuan Wang, Yuwei Hu, Luis Ceze, Carlos Guestrin, Arvind Krishnamurthy |
Abstract | There is an increasing need to bring machine learning to a wide diversity of hardware devices. Current frameworks rely on vendor-specific operator libraries and optimize for a narrow range of server-class GPUs. Deploying workloads to new platforms – such as mobile phones, embedded devices, and accelerators (e.g., FPGAs, ASICs) – requires significant manual effort. We propose TVM, a compiler that exposes graph-level and operator-level optimizations to provide performance portability to deep learning workloads across diverse hardware back-ends. TVM solves optimization challenges specific to deep learning, such as high-level operator fusion, mapping to arbitrary hardware primitives, and memory latency hiding. It also automates optimization of low-level programs to hardware characteristics by employing a novel, learning-based cost modeling method for rapid exploration of code optimizations. Experimental results show that TVM delivers performance across hardware back-ends that are competitive with state-of-the-art, hand-tuned libraries for low-power CPU, mobile GPU, and server-class GPUs. We also demonstrate TVM’s ability to target new accelerator back-ends, such as the FPGA-based generic deep learning accelerator. The system is open sourced and in production use inside several major companies. |
Tasks | |
Published | 2018-02-12 |
URL | http://arxiv.org/abs/1802.04799v3 |
http://arxiv.org/pdf/1802.04799v3.pdf | |
PWC | https://paperswithcode.com/paper/tvm-an-automated-end-to-end-optimizing |
Repo | https://github.com/ctuning/ck-tvm |
Framework | mxnet |
Causal Queries from Observational Data in Biological Systems via Bayesian Networks: An Empirical Study in Small Networks
Title | Causal Queries from Observational Data in Biological Systems via Bayesian Networks: An Empirical Study in Small Networks |
Authors | Alex White, Matthieu Vignes |
Abstract | Biological networks are a very convenient modelling and visualisation tool to discover knowledge from modern high-throughput genomics and postgenomics data sets. Indeed, biological entities are not isolated, but are components of complex multi-level systems. We go one step further and advocate for the consideration of causal representations of the interactions in living systems.We present the causal formalism and bring it out in the context of biological networks, when the data is observational. We also discuss its ability to decipher the causal information flow as observed in gene expression. We also illustrate our exploration by experiments on small simulated networks as well as on a real biological data set. |
Tasks | |
Published | 2018-05-04 |
URL | http://arxiv.org/abs/1805.01608v1 |
http://arxiv.org/pdf/1805.01608v1.pdf | |
PWC | https://paperswithcode.com/paper/causal-queries-from-observational-data-in |
Repo | https://github.com/alexW335/BNCausalExperiments |
Framework | none |
Manifold regularization with GANs for semi-supervised learning
Title | Manifold regularization with GANs for semi-supervised learning |
Authors | Bruno Lecouat, Chuan-Sheng Foo, Houssam Zenati, Vijay Chandrasekhar |
Abstract | Generative Adversarial Networks are powerful generative models that are able to model the manifold of natural images. We leverage this property to perform manifold regularization by approximating a variant of the Laplacian norm using a Monte Carlo approximation that is easily computed with the GAN. When incorporated into the semi-supervised feature-matching GAN we achieve state-of-the-art results for GAN-based semi-supervised learning on CIFAR-10 and SVHN benchmarks, with a method that is significantly easier to implement than competing methods. We also find that manifold regularization improves the quality of generated images, and is affected by the quality of the GAN used to approximate the regularizer. |
Tasks | |
Published | 2018-07-11 |
URL | http://arxiv.org/abs/1807.04307v1 |
http://arxiv.org/pdf/1807.04307v1.pdf | |
PWC | https://paperswithcode.com/paper/manifold-regularization-with-gans-for-semi |
Repo | https://github.com/bruno-31/gan-manifold-reg |
Framework | tf |