October 20, 2019

3061 words 15 mins read

Paper Group AWR 210

Paper Group AWR 210

Deep Generative Models for Distribution-Preserving Lossy Compression. Bayesian Deep Learning on a Quantum Computer. TAP-DLND 1.0 : A Corpus for Document Level Novelty Detection. Deep Convolutional AutoEncoder-based Lossy Image Compression. Feature Distillation: DNN-Oriented JPEG Compression Against Adversarial Examples. Conditional Probability Mode …

Deep Generative Models for Distribution-Preserving Lossy Compression

Title Deep Generative Models for Distribution-Preserving Lossy Compression
Authors Michael Tschannen, Eirikur Agustsson, Mario Lucic
Abstract We propose and study the problem of distribution-preserving lossy compression. Motivated by recent advances in extreme image compression which allow to maintain artifact-free reconstructions even at very low bitrates, we propose to optimize the rate-distortion tradeoff under the constraint that the reconstructed samples follow the distribution of the training data. The resulting compression system recovers both ends of the spectrum: On one hand, at zero bitrate it learns a generative model of the data, and at high enough bitrates it achieves perfect reconstruction. Furthermore, for intermediate bitrates it smoothly interpolates between learning a generative model of the training data and perfectly reconstructing the training samples. We study several methods to approximately solve the proposed optimization problem, including a novel combination of Wasserstein GAN and Wasserstein Autoencoder, and present an extensive theoretical and empirical characterization of the proposed compression systems.
Tasks Image Compression, Image Generation
Published 2018-05-28
URL http://arxiv.org/abs/1805.11057v2
PDF http://arxiv.org/pdf/1805.11057v2.pdf
PWC https://paperswithcode.com/paper/deep-generative-models-for-distribution
Repo https://github.com/mitscha/dplc
Framework pytorch

Bayesian Deep Learning on a Quantum Computer

Title Bayesian Deep Learning on a Quantum Computer
Authors Zhikuan Zhao, Alejandro Pozas-Kerstjens, Patrick Rebentrost, Peter Wittek
Abstract Bayesian methods in machine learning, such as Gaussian processes, have great advantages com-pared to other techniques. In particular, they provide estimates of the uncertainty associated with a prediction. Extending the Bayesian approach to deep architectures has remained a major challenge. Recent results connected deep feedforward neural networks with Gaussian processes, allowing training without backpropagation. This connection enables us to leverage a quantum algorithm designed for Gaussian processes and develop a new algorithm for Bayesian deep learning on quantum computers. The properties of the kernel matrix in the Gaussian process ensure the efficient execution of the core component of the protocol, quantum matrix inversion, providing an at least polynomial speedup over classical algorithms. Furthermore, we demonstrate the execution of the algorithm on contemporary quantum computers and analyze its robustness with respect to realistic noise models.
Tasks Gaussian Processes
Published 2018-06-29
URL https://arxiv.org/abs/1806.11463v3
PDF https://arxiv.org/pdf/1806.11463v3.pdf
PWC https://paperswithcode.com/paper/bayesian-deep-learning-on-a-quantum-computer
Repo https://github.com/balbok0/bayes-nn-qsh
Framework none

TAP-DLND 1.0 : A Corpus for Document Level Novelty Detection

Title TAP-DLND 1.0 : A Corpus for Document Level Novelty Detection
Authors Tirthankar Ghosal, Amitra Salam, Swati Tiwari, Asif Ekbal, Pushpak Bhattacharyya
Abstract Detecting novelty of an entire document is an Artificial Intelligence (AI) frontier problem that has widespread NLP applications, such as extractive document summarization, tracking development of news events, predicting impact of scholarly articles, etc. Important though the problem is, we are unaware of any benchmark document level data that correctly addresses the evaluation of automatic novelty detection techniques in a classification framework. To bridge this gap, we present here a resource for benchmarking the techniques for document level novelty detection. We create the resource via event-specific crawling of news documents across several domains in a periodic manner. We release the annotated corpus with necessary statistics and show its use with a developed system for the problem in concern.
Tasks Document Summarization, Extractive Document Summarization
Published 2018-02-20
URL http://arxiv.org/abs/1802.06950v1
PDF http://arxiv.org/pdf/1802.06950v1.pdf
PWC https://paperswithcode.com/paper/tap-dlnd-10-a-corpus-for-document-level
Repo https://github.com/edithal-14/A-Deep-Neural-Solution-To-Document-Level-Novelty-Detection-COLING-2018-
Framework tf

Deep Convolutional AutoEncoder-based Lossy Image Compression

Title Deep Convolutional AutoEncoder-based Lossy Image Compression
Authors Zhengxue Cheng, Heming Sun, Masaru Takeuchi, Jiro Katto
Abstract Image compression has been investigated as a fundamental research topic for many decades. Recently, deep learning has achieved great success in many computer vision tasks, and is gradually being used in image compression. In this paper, we present a lossy image compression architecture, which utilizes the advantages of convolutional autoencoder (CAE) to achieve a high coding efficiency. First, we design a novel CAE architecture to replace the conventional transforms and train this CAE using a rate-distortion loss function. Second, to generate a more energy-compact representation, we utilize the principal components analysis (PCA) to rotate the feature maps produced by the CAE, and then apply the quantization and entropy coder to generate the codes. Experimental results demonstrate that our method outperforms traditional image coding algorithms, by achieving a 13.7% BD-rate decrement on the Kodak database images compared to JPEG2000. Besides, our method maintains a moderate complexity similar to JPEG2000.
Tasks Image Compression, Quantization
Published 2018-04-25
URL http://arxiv.org/abs/1804.09535v1
PDF http://arxiv.org/pdf/1804.09535v1.pdf
PWC https://paperswithcode.com/paper/deep-convolutional-autoencoder-based-lossy
Repo https://github.com/cachett/DCGANandCAE
Framework pytorch

Feature Distillation: DNN-Oriented JPEG Compression Against Adversarial Examples

Title Feature Distillation: DNN-Oriented JPEG Compression Against Adversarial Examples
Authors Zihao Liu, Qi Liu, Tao Liu, Nuo Xu, Xue Lin, Yanzhi Wang, Wujie Wen
Abstract Image compression-based approaches for defending against the adversarial-example attacks, which threaten the safety use of deep neural networks (DNN), have been investigated recently. However, prior works mainly rely on directly tuning parameters like compression rate, to blindly reduce image features, thereby lacking guarantee on both defense efficiency (i.e. accuracy of polluted images) and classification accuracy of benign images, after applying defense methods. To overcome these limitations, we propose a JPEG-based defensive compression framework, namely “feature distillation”, to effectively rectify adversarial examples without impacting classification accuracy on benign data. Our framework significantly escalates the defense efficiency with marginal accuracy reduction using a two-step method: First, we maximize malicious features filtering of adversarial input perturbations by developing defensive quantization in frequency domain of JPEG compression or decompression, guided by a semi-analytical method; Second, we suppress the distortions of benign features to restore classification accuracy through a DNN-oriented quantization refine process. Our experimental results show that proposed “feature distillation” can significantly surpass the latest input-transformation based mitigations such as Quilting and TV Minimization in three aspects, including defense efficiency (improve classification accuracy from $\sim20%$ to $\sim90%$ on adversarial examples), accuracy of benign images after defense ($\le1%$ accuracy degradation), and processing time per image ($\sim259\times$ Speedup). Moreover, our solution can also provide the best defense efficiency ($\sim60%$ accuracy) against the recent adaptive attack with least accuracy reduction ($\sim1%$) on benign images when compared with other input-transformation based defense methods.
Tasks Image Compression, Quantization
Published 2018-03-14
URL http://arxiv.org/abs/1803.05787v2
PDF http://arxiv.org/pdf/1803.05787v2.pdf
PWC https://paperswithcode.com/paper/feature-distillation-dnn-oriented-jpeg
Repo https://github.com/zihaoliu123/Feature-Distillation-DNN-Oriented-JPEG-Compression-Against-Adversarial-Examples
Framework tf

Conditional Probability Models for Deep Image Compression

Title Conditional Probability Models for Deep Image Compression
Authors Fabian Mentzer, Eirikur Agustsson, Michael Tschannen, Radu Timofte, Luc Van Gool
Abstract Deep Neural Networks trained as image auto-encoders have recently emerged as a promising direction for advancing the state-of-the-art in image compression. The key challenge in learning such networks is twofold: To deal with quantization, and to control the trade-off between reconstruction error (distortion) and entropy (rate) of the latent image representation. In this paper, we focus on the latter challenge and propose a new technique to navigate the rate-distortion trade-off for an image compression auto-encoder. The main idea is to directly model the entropy of the latent representation by using a context model: A 3D-CNN which learns a conditional probability model of the latent distribution of the auto-encoder. During training, the auto-encoder makes use of the context model to estimate the entropy of its representation, and the context model is concurrently updated to learn the dependencies between the symbols in the latent representation. Our experiments show that this approach, when measured in MS-SSIM, yields a state-of-the-art image compression system based on a simple convolutional auto-encoder.
Tasks Image Compression, Quantization
Published 2018-01-12
URL https://arxiv.org/abs/1801.04260v4
PDF https://arxiv.org/pdf/1801.04260v4.pdf
PWC https://paperswithcode.com/paper/conditional-probability-models-for-deep-image
Repo https://github.com/fab-jul/imgcomp-cvpr
Framework tf

Structured Semantic Model supported Deep Neural Network for Click-Through Rate Prediction

Title Structured Semantic Model supported Deep Neural Network for Click-Through Rate Prediction
Authors Chenglei Niu, Guojing Zhong, Ying Liu, Yandong Zhang, Yongsheng Sun, Ailong He, Zhaoji Chen
Abstract With the rapid development of online advertising and recommendation systems, click-through rate prediction is expected to play an increasingly important role.Recently many DNN-based models which follow a similar Embedding&MLP paradigm have been proposed, and have achieved good result in image/voice and nlp fields. In these methods the Wide&Deep model announced by Google plays a key role.Most models first map large scale sparse input features into low-dimensional vectors which are transformed to fixed-length vectors, then concatenated together before being fed into a multilayer perceptron (MLP) to learn non-linear relations among input features. The number of trainable variables normally grow dramatically the number of feature fields and the embedding dimension grow. It is a big challenge to get state-of-the-art result through training deep neural network and embedding together, which falls into local optimal or overfitting easily. In this paper, we propose an Structured Semantic Model (SSM) to tackles this challenge by designing a orthogonal base convolution and pooling model which adaptively learn the multi-scale base semantic representation between features supervised by the click label.The output of SSM are then used in the Wide&Deep for CTR prediction.Experiments on two public datasets as well as real Weibo production dataset with over 1 billion samples have demonstrated the effectiveness of our proposed approach with superior performance comparing to state-of-the-art methods.
Tasks Click-Through Rate Prediction, Recommendation Systems
Published 2018-12-04
URL http://arxiv.org/abs/1812.01353v5
PDF http://arxiv.org/pdf/1812.01353v5.pdf
PWC https://paperswithcode.com/paper/structured-semantic-model-supported-deep
Repo https://github.com/niuchenglei/ssm-dnn
Framework tf

End-to-End Learning on 3D Protein Structure for Interface Prediction

Title End-to-End Learning on 3D Protein Structure for Interface Prediction
Authors Raphael J. L. Townshend, Rishi Bedi, Patricia A. Suriana, Ron O. Dror
Abstract Despite an explosion in the number of experimentally determined, atomically detailed structures of biomolecules, many critical tasks in structural biology remain data-limited. Whether performance in such tasks can be improved by using large repositories of tangentially related structural data remains an open question. To address this question, we focused on a central problem in biology: predicting how proteins interact with one another—that is, which surfaces of one protein bind to those of another protein. We built a training dataset, the Database of Interacting Protein Structures (DIPS), that contains biases but is two orders of magnitude larger than those used previously. We found that these biases significantly degrade the performance of existing methods on gold-standard data. Hypothesizing that assumptions baked into the hand-crafted features on which these methods depend were the source of the problem, we developed the first end-to-end learning model for protein interface prediction, the Siamese Atomic Surfacelet Network (SASNet). Using only spatial coordinates and identities of atoms, SASNet outperforms state-of-the-art methods trained on gold-standard structural data, even when trained on only 3% of our new dataset. Code and data available at https://github.com/drorlab/DIPS.
Tasks Transfer Learning
Published 2018-07-03
URL https://arxiv.org/abs/1807.01297v5
PDF https://arxiv.org/pdf/1807.01297v5.pdf
PWC https://paperswithcode.com/paper/transferrable-end-to-end-learning-for-protein
Repo https://github.com/drorlab/DIPS
Framework tf

Visualizing Convolutional Neural Network Protein-Ligand Scoring

Title Visualizing Convolutional Neural Network Protein-Ligand Scoring
Authors Joshua Hochuli, Alec Helbling, Tamar Skaist, Matthew Ragoza, David Ryan Koes
Abstract Protein-ligand scoring is an important step in a structure-based drug design pipeline. Selecting a correct binding pose and predicting the binding affinity of a protein-ligand complex enables effective virtual screening. Machine learning techniques can make use of the increasing amounts of structural data that are becoming publicly available. Convolutional neural network (CNN) scoring functions in particular have shown promise in pose selection and affinity prediction for protein-ligand complexes. Neural networks are known for being difficult to interpret. Understanding the decisions of a particular network can help tune parameters and training data to maximize performance. Visualization of neural networks helps decompose complex scoring functions into pictures that are more easily parsed by humans. Here we present three methods for visualizing how individual protein-ligand complexes are interpreted by 3D convolutional neural networks. We also present a visualization of the convolutional filters and their weights. We describe how the intuition provided by these visualizations aids in network design.
Tasks
Published 2018-03-06
URL http://arxiv.org/abs/1803.02398v1
PDF http://arxiv.org/pdf/1803.02398v1.pdf
PWC https://paperswithcode.com/paper/visualizing-convolutional-neural-network
Repo https://github.com/gnina/gnina
Framework none

Are Deep Policy Gradient Algorithms Truly Policy Gradient Algorithms?

Title Are Deep Policy Gradient Algorithms Truly Policy Gradient Algorithms?
Authors Andrew Ilyas, Logan Engstrom, Shibani Santurkar, Dimitris Tsipras, Firdaus Janoos, Larry Rudolph, Aleksander Madry
Abstract We study how the behavior of deep policy gradient algorithms reflects the conceptual framework motivating their development. We propose a fine-grained analysis of state-of-the-art methods based on key aspects of this framework: gradient estimation, value prediction, optimization landscapes, and trust region enforcement. We find that from this perspective, the behavior of deep policy gradient algorithms often deviates from what their motivating framework would predict. Our analysis suggests first steps towards solidifying the foundations of these algorithms, and in particular indicates that we may need to move beyond the current benchmark-centric evaluation methodology.
Tasks
Published 2018-11-06
URL http://arxiv.org/abs/1811.02553v3
PDF http://arxiv.org/pdf/1811.02553v3.pdf
PWC https://paperswithcode.com/paper/are-deep-policy-gradient-algorithms-truly
Repo https://github.com/BPDanek/learning_resources
Framework none

Deep Face Recognition: A Survey

Title Deep Face Recognition: A Survey
Authors Mei Wang, Weihong Deng
Abstract Deep learning applies multiple processing layers to learn representations of data with multiple levels of feature extraction. This emerging technique has reshaped the research landscape of face recognition since 2014, launched by the breakthroughs of Deepface and DeepID methods. Since then, deep face recognition (FR) technique, which leverages the hierarchical architecture to learn discriminative face representation, has dramatically improved the state-of-the-art performance and fostered numerous successful real-world applications. In this paper, we provide a comprehensive survey of the recent developments on deep FR, covering the broad topics on algorithms, data, and scenes. First, we summarize different network architectures and loss functions proposed in the rapid evolution of the deep FR methods. Second, the related face processing methods are categorized into two classes: one-to-many augmentation' and many-to-one normalization’. Then, we summarize and compare the commonly used databases for both model training and evaluation. Third, we review miscellaneous scenes in deep FR, such as cross-factor, heterogenous, multiple-media and industry scenes. Finally, potential deficiencies of the current methods and several future directions are highlighted.
Tasks Face Recognition
Published 2018-04-18
URL http://arxiv.org/abs/1804.06655v8
PDF http://arxiv.org/pdf/1804.06655v8.pdf
PWC https://paperswithcode.com/paper/deep-face-recognition-a-survey
Repo https://github.com/parvatijay2901/FaceNet_FR
Framework pytorch

Attributing Fake Images to GANs: Learning and Analyzing GAN Fingerprints

Title Attributing Fake Images to GANs: Learning and Analyzing GAN Fingerprints
Authors Ning Yu, Larry Davis, Mario Fritz
Abstract Recent advances in Generative Adversarial Networks (GANs) have shown increasing success in generating photorealistic images. But they also raise challenges to visual forensics and model attribution. We present the first study of learning GAN fingerprints towards image attribution and using them to classify an image as real or GAN-generated. For GAN-generated images, we further identify their sources. Our experiments show that (1) GANs carry distinct model fingerprints and leave stable fingerprints in their generated images, which support image attribution; (2) even minor differences in GAN training can result in different fingerprints, which enables fine-grained model authentication; (3) fingerprints persist across different image frequencies and patches and are not biased by GAN artifacts; (4) fingerprint finetuning is effective in immunizing against five types of adversarial image perturbations; and (5) comparisons also show our learned fingerprints consistently outperform several baselines in a variety of setups.
Tasks Image Generation
Published 2018-11-20
URL https://arxiv.org/abs/1811.08180v3
PDF https://arxiv.org/pdf/1811.08180v3.pdf
PWC https://paperswithcode.com/paper/attributing-fake-images-to-gans-analyzing
Repo https://github.com/ningyu1991/GANFingerprints
Framework tf

Phase Harmonic Correlations and Convolutional Neural Networks

Title Phase Harmonic Correlations and Convolutional Neural Networks
Authors Stéphane Mallat, Sixin Zhang, Gaspar Rochette
Abstract A major issue in harmonic analysis is to capture the phase dependence of frequency representations, which carries important signal properties. It seems that convolutional neural networks have found a way. Over time-series and images, convolutional networks often learn a first layer of filters which are well localized in the frequency domain, with different phases. We show that a rectifier then acts as a filter on the phase of the resulting coefficients. It computes signal descriptors which are local in space, frequency and phase. The non-linear phase filter becomes a multiplicative operator over phase harmonics computed with a Fourier transform along the phase. We prove that it defines a bi-Lipschitz and invertible representation. The correlations of phase harmonics coefficients characterise coherent structures from their phase dependence across frequencies. For wavelet filters, we show numerically that signals having sparse wavelet coefficients can be recovered from few phase harmonic correlations, which provide a compressive representation
Tasks Time Series
Published 2018-10-29
URL https://arxiv.org/abs/1810.12136v2
PDF https://arxiv.org/pdf/1810.12136v2.pdf
PWC https://paperswithcode.com/paper/phase-harmonics-and-correlation-invariants-in
Repo https://github.com/kymatio/phaseharmonics
Framework pytorch

Face-Cap: Image Captioning using Facial Expression Analysis

Title Face-Cap: Image Captioning using Facial Expression Analysis
Authors Omid Mohamad Nezami, Mark Dras, Peter Anderson, Len Hamey
Abstract Image captioning is the process of generating a natural language description of an image. Most current image captioning models, however, do not take into account the emotional aspect of an image, which is very relevant to activities and interpersonal relationships represented therein. Towards developing a model that can produce human-like captions incorporating these, we use facial expression features extracted from images including human faces, with the aim of improving the descriptive ability of the model. In this work, we present two variants of our Face-Cap model, which embed facial expression features in different ways, to generate image captions. Using all standard evaluation metrics, our Face-Cap models outperform a state-of-the-art baseline model for generating image captions when applied to an image caption dataset extracted from the standard Flickr 30K dataset, consisting of around 11K images containing faces. An analysis of the captions finds that, perhaps surprisingly, the improvement in caption quality appears to come not from the addition of adjectives linked to emotional aspects of the images, but from more variety in the actions described in the captions.
Tasks Image Captioning
Published 2018-07-06
URL http://arxiv.org/abs/1807.02250v2
PDF http://arxiv.org/pdf/1807.02250v2.pdf
PWC https://paperswithcode.com/paper/face-cap-image-captioning-using-facial
Repo https://github.com/omidmn/Face-Cap
Framework none

Deep Reinforcement Learning for Surgical Gesture Segmentation and Classification

Title Deep Reinforcement Learning for Surgical Gesture Segmentation and Classification
Authors Daochang Liu, Tingting Jiang
Abstract Recognition of surgical gesture is crucial for surgical skill assessment and efficient surgery training. Prior works on this task are based on either variant graphical models such as HMMs and CRFs, or deep learning models such as Recurrent Neural Networks and Temporal Convolutional Networks. Most of the current approaches usually suffer from over-segmentation and therefore low segment-level edit scores. In contrast, we present an essentially different methodology by modeling the task as a sequential decision-making process. An intelligent agent is trained using reinforcement learning with hierarchical features from a deep model. Temporal consistency is integrated into our action design and reward mechanism to reduce over-segmentation errors. Experiments on JIGSAWS dataset demonstrate that the proposed method performs better than state-of-the-art methods in terms of the edit score and on par in frame-wise accuracy. Our code will be released later.
Tasks Decision Making, Surgical Gesture Recognition
Published 2018-06-21
URL http://arxiv.org/abs/1806.08089v1
PDF http://arxiv.org/pdf/1806.08089v1.pdf
PWC https://paperswithcode.com/paper/deep-reinforcement-learning-for-surgical
Repo https://github.com/Finspire13/RL-Surgical-Gesture-Segmentation
Framework pytorch
comments powered by Disqus