October 20, 2019

3061 words 15 mins read

Paper Group AWR 210

Deep Generative Models for Distribution-Preserving Lossy Compression. Bayesian Deep Learning on a Quantum Computer. TAP-DLND 1.0 : A Corpus for Document Level Novelty Detection. Deep Convolutional AutoEncoder-based Lossy Image Compression. Feature Distillation: DNN-Oriented JPEG Compression Against Adversarial Examples. Conditional Probability Mode …

Deep Generative Models for Distribution-Preserving Lossy Compression


Title	Deep Generative Models for Distribution-Preserving Lossy Compression
Authors	Michael Tschannen, Eirikur Agustsson, Mario Lucic
Abstract	We propose and study the problem of distribution-preserving lossy compression. Motivated by recent advances in extreme image compression which allow to maintain artifact-free reconstructions even at very low bitrates, we propose to optimize the rate-distortion tradeoff under the constraint that the reconstructed samples follow the distribution of the training data. The resulting compression system recovers both ends of the spectrum: On one hand, at zero bitrate it learns a generative model of the data, and at high enough bitrates it achieves perfect reconstruction. Furthermore, for intermediate bitrates it smoothly interpolates between learning a generative model of the training data and perfectly reconstructing the training samples. We study several methods to approximately solve the proposed optimization problem, including a novel combination of Wasserstein GAN and Wasserstein Autoencoder, and present an extensive theoretical and empirical characterization of the proposed compression systems.
Tasks	Image Compression, Image Generation
Published	2018-05-28
URL	http://arxiv.org/abs/1805.11057v2
PDF	http://arxiv.org/pdf/1805.11057v2.pdf
PWC	https://paperswithcode.com/paper/deep-generative-models-for-distribution
Repo	https://github.com/mitscha/dplc
Framework	pytorch

Bayesian Deep Learning on a Quantum Computer


Title	Bayesian Deep Learning on a Quantum Computer
Authors	Zhikuan Zhao, Alejandro Pozas-Kerstjens, Patrick Rebentrost, Peter Wittek
Abstract	Bayesian methods in machine learning, such as Gaussian processes, have great advantages com-pared to other techniques. In particular, they provide estimates of the uncertainty associated with a prediction. Extending the Bayesian approach to deep architectures has remained a major challenge. Recent results connected deep feedforward neural networks with Gaussian processes, allowing training without backpropagation. This connection enables us to leverage a quantum algorithm designed for Gaussian processes and develop a new algorithm for Bayesian deep learning on quantum computers. The properties of the kernel matrix in the Gaussian process ensure the efficient execution of the core component of the protocol, quantum matrix inversion, providing an at least polynomial speedup over classical algorithms. Furthermore, we demonstrate the execution of the algorithm on contemporary quantum computers and analyze its robustness with respect to realistic noise models.
Tasks	Gaussian Processes
Published	2018-06-29
URL	https://arxiv.org/abs/1806.11463v3
PDF	https://arxiv.org/pdf/1806.11463v3.pdf
PWC	https://paperswithcode.com/paper/bayesian-deep-learning-on-a-quantum-computer
Repo	https://github.com/balbok0/bayes-nn-qsh
Framework	none

TAP-DLND 1.0 : A Corpus for Document Level Novelty Detection


Title	TAP-DLND 1.0 : A Corpus for Document Level Novelty Detection
Authors	Tirthankar Ghosal, Amitra Salam, Swati Tiwari, Asif Ekbal, Pushpak Bhattacharyya
Abstract	Detecting novelty of an entire document is an Artificial Intelligence (AI) frontier problem that has widespread NLP applications, such as extractive document summarization, tracking development of news events, predicting impact of scholarly articles, etc. Important though the problem is, we are unaware of any benchmark document level data that correctly addresses the evaluation of automatic novelty detection techniques in a classification framework. To bridge this gap, we present here a resource for benchmarking the techniques for document level novelty detection. We create the resource via event-specific crawling of news documents across several domains in a periodic manner. We release the annotated corpus with necessary statistics and show its use with a developed system for the problem in concern.
Tasks	Document Summarization, Extractive Document Summarization
Published	2018-02-20
URL	http://arxiv.org/abs/1802.06950v1
PDF	http://arxiv.org/pdf/1802.06950v1.pdf
PWC	https://paperswithcode.com/paper/tap-dlnd-10-a-corpus-for-document-level
Repo	https://github.com/edithal-14/A-Deep-Neural-Solution-To-Document-Level-Novelty-Detection-COLING-2018-
Framework	tf

Deep Convolutional AutoEncoder-based Lossy Image Compression


Title	Deep Convolutional AutoEncoder-based Lossy Image Compression
Authors	Zhengxue Cheng, Heming Sun, Masaru Takeuchi, Jiro Katto
Abstract	Image compression has been investigated as a fundamental research topic for many decades. Recently, deep learning has achieved great success in many computer vision tasks, and is gradually being used in image compression. In this paper, we present a lossy image compression architecture, which utilizes the advantages of convolutional autoencoder (CAE) to achieve a high coding efficiency. First, we design a novel CAE architecture to replace the conventional transforms and train this CAE using a rate-distortion loss function. Second, to generate a more energy-compact representation, we utilize the principal components analysis (PCA) to rotate the feature maps produced by the CAE, and then apply the quantization and entropy coder to generate the codes. Experimental results demonstrate that our method outperforms traditional image coding algorithms, by achieving a 13.7% BD-rate decrement on the Kodak database images compared to JPEG2000. Besides, our method maintains a moderate complexity similar to JPEG2000.
Tasks	Image Compression, Quantization
Published	2018-04-25
URL	http://arxiv.org/abs/1804.09535v1
PDF	http://arxiv.org/pdf/1804.09535v1.pdf
PWC	https://paperswithcode.com/paper/deep-convolutional-autoencoder-based-lossy
Repo	https://github.com/cachett/DCGANandCAE
Framework	pytorch

Feature Distillation: DNN-Oriented JPEG Compression Against Adversarial Examples


Title	Feature Distillation: DNN-Oriented JPEG Compression Against Adversarial Examples
Authors	Zihao Liu, Qi Liu, Tao Liu, Nuo Xu, Xue Lin, Yanzhi Wang, Wujie Wen
Abstract	Image compression-based approaches for defending against the adversarial-example attacks, which threaten the safety use of deep neural networks (DNN), have been investigated recently. However, prior works mainly rely on directly tuning parameters like compression rate, to blindly reduce image features, thereby lacking guarantee on both defense efficiency (i.e. accuracy of polluted images) and classification accuracy of benign images, after applying defense methods. To overcome these limitations, we propose a JPEG-based defensive compression framework, namely “feature distillation”, to effectively rectify adversarial examples without impacting classification accuracy on benign data. Our framework significantly escalates the defense efficiency with marginal accuracy reduction using a two-step method: First, we maximize malicious features filtering of adversarial input perturbations by developing defensive quantization in frequency domain of JPEG compression or decompression, guided by a semi-analytical method; Second, we suppress the distortions of benign features to restore classification accuracy through a DNN-oriented quantization refine process. Our experimental results show that proposed “feature distillation” can significantly surpass the latest input-transformation based mitigations such as Quilting and TV Minimization in three aspects, including defense efficiency (improve classification accuracy from $\sim20%$ to $\sim90%$ on adversarial examples), accuracy of benign images after defense ($\le1%$ accuracy degradation), and processing time per image ($\sim259\times$ Speedup). Moreover, our solution can also provide the best defense efficiency ($\sim60%$ accuracy) against the recent adaptive attack with least accuracy reduction ($\sim1%$) on benign images when compared with other input-transformation based defense methods.
Tasks	Image Compression, Quantization
Published	2018-03-14
URL	http://arxiv.org/abs/1803.05787v2
PDF	http://arxiv.org/pdf/1803.05787v2.pdf
PWC	https://paperswithcode.com/paper/feature-distillation-dnn-oriented-jpeg
Repo	https://github.com/zihaoliu123/Feature-Distillation-DNN-Oriented-JPEG-Compression-Against-Adversarial-Examples
Framework	tf

Conditional Probability Models for Deep Image Compression


Title	Conditional Probability Models for Deep Image Compression
Authors	Fabian Mentzer, Eirikur Agustsson, Michael Tschannen, Radu Timofte, Luc Van Gool
Abstract	Deep Neural Networks trained as image auto-encoders have recently emerged as a promising direction for advancing the state-of-the-art in image compression. The key challenge in learning such networks is twofold: To deal with quantization, and to control the trade-off between reconstruction error (distortion) and entropy (rate) of the latent image representation. In this paper, we focus on the latter challenge and propose a new technique to navigate the rate-distortion trade-off for an image compression auto-encoder. The main idea is to directly model the entropy of the latent representation by using a context model: A 3D-CNN which learns a conditional probability model of the latent distribution of the auto-encoder. During training, the auto-encoder makes use of the context model to estimate the entropy of its representation, and the context model is concurrently updated to learn the dependencies between the symbols in the latent representation. Our experiments show that this approach, when measured in MS-SSIM, yields a state-of-the-art image compression system based on a simple convolutional auto-encoder.
Tasks	Image Compression, Quantization
Published	2018-01-12
URL	https://arxiv.org/abs/1801.04260v4
PDF	https://arxiv.org/pdf/1801.04260v4.pdf
PWC	https://paperswithcode.com/paper/conditional-probability-models-for-deep-image
Repo	https://github.com/fab-jul/imgcomp-cvpr
Framework	tf

Structured Semantic Model supported Deep Neural Network for Click-Through Rate Prediction


Title	Structured Semantic Model supported Deep Neural Network for Click-Through Rate Prediction
Authors	Chenglei Niu, Guojing Zhong, Ying Liu, Yandong Zhang, Yongsheng Sun, Ailong He, Zhaoji Chen
Abstract	With the rapid development of online advertising and recommendation systems, click-through rate prediction is expected to play an increasingly important role.Recently many DNN-based models which follow a similar Embedding&MLP paradigm have been proposed, and have achieved good result in image/voice and nlp fields. In these methods the Wide&Deep model announced by Google plays a key role.Most models first map large scale sparse input features into low-dimensional vectors which are transformed to fixed-length vectors, then concatenated together before being fed into a multilayer perceptron (MLP) to learn non-linear relations among input features. The number of trainable variables normally grow dramatically the number of feature fields and the embedding dimension grow. It is a big challenge to get state-of-the-art result through training deep neural network and embedding together, which falls into local optimal or overfitting easily. In this paper, we propose an Structured Semantic Model (SSM) to tackles this challenge by designing a orthogonal base convolution and pooling model which adaptively learn the multi-scale base semantic representation between features supervised by the click label.The output of SSM are then used in the Wide&Deep for CTR prediction.Experiments on two public datasets as well as real Weibo production dataset with over 1 billion samples have demonstrated the effectiveness of our proposed approach with superior performance comparing to state-of-the-art methods.
Tasks	Click-Through Rate Prediction, Recommendation Systems
Published	2018-12-04
URL	http://arxiv.org/abs/1812.01353v5
PDF	http://arxiv.org/pdf/1812.01353v5.pdf
PWC	https://paperswithcode.com/paper/structured-semantic-model-supported-deep
Repo	https://github.com/niuchenglei/ssm-dnn
Framework	tf

End-to-End Learning on 3D Protein Structure for Interface Prediction


Title	End-to-End Learning on 3D Protein Structure for Interface Prediction
Authors	Raphael J. L. Townshend, Rishi Bedi, Patricia A. Suriana, Ron O. Dror
Abstract	Despite an explosion in the number of experimentally determined, atomically detailed structures of biomolecules, many critical tasks in structural biology remain data-limited. Whether performance in such tasks can be improved by using large repositories of tangentially related structural data remains an open question. To address this question, we focused on a central problem in biology: predicting how proteins interact with one another—that is, which surfaces of one protein bind to those of another protein. We built a training dataset, the Database of Interacting Protein Structures (DIPS), that contains biases but is two orders of magnitude larger than those used previously. We found that these biases significantly degrade the performance of existing methods on gold-standard data. Hypothesizing that assumptions baked into the hand-crafted features on which these methods depend were the source of the problem, we developed the first end-to-end learning model for protein interface prediction, the Siamese Atomic Surfacelet Network (SASNet). Using only spatial coordinates and identities of atoms, SASNet outperforms state-of-the-art methods trained on gold-standard structural data, even when trained on only 3% of our new dataset. Code and data available at https://github.com/drorlab/DIPS.
Tasks	Transfer Learning
Published	2018-07-03
URL	https://arxiv.org/abs/1807.01297v5
PDF	https://arxiv.org/pdf/1807.01297v5.pdf
PWC	https://paperswithcode.com/paper/transferrable-end-to-end-learning-for-protein
Repo	https://github.com/drorlab/DIPS
Framework	tf

Visualizing Convolutional Neural Network Protein-Ligand Scoring


Title	Visualizing Convolutional Neural Network Protein-Ligand Scoring
Authors	Joshua Hochuli, Alec Helbling, Tamar Skaist, Matthew Ragoza, David Ryan Koes
Abstract	Protein-ligand scoring is an important step in a structure-based drug design pipeline. Selecting a correct binding pose and predicting the binding affinity of a protein-ligand complex enables effective virtual screening. Machine learning techniques can make use of the increasing amounts of structural data that are becoming publicly available. Convolutional neural network (CNN) scoring functions in particular have shown promise in pose selection and affinity prediction for protein-ligand complexes. Neural networks are known for being difficult to interpret. Understanding the decisions of a particular network can help tune parameters and training data to maximize performance. Visualization of neural networks helps decompose complex scoring functions into pictures that are more easily parsed by humans. Here we present three methods for visualizing how individual protein-ligand complexes are interpreted by 3D convolutional neural networks. We also present a visualization of the convolutional filters and their weights. We describe how the intuition provided by these visualizations aids in network design.
Tasks
Published	2018-03-06
URL	http://arxiv.org/abs/1803.02398v1
PDF	http://arxiv.org/pdf/1803.02398v1.pdf
PWC	https://paperswithcode.com/paper/visualizing-convolutional-neural-network
Repo	https://github.com/gnina/gnina
Framework	none

Are Deep Policy Gradient Algorithms Truly Policy Gradient Algorithms?


Title	Are Deep Policy Gradient Algorithms Truly Policy Gradient Algorithms?
Authors	Andrew Ilyas, Logan Engstrom, Shibani Santurkar, Dimitris Tsipras, Firdaus Janoos, Larry Rudolph, Aleksander Madry
Abstract	We study how the behavior of deep policy gradient algorithms reflects the conceptual framework motivating their development. We propose a fine-grained analysis of state-of-the-art methods based on key aspects of this framework: gradient estimation, value prediction, optimization landscapes, and trust region enforcement. We find that from this perspective, the behavior of deep policy gradient algorithms often deviates from what their motivating framework would predict. Our analysis suggests first steps towards solidifying the foundations of these algorithms, and in particular indicates that we may need to move beyond the current benchmark-centric evaluation methodology.
Tasks
Published	2018-11-06
URL	http://arxiv.org/abs/1811.02553v3
PDF	http://arxiv.org/pdf/1811.02553v3.pdf
PWC	https://paperswithcode.com/paper/are-deep-policy-gradient-algorithms-truly
Repo	https://github.com/BPDanek/learning_resources
Framework	none

Deep Face Recognition: A Survey


Title	Deep Face Recognition: A Survey
Authors	Mei Wang, Weihong Deng
Abstract	Deep learning applies multiple processing layers to learn representations of data with multiple levels of feature extraction. This emerging technique has reshaped the research landscape of face recognition since 2014, launched by the breakthroughs of Deepface and DeepID methods. Since then, deep face recognition (FR) technique, which leverages the hierarchical architecture to learn discriminative face representation, has dramatically improved the state-of-the-art performance and fostered numerous successful real-world applications. In this paper, we provide a comprehensive survey of the recent developments on deep FR, covering the broad topics on algorithms, data, and scenes. First, we summarize different network architectures and loss functions proposed in the rapid evolution of the deep FR methods. Second, the related face processing methods are categorized into two classes: `one-to-many augmentation' and` many-to-one normalization’. Then, we summarize and compare the commonly used databases for both model training and evaluation. Third, we review miscellaneous scenes in deep FR, such as cross-factor, heterogenous, multiple-media and industry scenes. Finally, potential deficiencies of the current methods and several future directions are highlighted.
Tasks	Face Recognition
Published	2018-04-18
URL	http://arxiv.org/abs/1804.06655v8
PDF	http://arxiv.org/pdf/1804.06655v8.pdf
PWC	https://paperswithcode.com/paper/deep-face-recognition-a-survey
Repo	https://github.com/parvatijay2901/FaceNet_FR
Framework	pytorch

Attributing Fake Images to GANs: Learning and Analyzing GAN Fingerprints


Title	Attributing Fake Images to GANs: Learning and Analyzing GAN Fingerprints
Authors	Ning Yu, Larry Davis, Mario Fritz
Abstract	Recent advances in Generative Adversarial Networks (GANs) have shown increasing success in generating photorealistic images. But they also raise challenges to visual forensics and model attribution. We present the first study of learning GAN fingerprints towards image attribution and using them to classify an image as real or GAN-generated. For GAN-generated images, we further identify their sources. Our experiments show that (1) GANs carry distinct model fingerprints and leave stable fingerprints in their generated images, which support image attribution; (2) even minor differences in GAN training can result in different fingerprints, which enables fine-grained model authentication; (3) fingerprints persist across different image frequencies and patches and are not biased by GAN artifacts; (4) fingerprint finetuning is effective in immunizing against five types of adversarial image perturbations; and (5) comparisons also show our learned fingerprints consistently outperform several baselines in a variety of setups.
Tasks	Image Generation
Published	2018-11-20
URL	https://arxiv.org/abs/1811.08180v3
PDF	https://arxiv.org/pdf/1811.08180v3.pdf
PWC	https://paperswithcode.com/paper/attributing-fake-images-to-gans-analyzing
Repo	https://github.com/ningyu1991/GANFingerprints
Framework	tf

Phase Harmonic Correlations and Convolutional Neural Networks


Title	Phase Harmonic Correlations and Convolutional Neural Networks
Authors	Stéphane Mallat, Sixin Zhang, Gaspar Rochette
Abstract	A major issue in harmonic analysis is to capture the phase dependence of frequency representations, which carries important signal properties. It seems that convolutional neural networks have found a way. Over time-series and images, convolutional networks often learn a first layer of filters which are well localized in the frequency domain, with different phases. We show that a rectifier then acts as a filter on the phase of the resulting coefficients. It computes signal descriptors which are local in space, frequency and phase. The non-linear phase filter becomes a multiplicative operator over phase harmonics computed with a Fourier transform along the phase. We prove that it defines a bi-Lipschitz and invertible representation. The correlations of phase harmonics coefficients characterise coherent structures from their phase dependence across frequencies. For wavelet filters, we show numerically that signals having sparse wavelet coefficients can be recovered from few phase harmonic correlations, which provide a compressive representation
Tasks	Time Series
Published	2018-10-29
URL	https://arxiv.org/abs/1810.12136v2
PDF	https://arxiv.org/pdf/1810.12136v2.pdf
PWC	https://paperswithcode.com/paper/phase-harmonics-and-correlation-invariants-in
Repo	https://github.com/kymatio/phaseharmonics
Framework	pytorch

Face-Cap: Image Captioning using Facial Expression Analysis


Title	Face-Cap: Image Captioning using Facial Expression Analysis
Authors	Omid Mohamad Nezami, Mark Dras, Peter Anderson, Len Hamey
Abstract	Image captioning is the process of generating a natural language description of an image. Most current image captioning models, however, do not take into account the emotional aspect of an image, which is very relevant to activities and interpersonal relationships represented therein. Towards developing a model that can produce human-like captions incorporating these, we use facial expression features extracted from images including human faces, with the aim of improving the descriptive ability of the model. In this work, we present two variants of our Face-Cap model, which embed facial expression features in different ways, to generate image captions. Using all standard evaluation metrics, our Face-Cap models outperform a state-of-the-art baseline model for generating image captions when applied to an image caption dataset extracted from the standard Flickr 30K dataset, consisting of around 11K images containing faces. An analysis of the captions finds that, perhaps surprisingly, the improvement in caption quality appears to come not from the addition of adjectives linked to emotional aspects of the images, but from more variety in the actions described in the captions.
Tasks	Image Captioning
Published	2018-07-06
URL	http://arxiv.org/abs/1807.02250v2
PDF	http://arxiv.org/pdf/1807.02250v2.pdf
PWC	https://paperswithcode.com/paper/face-cap-image-captioning-using-facial
Repo	https://github.com/omidmn/Face-Cap
Framework	none

Deep Reinforcement Learning for Surgical Gesture Segmentation and Classification


Title	Deep Reinforcement Learning for Surgical Gesture Segmentation and Classification
Authors	Daochang Liu, Tingting Jiang
Abstract	Recognition of surgical gesture is crucial for surgical skill assessment and efficient surgery training. Prior works on this task are based on either variant graphical models such as HMMs and CRFs, or deep learning models such as Recurrent Neural Networks and Temporal Convolutional Networks. Most of the current approaches usually suffer from over-segmentation and therefore low segment-level edit scores. In contrast, we present an essentially different methodology by modeling the task as a sequential decision-making process. An intelligent agent is trained using reinforcement learning with hierarchical features from a deep model. Temporal consistency is integrated into our action design and reward mechanism to reduce over-segmentation errors. Experiments on JIGSAWS dataset demonstrate that the proposed method performs better than state-of-the-art methods in terms of the edit score and on par in frame-wise accuracy. Our code will be released later.
Tasks	Decision Making, Surgical Gesture Recognition
Published	2018-06-21
URL	http://arxiv.org/abs/1806.08089v1
PDF	http://arxiv.org/pdf/1806.08089v1.pdf
PWC	https://paperswithcode.com/paper/deep-reinforcement-learning-for-surgical
Repo	https://github.com/Finspire13/RL-Surgical-Gesture-Segmentation
Framework	pytorch