Paper Group AWR 210
Deep Generative Models for Distribution-Preserving Lossy Compression. Bayesian Deep Learning on a Quantum Computer. TAP-DLND 1.0 : A Corpus for Document Level Novelty Detection. Deep Convolutional AutoEncoder-based Lossy Image Compression. Feature Distillation: DNN-Oriented JPEG Compression Against Adversarial Examples. Conditional Probability Mode …
Deep Generative Models for Distribution-Preserving Lossy Compression
Title | Deep Generative Models for Distribution-Preserving Lossy Compression |
Authors | Michael Tschannen, Eirikur Agustsson, Mario Lucic |
Abstract | We propose and study the problem of distribution-preserving lossy compression. Motivated by recent advances in extreme image compression which allow to maintain artifact-free reconstructions even at very low bitrates, we propose to optimize the rate-distortion tradeoff under the constraint that the reconstructed samples follow the distribution of the training data. The resulting compression system recovers both ends of the spectrum: On one hand, at zero bitrate it learns a generative model of the data, and at high enough bitrates it achieves perfect reconstruction. Furthermore, for intermediate bitrates it smoothly interpolates between learning a generative model of the training data and perfectly reconstructing the training samples. We study several methods to approximately solve the proposed optimization problem, including a novel combination of Wasserstein GAN and Wasserstein Autoencoder, and present an extensive theoretical and empirical characterization of the proposed compression systems. |
Tasks | Image Compression, Image Generation |
Published | 2018-05-28 |
URL | http://arxiv.org/abs/1805.11057v2 |
http://arxiv.org/pdf/1805.11057v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-generative-models-for-distribution |
Repo | https://github.com/mitscha/dplc |
Framework | pytorch |
Bayesian Deep Learning on a Quantum Computer
Title | Bayesian Deep Learning on a Quantum Computer |
Authors | Zhikuan Zhao, Alejandro Pozas-Kerstjens, Patrick Rebentrost, Peter Wittek |
Abstract | Bayesian methods in machine learning, such as Gaussian processes, have great advantages com-pared to other techniques. In particular, they provide estimates of the uncertainty associated with a prediction. Extending the Bayesian approach to deep architectures has remained a major challenge. Recent results connected deep feedforward neural networks with Gaussian processes, allowing training without backpropagation. This connection enables us to leverage a quantum algorithm designed for Gaussian processes and develop a new algorithm for Bayesian deep learning on quantum computers. The properties of the kernel matrix in the Gaussian process ensure the efficient execution of the core component of the protocol, quantum matrix inversion, providing an at least polynomial speedup over classical algorithms. Furthermore, we demonstrate the execution of the algorithm on contemporary quantum computers and analyze its robustness with respect to realistic noise models. |
Tasks | Gaussian Processes |
Published | 2018-06-29 |
URL | https://arxiv.org/abs/1806.11463v3 |
https://arxiv.org/pdf/1806.11463v3.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-deep-learning-on-a-quantum-computer |
Repo | https://github.com/balbok0/bayes-nn-qsh |
Framework | none |
TAP-DLND 1.0 : A Corpus for Document Level Novelty Detection
Title | TAP-DLND 1.0 : A Corpus for Document Level Novelty Detection |
Authors | Tirthankar Ghosal, Amitra Salam, Swati Tiwari, Asif Ekbal, Pushpak Bhattacharyya |
Abstract | Detecting novelty of an entire document is an Artificial Intelligence (AI) frontier problem that has widespread NLP applications, such as extractive document summarization, tracking development of news events, predicting impact of scholarly articles, etc. Important though the problem is, we are unaware of any benchmark document level data that correctly addresses the evaluation of automatic novelty detection techniques in a classification framework. To bridge this gap, we present here a resource for benchmarking the techniques for document level novelty detection. We create the resource via event-specific crawling of news documents across several domains in a periodic manner. We release the annotated corpus with necessary statistics and show its use with a developed system for the problem in concern. |
Tasks | Document Summarization, Extractive Document Summarization |
Published | 2018-02-20 |
URL | http://arxiv.org/abs/1802.06950v1 |
http://arxiv.org/pdf/1802.06950v1.pdf | |
PWC | https://paperswithcode.com/paper/tap-dlnd-10-a-corpus-for-document-level |
Repo | https://github.com/edithal-14/A-Deep-Neural-Solution-To-Document-Level-Novelty-Detection-COLING-2018- |
Framework | tf |
Deep Convolutional AutoEncoder-based Lossy Image Compression
Title | Deep Convolutional AutoEncoder-based Lossy Image Compression |
Authors | Zhengxue Cheng, Heming Sun, Masaru Takeuchi, Jiro Katto |
Abstract | Image compression has been investigated as a fundamental research topic for many decades. Recently, deep learning has achieved great success in many computer vision tasks, and is gradually being used in image compression. In this paper, we present a lossy image compression architecture, which utilizes the advantages of convolutional autoencoder (CAE) to achieve a high coding efficiency. First, we design a novel CAE architecture to replace the conventional transforms and train this CAE using a rate-distortion loss function. Second, to generate a more energy-compact representation, we utilize the principal components analysis (PCA) to rotate the feature maps produced by the CAE, and then apply the quantization and entropy coder to generate the codes. Experimental results demonstrate that our method outperforms traditional image coding algorithms, by achieving a 13.7% BD-rate decrement on the Kodak database images compared to JPEG2000. Besides, our method maintains a moderate complexity similar to JPEG2000. |
Tasks | Image Compression, Quantization |
Published | 2018-04-25 |
URL | http://arxiv.org/abs/1804.09535v1 |
http://arxiv.org/pdf/1804.09535v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-convolutional-autoencoder-based-lossy |
Repo | https://github.com/cachett/DCGANandCAE |
Framework | pytorch |
Feature Distillation: DNN-Oriented JPEG Compression Against Adversarial Examples
Title | Feature Distillation: DNN-Oriented JPEG Compression Against Adversarial Examples |
Authors | Zihao Liu, Qi Liu, Tao Liu, Nuo Xu, Xue Lin, Yanzhi Wang, Wujie Wen |
Abstract | Image compression-based approaches for defending against the adversarial-example attacks, which threaten the safety use of deep neural networks (DNN), have been investigated recently. However, prior works mainly rely on directly tuning parameters like compression rate, to blindly reduce image features, thereby lacking guarantee on both defense efficiency (i.e. accuracy of polluted images) and classification accuracy of benign images, after applying defense methods. To overcome these limitations, we propose a JPEG-based defensive compression framework, namely “feature distillation”, to effectively rectify adversarial examples without impacting classification accuracy on benign data. Our framework significantly escalates the defense efficiency with marginal accuracy reduction using a two-step method: First, we maximize malicious features filtering of adversarial input perturbations by developing defensive quantization in frequency domain of JPEG compression or decompression, guided by a semi-analytical method; Second, we suppress the distortions of benign features to restore classification accuracy through a DNN-oriented quantization refine process. Our experimental results show that proposed “feature distillation” can significantly surpass the latest input-transformation based mitigations such as Quilting and TV Minimization in three aspects, including defense efficiency (improve classification accuracy from $\sim20%$ to $\sim90%$ on adversarial examples), accuracy of benign images after defense ($\le1%$ accuracy degradation), and processing time per image ($\sim259\times$ Speedup). Moreover, our solution can also provide the best defense efficiency ($\sim60%$ accuracy) against the recent adaptive attack with least accuracy reduction ($\sim1%$) on benign images when compared with other input-transformation based defense methods. |
Tasks | Image Compression, Quantization |
Published | 2018-03-14 |
URL | http://arxiv.org/abs/1803.05787v2 |
http://arxiv.org/pdf/1803.05787v2.pdf | |
PWC | https://paperswithcode.com/paper/feature-distillation-dnn-oriented-jpeg |
Repo | https://github.com/zihaoliu123/Feature-Distillation-DNN-Oriented-JPEG-Compression-Against-Adversarial-Examples |
Framework | tf |
Conditional Probability Models for Deep Image Compression
Title | Conditional Probability Models for Deep Image Compression |
Authors | Fabian Mentzer, Eirikur Agustsson, Michael Tschannen, Radu Timofte, Luc Van Gool |
Abstract | Deep Neural Networks trained as image auto-encoders have recently emerged as a promising direction for advancing the state-of-the-art in image compression. The key challenge in learning such networks is twofold: To deal with quantization, and to control the trade-off between reconstruction error (distortion) and entropy (rate) of the latent image representation. In this paper, we focus on the latter challenge and propose a new technique to navigate the rate-distortion trade-off for an image compression auto-encoder. The main idea is to directly model the entropy of the latent representation by using a context model: A 3D-CNN which learns a conditional probability model of the latent distribution of the auto-encoder. During training, the auto-encoder makes use of the context model to estimate the entropy of its representation, and the context model is concurrently updated to learn the dependencies between the symbols in the latent representation. Our experiments show that this approach, when measured in MS-SSIM, yields a state-of-the-art image compression system based on a simple convolutional auto-encoder. |
Tasks | Image Compression, Quantization |
Published | 2018-01-12 |
URL | https://arxiv.org/abs/1801.04260v4 |
https://arxiv.org/pdf/1801.04260v4.pdf | |
PWC | https://paperswithcode.com/paper/conditional-probability-models-for-deep-image |
Repo | https://github.com/fab-jul/imgcomp-cvpr |
Framework | tf |
Structured Semantic Model supported Deep Neural Network for Click-Through Rate Prediction
Title | Structured Semantic Model supported Deep Neural Network for Click-Through Rate Prediction |
Authors | Chenglei Niu, Guojing Zhong, Ying Liu, Yandong Zhang, Yongsheng Sun, Ailong He, Zhaoji Chen |
Abstract | With the rapid development of online advertising and recommendation systems, click-through rate prediction is expected to play an increasingly important role.Recently many DNN-based models which follow a similar Embedding&MLP paradigm have been proposed, and have achieved good result in image/voice and nlp fields. In these methods the Wide&Deep model announced by Google plays a key role.Most models first map large scale sparse input features into low-dimensional vectors which are transformed to fixed-length vectors, then concatenated together before being fed into a multilayer perceptron (MLP) to learn non-linear relations among input features. The number of trainable variables normally grow dramatically the number of feature fields and the embedding dimension grow. It is a big challenge to get state-of-the-art result through training deep neural network and embedding together, which falls into local optimal or overfitting easily. In this paper, we propose an Structured Semantic Model (SSM) to tackles this challenge by designing a orthogonal base convolution and pooling model which adaptively learn the multi-scale base semantic representation between features supervised by the click label.The output of SSM are then used in the Wide&Deep for CTR prediction.Experiments on two public datasets as well as real Weibo production dataset with over 1 billion samples have demonstrated the effectiveness of our proposed approach with superior performance comparing to state-of-the-art methods. |
Tasks | Click-Through Rate Prediction, Recommendation Systems |
Published | 2018-12-04 |
URL | http://arxiv.org/abs/1812.01353v5 |
http://arxiv.org/pdf/1812.01353v5.pdf | |
PWC | https://paperswithcode.com/paper/structured-semantic-model-supported-deep |
Repo | https://github.com/niuchenglei/ssm-dnn |
Framework | tf |
End-to-End Learning on 3D Protein Structure for Interface Prediction
Title | End-to-End Learning on 3D Protein Structure for Interface Prediction |
Authors | Raphael J. L. Townshend, Rishi Bedi, Patricia A. Suriana, Ron O. Dror |
Abstract | Despite an explosion in the number of experimentally determined, atomically detailed structures of biomolecules, many critical tasks in structural biology remain data-limited. Whether performance in such tasks can be improved by using large repositories of tangentially related structural data remains an open question. To address this question, we focused on a central problem in biology: predicting how proteins interact with one another—that is, which surfaces of one protein bind to those of another protein. We built a training dataset, the Database of Interacting Protein Structures (DIPS), that contains biases but is two orders of magnitude larger than those used previously. We found that these biases significantly degrade the performance of existing methods on gold-standard data. Hypothesizing that assumptions baked into the hand-crafted features on which these methods depend were the source of the problem, we developed the first end-to-end learning model for protein interface prediction, the Siamese Atomic Surfacelet Network (SASNet). Using only spatial coordinates and identities of atoms, SASNet outperforms state-of-the-art methods trained on gold-standard structural data, even when trained on only 3% of our new dataset. Code and data available at https://github.com/drorlab/DIPS. |
Tasks | Transfer Learning |
Published | 2018-07-03 |
URL | https://arxiv.org/abs/1807.01297v5 |
https://arxiv.org/pdf/1807.01297v5.pdf | |
PWC | https://paperswithcode.com/paper/transferrable-end-to-end-learning-for-protein |
Repo | https://github.com/drorlab/DIPS |
Framework | tf |
Visualizing Convolutional Neural Network Protein-Ligand Scoring
Title | Visualizing Convolutional Neural Network Protein-Ligand Scoring |
Authors | Joshua Hochuli, Alec Helbling, Tamar Skaist, Matthew Ragoza, David Ryan Koes |
Abstract | Protein-ligand scoring is an important step in a structure-based drug design pipeline. Selecting a correct binding pose and predicting the binding affinity of a protein-ligand complex enables effective virtual screening. Machine learning techniques can make use of the increasing amounts of structural data that are becoming publicly available. Convolutional neural network (CNN) scoring functions in particular have shown promise in pose selection and affinity prediction for protein-ligand complexes. Neural networks are known for being difficult to interpret. Understanding the decisions of a particular network can help tune parameters and training data to maximize performance. Visualization of neural networks helps decompose complex scoring functions into pictures that are more easily parsed by humans. Here we present three methods for visualizing how individual protein-ligand complexes are interpreted by 3D convolutional neural networks. We also present a visualization of the convolutional filters and their weights. We describe how the intuition provided by these visualizations aids in network design. |
Tasks | |
Published | 2018-03-06 |
URL | http://arxiv.org/abs/1803.02398v1 |
http://arxiv.org/pdf/1803.02398v1.pdf | |
PWC | https://paperswithcode.com/paper/visualizing-convolutional-neural-network |
Repo | https://github.com/gnina/gnina |
Framework | none |
Are Deep Policy Gradient Algorithms Truly Policy Gradient Algorithms?
Title | Are Deep Policy Gradient Algorithms Truly Policy Gradient Algorithms? |
Authors | Andrew Ilyas, Logan Engstrom, Shibani Santurkar, Dimitris Tsipras, Firdaus Janoos, Larry Rudolph, Aleksander Madry |
Abstract | We study how the behavior of deep policy gradient algorithms reflects the conceptual framework motivating their development. We propose a fine-grained analysis of state-of-the-art methods based on key aspects of this framework: gradient estimation, value prediction, optimization landscapes, and trust region enforcement. We find that from this perspective, the behavior of deep policy gradient algorithms often deviates from what their motivating framework would predict. Our analysis suggests first steps towards solidifying the foundations of these algorithms, and in particular indicates that we may need to move beyond the current benchmark-centric evaluation methodology. |
Tasks | |
Published | 2018-11-06 |
URL | http://arxiv.org/abs/1811.02553v3 |
http://arxiv.org/pdf/1811.02553v3.pdf | |
PWC | https://paperswithcode.com/paper/are-deep-policy-gradient-algorithms-truly |
Repo | https://github.com/BPDanek/learning_resources |
Framework | none |
Deep Face Recognition: A Survey
Title | Deep Face Recognition: A Survey |
Authors | Mei Wang, Weihong Deng |
Abstract | Deep learning applies multiple processing layers to learn representations of data with multiple levels of feature extraction. This emerging technique has reshaped the research landscape of face recognition since 2014, launched by the breakthroughs of Deepface and DeepID methods. Since then, deep face recognition (FR) technique, which leverages the hierarchical architecture to learn discriminative face representation, has dramatically improved the state-of-the-art performance and fostered numerous successful real-world applications. In this paper, we provide a comprehensive survey of the recent developments on deep FR, covering the broad topics on algorithms, data, and scenes. First, we summarize different network architectures and loss functions proposed in the rapid evolution of the deep FR methods. Second, the related face processing methods are categorized into two classes: one-to-many augmentation' and many-to-one normalization’. Then, we summarize and compare the commonly used databases for both model training and evaluation. Third, we review miscellaneous scenes in deep FR, such as cross-factor, heterogenous, multiple-media and industry scenes. Finally, potential deficiencies of the current methods and several future directions are highlighted. |
Tasks | Face Recognition |
Published | 2018-04-18 |
URL | http://arxiv.org/abs/1804.06655v8 |
http://arxiv.org/pdf/1804.06655v8.pdf | |
PWC | https://paperswithcode.com/paper/deep-face-recognition-a-survey |
Repo | https://github.com/parvatijay2901/FaceNet_FR |
Framework | pytorch |
Attributing Fake Images to GANs: Learning and Analyzing GAN Fingerprints
Title | Attributing Fake Images to GANs: Learning and Analyzing GAN Fingerprints |
Authors | Ning Yu, Larry Davis, Mario Fritz |
Abstract | Recent advances in Generative Adversarial Networks (GANs) have shown increasing success in generating photorealistic images. But they also raise challenges to visual forensics and model attribution. We present the first study of learning GAN fingerprints towards image attribution and using them to classify an image as real or GAN-generated. For GAN-generated images, we further identify their sources. Our experiments show that (1) GANs carry distinct model fingerprints and leave stable fingerprints in their generated images, which support image attribution; (2) even minor differences in GAN training can result in different fingerprints, which enables fine-grained model authentication; (3) fingerprints persist across different image frequencies and patches and are not biased by GAN artifacts; (4) fingerprint finetuning is effective in immunizing against five types of adversarial image perturbations; and (5) comparisons also show our learned fingerprints consistently outperform several baselines in a variety of setups. |
Tasks | Image Generation |
Published | 2018-11-20 |
URL | https://arxiv.org/abs/1811.08180v3 |
https://arxiv.org/pdf/1811.08180v3.pdf | |
PWC | https://paperswithcode.com/paper/attributing-fake-images-to-gans-analyzing |
Repo | https://github.com/ningyu1991/GANFingerprints |
Framework | tf |
Phase Harmonic Correlations and Convolutional Neural Networks
Title | Phase Harmonic Correlations and Convolutional Neural Networks |
Authors | Stéphane Mallat, Sixin Zhang, Gaspar Rochette |
Abstract | A major issue in harmonic analysis is to capture the phase dependence of frequency representations, which carries important signal properties. It seems that convolutional neural networks have found a way. Over time-series and images, convolutional networks often learn a first layer of filters which are well localized in the frequency domain, with different phases. We show that a rectifier then acts as a filter on the phase of the resulting coefficients. It computes signal descriptors which are local in space, frequency and phase. The non-linear phase filter becomes a multiplicative operator over phase harmonics computed with a Fourier transform along the phase. We prove that it defines a bi-Lipschitz and invertible representation. The correlations of phase harmonics coefficients characterise coherent structures from their phase dependence across frequencies. For wavelet filters, we show numerically that signals having sparse wavelet coefficients can be recovered from few phase harmonic correlations, which provide a compressive representation |
Tasks | Time Series |
Published | 2018-10-29 |
URL | https://arxiv.org/abs/1810.12136v2 |
https://arxiv.org/pdf/1810.12136v2.pdf | |
PWC | https://paperswithcode.com/paper/phase-harmonics-and-correlation-invariants-in |
Repo | https://github.com/kymatio/phaseharmonics |
Framework | pytorch |
Face-Cap: Image Captioning using Facial Expression Analysis
Title | Face-Cap: Image Captioning using Facial Expression Analysis |
Authors | Omid Mohamad Nezami, Mark Dras, Peter Anderson, Len Hamey |
Abstract | Image captioning is the process of generating a natural language description of an image. Most current image captioning models, however, do not take into account the emotional aspect of an image, which is very relevant to activities and interpersonal relationships represented therein. Towards developing a model that can produce human-like captions incorporating these, we use facial expression features extracted from images including human faces, with the aim of improving the descriptive ability of the model. In this work, we present two variants of our Face-Cap model, which embed facial expression features in different ways, to generate image captions. Using all standard evaluation metrics, our Face-Cap models outperform a state-of-the-art baseline model for generating image captions when applied to an image caption dataset extracted from the standard Flickr 30K dataset, consisting of around 11K images containing faces. An analysis of the captions finds that, perhaps surprisingly, the improvement in caption quality appears to come not from the addition of adjectives linked to emotional aspects of the images, but from more variety in the actions described in the captions. |
Tasks | Image Captioning |
Published | 2018-07-06 |
URL | http://arxiv.org/abs/1807.02250v2 |
http://arxiv.org/pdf/1807.02250v2.pdf | |
PWC | https://paperswithcode.com/paper/face-cap-image-captioning-using-facial |
Repo | https://github.com/omidmn/Face-Cap |
Framework | none |
Deep Reinforcement Learning for Surgical Gesture Segmentation and Classification
Title | Deep Reinforcement Learning for Surgical Gesture Segmentation and Classification |
Authors | Daochang Liu, Tingting Jiang |
Abstract | Recognition of surgical gesture is crucial for surgical skill assessment and efficient surgery training. Prior works on this task are based on either variant graphical models such as HMMs and CRFs, or deep learning models such as Recurrent Neural Networks and Temporal Convolutional Networks. Most of the current approaches usually suffer from over-segmentation and therefore low segment-level edit scores. In contrast, we present an essentially different methodology by modeling the task as a sequential decision-making process. An intelligent agent is trained using reinforcement learning with hierarchical features from a deep model. Temporal consistency is integrated into our action design and reward mechanism to reduce over-segmentation errors. Experiments on JIGSAWS dataset demonstrate that the proposed method performs better than state-of-the-art methods in terms of the edit score and on par in frame-wise accuracy. Our code will be released later. |
Tasks | Decision Making, Surgical Gesture Recognition |
Published | 2018-06-21 |
URL | http://arxiv.org/abs/1806.08089v1 |
http://arxiv.org/pdf/1806.08089v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-reinforcement-learning-for-surgical |
Repo | https://github.com/Finspire13/RL-Surgical-Gesture-Segmentation |
Framework | pytorch |