January 26, 2020

3175 words 15 mins read

Paper Group ANR 1581

Paper Group ANR 1581

Non-discriminative data or weak model? On the relative importance of data and model resolution. Implicit Generative Modeling for Efficient Exploration. All SMILES Variational Autoencoder. DeepCheck: A Non-intrusive Control-flow Integrity Checking based on Deep Learning. GPT-based Generation for Classical Chinese Poetry. Histogram Transform Ensemble …

Non-discriminative data or weak model? On the relative importance of data and model resolution

Title Non-discriminative data or weak model? On the relative importance of data and model resolution
Authors Mark Sandler, Jonathan Baccash, Andrey Zhmoginov, Andrew Howard
Abstract We explore the question of how the resolution of the input image (“input resolution”) affects the performance of a neural network when compared to the resolution of the hidden layers (“internal resolution”). Adjusting these characteristics is frequently used as a hyperparameter providing a trade-off between model performance and accuracy. An intuitive interpretation is that the reduced information content in the low-resolution input causes decay in the accuracy. In this paper, we show that up to a point, the input resolution alone plays little role in the network performance, and it is the internal resolution that is the critical driver of model quality. We then build on these insights to develop novel neural network architectures that we call \emph{Isometric Neural Networks}. These models maintain a fixed internal resolution throughout their entire depth. We demonstrate that they lead to high accuracy models with low activation footprint and parameter count.
Tasks
Published 2019-09-07
URL https://arxiv.org/abs/1909.03205v2
PDF https://arxiv.org/pdf/1909.03205v2.pdf
PWC https://paperswithcode.com/paper/non-discriminative-data-or-weak-model-on-the
Repo
Framework

Implicit Generative Modeling for Efficient Exploration

Title Implicit Generative Modeling for Efficient Exploration
Authors Neale Ratzlaff, Qinxun Bai, Li Fuxin, Wei Xu
Abstract Efficient exploration remains a challenging problem in reinforcement learning, especially for those tasks where rewards from environments are sparse. A commonly used approach for exploring such environments is to introduce some “intrinsic” reward. In this work, we focus on model uncertainty estimation as an intrinsic reward for efficient exploration. In particular, we introduce an implicit generative modeling approach to estimate a Bayesian uncertainty of the agent’s belief of the environment dynamics. Each random draw from our generative model is a neural network that instantiates the dynamic function, hence multiple draws would approximate the posterior, and the variance in the future prediction based on this posterior is used as an intrinsic reward for exploration. We design a training algorithm for our generative model based on the amortized Stein Variational Gradient Descent. In experiments, we compare our implementation with state-of-the-art intrinsic reward-based exploration approaches, including two recent approaches based on an ensemble of dynamic models. In challenging exploration tasks, our implicit generative model consistently outperforms competing approaches regarding data efficiency in exploration.
Tasks Efficient Exploration, Future prediction
Published 2019-11-19
URL https://arxiv.org/abs/1911.08017v2
PDF https://arxiv.org/pdf/1911.08017v2.pdf
PWC https://paperswithcode.com/paper/implicit-generative-modeling-for-efficient-1
Repo
Framework

All SMILES Variational Autoencoder

Title All SMILES Variational Autoencoder
Authors Zaccary Alperstein, Artem Cherkasov, Jason Tyler Rolfe
Abstract Variational autoencoders (VAEs) defined over SMILES string and graph-based representations of molecules promise to improve the optimization of molecular properties, thereby revolutionizing the pharmaceuticals and materials industries. However, these VAEs are hindered by the non-unique nature of SMILES strings and the computational cost of graph convolutions. To efficiently pass messages along all paths through the molecular graph, we encode multiple SMILES strings of a single molecule using a set of stacked recurrent neural networks, pooling hidden representations of each atom between SMILES representations, and use attentional pooling to build a final fixed-length latent representation. By then decoding to a disjoint set of SMILES strings of the molecule, our All SMILES VAE learns an almost bijective mapping between molecules and latent representations near the high-probability-mass subspace of the prior. Our SMILES-derived but molecule-based latent representations significantly surpass the state-of-the-art in a variety of fully- and semi-supervised property regression and molecular property optimization tasks.
Tasks Drug Discovery
Published 2019-05-30
URL https://arxiv.org/abs/1905.13343v2
PDF https://arxiv.org/pdf/1905.13343v2.pdf
PWC https://paperswithcode.com/paper/all-smiles-vae
Repo
Framework

DeepCheck: A Non-intrusive Control-flow Integrity Checking based on Deep Learning

Title DeepCheck: A Non-intrusive Control-flow Integrity Checking based on Deep Learning
Authors Jiliang Zhang, Wuqiao Chen, Yuqi Niu
Abstract Code reuse attack (CRA) is a powerful attack that reuses existing codes to hijack the program control flow. Control flow integrity (CFI) is one of the most popular mechanisms to prevent against CRAs. However, current CFI techniques are difficult to be deployed in real applications due to suffering several issues such as modifying binaries or compiler, extending instruction set architectures (ISA) and incurring unacceptable runtime overhead. To address these issues, we propose the first deep learning-based CFI technique, named DeepCheck, where the control flow graph (CFG) is split into chains for deep neural network (DNN) training. Then the integrity features of CFG can be learned by DNN to detect abnormal control flows. DeepCheck does not interrupt the application and hence incurs zero runtime overhead. Experimental results on Adobe Flash Player, Nginx, Proftpd and Firefox show that the average detection accuracy of DeepCheck is as high as 98.9%. In addition, 64 ROP exploits created by ROPGadget and Ropper are used to further test the effectiveness, which shows that the detection success rate reaches 100%.
Tasks
Published 2019-05-06
URL https://arxiv.org/abs/1905.01858v1
PDF https://arxiv.org/pdf/1905.01858v1.pdf
PWC https://paperswithcode.com/paper/deepcheck-a-non-intrusive-control-flow
Repo
Framework

GPT-based Generation for Classical Chinese Poetry

Title GPT-based Generation for Classical Chinese Poetry
Authors Yi Liao, Yasheng Wang, Qun Liu, Xin Jiang
Abstract We present a simple yet effective method for generating high quality classical Chinese poetry with Generative Pre-trained Language Model (GPT). The method adopts a simple GPT model, without using any human crafted rules or features, or designing any additional neural components. While the proposed model learns to generate various forms of classical Chinese poems, including Jueju, L"{u}shi, various Cipai and Couples, the generated poems are of very high quality. We also propose and implement a method to fine-tune the model to generate acrostic poetry. To the best of our knowledge, this is the first to employ GPT in developing a poetry generation system. We have released an online mini demonstration program on Wechat to show the generation capability of the proposed method for classical Chinese poetry.
Tasks Language Modelling
Published 2019-06-29
URL https://arxiv.org/abs/1907.00151v5
PDF https://arxiv.org/pdf/1907.00151v5.pdf
PWC https://paperswithcode.com/paper/gpt-based-generation-for-classical-chinese
Repo
Framework

Histogram Transform Ensembles for Density Estimation

Title Histogram Transform Ensembles for Density Estimation
Authors Hanyuan Hang
Abstract We investigate an algorithm named histogram transform ensembles (HTE) density estimator whose effectiveness is supported by both solid theoretical analysis and significant experimental performance. On the theoretical side, by decomposing the error term into approximation error and estimation error, we are able to conduct the following analysis: First of all, we establish the universal consistency under $L_1(\mu)$-norm. Secondly, under the assumption that the underlying density function resides in the H"{o}lder space $C^{0,\alpha}$, we prove almost optimal convergence rates for both single and ensemble density estimators under $L_1(\mu)$-norm and $L_{\infty}(\mu)$-norm for different tail distributions, whereas in contrast, for its subspace $C^{1,\alpha}$ consisting of smoother functions, almost optimal convergence rates can only be established for the ensembles and the lower bound of the single estimators illustrates the benefits of ensembles over single density estimators. In the experiments, we first carry out simulations to illustrate that histogram transform ensembles surpass single histogram transforms, which offers powerful evidence to support the theoretical results in the space $C^{1,\alpha}$. Moreover, to further exert the experimental performances, we propose an adaptive version of HTE and study the parameters by generating several synthetic datasets with diversities in dimensions and distributions. Last but not least, real data experiments with other state-of-the-art density estimators demonstrate the accuracy of the adaptive HTE algorithm.
Tasks Density Estimation
Published 2019-11-24
URL https://arxiv.org/abs/1911.11581v1
PDF https://arxiv.org/pdf/1911.11581v1.pdf
PWC https://paperswithcode.com/paper/histogram-transform-ensembles-for-density
Repo
Framework

Unsupervised Graph-based Rank Aggregation for Improved Retrieval

Title Unsupervised Graph-based Rank Aggregation for Improved Retrieval
Authors Icaro Cavalcante Dourado, Daniel Carlos Guimarães Pedronette, Ricardo da Silva Torres
Abstract This paper presents a robust and comprehensive graph-based rank aggregation approach, used to combine results of isolated ranker models in retrieval tasks. The method follows an unsupervised scheme, which is independent of how the isolated ranks are formulated. Our approach is able to combine arbitrary models, defined in terms of different ranking criteria, such as those based on textual, image or hybrid content representations. We reformulate the ad-hoc retrieval problem as a document retrieval based on fusion graphs, which we propose as a new unified representation model capable of merging multiple ranks and expressing inter-relationships of retrieval results automatically. By doing so, we claim that the retrieval system can benefit from learning the manifold structure of datasets, thus leading to more effective results. Another contribution is that our graph-based aggregation formulation, unlike existing approaches, allows for encapsulating contextual information encoded from multiple ranks, which can be directly used for ranking, without further computations and post-processing steps over the graphs. Based on the graphs, a novel similarity retrieval score is formulated using an efficient computation of minimum common subgraphs. Finally, another benefit over existing approaches is the absence of hyperparameters. A comprehensive experimental evaluation was conducted considering diverse well-known public datasets, composed of textual, image, and multimodal documents. Performed experiments demonstrate that our method reaches top performance, yielding better effectiveness scores than state-of-the-art baseline methods and promoting large gains over the rankers being fused, thus demonstrating the successful capability of the proposal in representing queries based on a unified graph-based model of rank fusions.
Tasks
Published 2019-01-17
URL http://arxiv.org/abs/1901.05743v2
PDF http://arxiv.org/pdf/1901.05743v2.pdf
PWC https://paperswithcode.com/paper/unsupervised-graph-based-rank-aggregation-for
Repo
Framework

Characterizing Scalability of Sparse Matrix-Vector Multiplications on Phytium FT-2000+ Many-cores

Title Characterizing Scalability of Sparse Matrix-Vector Multiplications on Phytium FT-2000+ Many-cores
Authors Donglin Chen, Jianbin Fang, Chuanfu Xu, Shizhao Chen, Zheng Wang
Abstract Understanding the scalability of parallel programs is crucial for software optimization and hardware architecture design. As HPC hardware is moving towards many-core design, it becomes increasingly difficult for a parallel program to make effective use of all available processor cores. This makes scalability analysis increasingly important. This paper presents a quantitative study for characterizing the scalability of sparse matrix-vector multiplications (SpMV) on Phytium FT-2000+, an ARM-based many-core architecture for HPC computing. We choose to study SpMV as it is a common operation in scientific and HPC applications. Due to the newness of ARM-based many-core architectures, there is little work on understanding the SpMV scalability on such hardware design. To close the gap, we carry out a large-scale empirical evaluation involved over 1,000 representative SpMV datasets. We show that, while many computation-intensive SpMV applications contain extensive parallelism, achieving a linear speedup is non-trivial on Phytium FT-2000+. To better understand what software and hardware parameters are most important for determining the scalability of a given SpMV kernel, we develop a performance analytical model based on the regression tree. We show that our model is highly effective in characterizing SpMV scalability, offering useful insights to help application developers for better optimizing SpMV on an emerging HPC architecture.
Tasks
Published 2019-11-20
URL https://arxiv.org/abs/1911.08779v1
PDF https://arxiv.org/pdf/1911.08779v1.pdf
PWC https://paperswithcode.com/paper/characterizing-scalability-of-sparse-matrix
Repo
Framework

Affect-Driven Dialog Generation

Title Affect-Driven Dialog Generation
Authors Pierre Colombo, Wojciech Witon, Ashutosh Modi, James Kennedy, Mubbasir Kapadia
Abstract The majority of current systems for end-to-end dialog generation focus on response quality without an explicit control over the affective content of the responses. In this paper, we present an affect-driven dialog system, which generates emotional responses in a controlled manner using a continuous representation of emotions. The system achieves this by modeling emotions at a word and sequence level using: (1) a vector representation of the desired emotion, (2) an affect regularizer, which penalizes neutral words, and (3) an affect sampling method, which forces the neural network to generate diverse words that are emotionally relevant. During inference, we use a reranking procedure that aims to extract the most emotionally relevant responses using a human-in-the-loop optimization process. We study the performance of our system in terms of both quantitative (BLEU score and response diversity), and qualitative (emotional appropriateness) measures.
Tasks
Published 2019-04-04
URL http://arxiv.org/abs/1904.02793v1
PDF http://arxiv.org/pdf/1904.02793v1.pdf
PWC https://paperswithcode.com/paper/affect-driven-dialog-generation
Repo
Framework

Smart Hypothesis Generation for Efficient and Robust Room Layout Estimation

Title Smart Hypothesis Generation for Efficient and Robust Room Layout Estimation
Authors Martin Hirzer, Peter M. Roth, Vincent Lepetit
Abstract We propose a novel method to efficiently estimate the spatial layout of a room from a single monocular RGB image. As existing approaches based on low-level feature extraction, followed by a vanishing point estimation are very slow and often unreliable in realistic scenarios, we build on semantic segmentation of the input image. To obtain better segmentations, we introduce a robust, accurate and very efficient hypothesize-and-test scheme. The key idea is to use three segmentation hypotheses, each based on a different number of visible walls. For each hypothesis, we predict the image locations of the room corners and select the hypothesis for which the layout estimated from the room corners is consistent with the segmentation. We demonstrate the efficiency and robustness of our method on three challenging benchmark datasets, where we significantly outperform the state-of-the-art.
Tasks Room Layout Estimation, Semantic Segmentation
Published 2019-10-27
URL https://arxiv.org/abs/1910.12257v1
PDF https://arxiv.org/pdf/1910.12257v1.pdf
PWC https://paperswithcode.com/paper/smart-hypothesis-generation-for-efficient-and
Repo
Framework

On Training Robust PDF Malware Classifiers

Title On Training Robust PDF Malware Classifiers
Authors Yizheng Chen, Shiqi Wang, Dongdong She, Suman Jana
Abstract Although state-of-the-art PDF malware classifiers can be trained with almost perfect test accuracy (99%) and extremely low false positive rate (under 0.1%), it has been shown that even a simple adversary can evade them. A practically useful malware classifier must be robust against evasion attacks. However, achieving such robustness is an extremely challenging task. In this paper, we take the first steps towards training robust PDF malware classifiers with verifiable robustness properties. For instance, a robustness property can enforce that no matter how many pages from benign documents are inserted into a PDF malware, the classifier must still classify it as malicious. We demonstrate how the worst-case behavior of a malware classifier with respect to specific robustness properties can be formally verified. Furthermore, we find that training classifiers that satisfy formally verified robustness properties can increase the evasion cost of unbounded (i.e., not bounded by the robustness properties) attackers by eliminating simple evasion attacks. Specifically, we propose a new distance metric that operates on the PDF tree structure and specify two classes of robustness properties including subtree insertions and deletions. We utilize state-of-the-art verifiably robust training method to build robust PDF malware classifiers. Our results show that, we can achieve 92.27% average verified robust accuracy over three properties, while maintaining 99.74% accuracy and 0.56% false positive rate. With simple robustness properties, our robust model maintains 7% higher robust accuracy than all the baseline models against unrestricted whitebox attacks. Moreover, the state-of-the-art and new adaptive evolutionary attackers need up to 10 times larger $L_0$ feature distance and 21 times more PDF basic mutations (e.g., inserting and deleting objects) to evade our robust model than the baselines.
Tasks
Published 2019-04-06
URL https://arxiv.org/abs/1904.03542v2
PDF https://arxiv.org/pdf/1904.03542v2.pdf
PWC https://paperswithcode.com/paper/on-training-robust-pdf-malware-classifiers
Repo
Framework

DeepFPC: Deep Unfolding of a Fixed-Point Continuation Algorithm for Sparse Signal Recovery from Quantized Measurements

Title DeepFPC: Deep Unfolding of a Fixed-Point Continuation Algorithm for Sparse Signal Recovery from Quantized Measurements
Authors Peng Xiao, Bin Liao, Nikos Deligiannis
Abstract We present DeepFPC, a novel deep neural network designed by unfolding the iterations of the fixed-point continuation algorithm with one-sided l1-norm (FPC-l1), which has been proposed for solving the 1-bit compressed sensing problem. The network architecture resembles that of deep residual learning and incorporates prior knowledge about the signal structure (i.e., sparsity), thereby offering interpretability by design. Once DeepFPC is properly trained, a sparse signal can be recovered fast and accurately from quantized measurements. The proposed model is evaluated in the task of direction-of-arrival (DOA) estimation and is shown to outperform state-of-the-art algorithms, namely, the iterative FPC-l1 algorithm and the 1-bit MUSIC method.
Tasks
Published 2019-12-02
URL https://arxiv.org/abs/1912.00838v3
PDF https://arxiv.org/pdf/1912.00838v3.pdf
PWC https://paperswithcode.com/paper/deepfpc-deep-unfolding-of-a-fixed-point
Repo
Framework

Semi-Supervised Image-to-Image Translation

Title Semi-Supervised Image-to-Image Translation
Authors Manan Oza, Himanshu Vaghela, Sudhir Bagul
Abstract Image-to-image translation is a long-established and a difficult problem in computer vision. In this paper we propose an adversarial based model for image-to-image translation. The regular deep neural-network based methods perform the task of image-to-image translation by comparing gram matrices and using image segmentation which requires human intervention. Our generative adversarial network based model works on a conditional probability approach. This approach makes the image translation independent of any local, global and content or style features. In our approach we use a bidirectional reconstruction model appended with the affine transform factor that helps in conserving the content and photorealism as compared to other models. The advantage of using such an approach is that the image-to-image translation is semi-supervised, independant of image segmentation and inherits the properties of generative adversarial networks tending to produce realistic. This method has proven to produce better results than Multimodal Unsupervised Image-to-image translation.
Tasks Image-to-Image Translation, Multimodal Unsupervised Image-To-Image Translation, Semantic Segmentation, Unsupervised Image-To-Image Translation
Published 2019-01-24
URL http://arxiv.org/abs/1901.08212v1
PDF http://arxiv.org/pdf/1901.08212v1.pdf
PWC https://paperswithcode.com/paper/semi-supervised-image-to-image-translation
Repo
Framework

TKD: Temporal Knowledge Distillation for Active Perception

Title TKD: Temporal Knowledge Distillation for Active Perception
Authors Mohammad Farhadi, Yezhou Yang
Abstract Deep neural networks based methods have been proved to achieve outstanding performance on object detection and classification tasks. Despite significant performance improvement, due to the deep structures, they still require prohibitive runtime to process images and maintain the highest possible performance for real-time applications. Observing the phenomenon that human vision system (HVS) relies heavily on the temporal dependencies among frames from the visual input to conduct recognition efficiently, we propose a novel framework dubbed as TKD: temporal knowledge distillation. This framework distills the temporal knowledge from a heavy neural networks based model over selected video frames (the perception of the moments) to a light-weight model. To enable the distillation, we put forward two novel procedures: 1) an Long-short Term Memory (LSTM) based key frame selection method; and 2) a novel teacher-bounded loss design. To validate, we conduct comprehensive empirical evaluations using different object detection methods over multiple datasets including Youtube-Objects and Hollywood scene dataset. Our results show consistent improvement in accuracy-speed trad-offs for object detection over the frames of the dynamic scene, compare to other modern object recognition methods.
Tasks Object Detection, Object Recognition
Published 2019-03-04
URL https://arxiv.org/abs/1903.01522v2
PDF https://arxiv.org/pdf/1903.01522v2.pdf
PWC https://paperswithcode.com/paper/tkd-temporal-knowledge-distillation-for
Repo
Framework

Compositional uncertainty in deep Gaussian processes

Title Compositional uncertainty in deep Gaussian processes
Authors Ivan Ustyuzhaninov, Ieva Kazlauskaite, Markus Kaiser, Erik Bodin, Neill D. F. Campbell, Carl Henrik Ek
Abstract Gaussian processes (GPs) are nonparametric priors over functions. Fitting a GP implies computing a posterior distribution of functions consistent with the observed data. Similarly, deep Gaussian processes (DGPs) should allow us to compute a posterior distribution of compositions of multiple functions giving rise to the observations. However, exact Bayesian inference is intractable for DGPs, motivating the use of various approximations. We show that the application of simplifying mean-field assumptions across the hierarchy leads to the layers of a DGP collapsing to near-deterministic transformations. We argue that such an inference scheme is suboptimal, not taking advantage of the potential of the model to discover the compositional structure in the data. To address this issue, we examine alternative variational inference schemes allowing for dependencies across different layers and discuss their advantages and limitations.
Tasks Bayesian Inference, Gaussian Processes
Published 2019-09-17
URL https://arxiv.org/abs/1909.07698v3
PDF https://arxiv.org/pdf/1909.07698v3.pdf
PWC https://paperswithcode.com/paper/compositional-uncertainty-in-deep-gaussian
Repo
Framework
comments powered by Disqus