Paper Group ANR 578
Stealing Hyperparameters in Machine Learning. Revisit Multinomial Logistic Regression in Deep Learning: Data Dependent Model Initialization for Image Recognition. Discrete-Continuous Mixtures in Probabilistic Programming: Generalized Semantics and Inference Algorithms. Haze Density Estimation via Modeling of Scattering Coefficients of Iso-depth Reg …
Stealing Hyperparameters in Machine Learning
Title | Stealing Hyperparameters in Machine Learning |
Authors | Binghui Wang, Neil Zhenqiang Gong |
Abstract | Hyperparameters are critical in machine learning, as different hyperparameters often result in models with significantly different performance. Hyperparameters may be deemed confidential because of their commercial value and the confidentiality of the proprietary algorithms that the learner uses to learn them. In this work, we propose attacks on stealing the hyperparameters that are learned by a learner. We call our attacks hyperparameter stealing attacks. Our attacks are applicable to a variety of popular machine learning algorithms such as ridge regression, logistic regression, support vector machine, and neural network. We evaluate the effectiveness of our attacks both theoretically and empirically. For instance, we evaluate our attacks on Amazon Machine Learning. Our results demonstrate that our attacks can accurately steal hyperparameters. We also study countermeasures. Our results highlight the need for new defenses against our hyperparameter stealing attacks for certain machine learning algorithms. |
Tasks | |
Published | 2018-02-14 |
URL | https://arxiv.org/abs/1802.05351v3 |
https://arxiv.org/pdf/1802.05351v3.pdf | |
PWC | https://paperswithcode.com/paper/stealing-hyperparameters-in-machine-learning |
Repo | |
Framework | |
Revisit Multinomial Logistic Regression in Deep Learning: Data Dependent Model Initialization for Image Recognition
Title | Revisit Multinomial Logistic Regression in Deep Learning: Data Dependent Model Initialization for Image Recognition |
Authors | Bowen Cheng, Rong Xiao, Yandong Guo, Yuxiao Hu, Jianfeng Wang, Lei Zhang |
Abstract | We study in this paper how to initialize the parameters of multinomial logistic regression (a fully connected layer followed with softmax and cross entropy loss), which is widely used in deep neural network (DNN) models for classification problems. As logistic regression is widely known not having a closed-form solution, it is usually randomly initialized, leading to several deficiencies especially in transfer learning where all the layers except for the last task-specific layer are initialized using a pre-trained model. The deficiencies include slow convergence speed, possibility of stuck in local minimum, and the risk of over-fitting. To address those deficiencies, we first study the properties of logistic regression and propose a closed-form approximate solution named regularized Gaussian classifier (RGC). Then we adopt this approximate solution to initialize the task-specific linear layer and demonstrate superior performance over random initialization in terms of both accuracy and convergence speed on various tasks and datasets. For example, for image classification, our approach can reduce the training time by 10 times and achieve 3.2% gain in accuracy for Flickr-style classification. For object detection, our approach can also be 10 times faster in training for the same accuracy, or 5% better in terms of mAP for VOC 2007 with slightly longer training. |
Tasks | Image Classification, Object Detection, Transfer Learning |
Published | 2018-09-17 |
URL | http://arxiv.org/abs/1809.06131v1 |
http://arxiv.org/pdf/1809.06131v1.pdf | |
PWC | https://paperswithcode.com/paper/revisit-multinomial-logistic-regression-in |
Repo | |
Framework | |
Discrete-Continuous Mixtures in Probabilistic Programming: Generalized Semantics and Inference Algorithms
Title | Discrete-Continuous Mixtures in Probabilistic Programming: Generalized Semantics and Inference Algorithms |
Authors | Yi Wu, Siddharth Srivastava, Nicholas Hay, Simon Du, Stuart Russell |
Abstract | Despite the recent successes of probabilistic programming languages (PPLs) in AI applications, PPLs offer only limited support for random variables whose distributions combine discrete and continuous elements. We develop the notion of measure-theoretic Bayesian networks (MTBNs) and use it to provide more general semantics for PPLs with arbitrarily many random variables defined over arbitrary measure spaces. We develop two new general sampling algorithms that are provably correct under the MTBN framework: the lexicographic likelihood weighting (LLW) for general MTBNs and the lexicographic particle filter (LPF), a specialized algorithm for state-space models. We further integrate MTBNs into a widely used PPL system, BLOG, and verify the effectiveness of the new inference algorithms through representative examples. |
Tasks | Probabilistic Programming |
Published | 2018-06-06 |
URL | http://arxiv.org/abs/1806.02027v3 |
http://arxiv.org/pdf/1806.02027v3.pdf | |
PWC | https://paperswithcode.com/paper/discrete-continuous-mixtures-in-probabilistic |
Repo | |
Framework | |
Haze Density Estimation via Modeling of Scattering Coefficients of Iso-depth Regions
Title | Haze Density Estimation via Modeling of Scattering Coefficients of Iso-depth Regions |
Authors | Jie Chen, Cheen-Hau Tan, Lap-Pui Chau |
Abstract | Vision based haze density estimation is of practical implications for the purpose of precaution alarm and emergency reactions toward disastrous hazy weathers. In this paper, we introduce a haze density estimation framework based on modeling of scattering coefficients of iso-depth regions. A haze density metric of Normalized Scattering Coefficient (NSC) is proposed to measure current haze density level with reference to two reference scales. Iso-depth regions are determined via superpixel segmentation. Efficient searching and matching of iso-depth units could be carried out for measurements via unstationary cameras. A robust dark SP selection method is used to produce reliable predictions for most out-door scenarios. |
Tasks | Density Estimation |
Published | 2018-08-19 |
URL | http://arxiv.org/abs/1808.06207v1 |
http://arxiv.org/pdf/1808.06207v1.pdf | |
PWC | https://paperswithcode.com/paper/haze-density-estimation-via-modeling-of |
Repo | |
Framework | |
Structured Point Cloud Data Analysis via Regularized Tensor Regression for Process Modeling and Optimization
Title | Structured Point Cloud Data Analysis via Regularized Tensor Regression for Process Modeling and Optimization |
Authors | Hao Yan, Kamran Paynabar, Massimo Pacella |
Abstract | Advanced 3D metrology technologies such as Coordinate Measuring Machine (CMM) and laser 3D scanners have facilitated the collection of massive point cloud data, beneficial for process monitoring, control and optimization. However, due to their high dimensionality and structure complexity, modeling and analysis of point clouds are still a challenge. In this paper, we utilize multilinear algebra techniques and propose a set of tensor regression approaches to model the variational patterns of point clouds and to link them to process variables. The performance of the proposed methods is evaluated through simulations and a real case study of turning process optimization. |
Tasks | |
Published | 2018-07-26 |
URL | http://arxiv.org/abs/1807.10278v3 |
http://arxiv.org/pdf/1807.10278v3.pdf | |
PWC | https://paperswithcode.com/paper/structured-point-cloud-data-analysis-via |
Repo | |
Framework | |
Privacy-Preserving Collaborative Prediction using Random Forests
Title | Privacy-Preserving Collaborative Prediction using Random Forests |
Authors | Irene Giacomelli, Somesh Jha, Ross Kleiman, David Page, Kyonghwan Yoon |
Abstract | We study the problem of privacy-preserving machine learning (PPML) for ensemble methods, focusing our effort on random forests. In collaborative analysis, PPML attempts to solve the conflict between the need for data sharing and privacy. This is especially important in privacy sensitive applications such as learning predictive models for clinical decision support from EHR data from different clinics, where each clinic has a responsibility for its patients’ privacy. We propose a new approach for ensemble methods: each entity learns a model, from its own data, and then when a client asks the prediction for a new private instance, the answers from all the locally trained models are used to compute the prediction in such a way that no extra information is revealed. We implement this approach for random forests and we demonstrate its high efficiency and potential accuracy benefit via experiments on real-world datasets, including actual EHR data. |
Tasks | |
Published | 2018-11-21 |
URL | http://arxiv.org/abs/1811.08695v1 |
http://arxiv.org/pdf/1811.08695v1.pdf | |
PWC | https://paperswithcode.com/paper/privacy-preserving-collaborative-prediction |
Repo | |
Framework | |
Road Detection Technique Using Filters with Application to Autonomous Driving System
Title | Road Detection Technique Using Filters with Application to Autonomous Driving System |
Authors | Y. O. Agunbiade, J. O. Dehinbo, T. Zuva, A. K. Akanbi |
Abstract | Autonomous driving systems are broadly used equipment in the industries and in our daily lives, they assist in production, but are majorly used for exploration in dangerous or unfamiliar locations. Thus, for a successful exploration, navigation plays a significant role. Road detection is an essential factor that assists autonomous robots achieved perfect navigation. Various techniques using camera sensors have been proposed by numerous scholars with inspiring results, but their techniques are still vulnerable to these environmental noises: rain, snow, light intensity and shadow. In addressing these problems, this paper proposed to enhance the road detection system with filtering algorithm to overcome these limitations. Normalized Differences Index (NDI) and morphological operation are the filtering algorithms used to address the effect of shadow and guidance and re-guidance image filtering algorithms are used to address the effect of rain and/or snow, while dark channel image and specular-to-diffuse are the filters used to address light intensity effects. The experimental performance of the road detection system with filtering algorithms was tested qualitatively and quantitatively using the following evaluation schemes: False Negative Rate (FNR) and False Positive Rate (FPR). Comparison results of the road detection system with and without filtering algorithm shows the filtering algorithm’s capability to suppress the effect of environmental noises because better road/non-road classification is achieved by the road detection system. with filtering algorithm. This achievement has further improved path planning/region classification for autonomous driving system |
Tasks | Autonomous Driving |
Published | 2018-09-16 |
URL | http://arxiv.org/abs/1809.05878v1 |
http://arxiv.org/pdf/1809.05878v1.pdf | |
PWC | https://paperswithcode.com/paper/road-detection-technique-using-filters-with |
Repo | |
Framework | |
Reinforced Self-Attention Network: a Hybrid of Hard and Soft Attention for Sequence Modeling
Title | Reinforced Self-Attention Network: a Hybrid of Hard and Soft Attention for Sequence Modeling |
Authors | Tao Shen, Tianyi Zhou, Guodong Long, Jing Jiang, Sen Wang, Chengqi Zhang |
Abstract | Many natural language processing tasks solely rely on sparse dependencies between a few tokens in a sentence. Soft attention mechanisms show promising performance in modeling local/global dependencies by soft probabilities between every two tokens, but they are not effective and efficient when applied to long sentences. By contrast, hard attention mechanisms directly select a subset of tokens but are difficult and inefficient to train due to their combinatorial nature. In this paper, we integrate both soft and hard attention into one context fusion model, “reinforced self-attention (ReSA)", for the mutual benefit of each other. In ReSA, a hard attention trims a sequence for a soft self-attention to process, while the soft attention feeds reward signals back to facilitate the training of the hard one. For this purpose, we develop a novel hard attention called “reinforced sequence sampling (RSS)", selecting tokens in parallel and trained via policy gradient. Using two RSS modules, ReSA efficiently extracts the sparse dependencies between each pair of selected tokens. We finally propose an RNN/CNN-free sentence-encoding model, “reinforced self-attention network (ReSAN)", solely based on ReSA. It achieves state-of-the-art performance on both Stanford Natural Language Inference (SNLI) and Sentences Involving Compositional Knowledge (SICK) datasets. |
Tasks | Natural Language Inference |
Published | 2018-01-31 |
URL | http://arxiv.org/abs/1801.10296v2 |
http://arxiv.org/pdf/1801.10296v2.pdf | |
PWC | https://paperswithcode.com/paper/reinforced-self-attention-network-a-hybrid-of |
Repo | |
Framework | |
Discovering and Generating Hard Examples for Training a Red Tide Detector
Title | Discovering and Generating Hard Examples for Training a Red Tide Detector |
Authors | Hyungtae Lee, Heesung Kwon, Wonkook Kim |
Abstract | Currently, accurate detection of natural phenomena, such as red tide, that adversely affect wildlife and human, using satellite images has been increasingly utilized. However, red tide detection on satellite images still remains a very hard task due to unpredictable nature of red tide occurrence, extreme sparsity of red tide samples, difficulties in accurate groundtruthing, etc. In this paper, we aim to tackle both the data sparsity and groundtruthing issues by primarily addressing two challenges: i) significant lack of hard examples of non-red tide that can enhance detection performance and ii) extreme data imbalance between red tide and non-red tide examples. In the proposed work, we devise a 9-layer fully convolutional network jointly optimized with two plug-in modules tailored to overcoming the two challenges: i) a hard negative example generator (HNG) to supplement the hard negative (non-red tide) examples and ii) cascaded online hard example mining (cOHEM) to ease the data imbalance. Our proposed network jointly trained with HNG and cOHEM provides state-of-the-art red tide detection accuracy on GOCI satellite images. |
Tasks | |
Published | 2018-12-13 |
URL | http://arxiv.org/abs/1812.05447v2 |
http://arxiv.org/pdf/1812.05447v2.pdf | |
PWC | https://paperswithcode.com/paper/discovering-and-generating-hard-examples-for |
Repo | |
Framework | |
Learning a Probabilistic Model for Diffeomorphic Registration
Title | Learning a Probabilistic Model for Diffeomorphic Registration |
Authors | Julian Krebs, Hervé Delingette, Boris Mailhé, Nicholas Ayache, Tommaso Mansi |
Abstract | We propose to learn a low-dimensional probabilistic deformation model from data which can be used for registration and the analysis of deformations. The latent variable model maps similar deformations close to each other in an encoding space. It enables to compare deformations, generate normal or pathological deformations for any new image or to transport deformations from one image pair to any other image. Our unsupervised method is based on variational inference. In particular, we use a conditional variational autoencoder (CVAE) network and constrain transformations to be symmetric and diffeomorphic by applying a differentiable exponentiation layer with a symmetric loss function. We also present a formulation that includes spatial regularization such as diffusion-based filters. Additionally, our framework provides multi-scale velocity field estimations. We evaluated our method on 3-D intra-subject registration using 334 cardiac cine-MRIs. On this dataset, our method showed state-of-the-art performance with a mean DICE score of 81.2% and a mean Hausdorff distance of 7.3mm using 32 latent dimensions compared to three state-of-the-art methods while also demonstrating more regular deformation fields. The average time per registration was 0.32s. Besides, we visualized the learned latent space and show that the encoded deformations can be used to transport deformations and to cluster diseases with a classification accuracy of 83% after applying a linear projection. |
Tasks | Deformable Medical Image Registration, Diffeomorphic Medical Image Registration, Medical Image Registration |
Published | 2018-12-18 |
URL | http://arxiv.org/abs/1812.07460v2 |
http://arxiv.org/pdf/1812.07460v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-a-probabilistic-model-for |
Repo | |
Framework | |
Perceptual Video Super Resolution with Enhanced Temporal Consistency
Title | Perceptual Video Super Resolution with Enhanced Temporal Consistency |
Authors | Eduardo Pérez-Pellitero, Mehdi S. M. Sajjadi, Michael Hirsch, Bernhard Schölkopf |
Abstract | With the advent of perceptual loss functions, new possibilities in super-resolution have emerged, and we currently have models that successfully generate near-photorealistic high-resolution images from their low-resolution observations. Up to now, however, such approaches have been exclusively limited to single image super-resolution. The application of perceptual loss functions on video processing still entails several challenges, mostly related to the lack of temporal consistency of the generated images, i.e., flickering artifacts. In this work, we present a novel adversarial recurrent network for video upscaling that is able to produce realistic textures in a temporally consistent way. The proposed architecture naturally leverages information from previous frames due to its recurrent architecture, i.e. the input to the generator is composed of the low-resolution image and, additionally, the warped output of the network at the previous step. Together with a video discriminator, we also propose additional loss functions to further reinforce temporal consistency in the generated sequences. The experimental validation of our algorithm shows the effectiveness of our approach which obtains images with high perceptual quality and improved temporal consistency. |
Tasks | Image Super-Resolution, Super-Resolution, Video Super-Resolution |
Published | 2018-07-20 |
URL | https://arxiv.org/abs/1807.07930v2 |
https://arxiv.org/pdf/1807.07930v2.pdf | |
PWC | https://paperswithcode.com/paper/photorealistic-video-super-resolution |
Repo | |
Framework | |
Deep Learning and Glaucoma Specialists: The Relative Importance of Optic Disc Features to Predict Glaucoma Referral in Fundus Photos
Title | Deep Learning and Glaucoma Specialists: The Relative Importance of Optic Disc Features to Predict Glaucoma Referral in Fundus Photos |
Authors | Sonia Phene, R. Carter Dunn, Naama Hammel, Yun Liu, Jonathan Krause, Naho Kitade, Mike Schaekermann, Rory Sayres, Derek J. Wu, Ashish Bora, Christopher Semturs, Anita Misra, Abigail E. Huang, Arielle Spitze, Felipe A. Medeiros, April Y. Maa, Monica Gandhi, Greg S. Corrado, Lily Peng, Dale R. Webster |
Abstract | Glaucoma is the leading cause of preventable, irreversible blindness world-wide. The disease can remain asymptomatic until severe, and an estimated 50%-90% of people with glaucoma remain undiagnosed. Glaucoma screening is recommended for early detection and treatment. A cost-effective tool to detect glaucoma could expand screening access to a much larger patient population, but such a tool is currently unavailable. We trained a deep learning algorithm using a retrospective dataset of 86,618 images, assessed for glaucomatous optic nerve head features and referable glaucomatous optic neuropathy (GON). The algorithm was validated using 3 datasets. For referable GON, the algorithm had an AUC of 0.945 (95% CI, 0.929-0.960) in dataset A (1205 images, 1 image/patient; 18.1% referable), images adjudicated by panels of Glaucoma Specialists (GSs); 0.855 (95% CI, 0.841-0.870) in dataset B (9642 images, 1 image/patient; 9.2% referable), images from Atlanta Veterans Affairs Eye Clinic diabetic teleretinal screening program; and 0.881 (95% CI, 0.838-0.918) in dataset C (346 images, 1 image/patient; 81.7% referable), images from Dr. Shroff’s Charity Eye Hospital’s glaucoma clinic. The algorithm showed significantly higher sensitivity than 7 of 10 graders not involved in determining the reference standard, including 2 of 3 GSs, and showed higher specificity than 3 graders, while remaining comparable to others. For both GSs and the algorithm, the most crucial features related to referable GON were: presence of vertical cup-to-disc ratio of 0.7 or more, neuroretinal rim notching, retinal nerve fiber layer defect, and bared circumlinear vessels. An algorithm trained on fundus images alone can detect referable GON with higher sensitivity than and comparable specificity to eye care providers. The algorithm maintained good performance on an independent dataset with diagnoses based on a full glaucoma workup. |
Tasks | |
Published | 2018-12-21 |
URL | https://arxiv.org/abs/1812.08911v2 |
https://arxiv.org/pdf/1812.08911v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-to-assess-glaucoma-risk-and |
Repo | |
Framework | |
DeepBillboard: Systematic Physical-World Testing of Autonomous Driving Systems
Title | DeepBillboard: Systematic Physical-World Testing of Autonomous Driving Systems |
Authors | Husheng Zhou, Wei Li, Yuankun Zhu, Yuqun Zhang, Bei Yu, Lingming Zhang, Cong Liu |
Abstract | Deep Neural Networks (DNNs) have been widely applied in many autonomous systems such as autonomous driving. Recently, DNN testing has been intensively studied to automatically generate adversarial examples, which inject small-magnitude perturbations into inputs to test DNNs under extreme situations. While existing testing techniques prove to be effective, they mostly focus on generating digital adversarial perturbations (particularly for autonomous driving), e.g., changing image pixels, which may never happen in physical world. There is a critical missing piece in the literature on autonomous driving testing: understanding and exploiting both digital and physical adversarial perturbation generation for impacting steering decisions. In this paper, we present DeepBillboard, a systematic physical-world testing approach targeting at a common and practical driving scenario: drive-by billboards. DeepBillboard is capable of generating a robust and resilient printable adversarial billboard, which works under dynamic changing driving conditions including viewing angle, distance, and lighting. The objective is to maximize the possibility, degree, and duration of the steering-angle errors of an autonomous vehicle driving by the generated adversarial billboard. We have extensively evaluated the efficacy and robustness of DeepBillboard through conducting both digital and physical-world experiments. Results show that DeepBillboard is effective for various steering models and scenes. Furthermore, DeepBillboard is sufficiently robust and resilient for generating physical-world adversarial billboard tests for real-world driving under various weather conditions. To the best of our knowledge, this is the first study demonstrating the possibility of generating realistic and continuous physical-world tests for practical autonomous driving systems. |
Tasks | Autonomous Driving |
Published | 2018-12-27 |
URL | http://arxiv.org/abs/1812.10812v1 |
http://arxiv.org/pdf/1812.10812v1.pdf | |
PWC | https://paperswithcode.com/paper/deepbillboard-systematic-physical-world |
Repo | |
Framework | |
Active Deep Q-learning with Demonstration
Title | Active Deep Q-learning with Demonstration |
Authors | Si-An Chen, Voot Tangkaratt, Hsuan-Tien Lin, Masashi Sugiyama |
Abstract | Recent research has shown that although Reinforcement Learning (RL) can benefit from expert demonstration, it usually takes considerable efforts to obtain enough demonstration. The efforts prevent training decent RL agents with expert demonstration in practice. In this work, we propose Active Reinforcement Learning with Demonstration (ARLD), a new framework to streamline RL in terms of demonstration efforts by allowing the RL agent to query for demonstration actively during training. Under the framework, we propose Active Deep Q-Network, a novel query strategy which adapts to the dynamically-changing distributions during the RL training process by estimating the uncertainty of recent states. The expert demonstration data within Active DQN are then utilized by optimizing supervised max-margin loss in addition to temporal difference loss within usual DQN training. We propose two methods of estimating the uncertainty based on two state-of-the-art DQN models, namely the divergence of bootstrapped DQN and the variance of noisy DQN. The empirical results validate that both methods not only learn faster than other passive expert demonstration methods with the same amount of demonstration and but also reach super-expert level of performance across four different tasks. |
Tasks | Q-Learning |
Published | 2018-12-06 |
URL | http://arxiv.org/abs/1812.02632v1 |
http://arxiv.org/pdf/1812.02632v1.pdf | |
PWC | https://paperswithcode.com/paper/active-deep-q-learning-with-demonstration |
Repo | |
Framework | |
Speeding-up Object Detection Training for Robotics with FALKON
Title | Speeding-up Object Detection Training for Robotics with FALKON |
Authors | Elisa Maiettini, Giulia Pasquale, Lorenzo Rosasco, Lorenzo Natale |
Abstract | Latest deep learning methods for object detection provide remarkable performance, but have limits when used in robotic applications. One of the most relevant issues is the long training time, which is due to the large size and imbalance of the associated training sets, characterized by few positive and a large number of negative examples (i.e. background). Proposed approaches are based on end-to-end learning by back-propagation [22] or kernel methods trained with Hard Negatives Mining on top of deep features [8]. These solutions are effective, but prohibitively slow for on-line applications. In this paper we propose a novel pipeline for object detection that overcomes this problem and provides comparable performance, with a 60x training speedup. Our pipeline combines (i) the Region Proposal Network and the deep feature extractor from [22] to efficiently select candidate RoIs and encode them into powerful representations, with (ii) the FALKON [23] algorithm, a novel kernel-based method that allows fast training on large scale problems (millions of points). We address the size and imbalance of training data by exploiting the stochastic subsampling intrinsic into the method and a novel, fast, bootstrapping approach. We assess the effectiveness of the approach on a standard Computer Vision dataset (PASCAL VOC 2007 [5]) and demonstrate its applicability to a real robotic scenario with the iCubWorld Transformations [18] dataset. |
Tasks | Object Detection |
Published | 2018-03-23 |
URL | http://arxiv.org/abs/1803.08740v2 |
http://arxiv.org/pdf/1803.08740v2.pdf | |
PWC | https://paperswithcode.com/paper/speeding-up-object-detection-training-for |
Repo | |
Framework | |