January 28, 2020

3024 words 15 mins read

Paper Group ANR 1064

Paper Group ANR 1064

A Note on Bounding Regret of the C$^2$UCB Contextual Combinatorial Bandit. Approximate Sampling using an Accelerated Metropolis-Hastings based on Bayesian Optimization and Gaussian Processes. Distributionally Robust Optimization with Correlated Data from Vector Autoregressive Processes. RATQ: A Universal Fixed-Length Quantizer for Stochastic Optimi …

A Note on Bounding Regret of the C$^2$UCB Contextual Combinatorial Bandit

Title A Note on Bounding Regret of the C$^2$UCB Contextual Combinatorial Bandit
Authors Bastian Oetomo, Malinga Perera, Renata Borovica-Gajic, Benjamin I. P. Rubinstein
Abstract We revisit the proof by Qin et al. (2014) of bounded regret of the C$^2$UCB contextual combinatorial bandit. We demonstrate an error in the proof of volumetric expansion of the moment matrix, used in upper bounding a function of context vector norms. We prove a relaxed inequality that yields the originally-stated regret bound.
Tasks
Published 2019-02-20
URL http://arxiv.org/abs/1902.07500v1
PDF http://arxiv.org/pdf/1902.07500v1.pdf
PWC https://paperswithcode.com/paper/a-note-on-bounding-regret-of-the-c2ucb
Repo
Framework

Approximate Sampling using an Accelerated Metropolis-Hastings based on Bayesian Optimization and Gaussian Processes

Title Approximate Sampling using an Accelerated Metropolis-Hastings based on Bayesian Optimization and Gaussian Processes
Authors Asif J. Chowdhury, Gabriel Terejanu
Abstract Markov Chain Monte Carlo (MCMC) methods have a drawback when working with a target distribution or likelihood function that is computationally expensive to evaluate, specially when working with big data. This paper focuses on Metropolis-Hastings (MH) algorithm for unimodal distributions. Here, an enhanced MH algorithm is proposed that requires less number of expensive function evaluations, has shorter burn-in period, and uses a better proposal distribution. The main innovations include the use of Bayesian optimization to reach the high probability region quickly, emulating the target distribution using Gaussian processes (GP), and using Laplace approximation of the GP to build a proposal distribution that captures the underlying correlation better. The experiments show significant improvement over the regular MH. Statistical comparison between the results from two algorithms is presented.
Tasks Gaussian Processes
Published 2019-10-21
URL https://arxiv.org/abs/1910.09347v1
PDF https://arxiv.org/pdf/1910.09347v1.pdf
PWC https://paperswithcode.com/paper/approximate-sampling-using-an-accelerated
Repo
Framework

Distributionally Robust Optimization with Correlated Data from Vector Autoregressive Processes

Title Distributionally Robust Optimization with Correlated Data from Vector Autoregressive Processes
Authors Xialiang Dou, Mihai Anitescu
Abstract We present a distributionally robust formulation of a stochastic optimization problem for non-i.i.d vector autoregressive data. We use the Wasserstein distance to define robustness in the space of distributions and we show, using duality theory, that the problem is equivalent to a finite convex-concave saddle point problem. The performance of the method is demonstrated on both synthetic and real data.
Tasks Stochastic Optimization
Published 2019-09-08
URL https://arxiv.org/abs/1909.03433v1
PDF https://arxiv.org/pdf/1909.03433v1.pdf
PWC https://paperswithcode.com/paper/distributionally-robust-optimization-with
Repo
Framework

RATQ: A Universal Fixed-Length Quantizer for Stochastic Optimization

Title RATQ: A Universal Fixed-Length Quantizer for Stochastic Optimization
Authors Prathamesh Mayekar, Himanshu Tyagi
Abstract We present Rotated Adaptive Tetra-iterated Quantizer (RATQ), a fixed-length quantizer for gradients in first order stochastic optimization. RATQ is easy to implement and involves only a Hadamard transform computation and adaptive uniform quantization with appropriately chosen dynamic ranges. For noisy gradients with almost surely bounded Euclidean norms, we establish an information theoretic lower bound for optimization accuracy using finite precision gradients and show that RATQ almost attains this lower bound. For mean square bounded noisy gradients, we use a gain-shape quantizer which separately quantizes the Euclidean norm and uses RATQ to quantize the normalized unit norm vector. We establish lower bounds for performance of any optimization procedure and shape quantizer, when used with a uniform gain quantizer. Finally, we propose an adaptive quantizer for gain which when used with RATQ for shape quantizer outperforms uniform gain quantization and is, in fact, close to optimal. As a by-product, we show that our fixed-length quantizer RATQ has almost the same performance as the optimal variable-length quantizers for distributed mean estimation. Also, we obtain an efficient quantizer for Gaussian vectors which attains a rate very close to the Gaussian rate-distortion function and is, in fact, universal for subgaussian input vectors.
Tasks Quantization, Stochastic Optimization
Published 2019-08-22
URL https://arxiv.org/abs/1908.08200v3
PDF https://arxiv.org/pdf/1908.08200v3.pdf
PWC https://paperswithcode.com/paper/finite-precision-stochastic-optimisation
Repo
Framework

Bayesian high-dimensional linear regression with generic spike-and-slab priors

Title Bayesian high-dimensional linear regression with generic spike-and-slab priors
Authors Bai Jiang, Qiang Sun
Abstract Spike-and-slab priors are popular Bayesian solutions for high-dimensional linear regression problems. Previous theoretical studies on spike-and-slab methods focus on specific prior formulations and use prior-dependent conditions and analyses, and thus can not be generalized directly. In this paper, we propose a class of generic spike-and-slab priors and develop a unified framework to rigorously assess their theoretical properties. Technically, we provide general conditions under which generic spike-and-slab priors can achieve the nearly-optimal posterior contraction rate and the model selection consistency. Our results include those of Narisetty and He (2014) and Castillo et al. (2015) as special cases.
Tasks Model Selection
Published 2019-12-19
URL https://arxiv.org/abs/1912.08993v2
PDF https://arxiv.org/pdf/1912.08993v2.pdf
PWC https://paperswithcode.com/paper/bayesian-high-dimensional-linear-regression
Repo
Framework

Generative Adversarial Networks for Black-Box API Attacks with Limited Training Data

Title Generative Adversarial Networks for Black-Box API Attacks with Limited Training Data
Authors Yi Shi, Yalin E. Sagduyu, Kemal Davaslioglu, Jason H. Li
Abstract As online systems based on machine learning are offered to public or paid subscribers via application programming interfaces (APIs), they become vulnerable to frequent exploits and attacks. This paper studies adversarial machine learning in the practical case when there are rate limitations on API calls. The adversary launches an exploratory (inference) attack by querying the API of an online machine learning system (in particular, a classifier) with input data samples, collecting returned labels to build up the training data, and training an adversarial classifier that is functionally equivalent and statistically close to the target classifier. The exploratory attack with limited training data is shown to fail to reliably infer the target classifier of a real text classifier API that is available online to the public. In return, a generative adversarial network (GAN) based on deep learning is built to generate synthetic training data from a limited number of real training data samples, thereby extending the training data and improving the performance of the inferred classifier. The exploratory attack provides the basis to launch the causative attack (that aims to poison the training process) and evasion attack (that aims to fool the classifier into making wrong decisions) by selecting training and test data samples, respectively, based on the confidence scores obtained from the inferred classifier. These stealth attacks with small footprint (using a small number of API calls) make adversarial machine learning practical under the realistic case with limited training data available to the adversary.
Tasks Inference Attack
Published 2019-01-25
URL http://arxiv.org/abs/1901.09113v1
PDF http://arxiv.org/pdf/1901.09113v1.pdf
PWC https://paperswithcode.com/paper/generative-adversarial-networks-for-black-box
Repo
Framework

Model Adaption Object Detection System for Robot

Title Model Adaption Object Detection System for Robot
Authors Jingwen Fu, Licheng Zong, Yinbing Li, Ke Li, Bingqian Yang, Xibei Liu
Abstract Object detection for robot guidance is a crucial mission for autonomous robots, which has provoked extensive attention for researchers. However, the changing view of robot movement and limited available data hinder the research in this area. To address these matters, we proposed a new vision system for robots, the model adaptation object detection system. Instead of using a single one to solve problems, We made use of different object detection neural networks to guide the robot in accordance with various situations, with the help of a meta neural network to allocate the object detection neural networks. Furthermore, taking advantage of transfer learning technology and depthwise separable convolutions, our model is easy to train and can address small dataset problems.
Tasks Object Detection, Transfer Learning
Published 2019-11-07
URL https://arxiv.org/abs/1911.02718v2
PDF https://arxiv.org/pdf/1911.02718v2.pdf
PWC https://paperswithcode.com/paper/model-adaption-object-detection-system-for
Repo
Framework

Understanding Perceptions and Attitudes in Breast Cancer Discussions on Twitter

Title Understanding Perceptions and Attitudes in Breast Cancer Discussions on Twitter
Authors Francois Modave, Yunpeng Zhao, Janice Krieger, Zhe He, Yi Guo, Jinhai Huo, Mattia Prosperi, Jiang Bian
Abstract Among American women, the rate of breast cancer is only second to lung cancer. An estimated 12.4% women will develop breast cancer over the course of their lifetime. The widespread use of social media across the socio-economic spectrum offers unparalleled ways to facilitate information sharing, in particular as it pertains to health. Social media is also used by many healthcare stakeholders, ranging from government agencies to healthcare industry, to disseminate health information and to engage patients. The purpose of this study is to investigate people’s perceptions and attitudes relate to breast cancer, especially those that are related to physical activities, on Twitter. To achieve this, we first identified and collected tweets related to breast cancer; and then used topic modeling and sentiment analysis techniques to understanding discussion themes and quantify Twitter users’ perceptions and emotions w.r.t breast cancer to answer 5 research questions.
Tasks Sentiment Analysis
Published 2019-05-22
URL https://arxiv.org/abs/1905.12469v1
PDF https://arxiv.org/pdf/1905.12469v1.pdf
PWC https://paperswithcode.com/paper/190512469
Repo
Framework

Corpus Wide Argument Mining – a Working Solution

Title Corpus Wide Argument Mining – a Working Solution
Authors Liat Ein-Dor, Eyal Shnarch, Lena Dankin, Alon Halfon, Benjamin Sznajder, Ariel Gera, Carlos Alzate, Martin Gleize, Leshem Choshen, Yufang Hou, Yonatan Bilu, Ranit Aharonov, Noam Slonim
Abstract One of the main tasks in argument mining is the retrieval of argumentative content pertaining to a given topic. Most previous work addressed this task by retrieving a relatively small number of relevant documents as the initial source for such content. This line of research yielded moderate success, which is of limited use in a real-world system. Furthermore, for such a system to yield a comprehensive set of relevant arguments, over a wide range of topics, it requires leveraging a large and diverse corpus in an appropriate manner. Here we present a first end-to-end high-precision, corpus-wide argument mining system. This is made possible by combining sentence-level queries over an appropriate indexing of a very large corpus of newspaper articles, with an iterative annotation scheme. This scheme addresses the inherent label bias in the data and pinpoints the regions of the sample space whose manual labeling is required to obtain high-precision among top-ranked candidates.
Tasks Argument Mining
Published 2019-11-25
URL https://arxiv.org/abs/1911.10763v1
PDF https://arxiv.org/pdf/1911.10763v1.pdf
PWC https://paperswithcode.com/paper/corpus-wide-argument-mining-a-working
Repo
Framework

Deep Probabilistic Models to Detect Data Poisoning Attacks

Title Deep Probabilistic Models to Detect Data Poisoning Attacks
Authors Mahesh Subedar, Nilesh Ahuja, Ranganath Krishnan, Ibrahima J. Ndiour, Omesh Tickoo
Abstract Data poisoning attacks compromise the integrity of machine-learning models by introducing malicious training samples to influence the results during test time. In this work, we investigate backdoor data poisoning attack on deep neural networks (DNNs) by inserting a backdoor pattern in the training images. The resulting attack will misclassify poisoned test samples while maintaining high accuracies for the clean test-set. We present two approaches for detection of such poisoned samples by quantifying the uncertainty estimates associated with the trained models. In the first approach, we model the outputs of the various layers (deep features) with parametric probability distributions learnt from the clean held-out dataset. At inference, the likelihoods of deep features w.r.t these distributions are calculated to derive uncertainty estimates. In the second approach, we use Bayesian deep neural networks trained with mean-field variational inference to estimate model uncertainty associated with the predictions. The uncertainty estimates from these methods are used to discriminate clean from the poisoned samples.
Tasks data poisoning
Published 2019-12-03
URL https://arxiv.org/abs/1912.01206v1
PDF https://arxiv.org/pdf/1912.01206v1.pdf
PWC https://paperswithcode.com/paper/deep-probabilistic-models-to-detect-data
Repo
Framework

AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning

Title AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning
Authors Ximeng Sun, Rameswar Panda, Rogerio Feris
Abstract Multi-task learning is an open and challenging problem in computer vision. The typical way of conducting multi-task learning with deep neural networks is either through handcrafting schemes that share all initial layers and branch out at an adhoc point or through using separate task-specific networks with an additional feature sharing/fusion mechanism. Unlike existing methods, we propose an adaptive sharing approach, called AdaShare, that decides what to share across which tasks for achieving the best recognition accuracy, while taking resource efficiency into account. Specifically, our main idea is to learn the sharing pattern through a task-specific policy that selectively chooses which layers to execute for a given task in the multi-task network. We efficiently optimize the task-specific policy jointly with the network weights using standard back-propagation. Experiments on three challenging and diverse benchmark datasets with a variable number of tasks well demonstrate the efficacy of our approach over state-of-the-art methods.
Tasks Multi-Task Learning
Published 2019-11-27
URL https://arxiv.org/abs/1911.12423v1
PDF https://arxiv.org/pdf/1911.12423v1.pdf
PWC https://paperswithcode.com/paper/adashare-learning-what-to-share-for-efficient
Repo
Framework

Accelerating Data Loading in Deep Neural Network Training

Title Accelerating Data Loading in Deep Neural Network Training
Authors Chih-Chieh Yang, Guojing Cong
Abstract Data loading can dominate deep neural network training time on large-scale systems. We present a comprehensive study on accelerating data loading performance in large-scale distributed training. We first identify performance and scalability issues in current data loading implementations. We then propose optimizations that utilize CPU resources to the data loader design. We use an analytical model to characterize the impact of data loading on the overall training time and establish the performance trend as we scale up distributed training. Our model suggests that I/O rate limits the scalability of distributed training, which inspires us to design a locality-aware data loading method. By utilizing software caches, our method can drastically reduce the data loading communication volume in comparison with the original data loading implementation. Finally, we evaluate the proposed optimizations with various experiments. We achieved more than 30x speedup in data loading using 256 nodes with 1,024 learners.
Tasks
Published 2019-10-02
URL https://arxiv.org/abs/1910.01196v1
PDF https://arxiv.org/pdf/1910.01196v1.pdf
PWC https://paperswithcode.com/paper/accelerating-data-loading-in-deep-neural
Repo
Framework

Overcoming Small Minirhizotron Datasets Using Transfer Learning

Title Overcoming Small Minirhizotron Datasets Using Transfer Learning
Authors Weihuang Xu, Guohao Yu, Alina Zare, Brendan Zurweller, Diane Rowland, Joel Reyes-Cabrera, Felix B Fritschi, Roser Matamala, Thomas E. Juenger
Abstract Minirhizotron technology is widely used for studying the development of roots. Such systems collect visible-wavelength color imagery of plant roots in-situ by scanning an imaging system within a clear tube driven into the soil. Automated analysis of root systems could facilitate new scientific discoveries that would be critical to address the world’s pressing food, resource, and climate issues. A key component of automated analysis of plant roots from imagery is the automated pixel-level segmentation of roots from their surrounding soil. Supervised learning techniques appear to be an appropriate tool for the challenge due to varying local soil and root conditions, however, lack of enough annotated training data is a major limitation due to the error-prone and time-consuming manually labeling process. In this paper, we investigate the use of deep neural networks based on the U-net architecture for automated, precise pixel-wise root segmentation in minirhizotron imagery. We compiled two minirhizotron image datasets to accomplish this study: one with 17,550 peanut root images and another with 28 switchgrass root images. Both datasets were paired with manually labeled ground truth masks. We trained three neural networks with different architectures on the larger peanut root dataset to explore the effect of the neural network depth on segmentation performance. To tackle the more limited switchgrass root dataset, we showed that models initialized with features pre-trained on the peanut dataset and then fine-tuned on the switchgrass dataset can improve segmentation performance significantly. We obtained 99% segmentation accuracy in switchgrass imagery using only 21 training images. We also observed that features pre-trained on a closely related but relatively moderate size dataset like our peanut dataset are more effective than features pre-trained on the large but unrelated ImageNet dataset.
Tasks Transfer Learning
Published 2019-03-22
URL https://arxiv.org/abs/1903.09344v2
PDF https://arxiv.org/pdf/1903.09344v2.pdf
PWC https://paperswithcode.com/paper/overcoming-small-minirhizotron-datasets-using
Repo
Framework

Learning with Modular Representations for Long-Term Multi-Agent Motion Predictions

Title Learning with Modular Representations for Long-Term Multi-Agent Motion Predictions
Authors Todor Davchev, Michael Burke, Subramanian Ramamoorthy
Abstract Context plays a significant role in the generation of motion for dynamic agents in interactive environments. This work proposes a modular method that utilises a model of the environment to aid motion prediction of tracked agents. This paper shows that modelling the spatial and dynamic aspects of a given environment alongside the local per agent behaviour results in more accurate and informed long-term motion prediction. Further, we observe that this decoupling of dynamics and environment models allows for better adaptation to unseen environments, requiring that only a spatial representation of a new environment be learned. We highlight the model’s prediction capability using a benchmark pedestrian tracking problem and by tracking a robot arm performing a tabletop manipulation task. The proposed approach allows for robust and data efficient forward modelling, and relaxes the need for full model re-training in new environments. We evaluate this through an ablation study which shows better performance gain when utilising both representation modules in addition to improved generalisation on tasks with dynamics unseen at training time.
Tasks motion prediction
Published 2019-11-29
URL https://arxiv.org/abs/1911.13044v3
PDF https://arxiv.org/pdf/1911.13044v3.pdf
PWC https://paperswithcode.com/paper/learning-modular-representations-for-long
Repo
Framework

Distribution-dependent and Time-uniform Bounds for Piecewise i.i.d Bandits

Title Distribution-dependent and Time-uniform Bounds for Piecewise i.i.d Bandits
Authors Subhojyoti Mukherjee, Odalric-Ambrym Maillard
Abstract We consider the setup of stochastic multi-armed bandits in the case when reward distributions are piecewise i.i.d. and bounded with unknown changepoints. We focus on the case when changes happen simultaneously on all arms, and in stark contrast with the existing literature, we target gap-dependent (as opposed to only gap-independent) regret bounds involving the magnitude of changes $(\Delta^{chg}{i,g})$ and optimality-gaps ($\Delta^{opt}{i,g}$). Diverging from previous works, we assume the more realistic scenario that there can be undetectable changepoint gaps and under a different set of assumptions, we show that as long as the compounded delayed detection for each changepoint is bounded there is no need for forced exploration to actively detect changepoints. We introduce two adaptations of UCB-strategies that employ scan-statistics in order to actively detect the changepoints, without knowing in advance the changepoints and also the mean before and after any change. Our first method \UCBLCPD does not know the number of changepoints $G$ or time horizon $T$ and achieves the first time-uniform concentration bound for this setting using the Laplace method of integration. The second strategy \ImpCPD makes use of the knowledge of $T$ to achieve the order optimal regret bound of $\min\big\lbrace O(\sum\limits_{i=1}^{K} \sum\limits_{g=1}^{G}\frac{\log(T/H_{1,g})}{\Delta^{opt}_{i,g}}), O(\sqrt{GT})\big\rbrace$, (where $H_{1,g}$ is the problem complexity) thereby closing an important gap with respect to the lower bound in a specific challenging setting. Our theoretical findings are supported by numerical experiments on synthetic and real-life datasets.
Tasks Multi-Armed Bandits
Published 2019-05-30
URL https://arxiv.org/abs/1905.13159v2
PDF https://arxiv.org/pdf/1905.13159v2.pdf
PWC https://paperswithcode.com/paper/distribution-dependent-and-time-uniform
Repo
Framework
comments powered by Disqus