Paper Group ANR 1064
A Note on Bounding Regret of the C$^2$UCB Contextual Combinatorial Bandit. Approximate Sampling using an Accelerated Metropolis-Hastings based on Bayesian Optimization and Gaussian Processes. Distributionally Robust Optimization with Correlated Data from Vector Autoregressive Processes. RATQ: A Universal Fixed-Length Quantizer for Stochastic Optimi …
A Note on Bounding Regret of the C$^2$UCB Contextual Combinatorial Bandit
Title | A Note on Bounding Regret of the C$^2$UCB Contextual Combinatorial Bandit |
Authors | Bastian Oetomo, Malinga Perera, Renata Borovica-Gajic, Benjamin I. P. Rubinstein |
Abstract | We revisit the proof by Qin et al. (2014) of bounded regret of the C$^2$UCB contextual combinatorial bandit. We demonstrate an error in the proof of volumetric expansion of the moment matrix, used in upper bounding a function of context vector norms. We prove a relaxed inequality that yields the originally-stated regret bound. |
Tasks | |
Published | 2019-02-20 |
URL | http://arxiv.org/abs/1902.07500v1 |
http://arxiv.org/pdf/1902.07500v1.pdf | |
PWC | https://paperswithcode.com/paper/a-note-on-bounding-regret-of-the-c2ucb |
Repo | |
Framework | |
Approximate Sampling using an Accelerated Metropolis-Hastings based on Bayesian Optimization and Gaussian Processes
Title | Approximate Sampling using an Accelerated Metropolis-Hastings based on Bayesian Optimization and Gaussian Processes |
Authors | Asif J. Chowdhury, Gabriel Terejanu |
Abstract | Markov Chain Monte Carlo (MCMC) methods have a drawback when working with a target distribution or likelihood function that is computationally expensive to evaluate, specially when working with big data. This paper focuses on Metropolis-Hastings (MH) algorithm for unimodal distributions. Here, an enhanced MH algorithm is proposed that requires less number of expensive function evaluations, has shorter burn-in period, and uses a better proposal distribution. The main innovations include the use of Bayesian optimization to reach the high probability region quickly, emulating the target distribution using Gaussian processes (GP), and using Laplace approximation of the GP to build a proposal distribution that captures the underlying correlation better. The experiments show significant improvement over the regular MH. Statistical comparison between the results from two algorithms is presented. |
Tasks | Gaussian Processes |
Published | 2019-10-21 |
URL | https://arxiv.org/abs/1910.09347v1 |
https://arxiv.org/pdf/1910.09347v1.pdf | |
PWC | https://paperswithcode.com/paper/approximate-sampling-using-an-accelerated |
Repo | |
Framework | |
Distributionally Robust Optimization with Correlated Data from Vector Autoregressive Processes
Title | Distributionally Robust Optimization with Correlated Data from Vector Autoregressive Processes |
Authors | Xialiang Dou, Mihai Anitescu |
Abstract | We present a distributionally robust formulation of a stochastic optimization problem for non-i.i.d vector autoregressive data. We use the Wasserstein distance to define robustness in the space of distributions and we show, using duality theory, that the problem is equivalent to a finite convex-concave saddle point problem. The performance of the method is demonstrated on both synthetic and real data. |
Tasks | Stochastic Optimization |
Published | 2019-09-08 |
URL | https://arxiv.org/abs/1909.03433v1 |
https://arxiv.org/pdf/1909.03433v1.pdf | |
PWC | https://paperswithcode.com/paper/distributionally-robust-optimization-with |
Repo | |
Framework | |
RATQ: A Universal Fixed-Length Quantizer for Stochastic Optimization
Title | RATQ: A Universal Fixed-Length Quantizer for Stochastic Optimization |
Authors | Prathamesh Mayekar, Himanshu Tyagi |
Abstract | We present Rotated Adaptive Tetra-iterated Quantizer (RATQ), a fixed-length quantizer for gradients in first order stochastic optimization. RATQ is easy to implement and involves only a Hadamard transform computation and adaptive uniform quantization with appropriately chosen dynamic ranges. For noisy gradients with almost surely bounded Euclidean norms, we establish an information theoretic lower bound for optimization accuracy using finite precision gradients and show that RATQ almost attains this lower bound. For mean square bounded noisy gradients, we use a gain-shape quantizer which separately quantizes the Euclidean norm and uses RATQ to quantize the normalized unit norm vector. We establish lower bounds for performance of any optimization procedure and shape quantizer, when used with a uniform gain quantizer. Finally, we propose an adaptive quantizer for gain which when used with RATQ for shape quantizer outperforms uniform gain quantization and is, in fact, close to optimal. As a by-product, we show that our fixed-length quantizer RATQ has almost the same performance as the optimal variable-length quantizers for distributed mean estimation. Also, we obtain an efficient quantizer for Gaussian vectors which attains a rate very close to the Gaussian rate-distortion function and is, in fact, universal for subgaussian input vectors. |
Tasks | Quantization, Stochastic Optimization |
Published | 2019-08-22 |
URL | https://arxiv.org/abs/1908.08200v3 |
https://arxiv.org/pdf/1908.08200v3.pdf | |
PWC | https://paperswithcode.com/paper/finite-precision-stochastic-optimisation |
Repo | |
Framework | |
Bayesian high-dimensional linear regression with generic spike-and-slab priors
Title | Bayesian high-dimensional linear regression with generic spike-and-slab priors |
Authors | Bai Jiang, Qiang Sun |
Abstract | Spike-and-slab priors are popular Bayesian solutions for high-dimensional linear regression problems. Previous theoretical studies on spike-and-slab methods focus on specific prior formulations and use prior-dependent conditions and analyses, and thus can not be generalized directly. In this paper, we propose a class of generic spike-and-slab priors and develop a unified framework to rigorously assess their theoretical properties. Technically, we provide general conditions under which generic spike-and-slab priors can achieve the nearly-optimal posterior contraction rate and the model selection consistency. Our results include those of Narisetty and He (2014) and Castillo et al. (2015) as special cases. |
Tasks | Model Selection |
Published | 2019-12-19 |
URL | https://arxiv.org/abs/1912.08993v2 |
https://arxiv.org/pdf/1912.08993v2.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-high-dimensional-linear-regression |
Repo | |
Framework | |
Generative Adversarial Networks for Black-Box API Attacks with Limited Training Data
Title | Generative Adversarial Networks for Black-Box API Attacks with Limited Training Data |
Authors | Yi Shi, Yalin E. Sagduyu, Kemal Davaslioglu, Jason H. Li |
Abstract | As online systems based on machine learning are offered to public or paid subscribers via application programming interfaces (APIs), they become vulnerable to frequent exploits and attacks. This paper studies adversarial machine learning in the practical case when there are rate limitations on API calls. The adversary launches an exploratory (inference) attack by querying the API of an online machine learning system (in particular, a classifier) with input data samples, collecting returned labels to build up the training data, and training an adversarial classifier that is functionally equivalent and statistically close to the target classifier. The exploratory attack with limited training data is shown to fail to reliably infer the target classifier of a real text classifier API that is available online to the public. In return, a generative adversarial network (GAN) based on deep learning is built to generate synthetic training data from a limited number of real training data samples, thereby extending the training data and improving the performance of the inferred classifier. The exploratory attack provides the basis to launch the causative attack (that aims to poison the training process) and evasion attack (that aims to fool the classifier into making wrong decisions) by selecting training and test data samples, respectively, based on the confidence scores obtained from the inferred classifier. These stealth attacks with small footprint (using a small number of API calls) make adversarial machine learning practical under the realistic case with limited training data available to the adversary. |
Tasks | Inference Attack |
Published | 2019-01-25 |
URL | http://arxiv.org/abs/1901.09113v1 |
http://arxiv.org/pdf/1901.09113v1.pdf | |
PWC | https://paperswithcode.com/paper/generative-adversarial-networks-for-black-box |
Repo | |
Framework | |
Model Adaption Object Detection System for Robot
Title | Model Adaption Object Detection System for Robot |
Authors | Jingwen Fu, Licheng Zong, Yinbing Li, Ke Li, Bingqian Yang, Xibei Liu |
Abstract | Object detection for robot guidance is a crucial mission for autonomous robots, which has provoked extensive attention for researchers. However, the changing view of robot movement and limited available data hinder the research in this area. To address these matters, we proposed a new vision system for robots, the model adaptation object detection system. Instead of using a single one to solve problems, We made use of different object detection neural networks to guide the robot in accordance with various situations, with the help of a meta neural network to allocate the object detection neural networks. Furthermore, taking advantage of transfer learning technology and depthwise separable convolutions, our model is easy to train and can address small dataset problems. |
Tasks | Object Detection, Transfer Learning |
Published | 2019-11-07 |
URL | https://arxiv.org/abs/1911.02718v2 |
https://arxiv.org/pdf/1911.02718v2.pdf | |
PWC | https://paperswithcode.com/paper/model-adaption-object-detection-system-for |
Repo | |
Framework | |
Understanding Perceptions and Attitudes in Breast Cancer Discussions on Twitter
Title | Understanding Perceptions and Attitudes in Breast Cancer Discussions on Twitter |
Authors | Francois Modave, Yunpeng Zhao, Janice Krieger, Zhe He, Yi Guo, Jinhai Huo, Mattia Prosperi, Jiang Bian |
Abstract | Among American women, the rate of breast cancer is only second to lung cancer. An estimated 12.4% women will develop breast cancer over the course of their lifetime. The widespread use of social media across the socio-economic spectrum offers unparalleled ways to facilitate information sharing, in particular as it pertains to health. Social media is also used by many healthcare stakeholders, ranging from government agencies to healthcare industry, to disseminate health information and to engage patients. The purpose of this study is to investigate people’s perceptions and attitudes relate to breast cancer, especially those that are related to physical activities, on Twitter. To achieve this, we first identified and collected tweets related to breast cancer; and then used topic modeling and sentiment analysis techniques to understanding discussion themes and quantify Twitter users’ perceptions and emotions w.r.t breast cancer to answer 5 research questions. |
Tasks | Sentiment Analysis |
Published | 2019-05-22 |
URL | https://arxiv.org/abs/1905.12469v1 |
https://arxiv.org/pdf/1905.12469v1.pdf | |
PWC | https://paperswithcode.com/paper/190512469 |
Repo | |
Framework | |
Corpus Wide Argument Mining – a Working Solution
Title | Corpus Wide Argument Mining – a Working Solution |
Authors | Liat Ein-Dor, Eyal Shnarch, Lena Dankin, Alon Halfon, Benjamin Sznajder, Ariel Gera, Carlos Alzate, Martin Gleize, Leshem Choshen, Yufang Hou, Yonatan Bilu, Ranit Aharonov, Noam Slonim |
Abstract | One of the main tasks in argument mining is the retrieval of argumentative content pertaining to a given topic. Most previous work addressed this task by retrieving a relatively small number of relevant documents as the initial source for such content. This line of research yielded moderate success, which is of limited use in a real-world system. Furthermore, for such a system to yield a comprehensive set of relevant arguments, over a wide range of topics, it requires leveraging a large and diverse corpus in an appropriate manner. Here we present a first end-to-end high-precision, corpus-wide argument mining system. This is made possible by combining sentence-level queries over an appropriate indexing of a very large corpus of newspaper articles, with an iterative annotation scheme. This scheme addresses the inherent label bias in the data and pinpoints the regions of the sample space whose manual labeling is required to obtain high-precision among top-ranked candidates. |
Tasks | Argument Mining |
Published | 2019-11-25 |
URL | https://arxiv.org/abs/1911.10763v1 |
https://arxiv.org/pdf/1911.10763v1.pdf | |
PWC | https://paperswithcode.com/paper/corpus-wide-argument-mining-a-working |
Repo | |
Framework | |
Deep Probabilistic Models to Detect Data Poisoning Attacks
Title | Deep Probabilistic Models to Detect Data Poisoning Attacks |
Authors | Mahesh Subedar, Nilesh Ahuja, Ranganath Krishnan, Ibrahima J. Ndiour, Omesh Tickoo |
Abstract | Data poisoning attacks compromise the integrity of machine-learning models by introducing malicious training samples to influence the results during test time. In this work, we investigate backdoor data poisoning attack on deep neural networks (DNNs) by inserting a backdoor pattern in the training images. The resulting attack will misclassify poisoned test samples while maintaining high accuracies for the clean test-set. We present two approaches for detection of such poisoned samples by quantifying the uncertainty estimates associated with the trained models. In the first approach, we model the outputs of the various layers (deep features) with parametric probability distributions learnt from the clean held-out dataset. At inference, the likelihoods of deep features w.r.t these distributions are calculated to derive uncertainty estimates. In the second approach, we use Bayesian deep neural networks trained with mean-field variational inference to estimate model uncertainty associated with the predictions. The uncertainty estimates from these methods are used to discriminate clean from the poisoned samples. |
Tasks | data poisoning |
Published | 2019-12-03 |
URL | https://arxiv.org/abs/1912.01206v1 |
https://arxiv.org/pdf/1912.01206v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-probabilistic-models-to-detect-data |
Repo | |
Framework | |
AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning
Title | AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning |
Authors | Ximeng Sun, Rameswar Panda, Rogerio Feris |
Abstract | Multi-task learning is an open and challenging problem in computer vision. The typical way of conducting multi-task learning with deep neural networks is either through handcrafting schemes that share all initial layers and branch out at an adhoc point or through using separate task-specific networks with an additional feature sharing/fusion mechanism. Unlike existing methods, we propose an adaptive sharing approach, called AdaShare, that decides what to share across which tasks for achieving the best recognition accuracy, while taking resource efficiency into account. Specifically, our main idea is to learn the sharing pattern through a task-specific policy that selectively chooses which layers to execute for a given task in the multi-task network. We efficiently optimize the task-specific policy jointly with the network weights using standard back-propagation. Experiments on three challenging and diverse benchmark datasets with a variable number of tasks well demonstrate the efficacy of our approach over state-of-the-art methods. |
Tasks | Multi-Task Learning |
Published | 2019-11-27 |
URL | https://arxiv.org/abs/1911.12423v1 |
https://arxiv.org/pdf/1911.12423v1.pdf | |
PWC | https://paperswithcode.com/paper/adashare-learning-what-to-share-for-efficient |
Repo | |
Framework | |
Accelerating Data Loading in Deep Neural Network Training
Title | Accelerating Data Loading in Deep Neural Network Training |
Authors | Chih-Chieh Yang, Guojing Cong |
Abstract | Data loading can dominate deep neural network training time on large-scale systems. We present a comprehensive study on accelerating data loading performance in large-scale distributed training. We first identify performance and scalability issues in current data loading implementations. We then propose optimizations that utilize CPU resources to the data loader design. We use an analytical model to characterize the impact of data loading on the overall training time and establish the performance trend as we scale up distributed training. Our model suggests that I/O rate limits the scalability of distributed training, which inspires us to design a locality-aware data loading method. By utilizing software caches, our method can drastically reduce the data loading communication volume in comparison with the original data loading implementation. Finally, we evaluate the proposed optimizations with various experiments. We achieved more than 30x speedup in data loading using 256 nodes with 1,024 learners. |
Tasks | |
Published | 2019-10-02 |
URL | https://arxiv.org/abs/1910.01196v1 |
https://arxiv.org/pdf/1910.01196v1.pdf | |
PWC | https://paperswithcode.com/paper/accelerating-data-loading-in-deep-neural |
Repo | |
Framework | |
Overcoming Small Minirhizotron Datasets Using Transfer Learning
Title | Overcoming Small Minirhizotron Datasets Using Transfer Learning |
Authors | Weihuang Xu, Guohao Yu, Alina Zare, Brendan Zurweller, Diane Rowland, Joel Reyes-Cabrera, Felix B Fritschi, Roser Matamala, Thomas E. Juenger |
Abstract | Minirhizotron technology is widely used for studying the development of roots. Such systems collect visible-wavelength color imagery of plant roots in-situ by scanning an imaging system within a clear tube driven into the soil. Automated analysis of root systems could facilitate new scientific discoveries that would be critical to address the world’s pressing food, resource, and climate issues. A key component of automated analysis of plant roots from imagery is the automated pixel-level segmentation of roots from their surrounding soil. Supervised learning techniques appear to be an appropriate tool for the challenge due to varying local soil and root conditions, however, lack of enough annotated training data is a major limitation due to the error-prone and time-consuming manually labeling process. In this paper, we investigate the use of deep neural networks based on the U-net architecture for automated, precise pixel-wise root segmentation in minirhizotron imagery. We compiled two minirhizotron image datasets to accomplish this study: one with 17,550 peanut root images and another with 28 switchgrass root images. Both datasets were paired with manually labeled ground truth masks. We trained three neural networks with different architectures on the larger peanut root dataset to explore the effect of the neural network depth on segmentation performance. To tackle the more limited switchgrass root dataset, we showed that models initialized with features pre-trained on the peanut dataset and then fine-tuned on the switchgrass dataset can improve segmentation performance significantly. We obtained 99% segmentation accuracy in switchgrass imagery using only 21 training images. We also observed that features pre-trained on a closely related but relatively moderate size dataset like our peanut dataset are more effective than features pre-trained on the large but unrelated ImageNet dataset. |
Tasks | Transfer Learning |
Published | 2019-03-22 |
URL | https://arxiv.org/abs/1903.09344v2 |
https://arxiv.org/pdf/1903.09344v2.pdf | |
PWC | https://paperswithcode.com/paper/overcoming-small-minirhizotron-datasets-using |
Repo | |
Framework | |
Learning with Modular Representations for Long-Term Multi-Agent Motion Predictions
Title | Learning with Modular Representations for Long-Term Multi-Agent Motion Predictions |
Authors | Todor Davchev, Michael Burke, Subramanian Ramamoorthy |
Abstract | Context plays a significant role in the generation of motion for dynamic agents in interactive environments. This work proposes a modular method that utilises a model of the environment to aid motion prediction of tracked agents. This paper shows that modelling the spatial and dynamic aspects of a given environment alongside the local per agent behaviour results in more accurate and informed long-term motion prediction. Further, we observe that this decoupling of dynamics and environment models allows for better adaptation to unseen environments, requiring that only a spatial representation of a new environment be learned. We highlight the model’s prediction capability using a benchmark pedestrian tracking problem and by tracking a robot arm performing a tabletop manipulation task. The proposed approach allows for robust and data efficient forward modelling, and relaxes the need for full model re-training in new environments. We evaluate this through an ablation study which shows better performance gain when utilising both representation modules in addition to improved generalisation on tasks with dynamics unseen at training time. |
Tasks | motion prediction |
Published | 2019-11-29 |
URL | https://arxiv.org/abs/1911.13044v3 |
https://arxiv.org/pdf/1911.13044v3.pdf | |
PWC | https://paperswithcode.com/paper/learning-modular-representations-for-long |
Repo | |
Framework | |
Distribution-dependent and Time-uniform Bounds for Piecewise i.i.d Bandits
Title | Distribution-dependent and Time-uniform Bounds for Piecewise i.i.d Bandits |
Authors | Subhojyoti Mukherjee, Odalric-Ambrym Maillard |
Abstract | We consider the setup of stochastic multi-armed bandits in the case when reward distributions are piecewise i.i.d. and bounded with unknown changepoints. We focus on the case when changes happen simultaneously on all arms, and in stark contrast with the existing literature, we target gap-dependent (as opposed to only gap-independent) regret bounds involving the magnitude of changes $(\Delta^{chg}{i,g})$ and optimality-gaps ($\Delta^{opt}{i,g}$). Diverging from previous works, we assume the more realistic scenario that there can be undetectable changepoint gaps and under a different set of assumptions, we show that as long as the compounded delayed detection for each changepoint is bounded there is no need for forced exploration to actively detect changepoints. We introduce two adaptations of UCB-strategies that employ scan-statistics in order to actively detect the changepoints, without knowing in advance the changepoints and also the mean before and after any change. Our first method \UCBLCPD does not know the number of changepoints $G$ or time horizon $T$ and achieves the first time-uniform concentration bound for this setting using the Laplace method of integration. The second strategy \ImpCPD makes use of the knowledge of $T$ to achieve the order optimal regret bound of $\min\big\lbrace O(\sum\limits_{i=1}^{K} \sum\limits_{g=1}^{G}\frac{\log(T/H_{1,g})}{\Delta^{opt}_{i,g}}), O(\sqrt{GT})\big\rbrace$, (where $H_{1,g}$ is the problem complexity) thereby closing an important gap with respect to the lower bound in a specific challenging setting. Our theoretical findings are supported by numerical experiments on synthetic and real-life datasets. |
Tasks | Multi-Armed Bandits |
Published | 2019-05-30 |
URL | https://arxiv.org/abs/1905.13159v2 |
https://arxiv.org/pdf/1905.13159v2.pdf | |
PWC | https://paperswithcode.com/paper/distribution-dependent-and-time-uniform |
Repo | |
Framework | |