January 25, 2020

3094 words 15 mins read

Paper Group ANR 1735

Estimating Kullback-Leibler Divergence Using Kernel Machines. Visual Privacy Protection via Mapping Distortion. Approximate Guarantees for Dictionary Learning. Jointly Learning Structured Analysis Discriminative Dictionary and Analysis Multiclass Classifier. White-Box Adversarial Defense via Self-Supervised Data Estimation. Regret Minimisation in M …

Estimating Kullback-Leibler Divergence Using Kernel Machines


Title	Estimating Kullback-Leibler Divergence Using Kernel Machines
Authors	Kartik Ahuja
Abstract	Recently, a method called the Mutual Information Neural Estimator (MINE) that uses neural networks has been proposed to estimate mutual information and more generally the Kullback-Leibler (KL) divergence between two distributions. The method uses the Donsker-Varadhan representation to arrive at the estimate of the KL divergence and is better than the existing estimators in terms of scalability and flexibility. The output of MINE algorithm is not guaranteed to be a consistent estimator. We propose a new estimator that instead of searching among functions characterized by neural networks searches the functions in a Reproducing Kernel Hilbert Space. We prove that the proposed estimator is consistent. We carry out simulations and show that when the datasets are small the proposed estimator is more reliable than the MINE estimator and when the datasets are large the performance of the two methods are close.
Tasks
Published	2019-05-02
URL	https://arxiv.org/abs/1905.00586v2
PDF	https://arxiv.org/pdf/1905.00586v2.pdf
PWC	https://paperswithcode.com/paper/estimating-kullback-leibler-divergence-using
Repo
Framework

Visual Privacy Protection via Mapping Distortion


Title	Visual Privacy Protection via Mapping Distortion
Authors	Peidong Liu, Yiming Li, Yong Jiang, Shu-Tao Xia
Abstract	Data privacy protection is an important research area, which is especially critical in this big data era. To a large extent, the privacy of visual classification tasks is mainly in the one-to-one mapping between image and its corresponding label, since this relation provides a great amount of information and can be used in other scenarios. In this paper, we propose Mapping Distortion Protection (MDP) and its augmentation-based extension (AugMDP) to protect the data privacy by modifying the original dataset. In MDP, the label of the modified image does not match the ground-truth mapping, yet DNNs can still learn the ground-truth relation even when the provided mapping is distorted. As such, this method protects privacy when the dataset is leaked. Extensive experiments are conducted on CIFAR-10 and restricted CASIA-WebFace dataset, which verifies the effectiveness and feasibility of the method.
Tasks
Published	2019-11-05
URL	https://arxiv.org/abs/1911.01769v2
PDF	https://arxiv.org/pdf/1911.01769v2.pdf
PWC	https://paperswithcode.com/paper/visual-privacy-protection-via-mapping
Repo
Framework

Approximate Guarantees for Dictionary Learning


Title	Approximate Guarantees for Dictionary Learning
Authors	Aditya Bhaskara, Wai Ming Tai
Abstract	In the dictionary learning (or sparse coding) problem, we are given a collection of signals (vectors in $\mathbb{R}^d$), and the goal is to find a “basis” in which the signals have a sparse (approximate) representation. The problem has received a lot of attention in signal processing, learning, and theoretical computer science. The problem is formalized as factorizing a matrix $X (d \times n)$ (whose columns are the signals) as $X = AY$, where $A$ has a prescribed number $m$ of columns (typically $m \ll n$), and $Y$ has columns that are $k$-sparse (typically $k \ll d$). Most of the known theoretical results involve assuming that the columns of the unknown $A$ have certain incoherence properties, and that the coefficient matrix $Y$ has random (or partly random) structure. The goal of our work is to understand what can be said in the absence of such assumptions. Can we still find $A$ and $Y$ such that $X \approx AY$? We show that this is possible, if we allow violating the bounds on $m$ and $k$ by appropriate factors that depend on $k$ and the desired approximation. Our results rely on an algorithm for what we call the threshold correlation problem, which turns out to be related to hypercontractive norms of matrices. We also show that our algorithmic ideas apply to a setting in which some of the columns of $X$ are outliers, thus giving similar guarantees even in this challenging setting.
Tasks	Dictionary Learning
Published	2019-05-28
URL	https://arxiv.org/abs/1905.12091v1
PDF	https://arxiv.org/pdf/1905.12091v1.pdf
PWC	https://paperswithcode.com/paper/approximate-guarantees-for-dictionary
Repo
Framework

Jointly Learning Structured Analysis Discriminative Dictionary and Analysis Multiclass Classifier


Title	Jointly Learning Structured Analysis Discriminative Dictionary and Analysis Multiclass Classifier
Authors	Zhao Zhang, Weiming Jiang, Jie Qin, Li Zhang, Fanzhang Li, Min Zhang, Shuicheng Yan
Abstract	In this paper, we propose an analysis mechanism based structured Analysis Discriminative Dictionary Learning (ADDL) framework. ADDL seamlessly integrates the analysis discriminative dictionary learning, analysis representation and analysis classifier training into a unified model. The applied analysis mechanism can make sure that the learnt dictionaries, representations and linear classifiers over different classes are independent and discriminating as much as possible. The dictionary is obtained by minimizing a reconstruction error and an analytical incoherence promoting term that encourages the sub-dictionaries associated with different classes to be independent. To obtain the representation coefficients, ADDL imposes a sparse l2,1-norm constraint on the coding coefficients instead of using l0 or l1-norm, since the l0 or l1-norm constraint applied in most existing DL criteria makes the training phase time consuming. The codes-extraction projection that bridges data with the sparse codes by extracting special features from the given samples is calculated via minimizing a sparse codes approximation term. Then we compute a linear classifier based on the approximated sparse codes by an analysis mechanism to simultaneously consider the classification and representation powers. Thus, the classification approach of our model is very efficient, because it can avoid the extra time-consuming sparse reconstruction process with trained dictionary for each new test data as most existing DL algorithms. Simulations on real image databases demonstrate that our ADDL model can obtain superior performance over other state-of-the-arts.
Tasks	Dictionary Learning
Published	2019-05-27
URL	https://arxiv.org/abs/1905.11543v1
PDF	https://arxiv.org/pdf/1905.11543v1.pdf
PWC	https://paperswithcode.com/paper/jointly-learning-structured-analysis
Repo
Framework

White-Box Adversarial Defense via Self-Supervised Data Estimation


Title	White-Box Adversarial Defense via Self-Supervised Data Estimation
Authors	Zudi Lin, Hanspeter Pfister, Ziming Zhang
Abstract	In this paper, we study the problem of how to defend classifiers against adversarial attacks that fool the classifiers using subtly modified input data. In contrast to previous works, here we focus on the white-box adversarial defense where the attackers are granted full access to not only the classifiers but also defenders to produce as strong attacks as possible. In such a context we propose viewing a defender as a functional, a higher-order function that takes functions as its argument to represent a function space, rather than fixed functions conventionally. From this perspective, a defender should be realized and optimized individually for each adversarial input. To this end, we propose RIDE, an efficient and provably convergent self-supervised learning algorithm for individual data estimation to protect the predictions from adversarial attacks. We demonstrate the significant improvement of adversarial defense performance on image recognition, eg, 98%, 76%, 43% test accuracy on MNIST, CIFAR-10, and ImageNet datasets respectively under the state-of-the-art BPDA attacker.
Tasks	Adversarial Defense
Published	2019-09-13
URL	https://arxiv.org/abs/1909.06271v1
PDF	https://arxiv.org/pdf/1909.06271v1.pdf
PWC	https://paperswithcode.com/paper/white-box-adversarial-defense-via-self
Repo
Framework

Regret Minimisation in Multi-Armed Bandits Using Bounded Arm Memory


Title	Regret Minimisation in Multi-Armed Bandits Using Bounded Arm Memory
Authors	Arghya Roy Chaudhuri, Shivaram Kalyanakrishnan
Abstract	In this paper, we propose a constant word (RAM model) algorithm for regret minimisation for both finite and infinite Stochastic Multi-Armed Bandit (MAB) instances. Most of the existing regret minimisation algorithms need to remember the statistics of all the arms they encounter. This may become a problem for the cases where the number of available words of memory is limited. Designing an efficient regret minimisation algorithm that uses a constant number of words has long been interesting to the community. Some early attempts consider the number of arms to be infinite, and require the reward distribution of the arms to belong to some particular family. Recently, for finitely many-armed bandits an explore-then-commit based algorithm~\citep{Liau+PSY:2018} seems to escape such assumption. However, due to the underlying PAC-based elimination their method incurs a high regret. We present a conceptually simple, and efficient algorithm that needs to remember statistics of at most $M$ arms, and for any $K$-armed finite bandit instance it enjoys a $O(KM +K^{1.5}\sqrt{T\log (T/MK)}/M)$ upper-bound on regret. We extend it to achieve sub-linear \textit{quantile-regret}~\citep{RoyChaudhuri+K:2018} and empirically verify the efficiency of our algorithm via experiments.
Tasks	Multi-Armed Bandits
Published	2019-01-24
URL	http://arxiv.org/abs/1901.08387v1
PDF	http://arxiv.org/pdf/1901.08387v1.pdf
PWC	https://paperswithcode.com/paper/regret-minimisation-in-multi-armed-bandits
Repo
Framework

Structured Discriminative Tensor Dictionary Learning for Unsupervised Domain Adaptation


Title	Structured Discriminative Tensor Dictionary Learning for Unsupervised Domain Adaptation
Authors	Songsong Wu, Yan Yan, Hao Tang, Jianjun Qian, Jian Zhang, Xiao-Yuan Jing
Abstract	Unsupervised Domain Adaptation (UDA) addresses the problem of performance degradation due to domain shift between training and testing sets, which is common in computer vision applications. Most existing UDA approaches are based on vector-form data although the typical format of data or features in visual applications is multi-dimensional tensor. Besides, current methods, including the deep network approaches, assume that abundant labeled source samples are provided for training. However, the number of labeled source samples are always limited due to expensive annotation cost in practice, making sub-optimal performance been observed. In this paper, we propose to seek discriminative representation for multi-dimensional data by learning a structured dictionary in tensor space. The dictionary separates domain-specific information and class-specific information to guarantee the representation robust to domains. In addition, a pseudo-label estimation scheme is developed to combine with discriminant analysis in the algorithm iteration for avoiding the external classifier design. We perform extensive results on different datasets with limited source samples. Experimental results demonstrates that the proposed method outperforms the state-of-the-art approaches.
Tasks	Dictionary Learning, Domain Adaptation, Unsupervised Domain Adaptation
Published	2019-05-11
URL	https://arxiv.org/abs/1905.04424v1
PDF	https://arxiv.org/pdf/1905.04424v1.pdf
PWC	https://paperswithcode.com/paper/structured-discriminative-tensor-dictionary
Repo
Framework

Hijacking Malaria Simulators with Probabilistic Programming


Title	Hijacking Malaria Simulators with Probabilistic Programming
Authors	Bradley Gram-Hansen, Christian Schröder de Witt, Tom Rainforth, Philip H. S. Torr, Yee Whye Teh, Atılım Güneş Baydin
Abstract	Epidemiology simulations have become a fundamental tool in the fight against the epidemics of various infectious diseases like AIDS and malaria. However, the complicated and stochastic nature of these simulators can mean their output is difficult to interpret, which reduces their usefulness to policymakers. In this paper, we introduce an approach that allows one to treat a large class of population-based epidemiology simulators as probabilistic generative models. This is achieved by hijacking the internal random number generator calls, through the use of a universal probabilistic programming system (PPS). In contrast to other methods, our approach can be easily retrofitted to simulators written in popular industrial programming frameworks. We demonstrate that our method can be used for interpretable introspection and inference, thus shedding light on black-box simulators. This reinstates much-needed trust between policymakers and evidence-based methods.
Tasks	Epidemiology, Probabilistic Programming
Published	2019-05-29
URL	https://arxiv.org/abs/1905.12432v1
PDF	https://arxiv.org/pdf/1905.12432v1.pdf
PWC	https://paperswithcode.com/paper/hijacking-malaria-simulators-with
Repo
Framework

Automatic Reparameterisation of Probabilistic Programs


Title	Automatic Reparameterisation of Probabilistic Programs
Authors	Maria I. Gorinova, Dave Moore, Matthew D. Hoffman
Abstract	Probabilistic programming has emerged as a powerful paradigm in statistics, applied science, and machine learning: by decoupling modelling from inference, it promises to allow modellers to directly reason about the processes generating data. However, the performance of inference algorithms can be dramatically affected by the parameterisation used to express a model, requiring users to transform their programs in non-intuitive ways. We argue for automating these transformations, and demonstrate that mechanisms available in recent modeling frameworks can implement non-centring and related reparameterisations. This enables new inference algorithms, and we propose two: a simple approach using interleaved sampling and a novel variational formulation that searches over a continuous space of parameterisations. We show that these approaches enable robust inference across a range of models, and can yield more efficient samplers than the best fixed parameterisation.
Tasks	Probabilistic Programming
Published	2019-06-07
URL	https://arxiv.org/abs/1906.03028v1
PDF	https://arxiv.org/pdf/1906.03028v1.pdf
PWC	https://paperswithcode.com/paper/automatic-reparameterisation-of-probabilistic
Repo
Framework

Learning Fair Rule Lists


Title	Learning Fair Rule Lists
Authors	Ulrich Aïvodji, Julien Ferry, Sébastien Gambs, Marie-José Huguet, Mohamed Siala
Abstract	As the use of black-box models becomes ubiquitous in high stake decision-making systems, demands for fair and interpretable models are increasing. While it has been shown that interpretable models can be as accurate as black-box models in several critical domains, existing fair classification techniques that are interpretable by design often display poor accuracy/fairness tradeoffs in comparison with their non-interpretable counterparts. In this paper, we propose FairCORELS, a fair classification technique interpretable by design, whose objective is to learn fair rule lists. Our solution is a multi-objective variant of CORELS, a branch-and-bound algorithm to learn rule lists, that supports several statistical notions of fairness. Examples of such measures include statistical parity, equal opportunity and equalized odds. The empirical evaluation of FairCORELS on real-world datasets demonstrates that it outperforms state-of-the-art fair classification techniques that are interpretable by design while being competitive with non-interpretable ones.
Tasks	Decision Making
Published	2019-09-09
URL	https://arxiv.org/abs/1909.03977v2
PDF	https://arxiv.org/pdf/1909.03977v2.pdf
PWC	https://paperswithcode.com/paper/learning-fair-rule-lists
Repo
Framework

Self-Training and Adversarial Background Regularization for Unsupervised Domain Adaptive One-Stage Object Detection


Title	Self-Training and Adversarial Background Regularization for Unsupervised Domain Adaptive One-Stage Object Detection
Authors	Seunghyeon Kim, Jaehoon Choi, Taekyung Kim, Changick Kim
Abstract	Deep learning-based object detectors have shown remarkable improvements. However, supervised learning-based methods perform poorly when the train data and the test data have different distributions. To address the issue, domain adaptation transfers knowledge from the label-sufficient domain (source domain) to the label-scarce domain (target domain). Self-training is one of the powerful ways to achieve domain adaptation since it helps class-wise domain adaptation. Unfortunately, a naive approach that utilizes pseudo-labels as ground-truth degenerates the performance due to incorrect pseudo-labels. In this paper, we introduce a weak self-training (WST) method and adversarial background score regularization (BSR) for domain adaptive one-stage object detection. WST diminishes the adverse effects of inaccurate pseudo-labels to stabilize the learning procedure. BSR helps the network extract discriminative features for target backgrounds to reduce the domain shift. Two components are complementary to each other as BSR enhances discrimination between foregrounds and backgrounds, whereas WST strengthen class-wise discrimination. Experimental results show that our approach effectively improves the performance of the one-stage object detection in unsupervised domain adaptation setting.
Tasks	Domain Adaptation, Object Detection, Unsupervised Domain Adaptation
Published	2019-09-02
URL	https://arxiv.org/abs/1909.00597v1
PDF	https://arxiv.org/pdf/1909.00597v1.pdf
PWC	https://paperswithcode.com/paper/self-training-and-adversarial-background
Repo
Framework

Histogram Transform Ensembles for Large-scale Regression


Title	Histogram Transform Ensembles for Large-scale Regression
Authors	Hanyuan Hang, Zhouchen Lin, Xiaoyu Liu, Hongwei Wen
Abstract	We propose a novel algorithm for large-scale regression problems named histogram transform ensembles (HTE), composed of random rotations, stretchings, and translations. First of all, we investigate the theoretical properties of HTE when the regression function lies in the H"{o}lder space $C^{k,\alpha}$, $k \in \mathbb{N}_0$, $\alpha \in (0,1]$. In the case that $k=0, 1$, we adopt the constant regressors and develop the na"{i}ve histogram transforms (NHT). Within the space $C^{0,\alpha}$, although almost optimal convergence rates can be derived for both single and ensemble NHT, we fail to show the benefits of ensembles over single estimators theoretically. In contrast, in the subspace $C^{1,\alpha}$, we prove that if $d \geq 2(1+\alpha)/\alpha$, the lower bound of the convergence rates for single NHT turns out to be worse than the upper bound of the convergence rates for ensemble NHT. In the other case when $k \geq 2$, the NHT may no longer be appropriate in predicting smoother regression functions. Instead, we apply kernel histogram transforms (KHT) equipped with smoother regressors such as support vector machines (SVMs), and it turns out that both single and ensemble KHT enjoy almost optimal convergence rates. Then we validate the above theoretical results by numerical experiments. On the one hand, simulations are conducted to elucidate that ensemble NHT outperform single NHT. On the other hand, the effects of bin sizes on accuracy of both NHT and KHT also accord with theoretical analysis. Last but not least, in the real-data experiments, comparisons between the ensemble KHT, equipped with adaptive histogram transforms, and other state-of-the-art large-scale regression estimators verify the effectiveness and accuracy of our algorithm.
Tasks
Published	2019-12-08
URL	https://arxiv.org/abs/1912.04738v1
PDF	https://arxiv.org/pdf/1912.04738v1.pdf
PWC	https://paperswithcode.com/paper/191204738
Repo
Framework

Learning data representation using modified autoencoder for the integrative analysis of multi-omics data


Title	Learning data representation using modified autoencoder for the integrative analysis of multi-omics data
Authors	Tianwei Yu
Abstract	In integrative analyses of omics data, it is often of interest to extract data embedding from one data type that best reflect relations with another data type. This task is traditionally fulfilled by linear methods such as canonical correlation and partial least squares. However, information contained in one data type pertaining to the other data type may not be in the linear form. Deep learning provides a convenient alternative to extract nonlinear information. Here we develop a method Autoencoder-based Integrative Multi-omics data Embedding (AIME) to extract such information. Using a real gene expression - methylation dataset, we show that AIME extracted meaningful information that the linear approach could not find. The R implementation is available at http://web1.sph.emory.edu/users/tyu8/AIME/.
Tasks
Published	2019-06-18
URL	https://arxiv.org/abs/1906.07800v1
PDF	https://arxiv.org/pdf/1906.07800v1.pdf
PWC	https://paperswithcode.com/paper/learning-data-representation-using-modified
Repo
Framework

Time series classification for varying length series


Title	Time series classification for varying length series
Authors	Chang Wei Tan, Francois Petitjean, Eamonn Keogh, Geoffrey I. Webb
Abstract	Research into time series classification has tended to focus on the case of series of uniform length. However, it is common for real-world time series data to have unequal lengths. Differing time series lengths may arise from a number of fundamentally different mechanisms. In this work, we identify and evaluate two classes of such mechanisms – variations in sampling rate relative to the relevant signal and variations between the start and end points of one time series relative to one another. We investigate how time series generated by each of these classes of mechanism are best addressed for time series classification. We perform extensive experiments and provide practical recommendations on how variations in length should be handled in time series classification.
Tasks	Time Series, Time Series Classification
Published	2019-10-10
URL	https://arxiv.org/abs/1910.04341v1
PDF	https://arxiv.org/pdf/1910.04341v1.pdf
PWC	https://paperswithcode.com/paper/time-series-classification-for-varying-length
Repo
Framework

PLATO: Pre-trained Dialogue Generation Model with Discrete Latent Variable


Title	PLATO: Pre-trained Dialogue Generation Model with Discrete Latent Variable
Authors	Siqi Bao, Huang He, Fan Wang, Hua Wu, Haifeng Wang
Abstract	Pre-training models have been proved effective for a wide range of natural language processing tasks. Inspired by this, we propose a novel dialogue generation pre-training framework to support various kinds of conversations, including chit-chat, knowledge grounded dialogues, and conversational question answering. In this framework, we adopt flexible attention mechanisms to fully leverage the bi-directional context and the uni-directional characteristic of language generation. We also introduce discrete latent variables to tackle with the natural born one-to-many mapping problem in response generation. Two reciprocal tasks of response generation and latent act recognition are designed and carried out simultaneously within a shared network. Comprehensive experiments on three publicly available datasets verify the effectiveness and superiority of the proposed framework.
Tasks	Dialogue Generation, Question Answering, Text Generation
Published	2019-10-17
URL	https://arxiv.org/abs/1910.07931v2
PDF	https://arxiv.org/pdf/1910.07931v2.pdf
PWC	https://paperswithcode.com/paper/plato-pre-trained-dialogue-generation-model
Repo
Framework