January 30, 2020

2992 words 15 mins read

Paper Group ANR 307

MM Algorithms for Distance Covariance based Sufficient Dimension Reduction and Sufficient Variable Selection. Annealing for Distributed Global Optimization. Langevin Monte Carlo without smoothness. Saliency Methods for Explaining Adversarial Attacks. TiK-means: $K$-means clustering for skewed groups. A Machine Learning Model for Long-Term Power Gen …

MM Algorithms for Distance Covariance based Sufficient Dimension Reduction and Sufficient Variable Selection


Title	MM Algorithms for Distance Covariance based Sufficient Dimension Reduction and Sufficient Variable Selection
Authors	Runxiong Wu, Xin Chen
Abstract	Sufficient dimension reduction (SDR) using distance covariance (DCOV) was recently proposed as an approach to dimension-reduction problems. Compared with other SDR methods, it is model-free without estimating link function and does not require any particular distributions on predictors (see Sheng and Yin, 2013, 2016). However, the DCOV-based SDR method involves optimizing a nonsmooth and nonconvex objective function over the Stiefel manifold. To tackle the numerical challenge, we novelly formulate the original objective function equivalently into a DC (Difference of Convex functions) program and construct an iterative algorithm based on the majorization-minimization (MM) principle. At each step of the MM algorithm, we inexactly solve the quadratic subproblem on the Stiefel manifold by taking one iteration of Riemannian Newton’s method. The algorithm can also be readily extended to sufficient variable selection (SVS) using distance covariance. We establish the convergence property of the proposed algorithm under some regularity conditions. Simulation studies show our algorithm drastically improves the computation efficiency and is robust across various settings compared with the existing method. Supplemental materials for this article are available.
Tasks	Dimensionality Reduction
Published	2019-12-13
URL	https://arxiv.org/abs/1912.06342v1
PDF	https://arxiv.org/pdf/1912.06342v1.pdf
PWC	https://paperswithcode.com/paper/mm-algorithms-for-distance-covariance-based
Repo
Framework

Annealing for Distributed Global Optimization


Title	Annealing for Distributed Global Optimization
Authors	Brian Swenson, Soummya Kar, H. Vincent Poor, Jose’ M. F. Moura
Abstract	The paper proves convergence to global optima for a class of distributed algorithms for nonconvex optimization in network-based multi-agent settings. Agents are permitted to communicate over a time-varying undirected graph. Each agent is assumed to possess a local objective function (assumed to be smooth, but possibly nonconvex). The paper considers algorithms for optimizing the sum function. A distributed algorithm of the consensus+innovations type is proposed which relies on first-order information at the agent level. Under appropriate conditions on network connectivity and the cost objective, convergence to the set of global optima is achieved by an annealing-type approach, with decaying Gaussian noise independently added into each agent’s update step. It is shown that the proposed algorithm converges in probability to the set of global minima of the sum function.
Tasks
Published	2019-03-18
URL	http://arxiv.org/abs/1903.07258v1
PDF	http://arxiv.org/pdf/1903.07258v1.pdf
PWC	https://paperswithcode.com/paper/annealing-for-distributed-global-optimization
Repo
Framework

Langevin Monte Carlo without smoothness


Title	Langevin Monte Carlo without smoothness
Authors	Niladri S. Chatterji, Jelena Diakonikolas, Michael I. Jordan, Peter L. Bartlett
Abstract	Langevin Monte Carlo (LMC) is an iterative algorithm used to generate samples from a distribution that is known only up to a normalizing constant. The nonasymptotic dependence of its mixing time on the dimension and target accuracy is understood mainly in the setting of smooth (gradient-Lipschitz) log-densities, a serious limitation for applications in machine learning. In this paper, we remove this limitation, providing polynomial-time convergence guarantees for a variant of LMC in the setting of nonsmooth log-concave distributions. At a high level, our results follow by leveraging the implicit smoothing of the log-density that comes from a small Gaussian perturbation that we add to the iterates of the algorithm and controlling the bias and variance that are induced by this perturbation.
Tasks
Published	2019-05-30
URL	https://arxiv.org/abs/1905.13285v3
PDF	https://arxiv.org/pdf/1905.13285v3.pdf
PWC	https://paperswithcode.com/paper/langevin-monte-carlo-without-smoothness
Repo
Framework

Saliency Methods for Explaining Adversarial Attacks


Title	Saliency Methods for Explaining Adversarial Attacks
Authors	Jindong Gu, Volker Tresp
Abstract	The classification decisions of neural networks can be misled by small imperceptible perturbations. This work aims to explain the misled classifications using saliency methods. The idea behind saliency methods is to explain the classification decisions of neural networks by creating so-called saliency maps. Unfortunately, a number of recent publications have shown that many of the proposed saliency methods do not provide insightful explanations. A prominent example is Guided Backpropagation (GuidedBP), which simply performs (partial) image recovery. However, our numerical analysis shows the saliency maps created by GuidedBP do indeed contain class-discriminative information. We propose a simple and efficient way to enhance the saliency maps. The proposed enhanced GuidedBP shows the state-of-the-art performance to explain adversary classifications.
Tasks
Published	2019-08-22
URL	https://arxiv.org/abs/1908.08413v4
PDF	https://arxiv.org/pdf/1908.08413v4.pdf
PWC	https://paperswithcode.com/paper/saliency-methods-for-explaining-adversarial
Repo
Framework

TiK-means: $K$-means clustering for skewed groups


Title	TiK-means: $K$-means clustering for skewed groups
Authors	Nicholas S. Berry, Ranjan Maitra
Abstract	The $K$-means algorithm is extended to allow for partitioning of skewed groups. Our algorithm is called TiK-Means and contributes a $K$-means type algorithm that assigns observations to groups while estimating their skewness-transformation parameters. The resulting groups and transformation reveal general-structured clusters that can be explained by inverting the estimated transformation. Further, a modification of the jump statistic chooses the number of groups. Our algorithm is evaluated on simulated and real-life datasets and then applied to a long-standing astronomical dispute regarding the distinct kinds of gamma ray bursts.
Tasks
Published	2019-04-21
URL	http://arxiv.org/abs/1904.09609v1
PDF	http://arxiv.org/pdf/1904.09609v1.pdf
PWC	https://paperswithcode.com/paper/tik-means-k-means-clustering-for-skewed
Repo
Framework

A Machine Learning Model for Long-Term Power Generation Forecasting at Bidding Zone Level


Title	A Machine Learning Model for Long-Term Power Generation Forecasting at Bidding Zone Level
Authors	Michela Moschella, Mauro Tucci, Emanuele Crisostomi, Alessandro Betti
Abstract	The increasing penetration level of energy generation from renewable sources is demanding for more accurate and reliable forecasting tools to support classic power grid operations (e.g., unit commitment, electricity market clearing or maintenance planning). For this purpose, many physical models have been employed, and more recently many statistical or machine learning algorithms, and data-driven methods in general, are becoming subject of intense research. While generally the power research community focuses on power forecasting at the level of single plants, in a short future horizon of time, in this time we are interested in aggregated macro-area power generation (i.e., in a territory of size greater than 100000 km^2) with a future horizon of interest up to 15 days ahead. Real data are used to validate the proposed forecasting methodology on a test set of several months.
Tasks
Published	2019-10-08
URL	https://arxiv.org/abs/1910.03276v1
PDF	https://arxiv.org/pdf/1910.03276v1.pdf
PWC	https://paperswithcode.com/paper/a-machine-learning-model-for-long-term-power
Repo
Framework

A Semantics-Guided Class Imbalance Learning Model for Zero-Shot Classification


Title	A Semantics-Guided Class Imbalance Learning Model for Zero-Shot Classification
Authors	Zhong Ji, Xuejie Yu, Yunlong Yu, Yanwei Pang, Zhongfei Zhang
Abstract	Zero-Shot Classification (ZSC) equips the learned model with the ability to recognize the visual instances from the novel classes via constructing the interactions between the visual and the semantic modalities. In contrast to the traditional image classification, ZSC is easily suffered from the class-imbalance issue since it is more concerned with the class-level knowledge transfer capability. In the real world, the class samples follow a long-tailed distribution, and the discriminative information in the sample-scarce seen classes is hard to be transferred to the related unseen classes in the traditional batch-based training manner, which degrades the overall generalization ability a lot. Towards alleviating the class imbalance issue in ZSC, we propose a sample-balanced training process to encourage all training classes to contribute equally to the learned model. Specifically, we randomly select the same number of images from each class across all training classes to form a training batch to ensure that the sample-scarce classes contribute equally as those classes with sufficient samples during each iteration. Considering that the instances from the same class differ in class representativeness, we further develop an efficient semantics-guided feature fusion model to obtain discriminative class visual prototype for the following visual-semantic interaction process via distributing different weights to the selected samples based on their class representativeness. Extensive experiments on three imbalanced ZSC benchmark datasets for both the Traditional ZSC (TZSC) and the Generalized ZSC (GZSC) tasks demonstrate our approach achieves promising results especially for the unseen categories those are closely related to the sample-scarce seen categories.
Tasks	Image Classification, Transfer Learning, Zero-Shot Learning
Published	2019-08-26
URL	https://arxiv.org/abs/1908.09745v1
PDF	https://arxiv.org/pdf/1908.09745v1.pdf
PWC	https://paperswithcode.com/paper/a-semantics-guided-class-imbalance-learning
Repo
Framework

Localized Adversarial Training for Increased Accuracy and Robustness in Image Classification


Title	Localized Adversarial Training for Increased Accuracy and Robustness in Image Classification
Authors	Eitan Rothberg, Tingting Chen, Luo Jie, Hao Ji
Abstract	Today’s state-of-the-art image classifiers fail to correctly classify carefully manipulated adversarial images. In this work, we develop a new, localized adversarial attack that generates adversarial examples by imperceptibly altering the backgrounds of normal images. We first use this attack to highlight the unnecessary sensitivity of neural networks to changes in the background of an image, then use it as part of a new training technique: localized adversarial training. By including locally adversarial images in the training set, we are able to create a classifier that suffers less loss than a non-adversarially trained counterpart model on both natural and adversarial inputs. The evaluation of our localized adversarial training algorithm on MNIST and CIFAR-10 datasets shows decreased accuracy loss on natural images, and increased robustness against adversarial inputs.
Tasks	Adversarial Attack, Image Classification
Published	2019-09-10
URL	https://arxiv.org/abs/1909.04779v1
PDF	https://arxiv.org/pdf/1909.04779v1.pdf
PWC	https://paperswithcode.com/paper/localized-adversarial-training-for-increased
Repo
Framework

An Efficient Algorithm for Multiple-Pursuer-Multiple-Evader Pursuit/Evasion Game


Title	An Efficient Algorithm for Multiple-Pursuer-Multiple-Evader Pursuit/Evasion Game
Authors	Joshua R. Bertram, Peng Wei
Abstract	We present a method for pursuit/evasion that is highly efficient and and scales to large teams of aircraft. The underlying algorithm is an efficient algorithm for solving Markov Decision Processes (MDPs) that supports fully continuous state spaces. We demonstrate the algorithm in a team pursuit/evasion setting in a 3D environment using a pseudo-6DOF model and study performance by varying sizes of team members. We show that as the number of aircraft in the simulation grows, computational performance remains efficient and is suitable for real-time systems. We also define probability-to-win and survivability metrics that describe the teams’ performance over multiple trials, and show that the algorithm performs consistently. We provide numerical results showing control inputs for a typical 1v1 encounter and provide videos for 1v1, 2v2, 3v3, 4v4, and 10v10 contests to demonstrate the ability of the algorithm to adapt seamlessly to complex environments.
Tasks
Published	2019-09-09
URL	https://arxiv.org/abs/1909.04171v1
PDF	https://arxiv.org/pdf/1909.04171v1.pdf
PWC	https://paperswithcode.com/paper/an-efficient-algorithm-for-multiple-pursuer
Repo
Framework

Bayesian Nonparametrics for Non-exhaustive Learning


Title	Bayesian Nonparametrics for Non-exhaustive Learning
Authors	Yicheng Cheng, Bartek Rajwa, Murat Dundar
Abstract	Non-exhaustive learning (NEL) is an emerging machine-learning paradigm designed to confront the challenge of non-stationary environments characterized by anon-exhaustive training sets lacking full information about the available classes.Unlike traditional supervised learning that relies on fixed models, NEL utilizes self-adjusting machine learning to better accommodate the non-stationary nature of the real-world problem, which is at the root of many recently discovered limitations of deep learning. Some of these hurdles led to a surge of interest in several research areas relevant to NEL such as open set classification or zero-shot learning. The presented study which has been motivated by two important applications proposes a NEL algorithm built on a highly flexible, doubly non-parametric Bayesian Gaussian mixture model that can grow arbitrarily large in terms of the number of classes and their components. We report several experiments that demonstrate the promising performance of the introduced model for NEL.
Tasks	Zero-Shot Learning
Published	2019-08-26
URL	https://arxiv.org/abs/1908.09736v1
PDF	https://arxiv.org/pdf/1908.09736v1.pdf
PWC	https://paperswithcode.com/paper/bayesian-nonparametrics-for-non-exhaustive
Repo
Framework

From deep learning to mechanistic understanding in neuroscience: the structure of retinal prediction


Title	From deep learning to mechanistic understanding in neuroscience: the structure of retinal prediction
Authors	Hidenori Tanaka, Aran Nayebi, Niru Maheswaranathan, Lane McIntosh, Stephen A. Baccus, Surya Ganguli
Abstract	Recently, deep feedforward neural networks have achieved considerable success in modeling biological sensory processing, in terms of reproducing the input-output map of sensory neurons. However, such models raise profound questions about the very nature of explanation in neuroscience. Are we simply replacing one complex system (a biological circuit) with another (a deep network), without understanding either? Moreover, beyond neural representations, are the deep network’s computational mechanisms for generating neural responses the same as those in the brain? Without a systematic approach to extracting and understanding computational mechanisms from deep neural network models, it can be difficult both to assess the degree of utility of deep learning approaches in neuroscience, and to extract experimentally testable hypotheses from deep networks. We develop such a systematic approach by combining dimensionality reduction and modern attribution methods for determining the relative importance of interneurons for specific visual computations. We apply this approach to deep network models of the retina, revealing a conceptual understanding of how the retina acts as a predictive feature extractor that signals deviations from expectations for diverse spatiotemporal stimuli. For each stimulus, our extracted computational mechanisms are consistent with prior scientific literature, and in one case yields a new mechanistic hypothesis. Thus overall, this work not only yields insights into the computational mechanisms underlying the striking predictive capabilities of the retina, but also places the framework of deep networks as neuroscientific models on firmer theoretical foundations, by providing a new roadmap to go beyond comparing neural representations to extracting and understand computational mechanisms.
Tasks	Dimensionality Reduction
Published	2019-12-12
URL	https://arxiv.org/abs/1912.06207v1
PDF	https://arxiv.org/pdf/1912.06207v1.pdf
PWC	https://paperswithcode.com/paper/from-deep-learning-to-mechanistic-1
Repo
Framework

Transferable Contrastive Network for Generalized Zero-Shot Learning


Title	Transferable Contrastive Network for Generalized Zero-Shot Learning
Authors	Huajie Jiang, Ruiping Wang, Shiguang Shan, Xilin Chen
Abstract	Zero-shot learning (ZSL) is a challenging problem that aims to recognize the target categories without seen data, where semantic information is leveraged to transfer knowledge from some source classes. Although ZSL has made great progress in recent years, most existing approaches are easy to overfit the sources classes in generalized zero-shot learning (GZSL) task, which indicates that they learn little knowledge about target classes. To tackle such problem, we propose a novel Transferable Contrastive Network (TCN) that explicitly transfers knowledge from the source classes to the target classes. It automatically contrasts one image with different classes to judge whether they are consistent or not. By exploiting the class similarities to make knowledge transfer from source images to similar target classes, our approach is more robust to recognize the target images. Experiments on five benchmark datasets show the superiority of our approach for GZSL.
Tasks	Transfer Learning, Zero-Shot Learning
Published	2019-08-16
URL	https://arxiv.org/abs/1908.05832v1
PDF	https://arxiv.org/pdf/1908.05832v1.pdf
PWC	https://paperswithcode.com/paper/transferable-contrastive-network-for
Repo
Framework

Generalised Zero-Shot Learning with Domain Classification in a Joint Semantic and Visual Space


Title	Generalised Zero-Shot Learning with Domain Classification in a Joint Semantic and Visual Space
Authors	Rafael Felix, Ben Harwood, Michele Sasdelli, Gustavo Carneiro
Abstract	Generalised zero-shot learning (GZSL) is a classification problem where the learning stage relies on a set of seen visual classes and the inference stage aims to identify both the seen visual classes and a new set of unseen visual classes. Critically, both the learning and inference stages can leverage a semantic representation that is available for the seen and unseen classes. Most state-of-the-art GZSL approaches rely on a mapping between latent visual and semantic spaces without considering if a particular sample belongs to the set of seen or unseen classes. In this paper, we propose a novel GZSL method that learns a joint latent representation that combines both visual and semantic information. This mitigates the need for learning a mapping between the two spaces. Our method also introduces a domain classification that estimates whether a sample belongs to a seen or an unseen class. Our classifier then combines a class discriminator with this domain classifier with the goal of reducing the natural bias that GZSL approaches have toward the seen classes. Experiments show that our method achieves state-of-the-art results in terms of harmonic mean, the area under the seen and unseen curve and unseen classification accuracy on public GZSL benchmark data sets. Our code will be available upon acceptance of this paper.
Tasks	Zero-Shot Learning
Published	2019-08-14
URL	https://arxiv.org/abs/1908.04930v1
PDF	https://arxiv.org/pdf/1908.04930v1.pdf
PWC	https://paperswithcode.com/paper/generalised-zero-shot-learning-with-domain
Repo
Framework

On the Privacy of dK-Random Graphs


Title	On the Privacy of dK-Random Graphs
Authors	Sameera Horawalavithana, Adriana Iamnitchi
Abstract	Real social network datasets provide significant benefits for understanding phenomena such as information diffusion or network evolution. Yet the privacy risks raised from sharing real graph datasets, even when stripped of user identity information, are significant. Previous research shows that many graph anonymization techniques fail against existing graph de-anonymization attacks. However, the specific reason for the success of such de-anonymization attacks is yet to be understood. This paper systematically studies the structural properties of real graphs that make them more vulnerable to machine learning-based techniques for de-anonymization. More precisely, we study the boundaries of anonymity based on the structural properties of real graph datasets in terms of how their dK-based anonymized versions resist (or fail) to various types of attacks. Our experimental results lead to three contributions. First, we identify the strength of an attacker based on the graph characteristics of the subset of nodes from which it starts the de-anonymization attack. Second, we quantify the relative effectiveness of dK-series for graph anonymization. And third, we identify the properties of the original graph that make it more vulnerable to de-anonymization.
Tasks
Published	2019-07-03
URL	https://arxiv.org/abs/1907.01695v1
PDF	https://arxiv.org/pdf/1907.01695v1.pdf
PWC	https://paperswithcode.com/paper/on-the-privacy-of-dk-random-graphs
Repo
Framework

Network Randomization: A Simple Technique for Generalization in Deep Reinforcement Learning


Title	Network Randomization: A Simple Technique for Generalization in Deep Reinforcement Learning
Authors	Kimin Lee, Kibok Lee, Jinwoo Shin, Honglak Lee
Abstract	Deep reinforcement learning (RL) agents often fail to generalize to unseen environments (yet semantically similar to trained agents), particularly when they are trained on high-dimensional state spaces, such as images. In this paper, we propose a simple technique to improve a generalization ability of deep RL agents by introducing a randomized (convolutional) neural network that randomly perturbs input observations. It enables trained agents to adapt to new domains by learning robust features invariant across varied and randomized environments. Furthermore, we consider an inference method based on the Monte Carlo approximation to reduce the variance induced by this randomization. We demonstrate the superiority of our method across 2D CoinRun, 3D DeepMind Lab exploration and 3D robotics control tasks: it significantly outperforms various regularization and data augmentation methods for the same purpose.
Tasks	Data Augmentation
Published	2019-10-11
URL	https://arxiv.org/abs/1910.05396v3
PDF	https://arxiv.org/pdf/1910.05396v3.pdf
PWC	https://paperswithcode.com/paper/a-simple-randomization-technique-for-1
Repo
Framework