Paper Group ANR 307
MM Algorithms for Distance Covariance based Sufficient Dimension Reduction and Sufficient Variable Selection. Annealing for Distributed Global Optimization. Langevin Monte Carlo without smoothness. Saliency Methods for Explaining Adversarial Attacks. TiK-means: $K$-means clustering for skewed groups. A Machine Learning Model for Long-Term Power Gen …
MM Algorithms for Distance Covariance based Sufficient Dimension Reduction and Sufficient Variable Selection
Title | MM Algorithms for Distance Covariance based Sufficient Dimension Reduction and Sufficient Variable Selection |
Authors | Runxiong Wu, Xin Chen |
Abstract | Sufficient dimension reduction (SDR) using distance covariance (DCOV) was recently proposed as an approach to dimension-reduction problems. Compared with other SDR methods, it is model-free without estimating link function and does not require any particular distributions on predictors (see Sheng and Yin, 2013, 2016). However, the DCOV-based SDR method involves optimizing a nonsmooth and nonconvex objective function over the Stiefel manifold. To tackle the numerical challenge, we novelly formulate the original objective function equivalently into a DC (Difference of Convex functions) program and construct an iterative algorithm based on the majorization-minimization (MM) principle. At each step of the MM algorithm, we inexactly solve the quadratic subproblem on the Stiefel manifold by taking one iteration of Riemannian Newton’s method. The algorithm can also be readily extended to sufficient variable selection (SVS) using distance covariance. We establish the convergence property of the proposed algorithm under some regularity conditions. Simulation studies show our algorithm drastically improves the computation efficiency and is robust across various settings compared with the existing method. Supplemental materials for this article are available. |
Tasks | Dimensionality Reduction |
Published | 2019-12-13 |
URL | https://arxiv.org/abs/1912.06342v1 |
https://arxiv.org/pdf/1912.06342v1.pdf | |
PWC | https://paperswithcode.com/paper/mm-algorithms-for-distance-covariance-based |
Repo | |
Framework | |
Annealing for Distributed Global Optimization
Title | Annealing for Distributed Global Optimization |
Authors | Brian Swenson, Soummya Kar, H. Vincent Poor, Jose’ M. F. Moura |
Abstract | The paper proves convergence to global optima for a class of distributed algorithms for nonconvex optimization in network-based multi-agent settings. Agents are permitted to communicate over a time-varying undirected graph. Each agent is assumed to possess a local objective function (assumed to be smooth, but possibly nonconvex). The paper considers algorithms for optimizing the sum function. A distributed algorithm of the consensus+innovations type is proposed which relies on first-order information at the agent level. Under appropriate conditions on network connectivity and the cost objective, convergence to the set of global optima is achieved by an annealing-type approach, with decaying Gaussian noise independently added into each agent’s update step. It is shown that the proposed algorithm converges in probability to the set of global minima of the sum function. |
Tasks | |
Published | 2019-03-18 |
URL | http://arxiv.org/abs/1903.07258v1 |
http://arxiv.org/pdf/1903.07258v1.pdf | |
PWC | https://paperswithcode.com/paper/annealing-for-distributed-global-optimization |
Repo | |
Framework | |
Langevin Monte Carlo without smoothness
Title | Langevin Monte Carlo without smoothness |
Authors | Niladri S. Chatterji, Jelena Diakonikolas, Michael I. Jordan, Peter L. Bartlett |
Abstract | Langevin Monte Carlo (LMC) is an iterative algorithm used to generate samples from a distribution that is known only up to a normalizing constant. The nonasymptotic dependence of its mixing time on the dimension and target accuracy is understood mainly in the setting of smooth (gradient-Lipschitz) log-densities, a serious limitation for applications in machine learning. In this paper, we remove this limitation, providing polynomial-time convergence guarantees for a variant of LMC in the setting of nonsmooth log-concave distributions. At a high level, our results follow by leveraging the implicit smoothing of the log-density that comes from a small Gaussian perturbation that we add to the iterates of the algorithm and controlling the bias and variance that are induced by this perturbation. |
Tasks | |
Published | 2019-05-30 |
URL | https://arxiv.org/abs/1905.13285v3 |
https://arxiv.org/pdf/1905.13285v3.pdf | |
PWC | https://paperswithcode.com/paper/langevin-monte-carlo-without-smoothness |
Repo | |
Framework | |
Saliency Methods for Explaining Adversarial Attacks
Title | Saliency Methods for Explaining Adversarial Attacks |
Authors | Jindong Gu, Volker Tresp |
Abstract | The classification decisions of neural networks can be misled by small imperceptible perturbations. This work aims to explain the misled classifications using saliency methods. The idea behind saliency methods is to explain the classification decisions of neural networks by creating so-called saliency maps. Unfortunately, a number of recent publications have shown that many of the proposed saliency methods do not provide insightful explanations. A prominent example is Guided Backpropagation (GuidedBP), which simply performs (partial) image recovery. However, our numerical analysis shows the saliency maps created by GuidedBP do indeed contain class-discriminative information. We propose a simple and efficient way to enhance the saliency maps. The proposed enhanced GuidedBP shows the state-of-the-art performance to explain adversary classifications. |
Tasks | |
Published | 2019-08-22 |
URL | https://arxiv.org/abs/1908.08413v4 |
https://arxiv.org/pdf/1908.08413v4.pdf | |
PWC | https://paperswithcode.com/paper/saliency-methods-for-explaining-adversarial |
Repo | |
Framework | |
TiK-means: $K$-means clustering for skewed groups
Title | TiK-means: $K$-means clustering for skewed groups |
Authors | Nicholas S. Berry, Ranjan Maitra |
Abstract | The $K$-means algorithm is extended to allow for partitioning of skewed groups. Our algorithm is called TiK-Means and contributes a $K$-means type algorithm that assigns observations to groups while estimating their skewness-transformation parameters. The resulting groups and transformation reveal general-structured clusters that can be explained by inverting the estimated transformation. Further, a modification of the jump statistic chooses the number of groups. Our algorithm is evaluated on simulated and real-life datasets and then applied to a long-standing astronomical dispute regarding the distinct kinds of gamma ray bursts. |
Tasks | |
Published | 2019-04-21 |
URL | http://arxiv.org/abs/1904.09609v1 |
http://arxiv.org/pdf/1904.09609v1.pdf | |
PWC | https://paperswithcode.com/paper/tik-means-k-means-clustering-for-skewed |
Repo | |
Framework | |
A Machine Learning Model for Long-Term Power Generation Forecasting at Bidding Zone Level
Title | A Machine Learning Model for Long-Term Power Generation Forecasting at Bidding Zone Level |
Authors | Michela Moschella, Mauro Tucci, Emanuele Crisostomi, Alessandro Betti |
Abstract | The increasing penetration level of energy generation from renewable sources is demanding for more accurate and reliable forecasting tools to support classic power grid operations (e.g., unit commitment, electricity market clearing or maintenance planning). For this purpose, many physical models have been employed, and more recently many statistical or machine learning algorithms, and data-driven methods in general, are becoming subject of intense research. While generally the power research community focuses on power forecasting at the level of single plants, in a short future horizon of time, in this time we are interested in aggregated macro-area power generation (i.e., in a territory of size greater than 100000 km^2) with a future horizon of interest up to 15 days ahead. Real data are used to validate the proposed forecasting methodology on a test set of several months. |
Tasks | |
Published | 2019-10-08 |
URL | https://arxiv.org/abs/1910.03276v1 |
https://arxiv.org/pdf/1910.03276v1.pdf | |
PWC | https://paperswithcode.com/paper/a-machine-learning-model-for-long-term-power |
Repo | |
Framework | |
A Semantics-Guided Class Imbalance Learning Model for Zero-Shot Classification
Title | A Semantics-Guided Class Imbalance Learning Model for Zero-Shot Classification |
Authors | Zhong Ji, Xuejie Yu, Yunlong Yu, Yanwei Pang, Zhongfei Zhang |
Abstract | Zero-Shot Classification (ZSC) equips the learned model with the ability to recognize the visual instances from the novel classes via constructing the interactions between the visual and the semantic modalities. In contrast to the traditional image classification, ZSC is easily suffered from the class-imbalance issue since it is more concerned with the class-level knowledge transfer capability. In the real world, the class samples follow a long-tailed distribution, and the discriminative information in the sample-scarce seen classes is hard to be transferred to the related unseen classes in the traditional batch-based training manner, which degrades the overall generalization ability a lot. Towards alleviating the class imbalance issue in ZSC, we propose a sample-balanced training process to encourage all training classes to contribute equally to the learned model. Specifically, we randomly select the same number of images from each class across all training classes to form a training batch to ensure that the sample-scarce classes contribute equally as those classes with sufficient samples during each iteration. Considering that the instances from the same class differ in class representativeness, we further develop an efficient semantics-guided feature fusion model to obtain discriminative class visual prototype for the following visual-semantic interaction process via distributing different weights to the selected samples based on their class representativeness. Extensive experiments on three imbalanced ZSC benchmark datasets for both the Traditional ZSC (TZSC) and the Generalized ZSC (GZSC) tasks demonstrate our approach achieves promising results especially for the unseen categories those are closely related to the sample-scarce seen categories. |
Tasks | Image Classification, Transfer Learning, Zero-Shot Learning |
Published | 2019-08-26 |
URL | https://arxiv.org/abs/1908.09745v1 |
https://arxiv.org/pdf/1908.09745v1.pdf | |
PWC | https://paperswithcode.com/paper/a-semantics-guided-class-imbalance-learning |
Repo | |
Framework | |
Localized Adversarial Training for Increased Accuracy and Robustness in Image Classification
Title | Localized Adversarial Training for Increased Accuracy and Robustness in Image Classification |
Authors | Eitan Rothberg, Tingting Chen, Luo Jie, Hao Ji |
Abstract | Today’s state-of-the-art image classifiers fail to correctly classify carefully manipulated adversarial images. In this work, we develop a new, localized adversarial attack that generates adversarial examples by imperceptibly altering the backgrounds of normal images. We first use this attack to highlight the unnecessary sensitivity of neural networks to changes in the background of an image, then use it as part of a new training technique: localized adversarial training. By including locally adversarial images in the training set, we are able to create a classifier that suffers less loss than a non-adversarially trained counterpart model on both natural and adversarial inputs. The evaluation of our localized adversarial training algorithm on MNIST and CIFAR-10 datasets shows decreased accuracy loss on natural images, and increased robustness against adversarial inputs. |
Tasks | Adversarial Attack, Image Classification |
Published | 2019-09-10 |
URL | https://arxiv.org/abs/1909.04779v1 |
https://arxiv.org/pdf/1909.04779v1.pdf | |
PWC | https://paperswithcode.com/paper/localized-adversarial-training-for-increased |
Repo | |
Framework | |
An Efficient Algorithm for Multiple-Pursuer-Multiple-Evader Pursuit/Evasion Game
Title | An Efficient Algorithm for Multiple-Pursuer-Multiple-Evader Pursuit/Evasion Game |
Authors | Joshua R. Bertram, Peng Wei |
Abstract | We present a method for pursuit/evasion that is highly efficient and and scales to large teams of aircraft. The underlying algorithm is an efficient algorithm for solving Markov Decision Processes (MDPs) that supports fully continuous state spaces. We demonstrate the algorithm in a team pursuit/evasion setting in a 3D environment using a pseudo-6DOF model and study performance by varying sizes of team members. We show that as the number of aircraft in the simulation grows, computational performance remains efficient and is suitable for real-time systems. We also define probability-to-win and survivability metrics that describe the teams’ performance over multiple trials, and show that the algorithm performs consistently. We provide numerical results showing control inputs for a typical 1v1 encounter and provide videos for 1v1, 2v2, 3v3, 4v4, and 10v10 contests to demonstrate the ability of the algorithm to adapt seamlessly to complex environments. |
Tasks | |
Published | 2019-09-09 |
URL | https://arxiv.org/abs/1909.04171v1 |
https://arxiv.org/pdf/1909.04171v1.pdf | |
PWC | https://paperswithcode.com/paper/an-efficient-algorithm-for-multiple-pursuer |
Repo | |
Framework | |
Bayesian Nonparametrics for Non-exhaustive Learning
Title | Bayesian Nonparametrics for Non-exhaustive Learning |
Authors | Yicheng Cheng, Bartek Rajwa, Murat Dundar |
Abstract | Non-exhaustive learning (NEL) is an emerging machine-learning paradigm designed to confront the challenge of non-stationary environments characterized by anon-exhaustive training sets lacking full information about the available classes.Unlike traditional supervised learning that relies on fixed models, NEL utilizes self-adjusting machine learning to better accommodate the non-stationary nature of the real-world problem, which is at the root of many recently discovered limitations of deep learning. Some of these hurdles led to a surge of interest in several research areas relevant to NEL such as open set classification or zero-shot learning. The presented study which has been motivated by two important applications proposes a NEL algorithm built on a highly flexible, doubly non-parametric Bayesian Gaussian mixture model that can grow arbitrarily large in terms of the number of classes and their components. We report several experiments that demonstrate the promising performance of the introduced model for NEL. |
Tasks | Zero-Shot Learning |
Published | 2019-08-26 |
URL | https://arxiv.org/abs/1908.09736v1 |
https://arxiv.org/pdf/1908.09736v1.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-nonparametrics-for-non-exhaustive |
Repo | |
Framework | |
From deep learning to mechanistic understanding in neuroscience: the structure of retinal prediction
Title | From deep learning to mechanistic understanding in neuroscience: the structure of retinal prediction |
Authors | Hidenori Tanaka, Aran Nayebi, Niru Maheswaranathan, Lane McIntosh, Stephen A. Baccus, Surya Ganguli |
Abstract | Recently, deep feedforward neural networks have achieved considerable success in modeling biological sensory processing, in terms of reproducing the input-output map of sensory neurons. However, such models raise profound questions about the very nature of explanation in neuroscience. Are we simply replacing one complex system (a biological circuit) with another (a deep network), without understanding either? Moreover, beyond neural representations, are the deep network’s computational mechanisms for generating neural responses the same as those in the brain? Without a systematic approach to extracting and understanding computational mechanisms from deep neural network models, it can be difficult both to assess the degree of utility of deep learning approaches in neuroscience, and to extract experimentally testable hypotheses from deep networks. We develop such a systematic approach by combining dimensionality reduction and modern attribution methods for determining the relative importance of interneurons for specific visual computations. We apply this approach to deep network models of the retina, revealing a conceptual understanding of how the retina acts as a predictive feature extractor that signals deviations from expectations for diverse spatiotemporal stimuli. For each stimulus, our extracted computational mechanisms are consistent with prior scientific literature, and in one case yields a new mechanistic hypothesis. Thus overall, this work not only yields insights into the computational mechanisms underlying the striking predictive capabilities of the retina, but also places the framework of deep networks as neuroscientific models on firmer theoretical foundations, by providing a new roadmap to go beyond comparing neural representations to extracting and understand computational mechanisms. |
Tasks | Dimensionality Reduction |
Published | 2019-12-12 |
URL | https://arxiv.org/abs/1912.06207v1 |
https://arxiv.org/pdf/1912.06207v1.pdf | |
PWC | https://paperswithcode.com/paper/from-deep-learning-to-mechanistic-1 |
Repo | |
Framework | |
Transferable Contrastive Network for Generalized Zero-Shot Learning
Title | Transferable Contrastive Network for Generalized Zero-Shot Learning |
Authors | Huajie Jiang, Ruiping Wang, Shiguang Shan, Xilin Chen |
Abstract | Zero-shot learning (ZSL) is a challenging problem that aims to recognize the target categories without seen data, where semantic information is leveraged to transfer knowledge from some source classes. Although ZSL has made great progress in recent years, most existing approaches are easy to overfit the sources classes in generalized zero-shot learning (GZSL) task, which indicates that they learn little knowledge about target classes. To tackle such problem, we propose a novel Transferable Contrastive Network (TCN) that explicitly transfers knowledge from the source classes to the target classes. It automatically contrasts one image with different classes to judge whether they are consistent or not. By exploiting the class similarities to make knowledge transfer from source images to similar target classes, our approach is more robust to recognize the target images. Experiments on five benchmark datasets show the superiority of our approach for GZSL. |
Tasks | Transfer Learning, Zero-Shot Learning |
Published | 2019-08-16 |
URL | https://arxiv.org/abs/1908.05832v1 |
https://arxiv.org/pdf/1908.05832v1.pdf | |
PWC | https://paperswithcode.com/paper/transferable-contrastive-network-for |
Repo | |
Framework | |
Generalised Zero-Shot Learning with Domain Classification in a Joint Semantic and Visual Space
Title | Generalised Zero-Shot Learning with Domain Classification in a Joint Semantic and Visual Space |
Authors | Rafael Felix, Ben Harwood, Michele Sasdelli, Gustavo Carneiro |
Abstract | Generalised zero-shot learning (GZSL) is a classification problem where the learning stage relies on a set of seen visual classes and the inference stage aims to identify both the seen visual classes and a new set of unseen visual classes. Critically, both the learning and inference stages can leverage a semantic representation that is available for the seen and unseen classes. Most state-of-the-art GZSL approaches rely on a mapping between latent visual and semantic spaces without considering if a particular sample belongs to the set of seen or unseen classes. In this paper, we propose a novel GZSL method that learns a joint latent representation that combines both visual and semantic information. This mitigates the need for learning a mapping between the two spaces. Our method also introduces a domain classification that estimates whether a sample belongs to a seen or an unseen class. Our classifier then combines a class discriminator with this domain classifier with the goal of reducing the natural bias that GZSL approaches have toward the seen classes. Experiments show that our method achieves state-of-the-art results in terms of harmonic mean, the area under the seen and unseen curve and unseen classification accuracy on public GZSL benchmark data sets. Our code will be available upon acceptance of this paper. |
Tasks | Zero-Shot Learning |
Published | 2019-08-14 |
URL | https://arxiv.org/abs/1908.04930v1 |
https://arxiv.org/pdf/1908.04930v1.pdf | |
PWC | https://paperswithcode.com/paper/generalised-zero-shot-learning-with-domain |
Repo | |
Framework | |
On the Privacy of dK-Random Graphs
Title | On the Privacy of dK-Random Graphs |
Authors | Sameera Horawalavithana, Adriana Iamnitchi |
Abstract | Real social network datasets provide significant benefits for understanding phenomena such as information diffusion or network evolution. Yet the privacy risks raised from sharing real graph datasets, even when stripped of user identity information, are significant. Previous research shows that many graph anonymization techniques fail against existing graph de-anonymization attacks. However, the specific reason for the success of such de-anonymization attacks is yet to be understood. This paper systematically studies the structural properties of real graphs that make them more vulnerable to machine learning-based techniques for de-anonymization. More precisely, we study the boundaries of anonymity based on the structural properties of real graph datasets in terms of how their dK-based anonymized versions resist (or fail) to various types of attacks. Our experimental results lead to three contributions. First, we identify the strength of an attacker based on the graph characteristics of the subset of nodes from which it starts the de-anonymization attack. Second, we quantify the relative effectiveness of dK-series for graph anonymization. And third, we identify the properties of the original graph that make it more vulnerable to de-anonymization. |
Tasks | |
Published | 2019-07-03 |
URL | https://arxiv.org/abs/1907.01695v1 |
https://arxiv.org/pdf/1907.01695v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-privacy-of-dk-random-graphs |
Repo | |
Framework | |
Network Randomization: A Simple Technique for Generalization in Deep Reinforcement Learning
Title | Network Randomization: A Simple Technique for Generalization in Deep Reinforcement Learning |
Authors | Kimin Lee, Kibok Lee, Jinwoo Shin, Honglak Lee |
Abstract | Deep reinforcement learning (RL) agents often fail to generalize to unseen environments (yet semantically similar to trained agents), particularly when they are trained on high-dimensional state spaces, such as images. In this paper, we propose a simple technique to improve a generalization ability of deep RL agents by introducing a randomized (convolutional) neural network that randomly perturbs input observations. It enables trained agents to adapt to new domains by learning robust features invariant across varied and randomized environments. Furthermore, we consider an inference method based on the Monte Carlo approximation to reduce the variance induced by this randomization. We demonstrate the superiority of our method across 2D CoinRun, 3D DeepMind Lab exploration and 3D robotics control tasks: it significantly outperforms various regularization and data augmentation methods for the same purpose. |
Tasks | Data Augmentation |
Published | 2019-10-11 |
URL | https://arxiv.org/abs/1910.05396v3 |
https://arxiv.org/pdf/1910.05396v3.pdf | |
PWC | https://paperswithcode.com/paper/a-simple-randomization-technique-for-1 |
Repo | |
Framework | |