April 3, 2020

3332 words 16 mins read

Paper Group ANR 29

Spectrum Dependent Learning Curves in Kernel Regression and Wide Neural Networks. Lipschitz standardization for robust multivariate learning. Distilling portable Generative Adversarial Networks for Image Translation. Identification of AC Networks via Online Learning. Deep regularization and direct training of the inner layers of Neural Networks wit …

Spectrum Dependent Learning Curves in Kernel Regression and Wide Neural Networks


Title	Spectrum Dependent Learning Curves in Kernel Regression and Wide Neural Networks
Authors	Blake Bordelon, Abdulkadir Canatar, Cengiz Pehlevan
Abstract	A fundamental question in modern machine learning is how deep neural networks can generalize. We address this question using 1) an equivalence between training infinitely wide neural networks and performing kernel regression with a deterministic kernel called the Neural Tangent Kernel (NTK) (Jacot et al. 2018), and 2) theoretical tools from statistical physics. We derive analytical expressions for learning curves for kernel regression, and use them to evaluate how the test loss of a trained neural network depends on the number of samples. Our approach allows us not only to compute the total test risk but also the decomposition of the risk due to different spectral components of the kernel. Complementary to recent results showing that during gradient descent, neural networks fit low frequency components first, we identify a new type of frequency principle: as the size of the training set size grows, kernel machines and neural networks begin to fit successively higher frequency modes of the target function. We verify our theory with simulations of kernel regression and training wide artificial neural networks.
Tasks
Published	2020-02-07
URL	https://arxiv.org/abs/2002.02561v2
PDF	https://arxiv.org/pdf/2002.02561v2.pdf
PWC	https://paperswithcode.com/paper/spectrum-dependent-learning-curves-in-kernel
Repo
Framework

Lipschitz standardization for robust multivariate learning


Title	Lipschitz standardization for robust multivariate learning
Authors	Adrián Javaloy, Isabel Valera
Abstract	Current trends in machine learning rely on out-of-the-box gradient-based approaches. With the aim of mitigating numerical errors and to improve the convergence of the learning process, a common empirical practice is to standardize or normalize the data. However, there is a lack of theoretical analysis regarding why and when these methods result in an improvement of the learning process. In this work, we first study these methods in the context of black-box variational inference, specifically analyzing the effect that scaling the data has on the smoothness of the optimization landscape. Our analysis shows that no general rule applies in order to decide which of the existing data scaling methods, or even if they, will improve the learning process. Second, we highlight the issues that arise when dealing with multivariate data, due to the discrepancy in smoothness of the likelihood functions for different variables, and the inability to scale discrete data. Finally, we propose a novel Lipschitz standardization, and its extension for discrete data, which overcomes the aforementioned limitations. Specifically, as backed by our experiments, Lipschitz standardization i) favors a fairer learning across different variables in the data; and ii) results in faster and more accurate learning.
Tasks
Published	2020-02-26
URL	https://arxiv.org/abs/2002.11369v1
PDF	https://arxiv.org/pdf/2002.11369v1.pdf
PWC	https://paperswithcode.com/paper/lipschitz-standardization-for-robust
Repo
Framework

Distilling portable Generative Adversarial Networks for Image Translation


Title	Distilling portable Generative Adversarial Networks for Image Translation
Authors	Hanting Chen, Yunhe Wang, Han Shu, Changyuan Wen, Chunjing Xu, Boxin Shi, Chao Xu, Chang Xu
Abstract	Despite Generative Adversarial Networks (GANs) have been widely used in various image-to-image translation tasks, they can be hardly applied on mobile devices due to their heavy computation and storage cost. Traditional network compression methods focus on visually recognition tasks, but never deal with generation tasks. Inspired by knowledge distillation, a student generator of fewer parameters is trained by inheriting the low-level and high-level information from the original heavy teacher generator. To promote the capability of student generator, we include a student discriminator to measure the distances between real images, and images generated by student and teacher generators. An adversarial learning process is therefore established to optimize student generator and student discriminator. Qualitative and quantitative analysis by conducting experiments on benchmark datasets demonstrate that the proposed method can learn portable generative models with strong performance.
Tasks	Image-to-Image Translation
Published	2020-03-07
URL	https://arxiv.org/abs/2003.03519v1
PDF	https://arxiv.org/pdf/2003.03519v1.pdf
PWC	https://paperswithcode.com/paper/distilling-portable-generative-adversarial
Repo
Framework

Identification of AC Networks via Online Learning


Title	Identification of AC Networks via Online Learning
Authors	Emanuele Fabbiani, Pulkit Nahata, Giuseppe De Nicolao, Giancarlo Ferrari-Trecate
Abstract	The increasing integration of intermittent renewable generation in power networks calls for novel planning and control methodologies, which hinge on detailed knowledge of the grid. However, reliable information concerning the system topology and parameters may be missing or outdated for temporally varying AC networks. This paper proposes an online learning procedure to estimate the admittance matrix of an AC network capturing topological information and line parameters. We start off by providing a recursive identification algorithm that exploits phasor measurements of voltages and currents. With the goal of accelerating convergence, we subsequently complement our base algorithm with a design-of-experiment procedure, which maximizes the information content of data at each step by computing optimal voltage excitations. Our approach improves on existing techniques and its effectiveness is substantiated by numerical studies on a 6-bus AC network.
Tasks
Published	2020-03-13
URL	https://arxiv.org/abs/2003.06210v1
PDF	https://arxiv.org/pdf/2003.06210v1.pdf
PWC	https://paperswithcode.com/paper/identification-of-ac-networks-via-online
Repo
Framework

Deep regularization and direct training of the inner layers of Neural Networks with Kernel Flows


Title	Deep regularization and direct training of the inner layers of Neural Networks with Kernel Flows
Authors	Gene Ryan Yoo, Houman Owhadi
Abstract	We introduce a new regularization method for Artificial Neural Networks (ANNs) based on Kernel Flows (KFs). KFs were introduced as a method for kernel selection in regression/kriging based on the minimization of the loss of accuracy incurred by halving the number of interpolation points in random batches of the dataset. Writing $f_\theta(x) = \big(f^{(n)}_{\theta_n}\circ f^{(n-1)}_{\theta_{n-1}} \circ \dots \circ f^{(1)}_{\theta_1}\big)(x)$ for the functional representation of compositional structure of the ANN, the inner layers outputs $h^{(i)}(x) = \big(f^{(i)}_{\theta_i}\circ f^{(i-1)}_{\theta_{i-1}} \circ \dots \circ f^{(1)}_{\theta_1}\big)(x)$ define a hierarchy of feature maps and kernels $k^{(i)}(x,x’)=\exp(- \gamma_i \h^{(i)}(x)-h^{(i)}(x’)_2^2)$. When combined with a batch of the dataset these kernels produce KF losses $e_2^{(i)}$ (the $L^2$ regression error incurred by using a random half of the batch to predict the other half) depending on parameters of inner layers $\theta_1,\ldots,\theta_i$ (and $\gamma_i$). The proposed method simply consists in aggregating a subset of these KF losses with a classical output loss. We test the proposed method on CNNs and WRNs without alteration of structure nor output classifier and report reduced test errors, decreased generalization gaps, and increased robustness to distribution shift without significant increase in computational complexity. We suspect that these results might be explained by the fact that while conventional training only employs a linear functional (a generalized moment) of the empirical distribution defined by the dataset and can be prone to trapping in the Neural Tangent Kernel regime (under over-parameterizations), the proposed loss function (defined as a nonlinear functional of the empirical distribution) effectively trains the underlying kernel defined by the CNN beyond regressing the data with that kernel.
Tasks
Published	2020-02-19
URL	https://arxiv.org/abs/2002.08335v1
PDF	https://arxiv.org/pdf/2002.08335v1.pdf
PWC	https://paperswithcode.com/paper/deep-regularization-and-direct-training-of
Repo
Framework

Getting Fairness Right: Towards a Toolbox for Practitioners


Title	Getting Fairness Right: Towards a Toolbox for Practitioners
Authors	Boris Ruf, Chaouki Boutharouite, Marcin Detyniecki
Abstract	The potential risk of AI systems unintentionally embedding and reproducing bias has attracted the attention of machine learning practitioners and society at large. As policy makers are willing to set the standards of algorithms and AI techniques, the issue on how to refine existing regulation, in order to enforce that decisions made by automated systems are fair and non-discriminatory, is again critical. Meanwhile, researchers have demonstrated that the various existing metrics for fairness are statistically mutually exclusive and the right choice mostly depends on the use case and the definition of fairness. Recognizing that the solutions for implementing fair AI are not purely mathematical but require the commitments of the stakeholders to define the desired nature of fairness, this paper proposes to draft a toolbox which helps practitioners to ensure fair AI practices. Based on the nature of the application and the available training data, but also on legal requirements and ethical, philosophical and cultural dimensions, the toolbox aims to identify the most appropriate fairness objective. This approach attempts to structure the complex landscape of fairness metrics and, therefore, makes the different available options more accessible to non-technical people. In the proven absence of a silver bullet solution for fair AI, this toolbox intends to produce the fairest AI systems possible with respect to their local context.
Tasks
Published	2020-03-15
URL	https://arxiv.org/abs/2003.06920v1
PDF	https://arxiv.org/pdf/2003.06920v1.pdf
PWC	https://paperswithcode.com/paper/getting-fairness-right-towards-a-toolbox-for
Repo
Framework

Algorithmic Fairness from a Non-ideal Perspective


Title	Algorithmic Fairness from a Non-ideal Perspective
Authors	Sina Fazelpour, Zachary C. Lipton
Abstract	Inspired by recent breakthroughs in predictive modeling, practitioners in both industry and government have turned to machine learning with hopes of operationalizing predictions to drive automated decisions. Unfortunately, many social desiderata concerning consequential decisions, such as justice or fairness, have no natural formulation within a purely predictive framework. In efforts to mitigate these problems, researchers have proposed a variety of metrics for quantifying deviations from various statistical parities that we might expect to observe in a fair world and offered a variety of algorithms in attempts to satisfy subsets of these parities or to trade off the degree to which they are satisfied against utility. In this paper, we connect this approach to \emph{fair machine learning} to the literature on ideal and non-ideal methodological approaches in political philosophy. The ideal approach requires positing the principles according to which a just world would operate. In the most straightforward application of ideal theory, one supports a proposed policy by arguing that it closes a discrepancy between the real and the perfectly just world. However, by failing to account for the mechanisms by which our non-ideal world arose, the responsibilities of various decision-makers, and the impacts of proposed policies, naive applications of ideal thinking can lead to misguided interventions. In this paper, we demonstrate a connection between the fair machine learning literature and the ideal approach in political philosophy, and argue that the increasingly apparent shortcomings of proposed fair machine learning algorithms reflect broader troubles faced by the ideal approach. We conclude with a critical discussion of the harms of misguided solutions, a reinterpretation of impossibility results, and directions for future research.
Tasks
Published	2020-01-08
URL	https://arxiv.org/abs/2001.09773v1
PDF	https://arxiv.org/pdf/2001.09773v1.pdf
PWC	https://paperswithcode.com/paper/algorithmic-fairness-from-a-non-ideal
Repo
Framework

CBIR using features derived by Deep Learning


Title	CBIR using features derived by Deep Learning
Authors	Subhadip Maji, Smarajit Bose
Abstract	In a Content Based Image Retrieval (CBIR) System, the task is to retrieve similar images from a large database given a query image. The usual procedure is to extract some useful features from the query image, and retrieve images which have similar set of features. For this purpose, a suitable similarity measure is chosen, and images with high similarity scores are retrieved. Naturally the choice of these features play a very important role in the success of this system, and high level features are required to reduce the semantic gap. In this paper, we propose to use features derived from pre-trained network models from a deep-learning convolution network trained for a large image classification problem. This approach appears to produce vastly superior results for a variety of databases, and it outperforms many contemporary CBIR systems. We analyse the retrieval time of the method, and also propose a pre-clustering of the database based on the above-mentioned features which yields comparable results in a much shorter time in most of the cases.
Tasks	Content-Based Image Retrieval, Image Classification, Image Retrieval
Published	2020-02-13
URL	https://arxiv.org/abs/2002.07877v1
PDF	https://arxiv.org/pdf/2002.07877v1.pdf
PWC	https://paperswithcode.com/paper/cbir-using-features-derived-by-deep-learning
Repo
Framework

Spectral neighbor joining for reconstruction of latent tree models


Title	Spectral neighbor joining for reconstruction of latent tree models
Authors	Ariel Jaffe, Noah Amsel, Boaz Nadler, Joseph T. Chang, Yuval Kluger
Abstract	A key assumption in multiple scientific applications is that the distribution of observed data can be modeled by a latent tree graphical model. An important example is phylogenetics, where the tree models the evolutionary lineages of various organisms. Given a set of independent realizations of the random variables at the leaves of the tree, a common task is to infer the underlying tree topology. In this work we develop Spectral Neighbor Joining (SNJ), a novel method to recover latent tree graphical models. In contrast to distance based methods, SNJ is based on a spectral measure of similarity between all pairs of observed variables. We prove that SNJ is consistent, and derive a sufficient condition for correct tree recovery from an estimated similarity matrix. Combining this condition with a concentration of measure result on the similarity matrix, we bound the number of samples required to recover the tree with high probability. We illustrate via extensive simulations that SNJ requires fewer samples to accurately recover trees in regimes where the tree contains a large number of leaves or long edges. We provide theoretical support for this observation by analyzing the model of a perfect binary tree.
Tasks
Published	2020-02-28
URL	https://arxiv.org/abs/2002.12547v1
PDF	https://arxiv.org/pdf/2002.12547v1.pdf
PWC	https://paperswithcode.com/paper/spectral-neighbor-joining-for-reconstruction
Repo
Framework

BCNet: Learning Body and Cloth Shape from A Single Image


Title	BCNet: Learning Body and Cloth Shape from A Single Image
Authors	Boyi Jiang, Juyong Zhang, Yang Hong, Jinhao Luo, Ligang Liu, Hujun Bao
Abstract	In this paper, we consider the problem to automatically reconstruct both garment and body shapes from a single near front view RGB image. To this end, we propose a layered garment representation on top of SMPL and novelly make the skinning weight of garment to be independent with the body mesh, which significantly improves the expression ability of our garment model. Compared with existing methods, our method can support more garment categories like skirts and recover more accurate garment geometry. To train our model, we construct two large scale datasets with ground truth body and garment geometries as well as paired color images. Compared with single mesh or non-parametric representation, our method can achieve more flexible control with separate meshes, makes applications like re-pose, garment transfer, and garment texture mapping possible.
Tasks
Published	2020-04-01
URL	https://arxiv.org/abs/2004.00214v1
PDF	https://arxiv.org/pdf/2004.00214v1.pdf
PWC	https://paperswithcode.com/paper/bcnet-learning-body-and-cloth-shape-from-a
Repo
Framework

ARDA: Automatic Relational Data Augmentation for Machine Learning


Title	ARDA: Automatic Relational Data Augmentation for Machine Learning
Authors	Nadiia Chepurko, Ryan Marcus, Emanuel Zgraggen, Raul Castro Fernandez, Tim Kraska, David Karger
Abstract	Automatic machine learning (\AML) is a family of techniques to automate the process of training predictive models, aiming to both improve performance and make machine learning more accessible. While many recent works have focused on aspects of the machine learning pipeline like model selection, hyperparameter tuning, and feature selection, relatively few works have focused on automatic data augmentation. Automatic data augmentation involves finding new features relevant to the user’s predictive task with minimal ``human-in-the-loop’’ involvement. We present \system, an end-to-end system that takes as input a dataset and a data repository, and outputs an augmented data set such that training a predictive model on this augmented dataset results in improved performance. Our system has two distinct components: (1) a framework to search and join data with the input data, based on various attributes of the input, and (2) an efficient feature selection algorithm that prunes out noisy or irrelevant features from the resulting join. We perform an extensive empirical evaluation of different system components and benchmark our feature selection algorithm on real-world datasets. \|
Tasks	Data Augmentation, Feature Selection, Model Selection
Published	2020-03-21
URL	https://arxiv.org/abs/2003.09758v1
PDF	https://arxiv.org/pdf/2003.09758v1.pdf
PWC	https://paperswithcode.com/paper/arda-automatic-relational-data-augmentation
Repo
Framework

Computer Aided Diagnosis for Spitzoid lesions classification using Artificial Intelligence techniques


Title	Computer Aided Diagnosis for Spitzoid lesions classification using Artificial Intelligence techniques
Authors	Abir Belaala, Labib Sadek, Noureddine Zerhouni, Christine Devalland
Abstract	Spitzoid lesions may be largely categorized into Spitz Nevus, Atypical Spitz Tumors, and Spitz Melanomas. Classifying a lesion precisely as Atypical Spitz Tumors or AST is challenging and often requires the integration of clinical, histological, and immunohistochemical features to differentiate AST from regular Spitz nevus and malignant Spitz melanomas. Specifically, this paper aims to test several artificial intelligence techniques so as to build a computer aided diagnosis system. A proposed three-phase approach is being implemented. In Phase I, collected data are preprocessed with an effective Synthetic Minority Oversampling TEchnique or SMOTE-based method being implemented to treat the imbalance data problem. Then, a feature selection mechanism using genetic algorithm (GA) is applied in Phase II. Finally, in Phase III, a ten-fold cross-validation method is used to compare the performance of seven machine-learning algorithms for classification. Results obtained with SMOTE-Multilayer Perceptron with GA-based 14 features show the highest classification accuracy (0.98), a sensitivity of 0.99, and a specificity of 0.98, outperforming other Spitzoid lesions classification algorithms.
Tasks	Feature Selection
Published	2020-03-10
URL	https://arxiv.org/abs/2003.04745v1
PDF	https://arxiv.org/pdf/2003.04745v1.pdf
PWC	https://paperswithcode.com/paper/computer-aided-diagnosis-for-spitzoid-lesions
Repo
Framework

Short-Term Forecasting of CO2 Emission Intensity in Power Grids by Machine Learning


Title	Short-Term Forecasting of CO2 Emission Intensity in Power Grids by Machine Learning
Authors	Kenneth Leerbeck, Peder Bacher, Rune Junker, Goran Goranović, Olivier Corradi, Razgar Ebrahimy, Anna Tveit, Henrik Madsen
Abstract	A machine learning algorithm is developed to forecast the CO2 emission intensities in electrical power grids in the Danish bidding zone DK2, distinguishing between average and marginal emissions. The analysis was done on data set comprised of a large number (473) of explanatory variables such as power production, demand, import, weather conditions etc. collected from selected neighboring zones. The number was reduced to less than 50 using both LASSO (a penalized linear regression analysis) and a forward feature selection algorithm. Three linear regression models that capture different aspects of the data (non-linearities and coupling of variables etc.) were created and combined into a final model using Softmax weighted average. Cross-validation is performed for debiasing and autoregressive moving average model (ARIMA) implemented to correct the residuals, making the final model the variant with exogenous inputs (ARIMAX). The forecasts with the corresponding uncertainties are given for two time horizons, below and above six hours. Marginal emissions came up independent of any conditions in the DK2 zone, suggesting that the marginal generators are located in the neighbouring zones. The developed methodology can be applied to any bidding zone in the European electricity network without requiring detailed knowledge about the zone.
Tasks	Feature Selection
Published	2020-03-10
URL	https://arxiv.org/abs/2003.05740v1
PDF	https://arxiv.org/pdf/2003.05740v1.pdf
PWC	https://paperswithcode.com/paper/short-term-forecasting-of-co2-emission
Repo
Framework

Quantum Bandits


Title	Quantum Bandits
Authors	Balthazar Casalé, Giuseppe Di Molfetta, Hachem Kadri, Liva Ralaivola
Abstract	We consider the quantum version of the bandit problem known as {\em best arm identification} (BAI). We first propose a quantum modeling of the BAI problem, which assumes that both the learning agent and the environment are quantum; we then propose an algorithm based on quantum amplitude amplification to solve BAI. We formally analyze the behavior of the algorithm on all instances of the problem and we show, in particular, that it is able to get the optimal solution quadratically faster than what is known to hold in the classical case.
Tasks
Published	2020-02-15
URL	https://arxiv.org/abs/2002.06395v1
PDF	https://arxiv.org/pdf/2002.06395v1.pdf
PWC	https://paperswithcode.com/paper/quantum-bandits
Repo
Framework

Memory-Loss is Fundamental for Stability and Distinguishes the Echo State Property Threshold in Reservoir Computing & Beyond


Title	Memory-Loss is Fundamental for Stability and Distinguishes the Echo State Property Threshold in Reservoir Computing & Beyond
Authors	G Manjunath
Abstract	Reservoir computing, a highly successful neuromorphic computing scheme used to filter, predict, classify temporal inputs, has entered an era of microchips for several other engineering and biological applications. A basis for reservoir computing is memory-loss or the echo state property. It is an open problem on how design parameters of the reservoir can be optimized to maximize reservoir freedom to map an input robustly and yet have its close-by-variants represented in the reservoir differently. We present a framework to analyze stability due to input and parameter perturbations and make a surprising fundamental conclusion, that the echo state property is \emph{equivalent} to robustness to input in any nonlinear recurrent neural network that may or may not be in the gambit of reservoir computing. Further, backed by theoretical conclusions, we define and find the difficult-to-describe \emph{input specific} edge-of-criticality or the echo state property threshold, which defines the boundary between parameter related stability and instability.
Tasks
Published	2020-01-03
URL	https://arxiv.org/abs/2001.00766v1
PDF	https://arxiv.org/pdf/2001.00766v1.pdf
PWC	https://paperswithcode.com/paper/memory-loss-is-fundamental-for-stability-and
Repo
Framework