April 2, 2020

3138 words 15 mins read

Paper Group ANR 140

Multi-factorial Optimization for Large-scale Virtual Machine Placement in Cloud Computing. Coreset-based Strategies for Robust Center-type Problems. Towards Federated Learning: Robustness Analytics to Data Heterogeneity. A Two-Stream Meticulous Processing Network for Retinal Vessel Segmentation. Evolutionary Approach to Collectible Card Game Arena …

Multi-factorial Optimization for Large-scale Virtual Machine Placement in Cloud Computing


Title	Multi-factorial Optimization for Large-scale Virtual Machine Placement in Cloud Computing
Authors	Zhengping Liang, Jian Zhang, Liang Feng, Zexuan Zhu
Abstract	The placement scheme of virtual machines (VMs) to physical servers (PSs) is crucial to lowering operational cost for cloud providers. Evolutionary algorithms (EAs) have been performed promising-solving on virtual machine placement (VMP) problems in the past. However, as growing demand for cloud services, the existing EAs fail to implement in large-scale virtual machine placement (LVMP) problem due to the high time complexity and poor scalability. Recently, the multi-factorial optimization (MFO) technology has surfaced as a new search paradigm in evolutionary computing. It offers the ability to evolve multiple optimization tasks simultaneously during the evolutionary process. This paper aims to apply the MFO technology to the LVMP problem in heterogeneous environment. Firstly, we formulate a deployment cost based VMP problem in the form of the MFO problem. Then, a multi-factorial evolutionary algorithm (MFEA) embedded with greedy-based allocation operator is developed to address the established MFO problem. After that, a re-migration and merge operator is designed to offer the integrated solution of the LVMP problem from the solutions of MFO problem. To assess the effectiveness of our proposed method, the simulation experiments are carried on large-scale and extra large-scale VMs test data sets. The results show that compared with various heuristic methods, our method could shorten optimization time significantly and offer a competitive placement solution for the LVMP problem in heterogeneous environment.
Tasks
Published	2020-01-18
URL	https://arxiv.org/abs/2001.06585v1
PDF	https://arxiv.org/pdf/2001.06585v1.pdf
PWC	https://paperswithcode.com/paper/multi-factorial-optimization-for-large-scale
Repo
Framework

Coreset-based Strategies for Robust Center-type Problems


Title	Coreset-based Strategies for Robust Center-type Problems
Authors	Andrea Pietracaprina, Geppino Pucci, Federico Soldà
Abstract	Given a dataset $V$ of points from some metric space, the popular $k$-center problem requires to identify a subset of $k$ points (centers) in $V$ minimizing the maximum distance of any point of $V$ from its closest center. The \emph{robust} formulation of the problem features a further parameter $z$ and allows up to $z$ points of $V$ (outliers) to be disregarded when computing the maximum distance from the centers. In this paper, we focus on two important constrained variants of the robust $k$-center problem, namely, the Robust Matroid Center (RMC) problem, where the set of returned centers are constrained to be an independent set of a matroid of rank $k$ built on $V$, and the Robust Knapsack Center (RKC) problem, where each element $i\in V$ is given a positive weight $w_i<1$ and the aggregate weight of the returned centers must be at most 1. We devise coreset-based strategies for the two problems which yield efficient sequential, MapReduce, and Streaming algorithms. More specifically, for any fixed $\epsilon>0$, the algorithms return solutions featuring a $(3+\epsilon)$-approximation ratio, which is a mere additive term $\epsilon$ away from the 3-approximations achievable by the best known polynomial-time sequential algorithms for the two problems. Moreover, the algorithms obliviously adapt to the intrinsic complexity of the dataset, captured by its doubling dimension $D$. For wide ranges of the parameters $k,z,\epsilon, D$, we obtain a sequential algorithm with running time linear in $V$, and MapReduce/Streaming algorithms with few rounds/passes and substantially sublinear local/working memory.
Tasks
Published	2020-02-18
URL	https://arxiv.org/abs/2002.07463v1
PDF	https://arxiv.org/pdf/2002.07463v1.pdf
PWC	https://paperswithcode.com/paper/coreset-based-strategies-for-robust-center
Repo
Framework

Towards Federated Learning: Robustness Analytics to Data Heterogeneity


Title	Towards Federated Learning: Robustness Analytics to Data Heterogeneity
Authors	Jia Qian, Xenofon Fafoutis, Lars Kai Hansen
Abstract	Federated Learning allows remote centralized server training models without to access the data stored in distributed (edge) devices. Most work assume the data generated from edge devices is identically and independently sampled from a common population distribution. However, such ideal sampling may not be realistic in many contexts where edge devices correspond to units in variable context. Also, models based on intrinsic agency, such as active sampling schemes, may lead to highly biased sampling. So an imminent question is how robust Federated Learning is to biased sampling? In this work, we investigate two such scenarios. First, we study Federated Learning of a classifier from data with edge device class distribution heterogeneity. Second, we study Federated Learning of a classifier with active sampling at the edge. We present evidence in both scenarios, that federated learning is robust to data heterogeneity.
Tasks
Published	2020-02-12
URL	https://arxiv.org/abs/2002.05038v1
PDF	https://arxiv.org/pdf/2002.05038v1.pdf
PWC	https://paperswithcode.com/paper/towards-federated-learning-robustness
Repo
Framework

A Two-Stream Meticulous Processing Network for Retinal Vessel Segmentation


Title	A Two-Stream Meticulous Processing Network for Retinal Vessel Segmentation
Authors	Shaoming Zheng, Tianyang Zhang, Jiawei Zhuang, Hao Wang, Jiang Liu
Abstract	Vessel segmentation in fundus is a key diagnostic capability in ophthalmology, and there are various challenges remained in this essential task. Early approaches indicate that it is often difficult to obtain desirable segmentation performance on thin vessels and boundary areas due to the imbalance of vessel pixels with different thickness levels. In this paper, we propose a novel two-stream Meticulous-Processing Network (MP-Net) for tackling this problem. To pay more attention to the thin vessels and boundary areas, we firstly propose an efficient hierarchical model automatically stratifies the ground-truth masks into different thickness levels. Then a novel two-stream adversarial network is introduced to use the stratification results with a balanced loss function and an integration operation to achieve a better performance, especially in thin vessels and boundary areas detecting. Our model is proved to outperform state-of-the-art methods on DRIVE, STARE, and CHASE_DB1 datasets.
Tasks	Retinal Vessel Segmentation
Published	2020-01-15
URL	https://arxiv.org/abs/2001.05829v1
PDF	https://arxiv.org/pdf/2001.05829v1.pdf
PWC	https://paperswithcode.com/paper/a-two-stream-meticulous-processing-network
Repo
Framework

Evolutionary Approach to Collectible Card Game Arena Deckbuilding using Active Genes


Title	Evolutionary Approach to Collectible Card Game Arena Deckbuilding using Active Genes
Authors	Jakub Kowalski, Radosław Miernik
Abstract	In this paper, we evolve a card-choice strategy for the arena mode of Legends of Code and Magic, a programming game inspired by popular collectible card games like Hearthstone or TES: Legends. In the arena game mode, before each match, a player has to construct his deck choosing cards one by one from the previously unknown options. Such a scenario is difficult from the optimization point of view, as not only the fitness function is non-deterministic, but its value, even for a given problem instance, is impossible to be calculated directly and can only be estimated with simulation-based approaches. We propose a variant of the evolutionary algorithm that uses a concept of an active gene to reduce the range of the operators only to generation-specific subsequences of the genotype. Thus, we batched learning process and constrained evolutionary updates only to the cards relevant for the particular draft, without forgetting the knowledge from the previous tests. We developed and tested various implementations of this idea, investigating their performance by taking into account the computational cost of each variant. Performed experiments show that some of the introduced active-genes algorithms tend to learn faster and produce statistically better draft policies than the compared methods.
Tasks	Card Games
Published	2020-01-05
URL	https://arxiv.org/abs/2001.01326v1
PDF	https://arxiv.org/pdf/2001.01326v1.pdf
PWC	https://paperswithcode.com/paper/evolutionary-approach-to-collectible-card
Repo
Framework

Imputer: Sequence Modelling via Imputation and Dynamic Programming


Title	Imputer: Sequence Modelling via Imputation and Dynamic Programming
Authors	William Chan, Chitwan Saharia, Geoffrey Hinton, Mohammad Norouzi, Navdeep Jaitly
Abstract	This paper presents the Imputer, a neural sequence model that generates output sequences iteratively via imputations. The Imputer is an iterative generative model, requiring only a constant number of generation steps independent of the number of input or output tokens. The Imputer can be trained to approximately marginalize over all possible alignments between the input and output sequences, and all possible generation orders. We present a tractable dynamic programming training algorithm, which yields a lower bound on the log marginal likelihood. When applied to end-to-end speech recognition, the Imputer outperforms prior non-autoregressive models and achieves competitive results to autoregressive models. On LibriSpeech test-other, the Imputer achieves 11.1 WER, outperforming CTC at 13.0 WER and seq2seq at 12.5 WER.
Tasks	End-To-End Speech Recognition, Imputation, Speech Recognition
Published	2020-02-20
URL	https://arxiv.org/abs/2002.08926v1
PDF	https://arxiv.org/pdf/2002.08926v1.pdf
PWC	https://paperswithcode.com/paper/imputer-sequence-modelling-via-imputation-and
Repo
Framework

Convex Geometry and Duality of Over-parameterized Neural Networks


Title	Convex Geometry and Duality of Over-parameterized Neural Networks
Authors	Tolga Ergen, Mert Pilanci
Abstract	We develop a convex analytic framework for ReLU neural networks which elucidates the inner workings of hidden neurons and their function space characteristics. We show that neural networks with rectified linear units act as convex regularizers, where simple solutions are encouraged via extreme points of a certain convex set. For one dimensional regression and classification, as well as rank-one data matrices, we prove that finite two-layer ReLU networks with norm regularization yield linear spline interpolation. We characterize the classification decision regions in terms of a closed form kernel matrix and minimum L1 norm solutions. This is in contrast to Neural Tangent Kernel which is unable to explain neural network predictions with finitely many neurons. Our convex geometric description also provides intuitive explanations of hidden neurons as auto-encoders. In higher dimensions, we show that the training problem for two-layer networks can be cast as a convex optimization problem with infinitely many constraints. We then provide a family of convex relaxations to approximate the solution, and a cutting-plane algorithm to improve the relaxations. We derive conditions for the exactness of the relaxations and provide simple closed form formulas for the optimal neural network weights in certain cases. We also establish a connection to $\ell_0$-$\ell_1$ equivalence for neural networks analogous to the minimal cardinality solutions in compressed sensing. Extensive experimental results show that the proposed approach yields interpretable and accurate models.
Tasks
Published	2020-02-25
URL	https://arxiv.org/abs/2002.11219v1
PDF	https://arxiv.org/pdf/2002.11219v1.pdf
PWC	https://paperswithcode.com/paper/convex-geometry-and-duality-of-over
Repo
Framework

Fair Prediction with Endogenous Behavior


Title	Fair Prediction with Endogenous Behavior
Authors	Christopher Jung, Sampath Kannan, Changhwa Lee, Mallesh M. Pai, Aaron Roth, Rakesh Vohra
Abstract	There is increasing regulatory interest in whether machine learning algorithms deployed in consequential domains (e.g. in criminal justice) treat different demographic groups “fairly.” However, there are several proposed notions of fairness, typically mutually incompatible. Using criminal justice as an example, we study a model in which society chooses an incarceration rule. Agents of different demographic groups differ in their outside options (e.g. opportunity for legal employment) and decide whether to commit crimes. We show that equalizing type I and type II errors across groups is consistent with the goal of minimizing the overall crime rate; other popular notions of fairness are not.
Tasks
Published	2020-02-18
URL	https://arxiv.org/abs/2002.07147v1
PDF	https://arxiv.org/pdf/2002.07147v1.pdf
PWC	https://paperswithcode.com/paper/fair-prediction-with-endogenous-behavior
Repo
Framework

The Effectiveness of Johnson-Lindenstrauss Transform for High Dimensional Optimization with Outliers


Title	The Effectiveness of Johnson-Lindenstrauss Transform for High Dimensional Optimization with Outliers
Authors	Hu Ding, Ruizhe Qin, Jiawei Huang
Abstract	{\em Johnson-Lindenstrauss (JL) Transform} is one of the most popular methods for dimension reduction. In this paper, we study the effectiveness of JL transform for solving the high dimensional optimization problems with outliers. We focus on two fundamental optimization problems: {\em $k$-center clustering with outliers} and {\em SVM with outliers}. In general, the time complexity for dealing with outliers in high dimensional space could be very large. Based on some novel insights in geometry, we prove that the complexities of these two problems can be significantly reduced through JL transform. In the experiments, we compare JL transform with several other well known dimension reduction methods, and study their performances on synthetic and real datasets.
Tasks	Dimensionality Reduction
Published	2020-02-27
URL	https://arxiv.org/abs/2002.11923v1
PDF	https://arxiv.org/pdf/2002.11923v1.pdf
PWC	https://paperswithcode.com/paper/the-effectiveness-of-johnson-lindenstrauss
Repo
Framework

Dimensionality Reduction of Movement Primitives in Parameter Space


Title	Dimensionality Reduction of Movement Primitives in Parameter Space
Authors	Samuele Tosatto, Jonas Stadtmueller, Jan Peters
Abstract	Movement primitives are an important policy class for real-world robotics. However, the high dimensionality of their parametrization makes the policy optimization expensive both in terms of samples and computation. Enabling an efficient representation of movement primitives facilitates the application of machine learning techniques such as reinforcement on robotics. Motions, especially in highly redundant kinematic structures, exhibit high correlation in the configuration space. For these reasons, prior work has mainly focused on the application of dimensionality reduction techniques in the configuration space. In this paper, we investigate the application of dimensionality reduction in the parameter space, identifying principal movements. The resulting approach is enriched with a probabilistic treatment of the parameters, inheriting all the properties of the Probabilistic Movement Primitives. We test the proposed technique both on a real robotic task and on a database of complex human movements. The empirical analysis shows that the dimensionality reduction in parameter space is more effective than in configuration space, as it enables the representation of the movements with a significant reduction of parameters.
Tasks	Dimensionality Reduction
Published	2020-02-26
URL	https://arxiv.org/abs/2003.02634v1
PDF	https://arxiv.org/pdf/2003.02634v1.pdf
PWC	https://paperswithcode.com/paper/dimensionality-reduction-of-movement
Repo
Framework

Tight Regret Bounds for Noisy Optimization of a Brownian Motion


Title	Tight Regret Bounds for Noisy Optimization of a Brownian Motion
Authors	Zexin Wang, Vincent Y. F. Tan, Jonathan Scarlett
Abstract	We consider the problem of Bayesian optimization of a one-dimensional Brownian motion in which the $T$ adaptively chosen observations are corrupted by Gaussian noise. We show that as the smallest possible expected simple regret and the smallest possible expected cumulative regret scale as $\Omega(1 / \sqrt{T \log (T)}) \cap \mathcal{O}(\log T / \sqrt{T})$ and $\Omega(\sqrt{T / \log (T)}) \cap \mathcal{O}(\sqrt{T} \cdot \log T)$ respectively. Thus, our upper and lower bounds are tight up to a factor of $\mathcal{O}( (\log T)^{1.5} )$. The upper bound uses an algorithm based on confidence bounds and the Markov property of Brownian motion, and the lower bound is based on a reduction to binary hypothesis testing.
Tasks
Published	2020-01-25
URL	https://arxiv.org/abs/2001.09327v1
PDF	https://arxiv.org/pdf/2001.09327v1.pdf
PWC	https://paperswithcode.com/paper/tight-regret-bounds-for-noisy-optimization-of
Repo
Framework

Concurrently Extrapolating and Interpolating Networks for Continuous Model Generation


Title	Concurrently Extrapolating and Interpolating Networks for Continuous Model Generation
Authors	Lijun Zhao, Jinjing Zhang, Fan Zhang, Anhong Wang, Huihui Bai, Yao Zhao
Abstract	Most deep image smoothing operators are always trained repetitively when different explicit structure-texture pairs are employed as label images for each algorithm configured with different parameters. This kind of training strategy often takes a long time and spends equipment resources in a costly manner. To address this challenging issue, we generalize continuous network interpolation as a more powerful model generation tool, and then propose a simple yet effective model generation strategy to form a sequence of models that only requires a set of specific-effect label images. To precisely learn image smoothing operators, we present a double-state aggregation (DSA) module, which can be easily inserted into most of current network architecture. Based on this module, we design a double-state aggregation neural network structure with a local feature aggregation block and a nonlocal feature aggregation block to obtain operators with large expression capacity. Through the evaluation of many objective and visual experimental results, we show that the proposed method is capable of producing a series of continuous models and achieves better performance than that of several state-of-the-art methods for image smoothing.
Tasks
Published	2020-01-12
URL	https://arxiv.org/abs/2001.03847v1
PDF	https://arxiv.org/pdf/2001.03847v1.pdf
PWC	https://paperswithcode.com/paper/concurrently-extrapolating-and-interpolating
Repo
Framework

Machine Learning for Performance-Aware Virtual Network Function Placement


Title	Machine Learning for Performance-Aware Virtual Network Function Placement
Authors	Dimitrios Michael Manias, Manar Jammal, Hassan Hawilo, Abdallah Shami, Parisa Heidari, Adel Larabi, Richard Brunner
Abstract	With the growing demand for data connectivity, network service providers are faced with the task of reducing their capital and operational expenses while simultaneously improving network performance and addressing the increased connectivity demand. Although Network Function Virtualization (NFV) has been identified as a solution, several challenges must be addressed to ensure its feasibility. In this paper, we address the Virtual Network Function (VNF) placement problem by developing a machine learning decision tree model that learns from the effective placement of the various VNF instances forming a Service Function Chain (SFC). The model takes several performance-related features from the network as an input and selects the placement of the various VNF instances on network servers with the objective of minimizing the delay between dependent VNF instances. The benefits of using machine learning are realized by moving away from a complex mathematical modelling of the system and towards a data-based understanding of the system. Using the Evolved Packet Core (EPC) as a use case, we evaluate our model on different data center networks and compare it to the BACON algorithm in terms of the delay between interconnected components and the total delay across the SFC. Furthermore, a time complexity analysis is performed to show the effectiveness of the model in NFV applications.
Tasks
Published	2020-01-13
URL	https://arxiv.org/abs/2001.07787v1
PDF	https://arxiv.org/pdf/2001.07787v1.pdf
PWC	https://paperswithcode.com/paper/machine-learning-for-performance-aware
Repo
Framework

Distributional Sliced-Wasserstein and Applications to Generative Modeling


Title	Distributional Sliced-Wasserstein and Applications to Generative Modeling
Authors	Khai Nguyen, Nhat Ho, Tung Pham, Hung Bui
Abstract	Sliced-Wasserstein distance (SWD) and its variation, Max Sliced-Wasserstein distance (Max-SWD), have been widely used in the recent years due to their fast computation and scalability when the probability measures lie in very high dimension. However, these distances still have their weakness, SWD requires a lot of projection samples because it uses the uniform distribution to sample projecting directions, Max-SWD uses only one projection, causing it to lose a large amount of information. In this paper, we propose a novel distance that finds optimal penalized probability measure over the slices, which is named Distributional Sliced-Wasserstein distance (DSWD). We show that the DSWD is a generalization of both SWD and Max-SWD, and the proposed distance could be found by searching for the push-forward measure over a set of measures satisfying some certain constraints. Moreover, similar to SWD, we can extend Generalized Sliced-Wasserstein distance (GSWD) to Distributional Generalized Sliced-Wasserstein distance (DGSWD). Finally, we carry out extensive experiments to demonstrate the favorable generative modeling performances of our distances over the previous sliced-based distances in large-scale real datasets.
Tasks
Published	2020-02-18
URL	https://arxiv.org/abs/2002.07367v1
PDF	https://arxiv.org/pdf/2002.07367v1.pdf
PWC	https://paperswithcode.com/paper/distributional-sliced-wasserstein-and
Repo
Framework

Concept Whitening for Interpretable Image Recognition


Title	Concept Whitening for Interpretable Image Recognition
Authors	Zhi Chen, Yijie Bei, Cynthia Rudin
Abstract	What does a neural network encode about a concept as we traverse through the layers? Interpretability in machine learning is undoubtedly important, but the calculations of neural networks are very challenging to understand. Attempts to see inside their hidden layers can either be misleading, unusable, or rely on the latent space to possess properties that it may not have. In this work, rather than attempting to analyze a neural network posthoc, we introduce a mechanism, called concept whitening (CW), to alter a given layer of the network to allow us to better understand the computation leading up to that layer. When a concept whitening module is added to a CNN, the axes of the latent space can be aligned with concepts of interest. By experiment, we show that CW can provide us a much clearer understanding for how the network gradually learns concepts over layers without hurting predictive performance.
Tasks
Published	2020-02-05
URL	https://arxiv.org/abs/2002.01650v1
PDF	https://arxiv.org/pdf/2002.01650v1.pdf
PWC	https://paperswithcode.com/paper/concept-whitening-for-interpretable-image
Repo
Framework