Paper Group ANR 1299
Convex Set Disjointness, Distributed Learning of Halfspaces, and LP Feasibility. The Global Convergence Analysis of the Bat Algorithm Using a Markovian Framework and Dynamical System Theory. Unsupervised Recalibration. PINFER: Privacy-Preserving Inference for Machine Learning. Adversarial Learning with Multiscale Features and Kernel Factorization f …
Convex Set Disjointness, Distributed Learning of Halfspaces, and LP Feasibility
Title | Convex Set Disjointness, Distributed Learning of Halfspaces, and LP Feasibility |
Authors | Mark Braverman, Gillat Kol, Shay Moran, Raghuvansh R. Saxena |
Abstract | We study the Convex Set Disjointness (CSD) problem, where two players have input sets taken from an arbitrary fixed domain~$U\subseteq \mathbb{R}^d$ of size $\lvert U\rvert = n$. Their mutual goal is to decide using minimum communication whether the convex hulls of their sets intersect (equivalently, whether their sets can be separated by a hyperplane). Different forms of this problem naturally arise in distributed learning and optimization: it is equivalent to {\em Distributed Linear Program (LP) Feasibility} – a basic task in distributed optimization, and it is tightly linked to {\it Distributed Learning of Halfdpaces in $\mathbb{R}^d$}. In {communication complexity theory}, CSD can be viewed as a geometric interpolation between the classical problems of {Set Disjointness} (when~$d\geq n-1$) and {Greater-Than} (when $d=1$). We establish a nearly tight bound of $\tilde \Theta(d\log n)$ on the communication complexity of learning halfspaces in $\mathbb{R}^d$. For Convex Set Disjointness (and the equivalent task of distributed LP feasibility) we derive upper and lower bounds of $\tilde O(d^2\log n)$ and~$\Omega(d\log n)$. These results improve upon several previous works in distributed learning and optimization. Unlike typical works in communication complexity, the main technical contribution of this work lies in the upper bounds. In particular, our protocols are based on a {\it Container Lemma for Halfspaces} and on two variants of {\it Carath'eodory’s Theorem}, which may be of independent interest. These geometric statements are used by our protocols to provide a compressed summary of the players’ input. |
Tasks | Distributed Optimization |
Published | 2019-09-08 |
URL | https://arxiv.org/abs/1909.03547v1 |
https://arxiv.org/pdf/1909.03547v1.pdf | |
PWC | https://paperswithcode.com/paper/convex-set-disjointness-distributed-learning |
Repo | |
Framework | |
The Global Convergence Analysis of the Bat Algorithm Using a Markovian Framework and Dynamical System Theory
Title | The Global Convergence Analysis of the Bat Algorithm Using a Markovian Framework and Dynamical System Theory |
Authors | Si Chen, Guo-Hua Peng, Xing-Shi He, Xin-She Yang |
Abstract | The bat algorithm (BA) has been shown to be effective to solve a wider range of optimization problems. However, there is not much theoretical analysis concerning its convergence and stability. In order to prove the convergence of the bat algorithm, we have built a Markov model for the algorithm and proved that the state sequence of the bat population forms a finite homogeneous Markov chain, satisfying the global convergence criteria. Then, we prove that the bat algorithm can have global convergence. In addition, in order to enhance the convergence performance of the algorithm, we have designed an updated model using the dynamical system theory in terms of a dynamic matrix, and the parameter ranges for the algorithm stability are then obtained. We then use some benchmark functions to demonstrate that BA can indeed achieve global optimality efficiently for these functions. |
Tasks | |
Published | 2019-03-27 |
URL | http://arxiv.org/abs/1903.11971v1 |
http://arxiv.org/pdf/1903.11971v1.pdf | |
PWC | https://paperswithcode.com/paper/the-global-convergence-analysis-of-the-bat |
Repo | |
Framework | |
Unsupervised Recalibration
Title | Unsupervised Recalibration |
Authors | Albert Ziegler, Paweł Czyż |
Abstract | Unsupervised recalibration (URC) is a general way to improve the accuracy of an already trained probabilistic classification or regression model upon encountering new data while deployed in the field. URC does not require any ground truth associated with the new field data. URC merely observes the model’s predictions and recognizes when the training set is not representative of field data, and then corrects to remove any introduced bias. URC can be particularly useful when applied separately to different subpopulations observed in the field that were not considered as features when training the machine learning model. This makes it possible to exploit subpopulation information without retraining the model or even having ground truth for some or all subpopulations available. Additionally, if these subpopulations are the object of study, URC serves to determine the correct ground truth distributions for them, where naive aggregation methods, like averaging the model’s predictions, systematically underestimate their differences. |
Tasks | |
Published | 2019-08-24 |
URL | https://arxiv.org/abs/1908.09157v2 |
https://arxiv.org/pdf/1908.09157v2.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-recalibration |
Repo | |
Framework | |
PINFER: Privacy-Preserving Inference for Machine Learning
Title | PINFER: Privacy-Preserving Inference for Machine Learning |
Authors | Marc Joye, Fabien A. P. Petitcolas |
Abstract | The foreseen growing role of outsourced machine learning services is raising concerns about the privacy of user data. Several technical solutions are being proposed to address the issue. Hardware security modules in cloud data centres appear limited to enterprise customers due to their complexity, while general multi-party computation techniques require a large number of message exchanges. This paper proposes a variety of protocols for privacy-preserving regression and classification that (i) only require additively homomorphic encryption algorithms, (ii) limit interactions to a mere request and response, and (iii) that can be used directly for important machine-learning algorithms such as logistic regression and SVM classification. The basic protocols are then extended and applied to feed-forward neural networks. |
Tasks | |
Published | 2019-10-04 |
URL | https://arxiv.org/abs/1910.01865v1 |
https://arxiv.org/pdf/1910.01865v1.pdf | |
PWC | https://paperswithcode.com/paper/pinfer-privacy-preserving-inference-for |
Repo | |
Framework | |
Adversarial Learning with Multiscale Features and Kernel Factorization for Retinal Blood Vessel Segmentation
Title | Adversarial Learning with Multiscale Features and Kernel Factorization for Retinal Blood Vessel Segmentation |
Authors | Farhan Akram, Vivek Kumar Singh, Hatem A. Rashwan, Mohamed Abdel-Nasser, Md. Mostafa Kamal Sarker, Nidhi Pandey, Domenec Puig |
Abstract | In this paper, we propose an efficient blood vessel segmentation method for the eye fundus images using adversarial learning with multiscale features and kernel factorization. In the generator network of the adversarial framework, spatial pyramid pooling, kernel factorization and squeeze excitation block are employed to enhance the feature representation in spatial domain on different scales with reduced computational complexity. In turn, the discriminator network of the adversarial framework is formulated by combining convolutional layers with an additional squeeze excitation block to differentiate the generated segmentation mask from its respective ground truth. Before feeding the images to the network, we pre-processed them by using edge sharpening and Gaussian regularization to reach an optimized solution for vessel segmentation. The output of the trained model is post-processed using morphological operations to remove the small speckles of noise. The proposed method qualitatively and quantitatively outperforms state-of-the-art vessel segmentation methods using DRIVE and STARE datasets. |
Tasks | |
Published | 2019-07-05 |
URL | https://arxiv.org/abs/1907.02742v1 |
https://arxiv.org/pdf/1907.02742v1.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-learning-with-multiscale-features |
Repo | |
Framework | |
From perception to control: an autonomous driving system for a formula student driverless car
Title | From perception to control: an autonomous driving system for a formula student driverless car |
Authors | Tairan Chen, Zirui Li, Yiting He, Zewen Xu, Zhe Yan, Huiqian Li |
Abstract | This paper introduces the autonomous system of the “Smart Shark II” which won the Formula Student Autonomous China (FSAC) Competition in 2018. In this competition, an autonomous racecar is required to complete autonomously two laps of unknown track. In this paper, the author presents the self-driving software structure of this racecar which ensure high vehicle speed and safety. The key components ensure a stable driving of the racecar, LiDAR-based and Vision-based cone detection provide a redundant perception; the EKF-based localization offers high accuracy and high frequency state estimation; perception results are accumulated in time and space by occupancy grid map. After getting the trajectory, a model predictive control algorithm is used to optimize in both longitudinal and lateral control of the racecar. Finally, the performance of an experiment based on real-world data is shown. |
Tasks | Autonomous Driving |
Published | 2019-08-31 |
URL | https://arxiv.org/abs/1909.00119v1 |
https://arxiv.org/pdf/1909.00119v1.pdf | |
PWC | https://paperswithcode.com/paper/from-perception-to-control-an-autonomous |
Repo | |
Framework | |
Consistency-Based Semi-Supervised Active Learning: Towards Minimizing Labeling Cost
Title | Consistency-Based Semi-Supervised Active Learning: Towards Minimizing Labeling Cost |
Authors | Mingfei Gao, Zizhao Zhang, Guo Yu, Sercan O. Arik, Larry S. Davis, Tomas Pfister |
Abstract | Active learning (AL) integrates data labeling and model training to minimize the labeling cost by prioritizing the selection of high value data that can best improve model performance. Readily-available unlabeled data are used for selection mechanisms, but are not used for model training in most conventional pool-based AL methods. To minimize the labeling cost, we unify unlabeled sample selection and model training based on two principles. First, we exploit both labeled and unlabeled data using semi-supervised learning (SSL) to distill information from unlabeled data that improves representation learning and sample selection. Second, we propose a simple yet effective selection metric that is coherent with the training objective such that the selected samples are effective at improving model performance. Experimental results demonstrate superior performance of our proposed principles for limited labeled data compared to alternative AL and SSL combinations. In addition, we study an important problem – “When can we start AL?". We propose a measure that is empirically correlated with the AL target loss and can be used to assist in determining the proper start point. |
Tasks | Active Learning, Representation Learning |
Published | 2019-10-16 |
URL | https://arxiv.org/abs/1910.07153v1 |
https://arxiv.org/pdf/1910.07153v1.pdf | |
PWC | https://paperswithcode.com/paper/consistency-based-semi-supervised-active-1 |
Repo | |
Framework | |
Two-block vs. Multi-block ADMM: An empirical evaluation of convergence
Title | Two-block vs. Multi-block ADMM: An empirical evaluation of convergence |
Authors | Andre Goncalves, Xiaoli Liu, Arindam Banerjee |
Abstract | Alternating Direction Method of Multipliers (ADMM) has become a widely used optimization method for convex problems, particularly in the context of data mining in which large optimization problems are often encountered. ADMM has several desirable properties, including the ability to decompose large problems into smaller tractable sub-problems and ease of parallelization, that are essential in these scenarios. The most common form of ADMM is the two-block, in which two sets of primal variables are updated alternatingly. Recent years have seen advances in multi-block ADMM, which update more than two blocks of primal variables sequentially. In this paper, we study the empirical question: {\em Is two-block ADMM always comparable with sequential multi-block ADMM solving an equivalent problem?} In the context of optimization problems arising in multi-task learning, through a comprehensive set of experiments we surprisingly show that multi-block ADMM consistently outperformed two-block ADMM on optimization performance, and as a consequence on prediction performance, across all datasets and for the entire range of dual step sizes. Our results have an important practical implication: rather than simply using the popular two-block ADMM, one may considerably benefit from experimenting with multi-block ADMM applied to an equivalent problem. |
Tasks | Multi-Task Learning |
Published | 2019-07-10 |
URL | https://arxiv.org/abs/1907.04524v1 |
https://arxiv.org/pdf/1907.04524v1.pdf | |
PWC | https://paperswithcode.com/paper/two-block-vs-multi-block-admm-an-empirical |
Repo | |
Framework | |
Generating User-friendly Explanations for Loan Denials using GANs
Title | Generating User-friendly Explanations for Loan Denials using GANs |
Authors | Ramya Srinivasan, Ajay Chander, Pouya Pezeshkpour |
Abstract | Financial decisions impact our lives, and thus everyone from the regulator to the consumer is interested in fair, sound, and explainable decisions. There is increasing competitive desire and regulatory incentive to deploy AI mindfully within financial services. An important mechanism towards that end is to explain AI decisions to various stakeholders. State-of-the-art explainable AI systems mostly serve AI engineers and offer little to no value to business decision makers, customers, and other stakeholders. Towards addressing this gap, in this work we consider the scenario of explaining loan denials. We build the first-of-its-kind dataset that is representative of loan-applicant friendly explanations. We design a novel Generative Adversarial Network (GAN) that can accommodate smaller datasets, to generate user-friendly textual explanations. We demonstrate how our system can also generate explanations serving different purposes: those that help educate the loan applicants, or help them take appropriate action towards a future approval. |
Tasks | |
Published | 2019-06-24 |
URL | https://arxiv.org/abs/1906.10244v1 |
https://arxiv.org/pdf/1906.10244v1.pdf | |
PWC | https://paperswithcode.com/paper/generating-user-friendly-explanations-for |
Repo | |
Framework | |
LU-Net: An Efficient Network for 3D LiDAR Point Cloud Semantic Segmentation Based on End-to-End-Learned 3D Features and U-Net
Title | LU-Net: An Efficient Network for 3D LiDAR Point Cloud Semantic Segmentation Based on End-to-End-Learned 3D Features and U-Net |
Authors | Pierre Biasutti, Vincent Lepetit, Jean-François Aujol, Mathieu Brédif, Aurélie Bugeau |
Abstract | We propose LU-Net – for LiDAR U-Net, a new method for the semantic segmentation of a 3D LiDAR point cloud. Instead of applying some global 3D segmentation method such as PointNet, we propose an end-to-end architecture for LiDAR point cloud semantic segmentation that efficiently solves the problem as an image processing problem. We first extract high-level 3D features for each point given its 3D neighbors. Then, these features are projected into a 2D multichannel range-image by considering the topology of the sensor. Thanks to these learned features and this projection, we can finally perform the segmentation using a simple U-Net segmentation network, which performs very well while being very efficient. In this way, we can exploit both the 3D nature of the data and the specificity of the LiDAR sensor. This approach outperforms the state-of-the-art by a large margin on the KITTI dataset, as our experiments show. Moreover, this approach operates at 24fps on a single GPU. This is above the acquisition rate of common LiDAR sensors which makes it suitable for real-time applications. |
Tasks | Semantic Segmentation |
Published | 2019-08-30 |
URL | https://arxiv.org/abs/1908.11656v1 |
https://arxiv.org/pdf/1908.11656v1.pdf | |
PWC | https://paperswithcode.com/paper/lu-net-an-efficient-network-for-3d-lidar |
Repo | |
Framework | |
Mitigating Multi-Stage Cascading Failure by Reinforcement Learning
Title | Mitigating Multi-Stage Cascading Failure by Reinforcement Learning |
Authors | Yongli Zhu, Chengxi Liu |
Abstract | This paper proposes a cascading failure mitigation strategy based on Reinforcement Learning (RL) method. Firstly, the principles of RL are introduced. Then, the Multi-Stage Cascading Failure (MSCF) problem is presented and its challenges are investigated. The problem is then tackled by the RL based on DC-OPF (Optimal Power Flow). Designs of the key elements of the RL framework (rewards, states, etc.) are also discussed in detail. Experiments on the IEEE 118-bus system by both shallow and deep neural networks demonstrate promising results in terms of reduced system collapse rates. |
Tasks | |
Published | 2019-08-19 |
URL | https://arxiv.org/abs/1908.06599v1 |
https://arxiv.org/pdf/1908.06599v1.pdf | |
PWC | https://paperswithcode.com/paper/mitigating-multi-stage-cascading-failure-by |
Repo | |
Framework | |
Markov Chain-based Cost-Optimal Control Charts for Healthcare Data
Title | Markov Chain-based Cost-Optimal Control Charts for Healthcare Data |
Authors | Balázs Dobi, András Zempléni |
Abstract | Control charts have traditionally been used in industrial statistics, but are constantly seeing new areas of application, especially in the age of Industry 4.0. This paper introduces a new method, which is suitable for applications in the healthcare sector, especially for monitoring a health-characteristic of a patient. We adapt a Markov chain-based approach and develop a method in which not only the shift size (i.e. the degradation of the patient’s health) can be random, but the effect of the repair (i.e. treatment) and time between samplings (i.e. visits) too. This means that we do not use many often-present assumptions which are usually not applicable for medical treatments. The average cost of the protocol, which is determined by the time between samplings and the control limit, can be estimated using the stationary distribution of the Markov chain. Furthermore, we incorporate the standard deviation of the cost into the optimisation procedure, which is often very important from a process control viewpoint. The sensitivity of the optimal parameters and the resulting average cost and cost standard deviation on different parameter values is investigated. We demonstrate the usefulness of the approach for real-life data of patients treated in Hungary: namely the monitoring of cholesterol level of patients with cardiovascular event risk. The results showed that the optimal parameters from our approach can be somewhat different from the original medical parameters. |
Tasks | |
Published | 2019-02-14 |
URL | http://arxiv.org/abs/1903.06675v1 |
http://arxiv.org/pdf/1903.06675v1.pdf | |
PWC | https://paperswithcode.com/paper/markov-chain-based-cost-optimal-control |
Repo | |
Framework | |
Salient Building Outline Enhancement and Extraction Using Iterative L0 Smoothing and Line Enhancing
Title | Salient Building Outline Enhancement and Extraction Using Iterative L0 Smoothing and Line Enhancing |
Authors | Cho-Ying Wu, Ulrich Neumann |
Abstract | In this paper, our goal is salient building outline enhancement and extraction from images taken from consumer cameras using L0 smoothing. We address weak outlines and over-smoothing problem. Weak outlines are often undetected by edge extractors or easily smoothed out. We propose an iterative method, including the smoothing cell and sharpening cell. In the smoothing cell, we iteratively enlarge the smoothing level of the L0 smoothing. In the sharpening cell, we use Hough Transform to extract lines, based on the assumption that salient outlines for buildings are usually straight, and enhance those extracted lines. Our goal is to enhance line structures and do the L0 smoothing simultaneously. Also, we propose to create building masks from semantic segmentation using an encoder-decoder network. The masks filter out irrelevant edges. We also provide an evaluation dataset on this task. |
Tasks | Semantic Segmentation |
Published | 2019-06-06 |
URL | https://arxiv.org/abs/1906.02426v1 |
https://arxiv.org/pdf/1906.02426v1.pdf | |
PWC | https://paperswithcode.com/paper/salient-building-outline-enhancement-and |
Repo | |
Framework | |
Popt4jlib: A Parallel/Distributed Optimization Library for Java
Title | Popt4jlib: A Parallel/Distributed Optimization Library for Java |
Authors | Ioannis T. Christou |
Abstract | This paper describes the architectural design as well as key implementation details of the Open Source popt4jlib library (https://githhub.org/ioannischristou/popt4jlib) that contains a fairly large number of meta-heuristic and other exact optimization algorithms parallel/distributed Java implementations. Although we report on speedup and efficiency issues on some of the algorithms in the library, our main concern is to detail the design decisions for the key parallel/distributed infrastructure built into the library, so as to make it easier for developers to develop their own parallel implementations of the algorithms of their choice, rather than simply using it as an off-the-self application library. |
Tasks | Distributed Optimization |
Published | 2019-08-01 |
URL | https://arxiv.org/abs/1908.00338v1 |
https://arxiv.org/pdf/1908.00338v1.pdf | |
PWC | https://paperswithcode.com/paper/popt4jlib-a-paralleldistributed-optimization |
Repo | |
Framework | |
Data Encoding for Byzantine-Resilient Distributed Optimization
Title | Data Encoding for Byzantine-Resilient Distributed Optimization |
Authors | Deepesh Data, Linqi Song, Suhas Diggavi |
Abstract | We study distributed optimization in the presence of Byzantine adversaries, where both data and computation are distributed among $m$ worker machines, $t$ of which can be corrupt and collaboratively deviate arbitrarily from their pre-specified programs, and a designated (master) node iteratively computes the model/parameter vector for {\em generalized linear models}. In this work, we primarily focus on two iterative algorithms: {\em Proximal Gradient Descent} (PGD) and {\em Coordinate Descent} (CD). Gradient descent (GD) is a special case of these algorithms. PGD is typically used in the data-parallel setting, where data is partitioned across different samples, whereas, CD is used in the model-parallelism setting, where the data is partitioned across the parameter space. In this paper, we propose a method based on data encoding and error correction over real numbers to combat adversarial attacks. We can tolerate up to $t\leq \lfloor\frac{m-1}{2}\rfloor$ corrupt worker nodes, which is information-theoretically optimal. We give deterministic guarantees, and our method does not assume any probability distribution on the data. We develop a {\em sparse} encoding scheme which enables computationally efficient data encoding and decoding. We demonstrate a trade-off between corruption threshold and the resource requirement (storage and computational/communication complexity). As an example, for $t\leq\frac{m}{3}$, our scheme incurs only a {\em constant} overhead on these resources, over that required by the plain distributed PGD/CD algorithms which provide no adversarial protection. Our encoding scheme extends {\em efficiently} to (i) the data streaming model, where data samples come in an online fashion and are encoded as they arrive, and (ii) making {\em stochastic gradient descent} (SGD) Byzantine-resilient. In the end, we give experimental results to show the efficacy of our method. |
Tasks | Distributed Optimization |
Published | 2019-07-05 |
URL | https://arxiv.org/abs/1907.02664v1 |
https://arxiv.org/pdf/1907.02664v1.pdf | |
PWC | https://paperswithcode.com/paper/data-encoding-for-byzantine-resilient |
Repo | |
Framework | |