October 21, 2019

3069 words 15 mins read

Paper Group AWR 136

Compressing physical properties of atomic species for improving predictive chemistry. Data-efficient Neuroevolution with Kernel-Based Surrogate Models. RoadTracer: Automatic Extraction of Road Networks from Aerial Images. Ensemble Learning Applied to Classify GPS Trajectories of Birds into Male or Female. Detail-Preserving Pooling in Deep Networks. …

Compressing physical properties of atomic species for improving predictive chemistry


Title	Compressing physical properties of atomic species for improving predictive chemistry
Authors	John E. Herr, Kevin Koh, Kun Yao, John Parkhill
Abstract	The answers to many unsolved problems lie in the intractable chemical space of molecules and materials. Machine learning techniques are rapidly growing in popularity as a way to compress and explore chemical space efficiently. One of the most important aspects of machine learning techniques is representation through the feature vector, which should contain the most important descriptors necessary to make accurate predictions, not least of which is the atomic species in the molecule or material. In this work we introduce a compressed representation of physical properties for atomic species we call the elemental modes. The elemental modes provide an excellent representation by capturing many of the nuances of the periodic table and the similarity of atomic species. We apply the elemental modes to several different tasks for machine learning algorithms and show that they enable us to make improvements to these tasks even beyond simply achieving higher accuracy predictions.
Tasks
Published	2018-10-31
URL	http://arxiv.org/abs/1811.00123v1
PDF	http://arxiv.org/pdf/1811.00123v1.pdf
PWC	https://paperswithcode.com/paper/compressing-physical-properties-of-atomic
Repo	https://github.com/jeherr/Elpasolite-Formation-Energy-Predictor
Framework	tf

Data-efficient Neuroevolution with Kernel-Based Surrogate Models


Title	Data-efficient Neuroevolution with Kernel-Based Surrogate Models
Authors	Adam Gaier, Alexander Asteroth, Jean-Baptiste Mouret
Abstract	Surrogate-assistance approaches have long been used in computationally expensive domains to improve the data-efficiency of optimization algorithms. Neuroevolution, however, has so far resisted the application of these techniques because it requires the surrogate model to make fitness predictions based on variable topologies, instead of a vector of parameters. Our main insight is that we can sidestep this problem by using kernel-based surrogate models, which require only the definition of a distance measure between individuals. Our second insight is that the well-established Neuroevolution of Augmenting Topologies (NEAT) algorithm provides a computationally efficient distance measure between dissimilar networks in the form of “compatibility distance”, initially designed to maintain topological diversity. Combining these two ideas, we introduce a surrogate-assisted neuroevolution algorithm that combines NEAT and a surrogate model built using a compatibility distance kernel. We demonstrate the data-efficiency of this new algorithm on the low dimensional cart-pole swing-up problem, as well as the higher dimensional half-cheetah running task. In both tasks the surrogate-assisted variant achieves the same or better results with several times fewer function evaluations as the original NEAT.
Tasks
Published	2018-04-15
URL	http://arxiv.org/abs/1804.05364v2
PDF	http://arxiv.org/pdf/1804.05364v2.pdf
PWC	https://paperswithcode.com/paper/data-efficient-neuroevolution-with-kernel
Repo	https://github.com/agaier/matNEAT
Framework	none

RoadTracer: Automatic Extraction of Road Networks from Aerial Images


Title	RoadTracer: Automatic Extraction of Road Networks from Aerial Images
Authors	Favyen Bastani, Songtao He, Sofiane Abbar, Mohammad Alizadeh, Hari Balakrishnan, Sanjay Chawla, Sam Madden, David DeWitt
Abstract	Mapping road networks is currently both expensive and labor-intensive. High-resolution aerial imagery provides a promising avenue to automatically infer a road network. Prior work uses convolutional neural networks (CNNs) to detect which pixels belong to a road (segmentation), and then uses complex post-processing heuristics to infer graph connectivity. We show that these segmentation methods have high error rates because noisy CNN outputs are difficult to correct. We propose RoadTracer, a new method to automatically construct accurate road network maps from aerial images. RoadTracer uses an iterative search process guided by a CNN-based decision function to derive the road network graph directly from the output of the CNN. We compare our approach with a segmentation method on fifteen cities, and find that at a 5% error rate, RoadTracer correctly captures 45% more junctions across these cities.
Tasks
Published	2018-02-11
URL	http://arxiv.org/abs/1802.03680v2
PDF	http://arxiv.org/pdf/1802.03680v2.pdf
PWC	https://paperswithcode.com/paper/roadtracer-automatic-extraction-of-road
Repo	https://github.com/mitroadmaps/roadtracer
Framework	tf

Ensemble Learning Applied to Classify GPS Trajectories of Birds into Male or Female


Title	Ensemble Learning Applied to Classify GPS Trajectories of Birds into Male or Female
Authors	Dewan Fayzur
Abstract	We describe our first-place solution to the Animal Behavior Challenge (ABC 2018) on predicting gender of bird from its GPS trajectory. The task consisted in predicting the gender of shearwater based on how they navigate themselves across a big ocean. The trajectories are collected from GPS loggers attached on shearwaters’ body, and represented as a variable-length sequence of GPS points (latitude and longitude), and associated meta-information, such as the sun azimuth, the sun elevation, the daytime, the elapsed time on each GPS location after starting the trip, the local time (date is trimmed), and the indicator of the day starting the from the trip. We used ensemble of several variants of Gradient Boosting Classifier along with Gaussian Process Classifier and Support Vector Classifier after extensive feature engineering and we ranked first out of 74 registered teams. The variants of Gradient Boosting Classifier we tried are CatBoost (Developed by Yandex), LightGBM (Developed by Microsoft), XGBoost (Developed by Distributed Machine Learning Community). Our approach could easily be adapted to other applications in which the goal is to predict a classification output from a variable-length sequence.
Tasks	Feature Engineering
Published	2018-08-26
URL	http://arxiv.org/abs/1808.08613v1
PDF	http://arxiv.org/pdf/1808.08613v1.pdf
PWC	https://paperswithcode.com/paper/ensemble-learning-applied-to-classify-gps
Repo	https://github.com/dfayzur/Animal-Behavior-Challenge-ABC2018
Framework	none

Detail-Preserving Pooling in Deep Networks


Title	Detail-Preserving Pooling in Deep Networks
Authors	Faraz Saeedan, Nicolas Weber, Michael Goesele, Stefan Roth
Abstract	Most convolutional neural networks use some method for gradually downscaling the size of the hidden layers. This is commonly referred to as pooling, and is applied to reduce the number of parameters, improve invariance to certain distortions, and increase the receptive field size. Since pooling by nature is a lossy process, it is crucial that each such layer maintains the portion of the activations that is most important for the network’s discriminability. Yet, simple maximization or averaging over blocks, max or average pooling, or plain downsampling in the form of strided convolutions are the standard. In this paper, we aim to leverage recent results on image downscaling for the purposes of deep learning. Inspired by the human visual system, which focuses on local spatial changes, we propose detail-preserving pooling (DPP), an adaptive pooling method that magnifies spatial changes and preserves important structural detail. Importantly, its parameters can be learned jointly with the rest of the network. We analyze some of its theoretical properties and show its empirical benefits on several datasets and networks, where DPP consistently outperforms previous pooling approaches.
Tasks
Published	2018-04-11
URL	http://arxiv.org/abs/1804.04076v1
PDF	http://arxiv.org/pdf/1804.04076v1.pdf
PWC	https://paperswithcode.com/paper/detail-preserving-pooling-in-deep-networks
Repo	https://github.com/visinf/dpp
Framework	pytorch

Decentralized learning with budgeted network load using Gaussian copulas and classifier ensembles


Title	Decentralized learning with budgeted network load using Gaussian copulas and classifier ensembles
Authors	John Klein, Mahmoud Albardan, Benjamin Guedj, Olivier Colot
Abstract	We examine a network of learners which address the same classification task but must learn from different data sets. The learners cannot share data but instead share their models. Models are shared only one time so as to preserve the network load. We introduce DELCO (standing for Decentralized Ensemble Learning with COpulas), a new approach allowing to aggregate the predictions of the classifiers trained by each learner. The proposed method aggregates the base classifiers using a probabilistic model relying on Gaussian copulas. Experiments on logistic regressor ensembles demonstrate competing accuracy and increased robustness in case of dependent classifiers. A companion python implementation can be downloaded at https://github.com/john-klein/DELCO
Tasks
Published	2018-04-26
URL	https://arxiv.org/abs/1804.10028v3
PDF	https://arxiv.org/pdf/1804.10028v3.pdf
PWC	https://paperswithcode.com/paper/decentralized-learning-with-budgeted-network
Repo	https://github.com/john-klein/DELCO
Framework	none

Memory Augmented Policy Optimization for Program Synthesis and Semantic Parsing


Title	Memory Augmented Policy Optimization for Program Synthesis and Semantic Parsing
Authors	Chen Liang, Mohammad Norouzi, Jonathan Berant, Quoc Le, Ni Lao
Abstract	We present Memory Augmented Policy Optimization (MAPO), a simple and novel way to leverage a memory buffer of promising trajectories to reduce the variance of policy gradient estimate. MAPO is applicable to deterministic environments with discrete actions, such as structured prediction and combinatorial optimization tasks. We express the expected return objective as a weighted sum of two terms: an expectation over the high-reward trajectories inside the memory buffer, and a separate expectation over trajectories outside the buffer. To make an efficient algorithm of MAPO, we propose: (1) memory weight clipping to accelerate and stabilize training; (2) systematic exploration to discover high-reward trajectories; (3) distributed sampling from inside and outside of the memory buffer to scale up training. MAPO improves the sample efficiency and robustness of policy gradient, especially on tasks with sparse rewards. We evaluate MAPO on weakly supervised program synthesis from natural language (semantic parsing). On the WikiTableQuestions benchmark, we improve the state-of-the-art by 2.6%, achieving an accuracy of 46.3%. On the WikiSQL benchmark, MAPO achieves an accuracy of 74.9% with only weak supervision, outperforming several strong baselines with full supervision. Our source code is available at https://github.com/crazydonkey200/neural-symbolic-machines
Tasks	Combinatorial Optimization, Program Synthesis, Semantic Parsing, Structured Prediction
Published	2018-07-06
URL	http://arxiv.org/abs/1807.02322v5
PDF	http://arxiv.org/pdf/1807.02322v5.pdf
PWC	https://paperswithcode.com/paper/memory-augmented-policy-optimization-for
Repo	https://github.com/theSparta/neural-symbolic-machines
Framework	tf

Towards Efficient and Secure Delivery of Data for Training and Inference with Privacy-Preserving


Title	Towards Efficient and Secure Delivery of Data for Training and Inference with Privacy-Preserving
Authors	Juncheng Shen, Juzheng Liu, Yiran Chen, Hai Li
Abstract	Privacy recently emerges as a severe concern in deep learning, that is, sensitive data must be prohibited from being shared with the third party during deep neural network development. In this paper, we propose Morphed Learning (MoLe), an efficient and secure scheme to deliver deep learning data. MoLe has two main components: data morphing and Augmented Convolutional (Aug-Conv) layer. Data morphing allows data providers to send morphed data without privacy information, while Aug-Conv layer helps deep learning developers to apply their networks on the morphed data without performance penalty. MoLe provides stronger security while introducing lower overhead compared to GAZELLE (USENIX Security 2018), which is another method with no performance penalty on the neural network. When using MoLe for VGG-16 network on CIFAR dataset, the computational overhead is only 9% and the data transmission overhead is 5.12%. As a comparison, GAZELLE has computational overhead of 10,000 times and data transmission overhead of 421,000 times. In this setting, the attack success rate of adversary is 7.9 x 10^{-90} for MoLe and 2.9 x 10^{-30} for GAZELLE, respectively.
Tasks
Published	2018-09-20
URL	https://arxiv.org/abs/1809.09968v5
PDF	https://arxiv.org/pdf/1809.09968v5.pdf
PWC	https://paperswithcode.com/paper/morphed-learning-towards-privacy-preserving
Repo	https://github.com/NIPS2019-authors/MoLe_public
Framework	none

Accelerated Gossip in Networks of Given Dimension using Jacobi Polynomial Iterations


Title	Accelerated Gossip in Networks of Given Dimension using Jacobi Polynomial Iterations
Authors	Raphaël Berthier, Francis Bach, Pierre Gaillard
Abstract	Consider a network of agents connected by communication links, where each agent holds a real value. The gossip problem consists in estimating the average of the values diffused in the network in a distributed manner. We develop a method solving the gossip problem that depends only on the spectral dimension of the network, that is, in the communication network set-up, the dimension of the space in which the agents live. This contrasts with previous work that required the spectral gap of the network as a parameter, or suffered from slow mixing. Our method shows an important improvement over existing algorithms in the non-asymptotic regime, i.e., when the values are far from being fully mixed in the network. Our approach stems from a polynomial-based point of view on gossip algorithms, as well as an approximation of the spectral measure of the graphs with a Jacobi measure. We show the power of the approach with simulations on various graphs, and with performance guarantees on graphs of known spectral dimension, such as grids and random percolation bonds. An extension of this work to distributed Laplacian solvers is discussed. As a side result, we also use the polynomial-based point of view to show the convergence of the message passing algorithm for gossip of Moallemi & Van Roy on regular graphs. The explicit computation of the rate of the convergence shows that message passing has a slow rate of convergence on graphs with small spectral gap.
Tasks	Denoising
Published	2018-05-22
URL	https://arxiv.org/abs/1805.08531v4
PDF	https://arxiv.org/pdf/1805.08531v4.pdf
PWC	https://paperswithcode.com/paper/accelerated-gossip-in-networks-of-given
Repo	https://github.com/raphael-berthier/jacobi-polynomial-iterations
Framework	none

Bayesian Model-Agnostic Meta-Learning


Title	Bayesian Model-Agnostic Meta-Learning
Authors	Taesup Kim, Jaesik Yoon, Ousmane Dia, Sungwoong Kim, Yoshua Bengio, Sungjin Ahn
Abstract	Learning to infer Bayesian posterior from a few-shot dataset is an important step towards robust meta-learning due to the model uncertainty inherent in the problem. In this paper, we propose a novel Bayesian model-agnostic meta-learning method. The proposed method combines scalable gradient-based meta-learning with nonparametric variational inference in a principled probabilistic framework. During fast adaptation, the method is capable of learning complex uncertainty structure beyond a point estimate or a simple Gaussian approximation. In addition, a robust Bayesian meta-update mechanism with a new meta-loss prevents overfitting during meta-update. Remaining an efficient gradient-based meta-learner, the method is also model-agnostic and simple to implement. Experiment results show the accuracy and robustness of the proposed method in various tasks: sinusoidal regression, image classification, active learning, and reinforcement learning.
Tasks	Active Learning, Image Classification, Meta-Learning
Published	2018-06-11
URL	http://arxiv.org/abs/1806.03836v4
PDF	http://arxiv.org/pdf/1806.03836v4.pdf
PWC	https://paperswithcode.com/paper/bayesian-model-agnostic-meta-learning
Repo	https://github.com/jaesik817/bmaml
Framework	tf

Acceleration of RED via Vector Extrapolation


Title	Acceleration of RED via Vector Extrapolation
Authors	Tao Hong, Yaniv Romano, Michael Elad
Abstract	Models play an important role in inverse problems, serving as the prior for representing the original signal to be recovered. REgularization by Denoising (RED) is a recently introduced general framework for constructing such priors using state-of-the-art denoising algorithms. Using RED, solving inverse problems is shown to amount to an iterated denoising process. However, as the complexity of denoising algorithms is generally high, this might lead to an overall slow algorithm. In this paper, we suggest an accelerated technique based on vector extrapolation (VE) to speed-up existing RED solvers. Numerical experiments validate the obtained gain by VE, leading to a substantial savings in computations compared with the original fixed-point method.
Tasks	Denoising
Published	2018-05-06
URL	http://arxiv.org/abs/1805.02158v2
PDF	http://arxiv.org/pdf/1805.02158v2.pdf
PWC	https://paperswithcode.com/paper/acceleration-of-red-via-vector-extrapolation
Repo	https://github.com/happyhongt/Acceleration-of-RED-via-Vector-Extrapolation
Framework	none

Subgradient Descent Learns Orthogonal Dictionaries


Title	Subgradient Descent Learns Orthogonal Dictionaries
Authors	Yu Bai, Qijia Jiang, Ju Sun
Abstract	This paper concerns dictionary learning, i.e., sparse coding, a fundamental representation learning problem. We show that a subgradient descent algorithm, with random initialization, can provably recover orthogonal dictionaries on a natural nonsmooth, nonconvex $\ell_1$ minimization formulation of the problem, under mild statistical assumptions on the data. This is in contrast to previous provable methods that require either expensive computation or delicate initialization schemes. Our analysis develops several tools for characterizing landscapes of nonsmooth functions, which might be of independent interest for provable training of deep networks with nonsmooth activations (e.g., ReLU), among numerous other applications. Preliminary experiments corroborate our analysis and show that our algorithm works well empirically in recovering orthogonal dictionaries.
Tasks	Dictionary Learning, Representation Learning
Published	2018-10-25
URL	https://arxiv.org/abs/1810.10702v2
PDF	https://arxiv.org/pdf/1810.10702v2.pdf
PWC	https://paperswithcode.com/paper/subgradient-descent-learns-orthogonal
Repo	https://github.com/sunju/ODL_L1
Framework	none

FOTS: Fast Oriented Text Spotting with a Unified Network


Title	FOTS: Fast Oriented Text Spotting with a Unified Network
Authors	Xuebo Liu, Ding Liang, Shi Yan, Dagui Chen, Yu Qiao, Junjie Yan
Abstract	Incidental scene text spotting is considered one of the most difficult and valuable challenges in the document analysis community. Most existing methods treat text detection and recognition as separate tasks. In this work, we propose a unified end-to-end trainable Fast Oriented Text Spotting (FOTS) network for simultaneous detection and recognition, sharing computation and visual information among the two complementary tasks. Specially, RoIRotate is introduced to share convolutional features between detection and recognition. Benefiting from convolution sharing strategy, our FOTS has little computation overhead compared to baseline text detection network, and the joint training method learns more generic features to make our method perform better than these two-stage methods. Experiments on ICDAR 2015, ICDAR 2017 MLT, and ICDAR 2013 datasets demonstrate that the proposed method outperforms state-of-the-art methods significantly, which further allows us to develop the first real-time oriented text spotting system which surpasses all previous state-of-the-art results by more than 5% on ICDAR 2015 text spotting task while keeping 22.6 fps.
Tasks	Scene Text Detection, Scene Text Recognition, Text Spotting
Published	2018-01-05
URL	http://arxiv.org/abs/1801.01671v2
PDF	http://arxiv.org/pdf/1801.01671v2.pdf
PWC	https://paperswithcode.com/paper/fots-fast-oriented-text-spotting-with-a
Repo	https://github.com/yu20103983/FOTS
Framework	tf

A comparative study of fairness-enhancing interventions in machine learning


Title	A comparative study of fairness-enhancing interventions in machine learning
Authors	Sorelle A. Friedler, Carlos Scheidegger, Suresh Venkatasubramanian, Sonam Choudhary, Evan P. Hamilton, Derek Roth
Abstract	Computers are increasingly used to make decisions that have significant impact in people’s lives. Often, these predictions can affect different population subgroups disproportionately. As a result, the issue of fairness has received much recent interest, and a number of fairness-enhanced classifiers and predictors have appeared in the literature. This paper seeks to study the following questions: how do these different techniques fundamentally compare to one another, and what accounts for the differences? Specifically, we seek to bring attention to many under-appreciated aspects of such fairness-enhancing interventions. Concretely, we present the results of an open benchmark we have developed that lets us compare a number of different algorithms under a variety of fairness measures, and a large number of existing datasets. We find that although different algorithms tend to prefer specific formulations of fairness preservations, many of these measures strongly correlate with one another. In addition, we find that fairness-preserving algorithms tend to be sensitive to fluctuations in dataset composition (simulated in our benchmark by varying training-test splits), indicating that fairness interventions might be more brittle than previously thought.
Tasks
Published	2018-02-13
URL	http://arxiv.org/abs/1802.04422v1
PDF	http://arxiv.org/pdf/1802.04422v1.pdf
PWC	https://paperswithcode.com/paper/a-comparative-study-of-fairness-enhancing
Repo	https://github.com/algofairness/fairness-comparison
Framework	none

In Defense of the Classification Loss for Person Re-Identification


Title	In Defense of the Classification Loss for Person Re-Identification
Authors	Yao Zhai, Xun Guo, Yan Lu, Houqiang Li
Abstract	The recent research for person re-identification has been focused on two trends. One is learning the part-based local features to form more informative feature descriptors. The other is designing effective metric learning loss functions such as the triplet loss family. We argue that learning global features with classification loss could achieve the same goal, even with some simple and cost-effective architecture design. In this paper, we first explain why the person re-id framework with standard classification loss usually has inferior performance compared to metric learning. Based on that, we further propose a person re-id framework featured by channel grouping and multi-branch strategy, which divides global features into multiple channel groups and learns the discriminative channel group features by multi-branch classification layers. The extensive experiments show that our framework outperforms prior state-of-the-arts in terms of both accuracy and inference speed.
Tasks	Metric Learning, Person Re-Identification
Published	2018-09-16
URL	http://arxiv.org/abs/1809.05864v2
PDF	http://arxiv.org/pdf/1809.05864v2.pdf
PWC	https://paperswithcode.com/paper/in-defense-of-the-classification-loss-for
Repo	https://github.com/MARMOTatZJU/ZSLPR-TIANCHI
Framework	pytorch