Paper Group AWR 136
Compressing physical properties of atomic species for improving predictive chemistry. Data-efficient Neuroevolution with Kernel-Based Surrogate Models. RoadTracer: Automatic Extraction of Road Networks from Aerial Images. Ensemble Learning Applied to Classify GPS Trajectories of Birds into Male or Female. Detail-Preserving Pooling in Deep Networks. …
Compressing physical properties of atomic species for improving predictive chemistry
Title | Compressing physical properties of atomic species for improving predictive chemistry |
Authors | John E. Herr, Kevin Koh, Kun Yao, John Parkhill |
Abstract | The answers to many unsolved problems lie in the intractable chemical space of molecules and materials. Machine learning techniques are rapidly growing in popularity as a way to compress and explore chemical space efficiently. One of the most important aspects of machine learning techniques is representation through the feature vector, which should contain the most important descriptors necessary to make accurate predictions, not least of which is the atomic species in the molecule or material. In this work we introduce a compressed representation of physical properties for atomic species we call the elemental modes. The elemental modes provide an excellent representation by capturing many of the nuances of the periodic table and the similarity of atomic species. We apply the elemental modes to several different tasks for machine learning algorithms and show that they enable us to make improvements to these tasks even beyond simply achieving higher accuracy predictions. |
Tasks | |
Published | 2018-10-31 |
URL | http://arxiv.org/abs/1811.00123v1 |
http://arxiv.org/pdf/1811.00123v1.pdf | |
PWC | https://paperswithcode.com/paper/compressing-physical-properties-of-atomic |
Repo | https://github.com/jeherr/Elpasolite-Formation-Energy-Predictor |
Framework | tf |
Data-efficient Neuroevolution with Kernel-Based Surrogate Models
Title | Data-efficient Neuroevolution with Kernel-Based Surrogate Models |
Authors | Adam Gaier, Alexander Asteroth, Jean-Baptiste Mouret |
Abstract | Surrogate-assistance approaches have long been used in computationally expensive domains to improve the data-efficiency of optimization algorithms. Neuroevolution, however, has so far resisted the application of these techniques because it requires the surrogate model to make fitness predictions based on variable topologies, instead of a vector of parameters. Our main insight is that we can sidestep this problem by using kernel-based surrogate models, which require only the definition of a distance measure between individuals. Our second insight is that the well-established Neuroevolution of Augmenting Topologies (NEAT) algorithm provides a computationally efficient distance measure between dissimilar networks in the form of “compatibility distance”, initially designed to maintain topological diversity. Combining these two ideas, we introduce a surrogate-assisted neuroevolution algorithm that combines NEAT and a surrogate model built using a compatibility distance kernel. We demonstrate the data-efficiency of this new algorithm on the low dimensional cart-pole swing-up problem, as well as the higher dimensional half-cheetah running task. In both tasks the surrogate-assisted variant achieves the same or better results with several times fewer function evaluations as the original NEAT. |
Tasks | |
Published | 2018-04-15 |
URL | http://arxiv.org/abs/1804.05364v2 |
http://arxiv.org/pdf/1804.05364v2.pdf | |
PWC | https://paperswithcode.com/paper/data-efficient-neuroevolution-with-kernel |
Repo | https://github.com/agaier/matNEAT |
Framework | none |
RoadTracer: Automatic Extraction of Road Networks from Aerial Images
Title | RoadTracer: Automatic Extraction of Road Networks from Aerial Images |
Authors | Favyen Bastani, Songtao He, Sofiane Abbar, Mohammad Alizadeh, Hari Balakrishnan, Sanjay Chawla, Sam Madden, David DeWitt |
Abstract | Mapping road networks is currently both expensive and labor-intensive. High-resolution aerial imagery provides a promising avenue to automatically infer a road network. Prior work uses convolutional neural networks (CNNs) to detect which pixels belong to a road (segmentation), and then uses complex post-processing heuristics to infer graph connectivity. We show that these segmentation methods have high error rates because noisy CNN outputs are difficult to correct. We propose RoadTracer, a new method to automatically construct accurate road network maps from aerial images. RoadTracer uses an iterative search process guided by a CNN-based decision function to derive the road network graph directly from the output of the CNN. We compare our approach with a segmentation method on fifteen cities, and find that at a 5% error rate, RoadTracer correctly captures 45% more junctions across these cities. |
Tasks | |
Published | 2018-02-11 |
URL | http://arxiv.org/abs/1802.03680v2 |
http://arxiv.org/pdf/1802.03680v2.pdf | |
PWC | https://paperswithcode.com/paper/roadtracer-automatic-extraction-of-road |
Repo | https://github.com/mitroadmaps/roadtracer |
Framework | tf |
Ensemble Learning Applied to Classify GPS Trajectories of Birds into Male or Female
Title | Ensemble Learning Applied to Classify GPS Trajectories of Birds into Male or Female |
Authors | Dewan Fayzur |
Abstract | We describe our first-place solution to the Animal Behavior Challenge (ABC 2018) on predicting gender of bird from its GPS trajectory. The task consisted in predicting the gender of shearwater based on how they navigate themselves across a big ocean. The trajectories are collected from GPS loggers attached on shearwaters’ body, and represented as a variable-length sequence of GPS points (latitude and longitude), and associated meta-information, such as the sun azimuth, the sun elevation, the daytime, the elapsed time on each GPS location after starting the trip, the local time (date is trimmed), and the indicator of the day starting the from the trip. We used ensemble of several variants of Gradient Boosting Classifier along with Gaussian Process Classifier and Support Vector Classifier after extensive feature engineering and we ranked first out of 74 registered teams. The variants of Gradient Boosting Classifier we tried are CatBoost (Developed by Yandex), LightGBM (Developed by Microsoft), XGBoost (Developed by Distributed Machine Learning Community). Our approach could easily be adapted to other applications in which the goal is to predict a classification output from a variable-length sequence. |
Tasks | Feature Engineering |
Published | 2018-08-26 |
URL | http://arxiv.org/abs/1808.08613v1 |
http://arxiv.org/pdf/1808.08613v1.pdf | |
PWC | https://paperswithcode.com/paper/ensemble-learning-applied-to-classify-gps |
Repo | https://github.com/dfayzur/Animal-Behavior-Challenge-ABC2018 |
Framework | none |
Detail-Preserving Pooling in Deep Networks
Title | Detail-Preserving Pooling in Deep Networks |
Authors | Faraz Saeedan, Nicolas Weber, Michael Goesele, Stefan Roth |
Abstract | Most convolutional neural networks use some method for gradually downscaling the size of the hidden layers. This is commonly referred to as pooling, and is applied to reduce the number of parameters, improve invariance to certain distortions, and increase the receptive field size. Since pooling by nature is a lossy process, it is crucial that each such layer maintains the portion of the activations that is most important for the network’s discriminability. Yet, simple maximization or averaging over blocks, max or average pooling, or plain downsampling in the form of strided convolutions are the standard. In this paper, we aim to leverage recent results on image downscaling for the purposes of deep learning. Inspired by the human visual system, which focuses on local spatial changes, we propose detail-preserving pooling (DPP), an adaptive pooling method that magnifies spatial changes and preserves important structural detail. Importantly, its parameters can be learned jointly with the rest of the network. We analyze some of its theoretical properties and show its empirical benefits on several datasets and networks, where DPP consistently outperforms previous pooling approaches. |
Tasks | |
Published | 2018-04-11 |
URL | http://arxiv.org/abs/1804.04076v1 |
http://arxiv.org/pdf/1804.04076v1.pdf | |
PWC | https://paperswithcode.com/paper/detail-preserving-pooling-in-deep-networks |
Repo | https://github.com/visinf/dpp |
Framework | pytorch |
Decentralized learning with budgeted network load using Gaussian copulas and classifier ensembles
Title | Decentralized learning with budgeted network load using Gaussian copulas and classifier ensembles |
Authors | John Klein, Mahmoud Albardan, Benjamin Guedj, Olivier Colot |
Abstract | We examine a network of learners which address the same classification task but must learn from different data sets. The learners cannot share data but instead share their models. Models are shared only one time so as to preserve the network load. We introduce DELCO (standing for Decentralized Ensemble Learning with COpulas), a new approach allowing to aggregate the predictions of the classifiers trained by each learner. The proposed method aggregates the base classifiers using a probabilistic model relying on Gaussian copulas. Experiments on logistic regressor ensembles demonstrate competing accuracy and increased robustness in case of dependent classifiers. A companion python implementation can be downloaded at https://github.com/john-klein/DELCO |
Tasks | |
Published | 2018-04-26 |
URL | https://arxiv.org/abs/1804.10028v3 |
https://arxiv.org/pdf/1804.10028v3.pdf | |
PWC | https://paperswithcode.com/paper/decentralized-learning-with-budgeted-network |
Repo | https://github.com/john-klein/DELCO |
Framework | none |
Memory Augmented Policy Optimization for Program Synthesis and Semantic Parsing
Title | Memory Augmented Policy Optimization for Program Synthesis and Semantic Parsing |
Authors | Chen Liang, Mohammad Norouzi, Jonathan Berant, Quoc Le, Ni Lao |
Abstract | We present Memory Augmented Policy Optimization (MAPO), a simple and novel way to leverage a memory buffer of promising trajectories to reduce the variance of policy gradient estimate. MAPO is applicable to deterministic environments with discrete actions, such as structured prediction and combinatorial optimization tasks. We express the expected return objective as a weighted sum of two terms: an expectation over the high-reward trajectories inside the memory buffer, and a separate expectation over trajectories outside the buffer. To make an efficient algorithm of MAPO, we propose: (1) memory weight clipping to accelerate and stabilize training; (2) systematic exploration to discover high-reward trajectories; (3) distributed sampling from inside and outside of the memory buffer to scale up training. MAPO improves the sample efficiency and robustness of policy gradient, especially on tasks with sparse rewards. We evaluate MAPO on weakly supervised program synthesis from natural language (semantic parsing). On the WikiTableQuestions benchmark, we improve the state-of-the-art by 2.6%, achieving an accuracy of 46.3%. On the WikiSQL benchmark, MAPO achieves an accuracy of 74.9% with only weak supervision, outperforming several strong baselines with full supervision. Our source code is available at https://github.com/crazydonkey200/neural-symbolic-machines |
Tasks | Combinatorial Optimization, Program Synthesis, Semantic Parsing, Structured Prediction |
Published | 2018-07-06 |
URL | http://arxiv.org/abs/1807.02322v5 |
http://arxiv.org/pdf/1807.02322v5.pdf | |
PWC | https://paperswithcode.com/paper/memory-augmented-policy-optimization-for |
Repo | https://github.com/theSparta/neural-symbolic-machines |
Framework | tf |
Towards Efficient and Secure Delivery of Data for Training and Inference with Privacy-Preserving
Title | Towards Efficient and Secure Delivery of Data for Training and Inference with Privacy-Preserving |
Authors | Juncheng Shen, Juzheng Liu, Yiran Chen, Hai Li |
Abstract | Privacy recently emerges as a severe concern in deep learning, that is, sensitive data must be prohibited from being shared with the third party during deep neural network development. In this paper, we propose Morphed Learning (MoLe), an efficient and secure scheme to deliver deep learning data. MoLe has two main components: data morphing and Augmented Convolutional (Aug-Conv) layer. Data morphing allows data providers to send morphed data without privacy information, while Aug-Conv layer helps deep learning developers to apply their networks on the morphed data without performance penalty. MoLe provides stronger security while introducing lower overhead compared to GAZELLE (USENIX Security 2018), which is another method with no performance penalty on the neural network. When using MoLe for VGG-16 network on CIFAR dataset, the computational overhead is only 9% and the data transmission overhead is 5.12%. As a comparison, GAZELLE has computational overhead of 10,000 times and data transmission overhead of 421,000 times. In this setting, the attack success rate of adversary is 7.9 x 10^{-90} for MoLe and 2.9 x 10^{-30} for GAZELLE, respectively. |
Tasks | |
Published | 2018-09-20 |
URL | https://arxiv.org/abs/1809.09968v5 |
https://arxiv.org/pdf/1809.09968v5.pdf | |
PWC | https://paperswithcode.com/paper/morphed-learning-towards-privacy-preserving |
Repo | https://github.com/NIPS2019-authors/MoLe_public |
Framework | none |
Accelerated Gossip in Networks of Given Dimension using Jacobi Polynomial Iterations
Title | Accelerated Gossip in Networks of Given Dimension using Jacobi Polynomial Iterations |
Authors | Raphaël Berthier, Francis Bach, Pierre Gaillard |
Abstract | Consider a network of agents connected by communication links, where each agent holds a real value. The gossip problem consists in estimating the average of the values diffused in the network in a distributed manner. We develop a method solving the gossip problem that depends only on the spectral dimension of the network, that is, in the communication network set-up, the dimension of the space in which the agents live. This contrasts with previous work that required the spectral gap of the network as a parameter, or suffered from slow mixing. Our method shows an important improvement over existing algorithms in the non-asymptotic regime, i.e., when the values are far from being fully mixed in the network. Our approach stems from a polynomial-based point of view on gossip algorithms, as well as an approximation of the spectral measure of the graphs with a Jacobi measure. We show the power of the approach with simulations on various graphs, and with performance guarantees on graphs of known spectral dimension, such as grids and random percolation bonds. An extension of this work to distributed Laplacian solvers is discussed. As a side result, we also use the polynomial-based point of view to show the convergence of the message passing algorithm for gossip of Moallemi & Van Roy on regular graphs. The explicit computation of the rate of the convergence shows that message passing has a slow rate of convergence on graphs with small spectral gap. |
Tasks | Denoising |
Published | 2018-05-22 |
URL | https://arxiv.org/abs/1805.08531v4 |
https://arxiv.org/pdf/1805.08531v4.pdf | |
PWC | https://paperswithcode.com/paper/accelerated-gossip-in-networks-of-given |
Repo | https://github.com/raphael-berthier/jacobi-polynomial-iterations |
Framework | none |
Bayesian Model-Agnostic Meta-Learning
Title | Bayesian Model-Agnostic Meta-Learning |
Authors | Taesup Kim, Jaesik Yoon, Ousmane Dia, Sungwoong Kim, Yoshua Bengio, Sungjin Ahn |
Abstract | Learning to infer Bayesian posterior from a few-shot dataset is an important step towards robust meta-learning due to the model uncertainty inherent in the problem. In this paper, we propose a novel Bayesian model-agnostic meta-learning method. The proposed method combines scalable gradient-based meta-learning with nonparametric variational inference in a principled probabilistic framework. During fast adaptation, the method is capable of learning complex uncertainty structure beyond a point estimate or a simple Gaussian approximation. In addition, a robust Bayesian meta-update mechanism with a new meta-loss prevents overfitting during meta-update. Remaining an efficient gradient-based meta-learner, the method is also model-agnostic and simple to implement. Experiment results show the accuracy and robustness of the proposed method in various tasks: sinusoidal regression, image classification, active learning, and reinforcement learning. |
Tasks | Active Learning, Image Classification, Meta-Learning |
Published | 2018-06-11 |
URL | http://arxiv.org/abs/1806.03836v4 |
http://arxiv.org/pdf/1806.03836v4.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-model-agnostic-meta-learning |
Repo | https://github.com/jaesik817/bmaml |
Framework | tf |
Acceleration of RED via Vector Extrapolation
Title | Acceleration of RED via Vector Extrapolation |
Authors | Tao Hong, Yaniv Romano, Michael Elad |
Abstract | Models play an important role in inverse problems, serving as the prior for representing the original signal to be recovered. REgularization by Denoising (RED) is a recently introduced general framework for constructing such priors using state-of-the-art denoising algorithms. Using RED, solving inverse problems is shown to amount to an iterated denoising process. However, as the complexity of denoising algorithms is generally high, this might lead to an overall slow algorithm. In this paper, we suggest an accelerated technique based on vector extrapolation (VE) to speed-up existing RED solvers. Numerical experiments validate the obtained gain by VE, leading to a substantial savings in computations compared with the original fixed-point method. |
Tasks | Denoising |
Published | 2018-05-06 |
URL | http://arxiv.org/abs/1805.02158v2 |
http://arxiv.org/pdf/1805.02158v2.pdf | |
PWC | https://paperswithcode.com/paper/acceleration-of-red-via-vector-extrapolation |
Repo | https://github.com/happyhongt/Acceleration-of-RED-via-Vector-Extrapolation |
Framework | none |
Subgradient Descent Learns Orthogonal Dictionaries
Title | Subgradient Descent Learns Orthogonal Dictionaries |
Authors | Yu Bai, Qijia Jiang, Ju Sun |
Abstract | This paper concerns dictionary learning, i.e., sparse coding, a fundamental representation learning problem. We show that a subgradient descent algorithm, with random initialization, can provably recover orthogonal dictionaries on a natural nonsmooth, nonconvex $\ell_1$ minimization formulation of the problem, under mild statistical assumptions on the data. This is in contrast to previous provable methods that require either expensive computation or delicate initialization schemes. Our analysis develops several tools for characterizing landscapes of nonsmooth functions, which might be of independent interest for provable training of deep networks with nonsmooth activations (e.g., ReLU), among numerous other applications. Preliminary experiments corroborate our analysis and show that our algorithm works well empirically in recovering orthogonal dictionaries. |
Tasks | Dictionary Learning, Representation Learning |
Published | 2018-10-25 |
URL | https://arxiv.org/abs/1810.10702v2 |
https://arxiv.org/pdf/1810.10702v2.pdf | |
PWC | https://paperswithcode.com/paper/subgradient-descent-learns-orthogonal |
Repo | https://github.com/sunju/ODL_L1 |
Framework | none |
FOTS: Fast Oriented Text Spotting with a Unified Network
Title | FOTS: Fast Oriented Text Spotting with a Unified Network |
Authors | Xuebo Liu, Ding Liang, Shi Yan, Dagui Chen, Yu Qiao, Junjie Yan |
Abstract | Incidental scene text spotting is considered one of the most difficult and valuable challenges in the document analysis community. Most existing methods treat text detection and recognition as separate tasks. In this work, we propose a unified end-to-end trainable Fast Oriented Text Spotting (FOTS) network for simultaneous detection and recognition, sharing computation and visual information among the two complementary tasks. Specially, RoIRotate is introduced to share convolutional features between detection and recognition. Benefiting from convolution sharing strategy, our FOTS has little computation overhead compared to baseline text detection network, and the joint training method learns more generic features to make our method perform better than these two-stage methods. Experiments on ICDAR 2015, ICDAR 2017 MLT, and ICDAR 2013 datasets demonstrate that the proposed method outperforms state-of-the-art methods significantly, which further allows us to develop the first real-time oriented text spotting system which surpasses all previous state-of-the-art results by more than 5% on ICDAR 2015 text spotting task while keeping 22.6 fps. |
Tasks | Scene Text Detection, Scene Text Recognition, Text Spotting |
Published | 2018-01-05 |
URL | http://arxiv.org/abs/1801.01671v2 |
http://arxiv.org/pdf/1801.01671v2.pdf | |
PWC | https://paperswithcode.com/paper/fots-fast-oriented-text-spotting-with-a |
Repo | https://github.com/yu20103983/FOTS |
Framework | tf |
A comparative study of fairness-enhancing interventions in machine learning
Title | A comparative study of fairness-enhancing interventions in machine learning |
Authors | Sorelle A. Friedler, Carlos Scheidegger, Suresh Venkatasubramanian, Sonam Choudhary, Evan P. Hamilton, Derek Roth |
Abstract | Computers are increasingly used to make decisions that have significant impact in people’s lives. Often, these predictions can affect different population subgroups disproportionately. As a result, the issue of fairness has received much recent interest, and a number of fairness-enhanced classifiers and predictors have appeared in the literature. This paper seeks to study the following questions: how do these different techniques fundamentally compare to one another, and what accounts for the differences? Specifically, we seek to bring attention to many under-appreciated aspects of such fairness-enhancing interventions. Concretely, we present the results of an open benchmark we have developed that lets us compare a number of different algorithms under a variety of fairness measures, and a large number of existing datasets. We find that although different algorithms tend to prefer specific formulations of fairness preservations, many of these measures strongly correlate with one another. In addition, we find that fairness-preserving algorithms tend to be sensitive to fluctuations in dataset composition (simulated in our benchmark by varying training-test splits), indicating that fairness interventions might be more brittle than previously thought. |
Tasks | |
Published | 2018-02-13 |
URL | http://arxiv.org/abs/1802.04422v1 |
http://arxiv.org/pdf/1802.04422v1.pdf | |
PWC | https://paperswithcode.com/paper/a-comparative-study-of-fairness-enhancing |
Repo | https://github.com/algofairness/fairness-comparison |
Framework | none |
In Defense of the Classification Loss for Person Re-Identification
Title | In Defense of the Classification Loss for Person Re-Identification |
Authors | Yao Zhai, Xun Guo, Yan Lu, Houqiang Li |
Abstract | The recent research for person re-identification has been focused on two trends. One is learning the part-based local features to form more informative feature descriptors. The other is designing effective metric learning loss functions such as the triplet loss family. We argue that learning global features with classification loss could achieve the same goal, even with some simple and cost-effective architecture design. In this paper, we first explain why the person re-id framework with standard classification loss usually has inferior performance compared to metric learning. Based on that, we further propose a person re-id framework featured by channel grouping and multi-branch strategy, which divides global features into multiple channel groups and learns the discriminative channel group features by multi-branch classification layers. The extensive experiments show that our framework outperforms prior state-of-the-arts in terms of both accuracy and inference speed. |
Tasks | Metric Learning, Person Re-Identification |
Published | 2018-09-16 |
URL | http://arxiv.org/abs/1809.05864v2 |
http://arxiv.org/pdf/1809.05864v2.pdf | |
PWC | https://paperswithcode.com/paper/in-defense-of-the-classification-loss-for |
Repo | https://github.com/MARMOTatZJU/ZSLPR-TIANCHI |
Framework | pytorch |