October 17, 2019

2577 words 13 mins read

Paper Group ANR 838

Swarm Intelligence: Past, Present and Future. From Random Differential Equations to Structural Causal Models: the stochastic case. High Throughput Synchronous Distributed Stochastic Gradient Descent. Computing the Shattering Coefficient of Supervised Learning Algorithms. Neural Networks Trained to Solve Differential Equations Learn General Represen …

Swarm Intelligence: Past, Present and Future


Title	Swarm Intelligence: Past, Present and Future
Authors	Xin-She Yang, Suash Deb, Yuxin Zhao, Simon Fong, Xingshi He
Abstract	Many optimization problems in science and engineering are challenging to solve, and the current trend is to use swarm intelligence (SI) and SI-based algorithms to tackle such challenging problems. Some significant developments have been made in recent years, though there are still many open problems in this area. This paper provides a short but timely analysis about SI-based algorithms and their links with self-organization. Different characteristics and properties are analyzed here from both mathematical and qualitative perspectives. Future research directions are outlined and open questions are also highlighted.
Tasks
Published	2018-04-21
URL	http://arxiv.org/abs/1804.07999v1
PDF	http://arxiv.org/pdf/1804.07999v1.pdf
PWC	https://paperswithcode.com/paper/swarm-intelligence-past-present-and-future
Repo
Framework

From Random Differential Equations to Structural Causal Models: the stochastic case


Title	From Random Differential Equations to Structural Causal Models: the stochastic case
Authors	Stephan Bongers, Joris M. Mooij
Abstract	Random Differential Equations provide a natural extension of Ordinary Differential Equations to the stochastic setting. We show how, and under which conditions, every equilibrium state of a Random Differential Equation (RDE) can be described by a Structural Causal Model (SCM), while pertaining the causal semantics. This provides an SCM that captures the stochastic and causal behavior of the RDE, which can model both cycles and confounders. This enables the study of the equilibrium states of the RDE by applying the theory and statistical tools available for SCMs, for example, marginalizations and Markov properties, as we illustrate by means of an example. Our work thus provides a direct connection between two fields that so far have been developing in isolation.
Tasks
Published	2018-03-23
URL	http://arxiv.org/abs/1803.08784v2
PDF	http://arxiv.org/pdf/1803.08784v2.pdf
PWC	https://paperswithcode.com/paper/from-random-differential-equations-to
Repo
Framework

High Throughput Synchronous Distributed Stochastic Gradient Descent


Title	High Throughput Synchronous Distributed Stochastic Gradient Descent
Authors	Michael Teng, Frank Wood
Abstract	We introduce a new, high-throughput, synchronous, distributed, data-parallel, stochastic-gradient-descent learning algorithm. This algorithm uses amortized inference in a compute-cluster-specific, deep, generative, dynamical model to perform joint posterior predictive inference of the mini-batch gradient computation times of all worker-nodes in a parallel computing cluster. We show that a synchronous parameter server can, by utilizing such a model, choose an optimal cutoff time beyond which mini-batch gradient messages from slow workers are ignored that maximizes overall mini-batch gradient computations per second. In keeping with earlier findings we observe that, under realistic conditions, eagerly discarding the mini-batch gradient computations of stragglers not only increases throughput but actually increases the overall rate of convergence as a function of wall-clock time by virtue of eliminating idleness. The principal novel contribution and finding of this work goes beyond this by demonstrating that using the predicted run-times from a generative model of cluster worker performance to dynamically adjust the cutoff improves substantially over the static-cutoff prior art, leading to, among other things, significantly reduced deep neural net training times on large computer clusters.
Tasks
Published	2018-03-12
URL	http://arxiv.org/abs/1803.04209v1
PDF	http://arxiv.org/pdf/1803.04209v1.pdf
PWC	https://paperswithcode.com/paper/high-throughput-synchronous-distributed
Repo
Framework

Computing the Shattering Coefficient of Supervised Learning Algorithms


Title	Computing the Shattering Coefficient of Supervised Learning Algorithms
Authors	Rodrigo Fernandes de Mello, Moacir Antonelli Ponti, Carlos Henrique Grossi Ferreira
Abstract	The Statistical Learning Theory (SLT) provides the theoretical guarantees for supervised machine learning based on the Empirical Risk Minimization Principle (ERMP). Such principle defines an upper bound to ensure the uniform convergence of the empirical risk Remp(f), i.e., the error measured on a given data sample, to the expected value of risk R(f) (a.k.a. actual risk), which depends on the Joint Probability Distribution P(X x Y) mapping input examples x in X to class labels y in Y. The uniform convergence is only ensured when the Shattering coefficient N(F,2n) has a polynomial growing behavior. This paper proves the Shattering coefficient for any Hilbert space H containing the input space X and discusses its effects in terms of learning guarantees for supervised machine algorithms.
Tasks
Published	2018-05-07
URL	http://arxiv.org/abs/1805.02627v4
PDF	http://arxiv.org/pdf/1805.02627v4.pdf
PWC	https://paperswithcode.com/paper/computing-the-shattering-coefficient-of
Repo
Framework

Neural Networks Trained to Solve Differential Equations Learn General Representations


Title	Neural Networks Trained to Solve Differential Equations Learn General Representations
Authors	Martin Magill, Faisal Qureshi, Hendrick W. de Haan
Abstract	We introduce a technique based on the singular vector canonical correlation analysis (SVCCA) for measuring the generality of neural network layers across a continuously-parametrized set of tasks. We illustrate this method by studying generality in neural networks trained to solve parametrized boundary value problems based on the Poisson partial differential equation. We find that the first hidden layer is general, and that deeper layers are successively more specific. Next, we validate our method against an existing technique that measures layer generality using transfer learning experiments. We find excellent agreement between the two methods, and note that our method is much faster, particularly for continuously-parametrized problems. Finally, we visualize the general representations of the first layers, and interpret them as generalized coordinates over the input domain.
Tasks	Transfer Learning
Published	2018-06-29
URL	http://arxiv.org/abs/1807.00042v1
PDF	http://arxiv.org/pdf/1807.00042v1.pdf
PWC	https://paperswithcode.com/paper/neural-networks-trained-to-solve-differential
Repo
Framework

Level-Based Analysis of the Population-Based Incremental Learning Algorithm


Title	Level-Based Analysis of the Population-Based Incremental Learning Algorithm
Authors	Per Kristian Lehre, Phan Trung Hai Nguyen
Abstract	The Population-Based Incremental Learning (PBIL) algorithm uses a convex combination of the current model and the empirical model to construct the next model, which is then sampled to generate offspring. The Univariate Marginal Distribution Algorithm (UMDA) is a special case of the PBIL, where the current model is ignored. Dang and Lehre (GECCO 2015) showed that UMDA can optimise LeadingOnes efficiently. The question still remained open if the PBIL performs equally well. Here, by applying the level-based theorem in addition to Dvoretzky–Kiefer–Wolfowitz inequality, we show that the PBIL optimises function LeadingOnes in expected time $\mathcal{O}(n\lambda \log \lambda + n^2)$ for a population size $\lambda = \Omega(\log n)$, which matches the bound of the UMDA. Finally, we show that the result carries over to BinVal, giving the fist runtime result for the PBIL on the BinVal problem.
Tasks
Published	2018-06-05
URL	http://arxiv.org/abs/1806.01710v1
PDF	http://arxiv.org/pdf/1806.01710v1.pdf
PWC	https://paperswithcode.com/paper/level-based-analysis-of-the-population-based
Repo
Framework

Luminoso at SemEval-2018 Task 10: Distinguishing Attributes Using Text Corpora and Relational Knowledge


Title	Luminoso at SemEval-2018 Task 10: Distinguishing Attributes Using Text Corpora and Relational Knowledge
Authors	Robyn Speer, Joanna Lowry-Duda
Abstract	Luminoso participated in the SemEval 2018 task on “Capturing Discriminative Attributes” with a system based on ConceptNet, an open knowledge graph focused on general knowledge. In this paper, we describe how we trained a linear classifier on a small number of semantically-informed features to achieve an $F_1$ score of 0.7368 on the task, close to the task’s high score of 0.75.
Tasks
Published	2018-06-05
URL	http://arxiv.org/abs/1806.01733v2
PDF	http://arxiv.org/pdf/1806.01733v2.pdf
PWC	https://paperswithcode.com/paper/luminoso-at-semeval-2018-task-10
Repo
Framework

Learning of Tree-Structured Gaussian Graphical Models on Distributed Data under Communication Constraints


Title	Learning of Tree-Structured Gaussian Graphical Models on Distributed Data under Communication Constraints
Authors	Mostafa Tavassolipour, Seyed Abolfazl Motahari, Mohammad-Taghi Manzuri Shalmani
Abstract	In this paper, learning of tree-structured Gaussian graphical models from distributed data is addressed. In our model, samples are stored in a set of distributed machines where each machine has access to only a subset of features. A central machine is then responsible for learning the structure based on received messages from the other nodes. We present a set of communication efficient strategies, which are theoretically proved to convey sufficient information for reliable learning of the structure. In particular, our analyses show that even if each machine sends only the signs of its local data samples to the central node, the tree structure can still be recovered with high accuracy. Our simulation results on both synthetic and real-world datasets show that our strategies achieve a desired accuracy in inferring the underlying structure, while spending a small budget on communication.
Tasks
Published	2018-09-21
URL	http://arxiv.org/abs/1809.08067v1
PDF	http://arxiv.org/pdf/1809.08067v1.pdf
PWC	https://paperswithcode.com/paper/learning-of-tree-structured-gaussian
Repo
Framework


Title	Image and Encoded Text Fusion for Multi-Modal Classification
Authors	Ignazio Gallo, Alessandro Calefati, Shah Nawaz, Muhammad Kamran Janjua
Abstract	Multi-modal approaches employ data from multiple input streams such as textual and visual domains. Deep neural networks have been successfully employed for these approaches. In this paper, we present a novel multi-modal approach that fuses images and text descriptions to improve multi-modal classification performance in real-world scenarios. The proposed approach embeds an encoded text onto an image to obtain an information-enriched image. To learn feature representations of resulting images, standard Convolutional Neural Networks (CNNs) are employed for the classification task. We demonstrate how a CNN based pipeline can be used to learn representations of the novel fusion approach. We compare our approach with individual sources on two large-scale multi-modal classification datasets while obtaining encouraging results. Furthermore, we evaluate our approach against two famous multi-modal strategies namely early fusion and late fusion.
Tasks
Published	2018-10-03
URL	http://arxiv.org/abs/1810.02001v1
PDF	http://arxiv.org/pdf/1810.02001v1.pdf
PWC	https://paperswithcode.com/paper/image-and-encoded-text-fusion-for-multi-modal
Repo
Framework

Predictor-Corrector Policy Optimization


Title	Predictor-Corrector Policy Optimization
Authors	Ching-An Cheng, Xinyan Yan, Nathan Ratliff, Byron Boots
Abstract	We present a predictor-corrector framework, called PicCoLO, that can transform a first-order model-free reinforcement or imitation learning algorithm into a new hybrid method that leverages predictive models to accelerate policy learning. The new “PicCoLOed” algorithm optimizes a policy by recursively repeating two steps: In the Prediction Step, the learner uses a model to predict the unseen future gradient and then applies the predicted estimate to update the policy; in the Correction Step, the learner runs the updated policy in the environment, receives the true gradient, and then corrects the policy using the gradient error. Unlike previous algorithms, PicCoLO corrects for the mistakes of using imperfect predicted gradients and hence does not suffer from model bias. The development of PicCoLO is made possible by a novel reduction from predictable online learning to adversarial online learning, which provides a systematic way to modify existing first-order algorithms to achieve the optimal regret with respect to predictable information. We show, in both theory and simulation, that the convergence rate of several first-order model-free algorithms can be improved by PicCoLO.
Tasks	Imitation Learning
Published	2018-10-15
URL	https://arxiv.org/abs/1810.06509v2
PDF	https://arxiv.org/pdf/1810.06509v2.pdf
PWC	https://paperswithcode.com/paper/predictor-corrector-policy-optimization
Repo
Framework

Adversarial Examples that Fool both Computer Vision and Time-Limited Humans


Title	Adversarial Examples that Fool both Computer Vision and Time-Limited Humans
Authors	Gamaleldin F. Elsayed, Shreya Shankar, Brian Cheung, Nicolas Papernot, Alex Kurakin, Ian Goodfellow, Jascha Sohl-Dickstein
Abstract	Machine learning models are vulnerable to adversarial examples: small changes to images can cause computer vision models to make mistakes such as identifying a school bus as an ostrich. However, it is still an open question whether humans are prone to similar mistakes. Here, we address this question by leveraging recent techniques that transfer adversarial examples from computer vision models with known parameters and architecture to other models with unknown parameters and architecture, and by matching the initial processing of the human visual system. We find that adversarial examples that strongly transfer across computer vision models influence the classifications made by time-limited human observers.
Tasks
Published	2018-02-22
URL	http://arxiv.org/abs/1802.08195v3
PDF	http://arxiv.org/pdf/1802.08195v3.pdf
PWC	https://paperswithcode.com/paper/adversarial-examples-that-fool-both-computer
Repo
Framework

Skeleton Driven Non-rigid Motion Tracking and 3D Reconstruction


Title	Skeleton Driven Non-rigid Motion Tracking and 3D Reconstruction
Authors	Shafeeq Elanattil, Peyman Moghadam, Simon Denman, Sridha Sridharan, Clinton Fookes
Abstract	This paper presents a method which can track and 3D reconstruct the non-rigid surface motion of human performance using a moving RGB-D camera. 3D reconstruction of marker-less human performance is a challenging problem due to the large range of articulated motions and considerable non-rigid deformations. Current approaches use local optimization for tracking. These methods need many iterations to converge and may get stuck in local minima during sudden articulated movements. We propose a puppet model-based tracking approach using skeleton prior, which provides a better initialization for tracking articulated movements. The proposed approach uses an aligned puppet model to estimate correct correspondences for human performance capture. We also contribute a synthetic dataset which provides ground truth locations for frame-by-frame geometry and skeleton joints of human subjects. Experimental results show that our approach is more robust when faced with sudden articulated motions, and provides better 3D reconstruction compared to the existing state-of-the-art approaches.
Tasks	3D Reconstruction
Published	2018-10-09
URL	http://arxiv.org/abs/1810.03774v1
PDF	http://arxiv.org/pdf/1810.03774v1.pdf
PWC	https://paperswithcode.com/paper/skeleton-driven-non-rigid-motion-tracking-and
Repo
Framework

Preventing Failures Due to Dataset Shift: Learning Predictive Models That Transport


Title	Preventing Failures Due to Dataset Shift: Learning Predictive Models That Transport
Authors	Adarsh Subbaswamy, Peter Schulam, Suchi Saria
Abstract	Classical supervised learning produces unreliable models when training and target distributions differ, with most existing solutions requiring samples from the target domain. We propose a proactive approach which learns a relationship in the training domain that will generalize to the target domain by incorporating prior knowledge of aspects of the data generating process that are expected to differ as expressed in a causal selection diagram. Specifically, we remove variables generated by unstable mechanisms from the joint factorization to yield the Surgery Estimator—an interventional distribution that is invariant to the differences across environments. We prove that the surgery estimator finds stable relationships in strictly more scenarios than previous approaches which only consider conditional relationships, and demonstrate this in simulated experiments. We also evaluate on real world data for which the true causal diagram is unknown, performing competitively against entirely data-driven approaches.
Tasks
Published	2018-12-11
URL	http://arxiv.org/abs/1812.04597v2
PDF	http://arxiv.org/pdf/1812.04597v2.pdf
PWC	https://paperswithcode.com/paper/preventing-failures-due-to-dataset-shift
Repo
Framework

Light-weight pixel context encoders for image inpainting


Title	Light-weight pixel context encoders for image inpainting
Authors	Nanne van Noord, Eric Postma
Abstract	In this work we propose Pixel Content Encoders (PCE), a light-weight image inpainting model, capable of generating novel con-tent for large missing regions in images. Unlike previously presented convolutional neural network based models, our PCE model has an order of magnitude fewer trainable parameters. Moreover, by incorporating dilated convolutions we are able to preserve fine grained spatial information, achieving state-of-the-art performance on benchmark datasets of natural images and paintings. Besides image inpainting, we show that without changing the architecture, PCE can be used for image extrapolation, generating novel content beyond existing image boundaries.
Tasks	Image Inpainting
Published	2018-01-17
URL	http://arxiv.org/abs/1801.05585v1
PDF	http://arxiv.org/pdf/1801.05585v1.pdf
PWC	https://paperswithcode.com/paper/light-weight-pixel-context-encoders-for-image
Repo
Framework

Unifying Identification and Context Learning for Person Recognition


Title	Unifying Identification and Context Learning for Person Recognition
Authors	Qingqiu Huang, Yu Xiong, Dahua Lin
Abstract	Despite the great success of face recognition techniques, recognizing persons under unconstrained settings remains challenging. Issues like profile views, unfavorable lighting, and occlusions can cause substantial difficulties. Previous works have attempted to tackle this problem by exploiting the context, e.g. clothes and social relations. While showing promising improvement, they are usually limited in two important aspects, relying on simple heuristics to combine different cues and separating the construction of context from people identities. In this work, we aim to move beyond such limitations and propose a new framework to leverage context for person recognition. In particular, we propose a Region Attention Network, which is learned to adaptively combine visual cues with instance-dependent weights. We also develop a unified formulation, where the social contexts are learned along with the reasoning of people identities. These models substantially improve the robustness when working with the complex contextual relations in unconstrained environments. On two large datasets, PIPA and Cast In Movies (CIM), a new dataset proposed in this work, our method consistently achieves state-of-the-art performance under multiple evaluation policies.
Tasks	Face Recognition, Person Recognition
Published	2018-06-08
URL	http://arxiv.org/abs/1806.03084v1
PDF	http://arxiv.org/pdf/1806.03084v1.pdf
PWC	https://paperswithcode.com/paper/unifying-identification-and-context-learning
Repo
Framework