Paper Group ANR 838
Swarm Intelligence: Past, Present and Future. From Random Differential Equations to Structural Causal Models: the stochastic case. High Throughput Synchronous Distributed Stochastic Gradient Descent. Computing the Shattering Coefficient of Supervised Learning Algorithms. Neural Networks Trained to Solve Differential Equations Learn General Represen …
Swarm Intelligence: Past, Present and Future
Title | Swarm Intelligence: Past, Present and Future |
Authors | Xin-She Yang, Suash Deb, Yuxin Zhao, Simon Fong, Xingshi He |
Abstract | Many optimization problems in science and engineering are challenging to solve, and the current trend is to use swarm intelligence (SI) and SI-based algorithms to tackle such challenging problems. Some significant developments have been made in recent years, though there are still many open problems in this area. This paper provides a short but timely analysis about SI-based algorithms and their links with self-organization. Different characteristics and properties are analyzed here from both mathematical and qualitative perspectives. Future research directions are outlined and open questions are also highlighted. |
Tasks | |
Published | 2018-04-21 |
URL | http://arxiv.org/abs/1804.07999v1 |
http://arxiv.org/pdf/1804.07999v1.pdf | |
PWC | https://paperswithcode.com/paper/swarm-intelligence-past-present-and-future |
Repo | |
Framework | |
From Random Differential Equations to Structural Causal Models: the stochastic case
Title | From Random Differential Equations to Structural Causal Models: the stochastic case |
Authors | Stephan Bongers, Joris M. Mooij |
Abstract | Random Differential Equations provide a natural extension of Ordinary Differential Equations to the stochastic setting. We show how, and under which conditions, every equilibrium state of a Random Differential Equation (RDE) can be described by a Structural Causal Model (SCM), while pertaining the causal semantics. This provides an SCM that captures the stochastic and causal behavior of the RDE, which can model both cycles and confounders. This enables the study of the equilibrium states of the RDE by applying the theory and statistical tools available for SCMs, for example, marginalizations and Markov properties, as we illustrate by means of an example. Our work thus provides a direct connection between two fields that so far have been developing in isolation. |
Tasks | |
Published | 2018-03-23 |
URL | http://arxiv.org/abs/1803.08784v2 |
http://arxiv.org/pdf/1803.08784v2.pdf | |
PWC | https://paperswithcode.com/paper/from-random-differential-equations-to |
Repo | |
Framework | |
High Throughput Synchronous Distributed Stochastic Gradient Descent
Title | High Throughput Synchronous Distributed Stochastic Gradient Descent |
Authors | Michael Teng, Frank Wood |
Abstract | We introduce a new, high-throughput, synchronous, distributed, data-parallel, stochastic-gradient-descent learning algorithm. This algorithm uses amortized inference in a compute-cluster-specific, deep, generative, dynamical model to perform joint posterior predictive inference of the mini-batch gradient computation times of all worker-nodes in a parallel computing cluster. We show that a synchronous parameter server can, by utilizing such a model, choose an optimal cutoff time beyond which mini-batch gradient messages from slow workers are ignored that maximizes overall mini-batch gradient computations per second. In keeping with earlier findings we observe that, under realistic conditions, eagerly discarding the mini-batch gradient computations of stragglers not only increases throughput but actually increases the overall rate of convergence as a function of wall-clock time by virtue of eliminating idleness. The principal novel contribution and finding of this work goes beyond this by demonstrating that using the predicted run-times from a generative model of cluster worker performance to dynamically adjust the cutoff improves substantially over the static-cutoff prior art, leading to, among other things, significantly reduced deep neural net training times on large computer clusters. |
Tasks | |
Published | 2018-03-12 |
URL | http://arxiv.org/abs/1803.04209v1 |
http://arxiv.org/pdf/1803.04209v1.pdf | |
PWC | https://paperswithcode.com/paper/high-throughput-synchronous-distributed |
Repo | |
Framework | |
Computing the Shattering Coefficient of Supervised Learning Algorithms
Title | Computing the Shattering Coefficient of Supervised Learning Algorithms |
Authors | Rodrigo Fernandes de Mello, Moacir Antonelli Ponti, Carlos Henrique Grossi Ferreira |
Abstract | The Statistical Learning Theory (SLT) provides the theoretical guarantees for supervised machine learning based on the Empirical Risk Minimization Principle (ERMP). Such principle defines an upper bound to ensure the uniform convergence of the empirical risk Remp(f), i.e., the error measured on a given data sample, to the expected value of risk R(f) (a.k.a. actual risk), which depends on the Joint Probability Distribution P(X x Y) mapping input examples x in X to class labels y in Y. The uniform convergence is only ensured when the Shattering coefficient N(F,2n) has a polynomial growing behavior. This paper proves the Shattering coefficient for any Hilbert space H containing the input space X and discusses its effects in terms of learning guarantees for supervised machine algorithms. |
Tasks | |
Published | 2018-05-07 |
URL | http://arxiv.org/abs/1805.02627v4 |
http://arxiv.org/pdf/1805.02627v4.pdf | |
PWC | https://paperswithcode.com/paper/computing-the-shattering-coefficient-of |
Repo | |
Framework | |
Neural Networks Trained to Solve Differential Equations Learn General Representations
Title | Neural Networks Trained to Solve Differential Equations Learn General Representations |
Authors | Martin Magill, Faisal Qureshi, Hendrick W. de Haan |
Abstract | We introduce a technique based on the singular vector canonical correlation analysis (SVCCA) for measuring the generality of neural network layers across a continuously-parametrized set of tasks. We illustrate this method by studying generality in neural networks trained to solve parametrized boundary value problems based on the Poisson partial differential equation. We find that the first hidden layer is general, and that deeper layers are successively more specific. Next, we validate our method against an existing technique that measures layer generality using transfer learning experiments. We find excellent agreement between the two methods, and note that our method is much faster, particularly for continuously-parametrized problems. Finally, we visualize the general representations of the first layers, and interpret them as generalized coordinates over the input domain. |
Tasks | Transfer Learning |
Published | 2018-06-29 |
URL | http://arxiv.org/abs/1807.00042v1 |
http://arxiv.org/pdf/1807.00042v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-networks-trained-to-solve-differential |
Repo | |
Framework | |
Level-Based Analysis of the Population-Based Incremental Learning Algorithm
Title | Level-Based Analysis of the Population-Based Incremental Learning Algorithm |
Authors | Per Kristian Lehre, Phan Trung Hai Nguyen |
Abstract | The Population-Based Incremental Learning (PBIL) algorithm uses a convex combination of the current model and the empirical model to construct the next model, which is then sampled to generate offspring. The Univariate Marginal Distribution Algorithm (UMDA) is a special case of the PBIL, where the current model is ignored. Dang and Lehre (GECCO 2015) showed that UMDA can optimise LeadingOnes efficiently. The question still remained open if the PBIL performs equally well. Here, by applying the level-based theorem in addition to Dvoretzky–Kiefer–Wolfowitz inequality, we show that the PBIL optimises function LeadingOnes in expected time $\mathcal{O}(n\lambda \log \lambda + n^2)$ for a population size $\lambda = \Omega(\log n)$, which matches the bound of the UMDA. Finally, we show that the result carries over to BinVal, giving the fist runtime result for the PBIL on the BinVal problem. |
Tasks | |
Published | 2018-06-05 |
URL | http://arxiv.org/abs/1806.01710v1 |
http://arxiv.org/pdf/1806.01710v1.pdf | |
PWC | https://paperswithcode.com/paper/level-based-analysis-of-the-population-based |
Repo | |
Framework | |
Luminoso at SemEval-2018 Task 10: Distinguishing Attributes Using Text Corpora and Relational Knowledge
Title | Luminoso at SemEval-2018 Task 10: Distinguishing Attributes Using Text Corpora and Relational Knowledge |
Authors | Robyn Speer, Joanna Lowry-Duda |
Abstract | Luminoso participated in the SemEval 2018 task on “Capturing Discriminative Attributes” with a system based on ConceptNet, an open knowledge graph focused on general knowledge. In this paper, we describe how we trained a linear classifier on a small number of semantically-informed features to achieve an $F_1$ score of 0.7368 on the task, close to the task’s high score of 0.75. |
Tasks | |
Published | 2018-06-05 |
URL | http://arxiv.org/abs/1806.01733v2 |
http://arxiv.org/pdf/1806.01733v2.pdf | |
PWC | https://paperswithcode.com/paper/luminoso-at-semeval-2018-task-10 |
Repo | |
Framework | |
Learning of Tree-Structured Gaussian Graphical Models on Distributed Data under Communication Constraints
Title | Learning of Tree-Structured Gaussian Graphical Models on Distributed Data under Communication Constraints |
Authors | Mostafa Tavassolipour, Seyed Abolfazl Motahari, Mohammad-Taghi Manzuri Shalmani |
Abstract | In this paper, learning of tree-structured Gaussian graphical models from distributed data is addressed. In our model, samples are stored in a set of distributed machines where each machine has access to only a subset of features. A central machine is then responsible for learning the structure based on received messages from the other nodes. We present a set of communication efficient strategies, which are theoretically proved to convey sufficient information for reliable learning of the structure. In particular, our analyses show that even if each machine sends only the signs of its local data samples to the central node, the tree structure can still be recovered with high accuracy. Our simulation results on both synthetic and real-world datasets show that our strategies achieve a desired accuracy in inferring the underlying structure, while spending a small budget on communication. |
Tasks | |
Published | 2018-09-21 |
URL | http://arxiv.org/abs/1809.08067v1 |
http://arxiv.org/pdf/1809.08067v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-of-tree-structured-gaussian |
Repo | |
Framework | |
Image and Encoded Text Fusion for Multi-Modal Classification
Title | Image and Encoded Text Fusion for Multi-Modal Classification |
Authors | Ignazio Gallo, Alessandro Calefati, Shah Nawaz, Muhammad Kamran Janjua |
Abstract | Multi-modal approaches employ data from multiple input streams such as textual and visual domains. Deep neural networks have been successfully employed for these approaches. In this paper, we present a novel multi-modal approach that fuses images and text descriptions to improve multi-modal classification performance in real-world scenarios. The proposed approach embeds an encoded text onto an image to obtain an information-enriched image. To learn feature representations of resulting images, standard Convolutional Neural Networks (CNNs) are employed for the classification task. We demonstrate how a CNN based pipeline can be used to learn representations of the novel fusion approach. We compare our approach with individual sources on two large-scale multi-modal classification datasets while obtaining encouraging results. Furthermore, we evaluate our approach against two famous multi-modal strategies namely early fusion and late fusion. |
Tasks | |
Published | 2018-10-03 |
URL | http://arxiv.org/abs/1810.02001v1 |
http://arxiv.org/pdf/1810.02001v1.pdf | |
PWC | https://paperswithcode.com/paper/image-and-encoded-text-fusion-for-multi-modal |
Repo | |
Framework | |
Predictor-Corrector Policy Optimization
Title | Predictor-Corrector Policy Optimization |
Authors | Ching-An Cheng, Xinyan Yan, Nathan Ratliff, Byron Boots |
Abstract | We present a predictor-corrector framework, called PicCoLO, that can transform a first-order model-free reinforcement or imitation learning algorithm into a new hybrid method that leverages predictive models to accelerate policy learning. The new “PicCoLOed” algorithm optimizes a policy by recursively repeating two steps: In the Prediction Step, the learner uses a model to predict the unseen future gradient and then applies the predicted estimate to update the policy; in the Correction Step, the learner runs the updated policy in the environment, receives the true gradient, and then corrects the policy using the gradient error. Unlike previous algorithms, PicCoLO corrects for the mistakes of using imperfect predicted gradients and hence does not suffer from model bias. The development of PicCoLO is made possible by a novel reduction from predictable online learning to adversarial online learning, which provides a systematic way to modify existing first-order algorithms to achieve the optimal regret with respect to predictable information. We show, in both theory and simulation, that the convergence rate of several first-order model-free algorithms can be improved by PicCoLO. |
Tasks | Imitation Learning |
Published | 2018-10-15 |
URL | https://arxiv.org/abs/1810.06509v2 |
https://arxiv.org/pdf/1810.06509v2.pdf | |
PWC | https://paperswithcode.com/paper/predictor-corrector-policy-optimization |
Repo | |
Framework | |
Adversarial Examples that Fool both Computer Vision and Time-Limited Humans
Title | Adversarial Examples that Fool both Computer Vision and Time-Limited Humans |
Authors | Gamaleldin F. Elsayed, Shreya Shankar, Brian Cheung, Nicolas Papernot, Alex Kurakin, Ian Goodfellow, Jascha Sohl-Dickstein |
Abstract | Machine learning models are vulnerable to adversarial examples: small changes to images can cause computer vision models to make mistakes such as identifying a school bus as an ostrich. However, it is still an open question whether humans are prone to similar mistakes. Here, we address this question by leveraging recent techniques that transfer adversarial examples from computer vision models with known parameters and architecture to other models with unknown parameters and architecture, and by matching the initial processing of the human visual system. We find that adversarial examples that strongly transfer across computer vision models influence the classifications made by time-limited human observers. |
Tasks | |
Published | 2018-02-22 |
URL | http://arxiv.org/abs/1802.08195v3 |
http://arxiv.org/pdf/1802.08195v3.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-examples-that-fool-both-computer |
Repo | |
Framework | |
Skeleton Driven Non-rigid Motion Tracking and 3D Reconstruction
Title | Skeleton Driven Non-rigid Motion Tracking and 3D Reconstruction |
Authors | Shafeeq Elanattil, Peyman Moghadam, Simon Denman, Sridha Sridharan, Clinton Fookes |
Abstract | This paper presents a method which can track and 3D reconstruct the non-rigid surface motion of human performance using a moving RGB-D camera. 3D reconstruction of marker-less human performance is a challenging problem due to the large range of articulated motions and considerable non-rigid deformations. Current approaches use local optimization for tracking. These methods need many iterations to converge and may get stuck in local minima during sudden articulated movements. We propose a puppet model-based tracking approach using skeleton prior, which provides a better initialization for tracking articulated movements. The proposed approach uses an aligned puppet model to estimate correct correspondences for human performance capture. We also contribute a synthetic dataset which provides ground truth locations for frame-by-frame geometry and skeleton joints of human subjects. Experimental results show that our approach is more robust when faced with sudden articulated motions, and provides better 3D reconstruction compared to the existing state-of-the-art approaches. |
Tasks | 3D Reconstruction |
Published | 2018-10-09 |
URL | http://arxiv.org/abs/1810.03774v1 |
http://arxiv.org/pdf/1810.03774v1.pdf | |
PWC | https://paperswithcode.com/paper/skeleton-driven-non-rigid-motion-tracking-and |
Repo | |
Framework | |
Preventing Failures Due to Dataset Shift: Learning Predictive Models That Transport
Title | Preventing Failures Due to Dataset Shift: Learning Predictive Models That Transport |
Authors | Adarsh Subbaswamy, Peter Schulam, Suchi Saria |
Abstract | Classical supervised learning produces unreliable models when training and target distributions differ, with most existing solutions requiring samples from the target domain. We propose a proactive approach which learns a relationship in the training domain that will generalize to the target domain by incorporating prior knowledge of aspects of the data generating process that are expected to differ as expressed in a causal selection diagram. Specifically, we remove variables generated by unstable mechanisms from the joint factorization to yield the Surgery Estimator—an interventional distribution that is invariant to the differences across environments. We prove that the surgery estimator finds stable relationships in strictly more scenarios than previous approaches which only consider conditional relationships, and demonstrate this in simulated experiments. We also evaluate on real world data for which the true causal diagram is unknown, performing competitively against entirely data-driven approaches. |
Tasks | |
Published | 2018-12-11 |
URL | http://arxiv.org/abs/1812.04597v2 |
http://arxiv.org/pdf/1812.04597v2.pdf | |
PWC | https://paperswithcode.com/paper/preventing-failures-due-to-dataset-shift |
Repo | |
Framework | |
Light-weight pixel context encoders for image inpainting
Title | Light-weight pixel context encoders for image inpainting |
Authors | Nanne van Noord, Eric Postma |
Abstract | In this work we propose Pixel Content Encoders (PCE), a light-weight image inpainting model, capable of generating novel con-tent for large missing regions in images. Unlike previously presented convolutional neural network based models, our PCE model has an order of magnitude fewer trainable parameters. Moreover, by incorporating dilated convolutions we are able to preserve fine grained spatial information, achieving state-of-the-art performance on benchmark datasets of natural images and paintings. Besides image inpainting, we show that without changing the architecture, PCE can be used for image extrapolation, generating novel content beyond existing image boundaries. |
Tasks | Image Inpainting |
Published | 2018-01-17 |
URL | http://arxiv.org/abs/1801.05585v1 |
http://arxiv.org/pdf/1801.05585v1.pdf | |
PWC | https://paperswithcode.com/paper/light-weight-pixel-context-encoders-for-image |
Repo | |
Framework | |
Unifying Identification and Context Learning for Person Recognition
Title | Unifying Identification and Context Learning for Person Recognition |
Authors | Qingqiu Huang, Yu Xiong, Dahua Lin |
Abstract | Despite the great success of face recognition techniques, recognizing persons under unconstrained settings remains challenging. Issues like profile views, unfavorable lighting, and occlusions can cause substantial difficulties. Previous works have attempted to tackle this problem by exploiting the context, e.g. clothes and social relations. While showing promising improvement, they are usually limited in two important aspects, relying on simple heuristics to combine different cues and separating the construction of context from people identities. In this work, we aim to move beyond such limitations and propose a new framework to leverage context for person recognition. In particular, we propose a Region Attention Network, which is learned to adaptively combine visual cues with instance-dependent weights. We also develop a unified formulation, where the social contexts are learned along with the reasoning of people identities. These models substantially improve the robustness when working with the complex contextual relations in unconstrained environments. On two large datasets, PIPA and Cast In Movies (CIM), a new dataset proposed in this work, our method consistently achieves state-of-the-art performance under multiple evaluation policies. |
Tasks | Face Recognition, Person Recognition |
Published | 2018-06-08 |
URL | http://arxiv.org/abs/1806.03084v1 |
http://arxiv.org/pdf/1806.03084v1.pdf | |
PWC | https://paperswithcode.com/paper/unifying-identification-and-context-learning |
Repo | |
Framework | |