Paper Group ANR 581
Fast cosmic web simulations with generative adversarial networks. Bandana: Using Non-volatile Memory for Storing Deep Learning Models. Simple coarse graining and sampling strategies for image recognition. Quantified Degrees of Group Responsibility (Extended Abstract). ColorUNet: A convolutional classification approach to colorization. Learning Navi …
Fast cosmic web simulations with generative adversarial networks
Title | Fast cosmic web simulations with generative adversarial networks |
Authors | Andres C. Rodriguez, Tomasz Kacprzak, Aurelien Lucchi, Adam Amara, Raphael Sgier, Janis Fluri, Thomas Hofmann, Alexandre Réfrégier |
Abstract | Dark matter in the universe evolves through gravity to form a complex network of halos, filaments, sheets and voids, that is known as the cosmic web. Computational models of the underlying physical processes, such as classical N-body simulations, are extremely resource intensive, as they track the action of gravity in an expanding universe using billions of particles as tracers of the cosmic matter distribution. Therefore, upcoming cosmology experiments will face a computational bottleneck that may limit the exploitation of their full scientific potential. To address this challenge, we demonstrate the application of a machine learning technique called Generative Adversarial Networks (GAN) to learn models that can efficiently generate new, physically realistic realizations of the cosmic web. Our training set is a small, representative sample of 2D image snapshots from N-body simulations of size 500 and 100 Mpc. We show that the GAN-generated samples are qualitatively and quantitatively very similar to the originals. For the larger boxes of size 500 Mpc, it is very difficult to distinguish them visually. The agreement of the power spectrum $P_k$ is 1-2% for most of the range, between $k=0.06$ and $k=0.4$. An important advantage of generating cosmic web realizations with a GAN is the considerable gains in terms of computation time. Each new sample generated by a GAN takes a fraction of a second, compared to the many hours needed by traditional N-body techniques. We anticipate that the use of generative models such as GANs will therefore play an important role in providing extremely fast and precise simulations of cosmic web in the era of large cosmological surveys, such as Euclid and Large Synoptic Survey Telescope (LSST). |
Tasks | |
Published | 2018-01-27 |
URL | http://arxiv.org/abs/1801.09070v4 |
http://arxiv.org/pdf/1801.09070v4.pdf | |
PWC | https://paperswithcode.com/paper/fast-cosmic-web-simulations-with-generative |
Repo | |
Framework | |
Bandana: Using Non-volatile Memory for Storing Deep Learning Models
Title | Bandana: Using Non-volatile Memory for Storing Deep Learning Models |
Authors | Assaf Eisenman, Maxim Naumov, Darryl Gardner, Misha Smelyanskiy, Sergey Pupyrev, Kim Hazelwood, Asaf Cidon, Sachin Katti |
Abstract | Typical large-scale recommender systems use deep learning models that are stored on a large amount of DRAM. These models often rely on embeddings, which consume most of the required memory. We present Bandana, a storage system that reduces the DRAM footprint of embeddings, by using Non-volatile Memory (NVM) as the primary storage medium, with a small amount of DRAM as cache. The main challenge in storing embeddings on NVM is its limited read bandwidth compared to DRAM. Bandana uses two primary techniques to address this limitation: first, it stores embedding vectors that are likely to be read together in the same physical location, using hypergraph partitioning, and second, it decides the number of embedding vectors to cache in DRAM by simulating dozens of small caches. These techniques allow Bandana to increase the effective read bandwidth of NVM by 2-3x and thereby significantly reduce the total cost of ownership. |
Tasks | hypergraph partitioning, Recommendation Systems |
Published | 2018-11-14 |
URL | http://arxiv.org/abs/1811.05922v2 |
http://arxiv.org/pdf/1811.05922v2.pdf | |
PWC | https://paperswithcode.com/paper/bandana-using-non-volatile-memory-for-storing |
Repo | |
Framework | |
Simple coarse graining and sampling strategies for image recognition
Title | Simple coarse graining and sampling strategies for image recognition |
Authors | Stephen Whitelam |
Abstract | A conceptually simple way to recognize images is to directly compare test-set data and training-set data. The accuracy of this approach is limited by the method of comparison used, and by the extent to which the training-set data covers the required configuration space. Here we show that this coverage can be substantially increased using simple strategies of coarse graining (replacing groups of images by their centroids) and sampling (using distinct sets of centroids in combination). We use the MNIST data set to show that coarse graining can be used to convert a subset of training images into about an order of magnitude fewer image centroids, with no loss of accuracy of classification of test-set images by direct (nearest-neighbor) classification. Distinct batches of centroids can be used in combination as a means of sampling configuration space, and can classify test-set data more accurately than can the unaltered training set. The approach works most naturally with multiple processors in parallel. |
Tasks | |
Published | 2018-09-07 |
URL | http://arxiv.org/abs/1809.02599v1 |
http://arxiv.org/pdf/1809.02599v1.pdf | |
PWC | https://paperswithcode.com/paper/simple-coarse-graining-and-sampling |
Repo | |
Framework | |
Quantified Degrees of Group Responsibility (Extended Abstract)
Title | Quantified Degrees of Group Responsibility (Extended Abstract) |
Authors | Vahid Yazdanpanah, Mehdi Dastani |
Abstract | This paper builds on an existing notion of group responsibility and proposes two ways to define the degree of group responsibility: structural and functional degrees of responsibility. These notions measure the potential responsibilities of (agent) groups for avoiding a state of affairs. According to these notions, a degree of responsibility for a state of affairs can be assigned to a group of agents if, and to the extent that, the group has the potential to preclude the state of affairs. |
Tasks | |
Published | 2018-01-23 |
URL | http://arxiv.org/abs/1801.07747v1 |
http://arxiv.org/pdf/1801.07747v1.pdf | |
PWC | https://paperswithcode.com/paper/quantified-degrees-of-group-responsibility |
Repo | |
Framework | |
ColorUNet: A convolutional classification approach to colorization
Title | ColorUNet: A convolutional classification approach to colorization |
Authors | Vincent Billaut, Matthieu de Rochemonteix, Marc Thibault |
Abstract | This paper tackles the challenge of colorizing grayscale images. We take a deep convolutional neural network approach, and choose to take the angle of classification, working on a finite set of possible colors. Similarly to a recent paper, we implement a loss and a prediction function that favor realistic, colorful images rather than “true” ones. We show that a rather lightweight architecture inspired by the U-Net, and trained on a reasonable amount of pictures of landscapes, achieves satisfactory results on this specific subset of pictures. We show that data augmentation significantly improves the performance and robustness of the model, and provide visual analysis of the prediction confidence. We show an application of our model, extending the task to video colorization. We suggest a way to smooth color predictions across frames, without the need to train a recurrent network designed for sequential inputs. |
Tasks | Colorization, Data Augmentation |
Published | 2018-11-07 |
URL | http://arxiv.org/abs/1811.03120v1 |
http://arxiv.org/pdf/1811.03120v1.pdf | |
PWC | https://paperswithcode.com/paper/colorunet-a-convolutional-classification |
Repo | |
Framework | |
Learning Navigation Behaviors End-to-End with AutoRL
Title | Learning Navigation Behaviors End-to-End with AutoRL |
Authors | Hao-Tien Lewis Chiang, Aleksandra Faust, Marek Fiser, Anthony Francis |
Abstract | We learn end-to-end point-to-point and path-following navigation behaviors that avoid moving obstacles. These policies receive noisy lidar observations and output robot linear and angular velocities. The policies are trained in small, static environments with AutoRL, an evolutionary automation layer around Reinforcement Learning (RL) that searches for a deep RL reward and neural network architecture with large-scale hyper-parameter optimization. AutoRL first finds a reward that maximizes task completion, and then finds a neural network architecture that maximizes the cumulative of the found reward. Empirical evaluations, both in simulation and on-robot, show that AutoRL policies do not suffer from the catastrophic forgetfulness that plagues many other deep reinforcement learning algorithms, generalize to new environments and moving obstacles, are robust to sensor, actuator, and localization noise, and can serve as robust building blocks for larger navigation tasks. Our path-following and point-to-point policies are respectively 23% and 26% more successful than comparison methods across new environments. Video at: https://youtu.be/0UwkjpUEcbI |
Tasks | Motion Planning |
Published | 2018-09-26 |
URL | http://arxiv.org/abs/1809.10124v2 |
http://arxiv.org/pdf/1809.10124v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-navigation-behaviors-end-to-end-with |
Repo | |
Framework | |
Convex Functions in ACL2(r)
Title | Convex Functions in ACL2(r) |
Authors | Carl Kwan, Mark R. Greenstreet |
Abstract | This paper builds upon our prior formalisation of R^n in ACL2(r) by presenting a set of theorems for reasoning about convex functions. This is a demonstration of the higher-dimensional analytical reasoning possible in our metric space formalisation of R^n. Among the introduced theorems is a set of equivalent conditions for convex functions with Lipschitz continuous gradients from Yurii Nesterov’s classic text on convex optimisation. To the best of our knowledge a full proof of the theorem has yet to be published in a single piece of literature. We also explore “proof engineering” issues, such as how to state Nesterov’s theorem in a manner that is both clear and useful. |
Tasks | |
Published | 2018-10-10 |
URL | http://arxiv.org/abs/1810.04316v1 |
http://arxiv.org/pdf/1810.04316v1.pdf | |
PWC | https://paperswithcode.com/paper/convex-functions-in-acl2r |
Repo | |
Framework | |
Generating Goal-Directed Visuomotor Plans Based on Learning Using a Predictive Coding-type Deep Visuomotor Recurrent Neural Network Model
Title | Generating Goal-Directed Visuomotor Plans Based on Learning Using a Predictive Coding-type Deep Visuomotor Recurrent Neural Network Model |
Authors | Minkyu Choi, Takazumi Matsumoto, Minju Jung, Jun Tani |
Abstract | The current paper presents how a predictive coding type deep recurrent neural networks can generate vision-based goal-directed plans based on prior learning experience by examining experiment results using a real arm robot. The proposed deep recurrent neural network learns to predict visuo-proprioceptive sequences by extracting an adequate predictive model from various visuomotor experiences related to object-directed behaviors. The predictive model was developed in terms of mapping from intention state space to expected visuo-proprioceptive sequences space through iterative learning. Our arm robot experiments adopted with three different tasks with different levels of difficulty showed that the error minimization principle in the predictive coding framework applied to inference of the optimal intention states for given goal states can generate goal-directed plans even for unlearned goal states with generalization. It was, however, shown that sufficient generalization requires relatively large number of learning trajectories. The paper discusses possible countermeasure to overcome this problem. |
Tasks | |
Published | 2018-03-07 |
URL | http://arxiv.org/abs/1803.02578v2 |
http://arxiv.org/pdf/1803.02578v2.pdf | |
PWC | https://paperswithcode.com/paper/generating-goal-directed-visuomotor-plans |
Repo | |
Framework | |
Competitive Machine Learning: Best Theoretical Prediction vs Optimization
Title | Competitive Machine Learning: Best Theoretical Prediction vs Optimization |
Authors | Amin Khajehnejad, Shima Hajimirza |
Abstract | Machine learning is often used in competitive scenarios: Participants learn and fit static models, and those models compete in a shared platform. The common assumption is that in order to win a competition one has to have the best predictive model, i.e., the model with the smallest out-sample error. Is that necessarily true? Does the best theoretical predictive model for a target always yield the best reward in a competition? If not, can one take the best model and purposefully change it into a theoretically inferior model which in practice results in a higher competitive edge? How does that modification look like? And finally, if all participants modify their prediction models towards the best practical performance, who benefits the most? players with inferior models, or those with theoretical superiority? The main theme of this paper is to raise these important questions and propose a theoretical model to answer them. We consider a study case where two linear predictive models compete over a shared target. The model with the closest estimate gets the whole reward, which is equal to the absolute value of the target. We characterize the reward function of each model, and using a basic game theoretic approach, demonstrate that the inferior competitor can significantly improve his performance by choosing optimal model coefficients that are different from the best theoretical prediction. This is a preliminary study that emphasizes the fact that in many applications where predictive machine learning is at the service of competition, much can be gained from practical (back-testing) optimization of the model compared to static prediction improvement. |
Tasks | |
Published | 2018-03-09 |
URL | http://arxiv.org/abs/1803.03672v1 |
http://arxiv.org/pdf/1803.03672v1.pdf | |
PWC | https://paperswithcode.com/paper/competitive-machine-learning-best-theoretical |
Repo | |
Framework | |
Renewal Monte Carlo: Renewal theory based reinforcement learning
Title | Renewal Monte Carlo: Renewal theory based reinforcement learning |
Authors | Jayakumar Subramanian, Aditya Mahajan |
Abstract | In this paper, we present an online reinforcement learning algorithm, called Renewal Monte Carlo (RMC), for infinite horizon Markov decision processes with a designated start state. RMC is a Monte Carlo algorithm and retains the advantages of Monte Carlo methods including low bias, simplicity, and ease of implementation while, at the same time, circumvents their key drawbacks of high variance and delayed (end of episode) updates. The key ideas behind RMC are as follows. First, under any reasonable policy, the reward process is ergodic. So, by renewal theory, the performance of a policy is equal to the ratio of expected discounted reward to the expected discounted time over a regenerative cycle. Second, by carefully examining the expression for performance gradient, we propose a stochastic approximation algorithm that only requires estimates of the expected discounted reward and discounted time over a regenerative cycle and their gradients. We propose two unbiased estimators for evaluating performance gradients—a likelihood ratio based estimator and a simultaneous perturbation based estimator—and show that for both estimators, RMC converges to a locally optimal policy. We generalize the RMC algorithm to post-decision state models and also present a variant that converges faster to an approximately optimal policy. We conclude by presenting numerical experiments on a randomly generated MDP, event-triggered communication, and inventory management. |
Tasks | |
Published | 2018-04-03 |
URL | http://arxiv.org/abs/1804.01116v1 |
http://arxiv.org/pdf/1804.01116v1.pdf | |
PWC | https://paperswithcode.com/paper/renewal-monte-carlo-renewal-theory-based |
Repo | |
Framework | |
Marrying Tracking with ELM: A Metric Constraint Guided Multiple Feature Fusion Method
Title | Marrying Tracking with ELM: A Metric Constraint Guided Multiple Feature Fusion Method |
Authors | Jing Zhang, Yonggong Ren |
Abstract | Object Tracking is one important problem in computer vision and surveillance system. The existing models mainly exploit the single-view feature (i.e. color, texture, shape) to solve the problem, failing to describe the objects comprehensively. In this paper, we solve the problem from multi-view perspective by leveraging multi-view complementary and latent information, so as to be robust to the partial occlusion and background clutter especially when the objects are similar to the target, meanwhile addressing tracking drift. However, one big problem is that multi-view fusion strategy can inevitably result tracking into non-efficiency. To this end, we propose to marry ELM (Extreme learning machine) to multi-view fusion to train the global hidden output weight, to effectively exploit the local information from each view. Following this principle, we propose a novel method to obtain the optimal sample as the target object, which avoids tracking drift resulting from noisy samples. Our method is evaluated over 12 challenge image sequences challenged with different attributes including illumination, occlusion, deformation, etc., which demonstrates better performance than several state-of-the-art methods in terms of effectiveness and robustness. |
Tasks | Object Tracking |
Published | 2018-09-30 |
URL | http://arxiv.org/abs/1810.01271v2 |
http://arxiv.org/pdf/1810.01271v2.pdf | |
PWC | https://paperswithcode.com/paper/marrying-tracking-with-elm-a-metric |
Repo | |
Framework | |
A Boosting Framework of Factorization Machine
Title | A Boosting Framework of Factorization Machine |
Authors | Longfei Li, Peilin Zhao, Jun Zhou, Xiaolong Li |
Abstract | Recently, Factorization Machines (FM) has become more and more popular for recommendation systems, due to its effectiveness in finding informative interactions between features. Usually, the weights for the interactions is learnt as a low rank weight matrix, which is formulated as an inner product of two low rank matrices. This low rank can help improve the generalization ability of Factorization Machines. However, to choose the rank properly, it usually needs to run the algorithm for many times using different ranks, which clearly is inefficient for some large-scale datasets. To alleviate this issue, we propose an Adaptive Boosting framework of Factorization Machines (AdaFM), which can adaptively search for proper ranks for different datasets without re-training. Instead of using a fixed rank for FM, the proposed algorithm will adaptively gradually increases its rank according to its performance until the performance does not grow, using boosting strategy. To verify the performance of our proposed framework, we conduct an extensive set of experiments on many real-world datasets. Encouraging empirical results shows that the proposed algorithms are generally more effective than state-of-the-art other Factorization Machines. |
Tasks | Recommendation Systems |
Published | 2018-04-17 |
URL | http://arxiv.org/abs/1804.06027v1 |
http://arxiv.org/pdf/1804.06027v1.pdf | |
PWC | https://paperswithcode.com/paper/a-boosting-framework-of-factorization-machine |
Repo | |
Framework | |
Functionally Modular and Interpretable Temporal Filtering for Robust Segmentation
Title | Functionally Modular and Interpretable Temporal Filtering for Robust Segmentation |
Authors | Jörg Wagner, Volker Fischer, Michael Herman, Sven Behnke |
Abstract | The performance of autonomous systems heavily relies on their ability to generate a robust representation of the environment. Deep neural networks have greatly improved vision-based perception systems but still fail in challenging situations, e.g. sensor outages or heavy weather. These failures are often introduced by data-inherent perturbations, which significantly reduce the information provided to the perception system. We propose a functionally modularized temporal filter, which stabilizes an abstract feature representation of a single-frame segmentation model using information of previous time steps. Our filter module splits the filter task into multiple less complex and more interpretable subtasks. The basic structure of the filter is inspired by a Bayes estimator consisting of a prediction and an update step. To make the prediction more transparent, we implement it using a geometric projection and estimate its parameters. This additionally enables the decomposition of the filter task into static representation filtering and low-dimensional motion filtering. Our model can cope with missing frames and is trainable in an end-to-end fashion. Using photorealistic, synthetic video data, we show the ability of the proposed architecture to overcome data-inherent perturbations. The experiments especially highlight advantages introduced by an interpretable and explicit filter module. |
Tasks | |
Published | 2018-10-09 |
URL | http://arxiv.org/abs/1810.03867v2 |
http://arxiv.org/pdf/1810.03867v2.pdf | |
PWC | https://paperswithcode.com/paper/functionally-modular-and-interpretable |
Repo | |
Framework | |
Quantum computing and the brain: quantum nets, dessins d’enfants and neural networks
Title | Quantum computing and the brain: quantum nets, dessins d’enfants and neural networks |
Authors | Torsten Asselmeyer-Maluga |
Abstract | In this paper, we will discuss a formal link between neural networks and quantum computing. For that purpose we will present a simple model for the description of the neural network by forming sub-graphs of the whole network with the same or a similar state. We will describe the interaction between these areas by closed loops, the feedback loops. The change of the graph is given by the deformations of the loops. This fact can be mathematically formalized by the fundamental group of the graph. Furthermore the neuron has two basic states $0\rangle$ (ground state) and $1\rangle$ (excited state). The whole state of an area of neurons is the linear combination of the two basic state with complex coefficients representing the signals (with 3 Parameters: amplitude, frequency and phase) along the neurons. Then it can be shown that the set of all signals forms a manifold (character variety) and all properties of the network must be encoded in this manifold. In the paper, we will discuss how to interpret learning and intuition in this model. Using the Morgan-Shalen compactification, the limit for signals with large amplitude can be analyzed by using quasi-Fuchsian groups as represented by dessins d’enfants (graphs to analyze Riemannian surfaces). As shown by Planat and collaborators, these dessins d’enfants are a direct bridge to (topological) quantum computing with permutation groups. The normalization of the signal reduces to the group $SU(2)$ and the whole model to a quantum network. Then we have a direct connection to quantum circuits. This network can be transformed into operations on tensor networks. Formally we will obtain a link between machine learning and Quantum computing. |
Tasks | Tensor Networks |
Published | 2018-12-18 |
URL | http://arxiv.org/abs/1812.08338v1 |
http://arxiv.org/pdf/1812.08338v1.pdf | |
PWC | https://paperswithcode.com/paper/quantum-computing-and-the-brain-quantum-nets |
Repo | |
Framework | |
Explaining Aggregates for Exploratory Analytics
Title | Explaining Aggregates for Exploratory Analytics |
Authors | Fotis Savva, Christos Anagnostopoulos, Peter Triantafillou |
Abstract | Analysts wishing to explore multivariate data spaces, typically pose queries involving selection operators, i.e., range or radius queries, which define data subspaces of possible interest and then use aggregation functions, the results of which determine their exploratory analytics interests. However, such aggregate query (AQ) results are simple scalars and as such, convey limited information about the queried subspaces for exploratory analysis. We address this shortcoming aiding analysts to explore and understand data subspaces by contributing a novel explanation mechanism coined XAXA: eXplaining Aggregates for eXploratory Analytics. XAXA’s novel AQ explanations are represented using functions obtained by a three-fold joint optimization problem. Explanations assume the form of a set of parametric piecewise-linear functions acquired through a statistical learning model. A key feature of the proposed solution is that model training is performed by only monitoring AQs and their answers on-line. In XAXA, explanations for future AQs can be computed without any database (DB) access and can be used to further explore the queried data subspaces, without issuing any more queries to the DB. We evaluate the explanation accuracy and efficiency of XAXA through theoretically grounded metrics over real-world and synthetic datasets and query workloads. |
Tasks | |
Published | 2018-12-29 |
URL | https://arxiv.org/abs/1812.11346v2 |
https://arxiv.org/pdf/1812.11346v2.pdf | |
PWC | https://paperswithcode.com/paper/explaining-aggregates-for-exploratory |
Repo | |
Framework | |