April 3, 2020

3393 words 16 mins read

Paper Group ANR 2

Paper Group ANR 2

Use the Force, Luke! Learning to Predict Physical Forces by Simulating Effects. GID-Net: Detecting Human-Object Interaction with Global and Instance Dependency. TPPO: A Novel Trajectory Predictor with Pseudo Oracle. Functional Error Correction for Robust Neural Networks. Visual-Semantic Graph Attention Networks for Human-Object Interaction Detectio …

Use the Force, Luke! Learning to Predict Physical Forces by Simulating Effects

Title Use the Force, Luke! Learning to Predict Physical Forces by Simulating Effects
Authors Kiana Ehsani, Shubham Tulsiani, Saurabh Gupta, Ali Farhadi, Abhinav Gupta
Abstract When we humans look at a video of human-object interaction, we can not only infer what is happening but we can even extract actionable information and imitate those interactions. On the other hand, current recognition or geometric approaches lack the physicality of action representation. In this paper, we take a step towards a more physical understanding of actions. We address the problem of inferring contact points and the physical forces from videos of humans interacting with objects. One of the main challenges in tackling this problem is obtaining ground-truth labels for forces. We sidestep this problem by instead using a physics simulator for supervision. Specifically, we use a simulator to predict effects and enforce that estimated forces must lead to the same effect as depicted in the video. Our quantitative and qualitative results show that (a) we can predict meaningful forces from videos whose effects lead to accurate imitation of the motions observed, (b) by jointly optimizing for contact point and force prediction, we can improve the performance on both tasks in comparison to independent training, and (c) we can learn a representation from this model that generalizes to novel objects using few shot examples.
Tasks Human-Object Interaction Detection
Published 2020-03-26
URL https://arxiv.org/abs/2003.12045v1
PDF https://arxiv.org/pdf/2003.12045v1.pdf
PWC https://paperswithcode.com/paper/use-the-force-luke-learning-to-predict

GID-Net: Detecting Human-Object Interaction with Global and Instance Dependency

Title GID-Net: Detecting Human-Object Interaction with Global and Instance Dependency
Authors Dongming Yang, YueXian Zou, Jian Zhang, Ge Li
Abstract Since detecting and recognizing individual human or object are not adequate to understand the visual world, learning how humans interact with surrounding objects becomes a core technology. However, convolution operations are weak in depicting visual interactions between the instances since they only build blocks that process one local neighborhood at a time. To address this problem, we learn from human perception in observing HOIs to introduce a two-stage trainable reasoning mechanism, referred to as GID block. GID block breaks through the local neighborhoods and captures long-range dependency of pixels both in global-level and instance-level from the scene to help detecting interactions between instances. Furthermore, we conduct a multi-stream network called GID-Net, which is a human-object interaction detection framework consisting of a human branch, an object branch and an interaction branch. Semantic information in global-level and local-level are efficiently reasoned and aggregated in each of the branches. We have compared our proposed GID-Net with existing state-of-the-art methods on two public benchmarks, including V-COCO and HICO-DET. The results have showed that GID-Net outperforms the existing best-performing methods on both the above two benchmarks, validating its efficacy in detecting human-object interactions.
Tasks Human-Object Interaction Detection
Published 2020-03-11
URL https://arxiv.org/abs/2003.05242v1
PDF https://arxiv.org/pdf/2003.05242v1.pdf
PWC https://paperswithcode.com/paper/gid-net-detecting-human-object-interaction

TPPO: A Novel Trajectory Predictor with Pseudo Oracle

Title TPPO: A Novel Trajectory Predictor with Pseudo Oracle
Authors Biao Yang, Guocheng Yan, Pin Wang, Ching-yao Chan, Xiaofeng Liu, Yang Chen
Abstract Forecasting pedestrian trajectories in dynamic scenes remains a critical problem with various applications, such as autonomous driving and socially aware robots. Such forecasting is challenging due to human-human and human-object interactions and future uncertainties caused by human randomness. Generative model-based methods handle future uncertainties by sampling a latent variable. However, few previous studies carefully explored the generation of the latent variable. In this work, we propose the Trajectory Predictor with Pseudo Oracle (TPPO), which is a generative model-based trajectory predictor. The first pseudo oracle is pedestrians’ moving directions, and the second one is the latent variable estimated from observed trajectories. A social attention module is used to aggregate neighbors’ interactions on the basis of the correlation between pedestrians’ moving directions and their future trajectories. This correlation is inspired by the fact that a pedestrian’s future trajectory is often influenced by pedestrians in front. A latent variable predictor is proposed to estimate latent variable distributions from observed and ground-truth trajectories. Moreover, the gap between these two distributions is minimized during training. Therefore, the latent variable predictor can estimate the latent variable from observed trajectories to approximate that estimated from ground-truth trajectories. We compare the performance of TPPO with related methods on several public datasets. Results demonstrate that TPPO outperforms state-of-the-art methods with low average and final displacement errors. Besides, the ablation study shows that the prediction performance will not dramatically decrease as sampling times decline during tests.
Tasks Autonomous Driving, Human-Object Interaction Detection
Published 2020-02-04
URL https://arxiv.org/abs/2002.01852v1
PDF https://arxiv.org/pdf/2002.01852v1.pdf
PWC https://paperswithcode.com/paper/tppo-a-novel-trajectory-predictor-with-pseudo

Functional Error Correction for Robust Neural Networks

Title Functional Error Correction for Robust Neural Networks
Authors Kunping Huang, Paul Siegel, Anxiao, Jiang
Abstract When neural networks (NeuralNets) are implemented in hardware, their weights need to be stored in memory devices. As noise accumulates in the stored weights, the NeuralNet’s performance will degrade. This paper studies how to use error correcting codes (ECCs) to protect the weights. Different from classic error correction in data storage, the optimization objective is to optimize the NeuralNet’s performance after error correction, instead of minimizing the Uncorrectable Bit Error Rate in the protected bits. That is, by seeing the NeuralNet as a function of its input, the error correction scheme is function-oriented. A main challenge is that a deep NeuralNet often has millions to hundreds of millions of weights, causing a large redundancy overhead for ECCs, and the relationship between the weights and its NeuralNet’s performance can be highly complex. To address the challenge, we propose a Selective Protection (SP) scheme, which chooses only a subset of important bits for ECC protection. To find such bits and achieve an optimized tradeoff between ECC’s redundancy and NeuralNet’s performance, we present an algorithm based on deep reinforcement learning. Experimental results verify that compared to the natural baseline scheme, the proposed algorithm achieves substantially better performance for the functional error correction task.
Published 2020-01-12
URL https://arxiv.org/abs/2001.03814v1
PDF https://arxiv.org/pdf/2001.03814v1.pdf
PWC https://paperswithcode.com/paper/functional-error-correction-for-robust-neural

Visual-Semantic Graph Attention Networks for Human-Object Interaction Detection

Title Visual-Semantic Graph Attention Networks for Human-Object Interaction Detection
Authors Zhijun Liang, Junfa Liu, Yisheng Guan, Juan Rojas
Abstract In scene understanding, machines benefit from not only detecting individual scene instances but also from learning their possible interactions. Human-Object Interaction (HOI) Detection infers the action predicate on a <subject,predicate,object> triplet. Contextual information has been found critical in inferring interactions. However, most works only use local features from a single subject-object pair for inference. Few works have studied the disambiguating contribution of subsidiary relations made available via graph networks and the impact attention mechanisms have in inference. Similarly, few have learned to effectively leverage visual cues along with the intrinsic semantic regularities contained in HOIs. We contribute a dual-graph attention network that effectively aggregates contextual visual, spatial, and semantic information dynamically from primary subject-object relations as well as subsidiary relations through attention mechanisms for strong disambiguating power. The network learns to use both primary and subsidiary relations to improve inference in challenging settings: encouraging the right interpretations and discouraging incorrect ones. We call our model: Visual-Semantic Graph Attention Networks (VS-GATs). We surpass state-of-the-art HOI detection mAPs in the challenging HICO-DET dataset, including in long-tail cases that are harder to interpret. Code, video, and supplementary information are available at http://www.juanrojas.net/VSGAT.
Tasks Human-Object Interaction Detection, Scene Understanding
Published 2020-01-07
URL https://arxiv.org/abs/2001.02302v3
PDF https://arxiv.org/pdf/2001.02302v3.pdf
PWC https://paperswithcode.com/paper/visual-semantic-graph-attention-network-for

Who Wins the Game of Thrones? How Sentiments Improve the Prediction of Candidate Choice

Title Who Wins the Game of Thrones? How Sentiments Improve the Prediction of Candidate Choice
Authors Chaehan So
Abstract This paper analyzes how candidate choice prediction improves by different psychological predictors. To investigate this question, it collected an original survey dataset featuring the popular TV series “Game of Thrones”. The respondents answered which character they anticipated to win in the final episode of the series, and explained their choice of the final candidate in free text from which sentiments were extracted. These sentiments were compared to feature sets derived from candidate likeability and candidate personality ratings. In our benchmarking of 10-fold cross-validation in 100 repetitions, all feature sets except the likeability ratings yielded a 10-11% improvement in accuracy on the holdout set over the base model. Treating the class imbalance with synthetic minority oversampling (SMOTE) increased holdout set performance by 20-34% but surprisingly not testing set performance. Taken together, our study provides a quantified estimation of the additional predictive value of psychological predictors. Likeability ratings were clearly outperformed by the feature sets based on personality, emotional valence, and basic emotions.
Published 2020-02-29
URL https://arxiv.org/abs/2003.07683v1
PDF https://arxiv.org/pdf/2003.07683v1.pdf
PWC https://paperswithcode.com/paper/who-wins-the-game-of-thrones-how-sentiments

Genetic Algorithmic Parameter Optimisation of a Recurrent Spiking Neural Network Model

Title Genetic Algorithmic Parameter Optimisation of a Recurrent Spiking Neural Network Model
Authors Ifeatu Ezenwe, Alok Joshi, KongFatt Wong-Lin
Abstract Neural networks are complex algorithms that loosely model the behaviour of the human brain. They play a significant role in computational neuroscience and artificial intelligence. The next generation of neural network models is based on the spike timing activity of neurons: spiking neural networks (SNNs). However, model parameters in SNNs are difficult to search and optimise. Previous studies using genetic algorithm (GA) optimisation of SNNs were focused mainly on simple, feedforward, or oscillatory networks, but not much work has been done on optimising cortex-like recurrent SNNs. In this work, we investigated the use of GAs to search for optimal parameters in recurrent SNNs to reach targeted neuronal population firing rates, e.g. as in experimental observations. We considered a cortical column based SNN comprising 1000 Izhikevich spiking neurons for computational efficiency and biologically realism. The model parameters explored were the neuronal biased input currents. First, we found for this particular SNN, the optimal parameter values for targeted population averaged firing activities, and the convergence of algorithm by ~100 generations. We then showed that the GA optimal population size was within ~16-20 while the crossover rate that returned the best fitness value was ~0.95. Overall, we have successfully demonstrated the feasibility of implementing GA to optimise model parameters in a recurrent cortical based SNN.
Published 2020-03-30
URL https://arxiv.org/abs/2003.13850v1
PDF https://arxiv.org/pdf/2003.13850v1.pdf
PWC https://paperswithcode.com/paper/genetic-algorithmic-parameter-optimisation-of

Diversity sampling is an implicit regularization for kernel methods

Title Diversity sampling is an implicit regularization for kernel methods
Authors Michaël Fanuel, Joachim Schreurs, Johan A. K. Suykens
Abstract Kernel methods have achieved very good performance on large scale regression and classification problems, by using the Nystr"om method and preconditioning techniques. The Nystr"om approximation – based on a subset of landmarks – gives a low rank approximation of the kernel matrix, and is known to provide a form of implicit regularization. We further elaborate on the impact of sampling diverse landmarks for constructing the Nystr"om approximation in supervised as well as unsupervised kernel methods. By using Determinantal Point Processes for sampling, we obtain additional theoretical results concerning the interplay between diversity and regularization. Empirically, we demonstrate the advantages of training kernel methods based on subsets made of diverse points. In particular, if the dataset has a dense bulk and a sparser tail, we show that Nystr"om kernel regression with diverse landmarks increases the accuracy of the regression in sparser regions of the dataset, with respect to a uniform landmark sampling. A greedy heuristic is also proposed to select diverse samples of significant size within large datasets when exact DPP sampling is not practically feasible.
Tasks Point Processes
Published 2020-02-20
URL https://arxiv.org/abs/2002.08616v1
PDF https://arxiv.org/pdf/2002.08616v1.pdf
PWC https://paperswithcode.com/paper/diversity-sampling-is-an-implicit

Using Simulated Data to Generate Images of Climate Change

Title Using Simulated Data to Generate Images of Climate Change
Authors Gautier Cosne, Adrien Juraver, Mélisande Teng, Victor Schmidt, Vahe Vardanyan, Alexandra Luccioni, Yoshua Bengio
Abstract Generative adversarial networks (GANs) used in domain adaptation tasks have the ability to generate images that are both realistic and personalized, transforming an input image while maintaining its identifiable characteristics. However, they often require a large quantity of training data to produce high-quality images in a robust way, which limits their usability in cases when access to data is limited. In our paper, we explore the potential of using images from a simulated 3D environment to improve a domain adaptation task carried out by the MUNIT architecture, aiming to use the resulting images to raise awareness of the potential future impacts of climate change.
Tasks Domain Adaptation
Published 2020-01-26
URL https://arxiv.org/abs/2001.09531v1
PDF https://arxiv.org/pdf/2001.09531v1.pdf
PWC https://paperswithcode.com/paper/using-simulated-data-to-generate-images-of

Differentiable Causal Backdoor Discovery

Title Differentiable Causal Backdoor Discovery
Authors Limor Gultchin, Matt J. Kusner, Varun Kanade, Ricardo Silva
Abstract Discovering the causal effect of a decision is critical to nearly all forms of decision-making. In particular, it is a key quantity in drug development, in crafting government policy, and when implementing a real-world machine learning system. Given only observational data, confounders often obscure the true causal effect. Luckily, in some cases, it is possible to recover the causal effect by using certain observed variables to adjust for the effects of confounders. However, without access to the true causal model, finding this adjustment requires brute-force search. In this work, we present an algorithm that exploits auxiliary variables, similar to instruments, in order to find an appropriate adjustment by a gradient-based optimization method. We demonstrate that it outperforms practical alternatives in estimating the true causal effect, without knowledge of the full causal graph.
Tasks Decision Making
Published 2020-03-03
URL https://arxiv.org/abs/2003.01461v1
PDF https://arxiv.org/pdf/2003.01461v1.pdf
PWC https://paperswithcode.com/paper/differentiable-causal-backdoor-discovery

Graph Deconvolutional Generation

Title Graph Deconvolutional Generation
Authors Daniel Flam-Shepherd, Tony Wu, Alan Aspuru-Guzik
Abstract Graph generation is an extremely important task, as graphs are found throughout different areas of science and engineering. In this work, we focus on the modern equivalent of the Erdos-Renyi random graph model: the graph variational autoencoder (GVAE). This model assumes edges and nodes are independent in order to generate entire graphs at a time using a multi-layer perceptron decoder. As a result of these assumptions, GVAE has difficulty matching the training distribution and relies on an expensive graph matching procedure. We improve this class of models by building a message passing neural network into GVAE’s encoder and decoder. We demonstrate our model on the specific task of generating small organic molecules
Tasks Graph Generation, Graph Matching
Published 2020-02-14
URL https://arxiv.org/abs/2002.07087v1
PDF https://arxiv.org/pdf/2002.07087v1.pdf
PWC https://paperswithcode.com/paper/graph-deconvolutional-generation

Differentiable Graph Module (DGM) Graph Convolutional Networks

Title Differentiable Graph Module (DGM) Graph Convolutional Networks
Authors Anees Kazi, Luca Cosmo, Nassir Navab, Michael Bronstein
Abstract Graph deep learning has recently emerged as a powerful ML concept allowing to generalize successful deep neural architectures to non-Euclidean structured data. Such methods have shown promising results on a broad spectrum of applications ranging from social science, biomedicine, and particle physics to computer vision, graphics, and chemistry. One of the limitations of the majority of the current graph neural network architectures is that they are often restricted to the transductive setting and rely on the assumption that the underlying graph is known and fixed. In many settings, such as those arising in medical and healthcare applications, this assumption is not necessarily true since the graph may be noisy, partially- or even completely unknown, and one is thus interested in inferring it from the data. This is especially important in inductive settings when dealing with nodes not present in the graph at training time. Furthermore, sometimes such a graph itself may convey insights that are even more important than the downstream task. In this paper, we introduce Differentiable Graph Module (DGM), a learnable function predicting the edge probability in the graph relevant for the task, that can be combined with convolutional graph neural network layers and trained in an end-to-end fashion. We provide an extensive evaluation of applications from the domains of healthcare (disease prediction), brain imaging (gender and age prediction), computer graphics (3D point cloud segmentation), and computer vision (zero-shot learning). We show that our model provides a significant improvement over baselines both in transductive and inductive settings and achieves state-of-the-art results.
Tasks Disease Prediction, Zero-Shot Learning
Published 2020-02-11
URL https://arxiv.org/abs/2002.04999v2
PDF https://arxiv.org/pdf/2002.04999v2.pdf
PWC https://paperswithcode.com/paper/differentiable-graph-module-dgm-graph

A Metric for Evaluating Neural Input Representation in Supervised Learning Networks

Title A Metric for Evaluating Neural Input Representation in Supervised Learning Networks
Authors Richard R Carrillo, Francisco Naveros, Eduardo Ros, Niceto R Luque
Abstract Supervised learning has long been attributed to several feed-forward neural circuits within the brain, with attention being paid to the cerebellar granular layer. The focus of this study is to evaluate the input activity representation of these feed-forward neural networks. The activity of cerebellar granule cells is conveyed by parallel fibers and translated into Purkinje cell activity; the sole output of the cerebellar cortex. The learning process at this parallel-fiber-to-Purkinje-cell connection makes each Purkinje cell sensitive to a set of specific cerebellar states, determined by the granule-cell activity during a certain time window. A Purkinje cell becomes sensitive to each neural input state and, consequently, the network operates as a function able to generate a desired output for each provided input by means of supervised learning. However, not all sets of Purkinje cell responses can be assigned to any set of input states due to the network’s own limitations (inherent to the network neurobiological substrate), that is, not all input-output mapping can be learned. A limiting factor is the representation of the input states through granule-cell activity. The quality of this representation will determine the capacity of the network to learn a varied set of outputs. In this study we present an algorithm for evaluating quantitatively the level of compatibility/interference amongst a set of given cerebellar states according to their representation (granule-cell activation patterns) without the need for actually conducting simulations and network training. The algorithm input consists of a real-number matrix that codifies the activity level of every considered granule-cell in each state. The capability of this representation to generate a varied set of outputs is evaluated geometrically, thus resulting in a real number that assesses the goodness of the representation
Published 2020-03-03
URL https://arxiv.org/abs/2003.01588v1
PDF https://arxiv.org/pdf/2003.01588v1.pdf
PWC https://paperswithcode.com/paper/a-metric-for-evaluating-neural-input

A Dynamic Reduction Network for Point Clouds

Title A Dynamic Reduction Network for Point Clouds
Authors Lindsey Gray, Thomas Klijnsma, Shamik Ghosh
Abstract Classifying whole images is a classic problem in machine learning, and graph neural networks are a powerful methodology to learn highly irregular geometries. It is often the case that certain parts of a point cloud are more important than others when determining overall classification. On graph structures this started by pooling information at the end of convolutional filters, and has evolved to a variety of staged pooling techniques on static graphs. In this paper, a dynamic graph formulation of pooling is introduced that removes the need for predetermined graph structure. It achieves this by dynamically learning the most important relationships between data via an intermediate clustering. The network architecture yields interesting results considering representation size and efficiency. It also adapts easily to a large number of tasks from image classification to energy regression in high energy particle physics.
Tasks Image Classification
Published 2020-03-18
URL https://arxiv.org/abs/2003.08013v1
PDF https://arxiv.org/pdf/2003.08013v1.pdf
PWC https://paperswithcode.com/paper/a-dynamic-reduction-network-for-point-clouds

A Study of Fitness Landscapes for Neuroevolution

Title A Study of Fitness Landscapes for Neuroevolution
Authors Nuno M. Rodrigues, Sara Silva, Leonardo Vanneschi
Abstract Fitness landscapes are a useful concept to study the dynamics of meta-heuristics. In the last two decades, they have been applied with success to estimate the optimization power of several types of evolutionary algorithms, including genetic algorithms and genetic programming. However, so far they have never been used to study the performance of machine learning algorithms on unseen data, and they have never been applied to neuroevolution. This paper aims at filling both these gaps, applying for the first time fitness landscapes to neuroevolution and using them to infer useful information about the predictive ability of the method. More specifically, we use a grammar-based approach to generate convolutional neural networks, and we study the dynamics of three different mutations to evolve them. To characterize fitness landscapes, we study autocorrelation and entropic measure of ruggedness. The results show that these measures are appropriate for estimating both the optimization power and the generalization ability of the considered neuroevolution configurations.
Published 2020-01-30
URL https://arxiv.org/abs/2001.11272v1
PDF https://arxiv.org/pdf/2001.11272v1.pdf
PWC https://paperswithcode.com/paper/a-study-of-fitness-landscapes-for
comments powered by Disqus