April 3, 2020

3519 words 17 mins read

Paper Group ANR 56

Learning Probabilistic Intersection Traffic Models for Trajectory Prediction. Deep compositional robotic planners that follow natural language commands. A mean-field theory of lazy training in two-layer neural nets: entropic regularization and controlled McKean-Vlasov dynamics. Exploiting Event Cameras for Spatio-Temporal Prediction of Fast-Changin …

Learning Probabilistic Intersection Traffic Models for Trajectory Prediction


Title	Learning Probabilistic Intersection Traffic Models for Trajectory Prediction
Authors	Andrew Patterson, Aditya Gahlawat, Naira Hovakimyan
Abstract	Autonomous agents must be able to safely interact with other vehicles to integrate into urban environments. The safety of these agents is dependent on their ability to predict collisions with other vehicles’ future trajectories for replanning and collision avoidance. The information needed to predict collisions can be learned from previously observed vehicle trajectories in a specific environment, generating a traffic model. The learned traffic model can then be incorporated as prior knowledge into any trajectory estimation method being used in this environment. This work presents a Gaussian process based probabilistic traffic model that is used to quantify vehicle behaviors in an intersection. The Gaussian process model provides estimates for the average vehicle trajectory, while also capturing the variance between the different paths a vehicle may take in the intersection. The method is demonstrated on a set of time-series position trajectories. These trajectories are reconstructed by removing object recognition errors and missed frames that may occur due to data source processing. To create the intersection traffic model, the reconstructed trajectories are clustered based on their source and destination lanes. For each cluster, a Gaussian process model is created to capture the average behavior and the variance of the cluster. To show the applicability of the Gaussian model, the test trajectories are classified with only partial observations. Performance is quantified by the number of observations required to correctly classify the vehicle trajectory. Both the intersection traffic modeling computations and the classification procedure are timed. These times are presented as results and demonstrate that the model can be constructed in a reasonable amount of time and the classification procedure can be used for online applications.
Tasks	Object Recognition, Time Series, Trajectory Prediction
Published	2020-02-05
URL	https://arxiv.org/abs/2002.01965v1
PDF	https://arxiv.org/pdf/2002.01965v1.pdf
PWC	https://paperswithcode.com/paper/learning-probabilistic-intersection-traffic
Repo
Framework

Deep compositional robotic planners that follow natural language commands


Title	Deep compositional robotic planners that follow natural language commands
Authors	Yen-Ling Kuo, Boris Katz, Andrei Barbu
Abstract	We demonstrate how a sampling-based robotic planner can be augmented to learn to understand a sequence of natural language commands in a continuous configuration space to move and manipulate objects. Our approach combines a deep network structured according to the parse of a complex command that includes objects, verbs, spatial relations, and attributes, with a sampling-based planner, RRT. A recurrent hierarchical deep network controls how the planner explores the environment, determines when a planned path is likely to achieve a goal, and estimates the confidence of each move to trade off exploitation and exploration between the network and the planner. Planners are designed to have near-optimal behavior when information about the task is missing, while networks learn to exploit observations which are available from the environment, making the two naturally complementary. Combining the two enables generalization to new maps, new kinds of obstacles, and more complex sentences that do not occur in the training set. Little data is required to train the model despite it jointly acquiring a CNN that extracts features from the environment as it learns the meanings of words. The model provides a level of interpretability through the use of attention maps allowing users to see its reasoning steps despite being an end-to-end model. This end-to-end model allows robots to learn to follow natural language commands in challenging continuous environments.
Tasks
Published	2020-02-12
URL	https://arxiv.org/abs/2002.05201v2
PDF	https://arxiv.org/pdf/2002.05201v2.pdf
PWC	https://paperswithcode.com/paper/deep-compositional-robotic-planners-that
Repo
Framework

A mean-field theory of lazy training in two-layer neural nets: entropic regularization and controlled McKean-Vlasov dynamics


Title	A mean-field theory of lazy training in two-layer neural nets: entropic regularization and controlled McKean-Vlasov dynamics
Authors	Belinda Tzen, Maxim Raginsky
Abstract	We consider the problem of universal approximation of functions by two-layer neural nets with random weights that are “nearly Gaussian” in the sense of Kullback-Leibler divergence. This problem is motivated by recent works on lazy training, where the weight updates generated by stochastic gradient descent do not move appreciably from the i.i.d. Gaussian initialization. We first consider the mean-field limit, where the finite population of neurons in the hidden layer is replaced by a continual ensemble, and show that our problem can be phrased as global minimization of a free-energy functional on the space of probability measures over the weights. This functional trades off the $L^2$ approximation risk against the KL divergence with respect to a centered Gaussian prior. We characterize the unique global minimizer and then construct a controlled nonlinear dynamics in the space of probability measures over weights that solves a McKean–Vlasov optimal control problem. This control problem is closely related to the Schr"odinger bridge (or entropic optimal transport) problem, and its value is proportional to the minimum of the free energy. Finally, we show that SGD in the lazy training regime (which can be ensured by jointly tuning the variance of the Gaussian prior and the entropic regularization parameter) serves as a greedy approximation to the optimal McKean–Vlasov distributional dynamics and provide quantitative guarantees on the $L^2$ approximation error.
Tasks
Published	2020-02-05
URL	https://arxiv.org/abs/2002.01987v3
PDF	https://arxiv.org/pdf/2002.01987v3.pdf
PWC	https://paperswithcode.com/paper/a-mean-field-theory-of-lazy-training-in-two
Repo
Framework

Exploiting Event Cameras for Spatio-Temporal Prediction of Fast-Changing Trajectories


Title	Exploiting Event Cameras for Spatio-Temporal Prediction of Fast-Changing Trajectories
Authors	Marco Monforte, Ander Arriandiaga, Arren Glover, Chiara Bartolozzi
Abstract	This paper investigates trajectory prediction for robotics, to improve the interaction of robots with moving targets, such as catching a bouncing ball. Unexpected, highly-non-linear trajectories cannot easily be predicted with regression-based fitting procedures, therefore we apply state of the art machine learning, specifically based on Long-Short Term Memory (LSTM) architectures. In addition, fast moving targets are better sensed using event cameras, which produce an asynchronous output triggered by spatial change, rather than at fixed temporal intervals as with traditional cameras. We investigate how LSTM models can be adapted for event camera data, and in particular look at the benefit of using asynchronously sampled data.
Tasks	Trajectory Prediction
Published	2020-01-05
URL	https://arxiv.org/abs/2001.01248v2
PDF	https://arxiv.org/pdf/2001.01248v2.pdf
PWC	https://paperswithcode.com/paper/exploiting-event-driven-cameras-for-spatio
Repo
Framework

NAttack! Adversarial Attacks to bypass a GAN based classifier trained to detect Network intrusion


Title	NAttack! Adversarial Attacks to bypass a GAN based classifier trained to detect Network intrusion
Authors	Aritran Piplai, Sai Sree Laya Chukkapalli, Anupam Joshi
Abstract	With the recent developments in artificial intelligence and machine learning, anomalies in network traffic can be detected using machine learning approaches. Before the rise of machine learning, network anomalies which could imply an attack, were detected using well-crafted rules. An attacker who has knowledge in the field of cyber-defence could make educated guesses to sometimes accurately predict which particular features of network traffic data the cyber-defence mechanism is looking at. With this information, the attacker can circumvent a rule-based cyber-defense system. However, after the advancements of machine learning for network anomaly, it is not easy for a human to understand how to bypass a cyber-defence system. Recently, adversarial attacks have become increasingly common to defeat machine learning algorithms. In this paper, we show that even if we build a classifier and train it with adversarial examples for network data, we can use adversarial attacks and successfully break the system. We propose a Generative Adversarial Network(GAN)based algorithm to generate data to train an efficient neural network based classifier, and we subsequently break the system using adversarial attacks.
Tasks
Published	2020-02-20
URL	https://arxiv.org/abs/2002.08527v1
PDF	https://arxiv.org/pdf/2002.08527v1.pdf
PWC	https://paperswithcode.com/paper/nattack-adversarial-attacks-to-bypass-a-gan
Repo
Framework

What Information Does a ResNet Compress?


Title	What Information Does a ResNet Compress?
Authors	Luke Nicholas Darlow, Amos Storkey
Abstract	The information bottleneck principle (Shwartz-Ziv & Tishby, 2017) suggests that SGD-based training of deep neural networks results in optimally compressed hidden layers, from an information theoretic perspective. However, this claim was established on toy data. The goal of the work we present here is to test whether the information bottleneck principle is applicable to a realistic setting using a larger and deeper convolutional architecture, a ResNet model. We trained PixelCNN++ models as inverse representation decoders to measure the mutual information between hidden layers of a ResNet and input image data, when trained for (1) classification and (2) autoencoding. We find that two stages of learning happen for both training regimes, and that compression does occur, even for an autoencoder. Sampling images by conditioning on hidden layers’ activations offers an intuitive visualisation to understand what a ResNets learns to forget.
Tasks
Published	2020-03-13
URL	https://arxiv.org/abs/2003.06254v1
PDF	https://arxiv.org/pdf/2003.06254v1.pdf
PWC	https://paperswithcode.com/paper/what-information-does-a-resnet-compress-1
Repo
Framework

Compositional Embeddings for Multi-Label One-Shot Learning


Title	Compositional Embeddings for Multi-Label One-Shot Learning
Authors	Zeqian Li, Michael C. Mozer, Jacob Whitehill
Abstract	We explore the idea of compositional set embeddings that can be used to infer not just a single class per input (e.g., image, video, audio signal), but a collection of classes, in the setting of one-shot learning. Class compositionality is useful in tasks such as multi-object detection in images and multi-speaker diarization in audio. Specifically, we devise and implement two novel models consisting of (1) an embedding function f trained jointly with a “composite” function g that computes set union operations between the classes encoded in two embedding vectors; and (2) embedding f trained jointly with a “query” function h that computes whether the classes encoded in one embedding subsume the classes encoded in another embedding. In contrast to previously developed methods, these models must both determine the classes associated with the input examples and encode the relationships between different class label sets. In experiments conducted on simulated data, OmniGlot, LibriSpeech and Open Images datasets, the proposed composite embedding models outperform baselines based on traditional embedding methods.
Tasks	Object Detection, Omniglot, One-Shot Learning, Speaker Diarization
Published	2020-02-11
URL	https://arxiv.org/abs/2002.04193v2
PDF	https://arxiv.org/pdf/2002.04193v2.pdf
PWC	https://paperswithcode.com/paper/compositional-embeddings-for-multi-label-one
Repo
Framework

Machine Learning Techniques to Detect and Characterise Whistler Radio Waves


Title	Machine Learning Techniques to Detect and Characterise Whistler Radio Waves
Authors	Othniel J. E. Y. Konan, Amit Kumar Mishra, Stefan Lotz
Abstract	Lightning strokes create powerful electromagnetic pulses that routinely cause very low frequency (VLF) waves to propagate across hemispheres along geomagnetic field lines. VLF antenna receivers can be used to detect these whistler waves generated by these lightning strokes. The particular time/frequency dependence of the received whistler wave enables the estimation of electron density in the plasmasphere region of the magnetosphere. Therefore the identification and characterisation of whistlers are important tasks to monitor the plasmasphere in real-time and to build large databases of events to be used for statistical studies. The current state of the art in detecting whistler is the Automatic Whistler Detection (AWD) method developed by Lichtenberger (2009). This method is based on image correlation in 2 dimensions and requires significant computing hardware situated at the VLF receiver antennas (e.g. in Antarctica). The aim of this work is to develop a machine learning-based model capable of automatically detecting whistlers in the data provided by the VLF receivers. The approach is to use a combination of image classification and localisation on the spectrogram data generated by the VLF receivers to identify and localise each whistler. The data at hand has around 2300 events identified by AWD at SANAE and Marion and will be used as training, validation, and testing data. Three detector designs have been proposed. The first one using a similar method to AWD, the second using image classification on regions of interest extracted from a spectrogram, and the last one using YOLO, the current state of the art in object detection. It has been shown that these detectors can achieve a misdetection and false alarm of less than 15% on Marion’s dataset.
Tasks	Image Classification, Object Detection
Published	2020-02-04
URL	https://arxiv.org/abs/2002.01244v1
PDF	https://arxiv.org/pdf/2002.01244v1.pdf
PWC	https://paperswithcode.com/paper/machine-learning-techniques-to-detect-and
Repo
Framework

Multiple Discrimination and Pairwise CNN for View-based 3D Object Retrieval


Title	Multiple Discrimination and Pairwise CNN for View-based 3D Object Retrieval
Authors	Z. Gao, K. X Xue, S. H Wan
Abstract	With the rapid development and wide application of computer, camera device, network and hardware technology, 3D object (or model) retrieval has attracted widespread attention and it has become a hot research topic in the computer vision domain. Deep learning features already available in 3D object retrieval have been proven to be better than the retrieval performance of hand-crafted features. However, most existing networks do not take into account the impact of multi-view image selection on network training, and the use of contrastive loss alone only forcing the same-class samples to be as close as possible. In this work, a novel solution named Multi-view Discrimination and Pairwise CNN (MDPCNN) for 3D object retrieval is proposed to tackle these issues. It can simultaneously input of multiple batches and multiple views by adding the Slice layer and the Concat layer. Furthermore, a highly discriminative network is obtained by training samples that are not easy to be classified by clustering. Lastly, we deploy the contrastive-center loss and contrastive loss as the optimization objective that has better intra-class compactness and inter-class separability. Large-scale experiments show that the proposed MDPCNN can achieve a significant performance over the state-of-the-art algorithms in 3D object retrieval.
Tasks	3D Object Retrieval
Published	2020-02-27
URL	https://arxiv.org/abs/2002.11977v1
PDF	https://arxiv.org/pdf/2002.11977v1.pdf
PWC	https://paperswithcode.com/paper/multiple-discrimination-and-pairwise-cnn-for
Repo
Framework

Pseudo-Bidirectional Decoding for Local Sequence Transduction


Title	Pseudo-Bidirectional Decoding for Local Sequence Transduction
Authors	Wangchunshu Zhou, Tao Ge, Ke Xu
Abstract	Local sequence transduction (LST) tasks are sequence transduction tasks where there exists massive overlapping between the source and target sequences, such as Grammatical Error Correction (GEC) and spell or OCR correction. Previous work generally tackles LST tasks with standard sequence-to-sequence (seq2seq) models that generate output tokens from left to right and suffer from the issue of unbalanced outputs. Motivated by the characteristic of LST tasks, in this paper, we propose a simple but versatile approach named Pseudo-Bidirectional Decoding (PBD) for LST tasks. PBD copies the corresponding representation of source tokens to the decoder as pseudo future context to enable the decoder to attends to its bi-directional context. In addition, the bidirectional decoding scheme and the characteristic of LST tasks motivate us to share the encoder and the decoder of seq2seq models. The proposed PBD approach provides right side context information for the decoder and models the inductive bias of LST tasks, reducing the number of parameters by half and providing good regularization effects. Experimental results on several benchmark datasets show that our approach consistently improves the performance of standard seq2seq models on LST tasks.
Tasks	Grammatical Error Correction, Optical Character Recognition
Published	2020-01-31
URL	https://arxiv.org/abs/2001.11694v2
PDF	https://arxiv.org/pdf/2001.11694v2.pdf
PWC	https://paperswithcode.com/paper/pseudo-bidirectional-decoding-for-local
Repo
Framework

Bio-inspired Optimization: metaheuristic algorithms for optimization


Title	Bio-inspired Optimization: metaheuristic algorithms for optimization
Authors	Pravin S Game, Dr. Vinod Vaze, Dr. Emmanuel M
Abstract	In today’s day and time solving real-world complex problems has become fundamentally vital and critical task. Many of these are combinatorial problems, where optimal solutions are sought rather than exact solutions. Traditional optimization methods are found to be effective for small scale problems. However, for real-world large scale problems, traditional methods either do not scale up or fail to obtain optimal solutions or they end-up giving solutions after a long running time. Even earlier artificial intelligence based techniques used to solve these problems could not give acceptable results. However, last two decades have seen many new methods in AI based on the characteristics and behaviors of the living organisms in the nature which are categorized as bio-inspired or nature inspired optimization algorithms. These methods, are also termed meta-heuristic optimization methods, have been proved theoretically and implemented using simulation as well used to create many useful applications. They have been used extensively to solve many industrial and engineering complex problems due to being easy to understand, flexible, simple to adapt to the problem at hand and most importantly their ability to come out of local optima traps. This local optima avoidance property helps in finding global optimal solutions. This paper is aimed at understanding how nature has inspired many optimization algorithms, basic categorization of them, major bio-inspired optimization algorithms invented in recent time with their applications.
Tasks
Published	2020-02-24
URL	https://arxiv.org/abs/2003.11637v1
PDF	https://arxiv.org/pdf/2003.11637v1.pdf
PWC	https://paperswithcode.com/paper/bio-inspired-optimization-metaheuristic
Repo
Framework

Evidence-based explanation to promote fairness in AI systems


Title	Evidence-based explanation to promote fairness in AI systems
Authors	Juliana Jansen Ferreira, Mateus de Souza Monteiro
Abstract	As Artificial Intelligence (AI) technology gets more intertwined with every system, people are using AI to make decisions on their everyday activities. In simple contexts, such as Netflix recommendations, or in more complex context like in judicial scenarios, AI is part of people’s decisions. People make decisions and usually, they need to explain their decision to others or in some matter. It is particularly critical in contexts where human expertise is central to decision-making. In order to explain their decisions with AI support, people need to understand how AI is part of that decision. When considering the aspect of fairness, the role that AI has on a decision-making process becomes even more sensitive since it affects the fairness and the responsibility of those people making the ultimate decision. We have been exploring an evidence-based explanation design approach to ‘tell the story of a decision’. In this position paper, we discuss our approach for AI systems using fairness sensitive cases in the literature.
Tasks	Decision Making
Published	2020-03-03
URL	https://arxiv.org/abs/2003.01525v1
PDF	https://arxiv.org/pdf/2003.01525v1.pdf
PWC	https://paperswithcode.com/paper/evidence-based-explanation-to-promote
Repo
Framework

Fawkes: Protecting Personal Privacy against Unauthorized Deep Learning Models


Title	Fawkes: Protecting Personal Privacy against Unauthorized Deep Learning Models
Authors	Shawn Shan, Emily Wenger, Jiayun Zhang, Huiying Li, Haitao Zheng, Ben Y. Zhao
Abstract	Today’s proliferation of powerful facial recognition models poses a real threat to personal privacy. As Clearview.ai demonstrated, anyone can canvas the Internet for data, and train highly accurate facial recognition models of us without our knowledge. We need tools to protect ourselves from unauthorized facial recognition systems and their numerous potential misuses. Unfortunately, work in related areas are limited in practicality and effectiveness. In this paper, we propose Fawkes, a system that allow individuals to inoculate themselves against unauthorized facial recognition models. Fawkes achieves this by helping users adding imperceptible pixel-level changes (we call them “cloaks”) to their own photos before publishing them online. When collected by a third-party “tracker” and used to train facial recognition models, these “cloaked” images produce functional models that consistently misidentify the user. We experimentally prove that Fawkes provides 95+% protection against user recognition regardless of how trackers train their models. Even when clean, uncloaked images are “leaked” to the tracker and used for training, Fawkes can still maintain a 80+% protection success rate. In fact, we perform real experiments against today’s state-of-the-art facial recognition services and achieve 100% success. Finally, we show that Fawkes is robust against a variety of countermeasures that try to detect or disrupt cloaks.
Tasks
Published	2020-02-19
URL	https://arxiv.org/abs/2002.08327v1
PDF	https://arxiv.org/pdf/2002.08327v1.pdf
PWC	https://paperswithcode.com/paper/fawkes-protecting-personal-privacy-against
Repo
Framework

A Convolutional Neural Network into graph space


Title	A Convolutional Neural Network into graph space
Authors	Maxime Martineau, Romain Raveaux, Donatello Conte, Gilles Venturini
Abstract	Convolutional neural networks (CNNs), in a few decades, have outperformed the existing state of the art methods in classification context. However, in the way they were formalised, CNNs are bound to operate on euclidean spaces. Indeed, convolution is a signal operation that are defined on euclidean spaces. This has restricted deep learning main use to euclidean-defined data such as sound or image. And yet, numerous computer application fields (among which network analysis, computational social science, chemo-informatics or computer graphics) induce non-euclideanly defined data such as graphs, networks or manifolds. In this paper we propose a new convolution neural network architecture, defined directly into graph space. Convolution and pooling operators are defined in graph domain. We show its usability in a back-propagation context. Experimental results show that our model performance is at state of the art level on simple tasks. It shows robustness with respect to graph domain changes and improvement with respect to other euclidean and non-euclidean convolutional architectures.
Tasks
Published	2020-02-20
URL	https://arxiv.org/abs/2002.09285v2
PDF	https://arxiv.org/pdf/2002.09285v2.pdf
PWC	https://paperswithcode.com/paper/a-convolutional-neural-network-into-graph
Repo
Framework

Self-Supervised Learning of Generative Spin-Glasses with Normalizing Flows


Title	Self-Supervised Learning of Generative Spin-Glasses with Normalizing Flows
Authors	Gavin S. Hartnett, Masoud Mohseni
Abstract	Spin-glasses are universal models that can capture complex behavior of many-body systems at the interface of statistical physics and computer science including discrete optimization, inference in graphical models, and automated reasoning. Computing the underlying structure and dynamics of such complex systems is extremely difficult due to the combinatorial explosion of their state space. Here, we develop deep generative continuous spin-glass distributions with normalizing flows to model correlations in generic discrete problems. We use a self-supervised learning paradigm by automatically generating the data from the spin-glass itself. We demonstrate that key physical and computational properties of the spin-glass phase can be successfully learned, including multi-modal steady-state distributions and topological structures among metastable states. Remarkably, we observe that the learning itself corresponds to a spin-glass phase transition within the layers of the trained normalizing flows. The inverse normalizing flows learns to perform reversible multi-scale coarse-graining operations which are very different from the typical irreversible renormalization group techniques.
Tasks
Published	2020-01-02
URL	https://arxiv.org/abs/2001.00585v2
PDF	https://arxiv.org/pdf/2001.00585v2.pdf
PWC	https://paperswithcode.com/paper/self-supervised-learning-of-generative-spin
Repo
Framework