April 2, 2020

3064 words 15 mins read

Paper Group ANR 105

Paper Group ANR 105

Autonomous discovery in the chemical sciences part I: Progress. Deep Active Inference for Autonomous Robot Navigation. Context-Aware Design of Cyber-Physical Human Systems (CPHS). Back-and-Forth prediction for deep tensor compression. A Study on Multimodal and Interactive Explanations for Visual Question Answering. Stochastic Frequency Masking to I …

Autonomous discovery in the chemical sciences part I: Progress

Title Autonomous discovery in the chemical sciences part I: Progress
Authors Connor W. Coley, Natalie S. Eyke, Klavs F. Jensen
Abstract This two-part review examines how automation has contributed to different aspects of discovery in the chemical sciences. In this first part, we describe a classification for discoveries of physical matter (molecules, materials, devices), processes, and models and how they are unified as search problems. We then introduce a set of questions and considerations relevant to assessing the extent of autonomy. Finally, we describe many case studies of discoveries accelerated by or resulting from computer assistance and automation from the domains of synthetic chemistry, drug discovery, inorganic chemistry, and materials science. These illustrate how rapid advancements in hardware automation and machine learning continue to transform the nature of experimentation and modelling. Part two reflects on these case studies and identifies a set of open challenges for the field.
Tasks Drug Discovery
Published 2020-03-30
URL https://arxiv.org/abs/2003.13754v1
PDF https://arxiv.org/pdf/2003.13754v1.pdf
PWC https://paperswithcode.com/paper/autonomous-discovery-in-the-chemical-sciences-1

Deep Active Inference for Autonomous Robot Navigation

Title Deep Active Inference for Autonomous Robot Navigation
Authors Ozan Çatal, Samuel Wauthier, Tim Verbelen, Cedric De Boom, Bart Dhoedt
Abstract Active inference is a theory that underpins the way biological agent’s perceive and act in the real world. At its core, active inference is based on the principle that the brain is an approximate Bayesian inference engine, building an internal generative model to drive agents towards minimal surprise. Although this theory has shown interesting results with grounding in cognitive neuroscience, its application remains limited to simulations with small, predefined sensor and state spaces. In this paper, we leverage recent advances in deep learning to build more complex generative models that can work without a predefined states space. State representations are learned end-to-end from real-world, high-dimensional sensory data such as camera frames. We also show that these generative models can be used to engage in active inference. To the best of our knowledge this is the first application of deep active inference for a real-world robot navigation task.
Tasks Bayesian Inference, Robot Navigation
Published 2020-03-06
URL https://arxiv.org/abs/2003.03220v1
PDF https://arxiv.org/pdf/2003.03220v1.pdf
PWC https://paperswithcode.com/paper/deep-active-inference-for-autonomous-robot

Context-Aware Design of Cyber-Physical Human Systems (CPHS)

Title Context-Aware Design of Cyber-Physical Human Systems (CPHS)
Authors Supratik Mukhopadhyay, Qun Liu, Edward Collier, Yimin Zhu, Ravindra Gudishala, Chanachok Chokwitthaya, Robert DiBiano, Alimire Nabijiang, Sanaz Saeidi, Subhajit Sidhanta, Arnab Ganguly
Abstract Recently, it has been widely accepted by the research community that interactions between humans and cyber-physical infrastructures have played a significant role in determining the performance of the latter. The existing paradigm for designing cyber-physical systems for optimal performance focuses on developing models based on historical data. The impacts of context factors driving human system interaction are challenging and are difficult to capture and replicate in existing design models. As a result, many existing models do not or only partially address those context factors of a new design owing to the lack of capabilities to capture the context factors. This limitation in many existing models often causes performance gaps between predicted and measured results. We envision a new design environment, a cyber-physical human system (CPHS) where decision-making processes for physical infrastructures under design are intelligently connected to distributed resources over cyberinfrastructure such as experiments on design features and empirical evidence from operations of existing instances. The framework combines existing design models with context-aware design-specific data involving human-infrastructure interactions in new designs, using a machine learning approach to create augmented design models with improved predictive powers.
Tasks Decision Making
Published 2020-01-07
URL https://arxiv.org/abs/2001.01918v1
PDF https://arxiv.org/pdf/2001.01918v1.pdf
PWC https://paperswithcode.com/paper/context-aware-design-of-cyber-physical-human

Back-and-Forth prediction for deep tensor compression

Title Back-and-Forth prediction for deep tensor compression
Authors Hyomin Choi, Robert A. Cohen, Ivan V. Bajic
Abstract Recent AI applications such as Collaborative Intelligence with neural networks involve transferring deep feature tensors between various computing devices. This necessitates tensor compression in order to optimize the usage of bandwidth-constrained channels between devices. In this paper we present a prediction scheme called Back-and-Forth (BaF) prediction, developed for deep feature tensors, which allows us to dramatically reduce tensor size and improve its compressibility. Our experiments with a state-of-the-art object detector demonstrate that the proposed method allows us to significantly reduce the number of bits needed for compressing feature tensors extracted from deep within the model, with negligible degradation of the detection performance and without requiring any retraining of the network weights. We achieve a 62% and 75% reduction in tensor size while keeping the loss in accuracy of the network to less than 1% and 2%, respectively.
Published 2020-02-14
URL https://arxiv.org/abs/2002.07036v1
PDF https://arxiv.org/pdf/2002.07036v1.pdf
PWC https://paperswithcode.com/paper/back-and-forth-prediction-for-deep-tensor

A Study on Multimodal and Interactive Explanations for Visual Question Answering

Title A Study on Multimodal and Interactive Explanations for Visual Question Answering
Authors Kamran Alipour, Jurgen P. Schulze, Yi Yao, Avi Ziskind, Giedrius Burachas
Abstract Explainability and interpretability of AI models is an essential factor affecting the safety of AI. While various explainable AI (XAI) approaches aim at mitigating the lack of transparency in deep networks, the evidence of the effectiveness of these approaches in improving usability, trust, and understanding of AI systems are still missing. We evaluate multimodal explanations in the setting of a Visual Question Answering (VQA) task, by asking users to predict the response accuracy of a VQA agent with and without explanations. We use between-subjects and within-subjects experiments to probe explanation effectiveness in terms of improving user prediction accuracy, confidence, and reliance, among other factors. The results indicate that the explanations help improve human prediction accuracy, especially in trials when the VQA system’s answer is inaccurate. Furthermore, we introduce active attention, a novel method for evaluating causal attentional effects through intervention by editing attention maps. User explanation ratings are strongly correlated with human prediction accuracy and suggest the efficacy of these explanations in human-machine AI collaboration tasks.
Tasks Question Answering, Visual Question Answering
Published 2020-03-01
URL https://arxiv.org/abs/2003.00431v1
PDF https://arxiv.org/pdf/2003.00431v1.pdf
PWC https://paperswithcode.com/paper/a-study-on-multimodal-and-interactive

Stochastic Frequency Masking to Improve Super-Resolution and Denoising Networks

Title Stochastic Frequency Masking to Improve Super-Resolution and Denoising Networks
Authors Majed El Helou, Ruofan Zhou, Sabine Süsstrunk
Abstract Super-resolution and denoising are ill-posed yet fundamental image restoration tasks. In blind settings, the degradation kernel or the noise level are unknown. This makes restoration even more challenging, notably for learning-based methods, as they tend to overfit to the degradation seen during training. We present an analysis, in the frequency domain, of degradation-kernel overfitting in super-resolution and introduce a conditional learning perspective that extends to both super-resolution and denoising. Building on our formulation, we propose a stochastic frequency masking of images used in training to regularize the networks and address the overfitting problem. Our technique improves state-of-the-art methods on blind super-resolution with different synthetic kernels, real super-resolution, blind Gaussian denoising, and real-image denoising.
Tasks Denoising, Image Denoising, Image Restoration, Super-Resolution
Published 2020-03-16
URL https://arxiv.org/abs/2003.07119v1
PDF https://arxiv.org/pdf/2003.07119v1.pdf
PWC https://paperswithcode.com/paper/stochastic-frequency-masking-to-improve-super

LEEP: A New Measure to Evaluate Transferability of Learned Representations

Title LEEP: A New Measure to Evaluate Transferability of Learned Representations
Authors Cuong V. Nguyen, Tal Hassner, Cedric Archambeau, Matthias Seeger
Abstract We introduce a new measure to evaluate the transferability of representations learned by classifiers. Our measure, the Log Expected Empirical Prediction (LEEP), is simple and easy to compute: when given a classifier trained on a source data set, it only requires running the target data set through this classifier once. We analyze the properties of LEEP theoretically and demonstrate its effectiveness empirically. Our analysis shows that LEEP can predict the performance and convergence speed of both transfer and meta-transfer learning methods, even for small or imbalanced data. Moreover, LEEP outperforms recently proposed transferability measures such as negative conditional entropy and H scores. Notably, when transferring from ImageNet to CIFAR100, LEEP can achieve up to 30% improvement compared to the best competing method in terms of the correlations with actual transfer accuracy.
Tasks Transfer Learning
Published 2020-02-27
URL https://arxiv.org/abs/2002.12462v1
PDF https://arxiv.org/pdf/2002.12462v1.pdf
PWC https://paperswithcode.com/paper/leep-a-new-measure-to-evaluate

Image Restoration for Under-Display Camera

Title Image Restoration for Under-Display Camera
Authors Yuqian Zhou, David Ren, Neil Emerton, Sehoon Lim, Timothy Large
Abstract The new trend of full-screen devices encourages us to position a camera behind a screen. Removing the bezel and centralizing the camera under the screen brings larger display-to-body ratio and enhances eye contact in video chat, but also causes image degradation. In this paper, we focus on a newly-defined Under-Display Camera (UDC), as a novel real-world single image restoration problem. First, we take a 4k Transparent OLED (T-OLED) and a phone Pentile OLED (P-OLED) and analyze their optical systems to understand the degradation. Second, we design a novel Monitor-Camera Imaging System (MCIS) for easier real pair data acquisition, and a model-based data synthesizing pipeline to generate UDC data only from display pattern and camera measurements. Finally, we resolve the complicated degradation using learning-based methods. Our model demonstrates a real-time high-quality restoration trained with either real or synthetic data. The presented results and methods provide good practice to apply image restoration to real-world applications.
Tasks Image Restoration
Published 2020-03-10
URL https://arxiv.org/abs/2003.04857v1
PDF https://arxiv.org/pdf/2003.04857v1.pdf
PWC https://paperswithcode.com/paper/image-restoration-for-under-display-camera

Better Depth-Width Trade-offs for Neural Networks through the lens of Dynamical Systems

Title Better Depth-Width Trade-offs for Neural Networks through the lens of Dynamical Systems
Authors Vaggos Chatziafratis, Sai Ganesh Nagarajan, Ioannis Panageas
Abstract The expressivity of neural networks as a function of their depth, width and type of activation units has been an important question in deep learning theory. Recently, depth separation results for ReLU networks were obtained via a new connection with dynamical systems, using a generalized notion of fixed points of a continuous map $f$, called periodic points. In this work, we strengthen the connection with dynamical systems and we improve the existing width lower bounds along several aspects. Our first main result is period-specific width lower bounds that hold under the stronger notion of $L^1$-approximation error, instead of the weaker classification error. Our second contribution is that we provide sharper width lower bounds, still yielding meaningful exponential depth-width separations, in regimes where previous results wouldn’t apply. A byproduct of our results is that there exists a universal constant characterizing the depth-width trade-offs, as long as $f$ has odd periods. Technically, our results follow by unveiling a tighter connection between the following three quantities of a given function: its period, its Lipschitz constant and the growth rate of the number of oscillations arising under compositions of the function $f$ with itself.
Published 2020-03-02
URL https://arxiv.org/abs/2003.00777v1
PDF https://arxiv.org/pdf/2003.00777v1.pdf
PWC https://paperswithcode.com/paper/better-depth-width-trade-offs-for-neural

Gradient Boosted Flows

Title Gradient Boosted Flows
Authors Robert Giaquinto, Arindam Banerjee
Abstract Normalizing flows (NF) are a powerful framework for approximating posteriors. By mapping a simple base density through invertible transformations, flows provide an exact method of density evaluation and sampling. The trend in normalizing flow literature has been to devise deeper, more complex transformations to achieve greater flexibility. We propose an alternative: Gradient Boosted Flows (GBF) model a variational posterior by successively adding new NF components by gradient boosting so that each new NF component is fit to the residuals of the previously trained components. The GBF formulation results in a variational posterior that is a mixture model, whose flexibility increases as more components are added. Moreover, GBFs offer a wider, not deeper, approach that can be incorporated to improve the results of many existing NFs. We demonstrate the effectiveness of this technique for density estimation and, by coupling GBF with a variational autoencoder, generative modeling of images.
Tasks Density Estimation
Published 2020-02-27
URL https://arxiv.org/abs/2002.11896v1
PDF https://arxiv.org/pdf/2002.11896v1.pdf
PWC https://paperswithcode.com/paper/gradient-boosted-flows

Human-robot co-manipulation of extended objects: Data-driven models and control from analysis of human-human dyads

Title Human-robot co-manipulation of extended objects: Data-driven models and control from analysis of human-human dyads
Authors Erich Mielke, Eric Townsend, David Wingate, Marc D. Killpack
Abstract Human teams are able to easily perform collaborative manipulation tasks. However, for a robot and human to simultaneously manipulate an extended object is a difficult task using existing methods from the literature. Our approach in this paper is to use data from human-human dyad experiments to determine motion intent which we use for a physical human-robot co-manipulation task. We first present and analyze data from human-human dyads performing co-manipulation tasks. We show that our human-human dyad data has interesting trends including that interaction forces are non-negligible compared to the force required to accelerate an object and that the beginning of a lateral movement is characterized by distinct torque triggers from the leader of the dyad. We also examine different metrics to quantify performance of different dyads. We also develop a deep neural network based on motion data from human-human trials to predict human intent based on past motion. We then show how force and motion data can be used as a basis for robot control in a human-robot dyad. Finally, we compare the performance of two controllers for human-robot co-manipulation to human-human dyad performance.
Published 2020-01-03
URL https://arxiv.org/abs/2001.00991v1
PDF https://arxiv.org/pdf/2001.00991v1.pdf
PWC https://paperswithcode.com/paper/human-robot-co-manipulation-of-extended
Title Analysing the Extent of Misinformation in Cancer Related Tweets
Authors Rakesh Bal, Sayan Sinha, Swastika Dutta, Risabh Joshi, Sayan Ghosh, Ritam Dutt
Abstract Twitter has become one of the most sought after places to discuss a wide variety of topics, including medically relevant issues such as cancer. This helps spread awareness regarding the various causes, cures and prevention methods of cancer. However, no proper analysis has been performed, which discusses the validity of such claims. In this work, we aim to tackle the misinformation spread in such platforms. We collect and present a dataset regarding tweets which talk specifically about cancer and propose an attention-based deep learning model for automated detection of misinformation along with its spread. We then do a comparative analysis of the linguistic variation in the text corresponding to misinformation and truth. This analysis helps us gather relevant insights on various social aspects related to misinformed tweets.
Published 2020-03-30
URL https://arxiv.org/abs/2003.13657v2
PDF https://arxiv.org/pdf/2003.13657v2.pdf
PWC https://paperswithcode.com/paper/analysing-the-extent-of-misinformation-in

On Simple Reactive Neural Networks for Behaviour-Based Reinforcement Learning

Title On Simple Reactive Neural Networks for Behaviour-Based Reinforcement Learning
Authors Ameya Pore, Gerardo Aragon-Camarasa
Abstract We present a behaviour-based reinforcement learning approach, inspired by Brook’s subsumption architecture, in which simple fully connected networks are trained as reactive behaviours. Our working assumption is that a pick and place robotic task can be simplified by leveraging domain knowledge of a robotics developer to decompose and train such reactive behaviours; namely, approach, grasp, and retract. Then the robot autonomously learns how to combine them via an Actor-Critic architecture. The Actor-Critic policy is to determine the activation and inhibition mechanisms of the reactive behaviours in a particular temporal sequence. We validate our approach in a simulated robot environment where the task is picking a block and taking it to a target position while orienting the gripper from a top grasp. The latter represents an extra degree-of-freedom of which current end-to-end reinforcement learning fail to generalise. Our findings suggest that robotic learning can be more effective if each behaviour is learnt in isolation and then combined them to accomplish the task. That is, our approach learns the pick and place task in 8,000 episodes, which represents a drastic reduction in the number of training episodes required by an end-to-end approach and the existing state-of-the-art algorithms.
Published 2020-01-22
URL https://arxiv.org/abs/2001.07973v1
PDF https://arxiv.org/pdf/2001.07973v1.pdf
PWC https://paperswithcode.com/paper/on-simple-reactive-neural-networks-for

Stochastic Approximate Gradient Descent via the Langevin Algorithm

Title Stochastic Approximate Gradient Descent via the Langevin Algorithm
Authors Yixuan Qiu, Xiao Wang
Abstract We introduce a novel and efficient algorithm called the stochastic approximate gradient descent (SAGD), as an alternative to the stochastic gradient descent for cases where unbiased stochastic gradients cannot be trivially obtained. Traditional methods for such problems rely on general-purpose sampling techniques such as Markov chain Monte Carlo, which typically requires manual intervention for tuning parameters and does not work efficiently in practice. Instead, SAGD makes use of the Langevin algorithm to construct stochastic gradients that are biased in finite steps but accurate asymptotically, enabling us to theoretically establish the convergence guarantee for SAGD. Inspired by our theoretical analysis, we also provide useful guidelines for its practical implementation. Finally, we show that SAGD performs well experimentally in popular statistical and machine learning problems such as the expectation-maximization algorithm and the variational autoencoders.
Published 2020-02-13
URL https://arxiv.org/abs/2002.05519v1
PDF https://arxiv.org/pdf/2002.05519v1.pdf
PWC https://paperswithcode.com/paper/stochastic-approximate-gradient-descent-via

Data-driven super-parameterization using deep learning: Experimentation with multi-scale Lorenz 96 systems and transfer-learning

Title Data-driven super-parameterization using deep learning: Experimentation with multi-scale Lorenz 96 systems and transfer-learning
Authors Ashesh Chattopadhyay, Adam Subel, Pedram Hassanzadeh
Abstract To make weather/climate modeling computationally affordable, small-scale processes are usually represented in terms of the large-scale, explicitly-resolved processes using physics-based or semi-empirical parameterization schemes. Another approach, computationally more demanding but often more accurate, is super-parameterization (SP), which involves integrating the equations of small-scale processes on high-resolution grids embedded within the low-resolution grids of large-scale processes. Recently, studies have used machine learning (ML) to develop data-driven parameterization (DD-P) schemes. Here, we propose a new approach, data-driven SP (DD-SP), in which the equations of the small-scale processes are integrated data-drivenly using ML methods such as recurrent neural networks. Employing multi-scale Lorenz 96 systems as testbed, we compare the cost and accuracy (in terms of both short-term prediction and long-term statistics) of parameterized low-resolution (LR), SP, DD-P, and DD-SP models. We show that with the same computational cost, DD-SP substantially outperforms LR, and is better than DD-P, particularly when scale separation is lacking. DD-SP is much cheaper than SP, yet its accuracy is the same in reproducing long-term statistics and often comparable in short-term forecasting. We also investigate generalization, finding that when models trained on data from one system are applied to a system with different forcing (e.g., more chaotic), the models often do not generalize, particularly when the short-term prediction accuracy is examined. But we show that transfer-learning, which involves re-training the data-driven model with a small amount of data from the new system, significantly improves generalization. Potential applications of DD-SP and transfer-learning in climate/weather modeling and the expected challenges are discussed.
Tasks Transfer Learning
Published 2020-02-25
URL https://arxiv.org/abs/2002.11167v1
PDF https://arxiv.org/pdf/2002.11167v1.pdf
PWC https://paperswithcode.com/paper/data-driven-super-parameterization-using-deep
comments powered by Disqus