January 31, 2020

3246 words 16 mins read

Paper Group ANR 39

Paper Group ANR 39

A Review of Machine Learning Applications in Fuzzing. In Hindsight: A Smooth Reward for Steady Exploration. Complementary reinforcement learning towards explainable agents. What is Local Optimality in Nonconvex-Nonconcave Minimax Optimization?. One-Class Classification by Ensembles of Regression models – a detailed study. FisheyeMultiNet: Real-tim …

A Review of Machine Learning Applications in Fuzzing

Title A Review of Machine Learning Applications in Fuzzing
Authors Gary J Saavedra, Kathryn N Rodhouse, Daniel M Dunlavy, Philip W Kegelmeyer
Abstract Fuzzing has played an important role in improving software development and testing over the course of several decades. Recent research in fuzzing has focused on applications of machine learning (ML), offering useful tools to overcome challenges in the fuzzing process. This review surveys the current research in applying ML to fuzzing. Specifically, this review discusses successful applications of ML to fuzzing, briefly explores challenges encountered, and motivates future research to address fuzzing bottlenecks.
Tasks
Published 2019-06-13
URL https://arxiv.org/abs/1906.11133v2
PDF https://arxiv.org/pdf/1906.11133v2.pdf
PWC https://paperswithcode.com/paper/a-review-of-machine-learning-applications-in
Repo
Framework

In Hindsight: A Smooth Reward for Steady Exploration

Title In Hindsight: A Smooth Reward for Steady Exploration
Authors Hadi S. Jomaa, Josif Grabocka, Lars Schmidt-Thieme
Abstract In classical Q-learning, the objective is to maximize the sum of discounted rewards through iteratively using the Bellman equation as an update, in an attempt to estimate the action value function of the optimal policy. Conventionally, the loss function is defined as the temporal difference between the action value and the expected (discounted) reward, however it focuses solely on the future, leading to overestimation errors. We extend the well-established Q-learning techniques by introducing the hindsight factor, an additional loss term that takes into account how the model progresses, by integrating the historic temporal difference as part of the reward. The effect of this modification is examined in a deterministic continuous-state space function estimation problem, where the overestimation phenomenon is significantly reduced and results in improved stability. The underlying effect of the hindsight factor is modeled as an adaptive learning rate, which unlike existing adaptive optimizers, takes into account the previously estimated action value. The proposed method outperforms variations of Q-learning, with an overall higher average reward and lower action values, which supports the deterministic evaluation, and proves that the hindsight factor contributes to lower overestimation errors. The mean average score of 100 episodes obtained after training for 10 million frames shows that the hindsight factor outperforms deep Q-networks, double deep Q-networks and dueling networks for a variety of ATARI games.
Tasks Atari Games, Q-Learning
Published 2019-06-24
URL https://arxiv.org/abs/1906.09781v1
PDF https://arxiv.org/pdf/1906.09781v1.pdf
PWC https://paperswithcode.com/paper/in-hindsight-a-smooth-reward-for-steady
Repo
Framework

Complementary reinforcement learning towards explainable agents

Title Complementary reinforcement learning towards explainable agents
Authors Jung Hoon Lee
Abstract Reinforcement learning (RL) algorithms allow agents to learn skills and strategies to perform complex tasks without detailed instructions or expensive labelled training examples. That is, RL agents can learn, as we learn. Given the importance of learning in our intelligence, RL has been thought to be one of key components to general artificial intelligence, and recent breakthroughs in deep reinforcement learning suggest that neural networks (NN) are natural platforms for RL agents. However, despite the efficiency and versatility of NN-based RL agents, their decision-making remains incomprehensible, reducing their utilities. To deploy RL into a wider range of applications, it is imperative to develop explainable NN-based RL agents. Here, we propose a method to derive a secondary comprehensible agent from a NN-based RL agent, whose decision-makings are based on simple rules. Our empirical evaluation of this secondary agent’s performance supports the possibility of building a comprehensible and transparent agent using a NN-based RL agent.
Tasks Decision Making
Published 2019-01-01
URL http://arxiv.org/abs/1901.00188v2
PDF http://arxiv.org/pdf/1901.00188v2.pdf
PWC https://paperswithcode.com/paper/complementary-reinforcement-learning-towards
Repo
Framework

What is Local Optimality in Nonconvex-Nonconcave Minimax Optimization?

Title What is Local Optimality in Nonconvex-Nonconcave Minimax Optimization?
Authors Chi Jin, Praneeth Netrapalli, Michael I. Jordan
Abstract Minimax optimization has found extensive applications in modern machine learning, in settings such as generative adversarial networks (GANs), adversarial training and multi-agent reinforcement learning. As most of these applications involve continuous nonconvex-nonconcave formulations, a very basic question arises—``what is a proper definition of local optima?’’ Most previous work answers this question using classical notions of equilibria from simultaneous games, where the min-player and the max-player act simultaneously. In contrast, most applications in machine learning, including GANs and adversarial training, correspond to sequential games, where the order of which player acts first is crucial (since minimax is in general not equal to maximin due to the nonconvex-nonconcave nature of the problems). The main contribution of this paper is to propose a proper mathematical definition of local optimality for this sequential setting—local minimax, as well as to present its properties and existence results. Finally, we establish a strong connection to a basic local search algorithm—gradient descent ascent (GDA): under mild conditions, all stable limit points of GDA are exactly local minimax points up to some degenerate points. |
Tasks Multi-agent Reinforcement Learning
Published 2019-02-02
URL https://arxiv.org/abs/1902.00618v2
PDF https://arxiv.org/pdf/1902.00618v2.pdf
PWC https://paperswithcode.com/paper/minmax-optimization-stable-limit-points-of
Repo
Framework

One-Class Classification by Ensembles of Regression models – a detailed study

Title One-Class Classification by Ensembles of Regression models – a detailed study
Authors Amir Ahmad, Srikanth Bezawada
Abstract One-class classification (OCC) deals with the classification problem in which the training data has data points belonging only to target class. In this paper, we study a one-class classification algorithm, One-Class Classification by Ensembles of Regression models (OCCER), that uses regression methods to address OCC problems. The OCCER coverts an OCC problem into many regression problems in the original feature space so that each feature of the original feature space is used as the target variable in one of the regression problems. Other features are used as the variables on which the dependent variable depends. The errors of regression of a data point by all the regression models are used to compute the outlier score of the data point. An extensive comparison of the OCCER algorithm with state-of-the-art OCC algorithms on several datasets was conducted to show the effectiveness of the this approach. We also demonstrate that the OCCER algorithm can work well with the latent feature space created by autoencoders for image datasets. The implementation of OCCER is available at https://github.com/srikanthBezawada/OCCER.
Tasks
Published 2019-12-26
URL https://arxiv.org/abs/1912.11475v3
PDF https://arxiv.org/pdf/1912.11475v3.pdf
PWC https://paperswithcode.com/paper/occer-one-class-classification-by-ensembles
Repo
Framework

FisheyeMultiNet: Real-time Multi-task Learning Architecture for Surround-view Automated Parking System

Title FisheyeMultiNet: Real-time Multi-task Learning Architecture for Surround-view Automated Parking System
Authors Pullarao Maddu, Wayne Doherty, Ganesh Sistu, Isabelle Leang, Michal Uricar, Sumanth Chennupati, Hazem Rashed, Jonathan Horgan, Ciaran Hughes, Senthil Yogamani
Abstract Automated Parking is a low speed manoeuvring scenario which is quite unstructured and complex, requiring full 360{\deg} near-field sensing around the vehicle. In this paper, we discuss the design and implementation of an automated parking system from the perspective of camera based deep learning algorithms. We provide a holistic overview of an industrial system covering the embedded system, use cases and the deep learning architecture. We demonstrate a real-time multi-task deep learning network called FisheyeMultiNet, which detects all the necessary objects for parking on a low-power embedded system. FisheyeMultiNet runs at 15 fps for 4 cameras and it has three tasks namely object detection, semantic segmentation and soiling detection. To encourage further research, we release a partial dataset of 5,000 images containing semantic segmentation and bounding box detection ground truth via WoodScape project \cite{yogamani2019woodscape}.
Tasks Multi-Task Learning, Object Detection, Semantic Segmentation
Published 2019-12-23
URL https://arxiv.org/abs/1912.11066v1
PDF https://arxiv.org/pdf/1912.11066v1.pdf
PWC https://paperswithcode.com/paper/fisheyemultinet-real-time-multi-task-learning
Repo
Framework

A Stochastic-Computing based Deep Learning Framework using Adiabatic Quantum-Flux-Parametron SuperconductingTechnology

Title A Stochastic-Computing based Deep Learning Framework using Adiabatic Quantum-Flux-Parametron SuperconductingTechnology
Authors Ruizhe Cai, Ao Ren, Olivia Chen, Ning Liu, Caiwen Ding, Xuehai Qian, Jie Han, Wenhui Luo, Nobuyuki Yoshikawa, Yanzhi Wang
Abstract The Adiabatic Quantum-Flux-Parametron (AQFP) superconducting technology has been recently developed, which achieves the highest energy efficiency among superconducting logic families, potentially huge gain compared with state-of-the-art CMOS. In 2016, the successful fabrication and testing of AQFP-based circuits with the scale of 83,000 JJs have demonstrated the scalability and potential of implementing large-scale systems using AQFP. As a result, it will be promising for AQFP in high-performance computing and deep space applications, with Deep Neural Network (DNN) inference acceleration as an important example. Besides ultra-high energy efficiency, AQFP exhibits two unique characteristics: the deep pipelining nature since each AQFP logic gate is connected with an AC clock signal, which increases the difficulty to avoid RAW hazards; the second is the unique opportunity of true random number generation (RNG) using a single AQFP buffer, far more efficient than RNG in CMOS. We point out that these two characteristics make AQFP especially compatible with the \emph{stochastic computing} (SC) technique, which uses a time-independent bit sequence for value representation, and is compatible with the deep pipelining nature. Further, the application of SC has been investigated in DNNs in prior work, and the suitability has been illustrated as SC is more compatible with approximate computations. This work is the first to develop an SC-based DNN acceleration framework using AQFP technology.
Tasks
Published 2019-07-22
URL https://arxiv.org/abs/1907.09077v1
PDF https://arxiv.org/pdf/1907.09077v1.pdf
PWC https://paperswithcode.com/paper/a-stochastic-computing-based-deep-learning
Repo
Framework

Pixel-wise Segmentation of Right Ventricle of Heart

Title Pixel-wise Segmentation of Right Ventricle of Heart
Authors Yaman Dang, Deepak Anand, Amit Sethi
Abstract One of the first steps in the diagnosis of most cardiac diseases, such as pulmonary hypertension, coronary heart disease is the segmentation of ventricles from cardiac magnetic resonance (MRI) images. Manual segmentation of the right ventricle requires diligence and time, while its automated segmentation is challenging due to shape variations and illdefined borders. We propose a deep learning based method for the accurate segmentation of right ventricle, which does not require post-processing and yet it achieves the state-of-the-art performance of 0.86 Dice coefficient and 6.73 mm Hausdorff distance on RVSC-MICCAI 2012 dataset. We use a novel adaptive cost function to counter extreme class-imbalance in the dataset. We present a comprehensive comparative study of loss functions, architectures, and ensembling techniques to build a principled approach for biomedical segmentation tasks.
Tasks
Published 2019-08-21
URL https://arxiv.org/abs/1908.08004v1
PDF https://arxiv.org/pdf/1908.08004v1.pdf
PWC https://paperswithcode.com/paper/pixel-wise-segmentation-of-right-ventricle-of
Repo
Framework

Progressive Relation Learning for Group Activity Recognition

Title Progressive Relation Learning for Group Activity Recognition
Authors Guyue Hu, Bo Cui, Yuan He, Shan Yu
Abstract Group activities usually involve spatiotemporal dynamics among many interactive individuals, while only a few participants at several key frames essentially define the activity. Therefore, effectively modeling the group-relevant and suppressing the irrelevant actions (and interactions) are vital for group activity recognition. In this paper, we propose a novel method based on deep reinforcement learning to progressively refine the low-level features and high-level relations of group activities. Firstly, we construct a semantic relation graph (SRG) to explicitly model the relations among persons. Then, two agents adopting policy according to two Markov decision processes are applied to progressively refine the SRG. Specifically, one feature-distilling (FD) agent in the discrete action space refines the low-level spatio-temporal features by distilling the most informative frames. Another relation-gating (RG) agent in continuous action space adjusts the high-level semantic graph to pay more attention to group-relevant relations. The SRG, FD agent, and RG agent are optimized alternately to mutually boost the performance of each other. Extensive experiments on two widely used benchmarks demonstrate the effectiveness and superiority of the proposed approach.
Tasks Activity Recognition, Group Activity Recognition
Published 2019-08-08
URL https://arxiv.org/abs/1908.02948v2
PDF https://arxiv.org/pdf/1908.02948v2.pdf
PWC https://paperswithcode.com/paper/progressive-relation-learning-for-group
Repo
Framework

Combining Offline Models and Online Monte-Carlo Tree Search for Planning from Scratch

Title Combining Offline Models and Online Monte-Carlo Tree Search for Planning from Scratch
Authors Yunlong Liu, Jianyang Zheng
Abstract Planning in stochastic and partially observable environments is a central issue in artificial intelligence. One commonly used technique for solving such a problem is by constructing an accurate model firstly. Although some recent approaches have been proposed for learning optimal behaviour under model uncertainty, prior knowledge about the environment is still needed to guarantee the performance of the proposed algorithms. With the benefits of the Predictive State Representations~(PSRs) approach for state representation and model prediction, in this paper, we introduce an approach for planning from scratch, where an offline PSR model is firstly learned and then combined with online Monte-Carlo tree search for planning with model uncertainty. By comparing with the state-of-the-art approach of planning with model uncertainty, we demonstrated the effectiveness of the proposed approaches along with the proof of their convergence. The effectiveness and scalability of our proposed approach are also tested on the RockSample problem, which are infeasible for the state-of-the-art BA-POMDP based approaches.
Tasks
Published 2019-04-05
URL http://arxiv.org/abs/1904.03008v1
PDF http://arxiv.org/pdf/1904.03008v1.pdf
PWC https://paperswithcode.com/paper/combining-offline-models-and-online-monte
Repo
Framework

Probabilistic Atlases to Enforce Topological Constraints

Title Probabilistic Atlases to Enforce Topological Constraints
Authors Udaranga Wickramasinghe, Graham Knott, Pascal Fua
Abstract Probabilistic atlases (PAs) have long been used in standard segmentation approaches and, more recently, in conjunction with Convolutional Neural Networks (CNNs). However, their use has been restricted to relatively standardized structures such as the brain or heart which have limited or predictable range of deformations. Here we propose an encoding-decoding CNN architecture that can exploit rough atlases that encode only the topology of the target structures that can appear in any pose and have arbitrarily complex shapes to improve the segmentation results. It relies on the output of the encoder to compute both the pose parameters used to deform the atlas and the segmentation mask itself, which makes it effective and end-to-end trainable.
Tasks
Published 2019-09-18
URL https://arxiv.org/abs/1909.08330v1
PDF https://arxiv.org/pdf/1909.08330v1.pdf
PWC https://paperswithcode.com/paper/probabilistic-atlases-to-enforce-topological
Repo
Framework

Resonant Machine Learning Based on Complex Growth Transform Dynamical Systems

Title Resonant Machine Learning Based on Complex Growth Transform Dynamical Systems
Authors Oindrila Chatterjee, Shantanu Chakrabartty
Abstract Traditional energy-based learning models associate a single energy metric to each configuration of variables involved in the underlying optimization process. Such models associate the lowest energy state to the optimal configuration of variables under consideration, and are thus inherently dissipative in nature. In this paper we propose an energy-efficient learning framework that exploits structural and functional similarities between a machine learning network and a general electrical network satisfying the Tellegen’s theorem. In contrast to the standard energy-based models, the proposed formulation associates two energy components, namely, active and reactive energy to the original network. This ensures that the network’s active-power is dissipated only during the process of learning, whereas the reactive-power is maintained to be zero at all times. As a result, in steady-state, the learned parameters are stored and self-sustained by electrical resonance determined by the network’s nodal inductances and capacitances. Based on this approach, this paper introduces three novel concepts: (a) A learning framework where the network’s active-power dissipation is used as a regularization for a learning objective function that is subjected to zero total reactive-power constraint; (b) A dynamical system based on complex-domain, continuous-time growth transforms which optimizes the learning objective function and drives the network towards electrical resonance under steady-state operation; and (c) An annealing procedure that controls the trade-off between active-power dissipation and the speed of convergence. As a representative example, we show how the proposed framework can be used for designing resonant support vector machines (SVMs), where we show that the support-vectors correspond to an LC network with self-sustained oscillations.
Tasks
Published 2019-08-15
URL https://arxiv.org/abs/1908.05377v2
PDF https://arxiv.org/pdf/1908.05377v2.pdf
PWC https://paperswithcode.com/paper/resonant-machine-learning-based-on-complex
Repo
Framework

Attributed Graph Clustering: A Deep Attentional Embedding Approach

Title Attributed Graph Clustering: A Deep Attentional Embedding Approach
Authors Chun Wang, Shirui Pan, Ruiqi Hu, Guodong Long, Jing Jiang, Chengqi Zhang
Abstract Graph clustering is a fundamental task which discovers communities or groups in networks. Recent studies have mostly focused on developing deep learning approaches to learn a compact graph embedding, upon which classic clustering methods like k-means or spectral clustering algorithms are applied. These two-step frameworks are difficult to manipulate and usually lead to suboptimal performance, mainly because the graph embedding is not goal-directed, i.e., designed for the specific clustering task. In this paper, we propose a goal-directed deep learning approach, Deep Attentional Embedded Graph Clustering (DAEGC for short). Our method focuses on attributed graphs to sufficiently explore the two sides of information in graphs. By employing an attention network to capture the importance of the neighboring nodes to a target node, our DAEGC algorithm encodes the topological structure and node content in a graph to a compact representation, on which an inner product decoder is trained to reconstruct the graph structure. Furthermore, soft labels from the graph embedding itself are generated to supervise a self-training graph clustering process, which iteratively refines the clustering results. The self-training process is jointly learned and optimized with the graph embedding in a unified framework, to mutually benefit both components. Experimental results compared with state-of-the-art algorithms demonstrate the superiority of our method.
Tasks Graph Clustering, Graph Embedding
Published 2019-06-15
URL https://arxiv.org/abs/1906.06532v1
PDF https://arxiv.org/pdf/1906.06532v1.pdf
PWC https://paperswithcode.com/paper/attributed-graph-clustering-a-deep
Repo
Framework

Stereo-based terrain traversability analysis using normal-based segmentation and superpixel surface analysis

Title Stereo-based terrain traversability analysis using normal-based segmentation and superpixel surface analysis
Authors Aras R. Dargazany
Abstract In this paper, an stereo-based traversability analysis approach for all terrains in off-road mobile robotics, e.g. Unmanned Ground Vehicles (UGVs) is proposed. This approach reformulates the problem of terrain traversability analysis into two main problems: (1) 3D terrain reconstruction and (2) terrain all surfaces detection and analysis. The proposed approach is using stereo camera for perception and 3D reconstruction of the terrain. In order to detect all the existing surfaces in the 3D reconstructed terrain as superpixel surfaces (i.e. segments), an image segmentation technique is applied using geometry-based features (pixel-based surface normals). Having detected all the surfaces, Superpixel Surface Traversability Analysis approach (SSTA) is applied on all of the detected surfaces (superpixel segments) in order to classify them based on their traversability index. The proposed SSTA approach is based on: (1) Superpixel surface normal and plane estimation, (2) Traversability analysis using superpixel surface planes. Having analyzed all the superpixel surfaces based on their traversability, these surfaces are finally classified into five main categories as following: traversable, semi-traversable, non-traversable, unknown and undecided.
Tasks 3D Reconstruction, Semantic Segmentation
Published 2019-07-16
URL https://arxiv.org/abs/1907.06823v1
PDF https://arxiv.org/pdf/1907.06823v1.pdf
PWC https://paperswithcode.com/paper/stereo-based-terrain-traversability-analysis
Repo
Framework

Dynamical Distance Learning for Semi-Supervised and Unsupervised Skill Discovery

Title Dynamical Distance Learning for Semi-Supervised and Unsupervised Skill Discovery
Authors Kristian Hartikainen, Xinyang Geng, Tuomas Haarnoja, Sergey Levine
Abstract Reinforcement learning requires manual specification of a reward function to learn a task. While in principle this reward function only needs to specify the task goal, in practice reinforcement learning can be very time-consuming or even infeasible unless the reward function is shaped so as to provide a smooth gradient towards a successful outcome. This shaping is difficult to specify by hand, particularly when the task is learned from raw observations, such as images. In this paper, we study how we can automatically learn dynamical distances: a measure of the expected number of time steps to reach a given goal state from any other state. These dynamical distances can be used to provide well-shaped reward functions for reaching new goals, making it possible to learn complex tasks efficiently. We show that dynamical distances can be used in a semi-supervised regime, where unsupervised interaction with the environment is used to learn the dynamical distances, while a small amount of preference supervision is used to determine the task goal, without any manually engineered reward function or goal examples. We evaluate our method both on a real-world robot and in simulation. We show that our method can learn to turn a valve with a real-world 9-DoF hand, using raw image observations and just ten preference labels, without any other supervision. Videos of the learned skills can be found on the project website: https://sites.google.com/view/dynamical-distance-learning.
Tasks
Published 2019-07-18
URL https://arxiv.org/abs/1907.08225v4
PDF https://arxiv.org/pdf/1907.08225v4.pdf
PWC https://paperswithcode.com/paper/dynamical-distance-learning-for-unsupervised
Repo
Framework
comments powered by Disqus