January 25, 2020

3350 words 16 mins read

Paper Group ANR 1620

Paper Group ANR 1620

Topological based classification using graph convolutional networks. Max-Plus Matching Pursuit for Deterministic Markov Decision Processes. Large Scale Markov Decision Processes with Changing Rewards. Inverse Rendering Techniques for Physically Grounded Image Editing. DeepDrawing: A Deep Learning Approach to Graph Drawing. Differential Recurrent Ne …

Topological based classification using graph convolutional networks

Title Topological based classification using graph convolutional networks
Authors Roy Abel, Idan Benami, Yoram Louzoun
Abstract In colored graphs, node classes are often associated with either their neighbors class or with information not incorporated in the graph associated with each node. We here propose that node classes are also associated with topological features of the nodes. We use this association to improve Graph machine learning in general and specifically, Graph Convolutional Networks (GCN). First, we show that even in the absence of any external information on nodes, a good accuracy can be obtained on the prediction of the node class using either topological features, or using the neighbors class as an input to a GCN. This accuracy is slightly less than the one that can be obtained using content based GCN. Secondly, we show that explicitly adding the topology as an input to the GCN does not improve the accuracy when combined with external information on nodes. However, adding an additional adjacency matrix with edges between distant nodes with similar topology to the GCN does significantly improve its accuracy, leading to results better than all state of the art methods in multiple datasets.
Tasks
Published 2019-10-26
URL https://arxiv.org/abs/1911.06892v1
PDF https://arxiv.org/pdf/1911.06892v1.pdf
PWC https://paperswithcode.com/paper/topological-based-classification-using-graph
Repo
Framework

Max-Plus Matching Pursuit for Deterministic Markov Decision Processes

Title Max-Plus Matching Pursuit for Deterministic Markov Decision Processes
Authors Francis Bach
Abstract We consider deterministic Markov decision processes (MDPs) and apply max-plus algebra tools to approximate the value iteration algorithm by a smaller-dimensional iteration based on a representation on dictionaries of value functions. The setup naturally leads to novel theoretical results which are simply formulated due to the max-plus algebra structure. For example, when considering a fixed (non adaptive) finite basis, the computational complexity of approximating the optimal value function is not directly related to the number of states, but to notions of covering numbers of the state space. In order to break the curse of dimensionality in factored state-spaces, we consider adaptive basis that can adapt to particular problems leading to an algorithm similar to matching pursuit from signal processing. They currently come with no theoretical guarantees but work empirically well on simple deterministic MDPs derived from low-dimensional continuous control problems. We focus primarily on deterministic MDPs but note that the framework can be applied to all MDPs by considering measure-based formulations.
Tasks Continuous Control
Published 2019-06-20
URL https://arxiv.org/abs/1906.08524v1
PDF https://arxiv.org/pdf/1906.08524v1.pdf
PWC https://paperswithcode.com/paper/max-plus-matching-pursuit-for-deterministic
Repo
Framework

Large Scale Markov Decision Processes with Changing Rewards

Title Large Scale Markov Decision Processes with Changing Rewards
Authors Adrian Rivera Cardoso, He Wang, Huan Xu
Abstract We consider Markov Decision Processes (MDPs) where the rewards are unknown and may change in an adversarial manner. We provide an algorithm that achieves state-of-the-art regret bound of $O( \sqrt{\tau (\lnS+\lnA)T}\ln(T))$, where $S$ is the state space, $A$ is the action space, $\tau$ is the mixing time of the MDP, and $T$ is the number of periods. The algorithm’s computational complexity is polynomial in $S$ and $A$ per period. We then consider a setting often encountered in practice, where the state space of the MDP is too large to allow for exact solutions. By approximating the state-action occupancy measures with a linear architecture of dimension $d\llS$, we propose a modified algorithm with computational complexity polynomial in $d$. We also prove a regret bound for this modified algorithm, which to the best of our knowledge this is the first $\tilde{O}(\sqrt{T})$ regret bound for large scale MDPs with changing rewards.
Tasks
Published 2019-05-25
URL https://arxiv.org/abs/1905.10649v1
PDF https://arxiv.org/pdf/1905.10649v1.pdf
PWC https://paperswithcode.com/paper/large-scale-markov-decision-processes-with
Repo
Framework

Inverse Rendering Techniques for Physically Grounded Image Editing

Title Inverse Rendering Techniques for Physically Grounded Image Editing
Authors Kevin Karsch
Abstract From a single picture of a scene, people can typically grasp the spatial layout immediately and even make good guesses at materials properties and where light is coming from to illuminate the scene. For example, we can reliably tell which objects occlude others, what an object is made of and its rough shape, regions that are illuminated or in shadow, and so on. It is interesting how little is known about our ability to make these determinations; as such, we are still not able to robustly “teach” computers to make the same high-level observations as people. This document presents algorithms for understanding intrinsic scene properties from single images. The goal of these inverse rendering techniques is to estimate the configurations of scene elements (geometry, materials, luminaires, camera parameters, etc) using only information visible in an image. Such algorithms have applications in robotics and computer graphics. One such application is in physically grounded image editing: photo editing made easier by leveraging knowledge of the physical space. These applications allow sophisticated editing operations to be performed in a matter of seconds, enabling seamless addition, removal, or relocation of objects in images.
Tasks
Published 2019-12-25
URL https://arxiv.org/abs/2001.00986v1
PDF https://arxiv.org/pdf/2001.00986v1.pdf
PWC https://paperswithcode.com/paper/inverse-rendering-techniques-for-physically
Repo
Framework

DeepDrawing: A Deep Learning Approach to Graph Drawing

Title DeepDrawing: A Deep Learning Approach to Graph Drawing
Authors Yong Wang, Zhihua Jin, Qianwen Wang, Weiwei Cui, Tengfei Ma, Huamin Qu
Abstract Node-link diagrams are widely used to facilitate network explorations. However, when using a graph drawing technique to visualize networks, users often need to tune different algorithm-specific parameters iteratively by comparing the corresponding drawing results in order to achieve a desired visual effect. This trial and error process is often tedious and time-consuming, especially for non-expert users. Inspired by the powerful data modelling and prediction capabilities of deep learning techniques, we explore the possibility of applying deep learning techniques to graph drawing. Specifically, we propose using a graph-LSTM-based approach to directly map network structures to graph drawings. Given a set of layout examples as the training dataset, we train the proposed graph-LSTM-based model to capture their layout characteristics. Then, the trained model is used to generate graph drawings in a similar style for new networks. We evaluated the proposed approach on two special types of layouts (i.e., grid layouts and star layouts) and two general types of layouts (i.e., ForceAtlas2 and PivotMDS) in both qualitative and quantitative ways. The results provide support for the effectiveness of our approach. We also conducted a time cost assessment on the drawings of small graphs with 20 to 50 nodes. We further report the lessons we learned and discuss the limitations and future work.
Tasks
Published 2019-07-17
URL https://arxiv.org/abs/1907.11040v3
PDF https://arxiv.org/pdf/1907.11040v3.pdf
PWC https://paperswithcode.com/paper/deepdrawing-a-deep-learning-approach-to-graph
Repo
Framework

Differential Recurrent Neural Network and its Application for Human Activity Recognition

Title Differential Recurrent Neural Network and its Application for Human Activity Recognition
Authors Naifan Zhuang, Guo-Jun Qi, The Duc Kieu, Kien A. Hua
Abstract The Long Short-Term Memory (LSTM) recurrent neural network is capable of processing complex sequential information since it utilizes special gating schemes for learning representations from long input sequences. It has the potential to model any sequential time-series data, where the current hidden state has to be considered in the context of the past hidden states. This property makes LSTM an ideal choice to learn the complex dynamics present in long sequences. Unfortunately, the conventional LSTMs do not consider the impact of spatio-temporal dynamics corresponding to the given salient motion patterns, when they gate the information that ought to be memorized through time. To address this problem, we propose a differential gating scheme for the LSTM neural network, which emphasizes on the change in information gain caused by the salient motions between the successive video frames. This change in information gain is quantified by Derivative of States (DoS), and thus the proposed LSTM model is termed as differential Recurrent Neural Network (dRNN). In addition, the original work used the hidden state at the last time-step to model the entire video sequence. Based on the energy profiling of DoS, we further propose to employ the State Energy Profile (SEP) to search for salient dRNN states and construct more informative representations. The effectiveness of the proposed model was demonstrated by automatically recognizing human actions from the real-world 2D and 3D single-person action datasets. We point out that LSTM is a special form of dRNN. As a result, we have introduced a new family of LSTMs. Our study is one of the first works towards demonstrating the potential of learning complex time-series representations via high-order derivatives of states.
Tasks Activity Recognition, Human Activity Recognition, Time Series
Published 2019-05-09
URL https://arxiv.org/abs/1905.04293v1
PDF https://arxiv.org/pdf/1905.04293v1.pdf
PWC https://paperswithcode.com/paper/190504293
Repo
Framework

Curiosity-Driven Experience Prioritization via Density Estimation

Title Curiosity-Driven Experience Prioritization via Density Estimation
Authors Rui Zhao, Volker Tresp
Abstract In Reinforcement Learning (RL), an agent explores the environment and collects trajectories into the memory buffer for later learning. However, the collected trajectories can easily be imbalanced with respect to the achieved goal states. The problem of learning from imbalanced data is a well-known problem in supervised learning, but has not yet been thoroughly researched in RL. To address this problem, we propose a novel Curiosity-Driven Prioritization (CDP) framework to encourage the agent to over-sample those trajectories that have rare achieved goal states. The CDP framework mimics the human learning process and focuses more on relatively uncommon events. We evaluate our methods using the robotic environment provided by OpenAI Gym. The environment contains six robot manipulation tasks. In our experiments, we combined CDP with Deep Deterministic Policy Gradient (DDPG) with or without Hindsight Experience Replay (HER). The experimental results show that CDP improves both performance and sample-efficiency of reinforcement learning agents, compared to state-of-the-art methods.
Tasks Density Estimation
Published 2019-02-20
URL http://arxiv.org/abs/1902.08039v2
PDF http://arxiv.org/pdf/1902.08039v2.pdf
PWC https://paperswithcode.com/paper/curiosity-driven-experience-prioritization
Repo
Framework

Machine Learning for Scent: Learning Generalizable Perceptual Representations of Small Molecules

Title Machine Learning for Scent: Learning Generalizable Perceptual Representations of Small Molecules
Authors Benjamin Sanchez-Lengeling, Jennifer N. Wei, Brian K. Lee, Richard C. Gerkin, Alán Aspuru-Guzik, Alexander B. Wiltschko
Abstract Predicting the relationship between a molecule’s structure and its odor remains a difficult, decades-old task. This problem, termed quantitative structure-odor relationship (QSOR) modeling, is an important challenge in chemistry, impacting human nutrition, manufacture of synthetic fragrance, the environment, and sensory neuroscience. We propose the use of graph neural networks for QSOR, and show they significantly out-perform prior methods on a novel data set labeled by olfactory experts. Additional analysis shows that the learned embeddings from graph neural networks capture a meaningful odor space representation of the underlying relationship between structure and odor, as demonstrated by strong performance on two challenging transfer learning tasks. Machine learning has already had a large impact on the senses of sight and sound. Based on these early results with graph neural networks for molecular properties, we hope machine learning can eventually do for olfaction what it has already done for vision and hearing.
Tasks Transfer Learning
Published 2019-10-23
URL https://arxiv.org/abs/1910.10685v2
PDF https://arxiv.org/pdf/1910.10685v2.pdf
PWC https://paperswithcode.com/paper/machine-learning-for-scent-learning
Repo
Framework

A hybrid deep learning framework for integrated segmentation and registration: evaluation on longitudinal white matter tract changes

Title A hybrid deep learning framework for integrated segmentation and registration: evaluation on longitudinal white matter tract changes
Authors Bo Li, Wiro Niessen, Stefan Klein, Marius de Groot, Arfan Ikram, Meike Vernooij, Esther Bron
Abstract To accurately analyze changes of anatomical structures in longitudinal imaging studies, consistent segmentation across multiple time-points is required. Existing solutions often involve independent registration and segmentation components. Registration between time-points is used either as a prior for segmentation in a subsequent time point or to perform segmentation in a common space. In this work, we propose a novel hybrid convolutional neural network (CNN) that integrates segmentation and registration into a single procedure. We hypothesize that the joint optimization leads to increased performance on both tasks. The hybrid CNN is trained by minimizing an integrated loss function composed of four different terms, measuring segmentation accuracy, similarity between registered images, deformation field smoothness, and segmentation consistency. We applied this method to the segmentation of white matter tracts, describing functionally grouped axonal fibers, using N=8045 longitudinal brain MRI data of 3249 individuals. The proposed method was compared with two multistage pipelines using two existing segmentation methods combined with a conventional deformable registration algorithm. In addition, we assessed the added value of the joint optimization for segmentation and registration separately. The hybrid CNN yielded significantly higher accuracy, consistency and reproducibility of segmentation than the multistage pipelines, and was orders of magnitude faster. Therefore, we expect it can serve as a novel tool to support clinical and epidemiological analyses on understanding microstructural brain changes over time.
Tasks
Published 2019-08-26
URL https://arxiv.org/abs/1908.10221v1
PDF https://arxiv.org/pdf/1908.10221v1.pdf
PWC https://paperswithcode.com/paper/a-hybrid-deep-learning-framework-for
Repo
Framework

Class Mean Vectors, Self Monitoring and Self Learning for Neural Classifiers

Title Class Mean Vectors, Self Monitoring and Self Learning for Neural Classifiers
Authors Eugene Wong
Abstract In this paper we explore the role of sample mean in building a neural network for classification. This role is surprisingly extensive and includes: direct computation of weights without training, performance monitoring for samples without known classification, and self-training for unlabeled data. Experimental computation on a CIFAR-10 data set provides promising empirical evidence on the efficacy of a simple and widely applicable approach to some difficult problems.
Tasks
Published 2019-10-22
URL https://arxiv.org/abs/1910.10122v1
PDF https://arxiv.org/pdf/1910.10122v1.pdf
PWC https://paperswithcode.com/paper/class-mean-vectors-self-monitoring-and-self
Repo
Framework

Learning Deep Conditional Target Densities for Accurate Regression

Title Learning Deep Conditional Target Densities for Accurate Regression
Authors Fredrik K. Gustafsson, Martin Danelljan, Goutam Bhat, Thomas B. Schön
Abstract While deep learning-based classification is generally addressed using standardized approaches, a wide variety of techniques are employed for regression. In computer vision, one particularly popular such technique is that of confidence-based regression, which entails predicting a confidence value for each input-target pair (x,y). While this approach has demonstrated impressive results, it requires important task-dependent design choices, and the predicted confidences lack a natural probabilistic meaning. We address these issues by proposing a general and conceptually simple regression method with a clear probabilistic interpretation. In our proposed approach, we create an energy-based model of the conditional target density p(yx), using a deep neural network to predict the un-normalized density from (x,y). This model of p(yx) is trained by directly minimizing the associated negative log-likelihood, approximated using Monte Carlo sampling. We perform comprehensive experiments on four computer vision regression tasks. Our approach outperforms direct regression, as well as other probabilistic and confidence-based methods. Notably, our model achieves a 1.9% AP improvement over Faster-RCNN for object detection on the COCO dataset, and sets a new state-of-the-art on visual tracking when applied for bounding box regression.
Tasks Object Detection, Visual Tracking
Published 2019-09-26
URL https://arxiv.org/abs/1909.12297v2
PDF https://arxiv.org/pdf/1909.12297v2.pdf
PWC https://paperswithcode.com/paper/dctd-deep-conditional-target-densities-for
Repo
Framework

Robust Semantic Segmentation in Adverse Weather Conditions by means of Sensor Data Fusion

Title Robust Semantic Segmentation in Adverse Weather Conditions by means of Sensor Data Fusion
Authors Andreas Pfeuffer, Klaus Dietmayer
Abstract A robust and reliable semantic segmentation in adverse weather conditions is very important for autonomous cars, but most state-of-the-art approaches only achieve high accuracy rates in optimal weather conditions. The reason is that they are only optimized for good weather conditions and given noise models. However, most of them fail, if data with unknown disturbances occur, and their performance decrease enormously. One possibility to still obtain reliable results is to observe the environment with different sensor types, such as camera and lidar, and to fuse the sensor data by means of neural networks, since different sensors behave differently in diverse weather conditions. Hence, the sensors can complement each other by means of an appropriate sensor data fusion. Nevertheless, the fusion-based approaches are still susceptible to disturbances and fail to classify disturbed image areas correctly. This problem can be solved by means of a special training method, the so called Robust Learning Method (RLM), a method by which the neural network learns to handle unknown noise. In this work, two different sensor fusion architectures for semantic segmentation are compared and evaluated on several datasets. Furthermore, it is shown that the RLM increases the robustness in adverse weather conditions enormously, and achieve good results although no disturbance model has been learned by the neural network.
Tasks Semantic Segmentation, Sensor Fusion
Published 2019-05-24
URL https://arxiv.org/abs/1905.10117v1
PDF https://arxiv.org/pdf/1905.10117v1.pdf
PWC https://paperswithcode.com/paper/robust-semantic-segmentation-in-adverse
Repo
Framework

Visual Tracking by means of Deep Reinforcement Learning and an Expert Demonstrator

Title Visual Tracking by means of Deep Reinforcement Learning and an Expert Demonstrator
Authors Matteo Dunnhofer, Niki Martinel, Gian Luca Foresti, Christian Micheloni
Abstract In the last decade many different algorithms have been proposed to track a generic object in videos. Their execution on recent large-scale video datasets can produce a great amount of various tracking behaviours. New trends in Reinforcement Learning showed that demonstrations of an expert agent can be efficiently used to speed-up the process of policy learning. Taking inspiration from such works and from the recent applications of Reinforcement Learning to visual tracking, we propose two novel trackers, A3CT, which exploits demonstrations of a state-of-the-art tracker to learn an effective tracking policy, and A3CTD, that takes advantage of the same expert tracker to correct its behaviour during tracking. Through an extensive experimental validation on the GOT-10k, OTB-100, LaSOT, UAV123 and VOT benchmarks, we show that the proposed trackers achieve state-of-the-art performance while running in real-time.
Tasks Visual Tracking
Published 2019-09-18
URL https://arxiv.org/abs/1909.08487v1
PDF https://arxiv.org/pdf/1909.08487v1.pdf
PWC https://paperswithcode.com/paper/visual-tracking-by-means-of-deep
Repo
Framework

Saliency based Semi-supervised Learning for Orbiting Satellite Tracking

Title Saliency based Semi-supervised Learning for Orbiting Satellite Tracking
Authors Peizhuo Li, Yunda Sun, Xue Wan
Abstract The trajectory and boundary of an orbiting satellite are fundamental information for on-orbit repairing and manipulation by space robots. This task, however, is challenging owing to the freely and rapidly motion of on-orbiting satellites, the quickly varying background and the sudden change in illumination conditions. Traditional tracking usually relies on a single bounding box of the target object, however, more detailed information should be provided by visual tracking such as binary mask. In this paper, we proposed a SSLT (Saliency-based Semi-supervised Learning for Tracking) algorithm that provides both the bounding box and segmentation binary mask of target satellites at 12 frame per second without requirement of annotated data. Our method, SSLT, improves the segmentation performance by generating a saliency map based semi-supervised on-line learning approach within the initial bounding box estimated by tracking. Once a customized segmentation model has been trained, the bounding box and satellite trajectory will be refined using the binary segmentation result. Experiment using real on-orbit rendezvous and docking video from NASA (Nation Aeronautics and Space Administration), simulated satellite animation sequence from ESA (European Space Agency) and image sequences of 3D printed satellite model took in our laboratory demonstrate the robustness, versatility and fast speed of our method compared to state-of-the-art tracking and segmentation methods. Our dataset will be released for academic use in future.
Tasks Visual Tracking
Published 2019-09-09
URL https://arxiv.org/abs/1909.03656v1
PDF https://arxiv.org/pdf/1909.03656v1.pdf
PWC https://paperswithcode.com/paper/saliency-based-semi-supervised-learning-for
Repo
Framework

Hierarchical Sequence to Sequence Voice Conversion with Limited Data

Title Hierarchical Sequence to Sequence Voice Conversion with Limited Data
Authors Praveen Narayanan, Punarjay Chakravarty, Francois Charette, Gint Puskorius
Abstract We present a voice conversion solution using recurrent sequence to sequence modeling for DNNs. Our solution takes advantage of recent advances in attention based modeling in the fields of Neural Machine Translation (NMT), Text-to-Speech (TTS) and Automatic Speech Recognition (ASR). The problem consists of converting between voices in a parallel setting when {\it $<$source,target$>$} audio pairs are available. Our seq2seq architecture makes use of a hierarchical encoder to summarize input audio frames. On the decoder side, we use an attention based architecture used in recent TTS works. Since there is a dearth of large multispeaker voice conversion databases needed for training DNNs, we resort to training the network with a large single speaker dataset as an autoencoder. This is then adapted for the smaller multispeaker voice conversion datasets available for voice conversion. In contrast with other voice conversion works that use $F_0$, duration and linguistic features, our system uses mel spectrograms as the audio representation. Output mel frames are converted back to audio using a wavenet vocoder.
Tasks Machine Translation, Speech Recognition, Voice Conversion
Published 2019-07-15
URL https://arxiv.org/abs/1907.07769v1
PDF https://arxiv.org/pdf/1907.07769v1.pdf
PWC https://paperswithcode.com/paper/hierarchical-sequence-to-sequence-voice
Repo
Framework
comments powered by Disqus