July 28, 2019

3034 words 15 mins read

Paper Group ANR 191

Paper Group ANR 191

Formulation of Deep Reinforcement Learning Architecture Toward Autonomous Driving for On-Ramp Merge. Episodic memory for continual model learning. Self-supervised learning: When is fusion of the primary and secondary sensor cue useful?. FairJudge: Trustworthy User Prediction in Rating Platforms. Review on Parameter Estimation in HMRF. One-Shot Visu …

Formulation of Deep Reinforcement Learning Architecture Toward Autonomous Driving for On-Ramp Merge

Title Formulation of Deep Reinforcement Learning Architecture Toward Autonomous Driving for On-Ramp Merge
Authors Pin Wang, Ching-Yao Chan
Abstract Multiple automakers have in development or in production automated driving systems (ADS) that offer freeway-pilot functions. This type of ADS is typically limited to restricted-access freeways only, that is, the transition from manual to automated modes takes place only after the ramp merging process is completed manually. One major challenge to extend the automation to ramp merging is that the automated vehicle needs to incorporate and optimize long-term objectives (e.g. successful and smooth merge) when near-term actions must be safely executed. Moreover, the merging process involves interactions with other vehicles whose behaviors are sometimes hard to predict but may influence the merging vehicle optimal actions. To tackle such a complicated control problem, we propose to apply Deep Reinforcement Learning (DRL) techniques for finding an optimal driving policy by maximizing the long-term reward in an interactive environment. Specifically, we apply a Long Short-Term Memory (LSTM) architecture to model the interactive environment, from which an internal state containing historical driving information is conveyed to a Deep Q-Network (DQN). The DQN is used to approximate the Q-function, which takes the internal state as input and generates Q-values as output for action selection. With this DRL architecture, the historical impact of interactive environment on the long-term reward can be captured and taken into account for deciding the optimal control policy. The proposed architecture has the potential to be extended and applied to other autonomous driving scenarios such as driving through a complex intersection or changing lanes under varying traffic flow conditions.
Tasks Autonomous Driving
Published 2017-09-07
URL http://arxiv.org/abs/1709.02066v3
PDF http://arxiv.org/pdf/1709.02066v3.pdf
PWC https://paperswithcode.com/paper/formulation-of-deep-reinforcement-learning
Repo
Framework

Episodic memory for continual model learning

Title Episodic memory for continual model learning
Authors David G. Nagy, Gergő Orbán
Abstract Both the human brain and artificial learning agents operating in real-world or comparably complex environments are faced with the challenge of online model selection. In principle this challenge can be overcome: hierarchical Bayesian inference provides a principled method for model selection and it converges on the same posterior for both off-line (i.e. batch) and online learning. However, maintaining a parameter posterior for each model in parallel has in general an even higher memory cost than storing the entire data set and is consequently clearly unfeasible. Alternatively, maintaining only a limited set of models in memory could limit memory requirements. However, sufficient statistics for one model will usually be insufficient for fitting a different kind of model, meaning that the agent loses information with each model change. We propose that episodic memory can circumvent the challenge of limited memory-capacity online model selection by retaining a selected subset of data points. We design a method to compute the quantities necessary for model selection even when the data is discarded and only statistics of one (or few) learnt models are available. We demonstrate on a simple model that a limited-sized episodic memory buffer, when the content is optimised to retain data with statistics not matching the current representation, can resolve the fundamental challenge of online model selection.
Tasks Bayesian Inference, Model Selection
Published 2017-12-04
URL http://arxiv.org/abs/1712.01169v1
PDF http://arxiv.org/pdf/1712.01169v1.pdf
PWC https://paperswithcode.com/paper/episodic-memory-for-continual-model-learning
Repo
Framework

Self-supervised learning: When is fusion of the primary and secondary sensor cue useful?

Title Self-supervised learning: When is fusion of the primary and secondary sensor cue useful?
Authors G. C. H. E. de Croon
Abstract Self-supervised learning (SSL) is a reliable learning mechanism in which a robot enhances its perceptual capabilities. Typically, in SSL a trusted, primary sensor cue provides supervised training data to a secondary sensor cue. In this article, a theoretical analysis is performed on the fusion of the primary and secondary cue in a minimal model of SSL. A proof is provided that determines the specific conditions under which it is favorable to perform fusion. In short, it is favorable when (i) the prior on the target value is strong or (ii) the secondary cue is sufficiently accurate. The theoretical findings are validated with computational experiments. Subsequently, a real-world case study is performed to investigate if fusion in SSL is also beneficial when assumptions of the minimal model are not met. In particular, a flying robot learns to map pressure measurements to sonar height measurements and then fuses the two, resulting in better height estimation. Fusion is also beneficial in the opposite case, when pressure is the primary cue. The analysis and results are encouraging to study SSL fusion also for other robots and sensors.
Tasks
Published 2017-09-23
URL http://arxiv.org/abs/1709.08126v1
PDF http://arxiv.org/pdf/1709.08126v1.pdf
PWC https://paperswithcode.com/paper/self-supervised-learning-when-is-fusion-of
Repo
Framework

FairJudge: Trustworthy User Prediction in Rating Platforms

Title FairJudge: Trustworthy User Prediction in Rating Platforms
Authors Srijan Kumar, Bryan Hooi, Disha Makhija, Mohit Kumar, Christos Faloutsos, V. S. Subrahamanian
Abstract Rating platforms enable large-scale collection of user opinion about items (products, other users, etc.). However, many untrustworthy users give fraudulent ratings for excessive monetary gains. In the paper, we present FairJudge, a system to identify such fraudulent users. We propose three metrics: (i) the fairness of a user that quantifies how trustworthy the user is in rating the products, (ii) the reliability of a rating that measures how reliable the rating is, and (iii) the goodness of a product that measures the quality of the product. Intuitively, a user is fair if it provides reliable ratings that are close to the goodness of the product. We formulate a mutually recursive definition of these metrics, and further address cold start problems and incorporate behavioral properties of users and products in the formulation. We propose an iterative algorithm, FairJudge, to predict the values of the three metrics. We prove that FairJudge is guaranteed to converge in a bounded number of iterations, with linear time complexity. By conducting five different experiments on five rating platforms, we show that FairJudge significantly outperforms nine existing algorithms in predicting fair and unfair users. We reported the 100 most unfair users in the Flipkart network to their review fraud investigators, and 80 users were correctly identified (80% accuracy). The FairJudge algorithm is already being deployed at Flipkart.
Tasks
Published 2017-03-30
URL http://arxiv.org/abs/1703.10545v1
PDF http://arxiv.org/pdf/1703.10545v1.pdf
PWC https://paperswithcode.com/paper/fairjudge-trustworthy-user-prediction-in
Repo
Framework

Review on Parameter Estimation in HMRF

Title Review on Parameter Estimation in HMRF
Authors Namjoon Suh
Abstract This is a technical report which explores the estimation methodologies on hyper-parameters in Markov Random Field and Gaussian Hidden Markov Random Field. In first section, we briefly investigate a theoretical framework on Metropolis-Hastings algorithm. Next, by using MH algorithm, we simulate the data from Ising model, and study on how hyper-parameter estimation in Ising model is enabled through MCMC algorithm using pseudo-likelihood approximation. Following section deals with an issue on parameters estimation process of Gaussian Hidden Markov Random Field using MAP estimation and EM algorithm, and also discusses problems, found through several experiments. In following section, we expand this idea on estimating parameters in Gaussian Hidden Markov Spatial-Temporal Random Field, and display results on two performed experiments.
Tasks
Published 2017-11-20
URL http://arxiv.org/abs/1711.07561v1
PDF http://arxiv.org/pdf/1711.07561v1.pdf
PWC https://paperswithcode.com/paper/review-on-parameter-estimation-in-hmrf
Repo
Framework

One-Shot Visual Imitation Learning via Meta-Learning

Title One-Shot Visual Imitation Learning via Meta-Learning
Authors Chelsea Finn, Tianhe Yu, Tianhao Zhang, Pieter Abbeel, Sergey Levine
Abstract In order for a robot to be a generalist that can perform a wide range of jobs, it must be able to acquire a wide variety of skills quickly and efficiently in complex unstructured environments. High-capacity models such as deep neural networks can enable a robot to represent complex skills, but learning each skill from scratch then becomes infeasible. In this work, we present a meta-imitation learning method that enables a robot to learn how to learn more efficiently, allowing it to acquire new skills from just a single demonstration. Unlike prior methods for one-shot imitation, our method can scale to raw pixel inputs and requires data from significantly fewer prior tasks for effective learning of new skills. Our experiments on both simulated and real robot platforms demonstrate the ability to learn new tasks, end-to-end, from a single visual demonstration.
Tasks Imitation Learning, Meta-Learning
Published 2017-09-14
URL http://arxiv.org/abs/1709.04905v1
PDF http://arxiv.org/pdf/1709.04905v1.pdf
PWC https://paperswithcode.com/paper/one-shot-visual-imitation-learning-via-meta
Repo
Framework

Reinforcement Learning-based Thermal Comfort Control for Vehicle Cabins

Title Reinforcement Learning-based Thermal Comfort Control for Vehicle Cabins
Authors James Brusey, Diana Hintea, Elena Gaura, Neil Beloe
Abstract Vehicle climate control systems aim to keep passengers thermally comfortable. However, current systems control temperature rather than thermal comfort and tend to be energy hungry, which is of particular concern when considering electric vehicles. This paper poses energy-efficient vehicle comfort control as a Markov Decision Process, which is then solved numerically using Sarsa({\lambda}) and an empirically validated, single-zone, 1D thermal model of the cabin. The resulting controller was tested in simulation using 200 randomly selected scenarios and found to exceed the performance of bang-bang, proportional, simple fuzzy logic, and commercial controllers with 23%, 43%, 40%, 56% increase, respectively. Compared to the next best performing controller, energy consumption is reduced by 13% while the proportion of time spent thermally comfortable is increased by 23%. These results indicate that this is a viable approach that promises to translate into substantial comfort and energy improvements in the car.
Tasks
Published 2017-04-25
URL http://arxiv.org/abs/1704.07899v2
PDF http://arxiv.org/pdf/1704.07899v2.pdf
PWC https://paperswithcode.com/paper/reinforcement-learning-based-thermal-comfort
Repo
Framework

Theoretical Foundation of Co-Training and Disagreement-Based Algorithms

Title Theoretical Foundation of Co-Training and Disagreement-Based Algorithms
Authors Wei Wang, Zhi-Hua Zhou
Abstract Disagreement-based approaches generate multiple classifiers and exploit the disagreement among them with unlabeled data to improve learning performance. Co-training is a representative paradigm of them, which trains two classifiers separately on two sufficient and redundant views; while for the applications where there is only one view, several successful variants of co-training with two different classifiers on single-view data instead of two views have been proposed. For these disagreement-based approaches, there are several important issues which still are unsolved, in this article we present theoretical analyses to address these issues, which provides a theoretical foundation of co-training and disagreement-based approaches.
Tasks
Published 2017-08-15
URL http://arxiv.org/abs/1708.04403v1
PDF http://arxiv.org/pdf/1708.04403v1.pdf
PWC https://paperswithcode.com/paper/theoretical-foundation-of-co-training-and
Repo
Framework

Multi-Period Flexibility Forecast for Low Voltage Prosumers

Title Multi-Period Flexibility Forecast for Low Voltage Prosumers
Authors Rui Pinto, Ricardo Bessa, Manuel Matos
Abstract Near-future electric distribution grids operation will have to rely on demand-side flexibility, both by implementation of demand response strategies and by taking advantage of the intelligent management of increasingly common small-scale energy storage. The Home energy management system (HEMS), installed at low voltage residential clients, will play a crucial role on the flexibility provision to both system operators and market players like aggregators. Modeling and forecasting multi-period flexibility from residential prosumers, such as battery storage and electric water heater, while complying with internal constraints (comfort levels, data privacy) and uncertainty is a complex task. This papers describes a computational method that is capable of efficiently learn and define the feasibility flexibility space from controllable resources connected to a HEMS. An Evolutionary Particle Swarm Optimization (EPSO) algorithm is adopted and reshaped to derive a set of feasible temporal trajectories for the residential net-load, considering storage, flexible appliances, and predefined costumer preferences, as well as load and photovoltaic (PV) forecast uncertainty. A support vector data description (SVDD) algorithm is used to build models capable of classifying feasible and non-feasible HEMS operating trajectories upon request from an optimization/control algorithm operated by a DSO or market player.
Tasks
Published 2017-03-26
URL http://arxiv.org/abs/1703.08825v4
PDF http://arxiv.org/pdf/1703.08825v4.pdf
PWC https://paperswithcode.com/paper/multi-period-flexibility-forecast-for-low
Repo
Framework

Finding Robust Solutions to Stable Marriage

Title Finding Robust Solutions to Stable Marriage
Authors Begum Genc, Mohamed Siala, Barry O’Sullivan, Gilles Simonin
Abstract We study the notion of robustness in stable matching problems. We first define robustness by introducing (a,b)-supermatches. An $(a,b)$-supermatch is a stable matching in which if $a$ pairs break up it is possible to find another stable matching by changing the partners of those $a$ pairs and at most $b$ other pairs. In this context, we define the most robust stable matching as a $(1,b)$-supermatch where b is minimum. We show that checking whether a given stable matching is a $(1,b)$-supermatch can be done in polynomial time. Next, we use this procedure to design a constraint programming model, a local search approach, and a genetic algorithm to find the most robust stable matching. Our empirical evaluation on large instances show that local search outperforms the other approaches.
Tasks
Published 2017-05-24
URL http://arxiv.org/abs/1705.09218v3
PDF http://arxiv.org/pdf/1705.09218v3.pdf
PWC https://paperswithcode.com/paper/finding-robust-solutions-to-stable-marriage
Repo
Framework

Mimicking Ensemble Learning with Deep Branched Networks

Title Mimicking Ensemble Learning with Deep Branched Networks
Authors Byungju Kim, Youngsoo Kim, Yeakang Lee, Junmo Kim
Abstract This paper proposes a branched residual network for image classification. It is known that high-level features of deep neural network are more representative than lower-level features. By sharing the low-level features, the network can allocate more memory to high-level features. The upper layers of our proposed network are branched, so that it mimics the ensemble learning. By mimicking ensemble learning with single network, we have achieved better performance on ImageNet classification task.
Tasks Image Classification
Published 2017-02-21
URL http://arxiv.org/abs/1702.06376v1
PDF http://arxiv.org/pdf/1702.06376v1.pdf
PWC https://paperswithcode.com/paper/mimicking-ensemble-learning-with-deep
Repo
Framework

HashGAN:Attention-aware Deep Adversarial Hashing for Cross Modal Retrieval

Title HashGAN:Attention-aware Deep Adversarial Hashing for Cross Modal Retrieval
Authors Xi Zhang, Siyu Zhou, Jiashi Feng, Hanjiang Lai, Bo Li, Yan Pan, Jian Yin, Shuicheng Yan
Abstract As the rapid growth of multi-modal data, hashing methods for cross-modal retrieval have received considerable attention. Deep-networks-based cross-modal hashing methods are appealing as they can integrate feature learning and hash coding into end-to-end trainable frameworks. However, it is still challenging to find content similarities between different modalities of data due to the heterogeneity gap. To further address this problem, we propose an adversarial hashing network with attention mechanism to enhance the measurement of content similarities by selectively focusing on informative parts of multi-modal data. The proposed new adversarial network, HashGAN, consists of three building blocks: 1) the feature learning module to obtain feature representations, 2) the generative attention module to generate an attention mask, which is used to obtain the attended (foreground) and the unattended (background) feature representations, 3) the discriminative hash coding module to learn hash functions that preserve the similarities between different modalities. In our framework, the generative module and the discriminative module are trained in an adversarial way: the generator is learned to make the discriminator cannot preserve the similarities of multi-modal data w.r.t. the background feature representations, while the discriminator aims to preserve the similarities of multi-modal data w.r.t. both the foreground and the background feature representations. Extensive evaluations on several benchmark datasets demonstrate that the proposed HashGAN brings substantial improvements over other state-of-the-art cross-modal hashing methods.
Tasks Cross-Modal Retrieval
Published 2017-11-26
URL http://arxiv.org/abs/1711.09347v1
PDF http://arxiv.org/pdf/1711.09347v1.pdf
PWC https://paperswithcode.com/paper/hashganattention-aware-deep-adversarial
Repo
Framework

Learning Independent Features with Adversarial Nets for Non-linear ICA

Title Learning Independent Features with Adversarial Nets for Non-linear ICA
Authors Philemon Brakel, Yoshua Bengio
Abstract Reliable measures of statistical dependence could be useful tools for learning independent features and performing tasks like source separation using Independent Component Analysis (ICA). Unfortunately, many of such measures, like the mutual information, are hard to estimate and optimize directly. We propose to learn independent features with adversarial objectives which optimize such measures implicitly. These objectives compare samples from the joint distribution and the product of the marginals without the need to compute any probability densities. We also propose two methods for obtaining samples from the product of the marginals using either a simple resampling trick or a separate parametric distribution. Our experiments show that this strategy can easily be applied to different types of model architectures and solve both linear and non-linear ICA problems.
Tasks
Published 2017-10-13
URL http://arxiv.org/abs/1710.05050v1
PDF http://arxiv.org/pdf/1710.05050v1.pdf
PWC https://paperswithcode.com/paper/learning-independent-features-with
Repo
Framework

Progressive and Multi-Path Holistically Nested Neural Networks for Pathological Lung Segmentation from CT Images

Title Progressive and Multi-Path Holistically Nested Neural Networks for Pathological Lung Segmentation from CT Images
Authors Adam P. Harrison, Ziyue Xu, Kevin George, Le Lu, Ronald M. Summers, Daniel J. Mollura
Abstract Pathological lung segmentation (PLS) is an important, yet challenging, medical image application due to the wide variability of pathological lung appearance and shape. Because PLS is often a pre-requisite for other imaging analytics, methodological simplicity and generality are key factors in usability. Along those lines, we present a bottom-up deep-learning based approach that is expressive enough to handle variations in appearance, while remaining unaffected by any variations in shape. We incorporate the deeply supervised learning framework, but enhance it with a simple, yet effective, progressive multi-path scheme, which more reliably merges outputs from different network stages. The result is a deep model able to produce finer detailed masks, which we call progressive holistically-nested networks (P-HNNs). Using extensive cross-validation, our method is tested on multi-institutional datasets comprising 929 CT scans (848 publicly available), of pathological lungs, reporting mean dice scores of 0.985 and demonstrating significant qualitative and quantitative improvements over state-of-the art approaches.
Tasks
Published 2017-06-12
URL http://arxiv.org/abs/1706.03702v1
PDF http://arxiv.org/pdf/1706.03702v1.pdf
PWC https://paperswithcode.com/paper/progressive-and-multi-path-holistically
Repo
Framework
Title Transferring a Semantic Representation for Person Re-Identification and Search
Authors Zhiyuan Shi, Timothy M. Hospedales, Tao Xiang
Abstract Learning semantic attributes for person re-identification and description-based person search has gained increasing interest due to attributes’ great potential as a pose and view-invariant representation. However, existing attribute-centric approaches have thus far underperformed state-of-the-art conventional approaches. This is due to their non-scalable need for extensive domain (camera) specific annotation. In this paper we present a new semantic attribute learning approach for person re-identification and search. Our model is trained on existing fashion photography datasets – either weakly or strongly labelled. It can then be transferred and adapted to provide a powerful semantic description of surveillance person detections, without requiring any surveillance domain supervision. The resulting representation is useful for both unsupervised and supervised person re-identification, achieving state-of-the-art and near state-of-the-art performance respectively. Furthermore, as a semantic representation it allows description-based person search to be integrated within the same framework.
Tasks Person Re-Identification, Person Search
Published 2017-06-12
URL http://arxiv.org/abs/1706.03725v1
PDF http://arxiv.org/pdf/1706.03725v1.pdf
PWC https://paperswithcode.com/paper/transferring-a-semantic-representation-for
Repo
Framework
comments powered by Disqus