Paper Group ANR 191
Formulation of Deep Reinforcement Learning Architecture Toward Autonomous Driving for On-Ramp Merge. Episodic memory for continual model learning. Self-supervised learning: When is fusion of the primary and secondary sensor cue useful?. FairJudge: Trustworthy User Prediction in Rating Platforms. Review on Parameter Estimation in HMRF. One-Shot Visu …
Formulation of Deep Reinforcement Learning Architecture Toward Autonomous Driving for On-Ramp Merge
Title | Formulation of Deep Reinforcement Learning Architecture Toward Autonomous Driving for On-Ramp Merge |
Authors | Pin Wang, Ching-Yao Chan |
Abstract | Multiple automakers have in development or in production automated driving systems (ADS) that offer freeway-pilot functions. This type of ADS is typically limited to restricted-access freeways only, that is, the transition from manual to automated modes takes place only after the ramp merging process is completed manually. One major challenge to extend the automation to ramp merging is that the automated vehicle needs to incorporate and optimize long-term objectives (e.g. successful and smooth merge) when near-term actions must be safely executed. Moreover, the merging process involves interactions with other vehicles whose behaviors are sometimes hard to predict but may influence the merging vehicle optimal actions. To tackle such a complicated control problem, we propose to apply Deep Reinforcement Learning (DRL) techniques for finding an optimal driving policy by maximizing the long-term reward in an interactive environment. Specifically, we apply a Long Short-Term Memory (LSTM) architecture to model the interactive environment, from which an internal state containing historical driving information is conveyed to a Deep Q-Network (DQN). The DQN is used to approximate the Q-function, which takes the internal state as input and generates Q-values as output for action selection. With this DRL architecture, the historical impact of interactive environment on the long-term reward can be captured and taken into account for deciding the optimal control policy. The proposed architecture has the potential to be extended and applied to other autonomous driving scenarios such as driving through a complex intersection or changing lanes under varying traffic flow conditions. |
Tasks | Autonomous Driving |
Published | 2017-09-07 |
URL | http://arxiv.org/abs/1709.02066v3 |
http://arxiv.org/pdf/1709.02066v3.pdf | |
PWC | https://paperswithcode.com/paper/formulation-of-deep-reinforcement-learning |
Repo | |
Framework | |
Episodic memory for continual model learning
Title | Episodic memory for continual model learning |
Authors | David G. Nagy, Gergő Orbán |
Abstract | Both the human brain and artificial learning agents operating in real-world or comparably complex environments are faced with the challenge of online model selection. In principle this challenge can be overcome: hierarchical Bayesian inference provides a principled method for model selection and it converges on the same posterior for both off-line (i.e. batch) and online learning. However, maintaining a parameter posterior for each model in parallel has in general an even higher memory cost than storing the entire data set and is consequently clearly unfeasible. Alternatively, maintaining only a limited set of models in memory could limit memory requirements. However, sufficient statistics for one model will usually be insufficient for fitting a different kind of model, meaning that the agent loses information with each model change. We propose that episodic memory can circumvent the challenge of limited memory-capacity online model selection by retaining a selected subset of data points. We design a method to compute the quantities necessary for model selection even when the data is discarded and only statistics of one (or few) learnt models are available. We demonstrate on a simple model that a limited-sized episodic memory buffer, when the content is optimised to retain data with statistics not matching the current representation, can resolve the fundamental challenge of online model selection. |
Tasks | Bayesian Inference, Model Selection |
Published | 2017-12-04 |
URL | http://arxiv.org/abs/1712.01169v1 |
http://arxiv.org/pdf/1712.01169v1.pdf | |
PWC | https://paperswithcode.com/paper/episodic-memory-for-continual-model-learning |
Repo | |
Framework | |
Self-supervised learning: When is fusion of the primary and secondary sensor cue useful?
Title | Self-supervised learning: When is fusion of the primary and secondary sensor cue useful? |
Authors | G. C. H. E. de Croon |
Abstract | Self-supervised learning (SSL) is a reliable learning mechanism in which a robot enhances its perceptual capabilities. Typically, in SSL a trusted, primary sensor cue provides supervised training data to a secondary sensor cue. In this article, a theoretical analysis is performed on the fusion of the primary and secondary cue in a minimal model of SSL. A proof is provided that determines the specific conditions under which it is favorable to perform fusion. In short, it is favorable when (i) the prior on the target value is strong or (ii) the secondary cue is sufficiently accurate. The theoretical findings are validated with computational experiments. Subsequently, a real-world case study is performed to investigate if fusion in SSL is also beneficial when assumptions of the minimal model are not met. In particular, a flying robot learns to map pressure measurements to sonar height measurements and then fuses the two, resulting in better height estimation. Fusion is also beneficial in the opposite case, when pressure is the primary cue. The analysis and results are encouraging to study SSL fusion also for other robots and sensors. |
Tasks | |
Published | 2017-09-23 |
URL | http://arxiv.org/abs/1709.08126v1 |
http://arxiv.org/pdf/1709.08126v1.pdf | |
PWC | https://paperswithcode.com/paper/self-supervised-learning-when-is-fusion-of |
Repo | |
Framework | |
FairJudge: Trustworthy User Prediction in Rating Platforms
Title | FairJudge: Trustworthy User Prediction in Rating Platforms |
Authors | Srijan Kumar, Bryan Hooi, Disha Makhija, Mohit Kumar, Christos Faloutsos, V. S. Subrahamanian |
Abstract | Rating platforms enable large-scale collection of user opinion about items (products, other users, etc.). However, many untrustworthy users give fraudulent ratings for excessive monetary gains. In the paper, we present FairJudge, a system to identify such fraudulent users. We propose three metrics: (i) the fairness of a user that quantifies how trustworthy the user is in rating the products, (ii) the reliability of a rating that measures how reliable the rating is, and (iii) the goodness of a product that measures the quality of the product. Intuitively, a user is fair if it provides reliable ratings that are close to the goodness of the product. We formulate a mutually recursive definition of these metrics, and further address cold start problems and incorporate behavioral properties of users and products in the formulation. We propose an iterative algorithm, FairJudge, to predict the values of the three metrics. We prove that FairJudge is guaranteed to converge in a bounded number of iterations, with linear time complexity. By conducting five different experiments on five rating platforms, we show that FairJudge significantly outperforms nine existing algorithms in predicting fair and unfair users. We reported the 100 most unfair users in the Flipkart network to their review fraud investigators, and 80 users were correctly identified (80% accuracy). The FairJudge algorithm is already being deployed at Flipkart. |
Tasks | |
Published | 2017-03-30 |
URL | http://arxiv.org/abs/1703.10545v1 |
http://arxiv.org/pdf/1703.10545v1.pdf | |
PWC | https://paperswithcode.com/paper/fairjudge-trustworthy-user-prediction-in |
Repo | |
Framework | |
Review on Parameter Estimation in HMRF
Title | Review on Parameter Estimation in HMRF |
Authors | Namjoon Suh |
Abstract | This is a technical report which explores the estimation methodologies on hyper-parameters in Markov Random Field and Gaussian Hidden Markov Random Field. In first section, we briefly investigate a theoretical framework on Metropolis-Hastings algorithm. Next, by using MH algorithm, we simulate the data from Ising model, and study on how hyper-parameter estimation in Ising model is enabled through MCMC algorithm using pseudo-likelihood approximation. Following section deals with an issue on parameters estimation process of Gaussian Hidden Markov Random Field using MAP estimation and EM algorithm, and also discusses problems, found through several experiments. In following section, we expand this idea on estimating parameters in Gaussian Hidden Markov Spatial-Temporal Random Field, and display results on two performed experiments. |
Tasks | |
Published | 2017-11-20 |
URL | http://arxiv.org/abs/1711.07561v1 |
http://arxiv.org/pdf/1711.07561v1.pdf | |
PWC | https://paperswithcode.com/paper/review-on-parameter-estimation-in-hmrf |
Repo | |
Framework | |
One-Shot Visual Imitation Learning via Meta-Learning
Title | One-Shot Visual Imitation Learning via Meta-Learning |
Authors | Chelsea Finn, Tianhe Yu, Tianhao Zhang, Pieter Abbeel, Sergey Levine |
Abstract | In order for a robot to be a generalist that can perform a wide range of jobs, it must be able to acquire a wide variety of skills quickly and efficiently in complex unstructured environments. High-capacity models such as deep neural networks can enable a robot to represent complex skills, but learning each skill from scratch then becomes infeasible. In this work, we present a meta-imitation learning method that enables a robot to learn how to learn more efficiently, allowing it to acquire new skills from just a single demonstration. Unlike prior methods for one-shot imitation, our method can scale to raw pixel inputs and requires data from significantly fewer prior tasks for effective learning of new skills. Our experiments on both simulated and real robot platforms demonstrate the ability to learn new tasks, end-to-end, from a single visual demonstration. |
Tasks | Imitation Learning, Meta-Learning |
Published | 2017-09-14 |
URL | http://arxiv.org/abs/1709.04905v1 |
http://arxiv.org/pdf/1709.04905v1.pdf | |
PWC | https://paperswithcode.com/paper/one-shot-visual-imitation-learning-via-meta |
Repo | |
Framework | |
Reinforcement Learning-based Thermal Comfort Control for Vehicle Cabins
Title | Reinforcement Learning-based Thermal Comfort Control for Vehicle Cabins |
Authors | James Brusey, Diana Hintea, Elena Gaura, Neil Beloe |
Abstract | Vehicle climate control systems aim to keep passengers thermally comfortable. However, current systems control temperature rather than thermal comfort and tend to be energy hungry, which is of particular concern when considering electric vehicles. This paper poses energy-efficient vehicle comfort control as a Markov Decision Process, which is then solved numerically using Sarsa({\lambda}) and an empirically validated, single-zone, 1D thermal model of the cabin. The resulting controller was tested in simulation using 200 randomly selected scenarios and found to exceed the performance of bang-bang, proportional, simple fuzzy logic, and commercial controllers with 23%, 43%, 40%, 56% increase, respectively. Compared to the next best performing controller, energy consumption is reduced by 13% while the proportion of time spent thermally comfortable is increased by 23%. These results indicate that this is a viable approach that promises to translate into substantial comfort and energy improvements in the car. |
Tasks | |
Published | 2017-04-25 |
URL | http://arxiv.org/abs/1704.07899v2 |
http://arxiv.org/pdf/1704.07899v2.pdf | |
PWC | https://paperswithcode.com/paper/reinforcement-learning-based-thermal-comfort |
Repo | |
Framework | |
Theoretical Foundation of Co-Training and Disagreement-Based Algorithms
Title | Theoretical Foundation of Co-Training and Disagreement-Based Algorithms |
Authors | Wei Wang, Zhi-Hua Zhou |
Abstract | Disagreement-based approaches generate multiple classifiers and exploit the disagreement among them with unlabeled data to improve learning performance. Co-training is a representative paradigm of them, which trains two classifiers separately on two sufficient and redundant views; while for the applications where there is only one view, several successful variants of co-training with two different classifiers on single-view data instead of two views have been proposed. For these disagreement-based approaches, there are several important issues which still are unsolved, in this article we present theoretical analyses to address these issues, which provides a theoretical foundation of co-training and disagreement-based approaches. |
Tasks | |
Published | 2017-08-15 |
URL | http://arxiv.org/abs/1708.04403v1 |
http://arxiv.org/pdf/1708.04403v1.pdf | |
PWC | https://paperswithcode.com/paper/theoretical-foundation-of-co-training-and |
Repo | |
Framework | |
Multi-Period Flexibility Forecast for Low Voltage Prosumers
Title | Multi-Period Flexibility Forecast for Low Voltage Prosumers |
Authors | Rui Pinto, Ricardo Bessa, Manuel Matos |
Abstract | Near-future electric distribution grids operation will have to rely on demand-side flexibility, both by implementation of demand response strategies and by taking advantage of the intelligent management of increasingly common small-scale energy storage. The Home energy management system (HEMS), installed at low voltage residential clients, will play a crucial role on the flexibility provision to both system operators and market players like aggregators. Modeling and forecasting multi-period flexibility from residential prosumers, such as battery storage and electric water heater, while complying with internal constraints (comfort levels, data privacy) and uncertainty is a complex task. This papers describes a computational method that is capable of efficiently learn and define the feasibility flexibility space from controllable resources connected to a HEMS. An Evolutionary Particle Swarm Optimization (EPSO) algorithm is adopted and reshaped to derive a set of feasible temporal trajectories for the residential net-load, considering storage, flexible appliances, and predefined costumer preferences, as well as load and photovoltaic (PV) forecast uncertainty. A support vector data description (SVDD) algorithm is used to build models capable of classifying feasible and non-feasible HEMS operating trajectories upon request from an optimization/control algorithm operated by a DSO or market player. |
Tasks | |
Published | 2017-03-26 |
URL | http://arxiv.org/abs/1703.08825v4 |
http://arxiv.org/pdf/1703.08825v4.pdf | |
PWC | https://paperswithcode.com/paper/multi-period-flexibility-forecast-for-low |
Repo | |
Framework | |
Finding Robust Solutions to Stable Marriage
Title | Finding Robust Solutions to Stable Marriage |
Authors | Begum Genc, Mohamed Siala, Barry O’Sullivan, Gilles Simonin |
Abstract | We study the notion of robustness in stable matching problems. We first define robustness by introducing (a,b)-supermatches. An $(a,b)$-supermatch is a stable matching in which if $a$ pairs break up it is possible to find another stable matching by changing the partners of those $a$ pairs and at most $b$ other pairs. In this context, we define the most robust stable matching as a $(1,b)$-supermatch where b is minimum. We show that checking whether a given stable matching is a $(1,b)$-supermatch can be done in polynomial time. Next, we use this procedure to design a constraint programming model, a local search approach, and a genetic algorithm to find the most robust stable matching. Our empirical evaluation on large instances show that local search outperforms the other approaches. |
Tasks | |
Published | 2017-05-24 |
URL | http://arxiv.org/abs/1705.09218v3 |
http://arxiv.org/pdf/1705.09218v3.pdf | |
PWC | https://paperswithcode.com/paper/finding-robust-solutions-to-stable-marriage |
Repo | |
Framework | |
Mimicking Ensemble Learning with Deep Branched Networks
Title | Mimicking Ensemble Learning with Deep Branched Networks |
Authors | Byungju Kim, Youngsoo Kim, Yeakang Lee, Junmo Kim |
Abstract | This paper proposes a branched residual network for image classification. It is known that high-level features of deep neural network are more representative than lower-level features. By sharing the low-level features, the network can allocate more memory to high-level features. The upper layers of our proposed network are branched, so that it mimics the ensemble learning. By mimicking ensemble learning with single network, we have achieved better performance on ImageNet classification task. |
Tasks | Image Classification |
Published | 2017-02-21 |
URL | http://arxiv.org/abs/1702.06376v1 |
http://arxiv.org/pdf/1702.06376v1.pdf | |
PWC | https://paperswithcode.com/paper/mimicking-ensemble-learning-with-deep |
Repo | |
Framework | |
HashGAN:Attention-aware Deep Adversarial Hashing for Cross Modal Retrieval
Title | HashGAN:Attention-aware Deep Adversarial Hashing for Cross Modal Retrieval |
Authors | Xi Zhang, Siyu Zhou, Jiashi Feng, Hanjiang Lai, Bo Li, Yan Pan, Jian Yin, Shuicheng Yan |
Abstract | As the rapid growth of multi-modal data, hashing methods for cross-modal retrieval have received considerable attention. Deep-networks-based cross-modal hashing methods are appealing as they can integrate feature learning and hash coding into end-to-end trainable frameworks. However, it is still challenging to find content similarities between different modalities of data due to the heterogeneity gap. To further address this problem, we propose an adversarial hashing network with attention mechanism to enhance the measurement of content similarities by selectively focusing on informative parts of multi-modal data. The proposed new adversarial network, HashGAN, consists of three building blocks: 1) the feature learning module to obtain feature representations, 2) the generative attention module to generate an attention mask, which is used to obtain the attended (foreground) and the unattended (background) feature representations, 3) the discriminative hash coding module to learn hash functions that preserve the similarities between different modalities. In our framework, the generative module and the discriminative module are trained in an adversarial way: the generator is learned to make the discriminator cannot preserve the similarities of multi-modal data w.r.t. the background feature representations, while the discriminator aims to preserve the similarities of multi-modal data w.r.t. both the foreground and the background feature representations. Extensive evaluations on several benchmark datasets demonstrate that the proposed HashGAN brings substantial improvements over other state-of-the-art cross-modal hashing methods. |
Tasks | Cross-Modal Retrieval |
Published | 2017-11-26 |
URL | http://arxiv.org/abs/1711.09347v1 |
http://arxiv.org/pdf/1711.09347v1.pdf | |
PWC | https://paperswithcode.com/paper/hashganattention-aware-deep-adversarial |
Repo | |
Framework | |
Learning Independent Features with Adversarial Nets for Non-linear ICA
Title | Learning Independent Features with Adversarial Nets for Non-linear ICA |
Authors | Philemon Brakel, Yoshua Bengio |
Abstract | Reliable measures of statistical dependence could be useful tools for learning independent features and performing tasks like source separation using Independent Component Analysis (ICA). Unfortunately, many of such measures, like the mutual information, are hard to estimate and optimize directly. We propose to learn independent features with adversarial objectives which optimize such measures implicitly. These objectives compare samples from the joint distribution and the product of the marginals without the need to compute any probability densities. We also propose two methods for obtaining samples from the product of the marginals using either a simple resampling trick or a separate parametric distribution. Our experiments show that this strategy can easily be applied to different types of model architectures and solve both linear and non-linear ICA problems. |
Tasks | |
Published | 2017-10-13 |
URL | http://arxiv.org/abs/1710.05050v1 |
http://arxiv.org/pdf/1710.05050v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-independent-features-with |
Repo | |
Framework | |
Progressive and Multi-Path Holistically Nested Neural Networks for Pathological Lung Segmentation from CT Images
Title | Progressive and Multi-Path Holistically Nested Neural Networks for Pathological Lung Segmentation from CT Images |
Authors | Adam P. Harrison, Ziyue Xu, Kevin George, Le Lu, Ronald M. Summers, Daniel J. Mollura |
Abstract | Pathological lung segmentation (PLS) is an important, yet challenging, medical image application due to the wide variability of pathological lung appearance and shape. Because PLS is often a pre-requisite for other imaging analytics, methodological simplicity and generality are key factors in usability. Along those lines, we present a bottom-up deep-learning based approach that is expressive enough to handle variations in appearance, while remaining unaffected by any variations in shape. We incorporate the deeply supervised learning framework, but enhance it with a simple, yet effective, progressive multi-path scheme, which more reliably merges outputs from different network stages. The result is a deep model able to produce finer detailed masks, which we call progressive holistically-nested networks (P-HNNs). Using extensive cross-validation, our method is tested on multi-institutional datasets comprising 929 CT scans (848 publicly available), of pathological lungs, reporting mean dice scores of 0.985 and demonstrating significant qualitative and quantitative improvements over state-of-the art approaches. |
Tasks | |
Published | 2017-06-12 |
URL | http://arxiv.org/abs/1706.03702v1 |
http://arxiv.org/pdf/1706.03702v1.pdf | |
PWC | https://paperswithcode.com/paper/progressive-and-multi-path-holistically |
Repo | |
Framework | |
Transferring a Semantic Representation for Person Re-Identification and Search
Title | Transferring a Semantic Representation for Person Re-Identification and Search |
Authors | Zhiyuan Shi, Timothy M. Hospedales, Tao Xiang |
Abstract | Learning semantic attributes for person re-identification and description-based person search has gained increasing interest due to attributes’ great potential as a pose and view-invariant representation. However, existing attribute-centric approaches have thus far underperformed state-of-the-art conventional approaches. This is due to their non-scalable need for extensive domain (camera) specific annotation. In this paper we present a new semantic attribute learning approach for person re-identification and search. Our model is trained on existing fashion photography datasets – either weakly or strongly labelled. It can then be transferred and adapted to provide a powerful semantic description of surveillance person detections, without requiring any surveillance domain supervision. The resulting representation is useful for both unsupervised and supervised person re-identification, achieving state-of-the-art and near state-of-the-art performance respectively. Furthermore, as a semantic representation it allows description-based person search to be integrated within the same framework. |
Tasks | Person Re-Identification, Person Search |
Published | 2017-06-12 |
URL | http://arxiv.org/abs/1706.03725v1 |
http://arxiv.org/pdf/1706.03725v1.pdf | |
PWC | https://paperswithcode.com/paper/transferring-a-semantic-representation-for |
Repo | |
Framework | |