Paper Group ANR 1111
Learning Representations in Model-Free Hierarchical Reinforcement Learning. Electric Vehicle Driver Clustering using Statistical Model and Machine Learning. Deep Curiosity Search: Intra-Life Exploration Can Improve Performance on Challenging Deep Reinforcement Learning Problems. Observe and Look Further: Achieving Consistent Performance on Atari. H …
Learning Representations in Model-Free Hierarchical Reinforcement Learning
Title | Learning Representations in Model-Free Hierarchical Reinforcement Learning |
Authors | Jacob Rafati, David C. Noelle |
Abstract | Common approaches to Reinforcement Learning (RL) are seriously challenged by large-scale applications involving huge state spaces and sparse delayed reward feedback. Hierarchical Reinforcement Learning (HRL) methods attempt to address this scalability issue by learning action selection policies at multiple levels of temporal abstraction. Abstraction can be had by identifying a relatively small set of states that are likely to be useful as subgoals, in concert with the learning of corresponding skill policies to achieve those subgoals. Many approaches to subgoal discovery in HRL depend on the analysis of a model of the environment, but the need to learn such a model introduces its own problems of scale. Once subgoals are identified, skills may be learned through intrinsic motivation, introducing an internal reward signal marking subgoal attainment. In this paper, we present a novel model-free method for subgoal discovery using incremental unsupervised learning over a small memory of the most recent experiences (trajectories) of the agent. When combined with an intrinsic motivation learning mechanism, this method learns both subgoals and skills, based on experiences in the environment. Thus, we offer an original approach to HRL that does not require the acquisition of a model of the environment, suitable for large-scale applications. We demonstrate the efficiency of our method on two RL problems with sparse delayed feedback: a variant of the rooms environment and the first screen of the ATARI 2600 Montezuma’s Revenge game. |
Tasks | Hierarchical Reinforcement Learning, Montezuma’s Revenge |
Published | 2018-10-23 |
URL | http://arxiv.org/abs/1810.10096v3 |
http://arxiv.org/pdf/1810.10096v3.pdf | |
PWC | https://paperswithcode.com/paper/learning-representations-in-model-free |
Repo | |
Framework | |
Electric Vehicle Driver Clustering using Statistical Model and Machine Learning
Title | Electric Vehicle Driver Clustering using Statistical Model and Machine Learning |
Authors | Yingqi Xiong, Bin Wang, Chi-Cheng Chu, Rajit Gadh |
Abstract | Electric Vehicle (EV) is playing a significant role in the distribution energy management systems since the power consumption level of the EVs is much higher than the other regular home appliances. The randomness of the EV driver behaviors make the optimal charging or discharging scheduling even more difficult due to the uncertain charging session parameters. To minimize the impact of behavioral uncertainties, it is critical to develop effective methods to predict EV load for smart EV energy management. Using the EV smart charging infrastructures on UCLA campus and city of Santa Monica as testbeds, we have collected real-world datasets of EV charging behaviors, based on which we proposed an EV user modeling technique which combines statistical analysis and machine learning approaches. Specifically, unsupervised clustering algorithm, and multilayer perceptron are applied to historical charging record to make the day-ahead EV parking and load prediction. Experimental results with cross-validation show that our model can achieve good performance for charging control scheduling and online EV load forecasting. |
Tasks | Load Forecasting |
Published | 2018-02-12 |
URL | http://arxiv.org/abs/1802.04193v1 |
http://arxiv.org/pdf/1802.04193v1.pdf | |
PWC | https://paperswithcode.com/paper/electric-vehicle-driver-clustering-using |
Repo | |
Framework | |
Deep Curiosity Search: Intra-Life Exploration Can Improve Performance on Challenging Deep Reinforcement Learning Problems
Title | Deep Curiosity Search: Intra-Life Exploration Can Improve Performance on Challenging Deep Reinforcement Learning Problems |
Authors | Christopher Stanton, Jeff Clune |
Abstract | Traditional exploration methods in RL require agents to perform random actions to find rewards. But these approaches struggle on sparse-reward domains like Montezuma’s Revenge where the probability that any random action sequence leads to reward is extremely low. Recent algorithms have performed well on such tasks by encouraging agents to visit new states or perform new actions in relation to all prior training episodes (which we call across-training novelty). But such algorithms do not consider whether an agent exhibits intra-life novelty: doing something new within the current episode, regardless of whether those behaviors have been performed in previous episodes. We hypothesize that across-training novelty might discourage agents from revisiting initially non-rewarding states that could become important stepping stones later in training. We introduce Deep Curiosity Search (DeepCS), which encourages intra-life exploration by rewarding agents for visiting as many different states as possible within each episode, and show that DeepCS matches the performance of current state-of-the-art methods on Montezuma’s Revenge. We further show that DeepCS improves exploration on Amidar, Freeway, Gravitar, and Tutankham (many of which are hard exploration games). Surprisingly, DeepCS doubles A2C performance on Seaquest, a game we would not have expected to benefit from intra-life exploration because the arena is small and already easily navigated by naive exploration techniques. In one run, DeepCS achieves a maximum training score of 80,000 points on Seaquest, higher than any methods other than Ape-X. The strong performance of DeepCS on these sparse- and dense-reward tasks suggests that encouraging intra-life novelty is an interesting, new approach for improving performance in Deep RL and motivates further research into hybridizing across-training and intra-life exploration methods. |
Tasks | Montezuma’s Revenge |
Published | 2018-06-01 |
URL | http://arxiv.org/abs/1806.00553v3 |
http://arxiv.org/pdf/1806.00553v3.pdf | |
PWC | https://paperswithcode.com/paper/deep-curiosity-search-intra-life-exploration |
Repo | |
Framework | |
Observe and Look Further: Achieving Consistent Performance on Atari
Title | Observe and Look Further: Achieving Consistent Performance on Atari |
Authors | Tobias Pohlen, Bilal Piot, Todd Hester, Mohammad Gheshlaghi Azar, Dan Horgan, David Budden, Gabriel Barth-Maron, Hado van Hasselt, John Quan, Mel Večerík, Matteo Hessel, Rémi Munos, Olivier Pietquin |
Abstract | Despite significant advances in the field of deep Reinforcement Learning (RL), today’s algorithms still fail to learn human-level policies consistently over a set of diverse tasks such as Atari 2600 games. We identify three key challenges that any algorithm needs to master in order to perform well on all games: processing diverse reward distributions, reasoning over long time horizons, and exploring efficiently. In this paper, we propose an algorithm that addresses each of these challenges and is able to learn human-level policies on nearly all Atari games. A new transformed Bellman operator allows our algorithm to process rewards of varying densities and scales; an auxiliary temporal consistency loss allows us to train stably using a discount factor of $\gamma = 0.999$ (instead of $\gamma = 0.99$) extending the effective planning horizon by an order of magnitude; and we ease the exploration problem by using human demonstrations that guide the agent towards rewarding states. When tested on a set of 42 Atari games, our algorithm exceeds the performance of an average human on 40 games using a common set of hyper parameters. Furthermore, it is the first deep RL algorithm to solve the first level of Montezuma’s Revenge. |
Tasks | Atari Games, Montezuma’s Revenge |
Published | 2018-05-29 |
URL | http://arxiv.org/abs/1805.11593v1 |
http://arxiv.org/pdf/1805.11593v1.pdf | |
PWC | https://paperswithcode.com/paper/observe-and-look-further-achieving-consistent |
Repo | |
Framework | |
Hierarchical Imitation and Reinforcement Learning
Title | Hierarchical Imitation and Reinforcement Learning |
Authors | Hoang M. Le, Nan Jiang, Alekh Agarwal, Miroslav Dudík, Yisong Yue, Hal Daumé III |
Abstract | We study how to effectively leverage expert feedback to learn sequential decision-making policies. We focus on problems with sparse rewards and long time horizons, which typically pose significant challenges in reinforcement learning. We propose an algorithmic framework, called hierarchical guidance, that leverages the hierarchical structure of the underlying problem to integrate different modes of expert interaction. Our framework can incorporate different combinations of imitation learning (IL) and reinforcement learning (RL) at different levels, leading to dramatic reductions in both expert effort and cost of exploration. Using long-horizon benchmarks, including Montezuma’s Revenge, we demonstrate that our approach can learn significantly faster than hierarchical RL, and be significantly more label-efficient than standard IL. We also theoretically analyze labeling cost for certain instantiations of our framework. |
Tasks | Decision Making, Imitation Learning, Montezuma’s Revenge |
Published | 2018-03-01 |
URL | http://arxiv.org/abs/1803.00590v2 |
http://arxiv.org/pdf/1803.00590v2.pdf | |
PWC | https://paperswithcode.com/paper/hierarchical-imitation-and-reinforcement |
Repo | |
Framework | |
A Coarse-To-Fine Framework For Video Object Segmentation
Title | A Coarse-To-Fine Framework For Video Object Segmentation |
Authors | Chi Zhang, Alexander Loui |
Abstract | In this study, we develop an unsupervised coarse-to-fine video analysis framework and prototype system to extract a salient object in a video sequence. This framework starts from tracking grid-sampled points along temporal frames, typically using KLT tracking method. The tracking points could be divided into several groups due to their inconsistent movements. At the same time, the SLIC algorithm is extended into 3D space to generate supervoxels. Coarse segmentation is achieved by combining the categorized tracking points and supervoxels of the corresponding frame in the video sequence. Finally, a graph-based fine segmentation algorithm is used to extract the moving object in the scene. Experimental results reveal that this method outperforms the previous approaches in terms of accuracy and robustness. |
Tasks | Semantic Segmentation, Video Object Segmentation, Video Semantic Segmentation |
Published | 2018-09-26 |
URL | http://arxiv.org/abs/1809.10260v1 |
http://arxiv.org/pdf/1809.10260v1.pdf | |
PWC | https://paperswithcode.com/paper/a-coarse-to-fine-framework-for-video-object |
Repo | |
Framework | |
Best Vision Technologies Submission to ActivityNet Challenge 2018-Task: Dense-Captioning Events in Videos
Title | Best Vision Technologies Submission to ActivityNet Challenge 2018-Task: Dense-Captioning Events in Videos |
Authors | Yuan Liu, Moyini Yao |
Abstract | This note describes the details of our solution to the dense-captioning events in videos task of ActivityNet Challenge 2018. Specifically, we solve this problem with a two-stage way, i.e., first temporal event proposal and then sentence generation. For temporal event proposal, we directly leverage the three-stage workflow in [13, 16]. For sentence generation, we capitalize on LSTM-based captioning framework with temporal attention mechanism (dubbed as LSTM-T). Moreover, the input visual sequence to the LSTM-based video captioning model is comprised of RGB and optical flow images. At inference, we adopt a late fusion scheme to fuse the two LSTM-based captioning models for sentence generation. |
Tasks | Optical Flow Estimation, Video Captioning |
Published | 2018-06-25 |
URL | http://arxiv.org/abs/1806.09278v1 |
http://arxiv.org/pdf/1806.09278v1.pdf | |
PWC | https://paperswithcode.com/paper/best-vision-technologies-submission-to |
Repo | |
Framework | |
Sequential Coordination of Deep Models for Learning Visual Arithmetic
Title | Sequential Coordination of Deep Models for Learning Visual Arithmetic |
Authors | Eric Crawford, Guillaume Rabusseau, Joelle Pineau |
Abstract | Achieving machine intelligence requires a smooth integration of perception and reasoning, yet models developed to date tend to specialize in one or the other; sophisticated manipulation of symbols acquired from rich perceptual spaces has so far proved elusive. Consider a visual arithmetic task, where the goal is to carry out simple arithmetical algorithms on digits presented under natural conditions (e.g. hand-written, placed randomly). We propose a two-tiered architecture for tackling this problem. The lower tier consists of a heterogeneous collection of information processing modules, which can include pre-trained deep neural networks for locating and extracting characters from the image, as well as modules performing symbolic transformations on the representations extracted by perception. The higher tier consists of a controller, trained using reinforcement learning, which coordinates the modules in order to solve the high-level task. For instance, the controller may learn in what contexts to execute the perceptual networks and what symbolic transformations to apply to their outputs. The resulting model is able to solve a variety of tasks in the visual arithmetic domain, and has several advantages over standard, architecturally homogeneous feedforward networks including improved sample efficiency. |
Tasks | |
Published | 2018-09-13 |
URL | http://arxiv.org/abs/1809.04988v1 |
http://arxiv.org/pdf/1809.04988v1.pdf | |
PWC | https://paperswithcode.com/paper/sequential-coordination-of-deep-models-for |
Repo | |
Framework | |
Light Field Super-Resolution using a Low-Rank Prior and Deep Convolutional Neural Networks
Title | Light Field Super-Resolution using a Low-Rank Prior and Deep Convolutional Neural Networks |
Authors | Reuben A. Farrugia, Christine Guillemot |
Abstract | Light field imaging has recently known a regain of interest due to the availability of practical light field capturing systems that offer a wide range of applications in the field of computer vision. However, capturing high-resolution light fields remains technologically challenging since the increase in angular resolution is often accompanied by a significant reduction in spatial resolution. This paper describes a learning-based spatial light field super-resolution method that allows the restoration of the entire light field with consistency across all sub-aperture images. The algorithm first uses optical flow to align the light field and then reduces its angular dimension using low-rank approximation. We then consider the linearly independent columns of the resulting low-rank model as an embedding, which is restored using a deep convolutional neural network (DCNN). The super-resolved embedding is then used to reconstruct the remaining sub-aperture images. The original disparities are restored using inverse warping where missing pixels are approximated using a novel light field inpainting algorithm. Experimental results show that the proposed method outperforms existing light field super-resolution algorithms, achieving PSNR gains of 0.23 dB over the second best performing method. This performance can be further improved using iterative back-projection as a post-processing step. |
Tasks | Optical Flow Estimation, Super-Resolution |
Published | 2018-01-12 |
URL | http://arxiv.org/abs/1801.04314v1 |
http://arxiv.org/pdf/1801.04314v1.pdf | |
PWC | https://paperswithcode.com/paper/light-field-super-resolution-using-a-low-rank |
Repo | |
Framework | |
Dual Latent Variable Model for Low-Resource Natural Language Generation in Dialogue Systems
Title | Dual Latent Variable Model for Low-Resource Natural Language Generation in Dialogue Systems |
Authors | Van-Khanh Tran, Le-Minh Nguyen |
Abstract | Recent deep learning models have shown improving results to natural language generation (NLG) irrespective of providing sufficient annotated data. However, a modest training data may harm such models performance. Thus, how to build a generator that can utilize as much of knowledge from a low-resource setting data is a crucial issue in NLG. This paper presents a variational neural-based generation model to tackle the NLG problem of having limited labeled dataset, in which we integrate a variational inference into an encoder-decoder generator and introduce a novel auxiliary autoencoding with an effective training procedure. Experiments showed that the proposed methods not only outperform the previous models when having sufficient training dataset but also show strong ability to work acceptably well when the training data is scarce. |
Tasks | Text Generation |
Published | 2018-11-10 |
URL | http://arxiv.org/abs/1811.04164v1 |
http://arxiv.org/pdf/1811.04164v1.pdf | |
PWC | https://paperswithcode.com/paper/dual-latent-variable-model-for-low-resource |
Repo | |
Framework | |
Design of optimal illumination patterns in single-pixel imaging using image dictionaries
Title | Design of optimal illumination patterns in single-pixel imaging using image dictionaries |
Authors | Jun Feng, Shuming Jiao, Yang Gao, Ting Lei, Xiaocong Yuan |
Abstract | Single-pixel imaging (SPI) has a major drawback that many sequential illuminations are required for capturing one single image with long acquisition time. Basis illumination patterns such as Fourier patterns and Hadamard patterns can achieve much better imaging efficiency than random patterns. But the performance is still sub-optimal since the basis patterns are fixed and non-adaptive for varying object images. This Letter proposes a novel scheme for designing and optimizing the illumination patterns adaptively from an image dictionary by extracting the common image features using principal component analysis (PCA). Simulation and experimental results reveal that our proposed scheme outperforms conventional Fourier SPI in terms of imaging efficiency. |
Tasks | |
Published | 2018-06-04 |
URL | https://arxiv.org/abs/1806.01340v2 |
https://arxiv.org/pdf/1806.01340v2.pdf | |
PWC | https://paperswithcode.com/paper/design-of-optimal-illumination-patterns-in |
Repo | |
Framework | |
TreeGAN: Syntax-Aware Sequence Generation with Generative Adversarial Networks
Title | TreeGAN: Syntax-Aware Sequence Generation with Generative Adversarial Networks |
Authors | Xinyue Liu, Xiangnan Kong, Lei Liu, Kuorong Chiang |
Abstract | Generative Adversarial Networks (GANs) have shown great capacity on image generation, in which a discriminative model guides the training of a generative model to construct images that resemble real images. Recently, GANs have been extended from generating images to generating sequences (e.g., poems, music and codes). Existing GANs on sequence generation mainly focus on general sequences, which are grammar-free. In many real-world applications, however, we need to generate sequences in a formal language with the constraint of its corresponding grammar. For example, to test the performance of a database, one may want to generate a collection of SQL queries, which are not only similar to the queries of real users, but also follow the SQL syntax of the target database. Generating such sequences is highly challenging because both the generator and discriminator of GANs need to consider the structure of the sequences and the given grammar in the formal language. To address these issues, we study the problem of syntax-aware sequence generation with GANs, in which a collection of real sequences and a set of pre-defined grammatical rules are given to both discriminator and generator. We propose a novel GAN framework, namely TreeGAN, to incorporate a given Context-Free Grammar (CFG) into the sequence generation process. In TreeGAN, the generator employs a recurrent neural network (RNN) to construct a parse tree. Each generated parse tree can then be translated to a valid sequence of the given grammar. The discriminator uses a tree-structured RNN to distinguish the generated trees from real trees. We show that TreeGAN can generate sequences for any CFG and its generation fully conforms with the given syntax. Experiments on synthetic and real data sets demonstrated that TreeGAN significantly improves the quality of the sequence generation in context-free languages. |
Tasks | Image Generation |
Published | 2018-08-22 |
URL | http://arxiv.org/abs/1808.07582v1 |
http://arxiv.org/pdf/1808.07582v1.pdf | |
PWC | https://paperswithcode.com/paper/treegan-syntax-aware-sequence-generation-with |
Repo | |
Framework | |
2P-DNN : Privacy-Preserving Deep Neural Networks Based on Homomorphic Cryptosystem
Title | 2P-DNN : Privacy-Preserving Deep Neural Networks Based on Homomorphic Cryptosystem |
Authors | Qiang Zhu, Xixiang Lv |
Abstract | Machine Learning as a Service (MLaaS), such as Microsoft Azure, Amazon AWS, offers an effective DNN model to complete the machine learning task for small businesses and individuals who are restricted to the lacking data and computing power. However, here comes an issue that user privacy is ex-posed to the MLaaS server, since users need to upload their sensitive data to the MLaaS server. In order to preserve their privacy, users can encrypt their data before uploading it. This makes it difficult to run the DNN model because it is not designed for running in ciphertext domain. In this paper, using the Paillier homomorphic cryptosystem we present a new Privacy-Preserving Deep Neural Network model that we called 2P-DNN. This model can fulfill the machine leaning task in ciphertext domain. By using 2P-DNN, MLaaS is able to provide a Privacy-Preserving machine learning ser-vice for users. We build our 2P-DNN model based on LeNet-5, and test it with the encrypted MNIST dataset. The classification accuracy is more than 97%, which is close to the accuracy of LeNet-5 running with the MNIST dataset and higher than that of other existing Privacy-Preserving machine learning models |
Tasks | |
Published | 2018-07-23 |
URL | http://arxiv.org/abs/1807.08459v1 |
http://arxiv.org/pdf/1807.08459v1.pdf | |
PWC | https://paperswithcode.com/paper/2p-dnn-privacy-preserving-deep-neural |
Repo | |
Framework | |
Review on Optical Image Hiding and Watermarking Techniques
Title | Review on Optical Image Hiding and Watermarking Techniques |
Authors | Shuming Jiao, Changyuan Zhou, Yishi Shi, Wenbin Zou, Xia Li |
Abstract | Information security is a critical issue in modern society and image watermarking can effectively prevent unauthorized information access. Optical image watermarking techniques generally have advantages of parallel high-speed processing and multi-dimensional capabilities compared with digital approaches. This paper provides a comprehensive review on the research works related to optical image hiding and watermarking techniques conducted in the past decade. The past research works are focused on two major aspects, various optical systems for image hiding and the methods for embedding optical system output into a host image. A summary of the state-of-the-art works is made from these two perspectives. |
Tasks | |
Published | 2018-04-16 |
URL | http://arxiv.org/abs/1804.05483v1 |
http://arxiv.org/pdf/1804.05483v1.pdf | |
PWC | https://paperswithcode.com/paper/review-on-optical-image-hiding-and |
Repo | |
Framework | |
Fashion is Taking Shape: Understanding Clothing Preference Based on Body Shape From Online Sources
Title | Fashion is Taking Shape: Understanding Clothing Preference Based on Body Shape From Online Sources |
Authors | Hosnieh Sattar, Gerard Pons-Moll, Mario Fritz |
Abstract | To study the correlation between clothing garments and body shape, we collected a new dataset (Fashion Takes Shape), which includes images of users with clothing category annotations. We employ our multi-photo approach to estimate body shapes of each user and build a conditional model of clothing categories given body-shape. We demonstrate that in real-world data, clothing categories and body-shapes are correlated and show that our multi-photo approach leads to a better predictive model for clothing categories compared to models based on single-view shape estimates or manually annotated body types. We see our method as the first step towards the large-scale understanding of clothing preferences from body shape. |
Tasks | |
Published | 2018-07-09 |
URL | http://arxiv.org/abs/1807.03235v2 |
http://arxiv.org/pdf/1807.03235v2.pdf | |
PWC | https://paperswithcode.com/paper/fashion-is-taking-shape-understanding |
Repo | |
Framework | |