October 16, 2019

3218 words 16 mins read

Paper Group ANR 1111

Learning Representations in Model-Free Hierarchical Reinforcement Learning. Electric Vehicle Driver Clustering using Statistical Model and Machine Learning. Deep Curiosity Search: Intra-Life Exploration Can Improve Performance on Challenging Deep Reinforcement Learning Problems. Observe and Look Further: Achieving Consistent Performance on Atari. H …

Learning Representations in Model-Free Hierarchical Reinforcement Learning


Title	Learning Representations in Model-Free Hierarchical Reinforcement Learning
Authors	Jacob Rafati, David C. Noelle
Abstract	Common approaches to Reinforcement Learning (RL) are seriously challenged by large-scale applications involving huge state spaces and sparse delayed reward feedback. Hierarchical Reinforcement Learning (HRL) methods attempt to address this scalability issue by learning action selection policies at multiple levels of temporal abstraction. Abstraction can be had by identifying a relatively small set of states that are likely to be useful as subgoals, in concert with the learning of corresponding skill policies to achieve those subgoals. Many approaches to subgoal discovery in HRL depend on the analysis of a model of the environment, but the need to learn such a model introduces its own problems of scale. Once subgoals are identified, skills may be learned through intrinsic motivation, introducing an internal reward signal marking subgoal attainment. In this paper, we present a novel model-free method for subgoal discovery using incremental unsupervised learning over a small memory of the most recent experiences (trajectories) of the agent. When combined with an intrinsic motivation learning mechanism, this method learns both subgoals and skills, based on experiences in the environment. Thus, we offer an original approach to HRL that does not require the acquisition of a model of the environment, suitable for large-scale applications. We demonstrate the efficiency of our method on two RL problems with sparse delayed feedback: a variant of the rooms environment and the first screen of the ATARI 2600 Montezuma’s Revenge game.
Tasks	Hierarchical Reinforcement Learning, Montezuma’s Revenge
Published	2018-10-23
URL	http://arxiv.org/abs/1810.10096v3
PDF	http://arxiv.org/pdf/1810.10096v3.pdf
PWC	https://paperswithcode.com/paper/learning-representations-in-model-free
Repo
Framework

Electric Vehicle Driver Clustering using Statistical Model and Machine Learning


Title	Electric Vehicle Driver Clustering using Statistical Model and Machine Learning
Authors	Yingqi Xiong, Bin Wang, Chi-Cheng Chu, Rajit Gadh
Abstract	Electric Vehicle (EV) is playing a significant role in the distribution energy management systems since the power consumption level of the EVs is much higher than the other regular home appliances. The randomness of the EV driver behaviors make the optimal charging or discharging scheduling even more difficult due to the uncertain charging session parameters. To minimize the impact of behavioral uncertainties, it is critical to develop effective methods to predict EV load for smart EV energy management. Using the EV smart charging infrastructures on UCLA campus and city of Santa Monica as testbeds, we have collected real-world datasets of EV charging behaviors, based on which we proposed an EV user modeling technique which combines statistical analysis and machine learning approaches. Specifically, unsupervised clustering algorithm, and multilayer perceptron are applied to historical charging record to make the day-ahead EV parking and load prediction. Experimental results with cross-validation show that our model can achieve good performance for charging control scheduling and online EV load forecasting.
Tasks	Load Forecasting
Published	2018-02-12
URL	http://arxiv.org/abs/1802.04193v1
PDF	http://arxiv.org/pdf/1802.04193v1.pdf
PWC	https://paperswithcode.com/paper/electric-vehicle-driver-clustering-using
Repo
Framework

Deep Curiosity Search: Intra-Life Exploration Can Improve Performance on Challenging Deep Reinforcement Learning Problems


Title	Deep Curiosity Search: Intra-Life Exploration Can Improve Performance on Challenging Deep Reinforcement Learning Problems
Authors	Christopher Stanton, Jeff Clune
Abstract	Traditional exploration methods in RL require agents to perform random actions to find rewards. But these approaches struggle on sparse-reward domains like Montezuma’s Revenge where the probability that any random action sequence leads to reward is extremely low. Recent algorithms have performed well on such tasks by encouraging agents to visit new states or perform new actions in relation to all prior training episodes (which we call across-training novelty). But such algorithms do not consider whether an agent exhibits intra-life novelty: doing something new within the current episode, regardless of whether those behaviors have been performed in previous episodes. We hypothesize that across-training novelty might discourage agents from revisiting initially non-rewarding states that could become important stepping stones later in training. We introduce Deep Curiosity Search (DeepCS), which encourages intra-life exploration by rewarding agents for visiting as many different states as possible within each episode, and show that DeepCS matches the performance of current state-of-the-art methods on Montezuma’s Revenge. We further show that DeepCS improves exploration on Amidar, Freeway, Gravitar, and Tutankham (many of which are hard exploration games). Surprisingly, DeepCS doubles A2C performance on Seaquest, a game we would not have expected to benefit from intra-life exploration because the arena is small and already easily navigated by naive exploration techniques. In one run, DeepCS achieves a maximum training score of 80,000 points on Seaquest, higher than any methods other than Ape-X. The strong performance of DeepCS on these sparse- and dense-reward tasks suggests that encouraging intra-life novelty is an interesting, new approach for improving performance in Deep RL and motivates further research into hybridizing across-training and intra-life exploration methods.
Tasks	Montezuma’s Revenge
Published	2018-06-01
URL	http://arxiv.org/abs/1806.00553v3
PDF	http://arxiv.org/pdf/1806.00553v3.pdf
PWC	https://paperswithcode.com/paper/deep-curiosity-search-intra-life-exploration
Repo
Framework

Observe and Look Further: Achieving Consistent Performance on Atari


Title	Observe and Look Further: Achieving Consistent Performance on Atari
Authors	Tobias Pohlen, Bilal Piot, Todd Hester, Mohammad Gheshlaghi Azar, Dan Horgan, David Budden, Gabriel Barth-Maron, Hado van Hasselt, John Quan, Mel Večerík, Matteo Hessel, Rémi Munos, Olivier Pietquin
Abstract	Despite significant advances in the field of deep Reinforcement Learning (RL), today’s algorithms still fail to learn human-level policies consistently over a set of diverse tasks such as Atari 2600 games. We identify three key challenges that any algorithm needs to master in order to perform well on all games: processing diverse reward distributions, reasoning over long time horizons, and exploring efficiently. In this paper, we propose an algorithm that addresses each of these challenges and is able to learn human-level policies on nearly all Atari games. A new transformed Bellman operator allows our algorithm to process rewards of varying densities and scales; an auxiliary temporal consistency loss allows us to train stably using a discount factor of $\gamma = 0.999$ (instead of $\gamma = 0.99$) extending the effective planning horizon by an order of magnitude; and we ease the exploration problem by using human demonstrations that guide the agent towards rewarding states. When tested on a set of 42 Atari games, our algorithm exceeds the performance of an average human on 40 games using a common set of hyper parameters. Furthermore, it is the first deep RL algorithm to solve the first level of Montezuma’s Revenge.
Tasks	Atari Games, Montezuma’s Revenge
Published	2018-05-29
URL	http://arxiv.org/abs/1805.11593v1
PDF	http://arxiv.org/pdf/1805.11593v1.pdf
PWC	https://paperswithcode.com/paper/observe-and-look-further-achieving-consistent
Repo
Framework

Hierarchical Imitation and Reinforcement Learning


Title	Hierarchical Imitation and Reinforcement Learning
Authors	Hoang M. Le, Nan Jiang, Alekh Agarwal, Miroslav Dudík, Yisong Yue, Hal Daumé III
Abstract	We study how to effectively leverage expert feedback to learn sequential decision-making policies. We focus on problems with sparse rewards and long time horizons, which typically pose significant challenges in reinforcement learning. We propose an algorithmic framework, called hierarchical guidance, that leverages the hierarchical structure of the underlying problem to integrate different modes of expert interaction. Our framework can incorporate different combinations of imitation learning (IL) and reinforcement learning (RL) at different levels, leading to dramatic reductions in both expert effort and cost of exploration. Using long-horizon benchmarks, including Montezuma’s Revenge, we demonstrate that our approach can learn significantly faster than hierarchical RL, and be significantly more label-efficient than standard IL. We also theoretically analyze labeling cost for certain instantiations of our framework.
Tasks	Decision Making, Imitation Learning, Montezuma’s Revenge
Published	2018-03-01
URL	http://arxiv.org/abs/1803.00590v2
PDF	http://arxiv.org/pdf/1803.00590v2.pdf
PWC	https://paperswithcode.com/paper/hierarchical-imitation-and-reinforcement
Repo
Framework

A Coarse-To-Fine Framework For Video Object Segmentation


Title	A Coarse-To-Fine Framework For Video Object Segmentation
Authors	Chi Zhang, Alexander Loui
Abstract	In this study, we develop an unsupervised coarse-to-fine video analysis framework and prototype system to extract a salient object in a video sequence. This framework starts from tracking grid-sampled points along temporal frames, typically using KLT tracking method. The tracking points could be divided into several groups due to their inconsistent movements. At the same time, the SLIC algorithm is extended into 3D space to generate supervoxels. Coarse segmentation is achieved by combining the categorized tracking points and supervoxels of the corresponding frame in the video sequence. Finally, a graph-based fine segmentation algorithm is used to extract the moving object in the scene. Experimental results reveal that this method outperforms the previous approaches in terms of accuracy and robustness.
Tasks	Semantic Segmentation, Video Object Segmentation, Video Semantic Segmentation
Published	2018-09-26
URL	http://arxiv.org/abs/1809.10260v1
PDF	http://arxiv.org/pdf/1809.10260v1.pdf
PWC	https://paperswithcode.com/paper/a-coarse-to-fine-framework-for-video-object
Repo
Framework

Best Vision Technologies Submission to ActivityNet Challenge 2018-Task: Dense-Captioning Events in Videos


Title	Best Vision Technologies Submission to ActivityNet Challenge 2018-Task: Dense-Captioning Events in Videos
Authors	Yuan Liu, Moyini Yao
Abstract	This note describes the details of our solution to the dense-captioning events in videos task of ActivityNet Challenge 2018. Specifically, we solve this problem with a two-stage way, i.e., first temporal event proposal and then sentence generation. For temporal event proposal, we directly leverage the three-stage workflow in [13, 16]. For sentence generation, we capitalize on LSTM-based captioning framework with temporal attention mechanism (dubbed as LSTM-T). Moreover, the input visual sequence to the LSTM-based video captioning model is comprised of RGB and optical flow images. At inference, we adopt a late fusion scheme to fuse the two LSTM-based captioning models for sentence generation.
Tasks	Optical Flow Estimation, Video Captioning
Published	2018-06-25
URL	http://arxiv.org/abs/1806.09278v1
PDF	http://arxiv.org/pdf/1806.09278v1.pdf
PWC	https://paperswithcode.com/paper/best-vision-technologies-submission-to
Repo
Framework

Sequential Coordination of Deep Models for Learning Visual Arithmetic


Title	Sequential Coordination of Deep Models for Learning Visual Arithmetic
Authors	Eric Crawford, Guillaume Rabusseau, Joelle Pineau
Abstract	Achieving machine intelligence requires a smooth integration of perception and reasoning, yet models developed to date tend to specialize in one or the other; sophisticated manipulation of symbols acquired from rich perceptual spaces has so far proved elusive. Consider a visual arithmetic task, where the goal is to carry out simple arithmetical algorithms on digits presented under natural conditions (e.g. hand-written, placed randomly). We propose a two-tiered architecture for tackling this problem. The lower tier consists of a heterogeneous collection of information processing modules, which can include pre-trained deep neural networks for locating and extracting characters from the image, as well as modules performing symbolic transformations on the representations extracted by perception. The higher tier consists of a controller, trained using reinforcement learning, which coordinates the modules in order to solve the high-level task. For instance, the controller may learn in what contexts to execute the perceptual networks and what symbolic transformations to apply to their outputs. The resulting model is able to solve a variety of tasks in the visual arithmetic domain, and has several advantages over standard, architecturally homogeneous feedforward networks including improved sample efficiency.
Tasks
Published	2018-09-13
URL	http://arxiv.org/abs/1809.04988v1
PDF	http://arxiv.org/pdf/1809.04988v1.pdf
PWC	https://paperswithcode.com/paper/sequential-coordination-of-deep-models-for
Repo
Framework

Light Field Super-Resolution using a Low-Rank Prior and Deep Convolutional Neural Networks


Title	Light Field Super-Resolution using a Low-Rank Prior and Deep Convolutional Neural Networks
Authors	Reuben A. Farrugia, Christine Guillemot
Abstract	Light field imaging has recently known a regain of interest due to the availability of practical light field capturing systems that offer a wide range of applications in the field of computer vision. However, capturing high-resolution light fields remains technologically challenging since the increase in angular resolution is often accompanied by a significant reduction in spatial resolution. This paper describes a learning-based spatial light field super-resolution method that allows the restoration of the entire light field with consistency across all sub-aperture images. The algorithm first uses optical flow to align the light field and then reduces its angular dimension using low-rank approximation. We then consider the linearly independent columns of the resulting low-rank model as an embedding, which is restored using a deep convolutional neural network (DCNN). The super-resolved embedding is then used to reconstruct the remaining sub-aperture images. The original disparities are restored using inverse warping where missing pixels are approximated using a novel light field inpainting algorithm. Experimental results show that the proposed method outperforms existing light field super-resolution algorithms, achieving PSNR gains of 0.23 dB over the second best performing method. This performance can be further improved using iterative back-projection as a post-processing step.
Tasks	Optical Flow Estimation, Super-Resolution
Published	2018-01-12
URL	http://arxiv.org/abs/1801.04314v1
PDF	http://arxiv.org/pdf/1801.04314v1.pdf
PWC	https://paperswithcode.com/paper/light-field-super-resolution-using-a-low-rank
Repo
Framework

Dual Latent Variable Model for Low-Resource Natural Language Generation in Dialogue Systems


Title	Dual Latent Variable Model for Low-Resource Natural Language Generation in Dialogue Systems
Authors	Van-Khanh Tran, Le-Minh Nguyen
Abstract	Recent deep learning models have shown improving results to natural language generation (NLG) irrespective of providing sufficient annotated data. However, a modest training data may harm such models performance. Thus, how to build a generator that can utilize as much of knowledge from a low-resource setting data is a crucial issue in NLG. This paper presents a variational neural-based generation model to tackle the NLG problem of having limited labeled dataset, in which we integrate a variational inference into an encoder-decoder generator and introduce a novel auxiliary autoencoding with an effective training procedure. Experiments showed that the proposed methods not only outperform the previous models when having sufficient training dataset but also show strong ability to work acceptably well when the training data is scarce.
Tasks	Text Generation
Published	2018-11-10
URL	http://arxiv.org/abs/1811.04164v1
PDF	http://arxiv.org/pdf/1811.04164v1.pdf
PWC	https://paperswithcode.com/paper/dual-latent-variable-model-for-low-resource
Repo
Framework

Design of optimal illumination patterns in single-pixel imaging using image dictionaries


Title	Design of optimal illumination patterns in single-pixel imaging using image dictionaries
Authors	Jun Feng, Shuming Jiao, Yang Gao, Ting Lei, Xiaocong Yuan
Abstract	Single-pixel imaging (SPI) has a major drawback that many sequential illuminations are required for capturing one single image with long acquisition time. Basis illumination patterns such as Fourier patterns and Hadamard patterns can achieve much better imaging efficiency than random patterns. But the performance is still sub-optimal since the basis patterns are fixed and non-adaptive for varying object images. This Letter proposes a novel scheme for designing and optimizing the illumination patterns adaptively from an image dictionary by extracting the common image features using principal component analysis (PCA). Simulation and experimental results reveal that our proposed scheme outperforms conventional Fourier SPI in terms of imaging efficiency.
Tasks
Published	2018-06-04
URL	https://arxiv.org/abs/1806.01340v2
PDF	https://arxiv.org/pdf/1806.01340v2.pdf
PWC	https://paperswithcode.com/paper/design-of-optimal-illumination-patterns-in
Repo
Framework

TreeGAN: Syntax-Aware Sequence Generation with Generative Adversarial Networks


Title	TreeGAN: Syntax-Aware Sequence Generation with Generative Adversarial Networks
Authors	Xinyue Liu, Xiangnan Kong, Lei Liu, Kuorong Chiang
Abstract	Generative Adversarial Networks (GANs) have shown great capacity on image generation, in which a discriminative model guides the training of a generative model to construct images that resemble real images. Recently, GANs have been extended from generating images to generating sequences (e.g., poems, music and codes). Existing GANs on sequence generation mainly focus on general sequences, which are grammar-free. In many real-world applications, however, we need to generate sequences in a formal language with the constraint of its corresponding grammar. For example, to test the performance of a database, one may want to generate a collection of SQL queries, which are not only similar to the queries of real users, but also follow the SQL syntax of the target database. Generating such sequences is highly challenging because both the generator and discriminator of GANs need to consider the structure of the sequences and the given grammar in the formal language. To address these issues, we study the problem of syntax-aware sequence generation with GANs, in which a collection of real sequences and a set of pre-defined grammatical rules are given to both discriminator and generator. We propose a novel GAN framework, namely TreeGAN, to incorporate a given Context-Free Grammar (CFG) into the sequence generation process. In TreeGAN, the generator employs a recurrent neural network (RNN) to construct a parse tree. Each generated parse tree can then be translated to a valid sequence of the given grammar. The discriminator uses a tree-structured RNN to distinguish the generated trees from real trees. We show that TreeGAN can generate sequences for any CFG and its generation fully conforms with the given syntax. Experiments on synthetic and real data sets demonstrated that TreeGAN significantly improves the quality of the sequence generation in context-free languages.
Tasks	Image Generation
Published	2018-08-22
URL	http://arxiv.org/abs/1808.07582v1
PDF	http://arxiv.org/pdf/1808.07582v1.pdf
PWC	https://paperswithcode.com/paper/treegan-syntax-aware-sequence-generation-with
Repo
Framework

2P-DNN : Privacy-Preserving Deep Neural Networks Based on Homomorphic Cryptosystem


Title	2P-DNN : Privacy-Preserving Deep Neural Networks Based on Homomorphic Cryptosystem
Authors	Qiang Zhu, Xixiang Lv
Abstract	Machine Learning as a Service (MLaaS), such as Microsoft Azure, Amazon AWS, offers an effective DNN model to complete the machine learning task for small businesses and individuals who are restricted to the lacking data and computing power. However, here comes an issue that user privacy is ex-posed to the MLaaS server, since users need to upload their sensitive data to the MLaaS server. In order to preserve their privacy, users can encrypt their data before uploading it. This makes it difficult to run the DNN model because it is not designed for running in ciphertext domain. In this paper, using the Paillier homomorphic cryptosystem we present a new Privacy-Preserving Deep Neural Network model that we called 2P-DNN. This model can fulfill the machine leaning task in ciphertext domain. By using 2P-DNN, MLaaS is able to provide a Privacy-Preserving machine learning ser-vice for users. We build our 2P-DNN model based on LeNet-5, and test it with the encrypted MNIST dataset. The classification accuracy is more than 97%, which is close to the accuracy of LeNet-5 running with the MNIST dataset and higher than that of other existing Privacy-Preserving machine learning models
Tasks
Published	2018-07-23
URL	http://arxiv.org/abs/1807.08459v1
PDF	http://arxiv.org/pdf/1807.08459v1.pdf
PWC	https://paperswithcode.com/paper/2p-dnn-privacy-preserving-deep-neural
Repo
Framework

Review on Optical Image Hiding and Watermarking Techniques


Title	Review on Optical Image Hiding and Watermarking Techniques
Authors	Shuming Jiao, Changyuan Zhou, Yishi Shi, Wenbin Zou, Xia Li
Abstract	Information security is a critical issue in modern society and image watermarking can effectively prevent unauthorized information access. Optical image watermarking techniques generally have advantages of parallel high-speed processing and multi-dimensional capabilities compared with digital approaches. This paper provides a comprehensive review on the research works related to optical image hiding and watermarking techniques conducted in the past decade. The past research works are focused on two major aspects, various optical systems for image hiding and the methods for embedding optical system output into a host image. A summary of the state-of-the-art works is made from these two perspectives.
Tasks
Published	2018-04-16
URL	http://arxiv.org/abs/1804.05483v1
PDF	http://arxiv.org/pdf/1804.05483v1.pdf
PWC	https://paperswithcode.com/paper/review-on-optical-image-hiding-and
Repo
Framework

Fashion is Taking Shape: Understanding Clothing Preference Based on Body Shape From Online Sources


Title	Fashion is Taking Shape: Understanding Clothing Preference Based on Body Shape From Online Sources
Authors	Hosnieh Sattar, Gerard Pons-Moll, Mario Fritz
Abstract	To study the correlation between clothing garments and body shape, we collected a new dataset (Fashion Takes Shape), which includes images of users with clothing category annotations. We employ our multi-photo approach to estimate body shapes of each user and build a conditional model of clothing categories given body-shape. We demonstrate that in real-world data, clothing categories and body-shapes are correlated and show that our multi-photo approach leads to a better predictive model for clothing categories compared to models based on single-view shape estimates or manually annotated body types. We see our method as the first step towards the large-scale understanding of clothing preferences from body shape.
Tasks
Published	2018-07-09
URL	http://arxiv.org/abs/1807.03235v2
PDF	http://arxiv.org/pdf/1807.03235v2.pdf
PWC	https://paperswithcode.com/paper/fashion-is-taking-shape-understanding
Repo
Framework