October 16, 2019

3235 words 16 mins read

Paper Group ANR 1138

Paper Group ANR 1138

Reinforcement Learning to Rank in E-Commerce Search Engine: Formalization, Analysis, and Application. Learning via social awareness: Improving a deep generative sketching model with facial feedback. Learning to guide task and motion planning using score-space representation. VH-HFCN based Parking Slot and Lane Markings Segmentation on Panoramic Sur …

Reinforcement Learning to Rank in E-Commerce Search Engine: Formalization, Analysis, and Application

Title Reinforcement Learning to Rank in E-Commerce Search Engine: Formalization, Analysis, and Application
Authors Yujing Hu, Qing Da, Anxiang Zeng, Yang Yu, Yinghui Xu
Abstract In e-commerce platforms such as Amazon and TaoBao, ranking items in a search session is a typical multi-step decision-making problem. Learning to rank (LTR) methods have been widely applied to ranking problems. However, such methods often consider different ranking steps in a session to be independent, which conversely may be highly correlated to each other. For better utilizing the correlation between different ranking steps, in this paper, we propose to use reinforcement learning (RL) to learn an optimal ranking policy which maximizes the expected accumulative rewards in a search session. Firstly, we formally define the concept of search session Markov decision process (SSMDP) to formulate the multi-step ranking problem. Secondly, we analyze the property of SSMDP and theoretically prove the necessity of maximizing accumulative rewards. Lastly, we propose a novel policy gradient algorithm for learning an optimal ranking policy, which is able to deal with the problem of high reward variance and unbalanced reward distribution of an SSMDP. Experiments are conducted in simulation and TaoBao search engine. The results demonstrate that our algorithm performs much better than online LTR methods, with more than 40% and 30% growth of total transaction amount in the simulation and the real application, respectively.
Tasks Decision Making, Learning-To-Rank
Published 2018-03-02
URL http://arxiv.org/abs/1803.00710v3
PDF http://arxiv.org/pdf/1803.00710v3.pdf
PWC https://paperswithcode.com/paper/reinforcement-learning-to-rank-in-e-commerce
Repo
Framework

Learning via social awareness: Improving a deep generative sketching model with facial feedback

Title Learning via social awareness: Improving a deep generative sketching model with facial feedback
Authors Natasha Jaques, Jennifer McCleary, Jesse Engel, David Ha, Fred Bertsch, Rosalind Picard, Douglas Eck
Abstract In the quest towards general artificial intelligence (AI), researchers have explored developing loss functions that act as intrinsic motivators in the absence of external rewards. This paper argues that such research has overlooked an important and useful intrinsic motivator: social interaction. We posit that making an AI agent aware of implicit social feedback from humans can allow for faster learning of more generalizable and useful representations, and could potentially impact AI safety. We collect social feedback in the form of facial expression reactions to samples from Sketch RNN, an LSTM-based variational autoencoder (VAE) designed to produce sketch drawings. We use a Latent Constraints GAN (LC-GAN) to learn from the facial feedback of a small group of viewers, by optimizing the model to produce sketches that it predicts will lead to more positive facial expressions. We show in multiple independent evaluations that the model trained with facial feedback produced sketches that are more highly rated, and induce significantly more positive facial expressions. Thus, we establish that implicit social feedback can improve the output of a deep learning model.
Tasks
Published 2018-02-13
URL http://arxiv.org/abs/1802.04877v2
PDF http://arxiv.org/pdf/1802.04877v2.pdf
PWC https://paperswithcode.com/paper/learning-via-social-awareness-improving-a
Repo
Framework

Learning to guide task and motion planning using score-space representation

Title Learning to guide task and motion planning using score-space representation
Authors Beomjoon Kim, Zi Wang, Leslie Pack Kaelbling, Tomas Lozano-Perez
Abstract In this paper, we propose a learning algorithm that speeds up the search in task and motion planning problems. Our algorithm proposes solutions to three different challenges that arise in learning to improve planning efficiency: what to predict, how to represent a planning problem instance, and how to transfer knowledge from one problem instance to another. We propose a method that predicts constraints on the search space based on a generic representation of a planning problem instance, called score-space, where we represent a problem instance in terms of the performance of a set of solutions attempted so far. Using this representation, we transfer knowledge, in the form of constraints, from previous problems based on the similarity in score space. We design a sequential algorithm that efficiently predicts these constraints, and evaluate it in three different challenging task and motion planning problems. Results indicate that our approach performs orders of magnitudes faster than an unguided planner
Tasks Motion Planning
Published 2018-07-26
URL http://arxiv.org/abs/1807.09962v1
PDF http://arxiv.org/pdf/1807.09962v1.pdf
PWC https://paperswithcode.com/paper/learning-to-guide-task-and-motion-planning
Repo
Framework

VH-HFCN based Parking Slot and Lane Markings Segmentation on Panoramic Surround View

Title VH-HFCN based Parking Slot and Lane Markings Segmentation on Panoramic Surround View
Authors Yan Wu, Tao Yang, Junqiao Zhao, Linting Guan, Wei Jiang
Abstract The automatic parking is being massively developed by car manufacturers and providers. Until now, there are two problems with the automatic parking. First, there is no openly-available segmentation labels of parking slot on panoramic surround view (PSV) dataset. Second, how to detect parking slot and road structure robustly. Therefore, in this paper, we build up a public PSV dataset. At the same time, we proposed a highly fused convolutional network (HFCN) based segmentation method for parking slot and lane markings based on the PSV dataset. A surround-view image is made of four calibrated images captured from four fisheye cameras. We collect and label more than 4,200 surround view images for this task, which contain various illuminated scenes of different types of parking slots. A VH-HFCN network is proposed, which adopts an HFCN as the base, with an extra efficient VH-stage for better segmenting various markings. The VH-stage consists of two independent linear convolution paths with vertical and horizontal convolution kernels respectively. This modification enables the network to robustly and precisely extract linear features. We evaluated our model on the PSV dataset and the results showed outstanding performance in ground markings segmentation. Based on the segmented markings, parking slots and lanes are acquired by skeletonization, hough line transform and line arrangement.
Tasks
Published 2018-04-19
URL http://arxiv.org/abs/1804.07027v2
PDF http://arxiv.org/pdf/1804.07027v2.pdf
PWC https://paperswithcode.com/paper/vh-hfcn-based-parking-slot-and-lane-markings
Repo
Framework

HeadOn: Real-time Reenactment of Human Portrait Videos

Title HeadOn: Real-time Reenactment of Human Portrait Videos
Authors Justus Thies, Michael Zollhöfer, Christian Theobalt, Marc Stamminger, Matthias Nießner
Abstract We propose HeadOn, the first real-time source-to-target reenactment approach for complete human portrait videos that enables transfer of torso and head motion, face expression, and eye gaze. Given a short RGB-D video of the target actor, we automatically construct a personalized geometry proxy that embeds a parametric head, eye, and kinematic torso model. A novel real-time reenactment algorithm employs this proxy to photo-realistically map the captured motion from the source actor to the target actor. On top of the coarse geometric proxy, we propose a video-based rendering technique that composites the modified target portrait video via view- and pose-dependent texturing, and creates photo-realistic imagery of the target actor under novel torso and head poses, facial expressions, and gaze directions. To this end, we propose a robust tracking of the face and torso of the source actor. We extensively evaluate our approach and show significant improvements in enabling much greater flexibility in creating realistic reenacted output videos.
Tasks
Published 2018-05-29
URL http://arxiv.org/abs/1805.11729v1
PDF http://arxiv.org/pdf/1805.11729v1.pdf
PWC https://paperswithcode.com/paper/headon-real-time-reenactment-of-human
Repo
Framework

Theory IIIb: Generalization in Deep Networks

Title Theory IIIb: Generalization in Deep Networks
Authors Tomaso Poggio, Qianli Liao, Brando Miranda, Andrzej Banburski, Xavier Boix, Jack Hidary
Abstract A main puzzle of deep neural networks (DNNs) revolves around the apparent absence of “overfitting”, defined in this paper as follows: the expected error does not get worse when increasing the number of neurons or of iterations of gradient descent. This is surprising because of the large capacity demonstrated by DNNs to fit randomly labeled data and the absence of explicit regularization. Recent results by Srebro et al. provide a satisfying solution of the puzzle for linear networks used in binary classification. They prove that minimization of loss functions such as the logistic, the cross-entropy and the exp-loss yields asymptotic, “slow” convergence to the maximum margin solution for linearly separable datasets, independently of the initial conditions. Here we prove a similar result for nonlinear multilayer DNNs near zero minima of the empirical loss. The result holds for exponential-type losses but not for the square loss. In particular, we prove that the weight matrix at each layer of a deep network converges to a minimum norm solution up to a scale factor (in the separable case). Our analysis of the dynamical system corresponding to gradient descent of a multilayer network suggests a simple criterion for ranking the generalization performance of different zero minimizers of the empirical loss.
Tasks
Published 2018-06-29
URL http://arxiv.org/abs/1806.11379v1
PDF http://arxiv.org/pdf/1806.11379v1.pdf
PWC https://paperswithcode.com/paper/theory-iiib-generalization-in-deep-networks
Repo
Framework

MagicVO: End-to-End Monocular Visual Odometry through Deep Bi-directional Recurrent Convolutional Neural Network

Title MagicVO: End-to-End Monocular Visual Odometry through Deep Bi-directional Recurrent Convolutional Neural Network
Authors Jian Jiao, Jichao Jiao, Yaokai Mo, Weilun Liu, Zhongliang Deng
Abstract This paper proposes a new framework to solve the problem of monocular visual odometry, called MagicVO . Based on Convolutional Neural Network (CNN) and Bi-directional LSTM (Bi-LSTM), MagicVO outputs a 6-DoF absolute-scale pose at each position of the camera with a sequence of continuous monocular images as input. It not only utilizes the outstanding performance of CNN in image feature processing to extract the rich features of image frames fully but also learns the geometric relationship from image sequences pre and post through Bi-LSTM to get a more accurate prediction. A pipeline of the MagicVO is shown in Fig. 1. The MagicVO system is end-to-end, and the results of experiments on the KITTI dataset and the ETH-asl cla dataset show that MagicVO has a better performance than traditional visual odometry (VO) systems in the accuracy of pose and the generalization ability.
Tasks Monocular Visual Odometry, Visual Odometry
Published 2018-11-27
URL http://arxiv.org/abs/1811.10964v2
PDF http://arxiv.org/pdf/1811.10964v2.pdf
PWC https://paperswithcode.com/paper/magicvo-end-to-end-monocular-visual-odometry
Repo
Framework

Learnable: Theory vs Applications

Title Learnable: Theory vs Applications
Authors Marina Sapir
Abstract Two different views on machine learning problem: Applied learning (machine learning with business applications) and Agnostic PAC learning are formalized and compared here. I show that, under some conditions, the theory of PAC Learnable provides a way to solve the Applied learning problem. However, the theory requires to have the training sets so large, that it would make the learning practically useless. I suggest shedding some theoretical misconceptions about learning to make the theory more aligned with the needs and experience of practitioners.
Tasks
Published 2018-07-27
URL http://arxiv.org/abs/1807.10681v1
PDF http://arxiv.org/pdf/1807.10681v1.pdf
PWC https://paperswithcode.com/paper/learnable-theory-vs-applications
Repo
Framework

Guided Feature Selection for Deep Visual Odometry

Title Guided Feature Selection for Deep Visual Odometry
Authors Fei Xue, Qiuyuan Wang, Xin Wang, Wei Dong, Junqiu Wang, Hongbin Zha
Abstract We present a novel end-to-end visual odometry architecture with guided feature selection based on deep convolutional recurrent neural networks. Different from current monocular visual odometry methods, our approach is established on the intuition that features contribute discriminately to different motion patterns. Specifically, we propose a dual-branch recurrent network to learn the rotation and translation separately by leveraging current Convolutional Neural Network (CNN) for feature representation and Recurrent Neural Network (RNN) for image sequence reasoning. To enhance the ability of feature selection, we further introduce an effective context-aware guidance mechanism to force each branch to distill related information for specific motion pattern explicitly. Experiments demonstrate that on the prevalent KITTI and ICL_NUIM benchmarks, our method outperforms current state-of-the-art model- and learning-based methods for both decoupled and joint camera pose recovery.
Tasks Feature Selection, Monocular Visual Odometry, Visual Odometry
Published 2018-11-25
URL http://arxiv.org/abs/1811.09935v1
PDF http://arxiv.org/pdf/1811.09935v1.pdf
PWC https://paperswithcode.com/paper/guided-feature-selection-for-deep-visual
Repo
Framework

IMMIGRATE: A Margin-based Feature Selection Method with Interaction Terms

Title IMMIGRATE: A Margin-based Feature Selection Method with Interaction Terms
Authors Ruzhang Zhao, Pengyu Hong, Jun S Liu
Abstract Relief based algorithms have often been claimed to uncover feature interactions. However, it is still unclear whether and how interaction terms will be differentiated from marginal effects. In this paper, we propose IMMIGRATE algorithm by including and training weights for interaction terms. Besides applying the large margin principle, we focus on the robustness of the contributors of margin and consider local and global information simultaneously. Moreover, IMMIGRATE has been shown to enjoy attractive properties, such as robustness and combination with Boosting. We evaluate our proposed method on several tasks, which achieves state-of-the-art results significantly.
Tasks Feature Selection
Published 2018-10-05
URL https://arxiv.org/abs/1810.02658v3
PDF https://arxiv.org/pdf/1810.02658v3.pdf
PWC https://paperswithcode.com/paper/immigrate-a-margin-based-feature-selection
Repo
Framework

Unveiling the Power of Deep Tracking

Title Unveiling the Power of Deep Tracking
Authors Goutam Bhat, Joakim Johnander, Martin Danelljan, Fahad Shahbaz Khan, Michael Felsberg
Abstract In the field of generic object tracking numerous attempts have been made to exploit deep features. Despite all expectations, deep trackers are yet to reach an outstanding level of performance compared to methods solely based on handcrafted features. In this paper, we investigate this key issue and propose an approach to unlock the true potential of deep features for tracking. We systematically study the characteristics of both deep and shallow features, and their relation to tracking accuracy and robustness. We identify the limited data and low spatial resolution as the main challenges, and propose strategies to counter these issues when integrating deep features for tracking. Furthermore, we propose a novel adaptive fusion approach that leverages the complementary properties of deep and shallow features to improve both robustness and accuracy. Extensive experiments are performed on four challenging datasets. On VOT2017, our approach significantly outperforms the top performing tracker from the challenge with a relative gain of 17% in EAO.
Tasks Object Tracking
Published 2018-04-18
URL http://arxiv.org/abs/1804.06833v1
PDF http://arxiv.org/pdf/1804.06833v1.pdf
PWC https://paperswithcode.com/paper/unveiling-the-power-of-deep-tracking
Repo
Framework

How Do Classifiers Induce Agents To Invest Effort Strategically?

Title How Do Classifiers Induce Agents To Invest Effort Strategically?
Authors Jon Kleinberg, Manish Raghavan
Abstract Algorithms are often used to produce decision-making rules that classify or evaluate individuals. When these individuals have incentives to be classified a certain way, they may behave strategically to influence their outcomes. We develop a model for how strategic agents can invest effort in order to change the outcomes they receive, and we give a tight characterization of when such agents can be incentivized to invest specified forms of effort into improving their outcomes as opposed to “gaming” the classifier. We show that whenever any “reasonable” mechanism can do so, a simple linear mechanism suffices.
Tasks Decision Making
Published 2018-07-13
URL https://arxiv.org/abs/1807.05307v5
PDF https://arxiv.org/pdf/1807.05307v5.pdf
PWC https://paperswithcode.com/paper/how-do-classifiers-induce-agents-to-invest
Repo
Framework

Deep Learning based Retinal OCT Segmentation

Title Deep Learning based Retinal OCT Segmentation
Authors Mike Pekala, Neil Joshi, David E. Freund, Neil M. Bressler, Delia Cabrera DeBuc, Philippe M Burlina
Abstract Our objective is to evaluate the efficacy of methods that use deep learning (DL) for the automatic fine-grained segmentation of optical coherence tomography (OCT) images of the retina. OCT images from 10 patients with mild non-proliferative diabetic retinopathy were used from a public (U. of Miami) dataset. For each patient, five images were available: one image of the fovea center, two images of the perifovea, and two images of the parafovea. For each image, two expert graders each manually annotated five retinal surfaces (i.e. boundaries between pairs of retinal layers). The first grader’s annotations were used as ground truth and the second grader’s annotations to compute inter-operator agreement. The proposed automated approach segments images using fully convolutional networks (FCNs) together with Gaussian process (GP)-based regression as a post-processing step to improve the quality of the estimates. Using 10-fold cross validation, the performance of the algorithms is determined by computing the per-pixel unsigned error (distance) between the automated estimates and the ground truth annotations generated by the first manual grader. We compare the proposed method against five state of the art automatic segmentation techniques. The results show that the proposed methods compare favorably with state of the art techniques, resulting in the smallest mean unsigned error values and associated standard deviations, and performance is comparable with human annotation of retinal layers from OCT when there is only mild retinopathy. The results suggest that semantic segmentation using FCNs, coupled with regression-based post-processing, can effectively solve the OCT segmentation problem on par with human capabilities with mild retinopathy.
Tasks Semantic Segmentation
Published 2018-01-29
URL http://arxiv.org/abs/1801.09749v1
PDF http://arxiv.org/pdf/1801.09749v1.pdf
PWC https://paperswithcode.com/paper/deep-learning-based-retinal-oct-segmentation
Repo
Framework

An Enhanced BPSO based Approach for Service Placement in Hybrid Cloud

Title An Enhanced BPSO based Approach for Service Placement in Hybrid Cloud
Authors Wissem Abbes, Zied Kechaou, Adel M. Alimi
Abstract Due to the challenges of competition and the rapidly evolving market, companies need to be innovative and agile, particularly in regard of web applications as used by customers. Nowadays, hybrid cloud stands as an attractive solution as organizations tend to use a combination of private and public cloud implementations, in accordance with their appropriate needs to profitably apply the available resources and speed of execution. In such a case, deploying the new applications would certainly entail opting for placing and consecrating some components to the private cloud option, while reserving some others to the public cloud option. In this respect, our primary goal in this paper consists in minimizing the extra costs likely to be incurred by applying the public cloud related options, along with those costs involved in maintaining communication between the private cloud system and the public cloud framework. As for our second targeted objective, it lies in reducing the decision process relating to the execution time, necessary for selecting the optimal service placement solution. For this purpose, a novel Binary Particle Swarm Optimization (BPSO) based approach is proposed, useful for an effective service placement optimization within hybrid cloud to take place. Using a real benchmark, the experimental results appear to reveal that our proposed approach reached results that outperform those documented in the state of the art both in terms of cost and time.
Tasks
Published 2018-06-10
URL http://arxiv.org/abs/1806.05971v1
PDF http://arxiv.org/pdf/1806.05971v1.pdf
PWC https://paperswithcode.com/paper/an-enhanced-bpso-based-approach-for-service
Repo
Framework

Force Estimation from OCT Volumes using 3D CNNs

Title Force Estimation from OCT Volumes using 3D CNNs
Authors Nils Gessert, Jens Beringhoff, Christoph Otte, Alexander Schlaefer
Abstract \textit{Purpose} Estimating the interaction forces of instruments and tissue is of interest, particularly to provide haptic feedback during robot assisted minimally invasive interventions. Different approaches based on external and integrated force sensors have been proposed. These are hampered by friction, sensor size, and sterilizability. We investigate a novel approach to estimate the force vector directly from optical coherence tomography image volumes. \textit{Methods} We introduce a novel Siamese 3D CNN architecture. The network takes an undeformed reference volume and a deformed sample volume as an input and outputs the three components of the force vector. We employ a deep residual architecture with bottlenecks for increased efficiency. We compare the Siamese approach to methods using difference volumes and two-dimensional projections. Data was generated using a robotic setup to obtain ground truth force vectors for silicon tissue phantoms as well as porcine tissue. \textit{Results} Our method achieves a mean average error of 7.7 +- 4.3 mN when estimating the force vector. Our novel Siamese 3D CNN architecture outperforms single-path methods that achieve a mean average error of 11.59 +- 6.7 mN. Moreover, the use of volume data leads to significantly higher performance compared to processing only surface information which achieves a mean average error of 24.38 +- 22.0 mN. Based on the tissue dataset, our methods shows good generalization in between different subjects. \textit{Conclusions} We propose a novel image-based force estimation method using optical coherence tomography. We illustrate that capturing the deformation of subsurface structures substantially improves force estimation. Our approach can provide accurate force estimates in surgical setups when using intraoperative optical coherence tomography.
Tasks
Published 2018-04-26
URL http://arxiv.org/abs/1804.10002v1
PDF http://arxiv.org/pdf/1804.10002v1.pdf
PWC https://paperswithcode.com/paper/force-estimation-from-oct-volumes-using-3d
Repo
Framework
comments powered by Disqus