Paper Group ANR 1138
Reinforcement Learning to Rank in E-Commerce Search Engine: Formalization, Analysis, and Application. Learning via social awareness: Improving a deep generative sketching model with facial feedback. Learning to guide task and motion planning using score-space representation. VH-HFCN based Parking Slot and Lane Markings Segmentation on Panoramic Sur …
Reinforcement Learning to Rank in E-Commerce Search Engine: Formalization, Analysis, and Application
Title | Reinforcement Learning to Rank in E-Commerce Search Engine: Formalization, Analysis, and Application |
Authors | Yujing Hu, Qing Da, Anxiang Zeng, Yang Yu, Yinghui Xu |
Abstract | In e-commerce platforms such as Amazon and TaoBao, ranking items in a search session is a typical multi-step decision-making problem. Learning to rank (LTR) methods have been widely applied to ranking problems. However, such methods often consider different ranking steps in a session to be independent, which conversely may be highly correlated to each other. For better utilizing the correlation between different ranking steps, in this paper, we propose to use reinforcement learning (RL) to learn an optimal ranking policy which maximizes the expected accumulative rewards in a search session. Firstly, we formally define the concept of search session Markov decision process (SSMDP) to formulate the multi-step ranking problem. Secondly, we analyze the property of SSMDP and theoretically prove the necessity of maximizing accumulative rewards. Lastly, we propose a novel policy gradient algorithm for learning an optimal ranking policy, which is able to deal with the problem of high reward variance and unbalanced reward distribution of an SSMDP. Experiments are conducted in simulation and TaoBao search engine. The results demonstrate that our algorithm performs much better than online LTR methods, with more than 40% and 30% growth of total transaction amount in the simulation and the real application, respectively. |
Tasks | Decision Making, Learning-To-Rank |
Published | 2018-03-02 |
URL | http://arxiv.org/abs/1803.00710v3 |
http://arxiv.org/pdf/1803.00710v3.pdf | |
PWC | https://paperswithcode.com/paper/reinforcement-learning-to-rank-in-e-commerce |
Repo | |
Framework | |
Learning via social awareness: Improving a deep generative sketching model with facial feedback
Title | Learning via social awareness: Improving a deep generative sketching model with facial feedback |
Authors | Natasha Jaques, Jennifer McCleary, Jesse Engel, David Ha, Fred Bertsch, Rosalind Picard, Douglas Eck |
Abstract | In the quest towards general artificial intelligence (AI), researchers have explored developing loss functions that act as intrinsic motivators in the absence of external rewards. This paper argues that such research has overlooked an important and useful intrinsic motivator: social interaction. We posit that making an AI agent aware of implicit social feedback from humans can allow for faster learning of more generalizable and useful representations, and could potentially impact AI safety. We collect social feedback in the form of facial expression reactions to samples from Sketch RNN, an LSTM-based variational autoencoder (VAE) designed to produce sketch drawings. We use a Latent Constraints GAN (LC-GAN) to learn from the facial feedback of a small group of viewers, by optimizing the model to produce sketches that it predicts will lead to more positive facial expressions. We show in multiple independent evaluations that the model trained with facial feedback produced sketches that are more highly rated, and induce significantly more positive facial expressions. Thus, we establish that implicit social feedback can improve the output of a deep learning model. |
Tasks | |
Published | 2018-02-13 |
URL | http://arxiv.org/abs/1802.04877v2 |
http://arxiv.org/pdf/1802.04877v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-via-social-awareness-improving-a |
Repo | |
Framework | |
Learning to guide task and motion planning using score-space representation
Title | Learning to guide task and motion planning using score-space representation |
Authors | Beomjoon Kim, Zi Wang, Leslie Pack Kaelbling, Tomas Lozano-Perez |
Abstract | In this paper, we propose a learning algorithm that speeds up the search in task and motion planning problems. Our algorithm proposes solutions to three different challenges that arise in learning to improve planning efficiency: what to predict, how to represent a planning problem instance, and how to transfer knowledge from one problem instance to another. We propose a method that predicts constraints on the search space based on a generic representation of a planning problem instance, called score-space, where we represent a problem instance in terms of the performance of a set of solutions attempted so far. Using this representation, we transfer knowledge, in the form of constraints, from previous problems based on the similarity in score space. We design a sequential algorithm that efficiently predicts these constraints, and evaluate it in three different challenging task and motion planning problems. Results indicate that our approach performs orders of magnitudes faster than an unguided planner |
Tasks | Motion Planning |
Published | 2018-07-26 |
URL | http://arxiv.org/abs/1807.09962v1 |
http://arxiv.org/pdf/1807.09962v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-guide-task-and-motion-planning |
Repo | |
Framework | |
VH-HFCN based Parking Slot and Lane Markings Segmentation on Panoramic Surround View
Title | VH-HFCN based Parking Slot and Lane Markings Segmentation on Panoramic Surround View |
Authors | Yan Wu, Tao Yang, Junqiao Zhao, Linting Guan, Wei Jiang |
Abstract | The automatic parking is being massively developed by car manufacturers and providers. Until now, there are two problems with the automatic parking. First, there is no openly-available segmentation labels of parking slot on panoramic surround view (PSV) dataset. Second, how to detect parking slot and road structure robustly. Therefore, in this paper, we build up a public PSV dataset. At the same time, we proposed a highly fused convolutional network (HFCN) based segmentation method for parking slot and lane markings based on the PSV dataset. A surround-view image is made of four calibrated images captured from four fisheye cameras. We collect and label more than 4,200 surround view images for this task, which contain various illuminated scenes of different types of parking slots. A VH-HFCN network is proposed, which adopts an HFCN as the base, with an extra efficient VH-stage for better segmenting various markings. The VH-stage consists of two independent linear convolution paths with vertical and horizontal convolution kernels respectively. This modification enables the network to robustly and precisely extract linear features. We evaluated our model on the PSV dataset and the results showed outstanding performance in ground markings segmentation. Based on the segmented markings, parking slots and lanes are acquired by skeletonization, hough line transform and line arrangement. |
Tasks | |
Published | 2018-04-19 |
URL | http://arxiv.org/abs/1804.07027v2 |
http://arxiv.org/pdf/1804.07027v2.pdf | |
PWC | https://paperswithcode.com/paper/vh-hfcn-based-parking-slot-and-lane-markings |
Repo | |
Framework | |
HeadOn: Real-time Reenactment of Human Portrait Videos
Title | HeadOn: Real-time Reenactment of Human Portrait Videos |
Authors | Justus Thies, Michael Zollhöfer, Christian Theobalt, Marc Stamminger, Matthias Nießner |
Abstract | We propose HeadOn, the first real-time source-to-target reenactment approach for complete human portrait videos that enables transfer of torso and head motion, face expression, and eye gaze. Given a short RGB-D video of the target actor, we automatically construct a personalized geometry proxy that embeds a parametric head, eye, and kinematic torso model. A novel real-time reenactment algorithm employs this proxy to photo-realistically map the captured motion from the source actor to the target actor. On top of the coarse geometric proxy, we propose a video-based rendering technique that composites the modified target portrait video via view- and pose-dependent texturing, and creates photo-realistic imagery of the target actor under novel torso and head poses, facial expressions, and gaze directions. To this end, we propose a robust tracking of the face and torso of the source actor. We extensively evaluate our approach and show significant improvements in enabling much greater flexibility in creating realistic reenacted output videos. |
Tasks | |
Published | 2018-05-29 |
URL | http://arxiv.org/abs/1805.11729v1 |
http://arxiv.org/pdf/1805.11729v1.pdf | |
PWC | https://paperswithcode.com/paper/headon-real-time-reenactment-of-human |
Repo | |
Framework | |
Theory IIIb: Generalization in Deep Networks
Title | Theory IIIb: Generalization in Deep Networks |
Authors | Tomaso Poggio, Qianli Liao, Brando Miranda, Andrzej Banburski, Xavier Boix, Jack Hidary |
Abstract | A main puzzle of deep neural networks (DNNs) revolves around the apparent absence of “overfitting”, defined in this paper as follows: the expected error does not get worse when increasing the number of neurons or of iterations of gradient descent. This is surprising because of the large capacity demonstrated by DNNs to fit randomly labeled data and the absence of explicit regularization. Recent results by Srebro et al. provide a satisfying solution of the puzzle for linear networks used in binary classification. They prove that minimization of loss functions such as the logistic, the cross-entropy and the exp-loss yields asymptotic, “slow” convergence to the maximum margin solution for linearly separable datasets, independently of the initial conditions. Here we prove a similar result for nonlinear multilayer DNNs near zero minima of the empirical loss. The result holds for exponential-type losses but not for the square loss. In particular, we prove that the weight matrix at each layer of a deep network converges to a minimum norm solution up to a scale factor (in the separable case). Our analysis of the dynamical system corresponding to gradient descent of a multilayer network suggests a simple criterion for ranking the generalization performance of different zero minimizers of the empirical loss. |
Tasks | |
Published | 2018-06-29 |
URL | http://arxiv.org/abs/1806.11379v1 |
http://arxiv.org/pdf/1806.11379v1.pdf | |
PWC | https://paperswithcode.com/paper/theory-iiib-generalization-in-deep-networks |
Repo | |
Framework | |
MagicVO: End-to-End Monocular Visual Odometry through Deep Bi-directional Recurrent Convolutional Neural Network
Title | MagicVO: End-to-End Monocular Visual Odometry through Deep Bi-directional Recurrent Convolutional Neural Network |
Authors | Jian Jiao, Jichao Jiao, Yaokai Mo, Weilun Liu, Zhongliang Deng |
Abstract | This paper proposes a new framework to solve the problem of monocular visual odometry, called MagicVO . Based on Convolutional Neural Network (CNN) and Bi-directional LSTM (Bi-LSTM), MagicVO outputs a 6-DoF absolute-scale pose at each position of the camera with a sequence of continuous monocular images as input. It not only utilizes the outstanding performance of CNN in image feature processing to extract the rich features of image frames fully but also learns the geometric relationship from image sequences pre and post through Bi-LSTM to get a more accurate prediction. A pipeline of the MagicVO is shown in Fig. 1. The MagicVO system is end-to-end, and the results of experiments on the KITTI dataset and the ETH-asl cla dataset show that MagicVO has a better performance than traditional visual odometry (VO) systems in the accuracy of pose and the generalization ability. |
Tasks | Monocular Visual Odometry, Visual Odometry |
Published | 2018-11-27 |
URL | http://arxiv.org/abs/1811.10964v2 |
http://arxiv.org/pdf/1811.10964v2.pdf | |
PWC | https://paperswithcode.com/paper/magicvo-end-to-end-monocular-visual-odometry |
Repo | |
Framework | |
Learnable: Theory vs Applications
Title | Learnable: Theory vs Applications |
Authors | Marina Sapir |
Abstract | Two different views on machine learning problem: Applied learning (machine learning with business applications) and Agnostic PAC learning are formalized and compared here. I show that, under some conditions, the theory of PAC Learnable provides a way to solve the Applied learning problem. However, the theory requires to have the training sets so large, that it would make the learning practically useless. I suggest shedding some theoretical misconceptions about learning to make the theory more aligned with the needs and experience of practitioners. |
Tasks | |
Published | 2018-07-27 |
URL | http://arxiv.org/abs/1807.10681v1 |
http://arxiv.org/pdf/1807.10681v1.pdf | |
PWC | https://paperswithcode.com/paper/learnable-theory-vs-applications |
Repo | |
Framework | |
Guided Feature Selection for Deep Visual Odometry
Title | Guided Feature Selection for Deep Visual Odometry |
Authors | Fei Xue, Qiuyuan Wang, Xin Wang, Wei Dong, Junqiu Wang, Hongbin Zha |
Abstract | We present a novel end-to-end visual odometry architecture with guided feature selection based on deep convolutional recurrent neural networks. Different from current monocular visual odometry methods, our approach is established on the intuition that features contribute discriminately to different motion patterns. Specifically, we propose a dual-branch recurrent network to learn the rotation and translation separately by leveraging current Convolutional Neural Network (CNN) for feature representation and Recurrent Neural Network (RNN) for image sequence reasoning. To enhance the ability of feature selection, we further introduce an effective context-aware guidance mechanism to force each branch to distill related information for specific motion pattern explicitly. Experiments demonstrate that on the prevalent KITTI and ICL_NUIM benchmarks, our method outperforms current state-of-the-art model- and learning-based methods for both decoupled and joint camera pose recovery. |
Tasks | Feature Selection, Monocular Visual Odometry, Visual Odometry |
Published | 2018-11-25 |
URL | http://arxiv.org/abs/1811.09935v1 |
http://arxiv.org/pdf/1811.09935v1.pdf | |
PWC | https://paperswithcode.com/paper/guided-feature-selection-for-deep-visual |
Repo | |
Framework | |
IMMIGRATE: A Margin-based Feature Selection Method with Interaction Terms
Title | IMMIGRATE: A Margin-based Feature Selection Method with Interaction Terms |
Authors | Ruzhang Zhao, Pengyu Hong, Jun S Liu |
Abstract | Relief based algorithms have often been claimed to uncover feature interactions. However, it is still unclear whether and how interaction terms will be differentiated from marginal effects. In this paper, we propose IMMIGRATE algorithm by including and training weights for interaction terms. Besides applying the large margin principle, we focus on the robustness of the contributors of margin and consider local and global information simultaneously. Moreover, IMMIGRATE has been shown to enjoy attractive properties, such as robustness and combination with Boosting. We evaluate our proposed method on several tasks, which achieves state-of-the-art results significantly. |
Tasks | Feature Selection |
Published | 2018-10-05 |
URL | https://arxiv.org/abs/1810.02658v3 |
https://arxiv.org/pdf/1810.02658v3.pdf | |
PWC | https://paperswithcode.com/paper/immigrate-a-margin-based-feature-selection |
Repo | |
Framework | |
Unveiling the Power of Deep Tracking
Title | Unveiling the Power of Deep Tracking |
Authors | Goutam Bhat, Joakim Johnander, Martin Danelljan, Fahad Shahbaz Khan, Michael Felsberg |
Abstract | In the field of generic object tracking numerous attempts have been made to exploit deep features. Despite all expectations, deep trackers are yet to reach an outstanding level of performance compared to methods solely based on handcrafted features. In this paper, we investigate this key issue and propose an approach to unlock the true potential of deep features for tracking. We systematically study the characteristics of both deep and shallow features, and their relation to tracking accuracy and robustness. We identify the limited data and low spatial resolution as the main challenges, and propose strategies to counter these issues when integrating deep features for tracking. Furthermore, we propose a novel adaptive fusion approach that leverages the complementary properties of deep and shallow features to improve both robustness and accuracy. Extensive experiments are performed on four challenging datasets. On VOT2017, our approach significantly outperforms the top performing tracker from the challenge with a relative gain of 17% in EAO. |
Tasks | Object Tracking |
Published | 2018-04-18 |
URL | http://arxiv.org/abs/1804.06833v1 |
http://arxiv.org/pdf/1804.06833v1.pdf | |
PWC | https://paperswithcode.com/paper/unveiling-the-power-of-deep-tracking |
Repo | |
Framework | |
How Do Classifiers Induce Agents To Invest Effort Strategically?
Title | How Do Classifiers Induce Agents To Invest Effort Strategically? |
Authors | Jon Kleinberg, Manish Raghavan |
Abstract | Algorithms are often used to produce decision-making rules that classify or evaluate individuals. When these individuals have incentives to be classified a certain way, they may behave strategically to influence their outcomes. We develop a model for how strategic agents can invest effort in order to change the outcomes they receive, and we give a tight characterization of when such agents can be incentivized to invest specified forms of effort into improving their outcomes as opposed to “gaming” the classifier. We show that whenever any “reasonable” mechanism can do so, a simple linear mechanism suffices. |
Tasks | Decision Making |
Published | 2018-07-13 |
URL | https://arxiv.org/abs/1807.05307v5 |
https://arxiv.org/pdf/1807.05307v5.pdf | |
PWC | https://paperswithcode.com/paper/how-do-classifiers-induce-agents-to-invest |
Repo | |
Framework | |
Deep Learning based Retinal OCT Segmentation
Title | Deep Learning based Retinal OCT Segmentation |
Authors | Mike Pekala, Neil Joshi, David E. Freund, Neil M. Bressler, Delia Cabrera DeBuc, Philippe M Burlina |
Abstract | Our objective is to evaluate the efficacy of methods that use deep learning (DL) for the automatic fine-grained segmentation of optical coherence tomography (OCT) images of the retina. OCT images from 10 patients with mild non-proliferative diabetic retinopathy were used from a public (U. of Miami) dataset. For each patient, five images were available: one image of the fovea center, two images of the perifovea, and two images of the parafovea. For each image, two expert graders each manually annotated five retinal surfaces (i.e. boundaries between pairs of retinal layers). The first grader’s annotations were used as ground truth and the second grader’s annotations to compute inter-operator agreement. The proposed automated approach segments images using fully convolutional networks (FCNs) together with Gaussian process (GP)-based regression as a post-processing step to improve the quality of the estimates. Using 10-fold cross validation, the performance of the algorithms is determined by computing the per-pixel unsigned error (distance) between the automated estimates and the ground truth annotations generated by the first manual grader. We compare the proposed method against five state of the art automatic segmentation techniques. The results show that the proposed methods compare favorably with state of the art techniques, resulting in the smallest mean unsigned error values and associated standard deviations, and performance is comparable with human annotation of retinal layers from OCT when there is only mild retinopathy. The results suggest that semantic segmentation using FCNs, coupled with regression-based post-processing, can effectively solve the OCT segmentation problem on par with human capabilities with mild retinopathy. |
Tasks | Semantic Segmentation |
Published | 2018-01-29 |
URL | http://arxiv.org/abs/1801.09749v1 |
http://arxiv.org/pdf/1801.09749v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-based-retinal-oct-segmentation |
Repo | |
Framework | |
An Enhanced BPSO based Approach for Service Placement in Hybrid Cloud
Title | An Enhanced BPSO based Approach for Service Placement in Hybrid Cloud |
Authors | Wissem Abbes, Zied Kechaou, Adel M. Alimi |
Abstract | Due to the challenges of competition and the rapidly evolving market, companies need to be innovative and agile, particularly in regard of web applications as used by customers. Nowadays, hybrid cloud stands as an attractive solution as organizations tend to use a combination of private and public cloud implementations, in accordance with their appropriate needs to profitably apply the available resources and speed of execution. In such a case, deploying the new applications would certainly entail opting for placing and consecrating some components to the private cloud option, while reserving some others to the public cloud option. In this respect, our primary goal in this paper consists in minimizing the extra costs likely to be incurred by applying the public cloud related options, along with those costs involved in maintaining communication between the private cloud system and the public cloud framework. As for our second targeted objective, it lies in reducing the decision process relating to the execution time, necessary for selecting the optimal service placement solution. For this purpose, a novel Binary Particle Swarm Optimization (BPSO) based approach is proposed, useful for an effective service placement optimization within hybrid cloud to take place. Using a real benchmark, the experimental results appear to reveal that our proposed approach reached results that outperform those documented in the state of the art both in terms of cost and time. |
Tasks | |
Published | 2018-06-10 |
URL | http://arxiv.org/abs/1806.05971v1 |
http://arxiv.org/pdf/1806.05971v1.pdf | |
PWC | https://paperswithcode.com/paper/an-enhanced-bpso-based-approach-for-service |
Repo | |
Framework | |
Force Estimation from OCT Volumes using 3D CNNs
Title | Force Estimation from OCT Volumes using 3D CNNs |
Authors | Nils Gessert, Jens Beringhoff, Christoph Otte, Alexander Schlaefer |
Abstract | \textit{Purpose} Estimating the interaction forces of instruments and tissue is of interest, particularly to provide haptic feedback during robot assisted minimally invasive interventions. Different approaches based on external and integrated force sensors have been proposed. These are hampered by friction, sensor size, and sterilizability. We investigate a novel approach to estimate the force vector directly from optical coherence tomography image volumes. \textit{Methods} We introduce a novel Siamese 3D CNN architecture. The network takes an undeformed reference volume and a deformed sample volume as an input and outputs the three components of the force vector. We employ a deep residual architecture with bottlenecks for increased efficiency. We compare the Siamese approach to methods using difference volumes and two-dimensional projections. Data was generated using a robotic setup to obtain ground truth force vectors for silicon tissue phantoms as well as porcine tissue. \textit{Results} Our method achieves a mean average error of 7.7 +- 4.3 mN when estimating the force vector. Our novel Siamese 3D CNN architecture outperforms single-path methods that achieve a mean average error of 11.59 +- 6.7 mN. Moreover, the use of volume data leads to significantly higher performance compared to processing only surface information which achieves a mean average error of 24.38 +- 22.0 mN. Based on the tissue dataset, our methods shows good generalization in between different subjects. \textit{Conclusions} We propose a novel image-based force estimation method using optical coherence tomography. We illustrate that capturing the deformation of subsurface structures substantially improves force estimation. Our approach can provide accurate force estimates in surgical setups when using intraoperative optical coherence tomography. |
Tasks | |
Published | 2018-04-26 |
URL | http://arxiv.org/abs/1804.10002v1 |
http://arxiv.org/pdf/1804.10002v1.pdf | |
PWC | https://paperswithcode.com/paper/force-estimation-from-oct-volumes-using-3d |
Repo | |
Framework | |