Paper Group ANR 201
Content Aware Neural Style Transfer. Tracking Slowly Moving Clairvoyant: Optimal Dynamic Regret of Online Learning with True and Noisy Gradient. Template Matching via Densities on the Roto-Translation Group. Dense Wide-Baseline Scene Flow From Two Handheld Video Cameras. Human Pose Estimation using Deep Consensus Voting. Defeating Image Obfuscation …
Content Aware Neural Style Transfer
Title | Content Aware Neural Style Transfer |
Authors | Rujie Yin |
Abstract | This paper presents a content-aware style transfer algorithm for paintings and photos of similar content using pre-trained neural network, obtaining better results than the previous work. In addition, the numerical experiments show that the style pattern and the content information is not completely separated by neural network. |
Tasks | Style Transfer |
Published | 2016-01-18 |
URL | http://arxiv.org/abs/1601.04568v1 |
http://arxiv.org/pdf/1601.04568v1.pdf | |
PWC | https://paperswithcode.com/paper/content-aware-neural-style-transfer |
Repo | |
Framework | |
Tracking Slowly Moving Clairvoyant: Optimal Dynamic Regret of Online Learning with True and Noisy Gradient
Title | Tracking Slowly Moving Clairvoyant: Optimal Dynamic Regret of Online Learning with True and Noisy Gradient |
Authors | Tianbao Yang, Lijun Zhang, Rong Jin, Jinfeng Yi |
Abstract | This work focuses on dynamic regret of online convex optimization that compares the performance of online learning to a clairvoyant who knows the sequence of loss functions in advance and hence selects the minimizer of the loss function at each step. By assuming that the clairvoyant moves slowly (i.e., the minimizers change slowly), we present several improved variation-based upper bounds of the dynamic regret under the true and noisy gradient feedback, which are {\it optimal} in light of the presented lower bounds. The key to our analysis is to explore a regularity metric that measures the temporal changes in the clairvoyant’s minimizers, to which we refer as {\it path variation}. Firstly, we present a general lower bound in terms of the path variation, and then show that under full information or gradient feedback we are able to achieve an optimal dynamic regret. Secondly, we present a lower bound with noisy gradient feedback and then show that we can achieve optimal dynamic regrets under a stochastic gradient feedback and two-point bandit feedback. Moreover, for a sequence of smooth loss functions that admit a small variation in the gradients, our dynamic regret under the two-point bandit feedback matches what is achieved with full information. |
Tasks | |
Published | 2016-05-16 |
URL | http://arxiv.org/abs/1605.04638v1 |
http://arxiv.org/pdf/1605.04638v1.pdf | |
PWC | https://paperswithcode.com/paper/tracking-slowly-moving-clairvoyant-optimal |
Repo | |
Framework | |
Template Matching via Densities on the Roto-Translation Group
Title | Template Matching via Densities on the Roto-Translation Group |
Authors | Erik J. Bekkers, Marco Loog, Bart M. ter Haar Romeny, Remco Duits |
Abstract | We propose a template matching method for the detection of 2D image objects that are characterized by orientation patterns. Our method is based on data representations via orientation scores, which are functions on the space of positions and orientations, and which are obtained via a wavelet-type transform. This new representation allows us to detect orientation patterns in an intuitive and direct way, namely via cross-correlations. Additionally, we propose a generalized linear regression framework for the construction of suitable templates using smoothing splines. Here, it is important to recognize a curved geometry on the position-orientation domain, which we identify with the Lie group SE(2): the roto-translation group. Templates are then optimized in a B-spline basis, and smoothness is defined with respect to the curved geometry. We achieve state-of-the-art results on three different applications: detection of the optic nerve head in the retina (99.83% success rate on 1737 images), of the fovea in the retina (99.32% success rate on 1616 images), and of the pupil in regular camera images (95.86% on 1521 images). The high performance is due to inclusion of both intensity and orientation features with effective geometric priors in the template matching. Moreover, our method is fast due to a cross-correlation based matching approach. |
Tasks | |
Published | 2016-03-10 |
URL | http://arxiv.org/abs/1603.03304v5 |
http://arxiv.org/pdf/1603.03304v5.pdf | |
PWC | https://paperswithcode.com/paper/template-matching-via-densities-on-the-roto |
Repo | |
Framework | |
Dense Wide-Baseline Scene Flow From Two Handheld Video Cameras
Title | Dense Wide-Baseline Scene Flow From Two Handheld Video Cameras |
Authors | Christian Richardt, Hyeongwoo Kim, Levi Valgaerts, Christian Theobalt |
Abstract | We propose a new technique for computing dense scene flow from two handheld videos with wide camera baselines and different photometric properties due to different sensors or camera settings like exposure and white balance. Our technique innovates in two ways over existing methods: (1) it supports independently moving cameras, and (2) it computes dense scene flow for wide-baseline scenarios.We achieve this by combining state-of-the-art wide-baseline correspondence finding with a variational scene flow formulation. First, we compute dense, wide-baseline correspondences using DAISY descriptors for matching between cameras and over time. We then detect and replace occluded pixels in the correspondence fields using a novel edge-preserving Laplacian correspondence completion technique. We finally refine the computed correspondence fields in a variational scene flow formulation. We show dense scene flow results computed from challenging datasets with independently moving, handheld cameras of varying camera settings. |
Tasks | |
Published | 2016-09-16 |
URL | http://arxiv.org/abs/1609.05115v1 |
http://arxiv.org/pdf/1609.05115v1.pdf | |
PWC | https://paperswithcode.com/paper/dense-wide-baseline-scene-flow-from-two |
Repo | |
Framework | |
Human Pose Estimation using Deep Consensus Voting
Title | Human Pose Estimation using Deep Consensus Voting |
Authors | Ita Lifshitz, Ethan Fetaya, Shimon Ullman |
Abstract | In this paper we consider the problem of human pose estimation from a single still image. We propose a novel approach where each location in the image votes for the position of each keypoint using a convolutional neural net. The voting scheme allows us to utilize information from the whole image, rather than rely on a sparse set of keypoint locations. Using dense, multi-target votes, not only produces good keypoint predictions, but also enables us to compute image-dependent joint keypoint probabilities by looking at consensus voting. This differs from most previous methods where joint probabilities are learned from relative keypoint locations and are independent of the image. We finally combine the keypoints votes and joint probabilities in order to identify the optimal pose configuration. We show our competitive performance on the MPII Human Pose and Leeds Sports Pose datasets. |
Tasks | Pose Estimation |
Published | 2016-03-27 |
URL | http://arxiv.org/abs/1603.08212v1 |
http://arxiv.org/pdf/1603.08212v1.pdf | |
PWC | https://paperswithcode.com/paper/human-pose-estimation-using-deep-consensus |
Repo | |
Framework | |
Defeating Image Obfuscation with Deep Learning
Title | Defeating Image Obfuscation with Deep Learning |
Authors | Richard McPherson, Reza Shokri, Vitaly Shmatikov |
Abstract | We demonstrate that modern image recognition methods based on artificial neural networks can recover hidden information from images protected by various forms of obfuscation. The obfuscation techniques considered in this paper are mosaicing (also known as pixelation), blurring (as used by YouTube), and P3, a recently proposed system for privacy-preserving photo sharing that encrypts the significant JPEG coefficients to make images unrecognizable by humans. We empirically show how to train artificial neural networks to successfully identify faces and recognize objects and handwritten digits even if the images are protected using any of the above obfuscation techniques. |
Tasks | |
Published | 2016-09-01 |
URL | http://arxiv.org/abs/1609.00408v2 |
http://arxiv.org/pdf/1609.00408v2.pdf | |
PWC | https://paperswithcode.com/paper/defeating-image-obfuscation-with-deep |
Repo | |
Framework | |
Open Information Extraction
Title | Open Information Extraction |
Authors | Duc-Thuan Vo, Ebrahim Bagheri |
Abstract | Open Information Extraction (Open IE) systems aim to obtain relation tuples with highly scalable extraction in portable across domain by identifying a variety of relation phrases and their arguments in arbitrary sentences. The first generation of Open IE learns linear chain models based on unlexicalized features such as Part-of-Speech (POS) or shallow tags to label the intermediate words between pair of potential arguments for identifying extractable relations. Open IE currently is developed in the second generation that is able to extract instances of the most frequently observed relation types such as Verb, Noun and Prep, Verb and Prep, and Infinitive with deep linguistic analysis. They expose simple yet principled ways in which verbs express relationships in linguistics such as verb phrase-based extraction or clause-based extraction. They obtain a significantly higher performance over previous systems in the first generation. In this paper, we describe an overview of two Open IE generations including strengths, weaknesses and application areas. |
Tasks | Open Information Extraction |
Published | 2016-07-10 |
URL | http://arxiv.org/abs/1607.02784v1 |
http://arxiv.org/pdf/1607.02784v1.pdf | |
PWC | https://paperswithcode.com/paper/open-information-extraction |
Repo | |
Framework | |
Indebted households profiling: a knowledge discovery from database approach
Title | Indebted households profiling: a knowledge discovery from database approach |
Authors | Rodrigo Scarpel, Alexandros Ladas, Uwe Aickelin |
Abstract | A major challenge in consumer credit risk portfolio management is to classify households according to their risk profile. In order to build such risk profiles it is necessary to employ an approach that analyses data systematically in order to detect important relationships, interactions, dependencies and associations amongst the available continuous and categorical variables altogether and accurately generate profiles of most interesting household segments according to their credit risk. The objective of this work is to employ a knowledge discovery from database process to identify groups of indebted households and describe their profiles using a database collected by the Consumer Credit Counselling Service (CCCS) in the UK. Employing a framework that allows the usage of both categorical and continuous data altogether to find hidden structures in unlabelled data it was established the ideal number of clusters and such clusters were described in order to identify the households who exhibit a high propensity of excessive debt levels. |
Tasks | |
Published | 2016-07-20 |
URL | http://arxiv.org/abs/1607.05869v1 |
http://arxiv.org/pdf/1607.05869v1.pdf | |
PWC | https://paperswithcode.com/paper/indebted-households-profiling-a-knowledge |
Repo | |
Framework | |
A diffusion and clustering-based approach for finding coherent motions and understanding crowd scenes
Title | A diffusion and clustering-based approach for finding coherent motions and understanding crowd scenes |
Authors | Weiyao Lin, Yang Mi, Weiyue Wang, Jianxin Wu, Jingdong Wang, Tao Mei |
Abstract | This paper addresses the problem of detecting coherent motions in crowd scenes and presents its two applications in crowd scene understanding: semantic region detection and recurrent activity mining. It processes input motion fields (e.g., optical flow fields) and produces a coherent motion filed, named as thermal energy field. The thermal energy field is able to capture both motion correlation among particles and the motion trends of individual particles which are helpful to discover coherency among them. We further introduce a two-step clustering process to construct stable semantic regions from the extracted time-varying coherent motions. These semantic regions can be used to recognize pre-defined activities in crowd scenes. Finally, we introduce a cluster-and-merge process which automatically discovers recurrent activities in crowd scenes by clustering and merging the extracted coherent motions. Experiments on various videos demonstrate the effectiveness of our approach. |
Tasks | Optical Flow Estimation, Scene Understanding |
Published | 2016-02-16 |
URL | http://arxiv.org/abs/1602.04921v1 |
http://arxiv.org/pdf/1602.04921v1.pdf | |
PWC | https://paperswithcode.com/paper/a-diffusion-and-clustering-based-approach-for |
Repo | |
Framework | |
Achieving non-discrimination in data release
Title | Achieving non-discrimination in data release |
Authors | Lu Zhang, Yongkai Wu, Xintao Wu |
Abstract | Discrimination discovery and prevention/removal are increasingly important tasks in data mining. Discrimination discovery aims to unveil discriminatory practices on the protected attribute (e.g., gender) by analyzing the dataset of historical decision records, and discrimination prevention aims to remove discrimination by modifying the biased data before conducting predictive analysis. In this paper, we show that the key to discrimination discovery and prevention is to find the meaningful partitions that can be used to provide quantitative evidences for the judgment of discrimination. With the support of the causal graph, we present a graphical condition for identifying a meaningful partition. Based on that, we develop a simple criterion for the claim of non-discrimination, and propose discrimination removal algorithms which accurately remove discrimination while retaining good data utility. Experiments using real datasets show the effectiveness of our approaches. |
Tasks | |
Published | 2016-11-22 |
URL | http://arxiv.org/abs/1611.07438v1 |
http://arxiv.org/pdf/1611.07438v1.pdf | |
PWC | https://paperswithcode.com/paper/achieving-non-discrimination-in-data-release |
Repo | |
Framework | |
FlyCap: Markerless Motion Capture Using Multiple Autonomous Flying Cameras
Title | FlyCap: Markerless Motion Capture Using Multiple Autonomous Flying Cameras |
Authors | Lan Xu, Lu Fang, Wei Cheng, Kaiwen Guo, Guyue Zhou, Qionghai Dai, Yebin Liu |
Abstract | Aiming at automatic, convenient and non-instrusive motion capture, this paper presents a new generation markerless motion capture technique, the FlyCap system, to capture surface motions of moving characters using multiple autonomous flying cameras (autonomous unmanned aerial vehicles(UAV) each integrated with an RGBD video camera). During data capture, three cooperative flying cameras automatically track and follow the moving target who performs large scale motions in a wide space. We propose a novel non-rigid surface registration method to track and fuse the depth of the three flying cameras for surface motion tracking of the moving target, and simultaneously calculate the pose of each flying camera. We leverage the using of visual-odometry information provided by the UAV platform, and formulate the surface tracking problem in a non-linear objective function that can be linearized and effectively minimized through a Gaussian-Newton method. Quantitative and qualitative experimental results demonstrate the competent and plausible surface and motion reconstruction results |
Tasks | Markerless Motion Capture, Motion Capture, Visual Odometry |
Published | 2016-10-29 |
URL | http://arxiv.org/abs/1610.09534v3 |
http://arxiv.org/pdf/1610.09534v3.pdf | |
PWC | https://paperswithcode.com/paper/flycap-markerless-motion-capture-using |
Repo | |
Framework | |
General Automatic Human Shape and Motion Capture Using Volumetric Contour Cues
Title | General Automatic Human Shape and Motion Capture Using Volumetric Contour Cues |
Authors | Helge Rhodin, Nadia Robertini, Dan Casas, Christian Richardt, Hans-Peter Seidel, Christian Theobalt |
Abstract | Markerless motion capture algorithms require a 3D body with properly personalized skeleton dimension and/or body shape and appearance to successfully track a person. Unfortunately, many tracking methods consider model personalization a different problem and use manual or semi-automatic model initialization, which greatly reduces applicability. In this paper, we propose a fully automatic algorithm that jointly creates a rigged actor model commonly used for animation - skeleton, volumetric shape, appearance, and optionally a body surface - and estimates the actor’s motion from multi-view video input only. The approach is rigorously designed to work on footage of general outdoor scenes recorded with very few cameras and without background subtraction. Our method uses a new image formation model with analytic visibility and analytically differentiable alignment energy. For reconstruction, 3D body shape is approximated as Gaussian density field. For pose and shape estimation, we minimize a new edge-based alignment energy inspired by volume raycasting in an absorbing medium. We further propose a new statistical human body model that represents the body surface, volumetric Gaussian density, as well as variability in skeleton shape. Given any multi-view sequence, our method jointly optimizes the pose and shape parameters of this model fully automatically in a spatiotemporal way. |
Tasks | Markerless Motion Capture, Motion Capture |
Published | 2016-07-28 |
URL | http://arxiv.org/abs/1607.08659v2 |
http://arxiv.org/pdf/1607.08659v2.pdf | |
PWC | https://paperswithcode.com/paper/general-automatic-human-shape-and-motion |
Repo | |
Framework | |
On the Sample Complexity of End-to-end Training vs. Semantic Abstraction Training
Title | On the Sample Complexity of End-to-end Training vs. Semantic Abstraction Training |
Authors | Shai Shalev-Shwartz, Amnon Shashua |
Abstract | We compare the end-to-end training approach to a modular approach in which a system is decomposed into semantically meaningful components. We focus on the sample complexity aspect, in the regime where an extremely high accuracy is necessary, as is the case in autonomous driving applications. We demonstrate cases in which the number of training examples required by the end-to-end approach is exponentially larger than the number of examples required by the semantic abstraction approach. |
Tasks | Autonomous Driving |
Published | 2016-04-23 |
URL | http://arxiv.org/abs/1604.06915v1 |
http://arxiv.org/pdf/1604.06915v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-sample-complexity-of-end-to-end |
Repo | |
Framework | |
A Comparative Study of Algorithms for Realtime Panoramic Video Blending
Title | A Comparative Study of Algorithms for Realtime Panoramic Video Blending |
Authors | Zhe Zhu, Jiaming Lu, Minxuan Wang, Songhai Zhang, Ralph Martin, Hantao Liu, Shimin Hu |
Abstract | Unlike image blending algorithms, video blending algorithms have been little studied. In this paper, we investigate 6 popular blending algorithms—feather blending, multi-band blending, modified Poisson blending, mean value coordinate blending, multi-spline blending and convolution pyramid blending. We consider in particular realtime panoramic video blending, a key problem in various virtual reality tasks. To evaluate the performance of the 6 algorithms on this problem, we have created a video benchmark of several videos captured under various conditions. We analyze the time and memory needed by the above 6 algorithms, for both CPU and GPU implementations (where readily parallelizable). The visual quality provided by these algorithms is also evaluated both objectively and subjectively. The video benchmark and algorithm implementations are publicly available. |
Tasks | |
Published | 2016-06-01 |
URL | http://arxiv.org/abs/1606.00103v2 |
http://arxiv.org/pdf/1606.00103v2.pdf | |
PWC | https://paperswithcode.com/paper/a-comparative-study-of-algorithms-for |
Repo | |
Framework | |
A probabilistic tour of visual attention and gaze shift computational models
Title | A probabilistic tour of visual attention and gaze shift computational models |
Authors | Giuseppe Boccignone |
Abstract | In this paper a number of problems are considered which are related to the modelling of eye guidance under visual attention in a natural setting. From a crude discussion of a variety of available models spelled in probabilistic terms, it appears that current approaches in computational vision are hitherto far from achieving the goal of an active observer relying upon eye guidance to accomplish real-world tasks. We argue that this challenging goal not only requires to embody, in a principled way, the problem of eye guidance within the action/perception loop, but to face the inextricable link tying up visual attention, emotion and executive control, in so far as recent neurobiological findings are weighed up. |
Tasks | |
Published | 2016-07-05 |
URL | http://arxiv.org/abs/1607.01232v1 |
http://arxiv.org/pdf/1607.01232v1.pdf | |
PWC | https://paperswithcode.com/paper/a-probabilistic-tour-of-visual-attention-and |
Repo | |
Framework | |