May 6, 2019

2736 words 13 mins read

Paper Group ANR 201

Content Aware Neural Style Transfer. Tracking Slowly Moving Clairvoyant: Optimal Dynamic Regret of Online Learning with True and Noisy Gradient. Template Matching via Densities on the Roto-Translation Group. Dense Wide-Baseline Scene Flow From Two Handheld Video Cameras. Human Pose Estimation using Deep Consensus Voting. Defeating Image Obfuscation …

Content Aware Neural Style Transfer


Title	Content Aware Neural Style Transfer
Authors	Rujie Yin
Abstract	This paper presents a content-aware style transfer algorithm for paintings and photos of similar content using pre-trained neural network, obtaining better results than the previous work. In addition, the numerical experiments show that the style pattern and the content information is not completely separated by neural network.
Tasks	Style Transfer
Published	2016-01-18
URL	http://arxiv.org/abs/1601.04568v1
PDF	http://arxiv.org/pdf/1601.04568v1.pdf
PWC	https://paperswithcode.com/paper/content-aware-neural-style-transfer
Repo
Framework

Tracking Slowly Moving Clairvoyant: Optimal Dynamic Regret of Online Learning with True and Noisy Gradient


Title	Tracking Slowly Moving Clairvoyant: Optimal Dynamic Regret of Online Learning with True and Noisy Gradient
Authors	Tianbao Yang, Lijun Zhang, Rong Jin, Jinfeng Yi
Abstract	This work focuses on dynamic regret of online convex optimization that compares the performance of online learning to a clairvoyant who knows the sequence of loss functions in advance and hence selects the minimizer of the loss function at each step. By assuming that the clairvoyant moves slowly (i.e., the minimizers change slowly), we present several improved variation-based upper bounds of the dynamic regret under the true and noisy gradient feedback, which are {\it optimal} in light of the presented lower bounds. The key to our analysis is to explore a regularity metric that measures the temporal changes in the clairvoyant’s minimizers, to which we refer as {\it path variation}. Firstly, we present a general lower bound in terms of the path variation, and then show that under full information or gradient feedback we are able to achieve an optimal dynamic regret. Secondly, we present a lower bound with noisy gradient feedback and then show that we can achieve optimal dynamic regrets under a stochastic gradient feedback and two-point bandit feedback. Moreover, for a sequence of smooth loss functions that admit a small variation in the gradients, our dynamic regret under the two-point bandit feedback matches what is achieved with full information.
Tasks
Published	2016-05-16
URL	http://arxiv.org/abs/1605.04638v1
PDF	http://arxiv.org/pdf/1605.04638v1.pdf
PWC	https://paperswithcode.com/paper/tracking-slowly-moving-clairvoyant-optimal
Repo
Framework

Template Matching via Densities on the Roto-Translation Group


Title	Template Matching via Densities on the Roto-Translation Group
Authors	Erik J. Bekkers, Marco Loog, Bart M. ter Haar Romeny, Remco Duits
Abstract	We propose a template matching method for the detection of 2D image objects that are characterized by orientation patterns. Our method is based on data representations via orientation scores, which are functions on the space of positions and orientations, and which are obtained via a wavelet-type transform. This new representation allows us to detect orientation patterns in an intuitive and direct way, namely via cross-correlations. Additionally, we propose a generalized linear regression framework for the construction of suitable templates using smoothing splines. Here, it is important to recognize a curved geometry on the position-orientation domain, which we identify with the Lie group SE(2): the roto-translation group. Templates are then optimized in a B-spline basis, and smoothness is defined with respect to the curved geometry. We achieve state-of-the-art results on three different applications: detection of the optic nerve head in the retina (99.83% success rate on 1737 images), of the fovea in the retina (99.32% success rate on 1616 images), and of the pupil in regular camera images (95.86% on 1521 images). The high performance is due to inclusion of both intensity and orientation features with effective geometric priors in the template matching. Moreover, our method is fast due to a cross-correlation based matching approach.
Tasks
Published	2016-03-10
URL	http://arxiv.org/abs/1603.03304v5
PDF	http://arxiv.org/pdf/1603.03304v5.pdf
PWC	https://paperswithcode.com/paper/template-matching-via-densities-on-the-roto
Repo
Framework

Dense Wide-Baseline Scene Flow From Two Handheld Video Cameras


Title	Dense Wide-Baseline Scene Flow From Two Handheld Video Cameras
Authors	Christian Richardt, Hyeongwoo Kim, Levi Valgaerts, Christian Theobalt
Abstract	We propose a new technique for computing dense scene flow from two handheld videos with wide camera baselines and different photometric properties due to different sensors or camera settings like exposure and white balance. Our technique innovates in two ways over existing methods: (1) it supports independently moving cameras, and (2) it computes dense scene flow for wide-baseline scenarios.We achieve this by combining state-of-the-art wide-baseline correspondence finding with a variational scene flow formulation. First, we compute dense, wide-baseline correspondences using DAISY descriptors for matching between cameras and over time. We then detect and replace occluded pixels in the correspondence fields using a novel edge-preserving Laplacian correspondence completion technique. We finally refine the computed correspondence fields in a variational scene flow formulation. We show dense scene flow results computed from challenging datasets with independently moving, handheld cameras of varying camera settings.
Tasks
Published	2016-09-16
URL	http://arxiv.org/abs/1609.05115v1
PDF	http://arxiv.org/pdf/1609.05115v1.pdf
PWC	https://paperswithcode.com/paper/dense-wide-baseline-scene-flow-from-two
Repo
Framework

Human Pose Estimation using Deep Consensus Voting


Title	Human Pose Estimation using Deep Consensus Voting
Authors	Ita Lifshitz, Ethan Fetaya, Shimon Ullman
Abstract	In this paper we consider the problem of human pose estimation from a single still image. We propose a novel approach where each location in the image votes for the position of each keypoint using a convolutional neural net. The voting scheme allows us to utilize information from the whole image, rather than rely on a sparse set of keypoint locations. Using dense, multi-target votes, not only produces good keypoint predictions, but also enables us to compute image-dependent joint keypoint probabilities by looking at consensus voting. This differs from most previous methods where joint probabilities are learned from relative keypoint locations and are independent of the image. We finally combine the keypoints votes and joint probabilities in order to identify the optimal pose configuration. We show our competitive performance on the MPII Human Pose and Leeds Sports Pose datasets.
Tasks	Pose Estimation
Published	2016-03-27
URL	http://arxiv.org/abs/1603.08212v1
PDF	http://arxiv.org/pdf/1603.08212v1.pdf
PWC	https://paperswithcode.com/paper/human-pose-estimation-using-deep-consensus
Repo
Framework

Defeating Image Obfuscation with Deep Learning


Title	Defeating Image Obfuscation with Deep Learning
Authors	Richard McPherson, Reza Shokri, Vitaly Shmatikov
Abstract	We demonstrate that modern image recognition methods based on artificial neural networks can recover hidden information from images protected by various forms of obfuscation. The obfuscation techniques considered in this paper are mosaicing (also known as pixelation), blurring (as used by YouTube), and P3, a recently proposed system for privacy-preserving photo sharing that encrypts the significant JPEG coefficients to make images unrecognizable by humans. We empirically show how to train artificial neural networks to successfully identify faces and recognize objects and handwritten digits even if the images are protected using any of the above obfuscation techniques.
Tasks
Published	2016-09-01
URL	http://arxiv.org/abs/1609.00408v2
PDF	http://arxiv.org/pdf/1609.00408v2.pdf
PWC	https://paperswithcode.com/paper/defeating-image-obfuscation-with-deep
Repo
Framework

Open Information Extraction


Title	Open Information Extraction
Authors	Duc-Thuan Vo, Ebrahim Bagheri
Abstract	Open Information Extraction (Open IE) systems aim to obtain relation tuples with highly scalable extraction in portable across domain by identifying a variety of relation phrases and their arguments in arbitrary sentences. The first generation of Open IE learns linear chain models based on unlexicalized features such as Part-of-Speech (POS) or shallow tags to label the intermediate words between pair of potential arguments for identifying extractable relations. Open IE currently is developed in the second generation that is able to extract instances of the most frequently observed relation types such as Verb, Noun and Prep, Verb and Prep, and Infinitive with deep linguistic analysis. They expose simple yet principled ways in which verbs express relationships in linguistics such as verb phrase-based extraction or clause-based extraction. They obtain a significantly higher performance over previous systems in the first generation. In this paper, we describe an overview of two Open IE generations including strengths, weaknesses and application areas.
Tasks	Open Information Extraction
Published	2016-07-10
URL	http://arxiv.org/abs/1607.02784v1
PDF	http://arxiv.org/pdf/1607.02784v1.pdf
PWC	https://paperswithcode.com/paper/open-information-extraction
Repo
Framework

Indebted households profiling: a knowledge discovery from database approach


Title	Indebted households profiling: a knowledge discovery from database approach
Authors	Rodrigo Scarpel, Alexandros Ladas, Uwe Aickelin
Abstract	A major challenge in consumer credit risk portfolio management is to classify households according to their risk profile. In order to build such risk profiles it is necessary to employ an approach that analyses data systematically in order to detect important relationships, interactions, dependencies and associations amongst the available continuous and categorical variables altogether and accurately generate profiles of most interesting household segments according to their credit risk. The objective of this work is to employ a knowledge discovery from database process to identify groups of indebted households and describe their profiles using a database collected by the Consumer Credit Counselling Service (CCCS) in the UK. Employing a framework that allows the usage of both categorical and continuous data altogether to find hidden structures in unlabelled data it was established the ideal number of clusters and such clusters were described in order to identify the households who exhibit a high propensity of excessive debt levels.
Tasks
Published	2016-07-20
URL	http://arxiv.org/abs/1607.05869v1
PDF	http://arxiv.org/pdf/1607.05869v1.pdf
PWC	https://paperswithcode.com/paper/indebted-households-profiling-a-knowledge
Repo
Framework

A diffusion and clustering-based approach for finding coherent motions and understanding crowd scenes


Title	A diffusion and clustering-based approach for finding coherent motions and understanding crowd scenes
Authors	Weiyao Lin, Yang Mi, Weiyue Wang, Jianxin Wu, Jingdong Wang, Tao Mei
Abstract	This paper addresses the problem of detecting coherent motions in crowd scenes and presents its two applications in crowd scene understanding: semantic region detection and recurrent activity mining. It processes input motion fields (e.g., optical flow fields) and produces a coherent motion filed, named as thermal energy field. The thermal energy field is able to capture both motion correlation among particles and the motion trends of individual particles which are helpful to discover coherency among them. We further introduce a two-step clustering process to construct stable semantic regions from the extracted time-varying coherent motions. These semantic regions can be used to recognize pre-defined activities in crowd scenes. Finally, we introduce a cluster-and-merge process which automatically discovers recurrent activities in crowd scenes by clustering and merging the extracted coherent motions. Experiments on various videos demonstrate the effectiveness of our approach.
Tasks	Optical Flow Estimation, Scene Understanding
Published	2016-02-16
URL	http://arxiv.org/abs/1602.04921v1
PDF	http://arxiv.org/pdf/1602.04921v1.pdf
PWC	https://paperswithcode.com/paper/a-diffusion-and-clustering-based-approach-for
Repo
Framework

Achieving non-discrimination in data release


Title	Achieving non-discrimination in data release
Authors	Lu Zhang, Yongkai Wu, Xintao Wu
Abstract	Discrimination discovery and prevention/removal are increasingly important tasks in data mining. Discrimination discovery aims to unveil discriminatory practices on the protected attribute (e.g., gender) by analyzing the dataset of historical decision records, and discrimination prevention aims to remove discrimination by modifying the biased data before conducting predictive analysis. In this paper, we show that the key to discrimination discovery and prevention is to find the meaningful partitions that can be used to provide quantitative evidences for the judgment of discrimination. With the support of the causal graph, we present a graphical condition for identifying a meaningful partition. Based on that, we develop a simple criterion for the claim of non-discrimination, and propose discrimination removal algorithms which accurately remove discrimination while retaining good data utility. Experiments using real datasets show the effectiveness of our approaches.
Tasks
Published	2016-11-22
URL	http://arxiv.org/abs/1611.07438v1
PDF	http://arxiv.org/pdf/1611.07438v1.pdf
PWC	https://paperswithcode.com/paper/achieving-non-discrimination-in-data-release
Repo
Framework

FlyCap: Markerless Motion Capture Using Multiple Autonomous Flying Cameras


Title	FlyCap: Markerless Motion Capture Using Multiple Autonomous Flying Cameras
Authors	Lan Xu, Lu Fang, Wei Cheng, Kaiwen Guo, Guyue Zhou, Qionghai Dai, Yebin Liu
Abstract	Aiming at automatic, convenient and non-instrusive motion capture, this paper presents a new generation markerless motion capture technique, the FlyCap system, to capture surface motions of moving characters using multiple autonomous flying cameras (autonomous unmanned aerial vehicles(UAV) each integrated with an RGBD video camera). During data capture, three cooperative flying cameras automatically track and follow the moving target who performs large scale motions in a wide space. We propose a novel non-rigid surface registration method to track and fuse the depth of the three flying cameras for surface motion tracking of the moving target, and simultaneously calculate the pose of each flying camera. We leverage the using of visual-odometry information provided by the UAV platform, and formulate the surface tracking problem in a non-linear objective function that can be linearized and effectively minimized through a Gaussian-Newton method. Quantitative and qualitative experimental results demonstrate the competent and plausible surface and motion reconstruction results
Tasks	Markerless Motion Capture, Motion Capture, Visual Odometry
Published	2016-10-29
URL	http://arxiv.org/abs/1610.09534v3
PDF	http://arxiv.org/pdf/1610.09534v3.pdf
PWC	https://paperswithcode.com/paper/flycap-markerless-motion-capture-using
Repo
Framework

General Automatic Human Shape and Motion Capture Using Volumetric Contour Cues


Title	General Automatic Human Shape and Motion Capture Using Volumetric Contour Cues
Authors	Helge Rhodin, Nadia Robertini, Dan Casas, Christian Richardt, Hans-Peter Seidel, Christian Theobalt
Abstract	Markerless motion capture algorithms require a 3D body with properly personalized skeleton dimension and/or body shape and appearance to successfully track a person. Unfortunately, many tracking methods consider model personalization a different problem and use manual or semi-automatic model initialization, which greatly reduces applicability. In this paper, we propose a fully automatic algorithm that jointly creates a rigged actor model commonly used for animation - skeleton, volumetric shape, appearance, and optionally a body surface - and estimates the actor’s motion from multi-view video input only. The approach is rigorously designed to work on footage of general outdoor scenes recorded with very few cameras and without background subtraction. Our method uses a new image formation model with analytic visibility and analytically differentiable alignment energy. For reconstruction, 3D body shape is approximated as Gaussian density field. For pose and shape estimation, we minimize a new edge-based alignment energy inspired by volume raycasting in an absorbing medium. We further propose a new statistical human body model that represents the body surface, volumetric Gaussian density, as well as variability in skeleton shape. Given any multi-view sequence, our method jointly optimizes the pose and shape parameters of this model fully automatically in a spatiotemporal way.
Tasks	Markerless Motion Capture, Motion Capture
Published	2016-07-28
URL	http://arxiv.org/abs/1607.08659v2
PDF	http://arxiv.org/pdf/1607.08659v2.pdf
PWC	https://paperswithcode.com/paper/general-automatic-human-shape-and-motion
Repo
Framework

On the Sample Complexity of End-to-end Training vs. Semantic Abstraction Training


Title	On the Sample Complexity of End-to-end Training vs. Semantic Abstraction Training
Authors	Shai Shalev-Shwartz, Amnon Shashua
Abstract	We compare the end-to-end training approach to a modular approach in which a system is decomposed into semantically meaningful components. We focus on the sample complexity aspect, in the regime where an extremely high accuracy is necessary, as is the case in autonomous driving applications. We demonstrate cases in which the number of training examples required by the end-to-end approach is exponentially larger than the number of examples required by the semantic abstraction approach.
Tasks	Autonomous Driving
Published	2016-04-23
URL	http://arxiv.org/abs/1604.06915v1
PDF	http://arxiv.org/pdf/1604.06915v1.pdf
PWC	https://paperswithcode.com/paper/on-the-sample-complexity-of-end-to-end
Repo
Framework

A Comparative Study of Algorithms for Realtime Panoramic Video Blending


Title	A Comparative Study of Algorithms for Realtime Panoramic Video Blending
Authors	Zhe Zhu, Jiaming Lu, Minxuan Wang, Songhai Zhang, Ralph Martin, Hantao Liu, Shimin Hu
Abstract	Unlike image blending algorithms, video blending algorithms have been little studied. In this paper, we investigate 6 popular blending algorithms—feather blending, multi-band blending, modified Poisson blending, mean value coordinate blending, multi-spline blending and convolution pyramid blending. We consider in particular realtime panoramic video blending, a key problem in various virtual reality tasks. To evaluate the performance of the 6 algorithms on this problem, we have created a video benchmark of several videos captured under various conditions. We analyze the time and memory needed by the above 6 algorithms, for both CPU and GPU implementations (where readily parallelizable). The visual quality provided by these algorithms is also evaluated both objectively and subjectively. The video benchmark and algorithm implementations are publicly available.
Tasks
Published	2016-06-01
URL	http://arxiv.org/abs/1606.00103v2
PDF	http://arxiv.org/pdf/1606.00103v2.pdf
PWC	https://paperswithcode.com/paper/a-comparative-study-of-algorithms-for
Repo
Framework

A probabilistic tour of visual attention and gaze shift computational models


Title	A probabilistic tour of visual attention and gaze shift computational models
Authors	Giuseppe Boccignone
Abstract	In this paper a number of problems are considered which are related to the modelling of eye guidance under visual attention in a natural setting. From a crude discussion of a variety of available models spelled in probabilistic terms, it appears that current approaches in computational vision are hitherto far from achieving the goal of an active observer relying upon eye guidance to accomplish real-world tasks. We argue that this challenging goal not only requires to embody, in a principled way, the problem of eye guidance within the action/perception loop, but to face the inextricable link tying up visual attention, emotion and executive control, in so far as recent neurobiological findings are weighed up.
Tasks
Published	2016-07-05
URL	http://arxiv.org/abs/1607.01232v1
PDF	http://arxiv.org/pdf/1607.01232v1.pdf
PWC	https://paperswithcode.com/paper/a-probabilistic-tour-of-visual-attention-and
Repo
Framework