July 29, 2019

3238 words 16 mins read

Paper Group ANR 117

Computer methods for 3D motion tracking in real-time. Illuminant Spectra-based Source Separation Using Flash Photography. ES Is More Than Just a Traditional Finite-Difference Approximator. Global optimization for low-dimensional switching linear regression and bounded-error estimation. Predicting Positive and Negative Links with Noisy Queries: Theo …

Computer methods for 3D motion tracking in real-time


Title	Computer methods for 3D motion tracking in real-time
Authors	Bogusław Rymut
Abstract	This thesis is devoted to marker-less 3D human motion tracking in calibrated and synchronized multicamera systems. Pose estimation is based on a 3D model, which is transformed into the image plane and then rendered. Owing to elaborated techniques the tracking of the full body has been achieved in real-time via dynamic optimization or dynamic Bayesian filtering. The objective function of a particle swarm optimization algorithm and the observation model of a particle filter are based on matching between the rendered 3D models in the required poses and image features representing the extracted person. In such an approach the main part of the computational overload is associated with the rendering of 3D models in hypothetical poses as well as determination of value of objective function. Effective methods for rendering of 3D models in real-time with support of OpenGL as well as parallel methods for determining the objective function on the GPU were developed. The elaborated solutions permit 3D tracking of full body motion in real-time.
Tasks	Pose Estimation
Published	2017-07-05
URL	http://arxiv.org/abs/1707.01745v1
PDF	http://arxiv.org/pdf/1707.01745v1.pdf
PWC	https://paperswithcode.com/paper/computer-methods-for-3d-motion-tracking-in
Repo
Framework

Illuminant Spectra-based Source Separation Using Flash Photography


Title	Illuminant Spectra-based Source Separation Using Flash Photography
Authors	Zhuo Hui, Kalyan Sunkavalli, Sunil Hadap, Aswin C. Sankaranarayanan
Abstract	Real-world lighting often consists of multiple illuminants with different spectra. Separating and manipulating these illuminants in post-process is a challenging problem that requires either significant manual input or calibrated scene geometry and lighting. In this work, we leverage a flash/no-flash image pair to analyze and edit scene illuminants based on their spectral differences. We derive a novel physics-based relationship between color variations in the observed flash/no-flash intensities and the spectra and surface shading corresponding to individual scene illuminants. Our technique uses this constraint to automatically separate an image into constituent images lit by each illuminant. This separation can be used to support applications like white balancing, lighting editing, and RGB photometric stereo, where we demonstrate results that outperform state-of-the-art techniques on a wide range of images.
Tasks
Published	2017-04-19
URL	http://arxiv.org/abs/1704.05564v2
PDF	http://arxiv.org/pdf/1704.05564v2.pdf
PWC	https://paperswithcode.com/paper/illuminant-spectra-based-source-separation
Repo
Framework

ES Is More Than Just a Traditional Finite-Difference Approximator


Title	ES Is More Than Just a Traditional Finite-Difference Approximator
Authors	Joel Lehman, Jay Chen, Jeff Clune, Kenneth O. Stanley
Abstract	An evolution strategy (ES) variant based on a simplification of a natural evolution strategy recently attracted attention because it performs surprisingly well in challenging deep reinforcement learning domains. It searches for neural network parameters by generating perturbations to the current set of parameters, checking their performance, and moving in the aggregate direction of higher reward. Because it resembles a traditional finite-difference approximation of the reward gradient, it can naturally be confused with one. However, this ES optimizes for a different gradient than just reward: It optimizes for the average reward of the entire population, thereby seeking parameters that are robust to perturbation. This difference can channel ES into distinct areas of the search space relative to gradient descent, and also consequently to networks with distinct properties. This unique robustness-seeking property, and its consequences for optimization, are demonstrated in several domains. They include humanoid locomotion, where networks from policy gradient-based reinforcement learning are significantly less robust to parameter perturbation than ES-based policies solving the same task. While the implications of such robustness and robustness-seeking remain open to further study, this work’s main contribution is to highlight such differences and their potential importance.
Tasks
Published	2017-12-18
URL	http://arxiv.org/abs/1712.06568v3
PDF	http://arxiv.org/pdf/1712.06568v3.pdf
PWC	https://paperswithcode.com/paper/es-is-more-than-just-a-traditional-finite
Repo
Framework

Global optimization for low-dimensional switching linear regression and bounded-error estimation


Title	Global optimization for low-dimensional switching linear regression and bounded-error estimation
Authors	Fabien Lauer
Abstract	The paper provides global optimization algorithms for two particularly difficult nonconvex problems raised by hybrid system identification: switching linear regression and bounded-error estimation. While most works focus on local optimization heuristics without global optimality guarantees or with guarantees valid only under restrictive conditions, the proposed approach always yields a solution with a certificate of global optimality. This approach relies on a branch-and-bound strategy for which we devise lower bounds that can be efficiently computed. In order to obtain scalable algorithms with respect to the number of data, we directly optimize the model parameters in a continuous optimization setting without involving integer variables. Numerical experiments show that the proposed algorithms offer a higher accuracy than convex relaxations with a reasonable computational burden for hybrid system identification. In addition, we discuss how bounded-error estimation is related to robust estimation in the presence of outliers and exact recovery under sparse noise, for which we also obtain promising numerical results.
Tasks
Published	2017-07-18
URL	http://arxiv.org/abs/1707.05533v3
PDF	http://arxiv.org/pdf/1707.05533v3.pdf
PWC	https://paperswithcode.com/paper/global-optimization-for-low-dimensional
Repo
Framework

Predicting Positive and Negative Links with Noisy Queries: Theory & Practice


Title	Predicting Positive and Negative Links with Noisy Queries: Theory & Practice
Authors	Charalampos E. Tsourakakis, Michael Mitzenmacher, Kasper Green Larsen, Jarosław Błasiok, Ben Lawson, Preetum Nakkiran, Vasileios Nakos
Abstract	Social networks involve both positive and negative relationships, which can be captured in signed graphs. The {\em edge sign prediction problem} aims to predict whether an interaction between a pair of nodes will be positive or negative. We provide theoretical results for this problem that motivate natural improvements to recent heuristics. The edge sign prediction problem is related to correlation clustering; a positive relationship means being in the same cluster. We consider the following model for two clusters: we are allowed to query any pair of nodes whether they belong to the same cluster or not, but the answer to the query is corrupted with some probability $0<q<\frac{1}{2}$. Let $\delta=1-2q$ be the bias. We provide an algorithm that recovers all signs correctly with high probability in the presence of noise with $O(\frac{n\log n}{\delta^2}+\frac{\log^2 n}{\delta^6})$ queries. This is the best known result for this problem for all but tiny $\delta$, improving on the recent work of Mazumdar and Saha \cite{mazumdar2017clustering}. We also provide an algorithm that performs $O(\frac{n\log n}{\delta^4})$ queries, and uses breadth first search as its main algorithmic primitive. While both the running time and the number of queries for this algorithm are sub-optimal, our result relies on novel theoretical techniques, and naturally suggests the use of edge-disjoint paths as a feature for predicting signs in online social networks. Correspondingly, we experiment with using edge disjoint $s-t$ paths of short length as a feature for predicting the sign of edge $(s,t)$ in real-world signed networks. Empirical findings suggest that the use of such paths improves the classification accuracy, especially for pairs of nodes with no common neighbors.
Tasks
Published	2017-09-19
URL	http://arxiv.org/abs/1709.07308v2
PDF	http://arxiv.org/pdf/1709.07308v2.pdf
PWC	https://paperswithcode.com/paper/predicting-positive-and-negative-links-with
Repo
Framework

On Sampling Strategies for Neural Network-based Collaborative Filtering


Title	On Sampling Strategies for Neural Network-based Collaborative Filtering
Authors	Ting Chen, Yizhou Sun, Yue Shi, Liangjie Hong
Abstract	Recent advances in neural networks have inspired people to design hybrid recommendation algorithms that can incorporate both (1) user-item interaction information and (2) content information including image, audio, and text. Despite their promising results, neural network-based recommendation algorithms pose extensive computational costs, making it challenging to scale and improve upon. In this paper, we propose a general neural network-based recommendation framework, which subsumes several existing state-of-the-art recommendation algorithms, and address the efficiency issue by investigating sampling strategies in the stochastic gradient descent training for the framework. We tackle this issue by first establishing a connection between the loss functions and the user-item interaction bipartite graph, where the loss function terms are defined on links while major computation burdens are located at nodes. We call this type of loss functions “graph-based” loss functions, for which varied mini-batch sampling strategies can have different computational costs. Based on the insight, three novel sampling strategies are proposed, which can significantly improve the training efficiency of the proposed framework (up to $\times 30$ times speedup in our experiments), as well as improving the recommendation performance. Theoretical analysis is also provided for both the computational cost and the convergence. We believe the study of sampling strategies have further implications on general graph-based loss functions, and would also enable more research under the neural network-based recommendation framework.
Tasks
Published	2017-06-23
URL	http://arxiv.org/abs/1706.07881v1
PDF	http://arxiv.org/pdf/1706.07881v1.pdf
PWC	https://paperswithcode.com/paper/on-sampling-strategies-for-neural-network
Repo
Framework

Building Graph Representations of Deep Vector Embeddings


Title	Building Graph Representations of Deep Vector Embeddings
Authors	Dario Garcia-Gasulla, Armand Vilalta, Ferran Parés, Jonatan Moreno, Eduard Ayguadé, Jesus Labarta, Ulises Cortés, Toyotaro Suzumura
Abstract	Patterns stored within pre-trained deep neural networks compose large and powerful descriptive languages that can be used for many different purposes. Typically, deep network representations are implemented within vector embedding spaces, which enables the use of traditional machine learning algorithms on top of them. In this short paper we propose the construction of a graph embedding space instead, introducing a methodology to transform the knowledge coded within a deep convolutional network into a topological space (i.e. a network). We outline how such graph can hold data instances, data features, relations between instances and features, and relations among features. Finally, we introduce some preliminary experiments to illustrate how the resultant graph embedding space can be exploited through graph analytics algorithms.
Tasks	Graph Embedding
Published	2017-07-24
URL	http://arxiv.org/abs/1707.07465v2
PDF	http://arxiv.org/pdf/1707.07465v2.pdf
PWC	https://paperswithcode.com/paper/building-graph-representations-of-deep-vector
Repo
Framework

Deep Learning for Real-Time Crime Forecasting and its Ternarization


Title	Deep Learning for Real-Time Crime Forecasting and its Ternarization
Authors	Bao Wang, Penghang Yin, Andrea L. Bertozzi, P. Jeffrey Brantingham, Stanley J. Osher, Jack Xin
Abstract	Real-time crime forecasting is important. However, accurate prediction of when and where the next crime will happen is difficult. No known physical model provides a reasonable approximation to such a complex system. Historical crime data are sparse in both space and time and the signal of interests is weak. In this work, we first present a proper representation of crime data. We then adapt the spatial temporal residual network on the well represented data to predict the distribution of crime in Los Angeles at the scale of hours in neighborhood-sized parcels. These experiments as well as comparisons with several existing approaches to prediction demonstrate the superiority of the proposed model in terms of accuracy. Finally, we present a ternarization technique to address the resource consumption issue for its deployment in real world. This work is an extension of our short conference proceeding paper [Wang et al, Arxiv 1707.03340].
Tasks
Published	2017-11-23
URL	http://arxiv.org/abs/1711.08833v1
PDF	http://arxiv.org/pdf/1711.08833v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-for-real-time-crime-forecasting
Repo
Framework

From knowledge-based to data-driven modeling of fuzzy rule-based systems: A critical reflection


Title	From knowledge-based to data-driven modeling of fuzzy rule-based systems: A critical reflection
Authors	Eyke Hüllermeier
Abstract	This paper briefly elaborates on a development in (applied) fuzzy logic that has taken place in the last couple of decades, namely, the complementation or even replacement of the traditional knowledge-based approach to fuzzy rule-based systems design by a data-driven one. It is argued that the classical rule-based modeling paradigm is actually more amenable to the knowledge-based approach, for which it has originally been conceived, while being less apt to data-driven model design. An important reason that prevents fuzzy (rule-based) systems from being leveraged in large-scale applications is the flat structure of rule bases, along with the local nature of fuzzy rules and their limited ability to express complex dependencies between variables. This motivates alternative approaches to fuzzy systems modeling, in which functional dependencies can be represented more flexibly and more compactly in terms of hierarchical structures.
Tasks
Published	2017-12-02
URL	http://arxiv.org/abs/1712.00646v1
PDF	http://arxiv.org/pdf/1712.00646v1.pdf
PWC	https://paperswithcode.com/paper/from-knowledge-based-to-data-driven-modeling
Repo
Framework

Enhance Visual Recognition under Adverse Conditions via Deep Networks


Title	Enhance Visual Recognition under Adverse Conditions via Deep Networks
Authors	Ding Liu, Bowen Cheng, Zhangyang Wang, Haichao Zhang, Thomas S. Huang
Abstract	Visual recognition under adverse conditions is a very important and challenging problem of high practical value, due to the ubiquitous existence of quality distortions during image acquisition, transmission, or storage. While deep neural networks have been extensively exploited in the techniques of low-quality image restoration and high-quality image recognition tasks respectively, few studies have been done on the important problem of recognition from very low-quality images. This paper proposes a deep learning based framework for improving the performance of image and video recognition models under adverse conditions, using robust adverse pre-training or its aggressive variant. The robust adverse pre-training algorithms leverage the power of pre-training and generalizes conventional unsupervised pre-training and data augmentation methods. We further develop a transfer learning approach to cope with real-world datasets of unknown adverse conditions. The proposed framework is comprehensively evaluated on a number of image and video recognition benchmarks, and obtains significant performance improvements under various single or mixed adverse conditions. Our visualization and analysis further add to the explainability of results.
Tasks	Data Augmentation, Image Restoration, Transfer Learning, Video Recognition
Published	2017-12-20
URL	http://arxiv.org/abs/1712.07732v2
PDF	http://arxiv.org/pdf/1712.07732v2.pdf
PWC	https://paperswithcode.com/paper/enhance-visual-recognition-under-adverse
Repo
Framework

Decoding visemes: improving machine lipreading


Title	Decoding visemes: improving machine lipreading
Authors	Helen L. Bear, Richard Harvey
Abstract	To undertake machine lip-reading, we try to recognise speech from a visual signal. Current work often uses viseme classification supported by language models with varying degrees of success. A few recent works suggest phoneme classification, in the right circumstances, can outperform viseme classification. In this work we present a novel two-pass method of training phoneme classifiers which uses previously trained visemes in the first pass. With our new training algorithm, we show classification performance which significantly improves on previous lip-reading results.
Tasks	Lipreading
Published	2017-10-03
URL	http://arxiv.org/abs/1710.01169v1
PDF	http://arxiv.org/pdf/1710.01169v1.pdf
PWC	https://paperswithcode.com/paper/decoding-visemes-improving-machine-lipreading-1
Repo
Framework

Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks


Title	Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks
Authors	Weilin Xu, David Evans, Yanjun Qi
Abstract	Although deep neural networks (DNNs) have achieved great success in many tasks, they can often be fooled by \emph{adversarial examples} that are generated by adding small but purposeful distortions to natural examples. Previous studies to defend against adversarial examples mostly focused on refining the DNN models, but have either shown limited success or required expensive computation. We propose a new strategy, \emph{feature squeezing}, that can be used to harden DNN models by detecting adversarial examples. Feature squeezing reduces the search space available to an adversary by coalescing samples that correspond to many different feature vectors in the original space into a single sample. By comparing a DNN model’s prediction on the original input with that on squeezed inputs, feature squeezing detects adversarial examples with high accuracy and few false positives. This paper explores two feature squeezing methods: reducing the color bit depth of each pixel and spatial smoothing. These simple strategies are inexpensive and complementary to other defenses, and can be combined in a joint detection framework to achieve high detection rates against state-of-the-art attacks.
Tasks
Published	2017-04-04
URL	http://arxiv.org/abs/1704.01155v2
PDF	http://arxiv.org/pdf/1704.01155v2.pdf
PWC	https://paperswithcode.com/paper/feature-squeezing-detecting-adversarial
Repo
Framework

An Asymptotically Optimal Algorithm for Communicating Multiplayer Multi-Armed Bandit Problems


Title	An Asymptotically Optimal Algorithm for Communicating Multiplayer Multi-Armed Bandit Problems
Authors	Noyan Evirgen, Alper Kose, Hakan Gokcesu
Abstract	We consider a decentralized stochastic multi-armed bandit problem with multiple players. Each player aims to maximize his/her own reward by pulling an arm. The arms give rewards based on i.i.d. stochastic Bernoulli distributions. Players are not aware about the probability distributions of the arms. At the end of each turn, the players inform their neighbors about the arm he/she pulled and the reward he/she got. Neighbors of players are determined according to an Erd{\H{o}}s-R{'e}nyi graph with connectivity $\alpha$. This graph is reproduced in the beginning of every turn with the same connectivity. When more than one player choose the same arm in a turn, we assume that only one of the players who is randomly chosen gets the reward where the others get nothing. We first start by assuming players are not aware of the collision model and offer an asymptotically optimal algorithm for $\alpha = 1$ case. Then, we extend our prior work and offer an asymptotically optimal algorithm for any connectivity but zero, assuming players aware of the collision model. We also study the effect of $\alpha$, the degree of communication between players, empirically on the cumulative regret by comparing them with traditional multi-armed bandit algorithms.
Tasks
Published	2017-12-02
URL	http://arxiv.org/abs/1712.00656v1
PDF	http://arxiv.org/pdf/1712.00656v1.pdf
PWC	https://paperswithcode.com/paper/an-asymptotically-optimal-algorithm-for
Repo
Framework

Deep Learning for Passive Synthetic Aperture Radar


Title	Deep Learning for Passive Synthetic Aperture Radar
Authors	Bariscan Yonel, Eric Mason, Birsen Yazıcı
Abstract	We introduce a deep learning (DL) framework for inverse problems in imaging, and demonstrate the advantages and applicability of this approach in passive synthetic aperture radar (SAR) image reconstruction. We interpret image recon- struction as a machine learning task and utilize deep networks as forward and inverse solvers for imaging. Specifically, we design a recurrent neural network (RNN) architecture as an inverse solver based on the iterations of proximal gradient descent optimization methods. We further adapt the RNN architecture to image reconstruction problems by transforming the network into a recurrent auto-encoder, thereby allowing for unsupervised training. Our DL based inverse solver is particularly suitable for a class of image formation problems in which the forward model is only partially known. The ability to learn forward models and hyper parameters combined with unsupervised training approach establish our recurrent auto-encoder suitable for real world applications. We demonstrate the performance of our method in passive SAR image reconstruction. In this regime a source of opportunity, with unknown location and transmitted waveform, is used to illuminate a scene of interest. We investigate recurrent auto- encoder architecture based on the 1 and 0 constrained least- squares problem. We present a projected stochastic gradient descent based training scheme which incorporates constraints of the unknown model parameters. We demonstrate through extensive numerical simulations that our DL based approach out performs conventional sparse coding methods in terms of computation and reconstructed image quality, specifically, when no information about the transmitter is available.
Tasks	Image Reconstruction
Published	2017-08-12
URL	http://arxiv.org/abs/1708.04682v1
PDF	http://arxiv.org/pdf/1708.04682v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-for-passive-synthetic-aperture
Repo
Framework

Dynamics Based 3D Skeletal Hand Tracking


Title	Dynamics Based 3D Skeletal Hand Tracking
Authors	Stan Melax, Leonid Keselman, Sterling Orsten
Abstract	Tracking the full skeletal pose of the hands and fingers is a challenging problem that has a plethora of applications for user interaction. Existing techniques either require wearable hardware, add restrictions to user pose, or require significant computation resources. This research explores a new approach to tracking hands, or any articulated model, by using an augmented rigid body simulation. This allows us to phrase 3D object tracking as a linear complementarity problem with a well-defined solution. Based on a depth sensor’s samples, the system generates constraints that limit motion orthogonal to the rigid body model’s surface. These constraints, along with prior motion, collision/contact constraints, and joint mechanics, are resolved with a projected Gauss-Seidel solver. Due to camera noise properties and attachment errors, the numerous surface constraints are impulse capped to avoid overpowering mechanical constraints. To improve tracking accuracy, multiple simulations are spawned at each frame and fed a variety of heuristics, constraints and poses. A 3D error metric selects the best-fit simulation, helping the system handle challenging hand motions. Such an approach enables real-time, robust, and accurate 3D skeletal tracking of a user’s hand on a variety of depth cameras, while only utilizing a single x86 CPU core for processing.
Tasks	Object Tracking
Published	2017-05-22
URL	http://arxiv.org/abs/1705.07640v1
PDF	http://arxiv.org/pdf/1705.07640v1.pdf
PWC	https://paperswithcode.com/paper/dynamics-based-3d-skeletal-hand-tracking
Repo
Framework