April 2, 2020

2981 words 14 mins read

Paper Group ANR 209

Online and Real-time Object Tracking Algorithm with Extremely Small Matrices. Fast Lower and Upper Estimates for the Price of Constrained Multiple Exercise American Options by Single Pass Lookahead Search and Nearest-Neighbor Martingale. DMV: Visual Object Tracking via Part-level Dense Memory and Voting-based Retrieval. Applying r-spatiogram in obj …

Online and Real-time Object Tracking Algorithm with Extremely Small Matrices


Title	Online and Real-time Object Tracking Algorithm with Extremely Small Matrices
Authors	Jesmin Jahan Tithi, Sriram Aananthakrishnan, Fabrizio Petrini
Abstract	Online and Real-time Object Tracking is an interesting workload that can be used to track objects (e.g., car, human, animal) in a series of video sequences in real-time. For simple object tracking on edge devices, the output of object tracking could be as simple as drawing a bounding box around a detected object and in some cases, the input matrices used in such computation are quite small (e.g., 4x7, 3x3, 5x5, etc). As a result, the amount of actual work is low. Therefore, a typical multi-threading based parallelization technique can not accelerate the tracking application; instead, a throughput based parallelization technique where each thread operates on independent video sequences is more rewarding. In this paper, we share our experience in parallelizing a Simple Online and Real-time Tracking (SORT) application on shared-memory multicores.
Tasks	Object Tracking
Published	2020-03-26
URL	https://arxiv.org/abs/2003.12091v1
PDF	https://arxiv.org/pdf/2003.12091v1.pdf
PWC	https://paperswithcode.com/paper/online-and-real-time-object-tracking
Repo
Framework

Fast Lower and Upper Estimates for the Price of Constrained Multiple Exercise American Options by Single Pass Lookahead Search and Nearest-Neighbor Martingale


Title	Fast Lower and Upper Estimates for the Price of Constrained Multiple Exercise American Options by Single Pass Lookahead Search and Nearest-Neighbor Martingale
Authors	Nicolas Essis-Breton, Patrice Gaillardetz
Abstract	This article presents fast lower and upper estimates for a large class of options: the class of constrained multiple exercise American options. Typical options in this class are swing options with volume and timing constraints, and passport options with multiple lookback rights. The lower estimate algorithm uses the artificial intelligence method of lookahead search. The upper estimate algorithm uses the dual approach to option pricing on a nearest-neighbor basis for the martingale space. Probabilistic convergence guarantees are provided. Several numerical examples illustrate the approaches including a swing option with four constraints, and a passport option with 16 constraints.
Tasks
Published	2020-02-26
URL	https://arxiv.org/abs/2002.11258v1
PDF	https://arxiv.org/pdf/2002.11258v1.pdf
PWC	https://paperswithcode.com/paper/fast-lower-and-upper-estimates-for-the-price
Repo
Framework

DMV: Visual Object Tracking via Part-level Dense Memory and Voting-based Retrieval


Title	DMV: Visual Object Tracking via Part-level Dense Memory and Voting-based Retrieval
Authors	Gunhee Nam, Seoung Wug Oh, Joon-Young Lee, Seon Joo Kim
Abstract	We propose a novel memory-based tracker via part-level dense memory and voting-based retrieval, called DMV. Since deep learning techniques have been introduced to the tracking field, Siamese trackers have attracted many researchers due to the balance between speed and accuracy. However, most of them are based on a single template matching, which limits the performance as it restricts the accessible in-formation to the initial target features. In this paper, we relieve this limitation by maintaining an external memory that saves the tracking record. Part-level retrieval from the memory also liberates the information from the template and allows our tracker to better handle the challenges such as appearance changes and occlusions. By updating the memory during tracking, the representative power for the target object can be enhanced without online learning. We also propose a novel voting mechanism for the memory reading to filter out unreliable information in the memory. We comprehensively evaluate our tracker on OTB-100,TrackingNet, GOT-10k, LaSOT, and UAV123, which show that our method yields comparable results to the state-of-the-art methods.
Tasks	Object Tracking, Visual Object Tracking
Published	2020-03-20
URL	https://arxiv.org/abs/2003.09171v1
PDF	https://arxiv.org/pdf/2003.09171v1.pdf
PWC	https://paperswithcode.com/paper/dmv-visual-object-tracking-via-part-level
Repo
Framework

Applying r-spatiogram in object tracking for occlusion handling


Title	Applying r-spatiogram in object tracking for occlusion handling
Authors	Niloufar Salehi Dastjerdi, M. Omair Ahmad
Abstract	Object tracking is one of the most important problems in computer vision. The aim of video tracking is to extract the trajectories of a target or object of interest, i.e. accurately locate a moving target in a video sequence and discriminate target from non-targets in the feature space of the sequence. So, feature descriptors can have significant effects on such discrimination. In this paper, we use the basic idea of many trackers which consists of three main components of the reference model, i.e., object modeling, object detection and localization, and model updating. However, there are major improvements in our system. Our forth component, occlusion handling, utilizes the r-spatiogram to detect the best target candidate. While spatiogram contains some moments upon the coordinates of the pixels, r-spatiogram computes region-based compactness on the distribution of the given feature in the image that captures richer features to represent the objects. The proposed research develops an efficient and robust way to keep tracking the object throughout video sequences in the presence of significant appearance variations and severe occlusions. The proposed method is evaluated on the Princeton RGBD tracking dataset considering sequences with different challenges and the obtained results demonstrate the effectiveness of the proposed method.
Tasks	Object Detection, Object Tracking
Published	2020-03-18
URL	https://arxiv.org/abs/2003.08021v1
PDF	https://arxiv.org/pdf/2003.08021v1.pdf
PWC	https://paperswithcode.com/paper/applying-r-spatiogram-in-object-tracking-for
Repo
Framework

Learning-Based Link Scheduling in Millimeter-wave Multi-connectivity Scenarios


Title	Learning-Based Link Scheduling in Millimeter-wave Multi-connectivity Scenarios
Authors	Cristian Tatino, Nikolaos Pappas, Ilaria Malanchini, Lutz Ewe, Di Yuan
Abstract	Multi-connectivity is emerging as a promising solution to provide reliable communications and seamless connectivity for the millimeter-wave frequency range. Due to the blockage sensitivity at such high frequencies, connectivity with multiple cells can drastically increase the network performance in terms of throughput and reliability. However, an inefficient link scheduling, i.e., over and under-provisioning of connections, can lead either to high interference and energy consumption or to unsatisfied user’s quality of service (QoS) requirements. In this work, we present a learning-based solution that is able to learn and then to predict the optimal link scheduling to satisfy users’ QoS requirements while avoiding communication interruptions. Moreover, we compare the proposed approach with two base line methods and the genie-aided link scheduling that assumes perfect channel knowledge. We show that the learning-based solution approaches the optimum and outperforms the base line methods.
Tasks
Published	2020-03-02
URL	https://arxiv.org/abs/2003.02651v1
PDF	https://arxiv.org/pdf/2003.02651v1.pdf
PWC	https://paperswithcode.com/paper/learning-based-link-scheduling-in-millimeter
Repo
Framework

Scalable and Probabilistically Complete Planning for Robotic Spatial Extrusion


Title	Scalable and Probabilistically Complete Planning for Robotic Spatial Extrusion
Authors	Caelan Reed Garrett, Yijiang Huang, Tomás Lozano-Pérez, Caitlin Tobin Mueller
Abstract	There is increasing demand for automated systems that can fabricate 3D structures. Robotic spatial extrusion has become an attractive alternative to traditional layer-based 3D printing due to a manipulator’s flexibility to print large, directionally-dependent structures. However, existing extrusion planning algorithms require a substantial amount of human input, do not scale to large instances, and lack theoretical guarantees. In this work, we present a rigorous formalization of robotic spatial extrusion planning and provide several efficient and probabilistically complete planning algorithms. The key planning challenge is, throughout the printing process, satisfying both stiffness constraints that limit the deformation of the structure and geometric constraints that ensure the robot does not collide with the structure. We show that, although these constraints often conflict with each other, a greedy backward state-space search guided by a stiffness-aware heuristic is able to successfully balance both constraints. We empirically compare our methods on a benchmark of over 40 simulated extrusion problems. Finally, we apply our approach to 3 real-world extrusion problems.
Tasks
Published	2020-02-06
URL	https://arxiv.org/abs/2002.02360v1
PDF	https://arxiv.org/pdf/2002.02360v1.pdf
PWC	https://paperswithcode.com/paper/scalable-and-probabilistically-complete
Repo
Framework

HypoML: Visual Analysis for Hypothesis-based Evaluation of Machine Learning Models


Title	HypoML: Visual Analysis for Hypothesis-based Evaluation of Machine Learning Models
Authors	Qianwen Wang, William Alexander, Jack Pegg, Huamin Qu, Min Chen
Abstract	In this paper, we present a visual analytics tool for enabling hypothesis-based evaluation of machine learning (ML) models. We describe a novel ML-testing framework that combines the traditional statistical hypothesis testing (commonly used in empirical research) with logical reasoning about the conclusions of multiple hypotheses. The framework defines a controlled configuration for testing a number of hypotheses as to whether and how some extra information about a “concept” or “feature” may benefit or hinder a ML model. Because reasoning multiple hypotheses is not always straightforward, we provide HypoML as a visual analysis tool, with which, the multi-thread testing data is transformed to a visual representation for rapid observation of the conclusions and the logical flow between the testing data and hypotheses.We have applied HypoML to a number of hypothesized concepts, demonstrating the intuitive and explainable nature of the visual analysis.
Tasks
Published	2020-02-12
URL	https://arxiv.org/abs/2002.05271v1
PDF	https://arxiv.org/pdf/2002.05271v1.pdf
PWC	https://paperswithcode.com/paper/hypoml-visual-analysis-for-hypothesis-based
Repo
Framework

Distance in Latent Space as Novelty Measure


Title	Distance in Latent Space as Novelty Measure
Authors	Mark Philip Philipsen, Thomas Baltzer Moeslund
Abstract	Deep Learning performs well when training data densely covers the experience space. For complex problems this makes data collection prohibitively expensive. We propose to intelligently select samples when constructing data sets in order to best utilize the available labeling budget. The selection methodology is based on the presumption that two dissimilar samples are worth more than two similar samples in a data set. Similarity is measured based on the Euclidean distance between samples in the latent space produced by a DNN. By using a self-supervised method to construct the latent space, it is ensured that the space fits the data well and that any upfront labeling effort can be avoided. The result is more efficient, diverse, and balanced data set, which produce equal or superior results with fewer labeled examples.
Tasks
Published	2020-03-31
URL	https://arxiv.org/abs/2003.14043v1
PDF	https://arxiv.org/pdf/2003.14043v1.pdf
PWC	https://paperswithcode.com/paper/distance-in-latent-space-as-novelty-measure
Repo
Framework

SURF: A Simple, Universal, Robust, Fast Distribution Learning Algorithm


Title	SURF: A Simple, Universal, Robust, Fast Distribution Learning Algorithm
Authors	Yi Hao, Ayush Jain, Alon Orlitsky, Vaishakh Ravindrakumar
Abstract	Sample- and computationally-efficient distribution estimation is a fundamental tenet in statistics and machine learning. We present $\mathrm{SURF}$, an algorithm for approximating distributions by piecewise polynomials. $\mathrm{SURF}$ is simple, replacing existing general-purpose optimization techniques by straight-forward approximation of each potential polynomial piece by a simple empirical-probability interpolation, and using plain divide-and-conquer to merge the pieces. It is universal, as well-known low-degree polynomial-approximation results imply that it accurately approximates a large class of common distributions. $\mathrm{SURF}$ is robust to distribution mis-specification as for any degree $d\le 8$, it estimates any distribution to an $\ell_1$ distance $ <3 $ times that of the nearest degree-$d$ piecewise polynomial, improving known factor upper bounds of 3 for single polynomials and 15 for polynomials with arbitrarily many pieces. It is fast, using optimal sample complexity, and running in near sample-linear time. In experiments, $\mathrm{SURF}$ significantly outperforms state-of-the art algorithms.
Tasks
Published	2020-02-22
URL	https://arxiv.org/abs/2002.09589v1
PDF	https://arxiv.org/pdf/2002.09589v1.pdf
PWC	https://paperswithcode.com/paper/surf-a-simple-universal-robust-fast
Repo
Framework

Set2Graph: Learning Graphs From Sets


Title	Set2Graph: Learning Graphs From Sets
Authors	Hadar Serviansky, Nimrod Segol, Jonathan Shlomi, Kyle Cranmer, Eilam Gross, Haggai Maron, Yaron Lipman
Abstract	Many problems in machine learning (ML) can be cast as learning functions from sets to graphs, or more generally to hypergraphs; in short, Set2Graph functions. Examples include clustering, learning vertex and edge features on graphs, and learning triplet data in a collection. Current neural network models that approximate Set2Graph functions come from two main ML sub-fields: equivariant learning, and similarity learning. Equivariant models would be in general computationally challenging or even infeasible, while similarity learning models can be shown to have limited expressive power. In this paper we suggest a neural network model family for learning Set2Graph functions that is both practical and of maximal expressive power (universal), that is, can approximate arbitrary continuous Set2Graph functions over compact sets. Testing our models on different machine learning tasks, including an application to particle physics, we find them favorable to existing baselines.
Tasks
Published	2020-02-20
URL	https://arxiv.org/abs/2002.08772v1
PDF	https://arxiv.org/pdf/2002.08772v1.pdf
PWC	https://paperswithcode.com/paper/set2graph-learning-graphs-from-sets
Repo
Framework

Dimensionality reduction to maximize prediction generalization capability


Title	Dimensionality reduction to maximize prediction generalization capability
Authors	Takuya Isomura, Taro Toyoizumi
Abstract	This work develops an analytically solvable unsupervised learning scheme that extracts the most informative components for predicting future inputs, termed predictive principal component analysis (PredPCA). Our scheme can effectively remove unpredictable observation noise and globally minimize the test prediction error. Mathematical analyses demonstrate that, with sufficiently high-dimensional observations that are generated by a linear or nonlinear system, PredPCA can identify the optimal hidden state representation, true system parameters, and true hidden state dimensionality, with a global convergence guarantee. We demonstrate the performance of PredPCA by using sequential visual inputs comprising hand-digits, rotating 3D objects, and natural scenes. It reliably and accurately estimates distinct hidden states and predicts future outcomes of previously unseen test input data, even in the presence of considerable observation noise. The simple model structure and low computational cost of PredPCA make it highly desirable as a learning scheme for biological neural networks and neuromorphic chips.
Tasks	Dimensionality Reduction
Published	2020-03-01
URL	https://arxiv.org/abs/2003.00470v1
PDF	https://arxiv.org/pdf/2003.00470v1.pdf
PWC	https://paperswithcode.com/paper/dimensionality-reduction-to-maximize
Repo
Framework

Supervised Dimensionality Reduction and Visualization using Centroid-encoder


Title	Supervised Dimensionality Reduction and Visualization using Centroid-encoder
Authors	Tomojit Ghosh, Michael Kirby
Abstract	Visualizing high-dimensional data is an essential task in Data Science and Machine Learning. The Centroid-Encoder (CE) method is similar to the autoencoder but incorporates label information to keep objects of a class close together in the reduced visualization space. CE exploits nonlinearity and labels to encode high variance in low dimensions while capturing the global structure of the data. We present a detailed analysis of the method using a wide variety of data sets and compare it with other supervised dimension reduction techniques, including NCA, nonlinear NCA, t-distributed NCA, t-distributed MCML, supervised UMAP, supervised PCA, Colored Maximum Variance Unfolding, supervised Isomap, Parametric Embedding, supervised Neighbor Retrieval Visualizer, and Multiple Relational Embedding. We empirically show that centroid-encoder outperforms most of these techniques. We also show that when the data variance is spread across multiple modalities, centroid-encoder extracts a significant amount of information from the data in low dimensional space. This key feature establishes its value to use it as a tool for data visualization.
Tasks	Dimensionality Reduction
Published	2020-02-27
URL	https://arxiv.org/abs/2002.11934v2
PDF	https://arxiv.org/pdf/2002.11934v2.pdf
PWC	https://paperswithcode.com/paper/supervised-dimensionality-reduction-and
Repo
Framework

FaceScape: a Large-scale High Quality 3D Face Dataset and Detailed Riggable 3D Face Prediction


Title	FaceScape: a Large-scale High Quality 3D Face Dataset and Detailed Riggable 3D Face Prediction
Authors	Haotian Yang, Hao Zhu, Yanru Wang, Mingkai Huang, Qiu Shen, Ruigang Yang, Xun Cao
Abstract	In this paper, we present a large-scale detailed 3D face dataset, FaceScape, and propose a novel algorithm that is able to predict elaborate riggable 3D face models from a single image input. FaceScape dataset provides 18,760 textured 3D faces, captured from 938 subjects and each with 20 specific expressions. The 3D models contain the pore-level facial geometry that is also processed to be topologically uniformed. These fine 3D facial models can be represented as a 3D morphable model for rough shapes and displacement maps for detailed geometry. Taking advantage of the large-scale and high-accuracy dataset, a novel algorithm is further proposed to learn the expression-specific dynamic details using a deep neural network. The learned relationship serves as the foundation of our 3D face prediction system from a single image input. Different than the previous methods, our predicted 3D models are riggable with highly detailed geometry under different expressions. The unprecedented dataset and code will be released to public for research purpose.
Tasks
Published	2020-03-31
URL	https://arxiv.org/abs/2003.13989v1
PDF	https://arxiv.org/pdf/2003.13989v1.pdf
PWC	https://paperswithcode.com/paper/facescape-a-large-scale-high-quality-3d-face
Repo
Framework

Finite-Time Last-Iterate Convergence for Multi-Agent Learning in Games


Title	Finite-Time Last-Iterate Convergence for Multi-Agent Learning in Games
Authors	Tianyi Lin, Zhengyuan Zhou, Panayotis Mertikopoulos, Michael I. Jordan
Abstract	We consider multi-agent learning via online gradient descent (OGD) in a class of games called $\lambda$-cocoercive games, a broad class of games that admits many Nash equilibria and that properly includes strongly monotone games. We characterize the finite-time last-iterate convergence rate for joint OGD learning on $\lambda$-cocoercive games; further, building on this result, we develop a fully adaptive OGD learning algorithm that does not require any knowledge of the problem parameter (e.g., the cocoercive constant $\lambda$) and show, via a novel double-stopping-time technique, that this adaptive algorithm achieves the same finite-time last-iterate convergence rate as its non-adaptive counterpart. Subsequently, we extend OGD learning to the noisy gradient feedback case and establish last-iterate convergence results—first qualitative almost sure convergence, then quantitative finite-time convergence rates—all under non-decreasing step-sizes. These results fill in several gaps in the existing multi-agent online learning literature, where three aspects—finite-time convergence rates, non-decreasing step-sizes, and fully adaptive algorithms—have not been previously explored.
Tasks
Published	2020-02-23
URL	https://arxiv.org/abs/2002.09806v2
PDF	https://arxiv.org/pdf/2002.09806v2.pdf
PWC	https://paperswithcode.com/paper/finite-time-last-iterate-convergence-for
Repo
Framework

Modeling Cross-view Interaction Consistency for Paired Egocentric Interaction Recognition


Title	Modeling Cross-view Interaction Consistency for Paired Egocentric Interaction Recognition
Authors	Zhongguo Li, Fan Lyu, Wei Feng, Song Wang
Abstract	With the development of Augmented Reality (AR), egocentric action recognition (EAR) plays important role in accurately understanding demands from the user. However, EAR is designed to help recognize human-machine interaction in single egocentric view, thus difficult to capture interactions between two face-to-face AR users. Paired egocentric interaction recognition (PEIR) is the task to collaboratively recognize the interactions between two persons with the videos in their corresponding views. Unfortunately, existing PEIR methods always directly use linear decision function to fuse the features extracted from two corresponding egocentric videos, which ignore consistency of interaction in paired egocentric videos. The consistency of interactions in paired videos, and features extracted from them are correlated to each other. On top of that, we propose to build the relevance between two views using biliear pooling, which capture the consistency of two views in feature-level. Specifically, each neuron in the feature maps from one view connects to the neurons from another view, which guarantee the compact consistency between two views. Then all possible paired neurons are used for PEIR for the inside consistent information of them. To be efficient, we use compact bilinear pooling with Count Sketch to avoid directly computing outer product in bilinear. Experimental results on dataset PEV shows the superiority of the proposed methods on the task PEIR.
Tasks
Published	2020-03-24
URL	https://arxiv.org/abs/2003.10663v1
PDF	https://arxiv.org/pdf/2003.10663v1.pdf
PWC	https://paperswithcode.com/paper/modeling-cross-view-interaction-consistency
Repo
Framework