October 17, 2019

3274 words 16 mins read

Paper Group ANR 922

MV-YOLO: Motion Vector-aided Tracking by Semantic Object Detection. Estimation with Low-Rank Time-Frequency Synthesis Models. Evidence-based lean logic profiles for conceptual data modelling languages. Linear Convergence of the Primal-Dual Gradient Method for Convex-Concave Saddle Point Problems without Strong Convexity. Hierarchical Game-Theoretic …

MV-YOLO: Motion Vector-aided Tracking by Semantic Object Detection


Title	MV-YOLO: Motion Vector-aided Tracking by Semantic Object Detection
Authors	Saeed Ranjbar Alvar, Ivan V. Bajić
Abstract	Object tracking is the cornerstone of many visual analytics systems. While considerable progress has been made in this area in recent years, robust, efficient, and accurate tracking in real-world video remains a challenge. In this paper, we present a hybrid tracker that leverages motion information from the compressed video stream and a general-purpose semantic object detector acting on decoded frames to construct a fast and efficient tracking engine. The proposed approach is compared with several well-known recent trackers on the OTB tracking dataset. The results indicate advantages of the proposed method in terms of speed and/or accuracy.Other desirable features of the proposed method are its simplicity and deployment efficiency, which stems from the fact that it reuses the resources and information that may already exist in the system for other reasons.
Tasks	Object Detection, Object Tracking
Published	2018-04-30
URL	http://arxiv.org/abs/1805.00107v2
PDF	http://arxiv.org/pdf/1805.00107v2.pdf
PWC	https://paperswithcode.com/paper/mv-yolo-motion-vector-aided-tracking-by
Repo
Framework

Estimation with Low-Rank Time-Frequency Synthesis Models


Title	Estimation with Low-Rank Time-Frequency Synthesis Models
Authors	Cédric Févotte, Matthieu Kowalski
Abstract	Many state-of-the-art signal decomposition techniques rely on a low-rank factorization of a time-frequency (t-f) transform. In particular, nonnegative matrix factorization (NMF) of the spectrogram has been considered in many audio applications. This is an analysis approach in the sense that the factorization is applied to the squared magnitude of the analysis coefficients returned by the t-f transform. In this paper we instead propose a synthesis approach, where low-rankness is imposed to the synthesis coefficients of the data signal over a given t-f dictionary (such as a Gabor frame). As such we offer a novel modeling paradigm that bridges t-f synthesis modeling and traditional analysis-based NMF approaches. The proposed generative model allows in turn to design more sophisticated multi-layer representations that can efficiently capture diverse forms of structure. Additionally, the generative modeling allows to exploit t-f low-rankness for compressive sensing. We present efficient iterative shrinkage algorithms to perform estimation in the proposed models and illustrate the capabilities of the new modeling paradigm over audio signal processing examples.
Tasks	Compressive Sensing
Published	2018-04-25
URL	http://arxiv.org/abs/1804.09497v2
PDF	http://arxiv.org/pdf/1804.09497v2.pdf
PWC	https://paperswithcode.com/paper/estimation-with-low-rank-time-frequency
Repo
Framework

Evidence-based lean logic profiles for conceptual data modelling languages


Title	Evidence-based lean logic profiles for conceptual data modelling languages
Authors	Pablo Rubén Fillottrani, C. Maria Keet
Abstract	Multiple logic-based reconstructions of conceptual data modelling languages such as EER, UML Class Diagrams, and ORM exist. They mainly cover various fragments of the languages and none are formalised such that the logic applies simultaneously for all three modelling language families as unifying mechanism. This hampers interchangeability, interoperability, and tooling support. In addition, due to the lack of a systematic design process of the logic used for the formalisation, hidden choices permeate the formalisations that have rendered them incompatible. We aim to address these problems, first, by structuring the logic design process in a methodological way. We generalise and extend the DSL design process to apply to logic language design more generally and, in particular, by incorporating an ontological analysis of language features in the process. Second, we specify minimal logic profiles availing of this extended process, including the ontological commitments embedded in the languages, of evidence gathered of language feature usage, and of computational complexity insights from Description Logics (DL). The profiles characterise the essential logic structure needed to handle the semantics of conceptual models, therewith enabling the development of interoperability tools. There is no known DL language that matches exactly the features of those profiles and the common core is small (in the tractable DL $\mathcal{ALNI}$). Although hardly any inconsistencies can be derived with the profiles, it is promising for scalable runtime use of conceptual data models.
Tasks
Published	2018-09-09
URL	https://arxiv.org/abs/1809.03001v2
PDF	https://arxiv.org/pdf/1809.03001v2.pdf
PWC	https://paperswithcode.com/paper/evidence-based-lean-logic-profiles-for
Repo
Framework

Linear Convergence of the Primal-Dual Gradient Method for Convex-Concave Saddle Point Problems without Strong Convexity


Title	Linear Convergence of the Primal-Dual Gradient Method for Convex-Concave Saddle Point Problems without Strong Convexity
Authors	Simon S. Du, Wei Hu
Abstract	We consider the convex-concave saddle point problem $\min_{x}\max_{y} f(x)+y^\top A x-g(y)$ where $f$ is smooth and convex and $g$ is smooth and strongly convex. We prove that if the coupling matrix $A$ has full column rank, the vanilla primal-dual gradient method can achieve linear convergence even if $f$ is not strongly convex. Our result generalizes previous work which either requires $f$ and $g$ to be quadratic functions or requires proximal mappings for both $f$ and $g$. We adopt a novel analysis technique that in each iteration uses a “ghost” update as a reference, and show that the iterates in the primal-dual gradient method converge to this “ghost” sequence. Using the same technique we further give an analysis for the primal-dual stochastic variance reduced gradient (SVRG) method for convex-concave saddle point problems with a finite-sum structure.
Tasks
Published	2018-02-05
URL	http://arxiv.org/abs/1802.01504v2
PDF	http://arxiv.org/pdf/1802.01504v2.pdf
PWC	https://paperswithcode.com/paper/linear-convergence-of-the-primal-dual
Repo
Framework

Hierarchical Game-Theoretic Planning for Autonomous Vehicles


Title	Hierarchical Game-Theoretic Planning for Autonomous Vehicles
Authors	Jaime F. Fisac, Eli Bronstein, Elis Stefansson, Dorsa Sadigh, S. Shankar Sastry, Anca D. Dragan
Abstract	The actions of an autonomous vehicle on the road affect and are affected by those of other drivers, whether overtaking, negotiating a merge, or avoiding an accident. This mutual dependence, best captured by dynamic game theory, creates a strong coupling between the vehicle’s planning and its predictions of other drivers’ behavior, and constitutes an open problem with direct implications on the safety and viability of autonomous driving technology. Unfortunately, dynamic games are too computationally demanding to meet the real-time constraints of autonomous driving in its continuous state and action space. In this paper, we introduce a novel game-theoretic trajectory planning algorithm for autonomous driving, that enables real-time performance by hierarchically decomposing the underlying dynamic game into a long-horizon “strategic” game with simplified dynamics and full information structure, and a short-horizon “tactical” game with full dynamics and a simplified information structure. The value of the strategic game is used to guide the tactical planning, implicitly extending the planning horizon, pushing the local trajectory optimization closer to global solutions, and, most importantly, quantitatively accounting for the autonomous vehicle and the human driver’s ability and incentives to influence each other. In addition, our approach admits non-deterministic models of human decision-making, rather than relying on perfectly rational predictions. Our results showcase richer, safer, and more effective autonomous behavior in comparison to existing techniques.
Tasks	Autonomous Driving, Autonomous Vehicles, Decision Making
Published	2018-10-13
URL	http://arxiv.org/abs/1810.05766v1
PDF	http://arxiv.org/pdf/1810.05766v1.pdf
PWC	https://paperswithcode.com/paper/hierarchical-game-theoretic-planning-for
Repo
Framework

Penetrating the Fog: the Path to Efficient CNN Models


Title	Penetrating the Fog: the Path to Efficient CNN Models
Authors	Kun Wan, Boyuan Feng, Shu Yang, Yufei Ding
Abstract	With the increasing demand to deploy convolutional neural networks (CNNs) on mobile platforms, the sparse kernel approach was proposed, which could save more parameters than the standard convolution while maintaining accuracy. However, despite the great potential, no prior research has pointed out how to craft an sparse kernel design with such potential (i.e., effective design), and all prior works just adopt simple combinations of existing sparse kernels such as group convolution. Meanwhile due to the large design space it is also impossible to try all combinations of existing sparse kernels. In this paper, we are the first in the field to consider how to craft an effective sparse kernel design by eliminating the large design space. Specifically, we present a sparse kernel scheme to illustrate how to reduce the space from three aspects. First, in terms of composition we remove designs composed of repeated layers. Second, to remove designs with large accuracy degradation, we find an unified property named information field behind various sparse kernel designs, which could directly indicate the final accuracy. Last, we remove designs in two cases where a better parameter efficiency could be achieved. Additionally, we provide detailed efficiency analysis on the final four designs in our scheme. Experimental results validate the idea of our scheme by showing that our scheme is able to find designs which are more efficient in using parameters and computation with similar or higher accuracy.
Tasks
Published	2018-10-09
URL	http://arxiv.org/abs/1810.04231v2
PDF	http://arxiv.org/pdf/1810.04231v2.pdf
PWC	https://paperswithcode.com/paper/penetrating-the-fog-the-path-to-efficient-cnn
Repo
Framework

Quantum Variational Autoencoder


Title	Quantum Variational Autoencoder
Authors	Amir Khoshaman, Walter Vinci, Brandon Denis, Evgeny Andriyash, Hossein Sadeghi, Mohammad H. Amin
Abstract	Variational autoencoders (VAEs) are powerful generative models with the salient ability to perform inference. Here, we introduce a quantum variational autoencoder (QVAE): a VAE whose latent generative process is implemented as a quantum Boltzmann machine (QBM). We show that our model can be trained end-to-end by maximizing a well-defined loss-function: a ‘quantum’ lower-bound to a variational approximation of the log-likelihood. We use quantum Monte Carlo (QMC) simulations to train and evaluate the performance of QVAEs. To achieve the best performance, we first create a VAE platform with discrete latent space generated by a restricted Boltzmann machine (RBM). Our model achieves state-of-the-art performance on the MNIST dataset when compared against similar approaches that only involve discrete variables in the generative process. We consider QVAEs with a smaller number of latent units to be able to perform QMC simulations, which are computationally expensive. We show that QVAEs can be trained effectively in regimes where quantum effects are relevant despite training via the quantum bound. Our findings open the way to the use of quantum computers to train QVAEs to achieve competitive performance for generative models. Placing a QBM in the latent space of a VAE leverages the full potential of current and next-generation quantum computers as sampling devices.
Tasks
Published	2018-02-15
URL	http://arxiv.org/abs/1802.05779v2
PDF	http://arxiv.org/pdf/1802.05779v2.pdf
PWC	https://paperswithcode.com/paper/quantum-variational-autoencoder
Repo
Framework

Recognition of Activities from Eye Gaze and Egocentric Video


Title	Recognition of Activities from Eye Gaze and Egocentric Video
Authors	Anjith George, Aurobinda Routray
Abstract	This paper presents a framework for recognition of human activity from egocentric video and eye tracking data obtained from a head-mounted eye tracker. Three channels of information such as eye movement, ego-motion, and visual features are combined for the classification of activities. Image features were extracted using a pre-trained convolutional neural network. Eye and ego-motion are quantized, and the windowed histograms are used as the features. The combination of features obtains better accuracy for activity classification as compared to individual features.
Tasks	Eye Tracking
Published	2018-05-18
URL	http://arxiv.org/abs/1805.07253v1
PDF	http://arxiv.org/pdf/1805.07253v1.pdf
PWC	https://paperswithcode.com/paper/recognition-of-activities-from-eye-gaze-and
Repo
Framework

Deep Multiscale Model Learning


Title	Deep Multiscale Model Learning
Authors	Yating Wang, Siu Wun Cheung, Eric T. Chung, Yalchin Efendiev, Min Wang
Abstract	The objective of this paper is to design novel multi-layer neural network architectures for multiscale simulations of flows taking into account the observed data and physical modeling concepts. Our approaches use deep learning concepts combined with local multiscale model reduction methodologies to predict flow dynamics. Using reduced-order model concepts is important for constructing robust deep learning architectures since the reduced-order models provide fewer degrees of freedom. Flow dynamics can be thought of as multi-layer networks. More precisely, the solution (e.g., pressures and saturations) at the time instant $n+1$ depends on the solution at the time instant $n$ and input parameters, such as permeability fields, forcing terms, and initial conditions. One can regard the solution as a multi-layer network, where each layer, in general, is a nonlinear forward map and the number of layers relates to the internal time steps. We will rely on rigorous model reduction concepts to define unknowns and connections for each layer. In each layer, our reduced-order models will provide a forward map, which will be modified (“trained”) using available data. It is critical to use reduced-order models for this purpose, which will identify the regions of influence and the appropriate number of variables. Because of the lack of available data, the training will be supplemented with computational data as needed and the interpolation between data-rich and data-deficient models. We will also use deep learning algorithms to train the elements of the reduced model discrete system. We will present main ingredients of our approach and numerical results. Numerical results show that using deep learning and multiscale models, we can improve the forward models, which are conditioned to the available data.
Tasks
Published	2018-06-13
URL	http://arxiv.org/abs/1806.04830v1
PDF	http://arxiv.org/pdf/1806.04830v1.pdf
PWC	https://paperswithcode.com/paper/deep-multiscale-model-learning
Repo
Framework

Surface Type Estimation from GPS Tracked Bicycle Activities


Title	Surface Type Estimation from GPS Tracked Bicycle Activities
Authors	Nitish Nag, Vaibhav Pandey, Aishwarya Manjunath, Avinash Vaka, Ramesh Jain
Abstract	Road conditions affect both machine and human powered modes of transportation. In the case of human powered transportation, poor road conditions increase the work for the individual to travel. Previous estimates for these parameters have used computationally expensive analysis of satellite images. In this work, we use a computationally inexpensive and simple method by using only GPS data from a human powered cyclist. By estimating if the road taken by the user has high or low variations in their directional vector, we classify if the user is on a paved road or on an unpaved trail. In order to do this, three methods were adopted, changes in frequency of the direction of slope in a given path segment, fitting segments of the path, and finding the first derivative and the number of points of zero crossings of each segment. Machine learning models such as support vector machines, K-nearest neighbors, and decision trees were used for the classification of the path. We show in our methods, the decision trees performed the best with an accuracy of 86%. Estimation of the type of surface can be used for many applications such as understanding rolling resistance for power estimation estimation or building exercise recommendation systems by user profiling as described in detail in the paper.
Tasks	Recommendation Systems
Published	2018-09-25
URL	http://arxiv.org/abs/1809.09745v1
PDF	http://arxiv.org/pdf/1809.09745v1.pdf
PWC	https://paperswithcode.com/paper/surface-type-estimation-from-gps-tracked
Repo
Framework

Federated Meta-Learning with Fast Convergence and Efficient Communication


Title	Federated Meta-Learning with Fast Convergence and Efficient Communication
Authors	Fei Chen, Mi Luo, Zhenhua Dong, Zhenguo Li, Xiuqiang He
Abstract	Statistical and systematic challenges in collaboratively training machine learning models across distributed networks of mobile devices have been the bottlenecks in the real-world application of federated learning. In this work, we show that meta-learning is a natural choice to handle these issues, and propose a federated meta-learning framework FedMeta, where a parameterized algorithm (or meta-learner) is shared, instead of a global model in previous approaches. We conduct an extensive empirical evaluation on LEAF datasets and a real-world production dataset, and demonstrate that FedMeta achieves a reduction in required communication cost by 2.82-4.33 times with faster convergence, and an increase in accuracy by 3.23%-14.84% as compared to Federated Averaging (FedAvg) which is a leading optimization algorithm in federated learning. Moreover, FedMeta preserves user privacy since only the parameterized algorithm is transmitted between mobile devices and central servers, and no raw data is collected onto the servers.
Tasks	Meta-Learning, Recommendation Systems
Published	2018-02-22
URL	https://arxiv.org/abs/1802.07876v2
PDF	https://arxiv.org/pdf/1802.07876v2.pdf
PWC	https://paperswithcode.com/paper/federated-meta-learning-for-recommendation
Repo
Framework

Discriminant Patch Representation for RGB-D Face Recognition Using Convolutional Neural Networks


Title	Discriminant Patch Representation for RGB-D Face Recognition Using Convolutional Neural Networks
Authors	Nesrine Grati, Achraf Ben-Hamadou, Mohamed Hammami
Abstract	This paper focuses on designing data-driven models to learn a discriminant representation space for face recognition using RGB-D data. Unlike hand-crafted representations, learned models can extract and organize the discriminant information from the data, and can automatically adapt to build new compute vision applications faster. We proposed an effective way to train Convolutional Neural Networks to learn face patch discriminant features. The proposed solution was tested and validated on state-of-the-art RGB-D datasets and showed competitive and promising results relatively to standard hand-crafted feature extractors.
Tasks	Face Recognition
Published	2018-12-17
URL	http://arxiv.org/abs/1812.06829v1
PDF	http://arxiv.org/pdf/1812.06829v1.pdf
PWC	https://paperswithcode.com/paper/discriminant-patch-representation-for-rgb-d
Repo
Framework

Multiple topic identification in telephone conversations


Title	Multiple topic identification in telephone conversations
Authors	Xavier Bost, Marc El Bèze, Renato De Mori
Abstract	This paper deals with the automatic analysis of conversations between a customer and an agent in a call centre of a customer care service. The purpose of the analysis is to hypothesize themes about problems and complaints discussed in the conversation. Themes are defined by the application documentation topics. A conversation may contain mentions that are irrelevant for the application purpose and multiple themes whose mentions may be interleaved portions of a conversation that cannot be well defined. Two methods are proposed for multiple theme hypothesization. One of them is based on a cosine similarity measure using a bag of features extracted from the entire conversation. The other method introduces the concept of thematic density distributed around specific word positions in a conversation. In addition to automatically selected words, word bi-grams with possible gaps between successive words are also considered and selected. Experimental results show that the results obtained with the proposed methods outperform the results obtained with support vector machines on the same data. Furthermore, using the theme skeleton of a conversation from which thematic densities are derived, it will be possible to extract components of an automatic conversation report to be used for improving the service performance. Index Terms: multi-topic audio document classification, hu-man/human conversation analysis, speech analytics, distance bigrams
Tasks	Document Classification
Published	2018-12-21
URL	http://arxiv.org/abs/1812.09321v2
PDF	http://arxiv.org/pdf/1812.09321v2.pdf
PWC	https://paperswithcode.com/paper/multiple-topic-identification-in-telephone
Repo
Framework

Distilling Information from a Flood: A Possibility for the Use of Meta-Analysis and Systematic Review in Machine Learning Research


Title	Distilling Information from a Flood: A Possibility for the Use of Meta-Analysis and Systematic Review in Machine Learning Research
Authors	Peter Henderson, Emma Brunskill
Abstract	The current flood of information in all areas of machine learning research, from computer vision to reinforcement learning, has made it difficult to make aggregate scientific inferences. It can be challenging to distill a myriad of similar papers into a set of useful principles, to determine which new methodologies to use for a particular application, and to be confident that one has compared against all relevant related work when developing new ideas. However, such a rapidly growing body of research literature is a problem that other fields have already faced - in particular, medicine and epidemiology. In those fields, systematic reviews and meta-analyses have been used exactly for dealing with these issues and it is not uncommon for entire journals to be dedicated to such analyses. Here, we suggest the field of machine learning might similarly benefit from meta-analysis and systematic review, and we encourage further discussion and development along this direction.
Tasks	Epidemiology
Published	2018-12-03
URL	http://arxiv.org/abs/1812.01074v1
PDF	http://arxiv.org/pdf/1812.01074v1.pdf
PWC	https://paperswithcode.com/paper/distilling-information-from-a-flood-a
Repo
Framework

Deconvolving convolution neural network for cell detection


Title	Deconvolving convolution neural network for cell detection
Authors	Shan E Ahmed Raza, Khalid AbdulJabbar, Mariam Jamal-Hanjani, Selvaraju Veeriah, John Le Quesne, Charles Swanton, Yinyin Yuan
Abstract	Automatic cell detection in histology images is a challenging task due to varying size, shape and features of cells and stain variations across a large cohort. Conventional deep learning methods regress the probability of each pixel belonging to the centre of a cell followed by detection of local maxima. We present deconvolution as an alternate approach to local maxima detection. The ground truth points are convolved with a mapping filter to generate artifical labels. A convolutional neural network (CNN) is modified to convolve it’s output with the same mapping filter and is trained for the mapped labels. Output of the trained CNN is then deconvolved to generate points as cell detection. We compare our method with state-of-the-art deep learning approaches where the results show that the proposed approach detects cells with comparatively high precision and F1-score.
Tasks
Published	2018-06-18
URL	http://arxiv.org/abs/1806.06970v1
PDF	http://arxiv.org/pdf/1806.06970v1.pdf
PWC	https://paperswithcode.com/paper/deconvolving-convolution-neural-network-for
Repo
Framework