May 6, 2019

3008 words 15 mins read

Paper Group ANR 390

UMDFaces: An Annotated Face Dataset for Training Deep Networks. Towards end-to-end optimisation of functional image analysis pipelines. A Chain-Detection Algorithm for Two-Dimensional Grids. Defensive Player Classification in the National Basketball Association. Jansen-MIDAS: a multi-level photomicrograph segmentation software based on isotropic un …

UMDFaces: An Annotated Face Dataset for Training Deep Networks


Title	UMDFaces: An Annotated Face Dataset for Training Deep Networks
Authors	Ankan Bansal, Anirudh Nanduri, Carlos Castillo, Rajeev Ranjan, Rama Chellappa
Abstract	Recent progress in face detection (including keypoint detection), and recognition is mainly being driven by (i) deeper convolutional neural network architectures, and (ii) larger datasets. However, most of the large datasets are maintained by private companies and are not publicly available. The academic computer vision community needs larger and more varied datasets to make further progress. In this paper we introduce a new face dataset, called UMDFaces, which has 367,888 annotated faces of 8,277 subjects. We also introduce a new face recognition evaluation protocol which will help advance the state-of-the-art in this area. We discuss how a large dataset can be collected and annotated using human annotators and deep networks. We provide human curated bounding boxes for faces. We also provide estimated pose (roll, pitch and yaw), locations of twenty-one key-points and gender information generated by a pre-trained neural network. In addition, the quality of keypoint annotations has been verified by humans for about 115,000 images. Finally, we compare the quality of the dataset with other publicly available face datasets at similar scales.
Tasks	Face Detection, Face Recognition, Keypoint Detection
Published	2016-11-04
URL	http://arxiv.org/abs/1611.01484v2
PDF	http://arxiv.org/pdf/1611.01484v2.pdf
PWC	https://paperswithcode.com/paper/umdfaces-an-annotated-face-dataset-for
Repo
Framework

Towards end-to-end optimisation of functional image analysis pipelines


Title	Towards end-to-end optimisation of functional image analysis pipelines
Authors	Albert Vilamala, Kristoffer Hougaard Madsen, Lars Kai Hansen
Abstract	The study of neurocognitive tasks requiring accurate localisation of activity often rely on functional Magnetic Resonance Imaging, a widely adopted technique that makes use of a pipeline of data processing modules, each involving a variety of parameters. These parameters are frequently set according to the local goal of each specific module, not accounting for the rest of the pipeline. Given recent success of neural network research in many different domains, we propose to convert the whole data pipeline into a deep neural network, where the parameters involved are jointly optimised by the network to best serve a common global goal. As a proof of concept, we develop a module able to adaptively apply the most suitable spatial smoothing to every brain volume for each specific neuroimaging task, and we validate its results in a standard brain decoding experiment.
Tasks	Brain Decoding
Published	2016-10-13
URL	http://arxiv.org/abs/1610.04079v1
PDF	http://arxiv.org/pdf/1610.04079v1.pdf
PWC	https://paperswithcode.com/paper/towards-end-to-end-optimisation-of-functional
Repo
Framework

A Chain-Detection Algorithm for Two-Dimensional Grids


Title	A Chain-Detection Algorithm for Two-Dimensional Grids
Authors	Paul Bonham, Azlan Iqbal
Abstract	We describe a general method of detecting valid chains or links of pieces on a two-dimensional grid. Specifically, using the example of the chess variant known as Switch-Side Chain-Chess (SSCC). Presently, no foolproof method of detecting such chains in any given chess position is known and existing graph theory, to our knowledge, is unable to fully address this problem either. We therefore propose a solution implemented and tested using the C++ programming language. We have been unable to find an incorrect result and therefore offer it as the most viable solution thus far to the chain-detection problem in this chess variant. The algorithm is also scalable, in principle, to areas beyond two-dimensional grids such as 3D analysis and molecular chemistry.
Tasks
Published	2016-10-12
URL	http://arxiv.org/abs/1610.03573v1
PDF	http://arxiv.org/pdf/1610.03573v1.pdf
PWC	https://paperswithcode.com/paper/a-chain-detection-algorithm-for-two
Repo
Framework

Defensive Player Classification in the National Basketball Association


Title	Defensive Player Classification in the National Basketball Association
Authors	Neil Seward
Abstract	The National Basketball Association(NBA) has expanded their data gathering and have heavily invested in new technologies to gather advanced performance metrics on players. This expanded data set allows analysts to use unique performance metrics in models to estimate and classify player performance. Instead of grouping players together based on physical attributes and positions played, analysts can group together players that play similar to each other based on these tracked metrics. Existing methods for player classification have typically used offensive metrics for clustering [1]. There have been attempts to classify players using past defensive metrics, but the lack of quality metrics has not produced promising results. The classifications presented in the paper use newly introduced defensive metrics to find different defensive positions for each player. Without knowing the number of categories that players can be cast into, Gaussian Mixture Models (GMM) can be applied to find the optimal number of clusters. In the model presented, five different defensive player types can be identified.
Tasks
Published	2016-12-13
URL	http://arxiv.org/abs/1612.05502v2
PDF	http://arxiv.org/pdf/1612.05502v2.pdf
PWC	https://paperswithcode.com/paper/defensive-player-classification-in-the
Repo
Framework

Jansen-MIDAS: a multi-level photomicrograph segmentation software based on isotropic undecimated wavelets


Title	Jansen-MIDAS: a multi-level photomicrograph segmentation software based on isotropic undecimated wavelets
Authors	Alexandre Fioravante de Siqueira, Flávio Camargo Cabrera, Wagner Massayuki Nakasuga, Aylton Pagamisse, Aldo Eloizo Job
Abstract	Image segmentation, the process of separating the elements within an image, is frequently used for obtaining information from photomicrographs. However, segmentation methods should be used with reservations: incorrect segmentation can mislead when interpreting regions of interest (ROI), thus decreasing the success rate of additional procedures. Multi-Level Starlet Segmentation (MLSS) and Multi-Level Starlet Optimal Segmentation (MLSOS) were developed to address the photomicrograph segmentation deficiency on general tools. These methods gave rise to Jansen-MIDAS, an open-source software which a scientist can use to obtain a multi-level threshold segmentation of his/hers photomicrographs. This software is presented in two versions: a text-based version, for GNU Octave, and a graphical user interface (GUI) version, for MathWorks MATLAB. It can be used to process several types of images, becoming a reliable alternative to the scientist.
Tasks	Semantic Segmentation
Published	2016-04-20
URL	http://arxiv.org/abs/1604.05921v2
PDF	http://arxiv.org/pdf/1604.05921v2.pdf
PWC	https://paperswithcode.com/paper/jansen-midas-a-multi-level-photomicrograph
Repo
Framework

Deep Learning At Scale and At Ease


Title	Deep Learning At Scale and At Ease
Authors	Wei Wang, Gang Chen, Haibo Chen, Tien Tuan Anh Dinh, Jinyang Gao, Beng Chin Ooi, Kian-Lee Tan, Sheng Wang
Abstract	Recently, deep learning techniques have enjoyed success in various multimedia applications, such as image classification and multi-modal data analysis. Large deep learning models are developed for learning rich representations of complex data. There are two challenges to overcome before deep learning can be widely adopted in multimedia and other applications. One is usability, namely the implementation of different models and training algorithms must be done by non-experts without much effort especially when the model is large and complex. The other is scalability, that is the deep learning system must be able to provision for a huge demand of computing resources for training large models with massive datasets. To address these two challenges, in this paper, we design a distributed deep learning platform called SINGA which has an intuitive programming model based on the common layer abstraction of deep learning models. Good scalability is achieved through flexible distributed training architecture and specific optimization techniques. SINGA runs on GPUs as well as on CPUs, and we show that it outperforms many other state-of-the-art deep learning systems. Our experience with developing and training deep learning models for real-life multimedia applications in SINGA shows that the platform is both usable and scalable.
Tasks	Image Classification
Published	2016-03-25
URL	http://arxiv.org/abs/1603.07846v1
PDF	http://arxiv.org/pdf/1603.07846v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-at-scale-and-at-ease
Repo
Framework

Deep Multimodal Feature Analysis for Action Recognition in RGB+D Videos


Title	Deep Multimodal Feature Analysis for Action Recognition in RGB+D Videos
Authors	Amir Shahroudy, Tian-Tsong Ng, Yihong Gong, Gang Wang
Abstract	Single modality action recognition on RGB or depth sequences has been extensively explored recently. It is generally accepted that each of these two modalities has different strengths and limitations for the task of action recognition. Therefore, analysis of the RGB+D videos can help us to better study the complementary properties of these two types of modalities and achieve higher levels of performance. In this paper, we propose a new deep autoencoder based shared-specific feature factorization network to separate input multimodal signals into a hierarchy of components. Further, based on the structure of the features, a structured sparsity learning machine is proposed which utilizes mixed norms to apply regularization within components and group selection between them for better classification performance. Our experimental results show the effectiveness of our cross-modality feature analysis framework by achieving state-of-the-art accuracy for action classification on five challenging benchmark datasets.
Tasks	Action Classification, Action Recognition In Videos, Multimodal Activity Recognition, Temporal Action Localization
Published	2016-03-23
URL	http://arxiv.org/abs/1603.07120v2
PDF	http://arxiv.org/pdf/1603.07120v2.pdf
PWC	https://paperswithcode.com/paper/deep-multimodal-feature-analysis-for-action
Repo
Framework

Lazifying Conditional Gradient Algorithms


Title	Lazifying Conditional Gradient Algorithms
Authors	Gábor Braun, Sebastian Pokutta, Daniel Zink
Abstract	Conditional gradient algorithms (also often called Frank-Wolfe algorithms) are popular due to their simplicity of only requiring a linear optimization oracle and more recently they also gained significant traction for online learning. While simple in principle, in many cases the actual implementation of the linear optimization oracle is costly. We show a general method to lazify various conditional gradient algorithms, which in actual computations leads to several orders of magnitude of speedup in wall-clock time. This is achieved by using a faster separation oracle instead of a linear optimization oracle, relying only on few linear optimization oracle calls.
Tasks
Published	2016-10-17
URL	http://arxiv.org/abs/1610.05120v4
PDF	http://arxiv.org/pdf/1610.05120v4.pdf
PWC	https://paperswithcode.com/paper/lazifying-conditional-gradient-algorithms
Repo
Framework

Generalized Haar Filter based Deep Networks for Real-Time Object Detection in Traffic Scene


Title	Generalized Haar Filter based Deep Networks for Real-Time Object Detection in Traffic Scene
Authors	Keyu Lu, Jian Li, Xiangjing An, Hangen He
Abstract	Vision-based object detection is one of the fundamental functions in numerous traffic scene applications such as self-driving vehicle systems and advance driver assistance systems (ADAS). However, it is also a challenging task due to the diversity of traffic scene and the storage, power and computing source limitations of the platforms for traffic scene applications. This paper presents a generalized Haar filter based deep network which is suitable for the object detection tasks in traffic scene. In this approach, we first decompose a object detection task into several easier local regression tasks. Then, we handle the local regression tasks by using several tiny deep networks which simultaneously output the bounding boxes, categories and confidence scores of detected objects. To reduce the consumption of storage and computing resources, the weights of the deep networks are constrained to the form of generalized Haar filter in training phase. Additionally, we introduce the strategy of sparse windows generation to improve the efficiency of the algorithm. Finally, we perform several experiments to validate the performance of our proposed approach. Experimental results demonstrate that the proposed approach is both efficient and effective in traffic scene compared with the state-of-the-art.
Tasks	Object Detection, Real-Time Object Detection
Published	2016-10-30
URL	http://arxiv.org/abs/1610.09609v1
PDF	http://arxiv.org/pdf/1610.09609v1.pdf
PWC	https://paperswithcode.com/paper/generalized-haar-filter-based-deep-networks
Repo
Framework

Online Prediction of Dyadic Data with Heterogeneous Matrix Factorization


Title	Online Prediction of Dyadic Data with Heterogeneous Matrix Factorization
Authors	Guangyong Chen, Fengyuan Zhu, Pheng Ann Heng
Abstract	Dyadic Data Prediction (DDP) is an important problem in many research areas. This paper develops a novel fully Bayesian nonparametric framework which integrates two popular and complementary approaches, discrete mixed membership modeling and continuous latent factor modeling into a unified Heterogeneous Matrix Factorization~(HeMF) model, which can predict the unobserved dyadics accurately. The HeMF can determine the number of communities automatically and exploit the latent linear structure for each bicluster efficiently. We propose a Variational Bayesian method to estimate the parameters and missing data. We further develop a novel online learning approach for Variational inference and use it for the online learning of HeMF, which can efficiently cope with the important large-scale DDP problem. We evaluate the performance of our method on the EachMoive, MovieLens and Netflix Prize collaborative filtering datasets. The experiment shows that, our model outperforms state-of-the-art methods on all benchmarks. Compared with Stochastic Gradient Method (SGD), our online learning approach achieves significant improvement on the estimation accuracy and robustness.
Tasks
Published	2016-01-13
URL	http://arxiv.org/abs/1601.03124v1
PDF	http://arxiv.org/pdf/1601.03124v1.pdf
PWC	https://paperswithcode.com/paper/online-prediction-of-dyadic-data-with
Repo
Framework

Things Bayes can’t do


Title	Things Bayes can’t do
Authors	Daniil Ryabko
Abstract	The problem of forecasting conditional probabilities of the next event given the past is considered in a general probabilistic setting. Given an arbitrary (large, uncountable) set C of predictors, we would like to construct a single predictor that performs asymptotically as well as the best predictor in C, on any data. Here we show that there are sets C for which such predictors exist, but none of them is a Bayesian predictor with a prior concentrated on C. In other words, there is a predictor with sublinear regret, but every Bayesian predictor must have a linear regret. This negative finding is in sharp contrast with previous results that establish the opposite for the case when one of the predictors in $C$ achieves asymptotically vanishing error. In such a case, if there is a predictor that achieves asymptotically vanishing error for any measure in C, then there is a Bayesian predictor that also has this property, and whose prior is concentrated on (a countable subset of) C.
Tasks
Published	2016-10-26
URL	http://arxiv.org/abs/1610.08239v2
PDF	http://arxiv.org/pdf/1610.08239v2.pdf
PWC	https://paperswithcode.com/paper/things-bayes-cant-do
Repo
Framework

“Show me the cup”: Reference with Continuous Representations


Title	“Show me the cup”: Reference with Continuous Representations
Authors	Gemma Boleda, Sebastian Padó, Marco Baroni
Abstract	One of the most basic functions of language is to refer to objects in a shared scene. Modeling reference with continuous representations is challenging because it requires individuation, i.e., tracking and distinguishing an arbitrary number of referents. We introduce a neural network model that, given a definite description and a set of objects represented by natural images, points to the intended object if the expression has a unique referent, or indicates a failure, if it does not. The model, directly trained on reference acts, is competitive with a pipeline manually engineered to perform the same task, both when referents are purely visual, and when they are characterized by a combination of visual and linguistic properties.
Tasks
Published	2016-06-28
URL	http://arxiv.org/abs/1606.08777v1
PDF	http://arxiv.org/pdf/1606.08777v1.pdf
PWC	https://paperswithcode.com/paper/show-me-the-cup-reference-with-continuous
Repo
Framework

Vision-based Traffic Flow Prediction using Dynamic Texture Model and Gaussian Process


Title	Vision-based Traffic Flow Prediction using Dynamic Texture Model and Gaussian Process
Authors	Bin Liu, Hao Ji, Yi Dai
Abstract	In this paper, we describe work in progress towards a real-time vision-based traffic flow prediction (TFP) system. The proposed method consists of three elemental operators, that are dynamic texture model based motion segmentation, feature extraction and Gaussian process (GP) regression. The objective of motion segmentation is to recognize the target regions covering the moving vehicles in the sequence of visual processes. The feature extraction operator aims to extract useful features from the target regions. The extracted features are then mapped to the number of vehicles through the operator of GP regression. A training stage using historical visual data is required for determining the parameter values of the GP. Using a low-resolution visual data set, we performed preliminary evaluations on the performance of the proposed method. The results show that our method beats a benchmark solution based on Gaussian mixture model, and has the potential to be developed into qualified and practical solutions to real-time TFP.
Tasks	Motion Segmentation
Published	2016-07-14
URL	http://arxiv.org/abs/1607.03991v2
PDF	http://arxiv.org/pdf/1607.03991v2.pdf
PWC	https://paperswithcode.com/paper/vision-based-traffic-flow-prediction-using
Repo
Framework

Robust Uncalibrated Stereo Rectification with Constrained Geometric Distortions (USR-CGD)


Title	Robust Uncalibrated Stereo Rectification with Constrained Geometric Distortions (USR-CGD)
Authors	Hyunsuk Ko, Han Suk Shim, Ouk Choi, C. -C. Jay Kuo
Abstract	A novel algorithm for uncalibrated stereo image-pair rectification under the constraint of geometric distortion, called USR-CGD, is presented in this work. Although it is straightforward to define a rectifying transformation (or homography) given the epipolar geometry, many existing algorithms have unwanted geometric distortions as a side effect. To obtain rectified images with reduced geometric distortions while maintaining a small rectification error, we parameterize the homography by considering the influence of various kinds of geometric distortions. Next, we define several geometric measures and incorporate them into a new cost function for parameter optimization. Finally, we propose a constrained adaptive optimization scheme to allow a balanced performance between the rectification error and the geometric error. Extensive experimental results are provided to demonstrate the superb performance of the proposed USR-CGD method, which outperforms existing algorithms by a significant margin.
Tasks
Published	2016-03-31
URL	http://arxiv.org/abs/1603.09462v1
PDF	http://arxiv.org/pdf/1603.09462v1.pdf
PWC	https://paperswithcode.com/paper/robust-uncalibrated-stereo-rectification-with
Repo
Framework

Game-Theoretic Modeling of Driver and Vehicle Interactions for Verification and Validation of Autonomous Vehicle Control Systems


Title	Game-Theoretic Modeling of Driver and Vehicle Interactions for Verification and Validation of Autonomous Vehicle Control Systems
Authors	Nan Li, Dave Oyler, Mengxuan Zhang, Yildiray Yildiz, Ilya Kolmanovsky, Anouck Girard
Abstract	Autonomous driving has been the subject of increased interest in recent years both in industry and in academia. Serious efforts are being pursued to address legal, technical and logistical problems and make autonomous cars a viable option for everyday transportation. One significant challenge is the time and effort required for the verification and validation of the decision and control algorithms employed in these vehicles to ensure a safe and comfortable driving experience. Hundreds of thousands of miles of driving tests are required to achieve a well calibrated control system that is capable of operating an autonomous vehicle in an uncertain traffic environment where multiple interactions between vehicles and drivers simultaneously occur. Traffic simulators where these interactions can be modeled and represented with reasonable fidelity can help decrease the time and effort necessary for the development of the autonomous driving control algorithms by providing a venue where acceptable initial control calibrations can be achieved quickly and safely before actual road tests. In this paper, we present a game theoretic traffic model that can be used to 1) test and compare various autonomous vehicle decision and control systems and 2) calibrate the parameters of an existing control system. We demonstrate two example case studies, where, in the first case, we test and quantitatively compare two autonomous vehicle control systems in terms of their safety and performance, and, in the second case, we optimize the parameters of an autonomous vehicle control system, utilizing the proposed traffic model and simulation environment.
Tasks	Autonomous Driving
Published	2016-08-30
URL	http://arxiv.org/abs/1608.08589v1
PDF	http://arxiv.org/pdf/1608.08589v1.pdf
PWC	https://paperswithcode.com/paper/game-theoretic-modeling-of-driver-and-vehicle
Repo
Framework