July 29, 2019

2980 words 14 mins read

Paper Group ANR 112

Paper Group ANR 112

3D Based Landmark Tracker Using Superpixels Based Segmentation for Neuroscience and Biomechanics Studies. Distance weighted discrimination of face images for gender classification. Malaria Likelihood Prediction By Effectively Surveying Households Using Deep Reinforcement Learning. An Outlyingness Matrix for Multivariate Functional Data Classificati …

3D Based Landmark Tracker Using Superpixels Based Segmentation for Neuroscience and Biomechanics Studies

Title 3D Based Landmark Tracker Using Superpixels Based Segmentation for Neuroscience and Biomechanics Studies
Authors Omid Haji Maghsoudi, Andrew Spence
Abstract Examining locomotion has improved our basic understanding of motor control and aided in treating motor impairment. Mice and rats are premier models of human disease and increasingly the model systems of choice for basic neuroscience. High frame rates (250 Hz) are needed to quantify the kinematics of these running rodents. Manual tracking, especially for multiple markers, becomes time-consuming and impossible for large sample sizes. Therefore, the need for automatic segmentation of these markers has grown in recent years. Here, we address this need by presenting a method to segment the markers using the SLIC superpixel method. The 2D coordinates on the image plane are projected to a 3D domain using direct linear transform (DLT) and a 3D Kalman filter has been used to predict the position of markers based on the speed and position of markers from the previous frames. Finally, a probabilistic function is used to find the best match among superpixels. The method is evaluated for different difficulties for tracking of the markers and it achieves 95% correct labeling of markers.
Tasks
Published 2017-11-23
URL http://arxiv.org/abs/1711.08785v1
PDF http://arxiv.org/pdf/1711.08785v1.pdf
PWC https://paperswithcode.com/paper/3d-based-landmark-tracker-using-superpixels
Repo
Framework

Distance weighted discrimination of face images for gender classification

Title Distance weighted discrimination of face images for gender classification
Authors Mónica Benito, Eduardo García-Portugués, J. S. Marron, Daniel Peña
Abstract We illustrate the advantages of distance weighted discrimination for classification and feature extraction in a High Dimension Low Sample Size (HDLSS) situation. The HDLSS context is a gender classification problem of face images in which the dimension of the data is several orders of magnitude larger than the sample size. We compare distance weighted discrimination with Fisher’s linear discriminant, support vector machines, and principal component analysis by exploring their classification interpretation through insightful visuanimations and by examining the classifiers’ discriminant errors. This analysis enables us to make new contributions to the understanding of the drivers of human discrimination between males and females.
Tasks
Published 2017-06-15
URL http://arxiv.org/abs/1706.05029v1
PDF http://arxiv.org/pdf/1706.05029v1.pdf
PWC https://paperswithcode.com/paper/distance-weighted-discrimination-of-face
Repo
Framework

Malaria Likelihood Prediction By Effectively Surveying Households Using Deep Reinforcement Learning

Title Malaria Likelihood Prediction By Effectively Surveying Households Using Deep Reinforcement Learning
Authors Pranav Rajpurkar, Vinaya Polamreddi, Anusha Balakrishnan
Abstract We build a deep reinforcement learning (RL) agent that can predict the likelihood of an individual testing positive for malaria by asking questions about their household. The RL agent learns to determine which survey question to ask next and when to stop to make a prediction about their likelihood of malaria based on their responses hitherto. The agent incurs a small penalty for each question asked, and a large reward/penalty for making the correct/wrong prediction; it thus has to learn to balance the length of the survey with the accuracy of its final predictions. Our RL agent is a Deep Q-network that learns a policy directly from the responses to the questions, with an action defined for each possible survey question and for each possible prediction class. We focus on Kenya, where malaria is a massive health burden, and train the RL agent on a dataset of 6481 households from the Kenya Malaria Indicator Survey 2015. To investigate the importance of having survey questions be adaptive to responses, we compare our RL agent to a supervised learning (SL) baseline that fixes its set of survey questions a priori. We evaluate on prediction accuracy and on the number of survey questions asked on a holdout set and find that the RL agent is able to predict with 80% accuracy, using only 2.5 questions on average. In addition, the RL agent learns to survey adaptively to responses and is able to match the SL baseline in prediction accuracy while significantly reducing survey length.
Tasks
Published 2017-11-25
URL http://arxiv.org/abs/1711.09223v1
PDF http://arxiv.org/pdf/1711.09223v1.pdf
PWC https://paperswithcode.com/paper/malaria-likelihood-prediction-by-effectively
Repo
Framework

An Outlyingness Matrix for Multivariate Functional Data Classification

Title An Outlyingness Matrix for Multivariate Functional Data Classification
Authors Wenlin Dai, Marc G. Genton
Abstract The classification of multivariate functional data is an important task in scientific research. Unlike point-wise data, functional data are usually classified by their shapes rather than by their scales. We define an outlyingness matrix by extending directional outlyingness, an effective measure of the shape variation of curves that combines the direction of outlyingness with conventional depth. We propose two classifiers based on directional outlyingness and the outlyingness matrix, respectively. Our classifiers provide better performance compared with existing depth-based classifiers when applied on both univariate and multivariate functional data from simulation studies. We also test our methods on two data problems: speech recognition and gesture classification, and obtain results that are consistent with the findings from the simulated data.
Tasks Speech Recognition
Published 2017-04-09
URL http://arxiv.org/abs/1704.02568v2
PDF http://arxiv.org/pdf/1704.02568v2.pdf
PWC https://paperswithcode.com/paper/an-outlyingness-matrix-for-multivariate
Repo
Framework

Sub-Gaussian estimators of the mean of a random vector

Title Sub-Gaussian estimators of the mean of a random vector
Authors Gábor Lugosi, Shahar Mendelson
Abstract We study the problem of estimating the mean of a random vector $X$ given a sample of $N$ independent, identically distributed points. We introduce a new estimator that achieves a purely sub-Gaussian performance under the only condition that the second moment of $X$ exists. The estimator is based on a novel concept of a multivariate median.
Tasks
Published 2017-02-01
URL http://arxiv.org/abs/1702.00482v1
PDF http://arxiv.org/pdf/1702.00482v1.pdf
PWC https://paperswithcode.com/paper/sub-gaussian-estimators-of-the-mean-of-a
Repo
Framework

Counterfactual Fairness

Title Counterfactual Fairness
Authors Matt J. Kusner, Joshua R. Loftus, Chris Russell, Ricardo Silva
Abstract Machine learning can impact people with legal or ethical consequences when it is used to automate decisions in areas such as insurance, lending, hiring, and predictive policing. In many of these scenarios, previous decisions have been made that are unfairly biased against certain subpopulations, for example those of a particular race, gender, or sexual orientation. Since this past data may be biased, machine learning predictors must account for this to avoid perpetuating or creating discriminatory practices. In this paper, we develop a framework for modeling fairness using tools from causal inference. Our definition of counterfactual fairness captures the intuition that a decision is fair towards an individual if it is the same in (a) the actual world and (b) a counterfactual world where the individual belonged to a different demographic group. We demonstrate our framework on a real-world problem of fair prediction of success in law school.
Tasks Causal Inference
Published 2017-03-20
URL http://arxiv.org/abs/1703.06856v3
PDF http://arxiv.org/pdf/1703.06856v3.pdf
PWC https://paperswithcode.com/paper/counterfactual-fairness
Repo
Framework

Multi-view (Joint) Probability Linear Discrimination Analysis for Multi-view Feature Verification

Title Multi-view (Joint) Probability Linear Discrimination Analysis for Multi-view Feature Verification
Authors Ziqiang Shi, Liu Liu, Mengjiao Wang, Rujie Liu
Abstract Multi-view feature has been proved to be very effective in many multimedia applications. However, the current back-end classifiers cannot make full use of such features. In this paper, we propose a method to model the multi-faceted information in the multi-view features explicitly and jointly. In our approach, the feature was modeled as a result derived by a generative multi-view (joint\footnotemark[1]) Probability Linear Discriminant Analysis (PLDA) model, which contains multiple kinds of latent variables. The usual PLDA model only considers one single label. However, in practical use, when using multi-task learned network as feature extractor, the extracted feature are always attached to several labels. This type of feature is called multi-view feature. With multi-view (joint) PLDA, we are able to explicitly build a model that can combine multiple heterogeneous information from the multi-view features. In verification step, we calculated the likelihood to describe whether the two features having consistent labels or not. This likelihood are used in the following decision-making. Experiments have been conducted on large scale verification task. On the public RSR2015 data corpus, the results showed that our approach can achieve 0.02% EER and 0.09% EER for impostor wrong and impostor correct cases respectively.
Tasks Decision Making
Published 2017-04-20
URL http://arxiv.org/abs/1704.06061v4
PDF http://arxiv.org/pdf/1704.06061v4.pdf
PWC https://paperswithcode.com/paper/multi-view-joint-probability-linear
Repo
Framework

Midgar: Detection of people through computer vision in the Internet of Things scenarios to improve the security in Smart Cities, Smart Towns, and Smart Homes

Title Midgar: Detection of people through computer vision in the Internet of Things scenarios to improve the security in Smart Cities, Smart Towns, and Smart Homes
Authors Cristian González García, Daniel Meana-Llorián, B. Cristina Pelayo G-Bustelo, Juan Manuel Cueva Lovelle, Néstor Garcia-Fernandez
Abstract Could we use Computer Vision in the Internet of Things for using pictures as sensors? This is the principal hypothesis that we want to resolve. Currently, in order to create safety areas, cities, or homes, people use IP cameras. Nevertheless, this system needs people who watch the camera images, watch the recording after something occurred, or watch when the camera notifies them of any movement. These are the disadvantages. Furthermore, there are many Smart Cities and Smart Homes around the world. This is why we thought of using the idea of the Internet of Things to add a way of automating the use of IP cameras. In our case, we propose the analysis of pictures through Computer Vision to detect people in the analysed pictures. With this analysis, we are able to obtain if these pictures contain people and handle the pictures as if they were sensors with two possible states. Notwithstanding, Computer Vision is a very complicated field. This is why we needed a second hypothesis: Could we work with Computer Vision in the Internet of Things with a good accuracy to automate or semi-automate this kind of events? The demonstration of these hypotheses required a testing over our Computer Vision module to check the possibilities that we have to use this module in a possible real environment with a good accuracy. Our proposal, as a possible solution, is the analysis of entire sequence instead of isolated pictures for using pictures as sensors in the Internet of Things.
Tasks
Published 2017-01-10
URL http://arxiv.org/abs/1701.02632v3
PDF http://arxiv.org/pdf/1701.02632v3.pdf
PWC https://paperswithcode.com/paper/midgar-detection-of-people-through-computer
Repo
Framework

Modeling the Intra-class Variability for Liver Lesion Detection using a Multi-class Patch-based CNN

Title Modeling the Intra-class Variability for Liver Lesion Detection using a Multi-class Patch-based CNN
Authors Maayan Frid-Adar, Idit Diamant, Eyal Klang, Michal Amitai, Jacob Goldberger, Hayit Greenspan
Abstract Automatic detection of liver lesions in CT images poses a great challenge for researchers. In this work we present a deep learning approach that models explicitly the variability within the non-lesion class, based on prior knowledge of the data, to support an automated lesion detection system. A multi-class convolutional neural network (CNN) is proposed to categorize input image patches into sub-categories of boundary and interior patches, the decisions of which are fused to reach a binary lesion vs non-lesion decision. For validation of our system, we use CT images of 132 livers and 498 lesions. Our approach shows highly improved detection results that outperform the state-of-the-art fully convolutional network. Automated computerized tools, as shown in this work, have the potential in the future to support the radiologists towards improved detection.
Tasks
Published 2017-07-19
URL http://arxiv.org/abs/1707.06053v2
PDF http://arxiv.org/pdf/1707.06053v2.pdf
PWC https://paperswithcode.com/paper/modeling-the-intra-class-variability-for
Repo
Framework

Unifying Map and Landmark Based Representations for Visual Navigation

Title Unifying Map and Landmark Based Representations for Visual Navigation
Authors Saurabh Gupta, David Fouhey, Sergey Levine, Jitendra Malik
Abstract This works presents a formulation for visual navigation that unifies map based spatial reasoning and path planning, with landmark based robust plan execution in noisy environments. Our proposed formulation is learned from data and is thus able to leverage statistical regularities of the world. This allows it to efficiently navigate in novel environments given only a sparse set of registered images as input for building representations for space. Our formulation is based on three key ideas: a learned path planner that outputs path plans to reach the goal, a feature synthesis engine that predicts features for locations along the planned path, and a learned goal-driven closed loop controller that can follow plans given these synthesized features. We test our approach for goal-driven navigation in simulated real world environments and report performance gains over competitive baseline approaches.
Tasks Visual Navigation
Published 2017-12-21
URL http://arxiv.org/abs/1712.08125v1
PDF http://arxiv.org/pdf/1712.08125v1.pdf
PWC https://paperswithcode.com/paper/unifying-map-and-landmark-based
Repo
Framework

3D Facial Expression Reconstruction using Cascaded Regression

Title 3D Facial Expression Reconstruction using Cascaded Regression
Authors Fanzi Wu, Songnan Li, Tianhao Zhao, King Ngi Ngan, Lv Sheng
Abstract This paper proposes a novel model fitting algorithm for 3D facial expression reconstruction from a single image. Face expression reconstruction from a single image is a challenging task in computer vision. Most state-of-the-art methods fit the input image to a 3D Morphable Model (3DMM). These methods need to solve a stochastic problem and cannot deal with expression and pose variations. To solve this problem, we adopt a 3D face expression model and use a combined feature which is robust to scale, rotation and different lighting conditions. The proposed method applies a cascaded regression framework to estimate parameters for the 3DMM. 2D landmarks are detected and used to initialize the 3D shape and mapping matrices. In each iteration, residues between the current 3DMM parameters and the ground truth are estimated and then used to update the 3D shapes. The mapping matrices are also calculated based on the updated shapes and 2D landmarks. HOG features of the local patches and displacements between 3D landmark projections and 2D landmarks are exploited. Compared with existing methods, the proposed method is robust to expression and pose changes and can reconstruct higher fidelity 3D face shape.
Tasks
Published 2017-12-10
URL http://arxiv.org/abs/1712.03491v2
PDF http://arxiv.org/pdf/1712.03491v2.pdf
PWC https://paperswithcode.com/paper/3d-facial-expression-reconstruction-using
Repo
Framework

Normal Integration: A Survey

Title Normal Integration: A Survey
Authors Yvain Quéau, Jean-Denis Durou, Jean-François Aujol
Abstract The need for efficient normal integration methods is driven by several computer vision tasks such as shape-from-shading, photometric stereo, deflectometry, etc. In the first part of this survey, we select the most important properties that one may expect from a normal integration method, based on a thorough study of two pioneering works by Horn and Brooks [28] and by Frankot and Chellappa [19]. Apart from accuracy, an integration method should at least be fast and robust to a noisy normal field. In addition, it should be able to handle several types of boundary condition, including the case of a free boundary, and a reconstruction domain of any shape i.e., which is not necessarily rectangular. It is also much appreciated that a minimum number of parameters have to be tuned, or even no parameter at all. Finally, it should preserve the depth discontinuities. In the second part of this survey, we review most of the existing methods in view of this analysis, and conclude that none of them satisfies all of the required properties. This work is complemented by a companion paper entitled Variational Methods for Normal Integration, in which we focus on the problem of normal integration in the presence of depth discontinuities, a problem which occurs as soon as there are occlusions.
Tasks
Published 2017-09-18
URL http://arxiv.org/abs/1709.05940v1
PDF http://arxiv.org/pdf/1709.05940v1.pdf
PWC https://paperswithcode.com/paper/normal-integration-a-survey
Repo
Framework

Quantifying the relation between performance and success in soccer

Title Quantifying the relation between performance and success in soccer
Authors Luca Pappalardo, Paolo Cintia
Abstract The availability of massive data about sports activities offers nowadays the opportunity to quantify the relation between performance and success. In this study, we analyze more than 6,000 games and 10 million events in six European leagues and investigate this relation in soccer competitions. We discover that a team’s position in a competition’s final ranking is significantly related to its typical performance, as described by a set of technical features extracted from the soccer data. Moreover we find that, while victory and defeats can be explained by the team’s performance during a game, it is difficult to detect draws by using a machine learning approach. We then simulate the outcomes of an entire season of each league only relying on technical data, i.e. excluding the goals scored, exploiting a machine learning model trained on data from past seasons. The simulation produces a team ranking (the PC ranking) which is close to the actual ranking, suggesting that a complex systems’ view on soccer has the potential of revealing hidden patterns regarding the relation between performance and success.
Tasks
Published 2017-05-02
URL http://arxiv.org/abs/1705.00885v3
PDF http://arxiv.org/pdf/1705.00885v3.pdf
PWC https://paperswithcode.com/paper/quantifying-the-relation-between-performance
Repo
Framework

Convolutional Recurrent Neural Networks for Small-Footprint Keyword Spotting

Title Convolutional Recurrent Neural Networks for Small-Footprint Keyword Spotting
Authors Sercan O. Arik, Markus Kliegl, Rewon Child, Joel Hestness, Andrew Gibiansky, Chris Fougner, Ryan Prenger, Adam Coates
Abstract Keyword spotting (KWS) constitutes a major component of human-technology interfaces. Maximizing the detection accuracy at a low false alarm (FA) rate, while minimizing the footprint size, latency and complexity are the goals for KWS. Towards achieving them, we study Convolutional Recurrent Neural Networks (CRNNs). Inspired by large-scale state-of-the-art speech recognition systems, we combine the strengths of convolutional layers and recurrent layers to exploit local structure and long-range context. We analyze the effect of architecture parameters, and propose training strategies to improve performance. With only ~230k parameters, our CRNN model yields acceptably low latency, and achieves 97.71% accuracy at 0.5 FA/hour for 5 dB signal-to-noise ratio.
Tasks Keyword Spotting, Small-Footprint Keyword Spotting, Speech Recognition
Published 2017-03-15
URL http://arxiv.org/abs/1703.05390v3
PDF http://arxiv.org/pdf/1703.05390v3.pdf
PWC https://paperswithcode.com/paper/convolutional-recurrent-neural-networks-for-2
Repo
Framework

Path-following based Point Matching using Similarity Transformation

Title Path-following based Point Matching using Similarity Transformation
Authors Wei Lian
Abstract To address the problem of 3D point matching where the poses of two point sets are unknown, we adapt a recently proposed path following based method to use similarity transformation instead of the original affine transformation. The reduced number of transformation parameters leads to more constrained and desirable matching results. Experimental results demonstrate better robustness of the proposed method over state-of-the-art methods.
Tasks
Published 2017-01-04
URL http://arxiv.org/abs/1701.01035v1
PDF http://arxiv.org/pdf/1701.01035v1.pdf
PWC https://paperswithcode.com/paper/path-following-based-point-matching-using
Repo
Framework
comments powered by Disqus