Paper Group ANR 300
Calibration of depth cameras using denoised depth images. Deep-Learning Convolutional Neural Networks for scattered shrub detection with Google Earth Imagery. Gait Recognition from Motion Capture Data. GSLAM: Initialization-robust Monocular Visual SLAM via Global Structure-from-Motion. What Will I Do Next? The Intention from Motion Experiment. Late …
Calibration of depth cameras using denoised depth images
Title | Calibration of depth cameras using denoised depth images |
Authors | Ramanpreet Singh Pahwa, Minh N. Do, Tian Tsong Ng, Binh-Son Hua |
Abstract | Depth sensing devices have created various new applications in scientific and commercial research with the advent of Microsoft Kinect and PMD (Photon Mixing Device) cameras. Most of these applications require the depth cameras to be pre-calibrated. However, traditional calibration methods using a checkerboard do not work very well for depth cameras due to the low image resolution. In this paper, we propose a depth calibration scheme which excels in estimating camera calibration parameters when only a handful of corners and calibration images are available. We exploit the noise properties of PMD devices to denoise depth measurements and perform camera calibration using the denoised depth as an additional set of measurements. Our synthetic and real experiments show that our depth denoising and depth based calibration scheme provides significantly better results than traditional calibration methods. |
Tasks | Calibration, Denoising |
Published | 2017-09-08 |
URL | http://arxiv.org/abs/1709.02635v1 |
http://arxiv.org/pdf/1709.02635v1.pdf | |
PWC | https://paperswithcode.com/paper/calibration-of-depth-cameras-using-denoised |
Repo | |
Framework | |
Deep-Learning Convolutional Neural Networks for scattered shrub detection with Google Earth Imagery
Title | Deep-Learning Convolutional Neural Networks for scattered shrub detection with Google Earth Imagery |
Authors | Emilio Guirado, Siham Tabik, Domingo Alcaraz-Segura, Javier Cabello, Francisco Herrera |
Abstract | There is a growing demand for accurate high-resolution land cover maps in many fields, e.g., in land-use planning and biodiversity conservation. Developing such maps has been performed using Object-Based Image Analysis (OBIA) methods, which usually reach good accuracies, but require a high human supervision and the best configuration for one image can hardly be extrapolated to a different image. Recently, the deep learning Convolutional Neural Networks (CNNs) have shown outstanding results in object recognition in the field of computer vision. However, they have not been fully explored yet in land cover mapping for detecting species of high biodiversity conservation interest. This paper analyzes the potential of CNNs-based methods for plant species detection using free high-resolution Google Earth T M images and provides an objective comparison with the state-of-the-art OBIA-methods. We consider as case study the detection of Ziziphus lotus shrubs, which are protected as a priority habitat under the European Union Habitats Directive. According to our results, compared to OBIA-based methods, the proposed CNN-based detection model, in combination with data-augmentation, transfer learning and pre-processing, achieves higher performance with less human intervention and the knowledge it acquires in the first image can be transferred to other images, which makes the detection process very fast. The provided methodology can be systematically reproduced for other species detection. |
Tasks | Data Augmentation, Object Recognition, Transfer Learning |
Published | 2017-06-03 |
URL | http://arxiv.org/abs/1706.00917v1 |
http://arxiv.org/pdf/1706.00917v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-convolutional-neural-networks |
Repo | |
Framework | |
Gait Recognition from Motion Capture Data
Title | Gait Recognition from Motion Capture Data |
Authors | Michal Balazia, Petr Sojka |
Abstract | Gait recognition from motion capture data, as a pattern classification discipline, can be improved by the use of machine learning. This paper contributes to the state-of-the-art with a statistical approach for extracting robust gait features directly from raw data by a modification of Linear Discriminant Analysis with Maximum Margin Criterion. Experiments on the CMU MoCap database show that the suggested method outperforms thirteen relevant methods based on geometric features and a method to learn the features by a combination of Principal Component Analysis and Linear Discriminant Analysis. The methods are evaluated in terms of the distribution of biometric templates in respective feature spaces expressed in a number of class separability coefficients and classification metrics. Results also indicate a high portability of learned features, that means, we can learn what aspects of walk people generally differ in and extract those as general gait features. Recognizing people without needing group-specific features is convenient as particular people might not always provide annotated learning data. As a contribution to reproducible research, our evaluation framework and database have been made publicly available. This research makes motion capture technology directly applicable for human recognition. |
Tasks | Gait Recognition, Motion Capture |
Published | 2017-08-24 |
URL | http://arxiv.org/abs/1708.07755v1 |
http://arxiv.org/pdf/1708.07755v1.pdf | |
PWC | https://paperswithcode.com/paper/gait-recognition-from-motion-capture-data |
Repo | |
Framework | |
GSLAM: Initialization-robust Monocular Visual SLAM via Global Structure-from-Motion
Title | GSLAM: Initialization-robust Monocular Visual SLAM via Global Structure-from-Motion |
Authors | Chengzhou Tang, Oliver Wang, Ping Tan |
Abstract | Many monocular visual SLAM algorithms are derived from incremental structure-from-motion (SfM) methods. This work proposes a novel monocular SLAM method which integrates recent advances made in global SfM. In particular, we present two main contributions to visual SLAM. First, we solve the visual odometry problem by a novel rank-1 matrix factorization technique which is more robust to the errors in map initialization. Second, we adopt a recent global SfM method for the pose-graph optimization, which leads to a multi-stage linear formulation and enables L1 optimization for better robustness to false loops. The combination of these two approaches generates more robust reconstruction and is significantly faster (4X) than recent state-of-the-art SLAM systems. We also present a new dataset recorded with ground truth camera motion in a Vicon motion capture room, and compare our method to prior systems on it and established benchmark datasets. |
Tasks | Motion Capture, Visual Odometry |
Published | 2017-08-16 |
URL | http://arxiv.org/abs/1708.04814v3 |
http://arxiv.org/pdf/1708.04814v3.pdf | |
PWC | https://paperswithcode.com/paper/gslam-initialization-robust-monocular-visual |
Repo | |
Framework | |
What Will I Do Next? The Intention from Motion Experiment
Title | What Will I Do Next? The Intention from Motion Experiment |
Authors | Andrea Zunino, Jacopo Cavazza, Atesh Koul, Andrea Cavallo, Cristina Becchio, Vittorio Murino |
Abstract | In computer vision, video-based approaches have been widely explored for the early classification and the prediction of actions or activities. However, it remains unclear whether this modality (as compared to 3D kinematics) can still be reliable for the prediction of human intentions, defined as the overarching goal embedded in an action sequence. Since the same action can be performed with different intentions, this problem is more challenging but yet affordable as proved by quantitative cognitive studies which exploit the 3D kinematics acquired through motion capture systems. In this paper, we bridge cognitive and computer vision studies, by demonstrating the effectiveness of video-based approaches for the prediction of human intentions. Precisely, we propose Intention from Motion, a new paradigm where, without using any contextual information, we consider instantaneous grasping motor acts involving a bottle in order to forecast why the bottle itself has been reached (to pass it or to place in a box, or to pour or to drink the liquid inside). We process only the grasping onsets casting intention prediction as a classification framework. Leveraging on our multimodal acquisition (3D motion capture data and 2D optical videos), we compare the most commonly used 3D descriptors from cognitive studies with state-of-the-art video-based techniques. Since the two analyses achieve an equivalent performance, we demonstrate that computer vision tools are effective in capturing the kinematics and facing the cognitive problem of human intention prediction. |
Tasks | Motion Capture |
Published | 2017-08-03 |
URL | http://arxiv.org/abs/1708.01034v1 |
http://arxiv.org/pdf/1708.01034v1.pdf | |
PWC | https://paperswithcode.com/paper/what-will-i-do-next-the-intention-from-motion |
Repo | |
Framework | |
Latent Gaussian Process Regression
Title | Latent Gaussian Process Regression |
Authors | Erik Bodin, Neill D. F. Campbell, Carl Henrik Ek |
Abstract | We introduce Latent Gaussian Process Regression which is a latent variable extension allowing modelling of non-stationary multi-modal processes using GPs. The approach is built on extending the input space of a regression problem with a latent variable that is used to modulate the covariance function over the training data. We show how our approach can be used to model multi-modal and non-stationary processes. We exemplify the approach on a set of synthetic data and provide results on real data from motion capture and geostatistics. |
Tasks | Motion Capture |
Published | 2017-07-18 |
URL | http://arxiv.org/abs/1707.05534v2 |
http://arxiv.org/pdf/1707.05534v2.pdf | |
PWC | https://paperswithcode.com/paper/latent-gaussian-process-regression |
Repo | |
Framework | |
A deep learning model integrating FCNNs and CRFs for brain tumor segmentation
Title | A deep learning model integrating FCNNs and CRFs for brain tumor segmentation |
Authors | Xiaomei Zhao, Yihong Wu, Guidong Song, Zhenye Li, Yazhuo Zhang, Yong Fan |
Abstract | Accurate and reliable brain tumor segmentation is a critical component in cancer diagnosis, treatment planning, and treatment outcome evaluation. Build upon successful deep learning techniques, a novel brain tumor segmentation method is developed by integrating fully convolutional neural networks (FCNNs) and Conditional Random Fields (CRFs) in a unified framework to obtain segmentation results with appearance and spatial consistency. We train a deep learning based segmentation model using 2D image patches and image slices in following steps: 1) training FCNNs using image patches; 2) training CRFs as Recurrent Neural Networks (CRF-RNN) using image slices with parameters of FCNNs fixed; and 3) fine-tuning the FCNNs and the CRF-RNN using image slices. Particularly, we train 3 segmentation models using 2D image patches and slices obtained in axial, coronal and sagittal views respectively, and combine them to segment brain tumors using a voting based fusion strategy. Our method could segment brain images slice-by-slice, much faster than those based on image patches. We have evaluated our method based on imaging data provided by the Multimodal Brain Tumor Image Segmentation Challenge (BRATS) 2013, BRATS 2015 and BRATS 2016. The experimental results have demonstrated that our method could build a segmentation model with Flair, T1c, and T2 scans and achieve competitive performance as those built with Flair, T1, T1c, and T2 scans. |
Tasks | Brain Tumor Segmentation, Semantic Segmentation |
Published | 2017-02-15 |
URL | http://arxiv.org/abs/1702.04528v3 |
http://arxiv.org/pdf/1702.04528v3.pdf | |
PWC | https://paperswithcode.com/paper/a-deep-learning-model-integrating-fcnns-and |
Repo | |
Framework | |
Generic Black-Box End-to-End Attack Against State of the Art API Call Based Malware Classifiers
Title | Generic Black-Box End-to-End Attack Against State of the Art API Call Based Malware Classifiers |
Authors | Ishai Rosenberg, Asaf Shabtai, Lior Rokach, Yuval Elovici |
Abstract | In this paper, we present a black-box attack against API call based machine learning malware classifiers, focusing on generating adversarial sequences combining API calls and static features (e.g., printable strings) that will be misclassified by the classifier without affecting the malware functionality. We show that this attack is effective against many classifiers due to the transferability principle between RNN variants, feed forward DNNs, and traditional machine learning classifiers such as SVM. We also implement GADGET, a software framework to convert any malware binary to a binary undetected by malware classifiers, using the proposed attack, without access to the malware source code. |
Tasks | |
Published | 2017-07-19 |
URL | http://arxiv.org/abs/1707.05970v5 |
http://arxiv.org/pdf/1707.05970v5.pdf | |
PWC | https://paperswithcode.com/paper/generic-black-box-end-to-end-attack-against |
Repo | |
Framework | |
Hot or not? Forecasting cellular network hot spots using sector performance indicators
Title | Hot or not? Forecasting cellular network hot spots using sector performance indicators |
Authors | Joan Serrà, Ilias Leontiadis, Alexandros Karatzoglou, Konstantina Papagiannaki |
Abstract | To manage and maintain large-scale cellular networks, operators need to know which sectors underperform at any given time. For this purpose, they use the so-called hot spot score, which is the result of a combination of multiple network measurements and reflects the instantaneous overall performance of individual sectors. While operators have a good understanding of the current performance of a network and its overall trend, forecasting the performance of each sector over time is a challenging task, as it is affected by both regular and non-regular events, triggered by human behavior and hardware failures. In this paper, we study the spatio-temporal patterns of the hot spot score and uncover its regularities. Based on our observations, we then explore the possibility to use recent measurements’ history to predict future hot spots. To this end, we consider tree-based machine learning models, and study their performance as a function of time, amount of past data, and prediction horizon. Our results indicate that, compared to the best baseline, tree-based models can deliver up to 14% better forecasts for regular hot spots and 153% better forecasts for non-regular hot spots. The latter brings strong evidence that, for moderate horizons, forecasts can be made even for sectors exhibiting isolated, non-regular behavior. Overall, our work provides insight into the dynamics of cellular sectors and their predictability. It also paves the way for more proactive network operations with greater forecasting horizons. |
Tasks | |
Published | 2017-04-18 |
URL | http://arxiv.org/abs/1704.05249v1 |
http://arxiv.org/pdf/1704.05249v1.pdf | |
PWC | https://paperswithcode.com/paper/hot-or-not-forecasting-cellular-network-hot |
Repo | |
Framework | |
Actionable Email Intent Modeling with Reparametrized RNNs
Title | Actionable Email Intent Modeling with Reparametrized RNNs |
Authors | Chu-Cheng Lin, Dongyeop Kang, Michael Gamon, Madian Khabsa, Ahmed Hassan Awadallah, Patrick Pantel |
Abstract | Emails in the workplace are often intentional calls to action for its recipients. We propose to annotate these emails for what action its recipient will take. We argue that our approach of action-based annotation is more scalable and theory-agnostic than traditional speech-act-based email intent annotation, while still carrying important semantic and pragmatic information. We show that our action-based annotation scheme achieves good inter-annotator agreement. We also show that we can leverage threaded messages from other domains, which exhibit comparable intents in their conversation, with domain adaptive RAINBOW (Recurrently AttentIve Neural Bag-Of-Words). On a collection of datasets consisting of IRC, Reddit, and email, our reparametrized RNNs outperform common multitask/multidomain approaches on several speech act related tasks. We also experiment with a minimally supervised scenario of email recipient action classification, and find the reparametrized RNNs learn a useful representation. |
Tasks | Action Classification |
Published | 2017-12-26 |
URL | http://arxiv.org/abs/1712.09185v1 |
http://arxiv.org/pdf/1712.09185v1.pdf | |
PWC | https://paperswithcode.com/paper/actionable-email-intent-modeling-with |
Repo | |
Framework | |
“Let me convince you to buy my product … “: A Case Study of an Automated Persuasive System for Fashion Products
Title | “Let me convince you to buy my product … “: A Case Study of an Automated Persuasive System for Fashion Products |
Authors | Vitobha Munigala, Srikanth Tamilselvam, Anush Sankaran |
Abstract | Persuasivenes is a creative art aimed at making people believe in certain set of beliefs. Many a times, such creativity is about adapting richness of one domain into another to strike a chord with the target audience. In this research, we present PersuAIDE! - A persuasive system based on linguistic creativity to transform given sentence to generate various forms of persuading sentences. These various forms cover multiple focus of persuasion such as memorability and sentiment. For a given simple product line, the algorithm is composed of several steps including: (i) select an appropriate well-known expression for the target domain to add memorability, (ii) identify keywords and entities in the given sentence and expression and transform it to produce creative persuading sentence, and (iii) adding positive or negative sentiment for further persuasion. The persuasive conversion were manually verified using qualitative results and the effectiveness of the proposed approach is empirically discussed. |
Tasks | |
Published | 2017-09-25 |
URL | http://arxiv.org/abs/1709.08366v1 |
http://arxiv.org/pdf/1709.08366v1.pdf | |
PWC | https://paperswithcode.com/paper/let-me-convince-you-to-buy-my-product-a-case |
Repo | |
Framework | |
Machine Learning Models that Remember Too Much
Title | Machine Learning Models that Remember Too Much |
Authors | Congzheng Song, Thomas Ristenpart, Vitaly Shmatikov |
Abstract | Machine learning (ML) is becoming a commodity. Numerous ML frameworks and services are available to data holders who are not ML experts but want to train predictive models on their data. It is important that ML models trained on sensitive inputs (e.g., personal images or documents) not leak too much information about the training data. We consider a malicious ML provider who supplies model-training code to the data holder, does not observe the training, but then obtains white- or black-box access to the resulting model. In this setting, we design and implement practical algorithms, some of them very similar to standard ML techniques such as regularization and data augmentation, that “memorize” information about the training dataset in the model yet the model is as accurate and predictive as a conventionally trained model. We then explain how the adversary can extract memorized information from the model. We evaluate our techniques on standard ML tasks for image classification (CIFAR10), face recognition (LFW and FaceScrub), and text analysis (20 Newsgroups and IMDB). In all cases, we show how our algorithms create models that have high predictive power yet allow accurate extraction of subsets of their training data. |
Tasks | Data Augmentation, Face Recognition, Image Classification |
Published | 2017-09-22 |
URL | http://arxiv.org/abs/1709.07886v1 |
http://arxiv.org/pdf/1709.07886v1.pdf | |
PWC | https://paperswithcode.com/paper/machine-learning-models-that-remember-too |
Repo | |
Framework | |
Algorithmic Regularization in Over-parameterized Matrix Sensing and Neural Networks with Quadratic Activations
Title | Algorithmic Regularization in Over-parameterized Matrix Sensing and Neural Networks with Quadratic Activations |
Authors | Yuanzhi Li, Tengyu Ma, Hongyang Zhang |
Abstract | We show that the gradient descent algorithm provides an implicit regularization effect in the learning of over-parameterized matrix factorization models and one-hidden-layer neural networks with quadratic activations. Concretely, we show that given $\tilde{O}(dr^{2})$ random linear measurements of a rank $r$ positive semidefinite matrix $X^{\star}$, we can recover $X^{\star}$ by parameterizing it by $UU^\top$ with $U\in \mathbb R^{d\times d}$ and minimizing the squared loss, even if $r \ll d$. We prove that starting from a small initialization, gradient descent recovers $X^{\star}$ in $\tilde{O}(\sqrt{r})$ iterations approximately. The results solve the conjecture of Gunasekar et al.‘17 under the restricted isometry property. The technique can be applied to analyzing neural networks with one-hidden-layer quadratic activations with some technical modifications. |
Tasks | |
Published | 2017-12-26 |
URL | http://arxiv.org/abs/1712.09203v5 |
http://arxiv.org/pdf/1712.09203v5.pdf | |
PWC | https://paperswithcode.com/paper/algorithmic-regularization-in-over |
Repo | |
Framework | |
Faster Monte-Carlo Algorithms for Fixation Probability of the Moran Process on Undirected Graphs
Title | Faster Monte-Carlo Algorithms for Fixation Probability of the Moran Process on Undirected Graphs |
Authors | Krishnendu Chatterjee, Rasmus Ibsen-Jensen, Martin A. Nowak |
Abstract | Evolutionary graph theory studies the evolutionary dynamics in a population structure given as a connected graph. Each node of the graph represents an individual of the population, and edges determine how offspring are placed. We consider the classical birth-death Moran process where there are two types of individuals, namely, the residents with fitness 1 and mutants with fitness r. The fitness indicates the reproductive strength. The evolutionary dynamics happens as follows: in the initial step, in a population of all resident individuals a mutant is introduced, and then at each step, an individual is chosen proportional to the fitness of its type to reproduce, and the offspring replaces a neighbor uniformly at random. The process stops when all individuals are either residents or mutants. The probability that all individuals in the end are mutants is called the fixation probability. We present faster polynomial-time Monte-Carlo algorithms for finidng the fixation probability on undirected graphs. Our algorithms are always at least a factor O(n^2/log n) faster as compared to the previous algorithms, where n is the number of nodes, and is polynomial even if r is given in binary. We also present lower bounds showing that the upper bound on the expected number of effective steps we present is asymptotically tight for undirected graphs. |
Tasks | |
Published | 2017-06-21 |
URL | http://arxiv.org/abs/1706.06931v1 |
http://arxiv.org/pdf/1706.06931v1.pdf | |
PWC | https://paperswithcode.com/paper/faster-monte-carlo-algorithms-for-fixation |
Repo | |
Framework | |
Computationally Efficient Robust Estimation of Sparse Functionals
Title | Computationally Efficient Robust Estimation of Sparse Functionals |
Authors | Simon S. Du, Sivaraman Balakrishnan, Aarti Singh |
Abstract | Many conventional statistical procedures are extremely sensitive to seemingly minor deviations from modeling assumptions. This problem is exacerbated in modern high-dimensional settings, where the problem dimension can grow with and possibly exceed the sample size. We consider the problem of robust estimation of sparse functionals, and provide a computationally and statistically efficient algorithm in the high-dimensional setting. Our theory identifies a unified set of deterministic conditions under which our algorithm guarantees accurate recovery. By further establishing that these deterministic conditions hold with high-probability for a wide range of statistical models, our theory applies to many problems of considerable interest including sparse mean and covariance estimation; sparse linear regression; and sparse generalized linear models. |
Tasks | |
Published | 2017-02-24 |
URL | http://arxiv.org/abs/1702.07709v1 |
http://arxiv.org/pdf/1702.07709v1.pdf | |
PWC | https://paperswithcode.com/paper/computationally-efficient-robust-estimation |
Repo | |
Framework | |