January 28, 2020

3033 words 15 mins read

Paper Group ANR 845

Nearest-Neighbor Neural Networks for Geostatistics. Deep Face Recognition Model Compression via Knowledge Transfer and Distillation. Optimisation of Air-Ground Swarm Teaming for Target Search, using Differential Evolution. Projection pursuit based on Gaussian mixtures and evolutionary algorithms. Quadruply Stochastic Gradient Method for Large Scale …

Nearest-Neighbor Neural Networks for Geostatistics


Title	Nearest-Neighbor Neural Networks for Geostatistics
Authors	Haoyu Wang, Yawen Guan, Brian J Reich
Abstract	Kriging is the predominant method used for spatial prediction, but relies on the assumption that predictions are linear combinations of the observations. Kriging often also relies on additional assumptions such as normality and stationarity. We propose a more flexible spatial prediction method based on the Nearest-Neighbor Neural Network (4N) process that embeds deep learning into a geostatistical model. We show that the 4N process is a valid stochastic process and propose a series of new ways to construct features to be used as inputs to the deep learning model based on neighboring information. Our model framework outperforms some existing state-of-art geostatistical modelling methods for simulated non-Gaussian data and is applied to a massive forestry dataset.
Tasks
Published	2019-03-28
URL	http://arxiv.org/abs/1903.12125v1
PDF	http://arxiv.org/pdf/1903.12125v1.pdf
PWC	https://paperswithcode.com/paper/nearest-neighbor-neural-networks-for
Repo
Framework

Deep Face Recognition Model Compression via Knowledge Transfer and Distillation


Title	Deep Face Recognition Model Compression via Knowledge Transfer and Distillation
Authors	Jayashree Karlekar, Jiashi Feng, Zi Sian Wong, Sugiri Pranata
Abstract	Fully convolutional networks (FCNs) have become de facto tool to achieve very high-level performance for many vision and non-vision tasks in general and face recognition in particular. Such high-level accuracies are normally obtained by very deep networks or their ensemble. However, deploying such high performing models to resource constraint devices or real-time applications is challenging. In this paper, we present a novel model compression approach based on student-teacher paradigm for face recognition applications. The proposed approach consists of training teacher FCN at bigger image resolution while student FCNs are trained at lower image resolutions than that of teacher FCN. We explored three different approaches to train student FCNs: knowledge transfer (KT), knowledge distillation (KD) and their combination. Experimental evaluation on LFW and IJB-C datasets demonstrate comparable improvements in accuracies with these approaches. Training low-resolution student FCNs from higher resolution teacher offer fourfold advantage of accelerated training, accelerated inference, reduced memory requirements and improved accuracies. We evaluated all models on IJB-C dataset and achieved state-of-the-art results on this benchmark. The teacher network and some student networks even achieved Top-1 performance on IJB-C dataset. The proposed approach is simple and hardware friendly, thus enables the deployment of high performing face recognition deep models to resource constraint devices.
Tasks	Face Recognition, Model Compression, Transfer Learning
Published	2019-06-03
URL	https://arxiv.org/abs/1906.00619v1
PDF	https://arxiv.org/pdf/1906.00619v1.pdf
PWC	https://paperswithcode.com/paper/190600619
Repo
Framework

Optimisation of Air-Ground Swarm Teaming for Target Search, using Differential Evolution


Title	Optimisation of Air-Ground Swarm Teaming for Target Search, using Differential Evolution
Authors	Jiangjun Tang, George Leu, Yu-Bin Yang
Abstract	This paper presents a swarm teaming perspective that enhances the scope of classic investigations on survivable networks. A target searching generic context is considered as test-bed, in which a swarm of ground agents and a swarm of UAVs cooperate so that the ground agents reach as many targets as possible in the field while also remaining connected as much as possible at all times. To optimise the system against both these objectives in the same time, we use an evolutionary computation approach in the form of a differential evolution algorithm. Results are encouraging, showing a good evolution of the fitness function used as part of the differential evolution, and a good performance of the evolved dual-swarm system, which exhibits an optimal trade-off between target reaching and connectivity.
Tasks
Published	2019-09-13
URL	https://arxiv.org/abs/1909.06037v1
PDF	https://arxiv.org/pdf/1909.06037v1.pdf
PWC	https://paperswithcode.com/paper/optimisation-of-air-ground-swarm-teaming-for
Repo
Framework

Projection pursuit based on Gaussian mixtures and evolutionary algorithms


Title	Projection pursuit based on Gaussian mixtures and evolutionary algorithms
Authors	Luca Scrucca, Alessio Serafini
Abstract	We propose a projection pursuit (PP) algorithm based on Gaussian mixture models (GMMs). The negentropy obtained from a multivariate density estimated by GMMs is adopted as the PP index to be maximised. For a fixed dimension of the projection subspace, the GMM-based density estimation is projected onto that subspace, where an approximation of the negentropy for Gaussian mixtures is computed. Then, Genetic Algorithms (GAs) are used to find the optimal, orthogonal projection basis by maximising the former approximation. We show that this semi-parametric approach to PP is flexible and allows highly informative structures to be detected, by projecting multivariate datasets onto a subspace, where the data can be feasibly visualised. The performance of the proposed approach is shown on both artificial and real datasets.
Tasks	Density Estimation
Published	2019-12-27
URL	https://arxiv.org/abs/1912.12049v1
PDF	https://arxiv.org/pdf/1912.12049v1.pdf
PWC	https://paperswithcode.com/paper/projection-pursuit-based-on-gaussian-mixtures
Repo
Framework

Quadruply Stochastic Gradient Method for Large Scale Nonlinear Semi-Supervised Ordinal Regression AUC Optimization


Title	Quadruply Stochastic Gradient Method for Large Scale Nonlinear Semi-Supervised Ordinal Regression AUC Optimization
Authors	Wanli Shi, Bin Gu, Xinag Li, Heng Huang
Abstract	Semi-supervised ordinal regression (S$^2$OR) problems are ubiquitous in real-world applications, where only a few ordered instances are labeled and massive instances remain unlabeled. Recent researches have shown that directly optimizing concordance index or AUC can impose a better ranking on the data than optimizing the traditional error rate in ordinal regression (OR) problems. In this paper, we propose an unbiased objective function for S$^2$OR AUC optimization based on ordinal binary decomposition approach. Besides, to handle the large-scale kernelized learning problems, we propose a scalable algorithm called QS$^3$ORAO using the doubly stochastic gradients (DSG) framework for functional optimization. Theoretically, we prove that our method can converge to the optimal solution at the rate of $O(1/t)$, where $t$ is the number of iterations for stochastic data sampling. Extensive experimental results on various benchmark and real-world datasets also demonstrate that our method is efficient and effective while retaining similar generalization performance.
Tasks
Published	2019-12-24
URL	https://arxiv.org/abs/1912.11193v1
PDF	https://arxiv.org/pdf/1912.11193v1.pdf
PWC	https://paperswithcode.com/paper/quadruply-stochastic-gradient-method-for
Repo
Framework

Predicting ConceptNet Path Quality Using Crowdsourced Assessments of Naturalness


Title	Predicting ConceptNet Path Quality Using Crowdsourced Assessments of Naturalness
Authors	Yilun Zhou, Steven Schockaert, Julie A. Shah
Abstract	In many applications, it is important to characterize the way in which two concepts are semantically related. Knowledge graphs such as ConceptNet provide a rich source of information for such characterizations by encoding relations between concepts as edges in a graph. When two concepts are not directly connected by an edge, their relationship can still be described in terms of the paths that connect them. Unfortunately, many of these paths are uninformative and noisy, which means that the success of applications that use such path features crucially relies on their ability to select high-quality paths. In existing applications, this path selection process is based on relatively simple heuristics. In this paper we instead propose to learn to predict path quality from crowdsourced human assessments. Since we are interested in a generic task-independent notion of quality, we simply ask human participants to rank paths according to their subjective assessment of the paths’ naturalness, without attempting to define naturalness or steering the participants towards particular indicators of quality. We show that a neural network model trained on these assessments is able to predict human judgments on unseen paths with near optimal performance. Most notably, we find that the resulting path selection method is substantially better than the current heuristic approaches at identifying meaningful paths.
Tasks	Knowledge Graphs
Published	2019-02-21
URL	http://arxiv.org/abs/1902.07831v1
PDF	http://arxiv.org/pdf/1902.07831v1.pdf
PWC	https://paperswithcode.com/paper/predicting-conceptnet-path-quality-using
Repo
Framework

Column2Vec: Structural Understanding via Distributed Representations of Database Schemas


Title	Column2Vec: Structural Understanding via Distributed Representations of Database Schemas
Authors	Michael J. Mior, Alexander G. Ororbia II
Abstract	We present Column2Vec, a distributed representation of database columns based on column metadata. Our distributed representation has several applications. Using known names for groups of columns (i.e., a table name), we train a model to generate an appropriate name for columns in an unnamed table. We demonstrate the viability of our approach using schema information collected from open source applications on GitHub.
Tasks
Published	2019-03-20
URL	http://arxiv.org/abs/1903.08621v1
PDF	http://arxiv.org/pdf/1903.08621v1.pdf
PWC	https://paperswithcode.com/paper/column2vec-structural-understanding-via
Repo
Framework

Multifidelity Bayesian Optimization for Binomial Output


Title	Multifidelity Bayesian Optimization for Binomial Output
Authors	Leonid Matyushin, Alexey Zaytsev, Oleg Alenkin, Andrey Ustuzhanin
Abstract	The key idea of Bayesian optimization is replacing an expensive target function with a cheap surrogate model. By selection of an acquisition function for Bayesian optimization, we trade off between exploration and exploitation. The acquisition function typically depends on the mean and the variance of the surrogate model at a given point. The most common Gaussian process-based surrogate model assumes that the target with fixed parameters is a realization of a Gaussian process. However, often the target function doesn’t satisfy this approximation. Here we consider target functions that come from the binomial distribution with the parameter that depends on inputs. Typically we can vary how many Bernoulli samples we obtain during each evaluation. We propose a general Gaussian process model that takes into account Bernoulli outputs. To make things work we consider a simple acquisition function based on Expected Improvement and a heuristic strategy to choose the number of samples at each point thus taking into account precision of the obtained output.
Tasks
Published	2019-02-19
URL	http://arxiv.org/abs/1902.06937v1
PDF	http://arxiv.org/pdf/1902.06937v1.pdf
PWC	https://paperswithcode.com/paper/multifidelity-bayesian-optimization-for
Repo
Framework

Mapping high-performance RNNs to in-memory neuromorphic chips


Title	Mapping high-performance RNNs to in-memory neuromorphic chips
Authors	Manu V Nair, Giacomo Indiveri
Abstract	The increasing need for compact and low-power computing solutions for machine learning applications has triggered significant interest in energy-efficient neuromorphic systems. However, most of these architectures rely on spiking neural networks, which typically perform poorly compared to their non-spiking counterparts in terms of accuracy. In this paper, we propose a new adaptive spiking neuron model that can be abstracted as a low-pass filter. This abstraction enables faster and better training of spiking networks using back-propagation, without simulating spikes. We show that this model dramatically improves the inference performance of a recurrent neural network and validate it with three complex spatio-temporal learning tasks: the temporal addition task, the temporal copying task, and a spoken-phrase recognition task. We estimate at least 500x higher energy-efficiency using our models on compatible neuromorphic chips in comparison to Cortex-M4, a popular embedded microprocessor.
Tasks
Published	2019-05-25
URL	https://arxiv.org/abs/1905.10692v4
PDF	https://arxiv.org/pdf/1905.10692v4.pdf
PWC	https://paperswithcode.com/paper/a-neuromorphic-boost-to-rnns-using-low-pass
Repo
Framework

DL2: A Deep Learning-driven Scheduler for Deep Learning Clusters


Title	DL2: A Deep Learning-driven Scheduler for Deep Learning Clusters
Authors	Yanghua Peng, Yixin Bao, Yangrui Chen, Chuan Wu, Chen Meng, Wei Lin
Abstract	More and more companies have deployed machine learning (ML) clusters, where deep learning (DL) models are trained for providing various AI-driven services. Efficient resource scheduling is essential for maximal utilization of expensive DL clusters. Existing cluster schedulers either are agnostic to ML workload characteristics, or use scheduling heuristics based on operators’ understanding of particular ML framework and workload, which are less efficient or not general enough. In this paper, we show that DL techniques can be adopted to design a generic and efficient scheduler. DL2 is a DL-driven scheduler for DL clusters, targeting global training job expedition by dynamically resizing resources allocated to jobs. DL2 advocates a joint supervised learning and reinforcement learning approach: a neural network is warmed up via offline supervised learning based on job traces produced by the existing cluster scheduler; then the neural network is plugged into the live DL cluster, fine-tuned by reinforcement learning carried out throughout the training progress of the DL jobs, and used for deciding job resource allocation in an online fashion. By applying past decisions made by the existing cluster scheduler in the preparatory supervised learning phase, our approach enables a smooth transition from existing scheduler, and renders a high-quality scheduler in minimizing average training completion time. We implement DL2 on Kubernetes and enable dynamic resource scaling in DL jobs on MXNet. Extensive evaluation shows that DL2 outperforms fairness scheduler (i.e., DRF) by 44.1% and expert heuristic scheduler (i.e., Optimus) by 17.5% in terms of average job completion time.
Tasks
Published	2019-09-13
URL	https://arxiv.org/abs/1909.06040v1
PDF	https://arxiv.org/pdf/1909.06040v1.pdf
PWC	https://paperswithcode.com/paper/dl2-a-deep-learning-driven-scheduler-for-deep
Repo
Framework


Title	Exploring Uncertainty in Conditional Multi-Modal Retrieval Systems
Authors	Ahmed Taha, Yi-Ting Chen, Xitong Yang, Teruhisa Misu, Larry Davis
Abstract	We cast visual retrieval as a regression problem by posing triplet loss as a regression loss. This enables epistemic uncertainty estimation using dropout as a Bayesian approximation framework in retrieval. Accordingly, Monte Carlo (MC) sampling is leveraged to boost retrieval performance. Our approach is evaluated on two applications: person re-identification and autonomous car driving. Comparable state-of-the-art results are achieved on multiple datasets for the former application. We leverage the Honda driving dataset (HDD) for autonomous car driving application. It provides multiple modalities and similarity notions for ego-motion action understanding. Hence, we present a multi-modal conditional retrieval network. It disentangles embeddings into separate representations to encode different similarities. This form of joint learning eliminates the need to train multiple independent networks without any performance degradation. Quantitative evaluation highlights our approach competence, achieving 6% improvement in a highly uncertain environment.
Tasks	Person Re-Identification
Published	2019-01-23
URL	http://arxiv.org/abs/1901.07702v1
PDF	http://arxiv.org/pdf/1901.07702v1.pdf
PWC	https://paperswithcode.com/paper/exploring-uncertainty-in-conditional-multi
Repo
Framework

Mutual Information-driven Subject-invariant and Class-relevant Deep Representation Learning in BCI


Title	Mutual Information-driven Subject-invariant and Class-relevant Deep Representation Learning in BCI
Authors	Eunjin Jeon, Wonjun Ko, Jee Seok Yoon, Heung-Il Suk
Abstract	In recent years, deep learning-based feature representation methods have shown a promising impact in electroencephalography (EEG)-based brain-computer interface (BCI). Nonetheless, there still exist BCI-illiterate subjects who struggle to use BCI systems, showing high intra-subject variabilities. Several methods have been proposed to enhance their performance via transfer learning. In such a case, high inter- and intrasubject variabilities are the points to be considered. Transfer learning, especially as a domain adaptation technique, has drawn increasing attention in various fields. However, the adaptation of approaches into BCI faces two challenging limitations. (i) Most domain adaptation methods are designed for labeled source and unlabeled target domain whereas BCI tasks generally have multiple annotated domains. (ii) Most of the existing methods do not consider a negative transfer to disrupt generalization ability. In this paper, we propose a novel network architecture to tackle these limitations by estimating mutual information in high- and low-level representations regardless of the domain that is considered as a subject in this paper. Specifically, our proposed method extracts subject-invariant and class-relevant features, thereby enhancing generalizability in overall classification. It is also noteworthy that our method can be applicable to a new subject with a small amount of data via fine-tuning, thus reducing calibration time for its practical uses. We validated our proposed method on two large motor imagery EEG datasets via comparisons with other competing methods.
Tasks	Calibration, Domain Adaptation, EEG, Representation Learning, Transfer Learning
Published	2019-10-17
URL	https://arxiv.org/abs/1910.07747v3
PDF	https://arxiv.org/pdf/1910.07747v3.pdf
PWC	https://paperswithcode.com/paper/toward-subject-invariant-and-class
Repo
Framework

Fourier-based Rotation-invariant Feature Boosting: An Efficient Framework for Geospatial Object Detection


Title	Fourier-based Rotation-invariant Feature Boosting: An Efficient Framework for Geospatial Object Detection
Authors	Xin Wu, Danfeng Hong, Jocelyn Chanussot, Yang Xu, Ran Tao, Yue Wang
Abstract	Geospatial object detection of remote sensing imagery has been attracting an increasing interest in recent years, due to the rapid development in spaceborne imaging. Most of previously proposed object detectors are very sensitive to object deformations, such as scaling and rotation. To this end, we propose a novel and efficient framework for geospatial object detection in this letter, called Fourier-based rotation-invariant feature boosting (FRIFB). A Fourier-based rotation-invariant feature is first generated in polar coordinate. Then, the extracted features can be further structurally refined using aggregate channel features. This leads to a faster feature computation and more robust feature representation, which is good fitting for the coming boosting learning. Finally, in the test phase, we achieve a fast pyramid feature extraction by estimating a scale factor instead of directly collecting all features from image pyramid. Extensive experiments are conducted on two subsets of NWPU VHR-10 dataset, demonstrating the superiority and effectiveness of the FRIFB compared to previous state-of-the-art methods.
Tasks	Object Detection
Published	2019-05-27
URL	https://arxiv.org/abs/1905.11074v1
PDF	https://arxiv.org/pdf/1905.11074v1.pdf
PWC	https://paperswithcode.com/paper/fourier-based-rotation-invariant-feature
Repo
Framework

Moving Object Detection under Discontinuous Change in Illumination Using Tensor Low-Rank and Invariant Sparse Decomposition


Title	Moving Object Detection under Discontinuous Change in Illumination Using Tensor Low-Rank and Invariant Sparse Decomposition
Authors	Moein Shakeri, Hong Zhang
Abstract	Although low-rank and sparse decomposition based methods have been successfully applied to the problem of moving object detection using structured sparsity-inducing norms, they are still vulnerable to significant illumination changes that arise in certain applications. We are interested in moving object detection in applications involving time-lapse image sequences for which current methods mistakenly group moving objects and illumination changes into foreground. Our method relies on the multilinear (tensor) data low-rank and sparse decomposition framework to address the weaknesses of existing methods. The key to our proposed method is to create first a set of prior maps that can characterize the changes in the image sequence due to illumination. We show that they can be detected by a k-support norm. To deal with concurrent, two types of changes, we employ two regularization terms, one for detecting moving objects and the other for accounting for illumination changes, in the tensor low-rank and sparse decomposition formulation. Through comprehensive experiments using challenging datasets, we show that our method demonstrates a remarkable ability to detect moving objects under discontinuous change in illumination, and outperforms the state-of-the-art solutions to this challenging problem.
Tasks	Object Detection
Published	2019-04-05
URL	http://arxiv.org/abs/1904.03175v2
PDF	http://arxiv.org/pdf/1904.03175v2.pdf
PWC	https://paperswithcode.com/paper/moving-object-detection-under-discontinuous
Repo
Framework

Online Learning Using Only Peer Prediction


Title	Online Learning Using Only Peer Prediction
Authors	Yang Liu, David P. Helmbold
Abstract	This paper considers a variant of the classical online learning problem with expert predictions. Our model’s differences and challenges are due to lacking any direct feedback on the loss each expert incurs at each time step $t$. We propose an approach that uses peer prediction and identify conditions where it succeeds. Our techniques revolve around a carefully designed peer score function $s()$ that scores experts’ predictions based on the peer consensus. We show a sufficient condition, that we call \emph{peer calibration}, under which standard online learning algorithms using loss feedback computed by the carefully crafted $s()$ have bounded regret with respect to the unrevealed ground truth values. We then demonstrate how suitable $s()$ functions can be derived for different assumptions and models.
Tasks	Calibration
Published	2019-10-10
URL	https://arxiv.org/abs/1910.04382v2
PDF	https://arxiv.org/pdf/1910.04382v2.pdf
PWC	https://paperswithcode.com/paper/online-learning-using-only-peer-assessment
Repo
Framework