February 1, 2020

3116 words 15 mins read

Paper Group AWR 152

Bayesian nonparametric multiway regression for clustered binomial data. Empirical Risk Minimization under Random Censorship: Theory and Practice. RGBD-GAN: Unsupervised 3D Representation Learning From Natural Image Datasets via RGBD Image Synthesis. Semantic Drift in Multilingual Representations. Autoencoding sensory substitution. Robust Decision T …

Bayesian nonparametric multiway regression for clustered binomial data


Title	Bayesian nonparametric multiway regression for clustered binomial data
Authors	Eric F. Lock, Dipankar Bandyopadhyay
Abstract	We introduce a Bayesian nonparametric regression model for data with multiway (tensor) structure, motivated by an application to periodontal disease (PD) data. Our outcome is the number of diseased sites measured over four different tooth types for each subject, with subject-specific covariates available as predictors. The outcomes are not well-characterized by simple parametric models, so we use a nonparametric approach with a binomial likelihood wherein the latent probabilities are drawn from a mixture with an arbitrary number of components, analogous to a Dirichlet Process (DP). We use a flexible probit stick-breaking formulation for the component weights that allows for covariate dependence and clustering structure in the outcomes. The parameter space for this model is large and multiway: patients $\times$ tooth types $\times$ covariates $\times$ components. We reduce its effective dimensionality, and account for the multiway structure, via low-rank assumptions. We illustrate how this can improve performance, and simplify interpretation, while still providing sufficient flexibility. We describe a general and efficient Gibbs sampling algorithm for posterior computation. The resulting fit to the PD data outperforms competitors, and is interpretable and well-calibrated. An interactive visual of the predictive model is available at http://ericfrazerlock.com/toothdata/ToothDisplay.html , and the code is available at https://github.com/lockEF/NonparametricMultiway .
Tasks
Published	2019-01-31
URL	http://arxiv.org/abs/1901.11172v1
PDF	http://arxiv.org/pdf/1901.11172v1.pdf
PWC	https://paperswithcode.com/paper/bayesian-nonparametric-multiway-regression
Repo	https://github.com/lockEF/NonparametricMultiway
Framework	none

Empirical Risk Minimization under Random Censorship: Theory and Practice


Title	Empirical Risk Minimization under Random Censorship: Theory and Practice
Authors	Guillaume Ausset, Stéphan Clémençon, François Portier
Abstract	We consider the classic supervised learning problem, where a continuous non-negative random label $Y$ (i.e. a random duration) is to be predicted based upon observing a random vector $X$ valued in $\mathbb{R}^d$ with $d\geq 1$ by means of a regression rule with minimum least square error. In various applications, ranging from industrial quality control to public health through credit risk analysis for instance, training observations can be right censored, meaning that, rather than on independent copies of $(X,Y)$, statistical learning relies on a collection of $n\geq 1$ independent realizations of the triplet $(X, ; \min{Y,; C},; \delta)$, where $C$ is a nonnegative r.v. with unknown distribution, modeling censorship and $\delta=\mathbb{I}{Y\leq C}$ indicates whether the duration is right censored or not. As ignoring censorship in the risk computation may clearly lead to a severe underestimation of the target duration and jeopardize prediction, we propose to consider a plug-in estimate of the true risk based on a Kaplan-Meier estimator of the conditional survival function of the censorship $C$ given $X$, referred to as Kaplan-Meier risk, in order to perform empirical risk minimization. It is established, under mild conditions, that the learning rate of minimizers of this biased/weighted empirical risk functional is of order $O_{\mathbb{P}}(\sqrt{\log(n)/n})$ when ignoring model bias issues inherent to plug-in estimation, as can be attained in absence of censorship. Beyond theoretical results, numerical experiments are presented in order to illustrate the relevance of the approach developed.
Tasks
Published	2019-06-05
URL	https://arxiv.org/abs/1906.01908v1
PDF	https://arxiv.org/pdf/1906.01908v1.pdf
PWC	https://paperswithcode.com/paper/empirical-risk-minimization-under-random
Repo	https://github.com/aussetg/ipcw
Framework	none

RGBD-GAN: Unsupervised 3D Representation Learning From Natural Image Datasets via RGBD Image Synthesis


Title	RGBD-GAN: Unsupervised 3D Representation Learning From Natural Image Datasets via RGBD Image Synthesis
Authors	Atsuhiro Noguchi, Tatsuya Harada
Abstract	Understanding three-dimensional (3D) geometries from two-dimensional (2D) images without any labeled information is promising for understanding the real world without incurring annotation cost. We herein propose a novel generative model, RGBD-GAN, which achieves unsupervised 3D representation learning from 2D images. The proposed method enables camera parameter conditional image generation and depth image generation without any 3D annotations such as camera poses or depth. We used an explicit 3D consistency loss for two RGBD images generated from different camera parameters in addition to the ordinal GAN objective. The loss is simple yet effective for any type of image generator such as the DCGAN and StyleGAN to be conditioned on camera parameters. We conducted experiments and demonstrated that the proposed method could learn 3D representations from 2D images with various generator architectures.
Tasks	Conditional Image Generation, Image Generation, Representation Learning
Published	2019-09-27
URL	https://arxiv.org/abs/1909.12573v1
PDF	https://arxiv.org/pdf/1909.12573v1.pdf
PWC	https://paperswithcode.com/paper/rgbd-gan-unsupervised-3d-representation-1
Repo	https://github.com/nogu-atsu/RGBD-GAN
Framework	none

Semantic Drift in Multilingual Representations


Title	Semantic Drift in Multilingual Representations
Authors	Lisa Beinborn, Rochelle Choenni
Abstract	Multilingual representations have mostly been evaluated based on their performance on specific tasks. In this article, we look beyond engineering goals and analyze the relations between languages in computational representations. We introduce a methodology for comparing languages based on their organization of semantic concepts. We propose to conduct an adapted version of representational similarity analysis of a selected set of concepts in computational multilingual representations. Using this analysis method, we can reconstruct a phylogenetic tree that closely resembles those assumed by linguistic experts. These results indicate that multilingual distributional representations which are only trained on monolingual text and bilingual dictionaries preserve relations between languages without the need for any etymological information. In addition, we propose a measure to identify semantic drift between language families. We perform experiments on word-based and sentence-based multilingual models and provide both quantitative results and qualitative examples. Analyses of semantic drift in multilingual representations can serve two purposes: they can indicate unwanted characteristics of the computational models and they provide a quantitative means to study linguistic phenomena across languages. The code is available at https://github.com/beinborn/SemanticDrift.
Tasks
Published	2019-04-24
URL	https://arxiv.org/abs/1904.10820v3
PDF	https://arxiv.org/pdf/1904.10820v3.pdf
PWC	https://paperswithcode.com/paper/semantic-drift-in-multilingual
Repo	https://github.com/beinborn/SemanticDrift
Framework	pytorch

Autoencoding sensory substitution


Title	Autoencoding sensory substitution
Authors	Viktor Tóth, Lauri Parkkonen
Abstract	Tens of millions of people live blind, and their number is ever increasing. Visual-to-auditory sensory substitution (SS) encompasses a family of cheap, generic solutions to assist the visually impaired by conveying visual information through sound. The required SS training is lengthy: months of effort is necessary to reach a practical level of adaptation. There are two reasons for the tedious training process: the elongated substituting audio signal, and the disregard for the compressive characteristics of the human hearing system. To overcome these obstacles, we developed a novel class of SS methods, by training deep recurrent autoencoders for image-to-sound conversion. We successfully trained deep learning models on different datasets to execute visual-to-auditory stimulus conversion. By constraining the visual space, we demonstrated the viability of shortened substituting audio signals, while proposing mechanisms, such as the integration of computational hearing models, to optimally convey visual features in the substituting stimulus as perceptually discernible auditory components. We tested our approach in two separate cases. In the first experiment, the author went blindfolded for 5 days, while performing SS training on hand posture discrimination. The second experiment assessed the accuracy of reaching movements towards objects on a table. In both test cases, above-chance-level accuracy was attained after a few hours of training. Our novel SS architecture broadens the horizon of rehabilitation methods engineered for the visually impaired. Further improvements on the proposed model shall yield hastened rehabilitation of the blind and a wider adaptation of SS devices as a consequence.
Tasks
Published	2019-07-14
URL	https://arxiv.org/abs/1907.06286v1
PDF	https://arxiv.org/pdf/1907.06286v1.pdf
PWC	https://paperswithcode.com/paper/autoencoding-sensory-substitution
Repo	https://github.com/csiki/v2a
Framework	tf

Robust Decision Trees Against Adversarial Examples


Title	Robust Decision Trees Against Adversarial Examples
Authors	Hongge Chen, Huan Zhang, Duane Boning, Cho-Jui Hsieh
Abstract	Although adversarial examples and model robustness have been extensively studied in the context of linear models and neural networks, research on this issue in tree-based models and how to make tree-based models robust against adversarial examples is still limited. In this paper, we show that tree based models are also vulnerable to adversarial examples and develop a novel algorithm to learn robust trees. At its core, our method aims to optimize the performance under the worst-case perturbation of input features, which leads to a max-min saddle point problem. Incorporating this saddle point objective into the decision tree building procedure is non-trivial due to the discrete nature of trees — a naive approach to finding the best split according to this saddle point objective will take exponential time. To make our approach practical and scalable, we propose efficient tree building algorithms by approximating the inner minimizer in this saddle point problem, and present efficient implementations for classical information gain based trees as well as state-of-the-art tree boosting models such as XGBoost. Experimental results on real world datasets demonstrate that the proposed algorithms can substantially improve the robustness of tree-based models against adversarial examples.
Tasks	Adversarial Attack, Adversarial Defense
Published	2019-02-27
URL	https://arxiv.org/abs/1902.10660v2
PDF	https://arxiv.org/pdf/1902.10660v2.pdf
PWC	https://paperswithcode.com/paper/robust-decision-trees-against-adversarial
Repo	https://github.com/chenhongge/RobustTrees
Framework	none

Separable Layers Enable Structured Efficient Linear Substitutions


Title	Separable Layers Enable Structured Efficient Linear Substitutions
Authors	Gavin Gray, Elliot J. Crowley, Amos Storkey
Abstract	In response to the development of recent efficient dense layers, this paper shows that something as simple as replacing linear components in pointwise convolutions with structured linear decompositions also produces substantial gains in the efficiency/accuracy tradeoff. Pointwise convolutions are fully connected layers and are thus prepared for replacement by structured transforms. Networks using such layers are able to learn the same tasks as those using standard convolutions, and provide Pareto-optimal benefits in efficiency/accuracy, both in terms of computation (mult-adds) and parameter count (and hence memory). Code is available at https://github.com/BayesWatch/deficient-efficient.
Tasks
Published	2019-06-03
URL	https://arxiv.org/abs/1906.00859v1
PDF	https://arxiv.org/pdf/1906.00859v1.pdf
PWC	https://paperswithcode.com/paper/190600859
Repo	https://github.com/BayesWatch/deficient-efficient
Framework	pytorch

SimpleShot: Revisiting Nearest-Neighbor Classification for Few-Shot Learning


Title	SimpleShot: Revisiting Nearest-Neighbor Classification for Few-Shot Learning
Authors	Yan Wang, Wei-Lun Chao, Kilian Q. Weinberger, Laurens van der Maaten
Abstract	Few-shot learners aim to recognize new object classes based on a small number of labeled training examples. To prevent overfitting, state-of-the-art few-shot learners use meta-learning on convolutional-network features and perform classification using a nearest-neighbor classifier. This paper studies the accuracy of nearest-neighbor baselines without meta-learning. Surprisingly, we find simple feature transformations suffice to obtain competitive few-shot learning accuracies. For example, we find that a nearest-neighbor classifier used in combination with mean-subtraction and L2-normalization outperforms prior results in three out of five settings on the miniImageNet dataset.
Tasks	Few-Shot Image Classification, Few-Shot Learning, Meta-Learning
Published	2019-11-12
URL	https://arxiv.org/abs/1911.04623v2
PDF	https://arxiv.org/pdf/1911.04623v2.pdf
PWC	https://paperswithcode.com/paper/simpleshot-revisiting-nearest-neighbor
Repo	https://github.com/mileyan/simple_shot
Framework	pytorch

Grammar Based Directed Testing of Machine Learning Systems


Title	Grammar Based Directed Testing of Machine Learning Systems
Authors	Sakshi Udeshi, Sudipta Chattopadhyay
Abstract	The massive progress of machine learning has seen its application over a variety of domains in the past decade. But how do we develop a systematic, scalable and modular strategy to validate machine-learning systems? We present, to the best of our knowledge, the first approach, which provides a systematic test framework for machine-learning systems that accepts grammar-based inputs. Our OGMA approach automatically discovers erroneous behaviours in classifiers and leverages these erroneous behaviours to improve the respective models. OGMA leverages inherent robustness properties present in any well trained machine-learning model to direct test generation and thus, implementing a scalable test generation methodology. To evaluate our OGMA approach, we have tested it on three real world natural language processing (NLP) classifiers. We have found thousands of erroneous behaviours in these systems. We also compare OGMA with a random test generation approach and observe that OGMA is more effective than such random test generation by up to 489%.
Tasks
Published	2019-02-26
URL	https://arxiv.org/abs/1902.10027v3
PDF	https://arxiv.org/pdf/1902.10027v3.pdf
PWC	https://paperswithcode.com/paper/grammar-based-directed-testing-of-machine
Repo	https://github.com/sakshiudeshi/Ogma
Framework	none

SpaMHMM: Sparse Mixture of Hidden Markov Models for Graph Connected Entities


Title	SpaMHMM: Sparse Mixture of Hidden Markov Models for Graph Connected Entities
Authors	Diogo Pernes, Jaime S. Cardoso
Abstract	We propose a framework to model the distribution of sequential data coming from a set of entities connected in a graph with a known topology. The method is based on a mixture of shared hidden Markov models (HMMs), which are jointly trained in order to exploit the knowledge of the graph structure and in such a way that the obtained mixtures tend to be sparse. Experiments in different application domains demonstrate the effectiveness and versatility of the method.
Tasks
Published	2019-03-31
URL	http://arxiv.org/abs/1904.00442v1
PDF	http://arxiv.org/pdf/1904.00442v1.pdf
PWC	https://paperswithcode.com/paper/spamhmm-sparse-mixture-of-hidden-markov
Repo	https://github.com/dpernes/spamhmm
Framework	none

Meta-Learning with Differentiable Convex Optimization


Title	Meta-Learning with Differentiable Convex Optimization
Authors	Kwonjoon Lee, Subhransu Maji, Avinash Ravichandran, Stefano Soatto
Abstract	Many meta-learning approaches for few-shot learning rely on simple base learners such as nearest-neighbor classifiers. However, even in the few-shot regime, discriminatively trained linear predictors can offer better generalization. We propose to use these predictors as base learners to learn representations for few-shot learning and show they offer better tradeoffs between feature size and performance across a range of few-shot recognition benchmarks. Our objective is to learn feature embeddings that generalize well under a linear classification rule for novel categories. To efficiently solve the objective, we exploit two properties of linear classifiers: implicit differentiation of the optimality conditions of the convex problem and the dual formulation of the optimization problem. This allows us to use high-dimensional embeddings with improved generalization at a modest increase in computational overhead. Our approach, named MetaOptNet, achieves state-of-the-art performance on miniImageNet, tieredImageNet, CIFAR-FS, and FC100 few-shot learning benchmarks. Our code is available at https://github.com/kjunelee/MetaOptNet.
Tasks	Few-Shot Image Classification, Few-Shot Learning, Meta-Learning
Published	2019-04-07
URL	http://arxiv.org/abs/1904.03758v2
PDF	http://arxiv.org/pdf/1904.03758v2.pdf
PWC	https://paperswithcode.com/paper/meta-learning-with-differentiable-convex
Repo	https://github.com/cyvius96/few-shot-meta-baseline
Framework	pytorch

XLSor: A Robust and Accurate Lung Segmentor on Chest X-Rays Using Criss-Cross Attention and Customized Radiorealistic Abnormalities Generation


Title	XLSor: A Robust and Accurate Lung Segmentor on Chest X-Rays Using Criss-Cross Attention and Customized Radiorealistic Abnormalities Generation
Authors	Youbao Tang, Yuxing Tang, Jing Xiao, Ronald M. Summers
Abstract	This paper proposes a novel framework for lung segmentation in chest X-rays. It consists of two key contributions, a criss-cross attention based segmentation network and radiorealistic chest X-ray image synthesis (i.e. a synthesized radiograph that appears anatomically realistic) for data augmentation. The criss-cross attention modules capture rich global contextual information in both horizontal and vertical directions for all the pixels thus facilitating accurate lung segmentation. To reduce the manual annotation burden and to train a robust lung segmentor that can be adapted to pathological lungs with hazy lung boundaries, an image-to-image translation module is employed to synthesize radiorealistic abnormal CXRs from the source of normal ones for data augmentation. The lung masks of synthetic abnormal CXRs are propagated from the segmentation results of their normal counterparts, and then serve as pseudo masks for robust segmentor training. In addition, we annotate 100 CXRs with lung masks on a more challenging NIH Chest X-ray dataset containing both posterioranterior and anteroposterior views for evaluation. Extensive experiments validate the robustness and effectiveness of the proposed framework. The code and data can be found from https://github.com/rsummers11/CADLab/tree/master/Lung_Segmentation_XLSor .
Tasks	Data Augmentation, Image Generation, Image-to-Image Translation
Published	2019-04-19
URL	http://arxiv.org/abs/1904.09229v1
PDF	http://arxiv.org/pdf/1904.09229v1.pdf
PWC	https://paperswithcode.com/paper/xlsor-a-robust-and-accurate-lung-segmentor-on
Repo	https://github.com/rsummers11/CADLab/tree/master/Lung_Segmentation_XLSor
Framework	pytorch

Remove Cosine Window from Correlation Filter-based Visual Trackers: When and How


Title	Remove Cosine Window from Correlation Filter-based Visual Trackers: When and How
Authors	Feng Li, Xiaohe Wu, Wangmeng Zuo, David Zhang, Lei Zhang
Abstract	Correlation filters (CFs) have been continuously advancing the state-of-the-art tracking performance and have been extensively studied in the recent few years. Most of the existing CF trackers adopt a cosine window to spatially reweight base image to alleviate boundary discontinuity. However, cosine window emphasizes more on the central region of base image and has the risk of contaminating negative training samples during model learning. On the other hand, spatial regularization deployed in many recent CF trackers plays a similar role as cosine window by enforcing spatial penalty on CF coefficients. Therefore, we in this paper investigate the feasibility to remove cosine window from CF trackers with spatial regularization. When simply removing cosine window, CF with spatial regularization still suffers from small degree of boundary discontinuity. To tackle this issue, binary and Gaussian shaped mask functions are further introduced for eliminating boundary discontinuity while reweighting the estimation error of each training sample, and can be incorporated with multiple CF trackers with spatial regularization. In comparison to the counterparts with cosine window, our methods are effective in handling boundary discontinuity and sample contamination, thereby benefiting tracking performance. Extensive experiments on three benchmarks show that our methods perform favorably against the state-of-the-art trackers using either handcrafted or deep CNN features. The code is publicly available at https://github.com/lifeng9472/Removing_cosine_window_from_CF_trackers.
Tasks
Published	2019-05-16
URL	https://arxiv.org/abs/1905.06648v1
PDF	https://arxiv.org/pdf/1905.06648v1.pdf
PWC	https://paperswithcode.com/paper/190506648
Repo	https://github.com/lifeng9472/Removing_cosine_window_from_CF_trackers
Framework	none

Few-Shot Learning with Global Class Representations


Title	Few-Shot Learning with Global Class Representations
Authors	Tiange Luo, Aoxue Li, Tao Xiang, Weiran Huang, Liwei Wang
Abstract	In this paper, we propose to tackle the challenging few-shot learning (FSL) problem by learning global class representations using both base and novel class training samples. In each training episode, an episodic class mean computed from a support set is registered with the global representation via a registration module. This produces a registered global class representation for computing the classification loss using a query set. Though following a similar episodic training pipeline as existing meta learning based approaches, our method differs significantly in that novel class training samples are involved in the training from the beginning. To compensate for the lack of novel class training samples, an effective sample synthesis strategy is developed to avoid overfitting. Importantly, by joint base-novel class training, our approach can be easily extended to a more practical yet challenging FSL setting, i.e., generalized FSL, where the label space of test data is extended to both base and novel classes. Extensive experiments show that our approach is effective for both of the two FSL settings.
Tasks	Few-Shot Image Classification, Few-Shot Learning, Generalized Few-Shot Classification, Meta-Learning
Published	2019-08-14
URL	https://arxiv.org/abs/1908.05257v1
PDF	https://arxiv.org/pdf/1908.05257v1.pdf
PWC	https://paperswithcode.com/paper/few-shot-learning-with-global-class
Repo	https://github.com/tiangeluo/fsl-global
Framework	none

AirFace: Lightweight and Efficient Model for Face Recognition


Title	AirFace: Lightweight and Efficient Model for Face Recognition
Authors	Xianyang Li, Feng Wang, Qinghao Hu, Cong Leng
Abstract	With the development of convolutional neural network, significant progress has been made in computer vision tasks. However, the commonly used loss function softmax loss and highly efficient network architecture for common visual tasks are not as effective for face recognition. In this paper, we propose a novel loss function named Li-ArcFace based on ArcFace. Li-ArcFace takes the value of the angle through linear function as the target logit rather than through cosine function, which has better convergence and performance on low dimensional embedding feature learning for face recognition. In terms of network architecture, we improved the the perfomance of MobileFaceNet by increasing the network depth, width and adding attention module. Besides, we found some useful training tricks for face recognition. With all the above results, we won the second place in the deepglint-light challenge of LFR2019.
Tasks	Face Recognition
Published	2019-07-29
URL	https://arxiv.org/abs/1907.12256v3
PDF	https://arxiv.org/pdf/1907.12256v3.pdf
PWC	https://paperswithcode.com/paper/airfacelightweight-and-efficient-model-for
Repo	https://github.com/pshashk/seesaw-facenet
Framework	pytorch