April 3, 2020

3275 words 16 mins read

Paper Group AWR 62

Robust binary classification with the 01 loss. CRVOS: Clue Refining Network for Video Object Segmentation. Mean shift cluster recognition method implementation in the nested sampling algorithm. The Synthinel-1 dataset: a collection of high resolution synthetic overhead imagery for building segmentation. Low Resource Neural Machine Translation: A Be …

Robust binary classification with the 01 loss


Title	Robust binary classification with the 01 loss
Authors	Yunzhe Xue, Meiyan Xie, Usman Roshan
Abstract	The 01 loss is robust to outliers and tolerant to noisy data compared to convex loss functions. We conjecture that the 01 loss may also be more robust to adversarial attacks. To study this empirically we have developed a stochastic coordinate descent algorithm for a linear 01 loss classifier and a single hidden layer 01 loss neural network. Due to the absence of the gradient we iteratively update coordinates on random subsets of the data for fixed epochs. We show our algorithms to be fast and comparable in accuracy to the linear support vector machine and logistic loss single hidden layer network for binary classification on several image benchmarks, thus establishing that our method is on-par in test accuracy with convex losses. We then subject them to accurately trained substitute model black box attacks on the same image benchmarks and find them to be more robust than convex counterparts. On CIFAR10 binary classification task between classes 0 and 1 with adversarial perturbation of 0.0625 we see that the MLP01 network loses 27% in accuracy whereas the MLP-logistic counterpart loses 83%. Similarly on STL10 and ImageNet binary classification between classes 0 and 1 the MLP01 network loses 21% and 20% while MLP-logistic loses 67% and 45% respectively. On MNIST that is a well-separable dataset we find MLP01 comparable to MLP-logistic and show under simulation how and why our 01 loss solver is less robust there. We then propose adversarial training for our linear 01 loss solver that significantly improves its robustness on MNIST and all other datasets and retains clean test accuracy. Finally we show practical applications of our method to deter traffic sign and facial recognition adversarial attacks. We discuss attacks with 01 loss, substitute model accuracy, and several future avenues like multiclass, 01 loss convolutions, and further adversarial training.
Tasks
Published	2020-02-09
URL	https://arxiv.org/abs/2002.03444v1
PDF	https://arxiv.org/pdf/2002.03444v1.pdf
PWC	https://paperswithcode.com/paper/robust-binary-classification-with-the-01-loss
Repo	https://github.com/zero-one-loss/01loss
Framework	pytorch

CRVOS: Clue Refining Network for Video Object Segmentation


Title	CRVOS: Clue Refining Network for Video Object Segmentation
Authors	Suhwan Cho, MyeongAh Cho, Tae-young Chung, Heansung Lee, Sangyoun Lee
Abstract	The encoder-decoder based methods for semi-supervised video object segmentation (Semi-VOS) have received extensive attentions due to their superior performances. However, most of them have complex intermediate networks which generate strong specifiers, to be robust against challenging scenarios, and this is quite inefficient when dealing with relatively simple scenarios. To solve this problem, we propose a real-time Clue Refining Network for Video Object Segmentation (CRVOS) which does not have complex intermediate network. In this work, we propose a simple specifier, referred to as the Clue, which consists of the previous frame’s coarse mask and coordinates information. We also propose a novel refine module which shows higher performance than general ones by using deconvolution layer instead of bilinear upsampling. Our proposed network, CRVOS, is the fastest method with the competitive performance. On DAVIS16 validation set, CRVOS achieves 61 FPS and J&F score of 81.6%.
Tasks	Semantic Segmentation, Semi-supervised Video Object Segmentation, Video Object Segmentation, Video Semantic Segmentation, Visual Object Tracking
Published	2020-02-10
URL	https://arxiv.org/abs/2002.03651v1
PDF	https://arxiv.org/pdf/2002.03651v1.pdf
PWC	https://paperswithcode.com/paper/crvos-clue-refining-network-for-video-object
Repo	https://github.com/suhwan-cho/CRVOS
Framework	pytorch

Mean shift cluster recognition method implementation in the nested sampling algorithm


Title	Mean shift cluster recognition method implementation in the nested sampling algorithm
Authors	M. Trassinelli, Pierre Ciccodicola
Abstract	Nested sampling is an efficient algorithm for the calculation of the Bayesian evidence and posterior parameter probability distributions. It is based on the step-by-step exploration of the parameter space by Monte Carlo sampling with a series of values sets called live points that evolve towards the region of interest, i.e. where the likelihood function is maximal. In presence of several local likelihood maxima, the algorithm converges with difficulty. Some systematic errors can also be introduced by unexplored parameter volume regions. In order to avoid this, different methods are proposed in the literature for an efficient search of new live points, even in presence of local maxima. Here we present a new solution based on the mean shift cluster recognition method implemented in a random walk search algorithm. The clustering recognition is integrated within the Bayesian analysis program NestedFit. It is tested with the analysis of some difficult cases. Compared to the analysis results without cluster recognition, the computation time is considerably reduced. At the same time, the entire parameter space is efficiently explored, which translates into a smaller uncertainty of the extracted value of the Bayesian evidence.
Tasks
Published	2020-01-31
URL	https://arxiv.org/abs/2002.01431v1
PDF	https://arxiv.org/pdf/2002.01431v1.pdf
PWC	https://paperswithcode.com/paper/mean-shift-cluster-recognition-method
Repo	https://github.com/martinit18/nested_fit
Framework	none

The Synthinel-1 dataset: a collection of high resolution synthetic overhead imagery for building segmentation


Title	The Synthinel-1 dataset: a collection of high resolution synthetic overhead imagery for building segmentation
Authors	Fanjie Kong, Bohao Huang, Kyle Bradbury, Jordan M. Malof
Abstract	Recently deep learning - namely convolutional neural networks (CNNs) - have yielded impressive performance for the task of building segmentation on large overhead (e.g., satellite) imagery benchmarks. However, these benchmark datasets only capture a small fraction of the variability present in real-world overhead imagery, limiting the ability to properly train, or evaluate, models for real-world application. Unfortunately, developing a dataset that captures even a small fraction of real-world variability is typically infeasible due to the cost of imagery, and manual pixel-wise labeling of the imagery. In this work we develop an approach to rapidly and cheaply generate large and diverse virtual environments from which we can capture synthetic overhead imagery for training segmentation CNNs. Using this approach, generate and publicly-release a collection of synthetic overhead imagery - termed Synthinel-1 with full pixel-wise building labels. We use several benchmark dataset to demonstrate that Synthinel-1 is consistently beneficial when used to augment real-world training imagery, especially when CNNs are tested on novel geographic locations or conditions.
Tasks
Published	2020-01-15
URL	https://arxiv.org/abs/2001.05130v1
PDF	https://arxiv.org/pdf/2001.05130v1.pdf
PWC	https://paperswithcode.com/paper/the-synthinel-1-dataset-a-collection-of-high
Repo	https://github.com/timqqt/Synthinel
Framework	none

Low Resource Neural Machine Translation: A Benchmark for Five African Languages


Title	Low Resource Neural Machine Translation: A Benchmark for Five African Languages
Authors	Surafel M. Lakew, Matteo Negri, Marco Turchi
Abstract	Recent advents in Neural Machine Translation (NMT) have shown improvements in low-resource language (LRL) translation tasks. In this work, we benchmark NMT between English and five African LRL pairs (Swahili, Amharic, Tigrigna, Oromo, Somali [SATOS]). We collected the available resources on the SATOS languages to evaluate the current state of NMT for LRLs. Our evaluation, comparing a baseline single language pair NMT model against semi-supervised learning, transfer learning, and multilingual modeling, shows significant performance improvements both in the En-LRL and LRL-En directions. In terms of averaged BLEU score, the multilingual approach shows the largest gains, up to +5 points, in six out of ten translation directions. To demonstrate the generalization capability of each model, we also report results on multi-domain test sets. We release the standardized experimental data and the test sets for future works addressing the challenges of NMT in under-resourced settings, in particular for the SATOS languages.
Tasks	Low-Resource Neural Machine Translation, Machine Translation, Transfer Learning
Published	2020-03-31
URL	https://arxiv.org/abs/2003.14402v1
PDF	https://arxiv.org/pdf/2003.14402v1.pdf
PWC	https://paperswithcode.com/paper/low-resource-neural-machine-translation-a
Repo	https://github.com/surafelml/Afro-NMT
Framework	none

Inferring Convolutional Neural Networks’ accuracies from their architectural characterizations


Title	Inferring Convolutional Neural Networks’ accuracies from their architectural characterizations
Authors	Duc Hoang, Jesse Hamer, Gabriel N. Perdue, Steven R. Young, Jonathan Miller, Anushree Ghosh
Abstract	Convolutional Neural Networks (CNNs) have shown strong promise for analyzing scientific data from many domains including particle imaging detectors. However, the challenge of choosing the appropriate network architecture (depth, kernel shapes, activation functions, etc.) for specific applications and different data sets is still poorly understood. In this paper, we study the relationships between a CNN’s architecture and its performance by proposing a systematic language that is useful for comparison between different CNN’s architectures before training time. We characterize CNN’s architecture by different attributes, and demonstrate that the attributes can be predictive of the networks’ performance in two specific computer vision-based physics problems – event vertex finding and hadron multiplicity classification in the MINERvA experiment at Fermi National Accelerator Laboratory. In doing so, we extract several architectural attributes from optimized networks’ architecture for the physics problems, which are outputs of a model selection algorithm called Multi-node Evolutionary Neural Networks for Deep Learning (MENNDL). We use machine learning models to predict whether a network can perform better than a certain threshold accuracy before training. The models perform 16-20% better than random guessing. Additionally, we found an coefficient of determination of 0.966 for an Ordinary Least Squares model in a regression on accuracy over a large population of networks.
Tasks	Model Selection
Published	2020-01-07
URL	https://arxiv.org/abs/2001.02160v2
PDF	https://arxiv.org/pdf/2001.02160v2.pdf
PWC	https://paperswithcode.com/paper/inferring-convolutional-neural-networks
Repo	https://github.com/Duchstf/CNN-Architectural-Analysis
Framework	none

Discriminative Particle Filter Reinforcement Learning for Complex Partial Observations


Title	Discriminative Particle Filter Reinforcement Learning for Complex Partial Observations
Authors	Xiao Ma, Peter Karkus, David Hsu, Wee Sun Lee, Nan Ye
Abstract	Deep reinforcement learning is successful in decision making for sophisticated games, such as Atari, Go, etc. However, real-world decision making often requires reasoning with partial information extracted from complex visual observations. This paper presents Discriminative Particle Filter Reinforcement Learning (DPFRL), a new reinforcement learning framework for complex partial observations. DPFRL encodes a differentiable particle filter in the neural network policy for explicit reasoning with partial observations over time. The particle filter maintains a belief using learned discriminative update, which is trained end-to-end for decision making. We show that using the discriminative update instead of standard generative models results in significantly improved performance, especially for tasks with complex visual observations, because they circumvent the difficulty of modeling complex observations that are irrelevant to decision making. In addition, to extract features from the particle belief, we propose a new type of belief feature based on the moment generating function. DPFRL outperforms state-of-the-art POMDP RL models in Flickering Atari Games, an existing POMDP RL benchmark, and in Natural Flickering Atari Games, a new, more challenging POMDP RL benchmark introduced in this paper. Further, DPFRL performs well for visual navigation with real-world data in the Habitat environment.
Tasks	Atari Games, Decision Making, Visual Navigation
Published	2020-02-23
URL	https://arxiv.org/abs/2002.09884v1
PDF	https://arxiv.org/pdf/2002.09884v1.pdf
PWC	https://paperswithcode.com/paper/discriminative-particle-filter-reinforcement-1
Repo	https://github.com/Yusufma03/DPFRL
Framework	pytorch

Binarized PMI Matrix: Bridging Word Embeddings and Hyperbolic Spaces


Title	Binarized PMI Matrix: Bridging Word Embeddings and Hyperbolic Spaces
Authors	Zhenisbek Assylbekov, Alibi Jangeldin
Abstract	We show analytically that removing sigmoid transformation in the SGNS objective does not harm the quality of word vectors significantly and at the same time is related to factorizing a binarized PMI matrix which, in turn, can be treated as an adjacency matrix of a certain graph. Empirically, such graph is a complex network, i.e. it has strong clustering and scale-free degree distribution, and is tightly connected with hyperbolic spaces. In short, we show the connection between static word embeddings and hyperbolic spaces through the binarized PMI matrix using analytical and empirical methods.
Tasks	Word Embeddings
Published	2020-02-27
URL	https://arxiv.org/abs/2002.12005v1
PDF	https://arxiv.org/pdf/2002.12005v1.pdf
PWC	https://paperswithcode.com/paper/binarized-pmi-matrix-bridging-word-embeddings
Repo	https://github.com/zh3nis/BPMI
Framework	pytorch

SPACE: Unsupervised Object-Oriented Scene Representation via Spatial Attention and Decomposition


Title	SPACE: Unsupervised Object-Oriented Scene Representation via Spatial Attention and Decomposition
Authors	Zhixuan Lin, Yi-Fu Wu, Skand Vishwanath Peri, Weihao Sun, Gautam Singh, Fei Deng, Jindong Jiang, Sungjin Ahn
Abstract	The ability to decompose complex multi-object scenes into meaningful abstractions like objects is fundamental to achieve higher-level cognition. Previous approaches for unsupervised object-oriented scene representation learning are either based on spatial-attention or scene-mixture approaches and limited in scalability which is a main obstacle towards modeling real-world scenes. In this paper, we propose a generative latent variable model, called SPACE, that provides a unified probabilistic modeling framework that combines the best of spatial-attention and scene-mixture approaches. SPACE can explicitly provide factorized object representations for foreground objects while also decomposing background segments of complex morphology. Previous models are good at either of these, but not both. SPACE also resolves the scalability problems of previous methods by incorporating parallel spatial-attention and thus is applicable to scenes with a large number of objects without performance degradations. We show through experiments on Atari and 3D-Rooms that SPACE achieves the above properties consistently in comparison to SPAIR, IODINE, and GENESIS. Results of our experiments can be found on our project website: https://sites.google.com/view/space-project-page
Tasks	Representation Learning
Published	2020-01-08
URL	https://arxiv.org/abs/2001.02407v3
PDF	https://arxiv.org/pdf/2001.02407v3.pdf
PWC	https://paperswithcode.com/paper/space-unsupervised-object-oriented-scene
Repo	https://github.com/sebamenabar/SPACE-Pytorch-Implementation
Framework	pytorch

A Branching and Merging Convolutional Network with Homogeneous Filter Capsules


Title	A Branching and Merging Convolutional Network with Homogeneous Filter Capsules
Authors	Adam Byerly, Tatiana Kalganova, Ian Dear
Abstract	We present a convolutional neural network design with additional branches after certain convolutions so that we can extract features with differing effective receptive fields and levels of abstraction. From each branch, we transform each of the final filters into a pair of homogeneous vector capsules. As the capsules are formed from entire filters, we refer to them as filter capsules. We then compare three methods for merging the branches–merging with equal weight and merging with learned weights, with two different weight initialization strategies. This design, in combination with a domain-specific set of randomly applied augmentation techniques, establishes a new state of the art for the MNIST dataset with an accuracy of 99.84% for an ensemble of these models, as well as establishing a new state of the art for a single model (99.79% accurate). These accuracies were achieved with a 75% reduction in both the number of parameters and the number of epochs of training relative to the previously best performing capsule network on MNIST. All training was performed using the Adam optimizer and experienced no overfitting.
Tasks	Image Classification
Published	2020-01-24
URL	https://arxiv.org/abs/2001.09136v3
PDF	https://arxiv.org/pdf/2001.09136v3.pdf
PWC	https://paperswithcode.com/paper/a-branching-and-merging-convolutional-network
Repo	https://github.com/AdamByerly/BMCNNwHFCs
Framework	tf

On Model Evaluation under Non-constant Class Imbalance


Title	On Model Evaluation under Non-constant Class Imbalance
Authors	Jan Brabec, Tomáš Komárek, Vojtěch Franc, Lukáš Machlica
Abstract	Many real-world classification problems are significantly class-imbalanced to detriment of the class of interest. The standard set of proper evaluation metrics is well-known but the usual assumption is that the test dataset imbalance equals the real-world imbalance. In practice, this assumption is often broken for various reasons. The reported results are then often too optimistic and may lead to wrong conclusions about industrial impact and suitability of proposed techniques. We introduce methods focusing on evaluation under non-constant class imbalance. We show that not only the absolute values of commonly used metrics, but even the order of classifiers in relation to the evaluation metric used is affected by the change of the imbalance rate. Finally, we demonstrate that using subsampling in order to get a test dataset with class imbalance equal to the one observed in the wild is not necessary, and eventually can lead to significant errors in classifier’s performance estimate.
Tasks
Published	2020-01-15
URL	https://arxiv.org/abs/2001.05571v1
PDF	https://arxiv.org/pdf/2001.05571v1.pdf
PWC	https://paperswithcode.com/paper/on-model-evaluation-under-non-constant-class
Repo	https://github.com/CiscoCTA/nci_eval
Framework	none

Solving Portfolio Optimization Problems Using MOEA/D and Levy Flight


Title	Solving Portfolio Optimization Problems Using MOEA/D and Levy Flight
Authors	Yifan He, Claus Aranha
Abstract	Portfolio optimization is a financial task which requires the allocation of capital on a set of financial assets to achieve a better trade-off between return and risk. To solve this problem, recent studies applied multi-objective evolutionary algorithms (MOEAs) for its natural bi-objective structure. This paper presents a method injecting a distribution-based mutation method named L'evy Flight into a decomposition based MOEA named MOEA/D. The proposed algorithm is compared with three MOEA/D-like algorithms, NSGA-II, and other distribution-based mutation methods on five portfolio optimization benchmarks sized from 31 to 225 in OR library without constraints, assessing with six metrics. Numerical results and statistical test indicate that this method can outperform comparison methods in most cases. We analyze how Levy Flight contributes to this improvement by promoting global search early in the optimization. We explain this improvement by considering the interaction between mutation method and the property of the problem.
Tasks	Portfolio Optimization
Published	2020-03-15
URL	https://arxiv.org/abs/2003.06737v1
PDF	https://arxiv.org/pdf/2003.06737v1.pdf
PWC	https://paperswithcode.com/paper/solving-portfolio-optimization-problems-using
Repo	https://github.com/Y1fanHE/po_with_moead-levy
Framework	none

DIBS: Diversity inducing Information Bottleneck in Model Ensembles


Title	DIBS: Diversity inducing Information Bottleneck in Model Ensembles
Authors	Samarth Sinha, Homanga Bharadhwaj, Anirudh Goyal, Hugo Larochelle, Animesh Garg, Florian Shkurti
Abstract	Although deep learning models have achieved state-of-the-art performance on a number of vision tasks, generalization over high dimensional multi-modal data, and reliable predictive uncertainty estimation are still active areas of research. Bayesian approaches including Bayesian Neural Nets (BNNs) do not scale well to modern computer vision tasks, as they are difficult to train, and have poor generalization under dataset-shift. This motivates the need for effective ensembles which can generalize and give reliable uncertainty estimates. In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction. We explicitly optimize a diversity inducing adversarial loss for learning the stochastic latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data. We evaluate our method on benchmark datasets: MNIST, CIFAR100, TinyImageNet and MIT Places 2, and compared to the most competitive baselines show significant improvements in classification accuracy, under a shift in the data distribution and in out-of-distribution detection. Code will be released in this url https://github.com/rvl-lab-utoronto/dibs
Tasks	Out-of-Distribution Detection
Published	2020-03-10
URL	https://arxiv.org/abs/2003.04514v1
PDF	https://arxiv.org/pdf/2003.04514v1.pdf
PWC	https://paperswithcode.com/paper/dibs-diversity-inducing-information
Repo	https://github.com/rvl-lab-utoronto/dibs
Framework	pytorch

Unsupervised Discovery of Interpretable Directions in the GAN Latent Space


Title	Unsupervised Discovery of Interpretable Directions in the GAN Latent Space
Authors	Andrey Voynov, Artem Babenko
Abstract	The latent spaces of typical GAN models often have semantically meaningful directions. Moving in these directions corresponds to human-interpretable image transformations, such as zooming or recoloring, enabling a more controllable generation process. However, the discovery of such directions is currently performed in a supervised manner, requiring human labels, pretrained models, or some form of self-supervision. These requirements can severely limit a range of directions existing approaches can discover. In this paper, we introduce an unsupervised method to identify interpretable directions in the latent space of a pretrained GAN model. By a simple model-agnostic procedure, we find directions corresponding to sensible semantic manipulations without any form of (self-)supervision. Furthermore, we reveal several non-trivial findings, which would be difficult to obtain by existing methods, e.g., a direction corresponding to background removal. As an immediate practical benefit of our work, we show how to exploit this finding to achieve a new state-of-the-art for the problem of saliency detection.
Tasks	Saliency Detection
Published	2020-02-10
URL	https://arxiv.org/abs/2002.03754v2
PDF	https://arxiv.org/pdf/2002.03754v2.pdf
PWC	https://paperswithcode.com/paper/unsupervised-discovery-of-interpretable
Repo	https://github.com/anvoynov/GANLatentDiscovery
Framework	pytorch

Bayesian Semi-supervised learning under nonparanormality


Title	Bayesian Semi-supervised learning under nonparanormality
Authors	Rui Zhu, Subhashis Ghosal
Abstract	Semi-supervised learning is a classification method which makes use of both labeled data and unlabeled data for training. In this paper, we propose a semi-supervised learning algorithm using a Bayesian semi-supervised model. We make a general assumption that the observations will follow two multivariate normal distributions depending on their true labels after the same unknown transformation. We use B-splines to put a prior on the transformation function for each component. To use unlabeled data in a semi-supervised setting, we assume the labels are missing at random. The posterior distributions can then be described using our assumptions, which we compute by the Gibbs sampling technique. The proposed method is then compared with several other available methods through an extensive simulation study. Finally we apply the proposed method in real data contexts for diagnosing breast cancer and classify radar returns. We conclude that the proposed method has better prediction accuracy in a wide variety of cases.
Tasks
Published	2020-01-11
URL	https://arxiv.org/abs/2001.03798v1
PDF	https://arxiv.org/pdf/2001.03798v1.pdf
PWC	https://paperswithcode.com/paper/bayesian-semi-supervised-learning-under
Repo	https://github.com/RrZzZz/Bayesian-Semi-supervised-learning-under-nonparanormality
Framework	none