April 3, 2020

3238 words 16 mins read

Paper Group AWR 71

Deep Active Learning for Biased Datasets via Fisher Kernel Self-Supervision. Unbiased and Efficient Log-Likelihood Estimation with Inverse Binomial Sampling. Single-exposure absorption imaging of ultracold atoms using deep learning. Cross-Iteration Batch Normalization. ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network. Gettin …

Deep Active Learning for Biased Datasets via Fisher Kernel Self-Supervision


Title	Deep Active Learning for Biased Datasets via Fisher Kernel Self-Supervision
Authors	Denis Gudovskiy, Alec Hodgkinson, Takuya Yamaguchi, Sotaro Tsukizawa
Abstract	Active learning (AL) aims to minimize labeling efforts for data-demanding deep neural networks (DNNs) by selecting the most representative data points for annotation. However, currently used methods are ill-equipped to deal with biased data. The main motivation of this paper is to consider a realistic setting for pool-based semi-supervised AL, where the unlabeled collection of train data is biased. We theoretically derive an optimal acquisition function for AL in this setting. It can be formulated as distribution shift minimization between unlabeled train data and weakly-labeled validation dataset. To implement such acquisition function, we propose a low-complexity method for feature density matching using self-supervised Fisher kernel (FK) as well as several novel pseudo-label estimators. Our FK-based method outperforms state-of-the-art methods on MNIST, SVHN, and ImageNet classification while requiring only 1/10th of processing. The conducted experiments show at least 40% drop in labeling efforts for the biased class-imbalanced data compared to existing methods.
Tasks	Active Learning
Published	2020-03-01
URL	https://arxiv.org/abs/2003.00393v1
PDF	https://arxiv.org/pdf/2003.00393v1.pdf
PWC	https://paperswithcode.com/paper/deep-active-learning-for-biased-datasets-via
Repo	https://github.com/gudovskiy/al-fk-self-supervision
Framework	pytorch

Unbiased and Efficient Log-Likelihood Estimation with Inverse Binomial Sampling


Title	Unbiased and Efficient Log-Likelihood Estimation with Inverse Binomial Sampling
Authors	Bas van Opheusden, Luigi Acerbi, Wei Ji Ma
Abstract	The fate of scientific hypotheses often relies on the ability of a computational model to explain the data, quantified in modern statistical approaches by the likelihood function. The log-likelihood is the key element for parameter estimation and model evaluation. However, the log-likelihood of complex models in fields such as computational biology and neuroscience is often intractable to compute analytically or numerically. In those cases, researchers can often only estimate the log-likelihood by comparing observed data with synthetic observations generated by model simulations. Standard techniques to approximate the likelihood via simulation either use summary statistics of the data or are at risk of producing severe biases in the estimate. Here, we explore another method, inverse binomial sampling (IBS), which can estimate the log-likelihood of an entire data set efficiently and without bias. For each observation, IBS draws samples from the simulator model until one matches the observation. The log-likelihood estimate is then a function of the number of samples drawn. The variance of this estimator is uniformly bounded, achieves the minimum variance for an unbiased estimator, and we can compute calibrated estimates of the variance. We provide theoretical arguments in favor of IBS and an empirical assessment of the method for maximum-likelihood estimation with simulation-based models. As case studies, we take three model-fitting problems of increasing complexity from computational and cognitive neuroscience. In all problems, IBS generally produces lower error in the estimated parameters and maximum log-likelihood values than alternative sampling methods with the same average number of samples. Our results demonstrate the potential of IBS as a practical, robust, and easy to implement method for log-likelihood evaluation when exact techniques are not available.
Tasks
Published	2020-01-12
URL	https://arxiv.org/abs/2001.03985v1
PDF	https://arxiv.org/pdf/2001.03985v1.pdf
PWC	https://paperswithcode.com/paper/unbiased-and-efficient-log-likelihood
Repo	https://github.com/basvanopheusden/ibs-development
Framework	none

Single-exposure absorption imaging of ultracold atoms using deep learning


Title	Single-exposure absorption imaging of ultracold atoms using deep learning
Authors	Gal Ness, Anastasiya Vainbaum, Constantine Shkedrov, Yanay Florshaim, Yoav Sagi
Abstract	Absorption imaging is the most common probing technique in experiments with ultracold atoms. The standard procedure involves the division of two frames acquired at successive exposures, one with the atomic absorption signal and one without. A well-known problem is the presence of residual structured noise in the final image, due to small differences between the imaging light in the two exposures. Here we solve this problem by performing absorption imaging with only a single exposure, where instead of a second exposure the reference frame is generated by an unsupervised image-completion autoencoder neural network. The network is trained on images without absorption signal such that it can infer the noise overlaying the atomic signal based only on the information in the region encircling the signal. We demonstrate our approach on data captured with a quantum degenerate Fermi gas. The average residual noise in the resulting images is below that of the standard double-shot technique. Our method simplifies the experimental sequence, reduces the hardware requirements, and can improve the accuracy of extracted physical observables. The trained network and its generating scripts are available as an open-source repository (http://absDL.github.io/).
Tasks	Image Denoising, Physical Attribute Prediction
Published	2020-03-03
URL	https://arxiv.org/abs/2003.01643v1
PDF	https://arxiv.org/pdf/2003.01643v1.pdf
PWC	https://paperswithcode.com/paper/single-exposure-absorption-imaging-of
Repo	https://github.com/absDL/absDL.github.io
Framework	none

Cross-Iteration Batch Normalization


Title	Cross-Iteration Batch Normalization
Authors	Zhuliang Yao, Yue Cao, Shuxin Zheng, Gao Huang, Stephen Lin
Abstract	A well-known issue of Batch Normalization is its significantly reduced effectiveness in the case of small mini-batch sizes. When a mini-batch contains few examples, the statistics upon which the normalization is defined cannot be reliably estimated from it during a training iteration. To address this problem, we present Cross-Iteration Batch Normalization (CBN), in which examples from multiple recent iterations are jointly utilized to enhance estimation quality. A challenge of computing statistics over multiple iterations is that the network activations from different iterations are not comparable to each other due to changes in network weights. We thus compensate for the network weight changes via a proposed technique based on Taylor polynomials, so that the statistics can be accurately estimated and batch normalization can be effectively applied. On object detection and image classification with small mini-batch sizes, CBN is found to outperform the original batch normalization and a direct calculation of statistics over previous iterations without the proposed compensation technique.
Tasks	Image Classification, Object Detection
Published	2020-02-13
URL	https://arxiv.org/abs/2002.05712v2
PDF	https://arxiv.org/pdf/2002.05712v2.pdf
PWC	https://paperswithcode.com/paper/cross-iteration-batch-normalization
Repo	https://github.com/Howal/Cross-iterationBatchNorm
Framework	pytorch

ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network


Title	ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network
Authors	Yuliang Liu, Hao Chen, Chunhua Shen, Tong He, Lianwen Jin, Liangwei Wang
Abstract	Scene text detection and recognition has received increasing research attention. Existing methods can be roughly categorized into two groups: character-based and segmentation-based. These methods either are costly for character annotation or need to maintain a complex pipeline, which is often not suitable for real-time applications. Here we address the problem by proposing the Adaptive Bezier-Curve Network (ABCNet). Our contributions are three-fold: 1) For the first time, we adaptively fit arbitrarily-shaped text by a parameterized Bezier curve. 2) We design a novel BezierAlign layer for extracting accurate convolution features of a text instance with arbitrary shapes, significantly improving the precision compared with previous methods. 3) Compared with standard bounding box detection, our Bezier curve detection introduces negligible computation overhead, resulting in superiority of our method in both efficiency and accuracy. Experiments on arbitrarily-shaped benchmark datasets, namely Total-Text and CTW1500, demonstrate that ABCNet achieves state-of-the-art accuracy, meanwhile significantly improving the speed. In particular, on Total-Text, our realtime version is over 10 times faster than recent state-of-the-art methods with a competitive recognition accuracy. Code is available at https://tinyurl.com/AdelaiDet
Tasks	Scene Text Detection, Text Spotting
Published	2020-02-24
URL	https://arxiv.org/abs/2002.10200v2
PDF	https://arxiv.org/pdf/2002.10200v2.pdf
PWC	https://paperswithcode.com/paper/abcnet-real-time-scene-text-spotting-with
Repo	https://github.com/aim-uofa/AdelaiDet
Framework	pytorch

Getting to 99% Accuracy in Interactive Segmentation


Title	Getting to 99% Accuracy in Interactive Segmentation
Authors	Marco Forte, Brian Price, Scott Cohen, Ning Xu, François Pitié
Abstract	Interactive object cutout tools are the cornerstone of the image editing workflow. Recent deep-learning based interactive segmentation algorithms have made significant progress in handling complex images and rough binary selections can typically be obtained with just a few clicks. Yet, deep learning techniques tend to plateau once this rough selection has been reached. In this work, we interpret this plateau as the inability of current algorithms to sufficiently leverage each user interaction and also as the limitations of current training/testing datasets. We propose a novel interactive architecture and a novel training scheme that are both tailored to better exploit the user workflow. We also show that significant improvements can be further gained by introducing a synthetic training dataset that is specifically designed for complex object boundaries. Comprehensive experiments support our approach, and our network achieves state of the art performance.
Tasks	Interactive Segmentation
Published	2020-03-17
URL	https://arxiv.org/abs/2003.07932v1
PDF	https://arxiv.org/pdf/2003.07932v1.pdf
PWC	https://paperswithcode.com/paper/getting-to-99-accuracy-in-interactive
Repo	https://github.com/MarcoForte/DeepInteractiveSegmentation
Framework	pytorch

GFTE: Graph-based Financial Table Extraction


Title	GFTE: Graph-based Financial Table Extraction
Authors	Yiren Li, Zheng Huang, Junchi Yan, Yi Zhou, Fan Ye, Xianhui Liu
Abstract	Tabular data is a crucial form of information expression, which can organize data in a standard structure for easy information retrieval and comparison. However, in financial industry and many other fields tables are often disclosed in unstructured digital files, e.g. Portable Document Format (PDF) and images, which are difficult to be extracted directly. In this paper, to facilitate deep learning based table extraction from unstructured digital files, we publish a standard Chinese dataset named FinTab, which contains more than 1,600 financial tables of diverse kinds and their corresponding structure representation in JSON. In addition, we propose a novel graph-based convolutional neural network model named GFTE as a baseline for future comparison. GFTE integrates image feature, position feature and textual feature together for precise edge prediction and reaches overall good results.
Tasks	Information Retrieval
Published	2020-03-17
URL	https://arxiv.org/abs/2003.07560v1
PDF	https://arxiv.org/pdf/2003.07560v1.pdf
PWC	https://paperswithcode.com/paper/gfte-graph-based-financial-table-extraction
Repo	https://github.com/Irene323/GFTE
Framework	pytorch

Black-box Smoothing: A Provable Defense for Pretrained Classifiers


Title	Black-box Smoothing: A Provable Defense for Pretrained Classifiers
Authors	Hadi Salman, Mingjie Sun, Greg Yang, Ashish Kapoor, J. Zico Kolter
Abstract	We present a method for provably defending any pretrained image classifier against $\ell_p$ adversarial attacks. By prepending a custom-trained denoiser to any off-the-shelf image classifier and using randomized smoothing, we effectively create a new classifier that is guaranteed to be $\ell_p$-robust to adversarial examples, without modifying the pretrained classifier. The approach applies both to the case where we have full access to the pretrained classifier as well as the case where we only have query access. We refer to this defense as black-box smoothing, and we demonstrate its effectiveness through extensive experimentation on ImageNet and CIFAR-10. Finally, we use our method to provably defend the Azure, Google, AWS, and ClarifAI image classification APIs. Our code replicating all the experiments in the paper can be found at https://github.com/microsoft/blackbox-smoothing .
Tasks	Image Classification
Published	2020-03-04
URL	https://arxiv.org/abs/2003.01908v1
PDF	https://arxiv.org/pdf/2003.01908v1.pdf
PWC	https://paperswithcode.com/paper/black-box-smoothing-a-provable-defense-for
Repo	https://github.com/microsoft/blackbox-smoothing
Framework	pytorch

I Am Going MAD: Maximum Discrepancy Competition for Comparing Classifiers Adaptively


Title	I Am Going MAD: Maximum Discrepancy Competition for Comparing Classifiers Adaptively
Authors	Haotao Wang, Tianlong Chen, Zhangyang Wang, Kede Ma
Abstract	The learning of hierarchical representations for image classification has experienced an impressive series of successes due in part to the availability of large-scale labeled data for training. On the other hand, the trained classifiers have traditionally been evaluated on small and fixed sets of test images, which are deemed to be extremely sparsely distributed in the space of all natural images. It is thus questionable whether recent performance improvements on the excessively re-used test sets generalize to real-world natural images with much richer content variations. Inspired by efficient stimulus selection for testing perceptual models in psychophysical and physiological studies, we present an alternative framework for comparing image classifiers, which we name the MAximum Discrepancy (MAD) competition. Rather than comparing image classifiers using fixed test images, we adaptively sample a small test set from an arbitrarily large corpus of unlabeled images so as to maximize the discrepancies between the classifiers, measured by the distance over WordNet hierarchy. Human labeling on the resulting model-dependent image sets reveals the relative performance of the competing classifiers, and provides useful insights on potential ways to improve them. We report the MAD competition results of eleven ImageNet classifiers while noting that the framework is readily extensible and cost-effective to add future classifiers into the competition. Codes can be found at https://github.com/TAMU-VITA/MAD.
Tasks	Image Classification
Published	2020-02-25
URL	https://arxiv.org/abs/2002.10648v1
PDF	https://arxiv.org/pdf/2002.10648v1.pdf
PWC	https://paperswithcode.com/paper/i-am-going-mad-maximum-discrepancy-1
Repo	https://github.com/TAMU-VITA/MAD
Framework	none

Scheduled Restart Momentum for Accelerated Stochastic Gradient Descent


Title	Scheduled Restart Momentum for Accelerated Stochastic Gradient Descent
Authors	Bao Wang, Tan M. Nguyen, Andrea L. Bertozzi, Richard G. Baraniuk, Stanley J. Osher
Abstract	Stochastic gradient descent (SGD) with constant momentum and its variants such as Adam are the optimization algorithms of choice for training deep neural networks (DNNs). Since DNN training is incredibly computationally expensive, there is great interest in speeding up convergence. Nesterov accelerated gradient (NAG) improves the convergence rate of gradient descent (GD) for convex optimization using a specially designed momentum; however, it accumulates error when an inexact gradient is used (such as in SGD), slowing convergence at best and diverging at worst. In this paper, we propose Scheduled Restart SGD (SRSGD), a new NAG-style scheme for training DNNs. SRSGD replaces the constant momentum in SGD by the increasing momentum in NAG but stabilizes the iterations by resetting the momentum to zero according to a schedule. Using a variety of models and benchmarks for image classification, we demonstrate that, in training DNNs, SRSGD significantly improves convergence and generalization; for instance in training ResNet200 for ImageNet classification, SRSGD achieves an error rate of 20.93% vs. the benchmark of 22.13%. These improvements become more significant as the network grows deeper. Furthermore, on both CIFAR and ImageNet, SRSGD reaches similar or even better error rates with fewer training epochs compared to the SGD baseline. We provide code for SRSGD at https://github.com/minhtannguyen/SRSGD.
Tasks	Image Classification
Published	2020-02-24
URL	https://arxiv.org/abs/2002.10583v1
PDF	https://arxiv.org/pdf/2002.10583v1.pdf
PWC	https://paperswithcode.com/paper/scheduled-restart-momentum-for-accelerated
Repo	https://github.com/minhtannguyen/SRSGD
Framework	pytorch

Fine-Grained Fashion Similarity Learning by Attribute-Specific Embedding Network


Title	Fine-Grained Fashion Similarity Learning by Attribute-Specific Embedding Network
Authors	Zhe Ma, Jianfeng Dong, Yao Zhang, Zhongzi Long, Yuan He, Hui Xue, Shouling Ji
Abstract	This paper strives to learn fine-grained fashion similarity. In this similarity paradigm, one should pay more attention to the similarity in terms of a specific design/attribute among fashion items, which has potential values in many fashion related applications such as fashion copyright protection. To this end, we propose an Attribute-Specific Embedding Network (ASEN) to jointly learn multiple attribute-specific embeddings in an end-to-end manner, thus measure the fine-grained similarity in the corresponding space. With two attention modules, i.e., Attribute-aware Spatial Attention and Attribute-aware Channel Attention, ASEN is able to locate the related regions and capture the essential patterns under the guidance of the specified attribute, thus make the learned attribute-specific embeddings better reflect the fine-grained similarity. Extensive experiments on four fashion-related datasets show the effectiveness of ASEN for fine-grained fashion similarity learning and its potential for fashion reranking.
Tasks
Published	2020-02-07
URL	https://arxiv.org/abs/2002.02814v1
PDF	https://arxiv.org/pdf/2002.02814v1.pdf
PWC	https://paperswithcode.com/paper/fine-grained-fashion-similarity-learning-by
Repo	https://github.com/Maryeon/asen
Framework	pytorch


Title	DISIR: Deep Image Segmentation with Interactive Refinement
Authors	Gaston Lenczner, Bertrand Le Saux, Nicola Luminari, Adrien Chan Hon Tong, Guy Le Besnerais
Abstract	This paper presents an interactive approach for multi-class segmentation of aerial images. Precisely, it is based on a deep neural network which exploits both RGB images and annotations. Starting from an initial output based on the image only, our network then interactively refines this segmentation map using a concatenation of the image and user annotations. Importantly, user annotations modify the inputs of the network - not its weights - enabling a fast and smooth process. Through experiments on two public aerial datasets, we show that user annotations are extremely rewarding: each click corrects roughly 5000 pixels. We analyze the impact of different aspects of our framework such as the representation of the annotations, the volume of training data or the network architecture. Code is available at https://github.com/delair-ai/DISIR.
Tasks	Semantic Segmentation
Published	2020-03-31
URL	https://arxiv.org/abs/2003.14200v1
PDF	https://arxiv.org/pdf/2003.14200v1.pdf
PWC	https://paperswithcode.com/paper/disir-deep-image-segmentation-with
Repo	https://github.com/delair-ai/DISIR
Framework	pytorch

Robust 6D Object Pose Estimation by Learning RGB-D Features


Title	Robust 6D Object Pose Estimation by Learning RGB-D Features
Authors	Meng Tian, Liang Pan, Marcelo H Ang Jr, Gim Hee Lee
Abstract	Accurate 6D object pose estimation is fundamental to robotic manipulation and grasping. Previous methods follow a local optimization approach which minimizes the distance between closest point pairs to handle the rotation ambiguity of symmetric objects. In this work, we propose a novel discrete-continuous formulation for rotation regression to resolve this local-optimum problem. We uniformly sample rotation anchors in SO(3), and predict a constrained deviation from each anchor to the target, as well as uncertainty scores for selecting the best prediction. Additionally, the object location is detected by aggregating point-wise vectors pointing to the 3D center. Experiments on two benchmarks: LINEMOD and YCB-Video, show that the proposed method outperforms state-of-the-art approaches. Our code is available at https://github.com/mentian/object-posenet.
Tasks	6D Pose Estimation using RGB, Pose Estimation
Published	2020-02-29
URL	https://arxiv.org/abs/2003.00188v2
PDF	https://arxiv.org/pdf/2003.00188v2.pdf
PWC	https://paperswithcode.com/paper/robust-6d-object-pose-estimation-by-learning
Repo	https://github.com/mentian/object-posenet
Framework	pytorch

Graph-Bert: Only Attention is Needed for Learning Graph Representations


Title	Graph-Bert: Only Attention is Needed for Learning Graph Representations
Authors	Jiawei Zhang, Haopeng Zhang, Congying Xia, Li Sun
Abstract	The dominant graph neural networks (GNNs) over-rely on the graph links, several serious performance problems with which have been witnessed already, e.g., suspended animation problem and over-smoothing problem. What’s more, the inherently inter-connected nature precludes parallelization within the graph, which becomes critical for large-sized graph, as memory constraints limit batching across the nodes. In this paper, we will introduce a new graph neural network, namely GRAPH-BERT (Graph based BERT), solely based on the attention mechanism without any graph convolution or aggregation operators. Instead of feeding GRAPH-BERT with the complete large input graph, we propose to train GRAPH-BERT with sampled linkless subgraphs within their local contexts. GRAPH-BERT can be learned effectively in a standalone mode. Meanwhile, a pre-trained GRAPH-BERT can also be transferred to other application tasks directly or with necessary fine-tuning if any supervised label information or certain application oriented objective is available. We have tested the effectiveness of GRAPH-BERT on several graph benchmark datasets. Based the pre-trained GRAPH-BERT with the node attribute reconstruction and structure recovery tasks, we further fine-tune GRAPH-BERT on node classification and graph clustering tasks specifically. The experimental results have demonstrated that GRAPH-BERT can out-perform the existing GNNs in both the learning effectiveness and efficiency.
Tasks	Graph Clustering, Node Classification
Published	2020-01-15
URL	https://arxiv.org/abs/2001.05140v2
PDF	https://arxiv.org/pdf/2001.05140v2.pdf
PWC	https://paperswithcode.com/paper/graph-bert-only-attention-is-needed-for
Repo	https://github.com/jwzhanggy/graph_bert_work
Framework	none

Towards Comparability in Non-Intrusive Load Monitoring: On Data and Performance Evaluation


Title	Towards Comparability in Non-Intrusive Load Monitoring: On Data and Performance Evaluation
Authors	Christoph Klemenjak, Stephen Makonin, Wilfried Elmenreich
Abstract	Non-Intrusive Load Monitoring (NILM) comprises of a set of techniques that provide insights into the energy consumption of households and industrial facilities. Latest contributions show significant improvements in terms of accuracy and generalisation abilities. Despite all progress made concerning disaggregation techniques, performance evaluation and comparability remains an open research question. The lack of standardisation and consensus on evaluation procedures makes reproducibility and comparability extremely difficult. In this paper, we draw attention to comparability in NILM with a focus on highlighting the considerable differences amongst common energy datasets used to test the performance of algorithms. We divide discussion on comparability into data aspects, performance metrics, and give a close view on evaluation processes. Detailed information on pre-processing as well as data cleaning methods, the importance of unified performance reporting, and the need for complexity measures in load disaggregation are found to be the most urgent issues in NILM-related research. In addition, our evaluation suggests that datasets should be chosen carefully. We conclude by formulating suggestions for future work to enhance comparability.
Tasks	Non-Intrusive Load Monitoring
Published	2020-01-20
URL	https://arxiv.org/abs/2001.07708v1
PDF	https://arxiv.org/pdf/2001.07708v1.pdf
PWC	https://paperswithcode.com/paper/towards-comparability-in-non-intrusive-load
Repo	https://github.com/klemenjak/comparability
Framework	none