April 3, 2020

3238 words 16 mins read

Paper Group AWR 71

Paper Group AWR 71

Deep Active Learning for Biased Datasets via Fisher Kernel Self-Supervision. Unbiased and Efficient Log-Likelihood Estimation with Inverse Binomial Sampling. Single-exposure absorption imaging of ultracold atoms using deep learning. Cross-Iteration Batch Normalization. ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network. Gettin …

Deep Active Learning for Biased Datasets via Fisher Kernel Self-Supervision

Title Deep Active Learning for Biased Datasets via Fisher Kernel Self-Supervision
Authors Denis Gudovskiy, Alec Hodgkinson, Takuya Yamaguchi, Sotaro Tsukizawa
Abstract Active learning (AL) aims to minimize labeling efforts for data-demanding deep neural networks (DNNs) by selecting the most representative data points for annotation. However, currently used methods are ill-equipped to deal with biased data. The main motivation of this paper is to consider a realistic setting for pool-based semi-supervised AL, where the unlabeled collection of train data is biased. We theoretically derive an optimal acquisition function for AL in this setting. It can be formulated as distribution shift minimization between unlabeled train data and weakly-labeled validation dataset. To implement such acquisition function, we propose a low-complexity method for feature density matching using self-supervised Fisher kernel (FK) as well as several novel pseudo-label estimators. Our FK-based method outperforms state-of-the-art methods on MNIST, SVHN, and ImageNet classification while requiring only 1/10th of processing. The conducted experiments show at least 40% drop in labeling efforts for the biased class-imbalanced data compared to existing methods.
Tasks Active Learning
Published 2020-03-01
URL https://arxiv.org/abs/2003.00393v1
PDF https://arxiv.org/pdf/2003.00393v1.pdf
PWC https://paperswithcode.com/paper/deep-active-learning-for-biased-datasets-via
Repo https://github.com/gudovskiy/al-fk-self-supervision
Framework pytorch

Unbiased and Efficient Log-Likelihood Estimation with Inverse Binomial Sampling

Title Unbiased and Efficient Log-Likelihood Estimation with Inverse Binomial Sampling
Authors Bas van Opheusden, Luigi Acerbi, Wei Ji Ma
Abstract The fate of scientific hypotheses often relies on the ability of a computational model to explain the data, quantified in modern statistical approaches by the likelihood function. The log-likelihood is the key element for parameter estimation and model evaluation. However, the log-likelihood of complex models in fields such as computational biology and neuroscience is often intractable to compute analytically or numerically. In those cases, researchers can often only estimate the log-likelihood by comparing observed data with synthetic observations generated by model simulations. Standard techniques to approximate the likelihood via simulation either use summary statistics of the data or are at risk of producing severe biases in the estimate. Here, we explore another method, inverse binomial sampling (IBS), which can estimate the log-likelihood of an entire data set efficiently and without bias. For each observation, IBS draws samples from the simulator model until one matches the observation. The log-likelihood estimate is then a function of the number of samples drawn. The variance of this estimator is uniformly bounded, achieves the minimum variance for an unbiased estimator, and we can compute calibrated estimates of the variance. We provide theoretical arguments in favor of IBS and an empirical assessment of the method for maximum-likelihood estimation with simulation-based models. As case studies, we take three model-fitting problems of increasing complexity from computational and cognitive neuroscience. In all problems, IBS generally produces lower error in the estimated parameters and maximum log-likelihood values than alternative sampling methods with the same average number of samples. Our results demonstrate the potential of IBS as a practical, robust, and easy to implement method for log-likelihood evaluation when exact techniques are not available.
Tasks
Published 2020-01-12
URL https://arxiv.org/abs/2001.03985v1
PDF https://arxiv.org/pdf/2001.03985v1.pdf
PWC https://paperswithcode.com/paper/unbiased-and-efficient-log-likelihood
Repo https://github.com/basvanopheusden/ibs-development
Framework none

Single-exposure absorption imaging of ultracold atoms using deep learning

Title Single-exposure absorption imaging of ultracold atoms using deep learning
Authors Gal Ness, Anastasiya Vainbaum, Constantine Shkedrov, Yanay Florshaim, Yoav Sagi
Abstract Absorption imaging is the most common probing technique in experiments with ultracold atoms. The standard procedure involves the division of two frames acquired at successive exposures, one with the atomic absorption signal and one without. A well-known problem is the presence of residual structured noise in the final image, due to small differences between the imaging light in the two exposures. Here we solve this problem by performing absorption imaging with only a single exposure, where instead of a second exposure the reference frame is generated by an unsupervised image-completion autoencoder neural network. The network is trained on images without absorption signal such that it can infer the noise overlaying the atomic signal based only on the information in the region encircling the signal. We demonstrate our approach on data captured with a quantum degenerate Fermi gas. The average residual noise in the resulting images is below that of the standard double-shot technique. Our method simplifies the experimental sequence, reduces the hardware requirements, and can improve the accuracy of extracted physical observables. The trained network and its generating scripts are available as an open-source repository (http://absDL.github.io/).
Tasks Image Denoising, Physical Attribute Prediction
Published 2020-03-03
URL https://arxiv.org/abs/2003.01643v1
PDF https://arxiv.org/pdf/2003.01643v1.pdf
PWC https://paperswithcode.com/paper/single-exposure-absorption-imaging-of
Repo https://github.com/absDL/absDL.github.io
Framework none

Cross-Iteration Batch Normalization

Title Cross-Iteration Batch Normalization
Authors Zhuliang Yao, Yue Cao, Shuxin Zheng, Gao Huang, Stephen Lin
Abstract A well-known issue of Batch Normalization is its significantly reduced effectiveness in the case of small mini-batch sizes. When a mini-batch contains few examples, the statistics upon which the normalization is defined cannot be reliably estimated from it during a training iteration. To address this problem, we present Cross-Iteration Batch Normalization (CBN), in which examples from multiple recent iterations are jointly utilized to enhance estimation quality. A challenge of computing statistics over multiple iterations is that the network activations from different iterations are not comparable to each other due to changes in network weights. We thus compensate for the network weight changes via a proposed technique based on Taylor polynomials, so that the statistics can be accurately estimated and batch normalization can be effectively applied. On object detection and image classification with small mini-batch sizes, CBN is found to outperform the original batch normalization and a direct calculation of statistics over previous iterations without the proposed compensation technique.
Tasks Image Classification, Object Detection
Published 2020-02-13
URL https://arxiv.org/abs/2002.05712v2
PDF https://arxiv.org/pdf/2002.05712v2.pdf
PWC https://paperswithcode.com/paper/cross-iteration-batch-normalization
Repo https://github.com/Howal/Cross-iterationBatchNorm
Framework pytorch

ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network

Title ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network
Authors Yuliang Liu, Hao Chen, Chunhua Shen, Tong He, Lianwen Jin, Liangwei Wang
Abstract Scene text detection and recognition has received increasing research attention. Existing methods can be roughly categorized into two groups: character-based and segmentation-based. These methods either are costly for character annotation or need to maintain a complex pipeline, which is often not suitable for real-time applications. Here we address the problem by proposing the Adaptive Bezier-Curve Network (ABCNet). Our contributions are three-fold: 1) For the first time, we adaptively fit arbitrarily-shaped text by a parameterized Bezier curve. 2) We design a novel BezierAlign layer for extracting accurate convolution features of a text instance with arbitrary shapes, significantly improving the precision compared with previous methods. 3) Compared with standard bounding box detection, our Bezier curve detection introduces negligible computation overhead, resulting in superiority of our method in both efficiency and accuracy. Experiments on arbitrarily-shaped benchmark datasets, namely Total-Text and CTW1500, demonstrate that ABCNet achieves state-of-the-art accuracy, meanwhile significantly improving the speed. In particular, on Total-Text, our realtime version is over 10 times faster than recent state-of-the-art methods with a competitive recognition accuracy. Code is available at https://tinyurl.com/AdelaiDet
Tasks Scene Text Detection, Text Spotting
Published 2020-02-24
URL https://arxiv.org/abs/2002.10200v2
PDF https://arxiv.org/pdf/2002.10200v2.pdf
PWC https://paperswithcode.com/paper/abcnet-real-time-scene-text-spotting-with
Repo https://github.com/aim-uofa/AdelaiDet
Framework pytorch

Getting to 99% Accuracy in Interactive Segmentation

Title Getting to 99% Accuracy in Interactive Segmentation
Authors Marco Forte, Brian Price, Scott Cohen, Ning Xu, François Pitié
Abstract Interactive object cutout tools are the cornerstone of the image editing workflow. Recent deep-learning based interactive segmentation algorithms have made significant progress in handling complex images and rough binary selections can typically be obtained with just a few clicks. Yet, deep learning techniques tend to plateau once this rough selection has been reached. In this work, we interpret this plateau as the inability of current algorithms to sufficiently leverage each user interaction and also as the limitations of current training/testing datasets. We propose a novel interactive architecture and a novel training scheme that are both tailored to better exploit the user workflow. We also show that significant improvements can be further gained by introducing a synthetic training dataset that is specifically designed for complex object boundaries. Comprehensive experiments support our approach, and our network achieves state of the art performance.
Tasks Interactive Segmentation
Published 2020-03-17
URL https://arxiv.org/abs/2003.07932v1
PDF https://arxiv.org/pdf/2003.07932v1.pdf
PWC https://paperswithcode.com/paper/getting-to-99-accuracy-in-interactive
Repo https://github.com/MarcoForte/DeepInteractiveSegmentation
Framework pytorch

GFTE: Graph-based Financial Table Extraction

Title GFTE: Graph-based Financial Table Extraction
Authors Yiren Li, Zheng Huang, Junchi Yan, Yi Zhou, Fan Ye, Xianhui Liu
Abstract Tabular data is a crucial form of information expression, which can organize data in a standard structure for easy information retrieval and comparison. However, in financial industry and many other fields tables are often disclosed in unstructured digital files, e.g. Portable Document Format (PDF) and images, which are difficult to be extracted directly. In this paper, to facilitate deep learning based table extraction from unstructured digital files, we publish a standard Chinese dataset named FinTab, which contains more than 1,600 financial tables of diverse kinds and their corresponding structure representation in JSON. In addition, we propose a novel graph-based convolutional neural network model named GFTE as a baseline for future comparison. GFTE integrates image feature, position feature and textual feature together for precise edge prediction and reaches overall good results.
Tasks Information Retrieval
Published 2020-03-17
URL https://arxiv.org/abs/2003.07560v1
PDF https://arxiv.org/pdf/2003.07560v1.pdf
PWC https://paperswithcode.com/paper/gfte-graph-based-financial-table-extraction
Repo https://github.com/Irene323/GFTE
Framework pytorch

Black-box Smoothing: A Provable Defense for Pretrained Classifiers

Title Black-box Smoothing: A Provable Defense for Pretrained Classifiers
Authors Hadi Salman, Mingjie Sun, Greg Yang, Ashish Kapoor, J. Zico Kolter
Abstract We present a method for provably defending any pretrained image classifier against $\ell_p$ adversarial attacks. By prepending a custom-trained denoiser to any off-the-shelf image classifier and using randomized smoothing, we effectively create a new classifier that is guaranteed to be $\ell_p$-robust to adversarial examples, without modifying the pretrained classifier. The approach applies both to the case where we have full access to the pretrained classifier as well as the case where we only have query access. We refer to this defense as black-box smoothing, and we demonstrate its effectiveness through extensive experimentation on ImageNet and CIFAR-10. Finally, we use our method to provably defend the Azure, Google, AWS, and ClarifAI image classification APIs. Our code replicating all the experiments in the paper can be found at https://github.com/microsoft/blackbox-smoothing .
Tasks Image Classification
Published 2020-03-04
URL https://arxiv.org/abs/2003.01908v1
PDF https://arxiv.org/pdf/2003.01908v1.pdf
PWC https://paperswithcode.com/paper/black-box-smoothing-a-provable-defense-for
Repo https://github.com/microsoft/blackbox-smoothing
Framework pytorch

I Am Going MAD: Maximum Discrepancy Competition for Comparing Classifiers Adaptively

Title I Am Going MAD: Maximum Discrepancy Competition for Comparing Classifiers Adaptively
Authors Haotao Wang, Tianlong Chen, Zhangyang Wang, Kede Ma
Abstract The learning of hierarchical representations for image classification has experienced an impressive series of successes due in part to the availability of large-scale labeled data for training. On the other hand, the trained classifiers have traditionally been evaluated on small and fixed sets of test images, which are deemed to be extremely sparsely distributed in the space of all natural images. It is thus questionable whether recent performance improvements on the excessively re-used test sets generalize to real-world natural images with much richer content variations. Inspired by efficient stimulus selection for testing perceptual models in psychophysical and physiological studies, we present an alternative framework for comparing image classifiers, which we name the MAximum Discrepancy (MAD) competition. Rather than comparing image classifiers using fixed test images, we adaptively sample a small test set from an arbitrarily large corpus of unlabeled images so as to maximize the discrepancies between the classifiers, measured by the distance over WordNet hierarchy. Human labeling on the resulting model-dependent image sets reveals the relative performance of the competing classifiers, and provides useful insights on potential ways to improve them. We report the MAD competition results of eleven ImageNet classifiers while noting that the framework is readily extensible and cost-effective to add future classifiers into the competition. Codes can be found at https://github.com/TAMU-VITA/MAD.
Tasks Image Classification
Published 2020-02-25
URL https://arxiv.org/abs/2002.10648v1
PDF https://arxiv.org/pdf/2002.10648v1.pdf
PWC https://paperswithcode.com/paper/i-am-going-mad-maximum-discrepancy-1
Repo https://github.com/TAMU-VITA/MAD
Framework none

Scheduled Restart Momentum for Accelerated Stochastic Gradient Descent

Title Scheduled Restart Momentum for Accelerated Stochastic Gradient Descent
Authors Bao Wang, Tan M. Nguyen, Andrea L. Bertozzi, Richard G. Baraniuk, Stanley J. Osher
Abstract Stochastic gradient descent (SGD) with constant momentum and its variants such as Adam are the optimization algorithms of choice for training deep neural networks (DNNs). Since DNN training is incredibly computationally expensive, there is great interest in speeding up convergence. Nesterov accelerated gradient (NAG) improves the convergence rate of gradient descent (GD) for convex optimization using a specially designed momentum; however, it accumulates error when an inexact gradient is used (such as in SGD), slowing convergence at best and diverging at worst. In this paper, we propose Scheduled Restart SGD (SRSGD), a new NAG-style scheme for training DNNs. SRSGD replaces the constant momentum in SGD by the increasing momentum in NAG but stabilizes the iterations by resetting the momentum to zero according to a schedule. Using a variety of models and benchmarks for image classification, we demonstrate that, in training DNNs, SRSGD significantly improves convergence and generalization; for instance in training ResNet200 for ImageNet classification, SRSGD achieves an error rate of 20.93% vs. the benchmark of 22.13%. These improvements become more significant as the network grows deeper. Furthermore, on both CIFAR and ImageNet, SRSGD reaches similar or even better error rates with fewer training epochs compared to the SGD baseline. We provide code for SRSGD at https://github.com/minhtannguyen/SRSGD.
Tasks Image Classification
Published 2020-02-24
URL https://arxiv.org/abs/2002.10583v1
PDF https://arxiv.org/pdf/2002.10583v1.pdf
PWC https://paperswithcode.com/paper/scheduled-restart-momentum-for-accelerated
Repo https://github.com/minhtannguyen/SRSGD
Framework pytorch

Fine-Grained Fashion Similarity Learning by Attribute-Specific Embedding Network

Title Fine-Grained Fashion Similarity Learning by Attribute-Specific Embedding Network
Authors Zhe Ma, Jianfeng Dong, Yao Zhang, Zhongzi Long, Yuan He, Hui Xue, Shouling Ji
Abstract This paper strives to learn fine-grained fashion similarity. In this similarity paradigm, one should pay more attention to the similarity in terms of a specific design/attribute among fashion items, which has potential values in many fashion related applications such as fashion copyright protection. To this end, we propose an Attribute-Specific Embedding Network (ASEN) to jointly learn multiple attribute-specific embeddings in an end-to-end manner, thus measure the fine-grained similarity in the corresponding space. With two attention modules, i.e., Attribute-aware Spatial Attention and Attribute-aware Channel Attention, ASEN is able to locate the related regions and capture the essential patterns under the guidance of the specified attribute, thus make the learned attribute-specific embeddings better reflect the fine-grained similarity. Extensive experiments on four fashion-related datasets show the effectiveness of ASEN for fine-grained fashion similarity learning and its potential for fashion reranking.
Tasks
Published 2020-02-07
URL https://arxiv.org/abs/2002.02814v1
PDF https://arxiv.org/pdf/2002.02814v1.pdf
PWC https://paperswithcode.com/paper/fine-grained-fashion-similarity-learning-by
Repo https://github.com/Maryeon/asen
Framework pytorch

DISIR: Deep Image Segmentation with Interactive Refinement

Title DISIR: Deep Image Segmentation with Interactive Refinement
Authors Gaston Lenczner, Bertrand Le Saux, Nicola Luminari, Adrien Chan Hon Tong, Guy Le Besnerais
Abstract This paper presents an interactive approach for multi-class segmentation of aerial images. Precisely, it is based on a deep neural network which exploits both RGB images and annotations. Starting from an initial output based on the image only, our network then interactively refines this segmentation map using a concatenation of the image and user annotations. Importantly, user annotations modify the inputs of the network - not its weights - enabling a fast and smooth process. Through experiments on two public aerial datasets, we show that user annotations are extremely rewarding: each click corrects roughly 5000 pixels. We analyze the impact of different aspects of our framework such as the representation of the annotations, the volume of training data or the network architecture. Code is available at https://github.com/delair-ai/DISIR.
Tasks Semantic Segmentation
Published 2020-03-31
URL https://arxiv.org/abs/2003.14200v1
PDF https://arxiv.org/pdf/2003.14200v1.pdf
PWC https://paperswithcode.com/paper/disir-deep-image-segmentation-with
Repo https://github.com/delair-ai/DISIR
Framework pytorch

Robust 6D Object Pose Estimation by Learning RGB-D Features

Title Robust 6D Object Pose Estimation by Learning RGB-D Features
Authors Meng Tian, Liang Pan, Marcelo H Ang Jr, Gim Hee Lee
Abstract Accurate 6D object pose estimation is fundamental to robotic manipulation and grasping. Previous methods follow a local optimization approach which minimizes the distance between closest point pairs to handle the rotation ambiguity of symmetric objects. In this work, we propose a novel discrete-continuous formulation for rotation regression to resolve this local-optimum problem. We uniformly sample rotation anchors in SO(3), and predict a constrained deviation from each anchor to the target, as well as uncertainty scores for selecting the best prediction. Additionally, the object location is detected by aggregating point-wise vectors pointing to the 3D center. Experiments on two benchmarks: LINEMOD and YCB-Video, show that the proposed method outperforms state-of-the-art approaches. Our code is available at https://github.com/mentian/object-posenet.
Tasks 6D Pose Estimation using RGB, Pose Estimation
Published 2020-02-29
URL https://arxiv.org/abs/2003.00188v2
PDF https://arxiv.org/pdf/2003.00188v2.pdf
PWC https://paperswithcode.com/paper/robust-6d-object-pose-estimation-by-learning
Repo https://github.com/mentian/object-posenet
Framework pytorch

Graph-Bert: Only Attention is Needed for Learning Graph Representations

Title Graph-Bert: Only Attention is Needed for Learning Graph Representations
Authors Jiawei Zhang, Haopeng Zhang, Congying Xia, Li Sun
Abstract The dominant graph neural networks (GNNs) over-rely on the graph links, several serious performance problems with which have been witnessed already, e.g., suspended animation problem and over-smoothing problem. What’s more, the inherently inter-connected nature precludes parallelization within the graph, which becomes critical for large-sized graph, as memory constraints limit batching across the nodes. In this paper, we will introduce a new graph neural network, namely GRAPH-BERT (Graph based BERT), solely based on the attention mechanism without any graph convolution or aggregation operators. Instead of feeding GRAPH-BERT with the complete large input graph, we propose to train GRAPH-BERT with sampled linkless subgraphs within their local contexts. GRAPH-BERT can be learned effectively in a standalone mode. Meanwhile, a pre-trained GRAPH-BERT can also be transferred to other application tasks directly or with necessary fine-tuning if any supervised label information or certain application oriented objective is available. We have tested the effectiveness of GRAPH-BERT on several graph benchmark datasets. Based the pre-trained GRAPH-BERT with the node attribute reconstruction and structure recovery tasks, we further fine-tune GRAPH-BERT on node classification and graph clustering tasks specifically. The experimental results have demonstrated that GRAPH-BERT can out-perform the existing GNNs in both the learning effectiveness and efficiency.
Tasks Graph Clustering, Node Classification
Published 2020-01-15
URL https://arxiv.org/abs/2001.05140v2
PDF https://arxiv.org/pdf/2001.05140v2.pdf
PWC https://paperswithcode.com/paper/graph-bert-only-attention-is-needed-for
Repo https://github.com/jwzhanggy/graph_bert_work
Framework none

Towards Comparability in Non-Intrusive Load Monitoring: On Data and Performance Evaluation

Title Towards Comparability in Non-Intrusive Load Monitoring: On Data and Performance Evaluation
Authors Christoph Klemenjak, Stephen Makonin, Wilfried Elmenreich
Abstract Non-Intrusive Load Monitoring (NILM) comprises of a set of techniques that provide insights into the energy consumption of households and industrial facilities. Latest contributions show significant improvements in terms of accuracy and generalisation abilities. Despite all progress made concerning disaggregation techniques, performance evaluation and comparability remains an open research question. The lack of standardisation and consensus on evaluation procedures makes reproducibility and comparability extremely difficult. In this paper, we draw attention to comparability in NILM with a focus on highlighting the considerable differences amongst common energy datasets used to test the performance of algorithms. We divide discussion on comparability into data aspects, performance metrics, and give a close view on evaluation processes. Detailed information on pre-processing as well as data cleaning methods, the importance of unified performance reporting, and the need for complexity measures in load disaggregation are found to be the most urgent issues in NILM-related research. In addition, our evaluation suggests that datasets should be chosen carefully. We conclude by formulating suggestions for future work to enhance comparability.
Tasks Non-Intrusive Load Monitoring
Published 2020-01-20
URL https://arxiv.org/abs/2001.07708v1
PDF https://arxiv.org/pdf/2001.07708v1.pdf
PWC https://paperswithcode.com/paper/towards-comparability-in-non-intrusive-load
Repo https://github.com/klemenjak/comparability
Framework none
comments powered by Disqus