Paper Group AWR 71
Deep Active Learning for Biased Datasets via Fisher Kernel Self-Supervision. Unbiased and Efficient Log-Likelihood Estimation with Inverse Binomial Sampling. Single-exposure absorption imaging of ultracold atoms using deep learning. Cross-Iteration Batch Normalization. ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network. Gettin …
Deep Active Learning for Biased Datasets via Fisher Kernel Self-Supervision
Title | Deep Active Learning for Biased Datasets via Fisher Kernel Self-Supervision |
Authors | Denis Gudovskiy, Alec Hodgkinson, Takuya Yamaguchi, Sotaro Tsukizawa |
Abstract | Active learning (AL) aims to minimize labeling efforts for data-demanding deep neural networks (DNNs) by selecting the most representative data points for annotation. However, currently used methods are ill-equipped to deal with biased data. The main motivation of this paper is to consider a realistic setting for pool-based semi-supervised AL, where the unlabeled collection of train data is biased. We theoretically derive an optimal acquisition function for AL in this setting. It can be formulated as distribution shift minimization between unlabeled train data and weakly-labeled validation dataset. To implement such acquisition function, we propose a low-complexity method for feature density matching using self-supervised Fisher kernel (FK) as well as several novel pseudo-label estimators. Our FK-based method outperforms state-of-the-art methods on MNIST, SVHN, and ImageNet classification while requiring only 1/10th of processing. The conducted experiments show at least 40% drop in labeling efforts for the biased class-imbalanced data compared to existing methods. |
Tasks | Active Learning |
Published | 2020-03-01 |
URL | https://arxiv.org/abs/2003.00393v1 |
https://arxiv.org/pdf/2003.00393v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-active-learning-for-biased-datasets-via |
Repo | https://github.com/gudovskiy/al-fk-self-supervision |
Framework | pytorch |
Unbiased and Efficient Log-Likelihood Estimation with Inverse Binomial Sampling
Title | Unbiased and Efficient Log-Likelihood Estimation with Inverse Binomial Sampling |
Authors | Bas van Opheusden, Luigi Acerbi, Wei Ji Ma |
Abstract | The fate of scientific hypotheses often relies on the ability of a computational model to explain the data, quantified in modern statistical approaches by the likelihood function. The log-likelihood is the key element for parameter estimation and model evaluation. However, the log-likelihood of complex models in fields such as computational biology and neuroscience is often intractable to compute analytically or numerically. In those cases, researchers can often only estimate the log-likelihood by comparing observed data with synthetic observations generated by model simulations. Standard techniques to approximate the likelihood via simulation either use summary statistics of the data or are at risk of producing severe biases in the estimate. Here, we explore another method, inverse binomial sampling (IBS), which can estimate the log-likelihood of an entire data set efficiently and without bias. For each observation, IBS draws samples from the simulator model until one matches the observation. The log-likelihood estimate is then a function of the number of samples drawn. The variance of this estimator is uniformly bounded, achieves the minimum variance for an unbiased estimator, and we can compute calibrated estimates of the variance. We provide theoretical arguments in favor of IBS and an empirical assessment of the method for maximum-likelihood estimation with simulation-based models. As case studies, we take three model-fitting problems of increasing complexity from computational and cognitive neuroscience. In all problems, IBS generally produces lower error in the estimated parameters and maximum log-likelihood values than alternative sampling methods with the same average number of samples. Our results demonstrate the potential of IBS as a practical, robust, and easy to implement method for log-likelihood evaluation when exact techniques are not available. |
Tasks | |
Published | 2020-01-12 |
URL | https://arxiv.org/abs/2001.03985v1 |
https://arxiv.org/pdf/2001.03985v1.pdf | |
PWC | https://paperswithcode.com/paper/unbiased-and-efficient-log-likelihood |
Repo | https://github.com/basvanopheusden/ibs-development |
Framework | none |
Single-exposure absorption imaging of ultracold atoms using deep learning
Title | Single-exposure absorption imaging of ultracold atoms using deep learning |
Authors | Gal Ness, Anastasiya Vainbaum, Constantine Shkedrov, Yanay Florshaim, Yoav Sagi |
Abstract | Absorption imaging is the most common probing technique in experiments with ultracold atoms. The standard procedure involves the division of two frames acquired at successive exposures, one with the atomic absorption signal and one without. A well-known problem is the presence of residual structured noise in the final image, due to small differences between the imaging light in the two exposures. Here we solve this problem by performing absorption imaging with only a single exposure, where instead of a second exposure the reference frame is generated by an unsupervised image-completion autoencoder neural network. The network is trained on images without absorption signal such that it can infer the noise overlaying the atomic signal based only on the information in the region encircling the signal. We demonstrate our approach on data captured with a quantum degenerate Fermi gas. The average residual noise in the resulting images is below that of the standard double-shot technique. Our method simplifies the experimental sequence, reduces the hardware requirements, and can improve the accuracy of extracted physical observables. The trained network and its generating scripts are available as an open-source repository (http://absDL.github.io/). |
Tasks | Image Denoising, Physical Attribute Prediction |
Published | 2020-03-03 |
URL | https://arxiv.org/abs/2003.01643v1 |
https://arxiv.org/pdf/2003.01643v1.pdf | |
PWC | https://paperswithcode.com/paper/single-exposure-absorption-imaging-of |
Repo | https://github.com/absDL/absDL.github.io |
Framework | none |
Cross-Iteration Batch Normalization
Title | Cross-Iteration Batch Normalization |
Authors | Zhuliang Yao, Yue Cao, Shuxin Zheng, Gao Huang, Stephen Lin |
Abstract | A well-known issue of Batch Normalization is its significantly reduced effectiveness in the case of small mini-batch sizes. When a mini-batch contains few examples, the statistics upon which the normalization is defined cannot be reliably estimated from it during a training iteration. To address this problem, we present Cross-Iteration Batch Normalization (CBN), in which examples from multiple recent iterations are jointly utilized to enhance estimation quality. A challenge of computing statistics over multiple iterations is that the network activations from different iterations are not comparable to each other due to changes in network weights. We thus compensate for the network weight changes via a proposed technique based on Taylor polynomials, so that the statistics can be accurately estimated and batch normalization can be effectively applied. On object detection and image classification with small mini-batch sizes, CBN is found to outperform the original batch normalization and a direct calculation of statistics over previous iterations without the proposed compensation technique. |
Tasks | Image Classification, Object Detection |
Published | 2020-02-13 |
URL | https://arxiv.org/abs/2002.05712v2 |
https://arxiv.org/pdf/2002.05712v2.pdf | |
PWC | https://paperswithcode.com/paper/cross-iteration-batch-normalization |
Repo | https://github.com/Howal/Cross-iterationBatchNorm |
Framework | pytorch |
ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network
Title | ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network |
Authors | Yuliang Liu, Hao Chen, Chunhua Shen, Tong He, Lianwen Jin, Liangwei Wang |
Abstract | Scene text detection and recognition has received increasing research attention. Existing methods can be roughly categorized into two groups: character-based and segmentation-based. These methods either are costly for character annotation or need to maintain a complex pipeline, which is often not suitable for real-time applications. Here we address the problem by proposing the Adaptive Bezier-Curve Network (ABCNet). Our contributions are three-fold: 1) For the first time, we adaptively fit arbitrarily-shaped text by a parameterized Bezier curve. 2) We design a novel BezierAlign layer for extracting accurate convolution features of a text instance with arbitrary shapes, significantly improving the precision compared with previous methods. 3) Compared with standard bounding box detection, our Bezier curve detection introduces negligible computation overhead, resulting in superiority of our method in both efficiency and accuracy. Experiments on arbitrarily-shaped benchmark datasets, namely Total-Text and CTW1500, demonstrate that ABCNet achieves state-of-the-art accuracy, meanwhile significantly improving the speed. In particular, on Total-Text, our realtime version is over 10 times faster than recent state-of-the-art methods with a competitive recognition accuracy. Code is available at https://tinyurl.com/AdelaiDet |
Tasks | Scene Text Detection, Text Spotting |
Published | 2020-02-24 |
URL | https://arxiv.org/abs/2002.10200v2 |
https://arxiv.org/pdf/2002.10200v2.pdf | |
PWC | https://paperswithcode.com/paper/abcnet-real-time-scene-text-spotting-with |
Repo | https://github.com/aim-uofa/AdelaiDet |
Framework | pytorch |
Getting to 99% Accuracy in Interactive Segmentation
Title | Getting to 99% Accuracy in Interactive Segmentation |
Authors | Marco Forte, Brian Price, Scott Cohen, Ning Xu, François Pitié |
Abstract | Interactive object cutout tools are the cornerstone of the image editing workflow. Recent deep-learning based interactive segmentation algorithms have made significant progress in handling complex images and rough binary selections can typically be obtained with just a few clicks. Yet, deep learning techniques tend to plateau once this rough selection has been reached. In this work, we interpret this plateau as the inability of current algorithms to sufficiently leverage each user interaction and also as the limitations of current training/testing datasets. We propose a novel interactive architecture and a novel training scheme that are both tailored to better exploit the user workflow. We also show that significant improvements can be further gained by introducing a synthetic training dataset that is specifically designed for complex object boundaries. Comprehensive experiments support our approach, and our network achieves state of the art performance. |
Tasks | Interactive Segmentation |
Published | 2020-03-17 |
URL | https://arxiv.org/abs/2003.07932v1 |
https://arxiv.org/pdf/2003.07932v1.pdf | |
PWC | https://paperswithcode.com/paper/getting-to-99-accuracy-in-interactive |
Repo | https://github.com/MarcoForte/DeepInteractiveSegmentation |
Framework | pytorch |
GFTE: Graph-based Financial Table Extraction
Title | GFTE: Graph-based Financial Table Extraction |
Authors | Yiren Li, Zheng Huang, Junchi Yan, Yi Zhou, Fan Ye, Xianhui Liu |
Abstract | Tabular data is a crucial form of information expression, which can organize data in a standard structure for easy information retrieval and comparison. However, in financial industry and many other fields tables are often disclosed in unstructured digital files, e.g. Portable Document Format (PDF) and images, which are difficult to be extracted directly. In this paper, to facilitate deep learning based table extraction from unstructured digital files, we publish a standard Chinese dataset named FinTab, which contains more than 1,600 financial tables of diverse kinds and their corresponding structure representation in JSON. In addition, we propose a novel graph-based convolutional neural network model named GFTE as a baseline for future comparison. GFTE integrates image feature, position feature and textual feature together for precise edge prediction and reaches overall good results. |
Tasks | Information Retrieval |
Published | 2020-03-17 |
URL | https://arxiv.org/abs/2003.07560v1 |
https://arxiv.org/pdf/2003.07560v1.pdf | |
PWC | https://paperswithcode.com/paper/gfte-graph-based-financial-table-extraction |
Repo | https://github.com/Irene323/GFTE |
Framework | pytorch |
Black-box Smoothing: A Provable Defense for Pretrained Classifiers
Title | Black-box Smoothing: A Provable Defense for Pretrained Classifiers |
Authors | Hadi Salman, Mingjie Sun, Greg Yang, Ashish Kapoor, J. Zico Kolter |
Abstract | We present a method for provably defending any pretrained image classifier against $\ell_p$ adversarial attacks. By prepending a custom-trained denoiser to any off-the-shelf image classifier and using randomized smoothing, we effectively create a new classifier that is guaranteed to be $\ell_p$-robust to adversarial examples, without modifying the pretrained classifier. The approach applies both to the case where we have full access to the pretrained classifier as well as the case where we only have query access. We refer to this defense as black-box smoothing, and we demonstrate its effectiveness through extensive experimentation on ImageNet and CIFAR-10. Finally, we use our method to provably defend the Azure, Google, AWS, and ClarifAI image classification APIs. Our code replicating all the experiments in the paper can be found at https://github.com/microsoft/blackbox-smoothing . |
Tasks | Image Classification |
Published | 2020-03-04 |
URL | https://arxiv.org/abs/2003.01908v1 |
https://arxiv.org/pdf/2003.01908v1.pdf | |
PWC | https://paperswithcode.com/paper/black-box-smoothing-a-provable-defense-for |
Repo | https://github.com/microsoft/blackbox-smoothing |
Framework | pytorch |
I Am Going MAD: Maximum Discrepancy Competition for Comparing Classifiers Adaptively
Title | I Am Going MAD: Maximum Discrepancy Competition for Comparing Classifiers Adaptively |
Authors | Haotao Wang, Tianlong Chen, Zhangyang Wang, Kede Ma |
Abstract | The learning of hierarchical representations for image classification has experienced an impressive series of successes due in part to the availability of large-scale labeled data for training. On the other hand, the trained classifiers have traditionally been evaluated on small and fixed sets of test images, which are deemed to be extremely sparsely distributed in the space of all natural images. It is thus questionable whether recent performance improvements on the excessively re-used test sets generalize to real-world natural images with much richer content variations. Inspired by efficient stimulus selection for testing perceptual models in psychophysical and physiological studies, we present an alternative framework for comparing image classifiers, which we name the MAximum Discrepancy (MAD) competition. Rather than comparing image classifiers using fixed test images, we adaptively sample a small test set from an arbitrarily large corpus of unlabeled images so as to maximize the discrepancies between the classifiers, measured by the distance over WordNet hierarchy. Human labeling on the resulting model-dependent image sets reveals the relative performance of the competing classifiers, and provides useful insights on potential ways to improve them. We report the MAD competition results of eleven ImageNet classifiers while noting that the framework is readily extensible and cost-effective to add future classifiers into the competition. Codes can be found at https://github.com/TAMU-VITA/MAD. |
Tasks | Image Classification |
Published | 2020-02-25 |
URL | https://arxiv.org/abs/2002.10648v1 |
https://arxiv.org/pdf/2002.10648v1.pdf | |
PWC | https://paperswithcode.com/paper/i-am-going-mad-maximum-discrepancy-1 |
Repo | https://github.com/TAMU-VITA/MAD |
Framework | none |
Scheduled Restart Momentum for Accelerated Stochastic Gradient Descent
Title | Scheduled Restart Momentum for Accelerated Stochastic Gradient Descent |
Authors | Bao Wang, Tan M. Nguyen, Andrea L. Bertozzi, Richard G. Baraniuk, Stanley J. Osher |
Abstract | Stochastic gradient descent (SGD) with constant momentum and its variants such as Adam are the optimization algorithms of choice for training deep neural networks (DNNs). Since DNN training is incredibly computationally expensive, there is great interest in speeding up convergence. Nesterov accelerated gradient (NAG) improves the convergence rate of gradient descent (GD) for convex optimization using a specially designed momentum; however, it accumulates error when an inexact gradient is used (such as in SGD), slowing convergence at best and diverging at worst. In this paper, we propose Scheduled Restart SGD (SRSGD), a new NAG-style scheme for training DNNs. SRSGD replaces the constant momentum in SGD by the increasing momentum in NAG but stabilizes the iterations by resetting the momentum to zero according to a schedule. Using a variety of models and benchmarks for image classification, we demonstrate that, in training DNNs, SRSGD significantly improves convergence and generalization; for instance in training ResNet200 for ImageNet classification, SRSGD achieves an error rate of 20.93% vs. the benchmark of 22.13%. These improvements become more significant as the network grows deeper. Furthermore, on both CIFAR and ImageNet, SRSGD reaches similar or even better error rates with fewer training epochs compared to the SGD baseline. We provide code for SRSGD at https://github.com/minhtannguyen/SRSGD. |
Tasks | Image Classification |
Published | 2020-02-24 |
URL | https://arxiv.org/abs/2002.10583v1 |
https://arxiv.org/pdf/2002.10583v1.pdf | |
PWC | https://paperswithcode.com/paper/scheduled-restart-momentum-for-accelerated |
Repo | https://github.com/minhtannguyen/SRSGD |
Framework | pytorch |
Fine-Grained Fashion Similarity Learning by Attribute-Specific Embedding Network
Title | Fine-Grained Fashion Similarity Learning by Attribute-Specific Embedding Network |
Authors | Zhe Ma, Jianfeng Dong, Yao Zhang, Zhongzi Long, Yuan He, Hui Xue, Shouling Ji |
Abstract | This paper strives to learn fine-grained fashion similarity. In this similarity paradigm, one should pay more attention to the similarity in terms of a specific design/attribute among fashion items, which has potential values in many fashion related applications such as fashion copyright protection. To this end, we propose an Attribute-Specific Embedding Network (ASEN) to jointly learn multiple attribute-specific embeddings in an end-to-end manner, thus measure the fine-grained similarity in the corresponding space. With two attention modules, i.e., Attribute-aware Spatial Attention and Attribute-aware Channel Attention, ASEN is able to locate the related regions and capture the essential patterns under the guidance of the specified attribute, thus make the learned attribute-specific embeddings better reflect the fine-grained similarity. Extensive experiments on four fashion-related datasets show the effectiveness of ASEN for fine-grained fashion similarity learning and its potential for fashion reranking. |
Tasks | |
Published | 2020-02-07 |
URL | https://arxiv.org/abs/2002.02814v1 |
https://arxiv.org/pdf/2002.02814v1.pdf | |
PWC | https://paperswithcode.com/paper/fine-grained-fashion-similarity-learning-by |
Repo | https://github.com/Maryeon/asen |
Framework | pytorch |
DISIR: Deep Image Segmentation with Interactive Refinement
Title | DISIR: Deep Image Segmentation with Interactive Refinement |
Authors | Gaston Lenczner, Bertrand Le Saux, Nicola Luminari, Adrien Chan Hon Tong, Guy Le Besnerais |
Abstract | This paper presents an interactive approach for multi-class segmentation of aerial images. Precisely, it is based on a deep neural network which exploits both RGB images and annotations. Starting from an initial output based on the image only, our network then interactively refines this segmentation map using a concatenation of the image and user annotations. Importantly, user annotations modify the inputs of the network - not its weights - enabling a fast and smooth process. Through experiments on two public aerial datasets, we show that user annotations are extremely rewarding: each click corrects roughly 5000 pixels. We analyze the impact of different aspects of our framework such as the representation of the annotations, the volume of training data or the network architecture. Code is available at https://github.com/delair-ai/DISIR. |
Tasks | Semantic Segmentation |
Published | 2020-03-31 |
URL | https://arxiv.org/abs/2003.14200v1 |
https://arxiv.org/pdf/2003.14200v1.pdf | |
PWC | https://paperswithcode.com/paper/disir-deep-image-segmentation-with |
Repo | https://github.com/delair-ai/DISIR |
Framework | pytorch |
Robust 6D Object Pose Estimation by Learning RGB-D Features
Title | Robust 6D Object Pose Estimation by Learning RGB-D Features |
Authors | Meng Tian, Liang Pan, Marcelo H Ang Jr, Gim Hee Lee |
Abstract | Accurate 6D object pose estimation is fundamental to robotic manipulation and grasping. Previous methods follow a local optimization approach which minimizes the distance between closest point pairs to handle the rotation ambiguity of symmetric objects. In this work, we propose a novel discrete-continuous formulation for rotation regression to resolve this local-optimum problem. We uniformly sample rotation anchors in SO(3), and predict a constrained deviation from each anchor to the target, as well as uncertainty scores for selecting the best prediction. Additionally, the object location is detected by aggregating point-wise vectors pointing to the 3D center. Experiments on two benchmarks: LINEMOD and YCB-Video, show that the proposed method outperforms state-of-the-art approaches. Our code is available at https://github.com/mentian/object-posenet. |
Tasks | 6D Pose Estimation using RGB, Pose Estimation |
Published | 2020-02-29 |
URL | https://arxiv.org/abs/2003.00188v2 |
https://arxiv.org/pdf/2003.00188v2.pdf | |
PWC | https://paperswithcode.com/paper/robust-6d-object-pose-estimation-by-learning |
Repo | https://github.com/mentian/object-posenet |
Framework | pytorch |
Graph-Bert: Only Attention is Needed for Learning Graph Representations
Title | Graph-Bert: Only Attention is Needed for Learning Graph Representations |
Authors | Jiawei Zhang, Haopeng Zhang, Congying Xia, Li Sun |
Abstract | The dominant graph neural networks (GNNs) over-rely on the graph links, several serious performance problems with which have been witnessed already, e.g., suspended animation problem and over-smoothing problem. What’s more, the inherently inter-connected nature precludes parallelization within the graph, which becomes critical for large-sized graph, as memory constraints limit batching across the nodes. In this paper, we will introduce a new graph neural network, namely GRAPH-BERT (Graph based BERT), solely based on the attention mechanism without any graph convolution or aggregation operators. Instead of feeding GRAPH-BERT with the complete large input graph, we propose to train GRAPH-BERT with sampled linkless subgraphs within their local contexts. GRAPH-BERT can be learned effectively in a standalone mode. Meanwhile, a pre-trained GRAPH-BERT can also be transferred to other application tasks directly or with necessary fine-tuning if any supervised label information or certain application oriented objective is available. We have tested the effectiveness of GRAPH-BERT on several graph benchmark datasets. Based the pre-trained GRAPH-BERT with the node attribute reconstruction and structure recovery tasks, we further fine-tune GRAPH-BERT on node classification and graph clustering tasks specifically. The experimental results have demonstrated that GRAPH-BERT can out-perform the existing GNNs in both the learning effectiveness and efficiency. |
Tasks | Graph Clustering, Node Classification |
Published | 2020-01-15 |
URL | https://arxiv.org/abs/2001.05140v2 |
https://arxiv.org/pdf/2001.05140v2.pdf | |
PWC | https://paperswithcode.com/paper/graph-bert-only-attention-is-needed-for |
Repo | https://github.com/jwzhanggy/graph_bert_work |
Framework | none |
Towards Comparability in Non-Intrusive Load Monitoring: On Data and Performance Evaluation
Title | Towards Comparability in Non-Intrusive Load Monitoring: On Data and Performance Evaluation |
Authors | Christoph Klemenjak, Stephen Makonin, Wilfried Elmenreich |
Abstract | Non-Intrusive Load Monitoring (NILM) comprises of a set of techniques that provide insights into the energy consumption of households and industrial facilities. Latest contributions show significant improvements in terms of accuracy and generalisation abilities. Despite all progress made concerning disaggregation techniques, performance evaluation and comparability remains an open research question. The lack of standardisation and consensus on evaluation procedures makes reproducibility and comparability extremely difficult. In this paper, we draw attention to comparability in NILM with a focus on highlighting the considerable differences amongst common energy datasets used to test the performance of algorithms. We divide discussion on comparability into data aspects, performance metrics, and give a close view on evaluation processes. Detailed information on pre-processing as well as data cleaning methods, the importance of unified performance reporting, and the need for complexity measures in load disaggregation are found to be the most urgent issues in NILM-related research. In addition, our evaluation suggests that datasets should be chosen carefully. We conclude by formulating suggestions for future work to enhance comparability. |
Tasks | Non-Intrusive Load Monitoring |
Published | 2020-01-20 |
URL | https://arxiv.org/abs/2001.07708v1 |
https://arxiv.org/pdf/2001.07708v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-comparability-in-non-intrusive-load |
Repo | https://github.com/klemenjak/comparability |
Framework | none |