February 2, 2020

3147 words 15 mins read

Paper Group AWR 10

Paper Group AWR 10

Preferences Implicit in the State of the World. Balanced Binary Neural Networks with Gated Residual. MNIST-C: A Robustness Benchmark for Computer Vision. An Active Approach for Model Interpretation. PyTorchPipe: a framework for rapid prototyping of pipelines combining language and vision. Unsupervised Learning of Probabilistic Diffeomorphic Registr …

Preferences Implicit in the State of the World

Title Preferences Implicit in the State of the World
Authors Rohin Shah, Dmitrii Krasheninnikov, Jordan Alexander, Pieter Abbeel, Anca Dragan
Abstract Reinforcement learning (RL) agents optimize only the features specified in a reward function and are indifferent to anything left out inadvertently. This means that we must not only specify what to do, but also the much larger space of what not to do. It is easy to forget these preferences, since these preferences are already satisfied in our environment. This motivates our key insight: when a robot is deployed in an environment that humans act in, the state of the environment is already optimized for what humans want. We can therefore use this implicit preference information from the state to fill in the blanks. We develop an algorithm based on Maximum Causal Entropy IRL and use it to evaluate the idea in a suite of proof-of-concept environments designed to show its properties. We find that information from the initial state can be used to infer both side effects that should be avoided as well as preferences for how the environment should be organized. Our code can be found at https://github.com/HumanCompatibleAI/rlsp.
Published 2019-02-12
URL http://arxiv.org/abs/1902.04198v2
PDF http://arxiv.org/pdf/1902.04198v2.pdf
PWC https://paperswithcode.com/paper/preferences-implicit-in-the-state-of-the
Repo https://github.com/HumanCompatibleAI/rlsp
Framework none

Balanced Binary Neural Networks with Gated Residual

Title Balanced Binary Neural Networks with Gated Residual
Authors Mingzhu Shen, Xianglong Liu, Ruihao Gong, Kai Han
Abstract Binary neural networks have attracted numerous attention in recent years. However, mainly due to the information loss stemming from the biased binarization, how to preserve the accuracy of networks still remains a critical issue. In this paper, we attempt to maintain the information propagated in the forward process and propose a Balanced Binary Neural Networks with Gated Residual (BBG for short). First, a weight balanced binarization is introduced to maximize information entropy of binary weights, and thus the informative binary weights can capture more information contained in the activations. Second, for binary activations, a gated residual is further appended to compensate their information loss during the forward process, with a slight overhead. Both techniques can be wrapped as a generic network module that supports various network architectures for different tasks including classification and detection. We evaluate our BBG on image classification tasks over CIFAR-10/100 and ImageNet and on detection task over Pascal VOC. The experimental results show that BBG-Net performs remarkably well across various network architectures such as VGG, ResNet and SSD with the superior performance over state-of-the-art methods in terms of memory consumption, inference speed and accuracy.
Tasks Image Classification
Published 2019-09-26
URL https://arxiv.org/abs/1909.12117v2
PDF https://arxiv.org/pdf/1909.12117v2.pdf
PWC https://paperswithcode.com/paper/balanced-binary-neural-networks-with-gated
Repo https://github.com/JDAI-CV/dabnn
Framework none

MNIST-C: A Robustness Benchmark for Computer Vision

Title MNIST-C: A Robustness Benchmark for Computer Vision
Authors Norman Mu, Justin Gilmer
Abstract We introduce the MNIST-C dataset, a comprehensive suite of 15 corruptions applied to the MNIST test set, for benchmarking out-of-distribution robustness in computer vision. Through several experiments and visualizations we demonstrate that our corruptions significantly degrade performance of state-of-the-art computer vision models while preserving the semantic content of the test images. In contrast to the popular notion of adversarial robustness, our model-agnostic corruptions do not seek worst-case performance but are instead designed to be broad and diverse, capturing multiple failure modes of modern models. In fact, we find that several previously published adversarial defenses significantly degrade robustness as measured by MNIST-C. We hope that our benchmark serves as a useful tool for future work in designing systems that are able to learn robust feature representations that capture the underlying semantics of the input.
Published 2019-06-05
URL https://arxiv.org/abs/1906.02337v1
PDF https://arxiv.org/pdf/1906.02337v1.pdf
PWC https://paperswithcode.com/paper/mnist-c-a-robustness-benchmark-for-computer
Repo https://github.com/google-research/mnist-c
Framework none

An Active Approach for Model Interpretation

Title An Active Approach for Model Interpretation
Authors Jialin Lu, Martin Ester
Abstract Model interpretation, or explanation of a machine learning classifier, aims to extract generalizable knowledge from a trained classifier into a human-understandable format, for various purposes such as model assessment, debugging and trust. From a computaional viewpoint, it is formulated as approximating the target classifier using a simpler interpretable model, such as rule models like a decision set/list/tree. Often, this approximation is handled as standard supervised learning and the only difference is that the labels are provided by the target classifier instead of ground truth. This paradigm is particularly popular because there exists a variety of well-studied supervised algorithms for learning an interpretable classifier. However, we argue that this paradigm is suboptimal for it does not utilize the unique property of the model interpretation problem, that is, the ability to generate synthetic instances and query the target classifier for their labels. We call this the active-query property, suggesting that we should consider model interpretation from an active learning perspective. Following this insight, we argue that the active-query property should be employed when designing a model interpretation algorithm, and that the generation of synthetic instances should be integrated seamlessly with the algorithm that learns the model interpretation. In this paper, we demonstrate that by doing so, it is possible to achieve more faithful interpretation with simpler model complexity. As a technical contribution, we present an active algorithm Active Decision Set Induction (ADS) to learn a decision set, a set of if-else rules, for model interpretation. ADS performs a local search over the space of all decision sets. In every iteration, ADS computes confidence intervals for the value of the objective function of all local actions and utilizes active-query to determine the best one.
Tasks Active Learning
Published 2019-10-27
URL https://arxiv.org/abs/1910.12207v1
PDF https://arxiv.org/pdf/1910.12207v1.pdf
PWC https://paperswithcode.com/paper/an-active-approach-for-model-interpretation
Repo https://github.com/LuxxxLucy/Active-Decision-Set-Induction
Framework none

PyTorchPipe: a framework for rapid prototyping of pipelines combining language and vision

Title PyTorchPipe: a framework for rapid prototyping of pipelines combining language and vision
Authors Tomasz Kornuta
Abstract Access to vast amounts of data along with affordable computational power stimulated the reincarnation of neural networks. The progress could not be achieved without adequate software tools, lowering the entry bar for the next generations of researchers and developers. The paper introduces PyTorchPipe (PTP), a framework built on top of PyTorch. Answering the recent needs and trends in machine learning, PTP facilitates building and training of complex, multi-modal models combining language and vision (but is not limited to those two modalities). At its core, PTP employs a component-oriented approach and relies on the concept of a pipeline, defined as a directed acyclic graph of loosely coupled components. A user defines a pipeline using yaml-based (thus human-readable) configuration files, whereas PTP provides generic workers for their loading, training, and testing using all the computational power (CPUs and GPUs) that is available to the user. The paper covers the main concepts of PyTorchPipe, discusses its key features and briefly presents the currently implemented tasks, models and components.
Published 2019-10-18
URL https://arxiv.org/abs/1910.08654v1
PDF https://arxiv.org/pdf/1910.08654v1.pdf
PWC https://paperswithcode.com/paper/pytorchpipe-a-framework-for-rapid-prototyping
Repo https://github.com/ibm/pytorchpipe
Framework pytorch

Unsupervised Learning of Probabilistic Diffeomorphic Registration for Images and Surfaces

Title Unsupervised Learning of Probabilistic Diffeomorphic Registration for Images and Surfaces
Authors Adrian V. Dalca, Guha Balakrishnan, John Guttag, Mert R. Sabuncu
Abstract Classical deformable registration techniques achieve impressive results and offer a rigorous theoretical treatment, but are computationally intensive since they solve an optimization problem for each image pair. Recently, learning-based methods have facilitated fast registration by learning spatial deformation functions. However, these approaches use restricted deformation models, require supervised labels, or do not guarantee a diffeomorphic (topology-preserving) registration. Furthermore, learning-based registration tools have not been derived from a probabilistic framework that can offer uncertainty estimates. In this paper, we build a connection between classical and learning-based methods. We present a probabilistic generative model and derive an unsupervised learning-based inference algorithm that uses insights from classical registration methods and makes use of recent developments in convolutional neural networks (CNNs). We demonstrate our method on a 3D brain registration task for both images and anatomical surfaces, and provide extensive empirical analyses. Our principled approach results in state of the art accuracy and very fast runtimes, while providing diffeomorphic guarantees. Our implementation is available at http://voxelmorph.csail.mit.edu.
Tasks Constrained Diffeomorphic Image Registration, Deformable Medical Image Registration, Diffeomorphic Medical Image Registration, Image Registration, Medical Image Registration
Published 2019-03-08
URL https://arxiv.org/abs/1903.03545v2
PDF https://arxiv.org/pdf/1903.03545v2.pdf
PWC https://paperswithcode.com/paper/unsupervised-learning-of-probabilistic
Repo https://github.com/voxelmorph/voxelmorph
Framework tf

Evaluation of Neural Network Uncertainty Estimation with Application to Resource-Constrained Platforms

Title Evaluation of Neural Network Uncertainty Estimation with Application to Resource-Constrained Platforms
Authors Yukun Ding, Jinglan Liu, Jinjun Xiong, Yiyu Shi
Abstract The ability to accurately estimate uncertainties in neural network predictions is of great importance in many critical tasks. In this paper, we first analyze the intrinsic relation between two main use cases of uncertainty estimation, i.e., selective prediction and confidence calibration. We then reveal the potential issues with the existing quality metrics for uncertainty estimation and propose new metrics to mitigate them. Finally, we apply these new metrics to resource-constrained platforms such as autonomous driver assistance systems where the quality of uncertainty estimation is critical. By exploring the trade-off between the model size and the estimation quality, a missing piece in the literature, some interesting trends are observed.
Tasks Calibration
Published 2019-03-05
URL http://arxiv.org/abs/1903.02050v1
PDF http://arxiv.org/pdf/1903.02050v1.pdf
PWC https://paperswithcode.com/paper/evaluation-of-neural-network-uncertainty
Repo https://github.com/yding5/AdaptiveBinning
Framework none

IsoNN: Isomorphic Neural Network for Graph Representation Learning and Classification

Title IsoNN: Isomorphic Neural Network for Graph Representation Learning and Classification
Authors Lin Meng, Jiawei Zhang
Abstract Deep learning models have achieved huge success in numerous fields, such as computer vision and natural language processing. However, unlike such fields, it is hard to apply traditional deep learning models on the graph data due to the ‘node-orderless’ property. Normally, adjacency matrices will cast an artificial and random node-order on the graphs, which renders the performance of deep models on graph classification tasks extremely erratic, and the representations learned by such models lack clear interpretability. To eliminate the unnecessary node-order constraint, we propose a novel model named Isomorphic Neural Network (IsoNN), which learns the graph representation by extracting its isomorphic features via the graph matching between input graph and templates. IsoNN has two main components: graph isomorphic feature extraction component and classification component. The graph isomorphic feature extraction component utilizes a set of subgraph templates as the kernel variables to learn the possible subgraph patterns existing in the input graph and then computes the isomorphic features. A set of permutation matrices is used in the component to break the node-order brought by the matrix representation. Three fully-connected layers are used as the classification component in IsoNN. Extensive experiments are conducted on benchmark datasets, the experimental results can demonstrate the effectiveness of ISONN, especially compared with both classic and state-of-the-art graph classification methods.
Tasks Graph Classification, Graph Matching, Graph Representation Learning, Representation Learning
Published 2019-07-22
URL https://arxiv.org/abs/1907.09495v2
PDF https://arxiv.org/pdf/1907.09495v2.pdf
PWC https://paperswithcode.com/paper/isonn-isomorphic-neural-network-for-graph
Repo https://github.com/linmengsysu/IsoNN
Framework pytorch

Deblurring Face Images using Uncertainty Guided Multi-Stream Semantic Networks

Title Deblurring Face Images using Uncertainty Guided Multi-Stream Semantic Networks
Authors Rajeev Yasarla, Federico Perazzi, Vishal M. Patel
Abstract We propose a novel multi-stream architecture and training methodology that exploits semantic labels for facial image deblurring. The proposed Uncertainty Guided Multi-Stream Semantic Network (UMSN) processes regions belonging to each semantic class independently and learns to combine their outputs into the final deblurred result. Pixel-wise semantic labels are obtained using a segmentation network. A predicted confidence measure is used during training to guide the network towards challenging regions of the human face such as the eyes and nose. The entire network is trained in an end-to-end fashion. Comprehensive experiments on three different face datasets demonstrate that the proposed method achieves significant improvements over the recent state-of-the-art face deblurring methods. Code is available at: https://github.com/rajeevyasarla/UMSN-Face-Deblurring
Tasks Deblurring
Published 2019-07-30
URL https://arxiv.org/abs/1907.13106v1
PDF https://arxiv.org/pdf/1907.13106v1.pdf
PWC https://paperswithcode.com/paper/deblurring-face-images-using-uncertainty
Repo https://github.com/rajeevyasarla/UMSN-Face-Deblurring
Framework pytorch

A cross-center smoothness prior for variational Bayesian brain tissue segmentation

Title A cross-center smoothness prior for variational Bayesian brain tissue segmentation
Authors Wouter M. Kouw, Silas N. Ørting, Jens Petersen, Kim S. Pedersen, Marleen de Bruijne
Abstract Suppose one is faced with the challenge of tissue segmentation in MR images, without annotators at their center to provide labeled training data. One option is to go to another medical center for a trained classifier. Sadly, tissue classifiers do not generalize well across centers due to voxel intensity shifts caused by center-specific acquisition protocols. However, certain aspects of segmentations, such as spatial smoothness, remain relatively consistent and can be learned separately. Here we present a smoothness prior that is fit to segmentations produced at another medical center. This informative prior is presented to an unsupervised Bayesian model. The model clusters the voxel intensities, such that it produces segmentations that are similarly smooth to those of the other medical center. In addition, the unsupervised Bayesian model is extended to a semi-supervised variant, which needs no visual interpretation of clusters into tissues.
Published 2019-03-11
URL http://arxiv.org/abs/1903.04191v1
PDF http://arxiv.org/pdf/1903.04191v1.pdf
PWC https://paperswithcode.com/paper/a-cross-center-smoothness-prior-for
Repo https://github.com/wmkouw/cc-smoothprior
Framework none

Multimodal Machine Translation with Embedding Prediction

Title Multimodal Machine Translation with Embedding Prediction
Authors Tosho Hirasawa, Hayahide Yamagishi, Yukio Matsumura, Mamoru Komachi
Abstract Multimodal machine translation is an attractive application of neural machine translation (NMT). It helps computers to deeply understand visual objects and their relations with natural languages. However, multimodal NMT systems suffer from a shortage of available training data, resulting in poor performance for translating rare words. In NMT, pretrained word embeddings have been shown to improve NMT of low-resource domains, and a search-based approach is proposed to address the rare word problem. In this study, we effectively combine these two approaches in the context of multimodal NMT and explore how we can take full advantage of pretrained word embeddings to better translate rare words. We report overall performance improvements of 1.24 METEOR and 2.49 BLEU and achieve an improvement of 7.67 F-score for rare word translation.
Tasks Machine Translation, Multimodal Machine Translation, Word Embeddings
Published 2019-04-01
URL http://arxiv.org/abs/1904.00639v1
PDF http://arxiv.org/pdf/1904.00639v1.pdf
PWC https://paperswithcode.com/paper/multimodal-machine-translation-with-embedding
Repo https://github.com/toshohirasawa/nmtpytorch-emb-pred
Framework pytorch

A Multi-Object Rectified Attention Network for Scene Text Recognition

Title A Multi-Object Rectified Attention Network for Scene Text Recognition
Authors Canjie Luo, Lianwen Jin, Zenghui Sun
Abstract Irregular text is widely used. However, it is considerably difficult to recognize because of its various shapes and distorted patterns. In this paper, we thus propose a multi-object rectified attention network (MORAN) for general scene text recognition. The MORAN consists of a multi-object rectification network and an attention-based sequence recognition network. The multi-object rectification network is designed for rectifying images that contain irregular text. It decreases the difficulty of recognition and enables the attention-based sequence recognition network to more easily read irregular text. It is trained in a weak supervision way, thus requiring only images and corresponding text labels. The attention-based sequence recognition network focuses on target characters and sequentially outputs the predictions. Moreover, to improve the sensitivity of the attention-based sequence recognition network, a fractional pickup method is proposed for an attention-based decoder in the training phase. With the rectification mechanism, the MORAN can read both regular and irregular scene text. Extensive experiments on various benchmarks are conducted, which show that the MORAN achieves state-of-the-art performance. The source code is available.
Tasks Scene Text Recognition
Published 2019-01-10
URL http://arxiv.org/abs/1901.03003v1
PDF http://arxiv.org/pdf/1901.03003v1.pdf
PWC https://paperswithcode.com/paper/a-multi-object-rectified-attention-network
Repo https://github.com/dipu-bd/craft-moran-ocr
Framework pytorch

Camera Lens Super-Resolution

Title Camera Lens Super-Resolution
Authors Chang Chen, Zhiwei Xiong, Xinmei Tian, Zheng-Jun Zha, Feng Wu
Abstract Existing methods for single image super-resolution (SR) are typically evaluated with synthetic degradation models such as bicubic or Gaussian downsampling. In this paper, we investigate SR from the perspective of camera lenses, named as CameraSR, which aims to alleviate the intrinsic tradeoff between resolution (R) and field-of-view (V) in realistic imaging systems. Specifically, we view the R-V degradation as a latent model in the SR process and learn to reverse it with realistic low- and high-resolution image pairs. To obtain the paired images, we propose two novel data acquisition strategies for two representative imaging systems (i.e., DSLR and smartphone cameras), respectively. Based on the obtained City100 dataset, we quantitatively analyze the performance of commonly-used synthetic degradation models, and demonstrate the superiority of CameraSR as a practical solution to boost the performance of existing SR methods. Moreover, CameraSR can be readily generalized to different content and devices, which serves as an advanced digital zoom tool in realistic imaging systems. Codes and datasets are available at https://github.com/ngchc/CameraSR.
Tasks Image Super-Resolution, Super-Resolution
Published 2019-04-06
URL http://arxiv.org/abs/1904.03378v1
PDF http://arxiv.org/pdf/1904.03378v1.pdf
PWC https://paperswithcode.com/paper/camera-lens-super-resolution
Repo https://github.com/ngchc/CameraSR
Framework tf

AIM 2019 Challenge on Constrained Super-Resolution: Methods and Results

Title AIM 2019 Challenge on Constrained Super-Resolution: Methods and Results
Authors Kai Zhang, Shuhang Gu, Radu Timofte, Zheng Hui, Xiumei Wang, Xinbo Gao, Dongliang Xiong, Shuai Liu, Ruipeng Gang, Nan Nan, Chenghua Li, Xueyi Zou, Ning Kang, Zhan Wang, Hang Xu, Chaofeng Wang, Zheng Li, Linlin Wang, Jun Shi, Wenyu Sun, Zhiqiang Lang, Jiangtao Nie, Wei Wei, Lei Zhang, Yazhe Niu, Peijin Zhuo, Xiangzhen Kong, Long Sun, Wenhao Wang
Abstract This paper reviews the AIM 2019 challenge on constrained example-based single image super-resolution with focus on proposed solutions and results. The challenge had 3 tracks. Taking the three main aspects (i.e., number of parameters, inference/running time, fidelity (PSNR)) of MSRResNet as the baseline, Track 1 aims to reduce the amount of parameters while being constrained to maintain or improve the running time and the PSNR result, Tracks 2 and 3 aim to optimize running time and PSNR result with constrain of the other two aspects, respectively. Each track had an average of 64 registered participants, and 12 teams submitted the final results. They gauge the state-of-the-art in single image super-resolution.
Tasks Image Super-Resolution, Super-Resolution
Published 2019-11-04
URL https://arxiv.org/abs/1911.01249v1
PDF https://arxiv.org/pdf/1911.01249v1.pdf
PWC https://paperswithcode.com/paper/aim-2019-challenge-on-constrained-super
Repo https://github.com/Zheng222/IMDN
Framework pytorch

Simultaneous Prediction Intervals for Patient-Specific Survival Curves

Title Simultaneous Prediction Intervals for Patient-Specific Survival Curves
Authors Samuel Sokota, Ryan D’Orazio, Khurram Javed, Humza Haider, Russell Greiner
Abstract Accurate models of patient survival probabilities provide important information to clinicians prescribing care for life-threatening and terminal ailments. A recently developed class of models - known as individual survival distributions (ISDs) - produces patient-specific survival functions that offer greater descriptive power of patient outcomes than was previously possible. Unfortunately, at the time of writing, ISD models almost universally lack uncertainty quantification. In this paper, we demonstrate that an existing method for estimating simultaneous prediction intervals from samples can easily be adapted for patient-specific survival curve analysis and yields accurate results. Furthermore, we introduce both a modification to the existing method and a novel method for estimating simultaneous prediction intervals and show that they offer competitive performance. It is worth emphasizing that these methods are not limited to survival analysis and can be applied in any context in which sampling the distribution of interest is tractable. Code is available at https://github.com/ssokota/spie .
Tasks Survival Analysis
Published 2019-06-25
URL https://arxiv.org/abs/1906.10780v1
PDF https://arxiv.org/pdf/1906.10780v1.pdf
PWC https://paperswithcode.com/paper/simultaneous-prediction-intervals-for-patient
Repo https://github.com/ssokota/spie
Framework none
comments powered by Disqus