October 21, 2019

3188 words 15 mins read

Paper Group AWR 20

Paper Group AWR 20

BlinkML: Efficient Maximum Likelihood Estimation with Probabilistic Guarantees. To Trust Or Not To Trust A Classifier. AI safety via debate. Neural Persistence: A Complexity Measure for Deep Neural Networks Using Algebraic Topology. SO-Net: Self-Organizing Network for Point Cloud Analysis. A Structured Variational Autoencoder for Contextual Morphol …

BlinkML: Efficient Maximum Likelihood Estimation with Probabilistic Guarantees

Title BlinkML: Efficient Maximum Likelihood Estimation with Probabilistic Guarantees
Authors Yongjoo Park, Jingyi Qing, Xiaoyang Shen, Barzan Mozafari
Abstract The rising volume of datasets has made training machine learning (ML) models a major computational cost in the enterprise. Given the iterative nature of model and parameter tuning, many analysts use a small sample of their entire data during their initial stage of analysis to make quick decisions (e.g., what features or hyperparameters to use) and use the entire dataset only in later stages (i.e., when they have converged to a specific model). This sampling, however, is performed in an ad-hoc fashion. Most practitioners cannot precisely capture the effect of sampling on the quality of their model, and eventually on their decision-making process during the tuning phase. Moreover, without systematic support for sampling operators, many optimizations and reuse opportunities are lost. In this paper, we introduce BlinkML, a system for fast, quality-guaranteed ML training. BlinkML allows users to make error-computation tradeoffs: instead of training a model on their full data (i.e., full model), BlinkML can quickly train an approximate model with quality guarantees using a sample. The quality guarantees ensure that, with high probability, the approximate model makes the same predictions as the full model. BlinkML currently supports any ML model that relies on maximum likelihood estimation (MLE), which includes Generalized Linear Models (e.g., linear regression, logistic regression, max entropy classifier, Poisson regression) as well as PPCA (Probabilistic Principal Component Analysis). Our experiments show that BlinkML can speed up the training of large-scale ML tasks by 6.26x-629x while guaranteeing the same predictions, with 95% probability, as the full model.
Tasks Decision Making
Published 2018-12-26
URL http://arxiv.org/abs/1812.10564v1
PDF http://arxiv.org/pdf/1812.10564v1.pdf
PWC https://paperswithcode.com/paper/blinkml-efficient-maximum-likelihood
Repo https://github.com/jinw18/Readings_MLDB
Framework none

To Trust Or Not To Trust A Classifier

Title To Trust Or Not To Trust A Classifier
Authors Heinrich Jiang, Been Kim, Melody Y. Guan, Maya Gupta
Abstract Knowing when a classifier’s prediction can be trusted is useful in many applications and critical for safely using AI. While the bulk of the effort in machine learning research has been towards improving classifier performance, understanding when a classifier’s predictions should and should not be trusted has received far less attention. The standard approach is to use the classifier’s discriminant or confidence score; however, we show there exists an alternative that is more effective in many situations. We propose a new score, called the trust score, which measures the agreement between the classifier and a modified nearest-neighbor classifier on the testing example. We show empirically that high (low) trust scores produce surprisingly high precision at identifying correctly (incorrectly) classified examples, consistently outperforming the classifier’s confidence score as well as many other baselines. Further, under some mild distributional assumptions, we show that if the trust score for an example is high (low), the classifier will likely agree (disagree) with the Bayes-optimal classifier. Our guarantees consist of non-asymptotic rates of statistical consistency under various nonparametric settings and build on recent developments in topological data analysis.
Tasks Topological Data Analysis
Published 2018-05-30
URL http://arxiv.org/abs/1805.11783v2
PDF http://arxiv.org/pdf/1805.11783v2.pdf
PWC https://paperswithcode.com/paper/to-trust-or-not-to-trust-a-classifier
Repo https://github.com/SeldonIO/alibi
Framework tf

AI safety via debate

Title AI safety via debate
Authors Geoffrey Irving, Paul Christiano, Dario Amodei
Abstract To make AI systems broadly useful for challenging real-world tasks, we need them to learn complex human goals and preferences. One approach to specifying complex goals asks humans to judge during training which agent behaviors are safe and useful, but this approach can fail if the task is too complicated for a human to directly judge. To help address this concern, we propose training agents via self play on a zero sum debate game. Given a question or proposed action, two agents take turns making short statements up to a limit, then a human judges which of the agents gave the most true, useful information. In an analogy to complexity theory, debate with optimal play can answer any question in PSPACE given polynomial time judges (direct judging answers only NP questions). In practice, whether debate works involves empirical questions about humans and the tasks we want AIs to perform, plus theoretical questions about the meaning of AI alignment. We report results on an initial MNIST experiment where agents compete to convince a sparse classifier, boosting the classifier’s accuracy from 59.4% to 88.9% given 6 pixels and from 48.2% to 85.2% given 4 pixels. Finally, we discuss theoretical and practical aspects of the debate model, focusing on potential weaknesses as the model scales up, and we propose future human and computer experiments to test these properties.
Tasks
Published 2018-05-02
URL http://arxiv.org/abs/1805.00899v2
PDF http://arxiv.org/pdf/1805.00899v2.pdf
PWC https://paperswithcode.com/paper/ai-safety-via-debate
Repo https://github.com/jvmancuso/safe-debates
Framework pytorch

Neural Persistence: A Complexity Measure for Deep Neural Networks Using Algebraic Topology

Title Neural Persistence: A Complexity Measure for Deep Neural Networks Using Algebraic Topology
Authors Bastian Rieck, Matteo Togninalli, Christian Bock, Michael Moor, Max Horn, Thomas Gumbsch, Karsten Borgwardt
Abstract While many approaches to make neural networks more fathomable have been proposed, they are restricted to interrogating the network with input data. Measures for characterizing and monitoring structural properties, however, have not been developed. In this work, we propose neural persistence, a complexity measure for neural network architectures based on topological data analysis on weighted stratified graphs. To demonstrate the usefulness of our approach, we show that neural persistence reflects best practices developed in the deep learning community such as dropout and batch normalization. Moreover, we derive a neural persistence-based stopping criterion that shortens the training process while achieving comparable accuracies as early stopping based on validation loss.
Tasks Topological Data Analysis
Published 2018-12-23
URL https://arxiv.org/abs/1812.09764v3
PDF https://arxiv.org/pdf/1812.09764v3.pdf
PWC https://paperswithcode.com/paper/neural-persistence-a-complexity-measure-for
Repo https://github.com/BorgwardtLab/Neural-Persistence
Framework tf

SO-Net: Self-Organizing Network for Point Cloud Analysis

Title SO-Net: Self-Organizing Network for Point Cloud Analysis
Authors Jiaxin Li, Ben M. Chen, Gim Hee Lee
Abstract This paper presents SO-Net, a permutation invariant architecture for deep learning with orderless point clouds. The SO-Net models the spatial distribution of point cloud by building a Self-Organizing Map (SOM). Based on the SOM, SO-Net performs hierarchical feature extraction on individual points and SOM nodes, and ultimately represents the input point cloud by a single feature vector. The receptive field of the network can be systematically adjusted by conducting point-to-node k nearest neighbor search. In recognition tasks such as point cloud reconstruction, classification, object part segmentation and shape retrieval, our proposed network demonstrates performance that is similar with or better than state-of-the-art approaches. In addition, the training speed is significantly faster than existing point cloud recognition networks because of the parallelizability and simplicity of the proposed architecture. Our code is available at the project website. https://github.com/lijx10/SO-Net
Tasks
Published 2018-03-12
URL http://arxiv.org/abs/1803.04249v4
PDF http://arxiv.org/pdf/1803.04249v4.pdf
PWC https://paperswithcode.com/paper/so-net-self-organizing-network-for-point
Repo https://github.com/donnyruixu/pc-elm-ae
Framework pytorch

A Structured Variational Autoencoder for Contextual Morphological Inflection

Title A Structured Variational Autoencoder for Contextual Morphological Inflection
Authors Lawrence Wolf-Sonkin, Jason Naradowsky, Sabrina J. Mielke, Ryan Cotterell
Abstract Statistical morphological inflectors are typically trained on fully supervised, type-level data. One remaining open research question is the following: How can we effectively exploit raw, token-level data to improve their performance? To this end, we introduce a novel generative latent-variable model for the semi-supervised learning of inflection generation. To enable posterior inference over the latent variables, we derive an efficient variational inference procedure based on the wake-sleep algorithm. We experiment on 23 languages, using the Universal Dependencies corpora in a simulated low-resource setting, and find improvements of over 10% absolute accuracy in some cases.
Tasks Morphological Inflection
Published 2018-06-10
URL https://arxiv.org/abs/1806.03746v2
PDF https://arxiv.org/pdf/1806.03746v2.pdf
PWC https://paperswithcode.com/paper/a-structured-variational-autoencoder-for
Repo https://github.com/LeenaShekhar/NLP-Linguistics-ML-Resources
Framework tf

AMNet: Memorability Estimation with Attention

Title AMNet: Memorability Estimation with Attention
Authors Jiri Fajtl, Vasileios Argyriou, Dorothy Monekosso, Paolo Remagnino
Abstract In this paper we present the design and evaluation of an end-to-end trainable, deep neural network with a visual attention mechanism for memorability estimation in still images. We analyze the suitability of transfer learning of deep models from image classification to the memorability task. Further on we study the impact of the attention mechanism on the memorability estimation and evaluate our network on the SUN Memorability and the LaMem datasets. Our network outperforms the existing state of the art models on both datasets in terms of the Spearman’s rank correlation as well as the mean squared error, closely matching human consistency.
Tasks Image Classification, Transfer Learning
Published 2018-04-09
URL http://arxiv.org/abs/1804.03115v1
PDF http://arxiv.org/pdf/1804.03115v1.pdf
PWC https://paperswithcode.com/paper/amnet-memorability-estimation-with-attention
Repo https://github.com/ok1zjf/amnet
Framework pytorch

Synaptic Plasticity Dynamics for Deep Continuous Local Learning (DECOLLE)

Title Synaptic Plasticity Dynamics for Deep Continuous Local Learning (DECOLLE)
Authors Jacques Kaiser, Hesham Mostafa, Emre Neftci
Abstract A growing body of work underlines striking similarities between biological neural networks and recurrent, binary neural networks. A relatively smaller body of work, however, discusses similarities between learning dynamics employed in deep artificial neural networks and synaptic plasticity in spiking neural networks. The challenge preventing this is largely caused by the discrepancy between the dynamical properties of synaptic plasticity and the requirements for gradient backpropagation. Learning algorithms that approximate gradient backpropagation using locally synthesized gradients can overcome this challenge. Here, we show that synthetic gradients enable the derivation of Deep Continuous Local Learning (DECOLLE) in spiking neural networks. DECOLLE is capable of learning deep spatio-temporal representations from spikes relying solely on local information. Synaptic plasticity rules are derived systematically from user-defined cost functions and neural dynamics by leveraging existing autodifferentiation methods of machine learning frameworks. We benchmark our approach on the MNIST and the event-based neuromorphic DvsGesture dataset, on which DECOLLE performs comparably to the state-of-the-art. DECOLLE networks provide continuously learning machines that are relevant to biology and supportive of event-based, low-power computer vision architectures matching the accuracies of conventional computers on tasks where temporal precision and speed are essential.
Tasks
Published 2018-11-27
URL https://arxiv.org/abs/1811.10766v3
PDF https://arxiv.org/pdf/1811.10766v3.pdf
PWC https://paperswithcode.com/paper/synaptic-plasticity-dynamics-for-deep
Repo https://github.com/nmi-lab/dcll
Framework pytorch

Class-Distinct and Class-Mutual Image Generation with GANs

Title Class-Distinct and Class-Mutual Image Generation with GANs
Authors Takuhiro Kaneko, Yoshitaka Ushiku, Tatsuya Harada
Abstract Class-conditional extensions of generative adversarial networks (GANs), such as auxiliary classifier GAN (AC-GAN) and conditional GAN (cGAN), have garnered attention owing to their ability to decompose representations into class labels and other factors and to boost the training stability. However, a limitation is that they assume that each class is separable and ignore the relationship between classes even though class overlapping frequently occurs in a real-world scenario when data are collected on the basis of diverse or ambiguous criteria. To overcome this limitation, we address a novel problem called class-distinct and class-mutual image generation, in which the goal is to construct a generator that can capture between-class relationships and generate an image selectively conditioned on the class specificity. To solve this problem without additional supervision, we propose classifier’s posterior GAN (CP-GAN), in which we redesign the generator input and the objective function of AC-GAN for class-overlapping data. Precisely, we incorporate the classifier’s posterior into the generator input and optimize the generator so that the classifier’s posterior of generated data corresponds with that of real data. We demonstrate the effectiveness of CP-GAN using both controlled and real-world class-overlapping data with a model configuration analysis and comparative study. Our code is available at https://github.com/takuhirok/CP-GAN/.
Tasks Conditional Image Generation, Image Generation, Image-to-Image Translation
Published 2018-11-27
URL https://arxiv.org/abs/1811.11163v2
PDF https://arxiv.org/pdf/1811.11163v2.pdf
PWC https://paperswithcode.com/paper/class-distinct-and-class-mutual-image
Repo https://github.com/takuhirok/NR-GAN
Framework pytorch

Diverse Conditional Image Generation by Stochastic Regression with Latent Drop-Out Codes

Title Diverse Conditional Image Generation by Stochastic Regression with Latent Drop-Out Codes
Authors Yang He, Bernt Schiele, Mario Fritz
Abstract Recent advances in Deep Learning and probabilistic modeling have led to strong improvements in generative models for images. On the one hand, Generative Adversarial Networks (GANs) have contributed a highly effective adversarial learning procedure, but still suffer from stability issues. On the other hand, Conditional Variational Auto-Encoders (CVAE) models provide a sound way of conditional modeling but suffer from mode-mixing issues. Therefore, recent work has turned back to simple and stable regression models that are effective at generation but give up on the sampling mechanism and the latent code representation. We propose a novel and efficient stochastic regression approach with latent drop-out codes that combines the merits of both lines of research. In addition, a new training objective increases coverage of the training distribution leading to improvements over the state of the art in terms of accuracy as well as diversity.
Tasks Conditional Image Generation, Image Generation
Published 2018-08-03
URL http://arxiv.org/abs/1808.01121v1
PDF http://arxiv.org/pdf/1808.01121v1.pdf
PWC https://paperswithcode.com/paper/diverse-conditional-image-generation-by
Repo https://github.com/SSAW14/Image_Generation_with_Latent_Code
Framework caffe2

Unprocessing Images for Learned Raw Denoising

Title Unprocessing Images for Learned Raw Denoising
Authors Tim Brooks, Ben Mildenhall, Tianfan Xue, Jiawen Chen, Dillon Sharlet, Jonathan T. Barron
Abstract Machine learning techniques work best when the data used for training resembles the data used for evaluation. This holds true for learned single-image denoising algorithms, which are applied to real raw camera sensor readings but, due to practical constraints, are often trained on synthetic image data. Though it is understood that generalizing from synthetic to real data requires careful consideration of the noise properties of image sensors, the other aspects of a camera’s image processing pipeline (gain, color correction, tone mapping, etc) are often overlooked, despite their significant effect on how raw measurements are transformed into finished images. To address this, we present a technique to “unprocess” images by inverting each step of an image processing pipeline, thereby allowing us to synthesize realistic raw sensor measurements from commonly available internet photos. We additionally model the relevant components of an image processing pipeline when evaluating our loss function, which allows training to be aware of all relevant photometric processing that will occur after denoising. By processing and unprocessing model outputs and training data in this way, we are able to train a simple convolutional neural network that has 14%-38% lower error rates and is 9x-18x faster than the previous state of the art on the Darmstadt Noise Dataset, and generalizes to sensors outside of that dataset as well.
Tasks Denoising, Image Denoising
Published 2018-11-27
URL http://arxiv.org/abs/1811.11127v1
PDF http://arxiv.org/pdf/1811.11127v1.pdf
PWC https://paperswithcode.com/paper/unprocessing-images-for-learned-raw-denoising
Repo https://github.com/google-research/google-research/tree/master/unprocessing
Framework tf

Joint Deformable Registration of Large EM Image Volumes: A Matrix Solver Approach

Title Joint Deformable Registration of Large EM Image Volumes: A Matrix Solver Approach
Authors Khaled Khairy, Gennady Denisov, Stephan Saalfeld
Abstract Large electron microscopy image datasets for connectomics are typically composed of thousands to millions of partially overlapping two-dimensional images (tiles), which must be registered into a coherent volume prior to further analysis. A common registration strategy is to find matching features between neighboring and overlapping image pairs, followed by a numerical estimation of optimal image deformation using a so-called solver program. Existing solvers are inadequate for large data volumes, and inefficient for small-scale image registration. In this work, an efficient and accurate matrix-based solver method is presented. A linear system is constructed that combines minimization of feature-pair square distances with explicit constraints in a regularization term. In absence of reliable priors for regularization, we show how to construct a rigid-model approximation to use as prior. The linear system is solved using available computer programs, whose performance on typical registration tasks we briefly compare, and to which future scale-up is delegated. Our method is applied to the joint alignment of 2.67 million images, with more than 200 million point-pairs and has been used for successfully aligning the first full adult fruit fly brain.
Tasks Image Registration
Published 2018-04-26
URL http://arxiv.org/abs/1804.10019v1
PDF http://arxiv.org/pdf/1804.10019v1.pdf
PWC https://paperswithcode.com/paper/joint-deformable-registration-of-large-em
Repo https://github.com/khaledkhairy/EM_aligner
Framework none

Representation Learning with Contrastive Predictive Coding

Title Representation Learning with Contrastive Predictive Coding
Authors Aaron van den Oord, Yazhe Li, Oriol Vinyals
Abstract While supervised learning has enabled great progress in many applications, unsupervised learning has not seen such widespread adoption, and remains an important and challenging endeavor for artificial intelligence. In this work, we propose a universal unsupervised learning approach to extract useful representations from high-dimensional data, which we call Contrastive Predictive Coding. The key insight of our model is to learn such representations by predicting the future in latent space by using powerful autoregressive models. We use a probabilistic contrastive loss which induces the latent space to capture information that is maximally useful to predict future samples. It also makes the model tractable by using negative sampling. While most prior work has focused on evaluating representations for a particular modality, we demonstrate that our approach is able to learn useful representations achieving strong performance on four distinct domains: speech, images, text and reinforcement learning in 3D environments.
Tasks Representation Learning, Self-Supervised Image Classification, Semi-Supervised Image Classification
Published 2018-07-10
URL http://arxiv.org/abs/1807.03748v2
PDF http://arxiv.org/pdf/1807.03748v2.pdf
PWC https://paperswithcode.com/paper/representation-learning-with-contrastive
Repo https://github.com/jefflai108/Contrastive-Predictive-Coding-PyTorch
Framework pytorch

Revisiting Single Image Depth Estimation: Toward Higher Resolution Maps with Accurate Object Boundaries

Title Revisiting Single Image Depth Estimation: Toward Higher Resolution Maps with Accurate Object Boundaries
Authors Junjie Hu, Mete Ozay, Yan Zhang, Takayuki Okatani
Abstract This paper considers the problem of single image depth estimation. The employment of convolutional neural networks (CNNs) has recently brought about significant advancements in the research of this problem. However, most existing methods suffer from loss of spatial resolution in the estimated depth maps; a typical symptom is distorted and blurry reconstruction of object boundaries. In this paper, toward more accurate estimation with a focus on depth maps with higher spatial resolution, we propose two improvements to existing approaches. One is about the strategy of fusing features extracted at different scales, for which we propose an improved network architecture consisting of four modules: an encoder, decoder, multi-scale feature fusion module, and refinement module. The other is about loss functions for measuring inference errors used in training. We show that three loss terms, which measure errors in depth, gradients and surface normals, respectively, contribute to improvement of accuracy in an complementary fashion. Experimental results show that these two improvements enable to attain higher accuracy than the current state-of-the-arts, which is given by finer resolution reconstruction, for example, with small objects and object boundaries.
Tasks Depth Estimation, Monocular Depth Estimation
Published 2018-03-23
URL http://arxiv.org/abs/1803.08673v2
PDF http://arxiv.org/pdf/1803.08673v2.pdf
PWC https://paperswithcode.com/paper/revisiting-single-image-depth-estimation
Repo https://github.com/Xt-Chen/SARPN
Framework pytorch

SDNet: Contextualized Attention-based Deep Network for Conversational Question Answering

Title SDNet: Contextualized Attention-based Deep Network for Conversational Question Answering
Authors Chenguang Zhu, Michael Zeng, Xuedong Huang
Abstract Conversational question answering (CQA) is a novel QA task that requires understanding of dialogue context. Different from traditional single-turn machine reading comprehension (MRC) tasks, CQA includes passage comprehension, coreference resolution, and contextual understanding. In this paper, we propose an innovated contextualized attention-based deep neural network, SDNet, to fuse context into traditional MRC models. Our model leverages both inter-attention and self-attention to comprehend conversation context and extract relevant information from passage. Furthermore, we demonstrated a novel method to integrate the latest BERT contextual model. Empirical results show the effectiveness of our model, which sets the new state of the art result in CoQA leaderboard, outperforming the previous best model by 1.6% F1. Our ensemble model further improves the result by 2.7% F1.
Tasks Coreference Resolution, Machine Reading Comprehension, Question Answering, Reading Comprehension
Published 2018-12-10
URL http://arxiv.org/abs/1812.03593v5
PDF http://arxiv.org/pdf/1812.03593v5.pdf
PWC https://paperswithcode.com/paper/sdnet-contextualized-attention-based-deep
Repo https://github.com/mpandeydev/SDnetmod
Framework pytorch
comments powered by Disqus