July 28, 2019

2976 words 14 mins read

Paper Group ANR 371

Paper Group ANR 371

Graph Fourier Transform with Negative Edges for Depth Image Coding. Semi-Supervised Learning via New Deep Network Inversion. Image to Image Translation for Domain Adaptation. Memory Matching Networks for Genomic Sequence Classification. Interactive Natural Language Acquisition in a Multi-modal Recurrent Neural Architecture. Neural Program Meta-Indu …

Graph Fourier Transform with Negative Edges for Depth Image Coding

Title Graph Fourier Transform with Negative Edges for Depth Image Coding
Authors Weng-Tai Su, Gene Cheung, Chia-Wen Lin
Abstract Recent advent in graph signal processing (GSP) has led to the development of new graph-based transforms and wavelets for image / video coding, where the underlying graph describes inter-pixel correlations. In this paper, we develop a new transform called signed graph Fourier transform (SGFT), where the underlying graph G contains negative edges that describe anti-correlations between pixel pairs. Specifically, we first construct a one-state Markov process that models both inter-pixel correlations and anti-correlations. We then derive the corresponding precision matrix, and show that the loopy graph Laplacian matrix Q of a graph G with a negative edge and two self-loops at its end nodes is approximately equivalent. This proves that the eigenvectors of Q - called SGFT - approximates the optimal Karhunen-Lo`eve Transform (KLT). We show the importance of the self-loops in G to ensure Q is positive semi-definite. We prove that the first eigenvector of Q is piecewise constant (PWC), and thus can well approximate a piecewise smooth (PWS) signal like a depth image. Experimental results show that a block-based coding scheme based on SGFT outperforms a previous scheme using graph transforms with only positive edges for several depth images. |
Tasks
Published 2017-02-10
URL http://arxiv.org/abs/1702.03105v2
PDF http://arxiv.org/pdf/1702.03105v2.pdf
PWC https://paperswithcode.com/paper/graph-fourier-transform-with-negative-edges
Repo
Framework

Semi-Supervised Learning via New Deep Network Inversion

Title Semi-Supervised Learning via New Deep Network Inversion
Authors Randall Balestriero, Vincent Roger, Herve G. Glotin, Richard G. Baraniuk
Abstract We exploit a recently derived inversion scheme for arbitrary deep neural networks to develop a new semi-supervised learning framework that applies to a wide range of systems and problems. The approach outperforms current state-of-the-art methods on MNIST reaching $99.14%$ of test set accuracy while using $5$ labeled examples per class. Experiments with one-dimensional signals highlight the generality of the method. Importantly, our approach is simple, efficient, and requires no change in the deep network architecture.
Tasks
Published 2017-11-12
URL http://arxiv.org/abs/1711.04313v1
PDF http://arxiv.org/pdf/1711.04313v1.pdf
PWC https://paperswithcode.com/paper/semi-supervised-learning-via-new-deep-network-1
Repo
Framework

Image to Image Translation for Domain Adaptation

Title Image to Image Translation for Domain Adaptation
Authors Zak Murez, Soheil Kolouri, David Kriegman, Ravi Ramamoorthi, Kyungnam Kim
Abstract We propose a general framework for unsupervised domain adaptation, which allows deep neural networks trained on a source domain to be tested on a different target domain without requiring any training annotations in the target domain. This is achieved by adding extra networks and losses that help regularize the features extracted by the backbone encoder network. To this end we propose the novel use of the recently proposed unpaired image-toimage translation framework to constrain the features extracted by the encoder network. Specifically, we require that the features extracted are able to reconstruct the images in both domains. In addition we require that the distribution of features extracted from images in the two domains are indistinguishable. Many recent works can be seen as specific cases of our general framework. We apply our method for domain adaptation between MNIST, USPS, and SVHN datasets, and Amazon, Webcam and DSLR Office datasets in classification tasks, and also between GTA5 and Cityscapes datasets for a segmentation task. We demonstrate state of the art performance on each of these datasets.
Tasks Domain Adaptation, Image-to-Image Translation, Unsupervised Domain Adaptation
Published 2017-12-01
URL http://arxiv.org/abs/1712.00479v1
PDF http://arxiv.org/pdf/1712.00479v1.pdf
PWC https://paperswithcode.com/paper/image-to-image-translation-for-domain
Repo
Framework

Memory Matching Networks for Genomic Sequence Classification

Title Memory Matching Networks for Genomic Sequence Classification
Authors Jack Lanchantin, Ritambhara Singh, Yanjun Qi
Abstract When analyzing the genome, researchers have discovered that proteins bind to DNA based on certain patterns of the DNA sequence known as “motifs”. However, it is difficult to manually construct motifs due to their complexity. Recently, externally learned memory models have proven to be effective methods for reasoning over inputs and supporting sets. In this work, we present memory matching networks (MMN) for classifying DNA sequences as protein binding sites. Our model learns a memory bank of encoded motifs, which are dynamic memory modules, and then matches a new test sequence to each of the motifs to classify the sequence as a binding or nonbinding site.
Tasks
Published 2017-02-22
URL http://arxiv.org/abs/1702.06760v1
PDF http://arxiv.org/pdf/1702.06760v1.pdf
PWC https://paperswithcode.com/paper/memory-matching-networks-for-genomic-sequence
Repo
Framework

Interactive Natural Language Acquisition in a Multi-modal Recurrent Neural Architecture

Title Interactive Natural Language Acquisition in a Multi-modal Recurrent Neural Architecture
Authors Stefan Heinrich, Stefan Wermter
Abstract For the complex human brain that enables us to communicate in natural language, we gathered good understandings of principles underlying language acquisition and processing, knowledge about socio-cultural conditions, and insights about activity patterns in the brain. However, we were not yet able to understand the behavioural and mechanistic characteristics for natural language and how mechanisms in the brain allow to acquire and process language. In bridging the insights from behavioural psychology and neuroscience, the goal of this paper is to contribute a computational understanding of appropriate characteristics that favour language acquisition. Accordingly, we provide concepts and refinements in cognitive modelling regarding principles and mechanisms in the brain and propose a neurocognitively plausible model for embodied language acquisition from real world interaction of a humanoid robot with its environment. In particular, the architecture consists of a continuous time recurrent neural network, where parts have different leakage characteristics and thus operate on multiple timescales for every modality and the association of the higher level nodes of all modalities into cell assemblies. The model is capable of learning language production grounded in both, temporal dynamic somatosensation and vision, and features hierarchical concept abstraction, concept decomposition, multi-modal integration, and self-organisation of latent representations.
Tasks Language Acquisition
Published 2017-03-24
URL http://arxiv.org/abs/1703.08513v2
PDF http://arxiv.org/pdf/1703.08513v2.pdf
PWC https://paperswithcode.com/paper/interactive-natural-language-acquisition-in-a
Repo
Framework

Neural Program Meta-Induction

Title Neural Program Meta-Induction
Authors Jacob Devlin, Rudy Bunel, Rishabh Singh, Matthew Hausknecht, Pushmeet Kohli
Abstract Most recently proposed methods for Neural Program Induction work under the assumption of having a large set of input/output (I/O) examples for learning any underlying input-output mapping. This paper aims to address the problem of data and computation efficiency of program induction by leveraging information from related tasks. Specifically, we propose two approaches for cross-task knowledge transfer to improve program induction in limited-data scenarios. In our first proposal, portfolio adaptation, a set of induction models is pretrained on a set of related tasks, and the best model is adapted towards the new task using transfer learning. In our second approach, meta program induction, a $k$-shot learning approach is used to make a model generalize to new tasks without additional training. To test the efficacy of our methods, we constructed a new benchmark of programs written in the Karel programming language. Using an extensive experimental evaluation on the Karel benchmark, we demonstrate that our proposals dramatically outperform the baseline induction method that does not use knowledge transfer. We also analyze the relative performance of the two approaches and study conditions in which they perform best. In particular, meta induction outperforms all existing approaches under extreme data sparsity (when a very small number of examples are available), i.e., fewer than ten. As the number of available I/O examples increase (i.e. a thousand or more), portfolio adapted program induction becomes the best approach. For intermediate data sizes, we demonstrate that the combined method of adapted meta program induction has the strongest performance.
Tasks Transfer Learning
Published 2017-10-11
URL http://arxiv.org/abs/1710.04157v1
PDF http://arxiv.org/pdf/1710.04157v1.pdf
PWC https://paperswithcode.com/paper/neural-program-meta-induction
Repo
Framework

Data Innovation for International Development: An overview of natural language processing for qualitative data analysis

Title Data Innovation for International Development: An overview of natural language processing for qualitative data analysis
Authors Philipp Broniecki, Anna Hanchar, Slava J. Mikhaylov
Abstract Availability, collection and access to quantitative data, as well as its limitations, often make qualitative data the resource upon which development programs heavily rely. Both traditional interview data and social media analysis can provide rich contextual information and are essential for research, appraisal, monitoring and evaluation. These data may be difficult to process and analyze both systematically and at scale. This, in turn, limits the ability of timely data driven decision-making which is essential in fast evolving complex social systems. In this paper, we discuss the potential of using natural language processing to systematize analysis of qualitative data, and to inform quick decision-making in the development context. We illustrate this with interview data generated in a format of micro-narratives for the UNDP Fragments of Impact project.
Tasks Decision Making
Published 2017-09-16
URL http://arxiv.org/abs/1709.05563v1
PDF http://arxiv.org/pdf/1709.05563v1.pdf
PWC https://paperswithcode.com/paper/data-innovation-for-international-development
Repo
Framework

Multi-objective Bandits: Optimizing the Generalized Gini Index

Title Multi-objective Bandits: Optimizing the Generalized Gini Index
Authors Robert Busa-Fekete, Balazs Szorenyi, Paul Weng, Shie Mannor
Abstract We study the multi-armed bandit (MAB) problem where the agent receives a vectorial feedback that encodes many possibly competing objectives to be optimized. The goal of the agent is to find a policy, which can optimize these objectives simultaneously in a fair way. This multi-objective online optimization problem is formalized by using the Generalized Gini Index (GGI) aggregation function. We propose an online gradient descent algorithm which exploits the convexity of the GGI aggregation function, and controls the exploration in a careful way achieving a distribution-free regret $\tilde{\bigO} (T^{-1/2} )$ with high probability. We test our algorithm on synthetic data as well as on an electric battery control problem where the goal is to trade off the use of the different cells of a battery in order to balance their respective degradation rates.
Tasks
Published 2017-06-15
URL http://arxiv.org/abs/1706.04933v1
PDF http://arxiv.org/pdf/1706.04933v1.pdf
PWC https://paperswithcode.com/paper/multi-objective-bandits-optimizing-the
Repo
Framework

Measuring Sample Quality with Kernels

Title Measuring Sample Quality with Kernels
Authors Jackson Gorham, Lester Mackey
Abstract Approximate Markov chain Monte Carlo (MCMC) offers the promise of more rapid sampling at the cost of more biased inference. Since standard MCMC diagnostics fail to detect these biases, researchers have developed computable Stein discrepancy measures that provably determine the convergence of a sample to its target distribution. This approach was recently combined with the theory of reproducing kernels to define a closed-form kernel Stein discrepancy (KSD) computable by summing kernel evaluations across pairs of sample points. We develop a theory of weak convergence for KSDs based on Stein’s method, demonstrate that commonly used KSDs fail to detect non-convergence even for Gaussian targets, and show that kernels with slowly decaying tails provably determine convergence for a large class of target distributions. The resulting convergence-determining KSDs are suitable for comparing biased, exact, and deterministic sample sequences and simpler to compute and parallelize than alternative Stein discrepancies. We use our tools to compare biased samplers, select sampler hyperparameters, and improve upon existing KSD approaches to one-sample hypothesis testing and sample quality improvement.
Tasks
Published 2017-03-06
URL http://arxiv.org/abs/1703.01717v8
PDF http://arxiv.org/pdf/1703.01717v8.pdf
PWC https://paperswithcode.com/paper/measuring-sample-quality-with-kernels
Repo
Framework

Prediction-Constrained Training for Semi-Supervised Mixture and Topic Models

Title Prediction-Constrained Training for Semi-Supervised Mixture and Topic Models
Authors Michael C. Hughes, Leah Weiner, Gabriel Hope, Thomas H. McCoy Jr., Roy H. Perlis, Erik B. Sudderth, Finale Doshi-Velez
Abstract Supervisory signals have the potential to make low-dimensional data representations, like those learned by mixture and topic models, more interpretable and useful. We propose a framework for training latent variable models that explicitly balances two goals: recovery of faithful generative explanations of high-dimensional data, and accurate prediction of associated semantic labels. Existing approaches fail to achieve these goals due to an incomplete treatment of a fundamental asymmetry: the intended application is always predicting labels from data, not data from labels. Our prediction-constrained objective for training generative models coherently integrates loss-based supervisory signals while enabling effective semi-supervised learning from partially labeled data. We derive learning algorithms for semi-supervised mixture and topic models using stochastic gradient descent with automatic differentiation. We demonstrate improved prediction quality compared to several previous supervised topic models, achieving predictions competitive with high-dimensional logistic regression on text sentiment analysis and electronic health records tasks while simultaneously learning interpretable topics.
Tasks Latent Variable Models, Sentiment Analysis, Topic Models
Published 2017-07-23
URL http://arxiv.org/abs/1707.07341v1
PDF http://arxiv.org/pdf/1707.07341v1.pdf
PWC https://paperswithcode.com/paper/prediction-constrained-training-for-semi
Repo
Framework

Accelerating Approximate Bayesian Computation with Quantile Regression: Application to Cosmological Redshift Distributions

Title Accelerating Approximate Bayesian Computation with Quantile Regression: Application to Cosmological Redshift Distributions
Authors Tomasz Kacprzak, Jörg Herbel, Adam Amara, Alexandre Réfrégier
Abstract Approximate Bayesian Computation (ABC) is a method to obtain a posterior distribution without a likelihood function, using simulations and a set of distance metrics. For that reason, it has recently been gaining popularity as an analysis tool in cosmology and astrophysics. Its drawback, however, is a slow convergence rate. We propose a novel method, which we call qABC, to accelerate ABC with Quantile Regression. In this method, we create a model of quantiles of distance measure as a function of input parameters. This model is trained on a small number of simulations and estimates which regions of the prior space are likely to be accepted into the posterior. Other regions are then immediately rejected. This procedure is then repeated as more simulations are available. We apply it to the practical problem of estimation of redshift distribution of cosmological samples, using forward modelling developed in previous work. The qABC method converges to nearly same posterior as the basic ABC. It uses, however, only 20% of the number of simulations compared to basic ABC, achieving a fivefold gain in execution time for our problem. For other problems the acceleration rate may vary; it depends on how close the prior is to the final posterior. We discuss possible improvements and extensions to this method.
Tasks
Published 2017-07-24
URL http://arxiv.org/abs/1707.07498v2
PDF http://arxiv.org/pdf/1707.07498v2.pdf
PWC https://paperswithcode.com/paper/accelerating-approximate-bayesian-computation
Repo
Framework

Criticality & Deep Learning I: Generally Weighted Nets

Title Criticality & Deep Learning I: Generally Weighted Nets
Authors Dan Oprisa, Peter Toth
Abstract Motivated by the idea that criticality and universality of phase transitions might play a crucial role in achieving and sustaining learning and intelligent behaviour in biological and artificial networks, we analyse a theoretical and a pragmatic experimental set up for critical phenomena in deep learning. On the theoretical side, we use results from statistical physics to carry out critical point calculations in feed-forward/fully connected networks, while on the experimental side we set out to find traces of criticality in deep neural networks. This is our first step in a series of upcoming investigations to map out the relationship between criticality and learning in deep networks.
Tasks
Published 2017-02-26
URL http://arxiv.org/abs/1702.08039v2
PDF http://arxiv.org/pdf/1702.08039v2.pdf
PWC https://paperswithcode.com/paper/criticality-deep-learning-i-generally
Repo
Framework

Information Perspective to Probabilistic Modeling: Boltzmann Machines versus Born Machines

Title Information Perspective to Probabilistic Modeling: Boltzmann Machines versus Born Machines
Authors Song Cheng, Jing Chen, Lei Wang
Abstract We compare and contrast the statistical physics and quantum physics inspired approaches for unsupervised generative modeling of classical data. The two approaches represent probabilities of observed data using energy-based models and quantum states respectively.Classical and quantum information patterns of the target datasets therefore provide principled guidelines for structural design and learning in these two approaches. Taking the restricted Boltzmann machines (RBM) as an example, we analyze the information theoretical bounds of the two approaches. We verify our reasonings by comparing the performance of RBMs of various architectures on the standard MNIST datasets.
Tasks
Published 2017-12-12
URL http://arxiv.org/abs/1712.04144v1
PDF http://arxiv.org/pdf/1712.04144v1.pdf
PWC https://paperswithcode.com/paper/information-perspective-to-probabilistic
Repo
Framework

CASSL: Curriculum Accelerated Self-Supervised Learning

Title CASSL: Curriculum Accelerated Self-Supervised Learning
Authors Adithyavairavan Murali, Lerrel Pinto, Dhiraj Gandhi, Abhinav Gupta
Abstract Recent self-supervised learning approaches focus on using a few thousand data points to learn policies for high-level, low-dimensional action spaces. However, scaling this framework for high-dimensional control require either scaling up the data collection efforts or using a clever sampling strategy for training. We present a novel approach - Curriculum Accelerated Self-Supervised Learning (CASSL) - to train policies that map visual information to high-level, higher- dimensional action spaces. CASSL orders the sampling of training data based on control dimensions: the learning and sampling are focused on few control parameters before other parameters. The right curriculum for learning is suggested by variance-based global sensitivity analysis of the control space. We apply our CASSL framework to learning how to grasp using an adaptive, underactuated multi-fingered gripper, a challenging system to control. Our experimental results indicate that CASSL provides significant improvement and generalization compared to baseline methods such as staged curriculum learning (8% increase) and complete end-to-end learning with random exploration (14% improvement) tested on a set of novel objects.
Tasks
Published 2017-08-04
URL http://arxiv.org/abs/1708.01354v2
PDF http://arxiv.org/pdf/1708.01354v2.pdf
PWC https://paperswithcode.com/paper/cassl-curriculum-accelerated-self-supervised
Repo
Framework

A Capacity Scaling Law for Artificial Neural Networks

Title A Capacity Scaling Law for Artificial Neural Networks
Authors Gerald Friedland, Mario Krell
Abstract We derive the calculation of two critical numbers predicting the behavior of perceptron networks. First, we derive the calculation of what we call the lossless memory (LM) dimension. The LM dimension is a generalization of the Vapnik–Chervonenkis (VC) dimension that avoids structured data and therefore provides an upper bound for perfectly fitting almost any training data. Second, we derive what we call the MacKay (MK) dimension. This limit indicates a 50% chance of not being able to train a given function. Our derivations are performed by embedding a neural network into Shannon’s communication model which allows to interpret the two points as capacities measured in bits. We present a proof and practical experiments that validate our upper bounds with repeatable experiments using different network configurations, diverse implementations, varying activation functions, and several learning algorithms. The bottom line is that the two capacity points scale strictly linear with the number of weights. Among other practical applications, our result allows to compare and benchmark different neural network implementations independent of a concrete learning task. Our results provide insight into the capabilities and limits of neural networks and generate valuable know how for experimental design decisions.
Tasks
Published 2017-08-20
URL http://arxiv.org/abs/1708.06019v3
PDF http://arxiv.org/pdf/1708.06019v3.pdf
PWC https://paperswithcode.com/paper/a-capacity-scaling-law-for-artificial-neural
Repo
Framework
comments powered by Disqus