October 21, 2019

3078 words 15 mins read

Paper Group AWR 159

Paper Group AWR 159

End-to-end music source separation: is it possible in the waveform domain?. IntPhys: A Framework and Benchmark for Visual Intuitive Physics Reasoning. Multitask Parsing Across Semantic Representations. Transfer Learning via Unsupervised Task Discovery for Visual Question Answering. Spherical CNNs. Image Inpainting for Irregular Holes Using Partial …

End-to-end music source separation: is it possible in the waveform domain?

Title End-to-end music source separation: is it possible in the waveform domain?
Authors Francesc Lluís, Jordi Pons, Xavier Serra
Abstract Most of the currently successful source separation techniques use the magnitude spectrogram as input, and are therefore by default omitting part of the signal: the phase. To avoid omitting potentially useful information, we study the viability of using end-to-end models for music source separation — which take into account all the information available in the raw audio signal, including the phase. Although during the last decades end-to-end music source separation has been considered almost unattainable, our results confirm that waveform-based models can perform similarly (if not better) than a spectrogram-based deep learning model. Namely: a Wavenet-based model we propose and Wave-U-Net can outperform DeepConvSep, a recent spectrogram-based deep learning model.
Tasks Music Source Separation
Published 2018-10-29
URL https://arxiv.org/abs/1810.12187v2
PDF https://arxiv.org/pdf/1810.12187v2.pdf
PWC https://paperswithcode.com/paper/end-to-end-music-source-separation-is-it
Repo https://github.com/francesclluis/source-separation-wavenet
Framework tf

IntPhys: A Framework and Benchmark for Visual Intuitive Physics Reasoning

Title IntPhys: A Framework and Benchmark for Visual Intuitive Physics Reasoning
Authors Ronan Riochet, Mario Ynocente Castro, Mathieu Bernard, Adam Lerer, Rob Fergus, Véronique Izard, Emmanuel Dupoux
Abstract In order to reach human performance on complexvisual tasks, artificial systems need to incorporate a sig-nificant amount of understanding of the world in termsof macroscopic objects, movements, forces, etc. Inspiredby work on intuitive physics in infants, we propose anevaluation benchmark which diagnoses how much a givensystem understands about physics by testing whether itcan tell apart well matched videos of possible versusimpossible events constructed with a game engine. Thetest requires systems to compute a physical plausibilityscore over an entire video. It is free of bias and cantest a range of basic physical reasoning concepts. Wethen describe two Deep Neural Networks systems aimedat learning intuitive physics in an unsupervised way,using only physically possible videos. The systems aretrained with a future semantic mask prediction objectiveand tested on the possible versus impossible discrimi-nation task. The analysis of their results compared tohuman data gives novel insights in the potentials andlimitations of next frame prediction architectures.
Tasks
Published 2018-03-20
URL https://arxiv.org/abs/1803.07616v3
PDF https://arxiv.org/pdf/1803.07616v3.pdf
PWC https://paperswithcode.com/paper/intphys-a-framework-and-benchmark-for-visual
Repo https://github.com/rronan/IntPhys-Baselines
Framework pytorch

Multitask Parsing Across Semantic Representations

Title Multitask Parsing Across Semantic Representations
Authors Daniel Hershcovich, Omri Abend, Ari Rappoport
Abstract The ability to consolidate information of different types is at the core of intelligence, and has tremendous practical value in allowing learning for one task to benefit from generalizations learned for others. In this paper we tackle the challenging task of improving semantic parsing performance, taking UCCA parsing as a test case, and AMR, SDP and Universal Dependencies (UD) parsing as auxiliary tasks. We experiment on three languages, using a uniform transition-based system and learning architecture for all parsing tasks. Despite notable conceptual, formal and domain differences, we show that multitask learning significantly improves UCCA parsing in both in-domain and out-of-domain settings.
Tasks Semantic Parsing
Published 2018-05-01
URL http://arxiv.org/abs/1805.00287v1
PDF http://arxiv.org/pdf/1805.00287v1.pdf
PWC https://paperswithcode.com/paper/multitask-parsing-across-semantic
Repo https://github.com/danielhers/tupa
Framework none

Transfer Learning via Unsupervised Task Discovery for Visual Question Answering

Title Transfer Learning via Unsupervised Task Discovery for Visual Question Answering
Authors Hyeonwoo Noh, Taehoon Kim, Jonghwan Mun, Bohyung Han
Abstract We study how to leverage off-the-shelf visual and linguistic data to cope with out-of-vocabulary answers in visual question answering task. Existing large-scale visual datasets with annotations such as image class labels, bounding boxes and region descriptions are good sources for learning rich and diverse visual concepts. However, it is not straightforward how the visual concepts can be captured and transferred to visual question answering models due to missing link between question dependent answering models and visual data without question. We tackle this problem in two steps: 1) learning a task conditional visual classifier, which is capable of solving diverse question-specific visual recognition tasks, based on unsupervised task discovery and 2) transferring the task conditional visual classifier to visual question answering models. Specifically, we employ linguistic knowledge sources such as structured lexical database (e.g. WordNet) and visual descriptions for unsupervised task discovery, and transfer a learned task conditional visual classifier as an answering unit in a visual question answering model. We empirically show that the proposed algorithm generalizes to out-of-vocabulary answers successfully using the knowledge transferred from the visual dataset.
Tasks Question Answering, Transfer Learning, Visual Question Answering
Published 2018-10-03
URL http://arxiv.org/abs/1810.02358v2
PDF http://arxiv.org/pdf/1810.02358v2.pdf
PWC https://paperswithcode.com/paper/transfer-learning-via-unsupervised-task
Repo https://github.com/HyeonwooNoh/vqa_task_discovery
Framework tf

Spherical CNNs

Title Spherical CNNs
Authors Taco S. Cohen, Mario Geiger, Jonas Koehler, Max Welling
Abstract Convolutional Neural Networks (CNNs) have become the method of choice for learning problems involving 2D planar images. However, a number of problems of recent interest have created a demand for models that can analyze spherical images. Examples include omnidirectional vision for drones, robots, and autonomous cars, molecular regression problems, and global weather and climate modelling. A naive application of convolutional networks to a planar projection of the spherical signal is destined to fail, because the space-varying distortions introduced by such a projection will make translational weight sharing ineffective. In this paper we introduce the building blocks for constructing spherical CNNs. We propose a definition for the spherical cross-correlation that is both expressive and rotation-equivariant. The spherical correlation satisfies a generalized Fourier theorem, which allows us to compute it efficiently using a generalized (non-commutative) Fast Fourier Transform (FFT) algorithm. We demonstrate the computational efficiency, numerical accuracy, and effectiveness of spherical CNNs applied to 3D model recognition and atomization energy regression.
Tasks
Published 2018-01-30
URL http://arxiv.org/abs/1801.10130v3
PDF http://arxiv.org/pdf/1801.10130v3.pdf
PWC https://paperswithcode.com/paper/spherical-cnns
Repo https://github.com/jonas-koehler/s2cnn
Framework pytorch

Image Inpainting for Irregular Holes Using Partial Convolutions

Title Image Inpainting for Irregular Holes Using Partial Convolutions
Authors Guilin Liu, Fitsum A. Reda, Kevin J. Shih, Ting-Chun Wang, Andrew Tao, Bryan Catanzaro
Abstract Existing deep learning based image inpainting methods use a standard convolutional network over the corrupted image, using convolutional filter responses conditioned on both valid pixels as well as the substitute values in the masked holes (typically the mean value). This often leads to artifacts such as color discrepancy and blurriness. Post-processing is usually used to reduce such artifacts, but are expensive and may fail. We propose the use of partial convolutions, where the convolution is masked and renormalized to be conditioned on only valid pixels. We further include a mechanism to automatically generate an updated mask for the next layer as part of the forward pass. Our model outperforms other methods for irregular masks. We show qualitative and quantitative comparisons with other methods to validate our approach.
Tasks Image Inpainting
Published 2018-04-20
URL http://arxiv.org/abs/1804.07723v2
PDF http://arxiv.org/pdf/1804.07723v2.pdf
PWC https://paperswithcode.com/paper/image-inpainting-for-irregular-holes-using
Repo https://github.com/SimonDele/School-projects
Framework none

Interpretable Neuron Structuring with Graph Spectral Regularization

Title Interpretable Neuron Structuring with Graph Spectral Regularization
Authors Alexander Tong, David van Dijk, Jay S. Stanley III, Matthew Amodio, Kristina Yim, Rebecca Muhle, James Noonan, Guy Wolf, Smita Krishnaswamy
Abstract While neural networks are powerful approximators used to classify or embed data into lower dimensional spaces, they are often regarded as black boxes with uninterpretable features. Here we propose Graph Spectral Regularization for making hidden layers more interpretable without significantly impacting performance on the primary task. Taking inspiration from spatial organization and localization of neuron activations in biological networks, we use a graph Laplacian penalty to structure the activations within a layer. This penalty encourages activations to be smooth either on a predetermined graph or on a feature-space graph learned from the data via co-activations of a hidden layer of the neural network. We show numerous uses for this additional structure including cluster indication and visualization in biological and image data sets.
Tasks
Published 2018-09-30
URL https://arxiv.org/abs/1810.00424v5
PDF https://arxiv.org/pdf/1810.00424v5.pdf
PWC https://paperswithcode.com/paper/graph-spectral-regularization-for-neural
Repo https://github.com/KrishnaswamyLab/GraphSpectralRegularization
Framework tf

CapsDeMM: Capsule network for Detection of Munro’s Microabscess in skin biopsy images

Title CapsDeMM: Capsule network for Detection of Munro’s Microabscess in skin biopsy images
Authors Anabik Pal, Akshay Chaturvedi, Utpal Garain, Aditi Chandra, Raghunath Chatterjee, Swapan Senapati
Abstract This paper presents an approach for automatic detection of Munro’s Microabscess in stratum corneum (SC) of human skin biopsy in order to realize a machine assisted diagnosis of Psoriasis. The challenge of detecting neutrophils in presence of nucleated cells is solved using the recent advances of deep learning algorithms. Separation of SC layer, extraction of patches from the layer followed by classification of patches with respect to presence or absence of neutrophils form the basis of the overall approach which is effected through an integration of a U-Net based segmentation network and a capsule network for classification. The novel design of the present capsule net leads to a drastic reduction in the number of parameters without any noticeable compromise in the overall performance. The research further addresses the challenge of dealing with Mega-pixel images (in 10X) vis-a-vis Giga-pixel ones (in 40X). The promising result coming out of an experiment on a dataset consisting of 273 real-life images shows that a practical system is possible based on the present research. The implementation of our system is available at https://github.com/Anabik/CapsDeMM.
Tasks
Published 2018-08-20
URL http://arxiv.org/abs/1808.06428v2
PDF http://arxiv.org/pdf/1808.06428v2.pdf
PWC https://paperswithcode.com/paper/capsdemm-capsule-network-for-detection-of
Repo https://github.com/Anabik/CapsDeMM
Framework tf

Adaptive Sampling for Coarse Ranking

Title Adaptive Sampling for Coarse Ranking
Authors Sumeet Katariya, Lalit Jain, Nandana Sengupta, James Evans, Robert Nowak
Abstract We consider the problem of active coarse ranking, where the goal is to sort items according to their means into clusters of pre-specified sizes, by adaptively sampling from their reward distributions. This setting is useful in many social science applications involving human raters and the approximate rank of every item is desired. Approximate or coarse ranking can significantly reduce the number of ratings required in comparison to the number needed to find an exact ranking. We propose a computationally efficient PAC algorithm LUCBRank for coarse ranking, and derive an upper bound on its sample complexity. We also derive a nearly matching distribution-dependent lower bound. Experiments on synthetic as well as real-world data show that LUCBRank performs better than state-of-the-art baseline methods, even when these methods have the advantage of knowing the underlying parametric model.
Tasks
Published 2018-02-20
URL http://arxiv.org/abs/1802.07176v1
PDF http://arxiv.org/pdf/1802.07176v1.pdf
PWC https://paperswithcode.com/paper/adaptive-sampling-for-coarse-ranking
Repo https://github.com/sumeetsk/coarse_ranking
Framework none

Markerless tracking of user-defined features with deep learning

Title Markerless tracking of user-defined features with deep learning
Authors Alexander Mathis, Pranav Mamidanna, Taiga Abe, Kevin M. Cury, Venkatesh N. Murthy, Mackenzie W. Mathis, Matthias Bethge
Abstract Quantifying behavior is crucial for many applications in neuroscience. Videography provides easy methods for the observation and recording of animal behavior in diverse settings, yet extracting particular aspects of a behavior for further analysis can be highly time consuming. In motor control studies, humans or other animals are often marked with reflective markers to assist with computer-based tracking, yet markers are intrusive (especially for smaller animals), and the number and location of the markers must be determined a priori. Here, we present a highly efficient method for markerless tracking based on transfer learning with deep neural networks that achieves excellent results with minimal training data. We demonstrate the versatility of this framework by tracking various body parts in a broad collection of experimental settings: mice odor trail-tracking, egg-laying behavior in drosophila, and mouse hand articulation in a skilled forelimb task. For example, during the skilled reaching behavior, individual joints can be automatically tracked (and a confidence score is reported). Remarkably, even when a small number of frames are labeled ($\approx 200$), the algorithm achieves excellent tracking performance on test frames that is comparable to human accuracy.
Tasks Animal Pose Estimation, Transfer Learning
Published 2018-04-09
URL http://arxiv.org/abs/1804.03142v1
PDF http://arxiv.org/pdf/1804.03142v1.pdf
PWC https://paperswithcode.com/paper/markerless-tracking-of-user-defined-features
Repo https://github.com/orkqueen/depplabseongil
Framework tf

RePr: Improved Training of Convolutional Filters

Title RePr: Improved Training of Convolutional Filters
Authors Aaditya Prakash, James Storer, Dinei Florencio, Cha Zhang
Abstract A well-trained Convolutional Neural Network can easily be pruned without significant loss of performance. This is because of unnecessary overlap in the features captured by the network’s filters. Innovations in network architecture such as skip/dense connections and Inception units have mitigated this problem to some extent, but these improvements come with increased computation and memory requirements at run-time. We attempt to address this problem from another angle - not by changing the network structure but by altering the training method. We show that by temporarily pruning and then restoring a subset of the model’s filters, and repeating this process cyclically, overlap in the learned features is reduced, producing improved generalization. We show that the existing model-pruning criteria are not optimal for selecting filters to prune in this context and introduce inter-filter orthogonality as the ranking criteria to determine under-expressive filters. Our method is applicable both to vanilla convolutional networks and more complex modern architectures, and improves the performance across a variety of tasks, especially when applied to smaller networks.
Tasks
Published 2018-11-18
URL http://arxiv.org/abs/1811.07275v3
PDF http://arxiv.org/pdf/1811.07275v3.pdf
PWC https://paperswithcode.com/paper/repr-improved-training-of-convolutional
Repo https://github.com/siahuat0727/RePr
Framework pytorch

Implicit Self-Regularization in Deep Neural Networks: Evidence from Random Matrix Theory and Implications for Learning

Title Implicit Self-Regularization in Deep Neural Networks: Evidence from Random Matrix Theory and Implications for Learning
Authors Charles H. Martin, Michael W. Mahoney
Abstract Random Matrix Theory (RMT) is applied to analyze weight matrices of Deep Neural Networks (DNNs), including both production quality, pre-trained models such as AlexNet and Inception, and smaller models trained from scratch, such as LeNet5 and a miniature-AlexNet. Empirical and theoretical results clearly indicate that the DNN training process itself implicitly implements a form of Self-Regularization. The empirical spectral density (ESD) of DNN layer matrices displays signatures of traditionally-regularized statistical models, even in the absence of exogenously specifying traditional forms of explicit regularization. Building on relatively recent results in RMT, most notably its extension to Universality classes of Heavy-Tailed matrices, we develop a theory to identify 5+1 Phases of Training, corresponding to increasing amounts of Implicit Self-Regularization. These phases can be observed during the training process as well as in the final learned DNNs. For smaller and/or older DNNs, this Implicit Self-Regularization is like traditional Tikhonov regularization, in that there is a “size scale” separating signal from noise. For state-of-the-art DNNs, however, we identify a novel form of Heavy-Tailed Self-Regularization, similar to the self-organization seen in the statistical physics of disordered systems. This results from correlations arising at all size scales, which arises implicitly due to the training process itself. This implicit Self-Regularization can depend strongly on the many knobs of the training process. By exploiting the generalization gap phenomena, we demonstrate that we can cause a small model to exhibit all 5+1 phases of training simply by changing the batch size. This demonstrates that—all else being equal—DNN optimization with larger batch sizes leads to less-well implicitly-regularized models, and it provides an explanation for the generalization gap phenomena.
Tasks
Published 2018-10-02
URL http://arxiv.org/abs/1810.01075v1
PDF http://arxiv.org/pdf/1810.01075v1.pdf
PWC https://paperswithcode.com/paper/implicit-self-regularization-in-deep-neural
Repo https://github.com/CalculatedContent/ImplicitSelfRegularization
Framework pytorch

Limited Evaluation Evolutionary Optimization of Large Neural Networks

Title Limited Evaluation Evolutionary Optimization of Large Neural Networks
Authors Jonas Prellberg, Oliver Kramer
Abstract Stochastic gradient descent is the most prevalent algorithm to train neural networks. However, other approaches such as evolutionary algorithms are also applicable to this task. Evolutionary algorithms bring unique trade-offs that are worth exploring, but computational demands have so far restricted exploration to small networks with few parameters. We implement an evolutionary algorithm that executes entirely on the GPU, which allows to efficiently batch-evaluate a whole population of networks. Within this framework, we explore the limited evaluation evolutionary algorithm for neural network training and find that its batch evaluation idea comes with a large accuracy trade-off. In further experiments, we explore crossover operators and find that unprincipled random uniform crossover performs extremely well. Finally, we train a network with 92k parameters on MNIST using an EA and achieve 97.6 % test accuracy compared to 98 % test accuracy on the same network trained with Adam. Code is available at https://github.com/jprellberg/gpuea.
Tasks
Published 2018-06-26
URL http://arxiv.org/abs/1806.09819v1
PDF http://arxiv.org/pdf/1806.09819v1.pdf
PWC https://paperswithcode.com/paper/limited-evaluation-evolutionary-optimization
Repo https://github.com/jprellberg/gpuea
Framework tf

A Convolutional Autoencoder Approach to Learn Volumetric Shape Representations for Brain Structures

Title A Convolutional Autoencoder Approach to Learn Volumetric Shape Representations for Brain Structures
Authors Evan M. Yu, Mert R. Sabuncu
Abstract We propose a novel machine learning strategy for studying neuroanatomical shape variation. Our model works with volumetric binary segmentation images, and requires no pre-processing such as the extraction of surface points or a mesh. The learned shape descriptor is invariant to affine transformations, including shifts, rotations and scaling. Thanks to the adopted autoencoder framework, inter-subject differences are automatically enhanced in the learned representation, while intra-subject variances are minimized. Our experimental results on a shape retrieval task showed that the proposed representation outperforms a state-of-the-art benchmark for brain structures extracted from MRI scans.
Tasks
Published 2018-10-17
URL http://arxiv.org/abs/1810.07746v1
PDF http://arxiv.org/pdf/1810.07746v1.pdf
PWC https://paperswithcode.com/paper/a-convolutional-autoencoder-approach-to-learn
Repo https://github.com/evanmy/voxel_shape_analysis
Framework pytorch

Minimal Random Code Learning: Getting Bits Back from Compressed Model Parameters

Title Minimal Random Code Learning: Getting Bits Back from Compressed Model Parameters
Authors Marton Havasi, Robert Peharz, José Miguel Hernández-Lobato
Abstract While deep neural networks are a highly successful model class, their large memory footprint puts considerable strain on energy consumption, communication bandwidth, and storage requirements. Consequently, model size reduction has become an utmost goal in deep learning. A typical approach is to train a set of deterministic weights, while applying certain techniques such as pruning and quantization, in order that the empirical weight distribution becomes amenable to Shannon-style coding schemes. However, as shown in this paper, relaxing weight determinism and using a full variational distribution over weights allows for more efficient coding schemes and consequently higher compression rates. In particular, following the classical bits-back argument, we encode the network weights using a random sample, requiring only a number of bits corresponding to the Kullback-Leibler divergence between the sampled variational distribution and the encoding distribution. By imposing a constraint on the Kullback-Leibler divergence, we are able to explicitly control the compression rate, while optimizing the expected loss on the training set. The employed encoding scheme can be shown to be close to the optimal information-theoretical lower bound, with respect to the employed variational family. Our method sets new state-of-the-art in neural network compression, as it strictly dominates previous approaches in a Pareto sense: On the benchmarks LeNet-5/MNIST and VGG-16/CIFAR-10, our approach yields the best test performance for a fixed memory budget, and vice versa, it achieves the highest compression rates for a fixed test performance.
Tasks Neural Network Compression, Quantization
Published 2018-09-30
URL http://arxiv.org/abs/1810.00440v1
PDF http://arxiv.org/pdf/1810.00440v1.pdf
PWC https://paperswithcode.com/paper/minimal-random-code-learning-getting-bits
Repo https://github.com/cambridge-mlg/variational-shannon-coding
Framework tf
comments powered by Disqus