October 21, 2019

3078 words 15 mins read

Paper Group AWR 159

End-to-end music source separation: is it possible in the waveform domain?. IntPhys: A Framework and Benchmark for Visual Intuitive Physics Reasoning. Multitask Parsing Across Semantic Representations. Transfer Learning via Unsupervised Task Discovery for Visual Question Answering. Spherical CNNs. Image Inpainting for Irregular Holes Using Partial …

End-to-end music source separation: is it possible in the waveform domain?


Title	End-to-end music source separation: is it possible in the waveform domain?
Authors	Francesc Lluís, Jordi Pons, Xavier Serra
Abstract	Most of the currently successful source separation techniques use the magnitude spectrogram as input, and are therefore by default omitting part of the signal: the phase. To avoid omitting potentially useful information, we study the viability of using end-to-end models for music source separation — which take into account all the information available in the raw audio signal, including the phase. Although during the last decades end-to-end music source separation has been considered almost unattainable, our results confirm that waveform-based models can perform similarly (if not better) than a spectrogram-based deep learning model. Namely: a Wavenet-based model we propose and Wave-U-Net can outperform DeepConvSep, a recent spectrogram-based deep learning model.
Tasks	Music Source Separation
Published	2018-10-29
URL	https://arxiv.org/abs/1810.12187v2
PDF	https://arxiv.org/pdf/1810.12187v2.pdf
PWC	https://paperswithcode.com/paper/end-to-end-music-source-separation-is-it
Repo	https://github.com/francesclluis/source-separation-wavenet
Framework	tf

IntPhys: A Framework and Benchmark for Visual Intuitive Physics Reasoning


Title	IntPhys: A Framework and Benchmark for Visual Intuitive Physics Reasoning
Authors	Ronan Riochet, Mario Ynocente Castro, Mathieu Bernard, Adam Lerer, Rob Fergus, Véronique Izard, Emmanuel Dupoux
Abstract	In order to reach human performance on complexvisual tasks, artificial systems need to incorporate a sig-nificant amount of understanding of the world in termsof macroscopic objects, movements, forces, etc. Inspiredby work on intuitive physics in infants, we propose anevaluation benchmark which diagnoses how much a givensystem understands about physics by testing whether itcan tell apart well matched videos of possible versusimpossible events constructed with a game engine. Thetest requires systems to compute a physical plausibilityscore over an entire video. It is free of bias and cantest a range of basic physical reasoning concepts. Wethen describe two Deep Neural Networks systems aimedat learning intuitive physics in an unsupervised way,using only physically possible videos. The systems aretrained with a future semantic mask prediction objectiveand tested on the possible versus impossible discrimi-nation task. The analysis of their results compared tohuman data gives novel insights in the potentials andlimitations of next frame prediction architectures.
Tasks
Published	2018-03-20
URL	https://arxiv.org/abs/1803.07616v3
PDF	https://arxiv.org/pdf/1803.07616v3.pdf
PWC	https://paperswithcode.com/paper/intphys-a-framework-and-benchmark-for-visual
Repo	https://github.com/rronan/IntPhys-Baselines
Framework	pytorch

Multitask Parsing Across Semantic Representations


Title	Multitask Parsing Across Semantic Representations
Authors	Daniel Hershcovich, Omri Abend, Ari Rappoport
Abstract	The ability to consolidate information of different types is at the core of intelligence, and has tremendous practical value in allowing learning for one task to benefit from generalizations learned for others. In this paper we tackle the challenging task of improving semantic parsing performance, taking UCCA parsing as a test case, and AMR, SDP and Universal Dependencies (UD) parsing as auxiliary tasks. We experiment on three languages, using a uniform transition-based system and learning architecture for all parsing tasks. Despite notable conceptual, formal and domain differences, we show that multitask learning significantly improves UCCA parsing in both in-domain and out-of-domain settings.
Tasks	Semantic Parsing
Published	2018-05-01
URL	http://arxiv.org/abs/1805.00287v1
PDF	http://arxiv.org/pdf/1805.00287v1.pdf
PWC	https://paperswithcode.com/paper/multitask-parsing-across-semantic
Repo	https://github.com/danielhers/tupa
Framework	none

Transfer Learning via Unsupervised Task Discovery for Visual Question Answering


Title	Transfer Learning via Unsupervised Task Discovery for Visual Question Answering
Authors	Hyeonwoo Noh, Taehoon Kim, Jonghwan Mun, Bohyung Han
Abstract	We study how to leverage off-the-shelf visual and linguistic data to cope with out-of-vocabulary answers in visual question answering task. Existing large-scale visual datasets with annotations such as image class labels, bounding boxes and region descriptions are good sources for learning rich and diverse visual concepts. However, it is not straightforward how the visual concepts can be captured and transferred to visual question answering models due to missing link between question dependent answering models and visual data without question. We tackle this problem in two steps: 1) learning a task conditional visual classifier, which is capable of solving diverse question-specific visual recognition tasks, based on unsupervised task discovery and 2) transferring the task conditional visual classifier to visual question answering models. Specifically, we employ linguistic knowledge sources such as structured lexical database (e.g. WordNet) and visual descriptions for unsupervised task discovery, and transfer a learned task conditional visual classifier as an answering unit in a visual question answering model. We empirically show that the proposed algorithm generalizes to out-of-vocabulary answers successfully using the knowledge transferred from the visual dataset.
Tasks	Question Answering, Transfer Learning, Visual Question Answering
Published	2018-10-03
URL	http://arxiv.org/abs/1810.02358v2
PDF	http://arxiv.org/pdf/1810.02358v2.pdf
PWC	https://paperswithcode.com/paper/transfer-learning-via-unsupervised-task
Repo	https://github.com/HyeonwooNoh/vqa_task_discovery
Framework	tf

Spherical CNNs


Title	Spherical CNNs
Authors	Taco S. Cohen, Mario Geiger, Jonas Koehler, Max Welling
Abstract	Convolutional Neural Networks (CNNs) have become the method of choice for learning problems involving 2D planar images. However, a number of problems of recent interest have created a demand for models that can analyze spherical images. Examples include omnidirectional vision for drones, robots, and autonomous cars, molecular regression problems, and global weather and climate modelling. A naive application of convolutional networks to a planar projection of the spherical signal is destined to fail, because the space-varying distortions introduced by such a projection will make translational weight sharing ineffective. In this paper we introduce the building blocks for constructing spherical CNNs. We propose a definition for the spherical cross-correlation that is both expressive and rotation-equivariant. The spherical correlation satisfies a generalized Fourier theorem, which allows us to compute it efficiently using a generalized (non-commutative) Fast Fourier Transform (FFT) algorithm. We demonstrate the computational efficiency, numerical accuracy, and effectiveness of spherical CNNs applied to 3D model recognition and atomization energy regression.
Tasks
Published	2018-01-30
URL	http://arxiv.org/abs/1801.10130v3
PDF	http://arxiv.org/pdf/1801.10130v3.pdf
PWC	https://paperswithcode.com/paper/spherical-cnns
Repo	https://github.com/jonas-koehler/s2cnn
Framework	pytorch

Image Inpainting for Irregular Holes Using Partial Convolutions


Title	Image Inpainting for Irregular Holes Using Partial Convolutions
Authors	Guilin Liu, Fitsum A. Reda, Kevin J. Shih, Ting-Chun Wang, Andrew Tao, Bryan Catanzaro
Abstract	Existing deep learning based image inpainting methods use a standard convolutional network over the corrupted image, using convolutional filter responses conditioned on both valid pixels as well as the substitute values in the masked holes (typically the mean value). This often leads to artifacts such as color discrepancy and blurriness. Post-processing is usually used to reduce such artifacts, but are expensive and may fail. We propose the use of partial convolutions, where the convolution is masked and renormalized to be conditioned on only valid pixels. We further include a mechanism to automatically generate an updated mask for the next layer as part of the forward pass. Our model outperforms other methods for irregular masks. We show qualitative and quantitative comparisons with other methods to validate our approach.
Tasks	Image Inpainting
Published	2018-04-20
URL	http://arxiv.org/abs/1804.07723v2
PDF	http://arxiv.org/pdf/1804.07723v2.pdf
PWC	https://paperswithcode.com/paper/image-inpainting-for-irregular-holes-using
Repo	https://github.com/SimonDele/School-projects
Framework	none

Interpretable Neuron Structuring with Graph Spectral Regularization


Title	Interpretable Neuron Structuring with Graph Spectral Regularization
Authors	Alexander Tong, David van Dijk, Jay S. Stanley III, Matthew Amodio, Kristina Yim, Rebecca Muhle, James Noonan, Guy Wolf, Smita Krishnaswamy
Abstract	While neural networks are powerful approximators used to classify or embed data into lower dimensional spaces, they are often regarded as black boxes with uninterpretable features. Here we propose Graph Spectral Regularization for making hidden layers more interpretable without significantly impacting performance on the primary task. Taking inspiration from spatial organization and localization of neuron activations in biological networks, we use a graph Laplacian penalty to structure the activations within a layer. This penalty encourages activations to be smooth either on a predetermined graph or on a feature-space graph learned from the data via co-activations of a hidden layer of the neural network. We show numerous uses for this additional structure including cluster indication and visualization in biological and image data sets.
Tasks
Published	2018-09-30
URL	https://arxiv.org/abs/1810.00424v5
PDF	https://arxiv.org/pdf/1810.00424v5.pdf
PWC	https://paperswithcode.com/paper/graph-spectral-regularization-for-neural
Repo	https://github.com/KrishnaswamyLab/GraphSpectralRegularization
Framework	tf

CapsDeMM: Capsule network for Detection of Munro’s Microabscess in skin biopsy images


Title	CapsDeMM: Capsule network for Detection of Munro’s Microabscess in skin biopsy images
Authors	Anabik Pal, Akshay Chaturvedi, Utpal Garain, Aditi Chandra, Raghunath Chatterjee, Swapan Senapati
Abstract	This paper presents an approach for automatic detection of Munro’s Microabscess in stratum corneum (SC) of human skin biopsy in order to realize a machine assisted diagnosis of Psoriasis. The challenge of detecting neutrophils in presence of nucleated cells is solved using the recent advances of deep learning algorithms. Separation of SC layer, extraction of patches from the layer followed by classification of patches with respect to presence or absence of neutrophils form the basis of the overall approach which is effected through an integration of a U-Net based segmentation network and a capsule network for classification. The novel design of the present capsule net leads to a drastic reduction in the number of parameters without any noticeable compromise in the overall performance. The research further addresses the challenge of dealing with Mega-pixel images (in 10X) vis-a-vis Giga-pixel ones (in 40X). The promising result coming out of an experiment on a dataset consisting of 273 real-life images shows that a practical system is possible based on the present research. The implementation of our system is available at https://github.com/Anabik/CapsDeMM.
Tasks
Published	2018-08-20
URL	http://arxiv.org/abs/1808.06428v2
PDF	http://arxiv.org/pdf/1808.06428v2.pdf
PWC	https://paperswithcode.com/paper/capsdemm-capsule-network-for-detection-of
Repo	https://github.com/Anabik/CapsDeMM
Framework	tf

Adaptive Sampling for Coarse Ranking


Title	Adaptive Sampling for Coarse Ranking
Authors	Sumeet Katariya, Lalit Jain, Nandana Sengupta, James Evans, Robert Nowak
Abstract	We consider the problem of active coarse ranking, where the goal is to sort items according to their means into clusters of pre-specified sizes, by adaptively sampling from their reward distributions. This setting is useful in many social science applications involving human raters and the approximate rank of every item is desired. Approximate or coarse ranking can significantly reduce the number of ratings required in comparison to the number needed to find an exact ranking. We propose a computationally efficient PAC algorithm LUCBRank for coarse ranking, and derive an upper bound on its sample complexity. We also derive a nearly matching distribution-dependent lower bound. Experiments on synthetic as well as real-world data show that LUCBRank performs better than state-of-the-art baseline methods, even when these methods have the advantage of knowing the underlying parametric model.
Tasks
Published	2018-02-20
URL	http://arxiv.org/abs/1802.07176v1
PDF	http://arxiv.org/pdf/1802.07176v1.pdf
PWC	https://paperswithcode.com/paper/adaptive-sampling-for-coarse-ranking
Repo	https://github.com/sumeetsk/coarse_ranking
Framework	none

Markerless tracking of user-defined features with deep learning


Title	Markerless tracking of user-defined features with deep learning
Authors	Alexander Mathis, Pranav Mamidanna, Taiga Abe, Kevin M. Cury, Venkatesh N. Murthy, Mackenzie W. Mathis, Matthias Bethge
Abstract	Quantifying behavior is crucial for many applications in neuroscience. Videography provides easy methods for the observation and recording of animal behavior in diverse settings, yet extracting particular aspects of a behavior for further analysis can be highly time consuming. In motor control studies, humans or other animals are often marked with reflective markers to assist with computer-based tracking, yet markers are intrusive (especially for smaller animals), and the number and location of the markers must be determined a priori. Here, we present a highly efficient method for markerless tracking based on transfer learning with deep neural networks that achieves excellent results with minimal training data. We demonstrate the versatility of this framework by tracking various body parts in a broad collection of experimental settings: mice odor trail-tracking, egg-laying behavior in drosophila, and mouse hand articulation in a skilled forelimb task. For example, during the skilled reaching behavior, individual joints can be automatically tracked (and a confidence score is reported). Remarkably, even when a small number of frames are labeled ($\approx 200$), the algorithm achieves excellent tracking performance on test frames that is comparable to human accuracy.
Tasks	Animal Pose Estimation, Transfer Learning
Published	2018-04-09
URL	http://arxiv.org/abs/1804.03142v1
PDF	http://arxiv.org/pdf/1804.03142v1.pdf
PWC	https://paperswithcode.com/paper/markerless-tracking-of-user-defined-features
Repo	https://github.com/orkqueen/depplabseongil
Framework	tf

RePr: Improved Training of Convolutional Filters


Title	RePr: Improved Training of Convolutional Filters
Authors	Aaditya Prakash, James Storer, Dinei Florencio, Cha Zhang
Abstract	A well-trained Convolutional Neural Network can easily be pruned without significant loss of performance. This is because of unnecessary overlap in the features captured by the network’s filters. Innovations in network architecture such as skip/dense connections and Inception units have mitigated this problem to some extent, but these improvements come with increased computation and memory requirements at run-time. We attempt to address this problem from another angle - not by changing the network structure but by altering the training method. We show that by temporarily pruning and then restoring a subset of the model’s filters, and repeating this process cyclically, overlap in the learned features is reduced, producing improved generalization. We show that the existing model-pruning criteria are not optimal for selecting filters to prune in this context and introduce inter-filter orthogonality as the ranking criteria to determine under-expressive filters. Our method is applicable both to vanilla convolutional networks and more complex modern architectures, and improves the performance across a variety of tasks, especially when applied to smaller networks.
Tasks
Published	2018-11-18
URL	http://arxiv.org/abs/1811.07275v3
PDF	http://arxiv.org/pdf/1811.07275v3.pdf
PWC	https://paperswithcode.com/paper/repr-improved-training-of-convolutional
Repo	https://github.com/siahuat0727/RePr
Framework	pytorch

Implicit Self-Regularization in Deep Neural Networks: Evidence from Random Matrix Theory and Implications for Learning


Title	Implicit Self-Regularization in Deep Neural Networks: Evidence from Random Matrix Theory and Implications for Learning
Authors	Charles H. Martin, Michael W. Mahoney
Abstract	Random Matrix Theory (RMT) is applied to analyze weight matrices of Deep Neural Networks (DNNs), including both production quality, pre-trained models such as AlexNet and Inception, and smaller models trained from scratch, such as LeNet5 and a miniature-AlexNet. Empirical and theoretical results clearly indicate that the DNN training process itself implicitly implements a form of Self-Regularization. The empirical spectral density (ESD) of DNN layer matrices displays signatures of traditionally-regularized statistical models, even in the absence of exogenously specifying traditional forms of explicit regularization. Building on relatively recent results in RMT, most notably its extension to Universality classes of Heavy-Tailed matrices, we develop a theory to identify 5+1 Phases of Training, corresponding to increasing amounts of Implicit Self-Regularization. These phases can be observed during the training process as well as in the final learned DNNs. For smaller and/or older DNNs, this Implicit Self-Regularization is like traditional Tikhonov regularization, in that there is a “size scale” separating signal from noise. For state-of-the-art DNNs, however, we identify a novel form of Heavy-Tailed Self-Regularization, similar to the self-organization seen in the statistical physics of disordered systems. This results from correlations arising at all size scales, which arises implicitly due to the training process itself. This implicit Self-Regularization can depend strongly on the many knobs of the training process. By exploiting the generalization gap phenomena, we demonstrate that we can cause a small model to exhibit all 5+1 phases of training simply by changing the batch size. This demonstrates that—all else being equal—DNN optimization with larger batch sizes leads to less-well implicitly-regularized models, and it provides an explanation for the generalization gap phenomena.
Tasks
Published	2018-10-02
URL	http://arxiv.org/abs/1810.01075v1
PDF	http://arxiv.org/pdf/1810.01075v1.pdf
PWC	https://paperswithcode.com/paper/implicit-self-regularization-in-deep-neural
Repo	https://github.com/CalculatedContent/ImplicitSelfRegularization
Framework	pytorch

Limited Evaluation Evolutionary Optimization of Large Neural Networks


Title	Limited Evaluation Evolutionary Optimization of Large Neural Networks
Authors	Jonas Prellberg, Oliver Kramer
Abstract	Stochastic gradient descent is the most prevalent algorithm to train neural networks. However, other approaches such as evolutionary algorithms are also applicable to this task. Evolutionary algorithms bring unique trade-offs that are worth exploring, but computational demands have so far restricted exploration to small networks with few parameters. We implement an evolutionary algorithm that executes entirely on the GPU, which allows to efficiently batch-evaluate a whole population of networks. Within this framework, we explore the limited evaluation evolutionary algorithm for neural network training and find that its batch evaluation idea comes with a large accuracy trade-off. In further experiments, we explore crossover operators and find that unprincipled random uniform crossover performs extremely well. Finally, we train a network with 92k parameters on MNIST using an EA and achieve 97.6 % test accuracy compared to 98 % test accuracy on the same network trained with Adam. Code is available at https://github.com/jprellberg/gpuea.
Tasks
Published	2018-06-26
URL	http://arxiv.org/abs/1806.09819v1
PDF	http://arxiv.org/pdf/1806.09819v1.pdf
PWC	https://paperswithcode.com/paper/limited-evaluation-evolutionary-optimization
Repo	https://github.com/jprellberg/gpuea
Framework	tf

A Convolutional Autoencoder Approach to Learn Volumetric Shape Representations for Brain Structures


Title	A Convolutional Autoencoder Approach to Learn Volumetric Shape Representations for Brain Structures
Authors	Evan M. Yu, Mert R. Sabuncu
Abstract	We propose a novel machine learning strategy for studying neuroanatomical shape variation. Our model works with volumetric binary segmentation images, and requires no pre-processing such as the extraction of surface points or a mesh. The learned shape descriptor is invariant to affine transformations, including shifts, rotations and scaling. Thanks to the adopted autoencoder framework, inter-subject differences are automatically enhanced in the learned representation, while intra-subject variances are minimized. Our experimental results on a shape retrieval task showed that the proposed representation outperforms a state-of-the-art benchmark for brain structures extracted from MRI scans.
Tasks
Published	2018-10-17
URL	http://arxiv.org/abs/1810.07746v1
PDF	http://arxiv.org/pdf/1810.07746v1.pdf
PWC	https://paperswithcode.com/paper/a-convolutional-autoencoder-approach-to-learn
Repo	https://github.com/evanmy/voxel_shape_analysis
Framework	pytorch

Minimal Random Code Learning: Getting Bits Back from Compressed Model Parameters


Title	Minimal Random Code Learning: Getting Bits Back from Compressed Model Parameters
Authors	Marton Havasi, Robert Peharz, José Miguel Hernández-Lobato
Abstract	While deep neural networks are a highly successful model class, their large memory footprint puts considerable strain on energy consumption, communication bandwidth, and storage requirements. Consequently, model size reduction has become an utmost goal in deep learning. A typical approach is to train a set of deterministic weights, while applying certain techniques such as pruning and quantization, in order that the empirical weight distribution becomes amenable to Shannon-style coding schemes. However, as shown in this paper, relaxing weight determinism and using a full variational distribution over weights allows for more efficient coding schemes and consequently higher compression rates. In particular, following the classical bits-back argument, we encode the network weights using a random sample, requiring only a number of bits corresponding to the Kullback-Leibler divergence between the sampled variational distribution and the encoding distribution. By imposing a constraint on the Kullback-Leibler divergence, we are able to explicitly control the compression rate, while optimizing the expected loss on the training set. The employed encoding scheme can be shown to be close to the optimal information-theoretical lower bound, with respect to the employed variational family. Our method sets new state-of-the-art in neural network compression, as it strictly dominates previous approaches in a Pareto sense: On the benchmarks LeNet-5/MNIST and VGG-16/CIFAR-10, our approach yields the best test performance for a fixed memory budget, and vice versa, it achieves the highest compression rates for a fixed test performance.
Tasks	Neural Network Compression, Quantization
Published	2018-09-30
URL	http://arxiv.org/abs/1810.00440v1
PDF	http://arxiv.org/pdf/1810.00440v1.pdf
PWC	https://paperswithcode.com/paper/minimal-random-code-learning-getting-bits
Repo	https://github.com/cambridge-mlg/variational-shannon-coding
Framework	tf