February 2, 2020

3042 words 15 mins read

Paper Group AWR 73

Paper Group AWR 73

Parameter elimination in particle Gibbs sampling. A Visually Plausible Grasping System for Object Manipulation and Interaction in Virtual Reality Environments. Recurrent Independent Mechanisms. Explicitly disentangling image content from translation and rotation with spatial-VAE. PHom-GeM: Persistent Homology for Generative Models. Hyperbolic Graph …

Parameter elimination in particle Gibbs sampling

Title Parameter elimination in particle Gibbs sampling
Authors Anna Wigren, Riccardo Sven Risuleo, Lawrence Murray, Fredrik Lindsten
Abstract Bayesian inference in state-space models is challenging due to high-dimensional state trajectories. A viable approach is particle Markov chain Monte Carlo, combining MCMC and sequential Monte Carlo to form “exact approximations” to otherwise intractable MCMC methods. The performance of the approximation is limited to that of the exact method. We focus on particle Gibbs and particle Gibbs with ancestor sampling, improving their performance beyond that of the underlying Gibbs sampler (which they approximate) by marginalizing out one or more parameters. This is possible when the parameter prior is conjugate to the complete data likelihood. Marginalization yields a non-Markovian model for inference, but we show that, in contrast to the general case, this method still scales linearly in time. While marginalization can be cumbersome to implement, recent advances in probabilistic programming have enabled its automation. We demonstrate how the marginalized methods are viable as efficient inference backends in probabilistic programming, and demonstrate with examples in ecology and epidemiology.
Tasks Bayesian Inference, Epidemiology, Probabilistic Programming
Published 2019-10-30
URL https://arxiv.org/abs/1910.14145v1
PDF https://arxiv.org/pdf/1910.14145v1.pdf
PWC https://paperswithcode.com/paper/parameter-elimination-in-particle-gibbs
Repo https://github.com/uu-sml/neurips2019-parameter-elimination
Framework none

A Visually Plausible Grasping System for Object Manipulation and Interaction in Virtual Reality Environments

Title A Visually Plausible Grasping System for Object Manipulation and Interaction in Virtual Reality Environments
Authors Sergiu Oprea, Pablo Martinez-Gonzalez, Alberto Garcia-Garcia, John Alejandro Castro-Vargas, Sergio Orts-Escolano, Jose Garcia-Rodriguez
Abstract Interaction in virtual reality (VR) environments is essential to achieve a pleasant and immersive experience. Most of the currently existing VR applications, lack of robust object grasping and manipulation, which are the cornerstone of interactive systems. Therefore, we propose a realistic, flexible and robust grasping system that enables rich and real-time interactions in virtual environments. It is visually realistic because it is completely user-controlled, flexible because it can be used for different hand configurations, and robust because it allows the manipulation of objects regardless their geometry, i.e. hand is automatically fitted to the object shape. In order to validate our proposal, an exhaustive qualitative and quantitative performance analysis has been carried out. On the one hand, qualitative evaluation was used in the assessment of the abstract aspects such as: hand movement realism, interaction realism and motor control. On the other hand, for the quantitative evaluation a novel error metric has been proposed to visually analyze the performed grips. This metric is based on the computation of the distance from the finger phalanges to the nearest contact point on the object surface. These contact points can be used with different application purposes, mainly in the field of robotics. As a conclusion, system evaluation reports a similar performance between users with previous experience in virtual reality applications and inexperienced users, referring to a steep learning curve.
Tasks
Published 2019-03-12
URL http://arxiv.org/abs/1903.05238v1
PDF http://arxiv.org/pdf/1903.05238v1.pdf
PWC https://paperswithcode.com/paper/a-visually-plausible-grasping-system-for
Repo https://github.com/3dperceptionlab/unrealgrasp
Framework none

Recurrent Independent Mechanisms

Title Recurrent Independent Mechanisms
Authors Anirudh Goyal, Alex Lamb, Jordan Hoffmann, Shagun Sodhani, Sergey Levine, Yoshua Bengio, Bernhard Schölkopf
Abstract Learning modular structures which reflect the dynamics of the environment can lead to better generalization and robustness to changes which only affect a few of the underlying causes. We propose Recurrent Independent Mechanisms (RIMs), a new recurrent architecture in which multiple groups of recurrent cells operate with nearly independent transition dynamics, communicate only sparingly through the bottleneck of attention, and are only updated at time steps where they are most relevant. We show that this leads to specialization amongst the RIMs, which in turn allows for dramatically improved generalization on tasks where some factors of variation differ systematically between training and evaluation.
Tasks
Published 2019-09-24
URL https://arxiv.org/abs/1909.10893v2
PDF https://arxiv.org/pdf/1909.10893v2.pdf
PWC https://paperswithcode.com/paper/recurrent-independent-mechanisms
Repo https://github.com/maximecb/gym-minigrid
Framework pytorch

Explicitly disentangling image content from translation and rotation with spatial-VAE

Title Explicitly disentangling image content from translation and rotation with spatial-VAE
Authors Tristan Bepler, Ellen D. Zhong, Kotaro Kelley, Edward Brignole, Bonnie Berger
Abstract Given an image dataset, we are often interested in finding data generative factors that encode semantic content independently from pose variables such as rotation and translation. However, current disentanglement approaches do not impose any specific structure on the learned latent representations. We propose a method for explicitly disentangling image rotation and translation from other unstructured latent factors in a variational autoencoder (VAE) framework. By formulating the generative model as a function of the spatial coordinate, we make the reconstruction error differentiable with respect to latent translation and rotation parameters. This formulation allows us to train a neural network to perform approximate inference on these latent variables while explicitly constraining them to only represent rotation and translation. We demonstrate that this framework, termed spatial-VAE, effectively learns latent representations that disentangle image rotation and translation from content and improves reconstruction over standard VAEs on several benchmark datasets, including applications to modeling continuous 2-D views of proteins from single particle electron microscopy and galaxies in astronomical images.
Tasks
Published 2019-09-25
URL https://arxiv.org/abs/1909.11663v1
PDF https://arxiv.org/pdf/1909.11663v1.pdf
PWC https://paperswithcode.com/paper/explicitly-disentangling-image-content-from
Repo https://github.com/tbepler/spatial-VAE
Framework pytorch

PHom-GeM: Persistent Homology for Generative Models

Title PHom-GeM: Persistent Homology for Generative Models
Authors Jeremy Charlier, Radu State, Jean Hilger
Abstract Generative neural network models, including Generative Adversarial Network (GAN) and Auto-Encoders (AE), are among the most popular neural network models to generate adversarial data. The GAN model is composed of a generator that produces synthetic data and of a discriminator that discriminates between the generator’s output and the true data. AE consist of an encoder which maps the model distribution to a latent manifold and of a decoder which maps the latent manifold to a reconstructed distribution. However, generative models are known to provoke chaotically scattered reconstructed distribution during their training, and consequently, incomplete generated adversarial distributions. Current distance measures fail to address this problem because they are not able to acknowledge the shape of the data manifold, i.e. its topological features, and the scale at which the manifold should be analyzed. We propose Persistent Homology for Generative Models, PHom-GeM, a new methodology to assess and measure the distribution of a generative model. PHom-GeM minimizes an objective function between the true and the reconstructed distributions and uses persistent homology, the study of the topological features of a space at different spatial resolutions, to compare the nature of the true and the generated distributions. Our experiments underline the potential of persistent homology for Wasserstein GAN in comparison to Wasserstein AE and Variational AE. The experiments are conducted on a real-world data set particularly challenging for traditional distance measures and generative neural network models. PHom-GeM is the first methodology to propose a topological distance measure, the bottleneck distance, for generative models used to compare adversarial samples in the context of credit card transactions.
Tasks
Published 2019-05-23
URL https://arxiv.org/abs/1905.09894v1
PDF https://arxiv.org/pdf/1905.09894v1.pdf
PWC https://paperswithcode.com/paper/phom-gem-persistent-homology-for-generative
Repo https://github.com/dagrate/phomgem
Framework none

Hyperbolic Graph Neural Networks

Title Hyperbolic Graph Neural Networks
Authors Qi Liu, Maximilian Nickel, Douwe Kiela
Abstract Learning from graph-structured data is an important task in machine learning and artificial intelligence, for which Graph Neural Networks (GNNs) have shown great promise. Motivated by recent advances in geometric representation learning, we propose a novel GNN architecture for learning representations on Riemannian manifolds with differentiable exponential and logarithmic maps. We develop a scalable algorithm for modeling the structural properties of graphs, comparing Euclidean and hyperbolic geometry. In our experiments, we show that hyperbolic GNNs can lead to substantial improvements on various benchmark datasets.
Tasks Representation Learning
Published 2019-10-28
URL https://arxiv.org/abs/1910.12892v1
PDF https://arxiv.org/pdf/1910.12892v1.pdf
PWC https://paperswithcode.com/paper/hyperbolic-graph-neural-networks
Repo https://github.com/facebookresearch/hgnn
Framework pytorch

Audio tagging with noisy labels and minimal supervision

Title Audio tagging with noisy labels and minimal supervision
Authors Eduardo Fonseca, Manoj Plakal, Frederic Font, Daniel P. W. Ellis, Xavier Serra
Abstract This paper introduces Task 2 of the DCASE2019 Challenge, titled “Audio tagging with noisy labels and minimal supervision”. This task was hosted on the Kaggle platform as “Freesound Audio Tagging 2019”. The task evaluates systems for multi-label audio tagging using a large set of noisy-labeled data, and a much smaller set of manually-labeled data, under a large vocabulary setting of 80 everyday sound classes. In addition, the proposed dataset poses an acoustic mismatch problem between the noisy train set and the test set due to the fact that they come from different web audio sources. This can correspond to a realistic scenario given by the difficulty in gathering large amounts of manually labeled data. We present the task setup, the FSDKaggle2019 dataset prepared for this scientific evaluation, and a baseline system consisting of a convolutional neural network. All these resources are freely available.
Tasks Audio Tagging
Published 2019-06-07
URL https://arxiv.org/abs/1906.02975v4
PDF https://arxiv.org/pdf/1906.02975v4.pdf
PWC https://paperswithcode.com/paper/audio-tagging-with-noisy-labels-and-minimal
Repo https://github.com/ebouteillon/freesound-audio-tagging-2019
Framework pytorch

Adversarial Training for Free!

Title Adversarial Training for Free!
Authors Ali Shafahi, Mahyar Najibi, Amin Ghiasi, Zheng Xu, John Dickerson, Christoph Studer, Larry S. Davis, Gavin Taylor, Tom Goldstein
Abstract Adversarial training, in which a network is trained on adversarial examples, is one of the few defenses against adversarial attacks that withstands strong attacks. Unfortunately, the high cost of generating strong adversarial examples makes standard adversarial training impractical on large-scale problems like ImageNet. We present an algorithm that eliminates the overhead cost of generating adversarial examples by recycling the gradient information computed when updating model parameters. Our “free” adversarial training algorithm achieves comparable robustness to PGD adversarial training on the CIFAR-10 and CIFAR-100 datasets at negligible additional cost compared to natural training, and can be 7 to 30 times faster than other strong adversarial training methods. Using a single workstation with 4 P100 GPUs and 2 days of runtime, we can train a robust model for the large-scale ImageNet classification task that maintains 40% accuracy against PGD attacks. The code is available at https://github.com/ashafahi/free_adv_train.
Tasks Adversarial Attack, Adversarial Defense
Published 2019-04-29
URL https://arxiv.org/abs/1904.12843v2
PDF https://arxiv.org/pdf/1904.12843v2.pdf
PWC https://paperswithcode.com/paper/adversarial-training-for-free
Repo https://github.com/locuslab/fast_adversarial
Framework pytorch

DiscoFuse: A Large-Scale Dataset for Discourse-Based Sentence Fusion

Title DiscoFuse: A Large-Scale Dataset for Discourse-Based Sentence Fusion
Authors Mor Geva, Eric Malmi, Idan Szpektor, Jonathan Berant
Abstract Sentence fusion is the task of joining several independent sentences into a single coherent text. Current datasets for sentence fusion are small and insufficient for training modern neural models. In this paper, we propose a method for automatically-generating fusion examples from raw text and present DiscoFuse, a large scale dataset for discourse-based sentence fusion. We author a set of rules for identifying a diverse set of discourse phenomena in raw text, and decomposing the text into two independent sentences. We apply our approach on two document collections: Wikipedia and Sports articles, yielding 60 million fusion examples annotated with discourse information required to reconstruct the fused text. We develop a sequence-to-sequence model on DiscoFuse and thoroughly analyze its strengths and weaknesses with respect to the various discourse phenomena, using both automatic as well as human evaluation. Finally, we conduct transfer learning experiments with WebSplit, a recent dataset for text simplification. We show that pretraining on DiscoFuse substantially improves performance on WebSplit when viewed as a sentence fusion task.
Tasks Text Simplification, Transfer Learning
Published 2019-02-27
URL http://arxiv.org/abs/1902.10526v3
PDF http://arxiv.org/pdf/1902.10526v3.pdf
PWC https://paperswithcode.com/paper/discofuse-a-large-scale-dataset-for-discourse
Repo https://github.com/google-research-datasets/discofuse
Framework none

On Compression of Unsupervised Neural Nets by Pruning Weak Connections

Title On Compression of Unsupervised Neural Nets by Pruning Weak Connections
Authors Zhiwen Zuo, Lei Zhao, Liwen Zuo, Feng Jiang, Wei Xing, Dongming Lu
Abstract Unsupervised neural nets such as Restricted Boltzmann Machines(RBMs) and Deep Belif Networks(DBNs), are powerful in automatic feature extraction,unsupervised weight initialization and density estimation. In this paper,we demonstrate that the parameters of these neural nets can be dramatically reduced without affecting their performance. We describe a method to reduce the parameters required by RBM which is the basic building block for deep architectures. Further we propose an unsupervised sparse deep architectures selection algorithm to form sparse deep neural networks.Experimental results show that there is virtually no loss in either generative or discriminative performance.
Tasks Density Estimation
Published 2019-01-21
URL http://arxiv.org/abs/1901.07066v2
PDF http://arxiv.org/pdf/1901.07066v2.pdf
PWC https://paperswithcode.com/paper/on-compression-of-unsupervised-neural-nets-by
Repo https://github.com/ElternalEnVy/tensorflow_rbm
Framework tf

All You Need is a Few Shifts: Designing Efficient Convolutional Neural Networks for Image Classification

Title All You Need is a Few Shifts: Designing Efficient Convolutional Neural Networks for Image Classification
Authors Weijie Chen, Di Xie, Yuan Zhang, Shiliang Pu
Abstract Shift operation is an efficient alternative over depthwise separable convolution. However, it is still bottlenecked by its implementation manner, namely memory movement. To put this direction forward, a new and novel basic component named Sparse Shift Layer (SSL) is introduced in this paper to construct efficient convolutional neural networks. In this family of architectures, the basic block is only composed by 1x1 convolutional layers with only a few shift operations applied to the intermediate feature maps. To make this idea feasible, we introduce shift operation penalty during optimization and further propose a quantization-aware shift learning method to impose the learned displacement more friendly for inference. Extensive ablation studies indicate that only a few shift operations are sufficient to provide spatial information communication. Furthermore, to maximize the role of SSL, we redesign an improved network architecture to Fully Exploit the limited capacity of neural Network (FE-Net). Equipped with SSL, this network can achieve 75.0% top-1 accuracy on ImageNet with only 563M M-Adds. It surpasses other counterparts constructed by depthwise separable convolution and the networks searched by NAS in terms of accuracy and practical speed.
Tasks Image Classification, Quantization
Published 2019-03-13
URL http://arxiv.org/abs/1903.05285v1
PDF http://arxiv.org/pdf/1903.05285v1.pdf
PWC https://paperswithcode.com/paper/all-you-need-is-a-few-shifts-designing
Repo https://github.com/DeadAt0m/ActiveSparseShifts-PyTorch
Framework pytorch

Recurrent Registration Neural Networks for Deformable Image Registration

Title Recurrent Registration Neural Networks for Deformable Image Registration
Authors Robin Sandkühler, Simon Andermatt, Grzegorz Bauman, Sylvia Nyilas, Christoph Jud, Philippe C. Cattin
Abstract Parametric spatial transformation models have been successfully applied to image registration tasks. In such models, the transformation of interest is parameterized by a fixed set of basis functions as for example B-splines. Each basis function is located on a fixed regular grid position among the image domain, because the transformation of interest is not known in advance. As a consequence, not all basis functions will necessarily contribute to the final transformation which results in a non-compact representation of the transformation. We reformulate the pairwise registration problem as a recursive sequence of successive alignments. For each element in the sequence, a local deformation defined by its position, shape, and weight is computed by our recurrent registration neural network. The sum of all local deformations yield the final spatial alignment of both images. Formulating the registration problem in this way allows the network to detect non-aligned regions in the images and to learn how to locally refine the registration properly. In contrast to current non-sequence-based registration methods, our approach iteratively applies local spatial deformations to the images until the desired registration accuracy is achieved. We trained our network on 2D magnetic resonance images of the lung and compared our method to a standard parametric B-spline registration. The experiments show, that our method performs on par for the accuracy but yields a more compact representation of the transformation. Furthermore, we achieve a speedup of around 15 compared to the B-spline registration.
Tasks Image Registration
Published 2019-06-07
URL https://arxiv.org/abs/1906.09988v1
PDF https://arxiv.org/pdf/1906.09988v1.pdf
PWC https://paperswithcode.com/paper/recurrent-registration-neural-networks-for
Repo https://github.com/RobinSandkuehler/r2n2
Framework pytorch

Semantic integration of disease-specific knowledge

Title Semantic integration of disease-specific knowledge
Authors Anastasios Nentidis, Konstantinos Bougiatiotis, Anastasia Krithara, Georgios Paliouras
Abstract Biomedical researchers working on a specific disease need up-to-date and unified access to knowledge relevant to the disease of their interest. Knowledge is continuously accumulated in scientific literature and other resources such as biomedical ontologies. Identifying the specific information needed is a challenging task and computational tools can be valuable. In this study, we propose a pipeline to automatically retrieve and integrate relevant knowledge based on a semantic graph representation, the iASiS Open Data Graph. Results: The disease-specific semantic graph can provide easy access to resources relevant to specific concepts and individual aspects of these concepts, in the form of concept relations and attributes. The proposed approach is applied to three different case studies: Two prevalent diseases, Lung Cancer and Dementia, for which a lot of knowledge is available, and one rare disease, Duchenne Muscular Dystrophy, for which knowledge is less abundant and difficult to locate. Results from exemplary queries are presented, investigating the potential of this approach in integrating and accessing knowledge as an automatically generated semantic graph.
Tasks
Published 2019-12-18
URL https://arxiv.org/abs/1912.08633v1
PDF https://arxiv.org/pdf/1912.08633v1.pdf
PWC https://paperswithcode.com/paper/semantic-integration-of-disease-specific
Repo https://github.com/tasosnent/Biomedical-Knowledge-Integration
Framework none

Learning Pyramid-Context Encoder Network for High-Quality Image Inpainting

Title Learning Pyramid-Context Encoder Network for High-Quality Image Inpainting
Authors Yanhong Zeng, Jianlong Fu, Hongyang Chao, Baining Guo
Abstract High-quality image inpainting requires filling missing regions in a damaged image with plausible content. Existing works either fill the regions by copying image patches or generating semantically-coherent patches from region context, while neglect the fact that both visual and semantic plausibility are highly-demanded. In this paper, we propose a Pyramid-context ENcoder Network (PEN-Net) for image inpainting by deep generative models. The PEN-Net is built upon a U-Net structure, which can restore an image by encoding contextual semantics from full resolution input, and decoding the learned semantic features back into images. Specifically, we propose a pyramid-context encoder, which progressively learns region affinity by attention from a high-level semantic feature map and transfers the learned attention to the previous low-level feature map. As the missing content can be filled by attention transfer from deep to shallow in a pyramid fashion, both visual and semantic coherence for image inpainting can be ensured. We further propose a multi-scale decoder with deeply-supervised pyramid losses and an adversarial loss. Such a design not only results in fast convergence in training, but more realistic results in testing. Extensive experiments on various datasets show the superior performance of the proposed network
Tasks Image Inpainting
Published 2019-04-16
URL https://arxiv.org/abs/1904.07475v4
PDF https://arxiv.org/pdf/1904.07475v4.pdf
PWC https://paperswithcode.com/paper/learning-pyramid-context-encoder-network-for
Repo https://github.com/researchmm/PEN-Net-for-Inpainting
Framework tf
Title Semantics- and Syntax-related Subvectors in the Skip-gram Embeddings
Authors Maxat Tezekbayev, Zhenisbek Assylbekov, Rustem Takhanov
Abstract We show that the skip-gram embedding of any word can be decomposed into two subvectors which roughly correspond to semantic and syntactic roles of the word.
Tasks
Published 2019-12-23
URL https://arxiv.org/abs/1912.13413v1
PDF https://arxiv.org/pdf/1912.13413v1.pdf
PWC https://paperswithcode.com/paper/semantics-and-syntax-related-subvectors-in
Repo https://github.com/MaxatTezekbayev/Semantics--and-Syntax-related-Subvectors-in-the-Skip-gram-Embeddings
Framework tf
comments powered by Disqus