February 2, 2020

3042 words 15 mins read

Paper Group AWR 73

Parameter elimination in particle Gibbs sampling. A Visually Plausible Grasping System for Object Manipulation and Interaction in Virtual Reality Environments. Recurrent Independent Mechanisms. Explicitly disentangling image content from translation and rotation with spatial-VAE. PHom-GeM: Persistent Homology for Generative Models. Hyperbolic Graph …

Parameter elimination in particle Gibbs sampling


Title	Parameter elimination in particle Gibbs sampling
Authors	Anna Wigren, Riccardo Sven Risuleo, Lawrence Murray, Fredrik Lindsten
Abstract	Bayesian inference in state-space models is challenging due to high-dimensional state trajectories. A viable approach is particle Markov chain Monte Carlo, combining MCMC and sequential Monte Carlo to form “exact approximations” to otherwise intractable MCMC methods. The performance of the approximation is limited to that of the exact method. We focus on particle Gibbs and particle Gibbs with ancestor sampling, improving their performance beyond that of the underlying Gibbs sampler (which they approximate) by marginalizing out one or more parameters. This is possible when the parameter prior is conjugate to the complete data likelihood. Marginalization yields a non-Markovian model for inference, but we show that, in contrast to the general case, this method still scales linearly in time. While marginalization can be cumbersome to implement, recent advances in probabilistic programming have enabled its automation. We demonstrate how the marginalized methods are viable as efficient inference backends in probabilistic programming, and demonstrate with examples in ecology and epidemiology.
Tasks	Bayesian Inference, Epidemiology, Probabilistic Programming
Published	2019-10-30
URL	https://arxiv.org/abs/1910.14145v1
PDF	https://arxiv.org/pdf/1910.14145v1.pdf
PWC	https://paperswithcode.com/paper/parameter-elimination-in-particle-gibbs
Repo	https://github.com/uu-sml/neurips2019-parameter-elimination
Framework	none

A Visually Plausible Grasping System for Object Manipulation and Interaction in Virtual Reality Environments


Title	A Visually Plausible Grasping System for Object Manipulation and Interaction in Virtual Reality Environments
Authors	Sergiu Oprea, Pablo Martinez-Gonzalez, Alberto Garcia-Garcia, John Alejandro Castro-Vargas, Sergio Orts-Escolano, Jose Garcia-Rodriguez
Abstract	Interaction in virtual reality (VR) environments is essential to achieve a pleasant and immersive experience. Most of the currently existing VR applications, lack of robust object grasping and manipulation, which are the cornerstone of interactive systems. Therefore, we propose a realistic, flexible and robust grasping system that enables rich and real-time interactions in virtual environments. It is visually realistic because it is completely user-controlled, flexible because it can be used for different hand configurations, and robust because it allows the manipulation of objects regardless their geometry, i.e. hand is automatically fitted to the object shape. In order to validate our proposal, an exhaustive qualitative and quantitative performance analysis has been carried out. On the one hand, qualitative evaluation was used in the assessment of the abstract aspects such as: hand movement realism, interaction realism and motor control. On the other hand, for the quantitative evaluation a novel error metric has been proposed to visually analyze the performed grips. This metric is based on the computation of the distance from the finger phalanges to the nearest contact point on the object surface. These contact points can be used with different application purposes, mainly in the field of robotics. As a conclusion, system evaluation reports a similar performance between users with previous experience in virtual reality applications and inexperienced users, referring to a steep learning curve.
Tasks
Published	2019-03-12
URL	http://arxiv.org/abs/1903.05238v1
PDF	http://arxiv.org/pdf/1903.05238v1.pdf
PWC	https://paperswithcode.com/paper/a-visually-plausible-grasping-system-for
Repo	https://github.com/3dperceptionlab/unrealgrasp
Framework	none

Recurrent Independent Mechanisms


Title	Recurrent Independent Mechanisms
Authors	Anirudh Goyal, Alex Lamb, Jordan Hoffmann, Shagun Sodhani, Sergey Levine, Yoshua Bengio, Bernhard Schölkopf
Abstract	Learning modular structures which reflect the dynamics of the environment can lead to better generalization and robustness to changes which only affect a few of the underlying causes. We propose Recurrent Independent Mechanisms (RIMs), a new recurrent architecture in which multiple groups of recurrent cells operate with nearly independent transition dynamics, communicate only sparingly through the bottleneck of attention, and are only updated at time steps where they are most relevant. We show that this leads to specialization amongst the RIMs, which in turn allows for dramatically improved generalization on tasks where some factors of variation differ systematically between training and evaluation.
Tasks
Published	2019-09-24
URL	https://arxiv.org/abs/1909.10893v2
PDF	https://arxiv.org/pdf/1909.10893v2.pdf
PWC	https://paperswithcode.com/paper/recurrent-independent-mechanisms
Repo	https://github.com/maximecb/gym-minigrid
Framework	pytorch

Explicitly disentangling image content from translation and rotation with spatial-VAE


Title	Explicitly disentangling image content from translation and rotation with spatial-VAE
Authors	Tristan Bepler, Ellen D. Zhong, Kotaro Kelley, Edward Brignole, Bonnie Berger
Abstract	Given an image dataset, we are often interested in finding data generative factors that encode semantic content independently from pose variables such as rotation and translation. However, current disentanglement approaches do not impose any specific structure on the learned latent representations. We propose a method for explicitly disentangling image rotation and translation from other unstructured latent factors in a variational autoencoder (VAE) framework. By formulating the generative model as a function of the spatial coordinate, we make the reconstruction error differentiable with respect to latent translation and rotation parameters. This formulation allows us to train a neural network to perform approximate inference on these latent variables while explicitly constraining them to only represent rotation and translation. We demonstrate that this framework, termed spatial-VAE, effectively learns latent representations that disentangle image rotation and translation from content and improves reconstruction over standard VAEs on several benchmark datasets, including applications to modeling continuous 2-D views of proteins from single particle electron microscopy and galaxies in astronomical images.
Tasks
Published	2019-09-25
URL	https://arxiv.org/abs/1909.11663v1
PDF	https://arxiv.org/pdf/1909.11663v1.pdf
PWC	https://paperswithcode.com/paper/explicitly-disentangling-image-content-from
Repo	https://github.com/tbepler/spatial-VAE
Framework	pytorch

PHom-GeM: Persistent Homology for Generative Models


Title	PHom-GeM: Persistent Homology for Generative Models
Authors	Jeremy Charlier, Radu State, Jean Hilger
Abstract	Generative neural network models, including Generative Adversarial Network (GAN) and Auto-Encoders (AE), are among the most popular neural network models to generate adversarial data. The GAN model is composed of a generator that produces synthetic data and of a discriminator that discriminates between the generator’s output and the true data. AE consist of an encoder which maps the model distribution to a latent manifold and of a decoder which maps the latent manifold to a reconstructed distribution. However, generative models are known to provoke chaotically scattered reconstructed distribution during their training, and consequently, incomplete generated adversarial distributions. Current distance measures fail to address this problem because they are not able to acknowledge the shape of the data manifold, i.e. its topological features, and the scale at which the manifold should be analyzed. We propose Persistent Homology for Generative Models, PHom-GeM, a new methodology to assess and measure the distribution of a generative model. PHom-GeM minimizes an objective function between the true and the reconstructed distributions and uses persistent homology, the study of the topological features of a space at different spatial resolutions, to compare the nature of the true and the generated distributions. Our experiments underline the potential of persistent homology for Wasserstein GAN in comparison to Wasserstein AE and Variational AE. The experiments are conducted on a real-world data set particularly challenging for traditional distance measures and generative neural network models. PHom-GeM is the first methodology to propose a topological distance measure, the bottleneck distance, for generative models used to compare adversarial samples in the context of credit card transactions.
Tasks
Published	2019-05-23
URL	https://arxiv.org/abs/1905.09894v1
PDF	https://arxiv.org/pdf/1905.09894v1.pdf
PWC	https://paperswithcode.com/paper/phom-gem-persistent-homology-for-generative
Repo	https://github.com/dagrate/phomgem
Framework	none

Hyperbolic Graph Neural Networks


Title	Hyperbolic Graph Neural Networks
Authors	Qi Liu, Maximilian Nickel, Douwe Kiela
Abstract	Learning from graph-structured data is an important task in machine learning and artificial intelligence, for which Graph Neural Networks (GNNs) have shown great promise. Motivated by recent advances in geometric representation learning, we propose a novel GNN architecture for learning representations on Riemannian manifolds with differentiable exponential and logarithmic maps. We develop a scalable algorithm for modeling the structural properties of graphs, comparing Euclidean and hyperbolic geometry. In our experiments, we show that hyperbolic GNNs can lead to substantial improvements on various benchmark datasets.
Tasks	Representation Learning
Published	2019-10-28
URL	https://arxiv.org/abs/1910.12892v1
PDF	https://arxiv.org/pdf/1910.12892v1.pdf
PWC	https://paperswithcode.com/paper/hyperbolic-graph-neural-networks
Repo	https://github.com/facebookresearch/hgnn
Framework	pytorch

Audio tagging with noisy labels and minimal supervision


Title	Audio tagging with noisy labels and minimal supervision
Authors	Eduardo Fonseca, Manoj Plakal, Frederic Font, Daniel P. W. Ellis, Xavier Serra
Abstract	This paper introduces Task 2 of the DCASE2019 Challenge, titled “Audio tagging with noisy labels and minimal supervision”. This task was hosted on the Kaggle platform as “Freesound Audio Tagging 2019”. The task evaluates systems for multi-label audio tagging using a large set of noisy-labeled data, and a much smaller set of manually-labeled data, under a large vocabulary setting of 80 everyday sound classes. In addition, the proposed dataset poses an acoustic mismatch problem between the noisy train set and the test set due to the fact that they come from different web audio sources. This can correspond to a realistic scenario given by the difficulty in gathering large amounts of manually labeled data. We present the task setup, the FSDKaggle2019 dataset prepared for this scientific evaluation, and a baseline system consisting of a convolutional neural network. All these resources are freely available.
Tasks	Audio Tagging
Published	2019-06-07
URL	https://arxiv.org/abs/1906.02975v4
PDF	https://arxiv.org/pdf/1906.02975v4.pdf
PWC	https://paperswithcode.com/paper/audio-tagging-with-noisy-labels-and-minimal
Repo	https://github.com/ebouteillon/freesound-audio-tagging-2019
Framework	pytorch

Adversarial Training for Free!


Title	Adversarial Training for Free!
Authors	Ali Shafahi, Mahyar Najibi, Amin Ghiasi, Zheng Xu, John Dickerson, Christoph Studer, Larry S. Davis, Gavin Taylor, Tom Goldstein
Abstract	Adversarial training, in which a network is trained on adversarial examples, is one of the few defenses against adversarial attacks that withstands strong attacks. Unfortunately, the high cost of generating strong adversarial examples makes standard adversarial training impractical on large-scale problems like ImageNet. We present an algorithm that eliminates the overhead cost of generating adversarial examples by recycling the gradient information computed when updating model parameters. Our “free” adversarial training algorithm achieves comparable robustness to PGD adversarial training on the CIFAR-10 and CIFAR-100 datasets at negligible additional cost compared to natural training, and can be 7 to 30 times faster than other strong adversarial training methods. Using a single workstation with 4 P100 GPUs and 2 days of runtime, we can train a robust model for the large-scale ImageNet classification task that maintains 40% accuracy against PGD attacks. The code is available at https://github.com/ashafahi/free_adv_train.
Tasks	Adversarial Attack, Adversarial Defense
Published	2019-04-29
URL	https://arxiv.org/abs/1904.12843v2
PDF	https://arxiv.org/pdf/1904.12843v2.pdf
PWC	https://paperswithcode.com/paper/adversarial-training-for-free
Repo	https://github.com/locuslab/fast_adversarial
Framework	pytorch

DiscoFuse: A Large-Scale Dataset for Discourse-Based Sentence Fusion


Title	DiscoFuse: A Large-Scale Dataset for Discourse-Based Sentence Fusion
Authors	Mor Geva, Eric Malmi, Idan Szpektor, Jonathan Berant
Abstract	Sentence fusion is the task of joining several independent sentences into a single coherent text. Current datasets for sentence fusion are small and insufficient for training modern neural models. In this paper, we propose a method for automatically-generating fusion examples from raw text and present DiscoFuse, a large scale dataset for discourse-based sentence fusion. We author a set of rules for identifying a diverse set of discourse phenomena in raw text, and decomposing the text into two independent sentences. We apply our approach on two document collections: Wikipedia and Sports articles, yielding 60 million fusion examples annotated with discourse information required to reconstruct the fused text. We develop a sequence-to-sequence model on DiscoFuse and thoroughly analyze its strengths and weaknesses with respect to the various discourse phenomena, using both automatic as well as human evaluation. Finally, we conduct transfer learning experiments with WebSplit, a recent dataset for text simplification. We show that pretraining on DiscoFuse substantially improves performance on WebSplit when viewed as a sentence fusion task.
Tasks	Text Simplification, Transfer Learning
Published	2019-02-27
URL	http://arxiv.org/abs/1902.10526v3
PDF	http://arxiv.org/pdf/1902.10526v3.pdf
PWC	https://paperswithcode.com/paper/discofuse-a-large-scale-dataset-for-discourse
Repo	https://github.com/google-research-datasets/discofuse
Framework	none

On Compression of Unsupervised Neural Nets by Pruning Weak Connections


Title	On Compression of Unsupervised Neural Nets by Pruning Weak Connections
Authors	Zhiwen Zuo, Lei Zhao, Liwen Zuo, Feng Jiang, Wei Xing, Dongming Lu
Abstract	Unsupervised neural nets such as Restricted Boltzmann Machines(RBMs) and Deep Belif Networks(DBNs), are powerful in automatic feature extraction,unsupervised weight initialization and density estimation. In this paper,we demonstrate that the parameters of these neural nets can be dramatically reduced without affecting their performance. We describe a method to reduce the parameters required by RBM which is the basic building block for deep architectures. Further we propose an unsupervised sparse deep architectures selection algorithm to form sparse deep neural networks.Experimental results show that there is virtually no loss in either generative or discriminative performance.
Tasks	Density Estimation
Published	2019-01-21
URL	http://arxiv.org/abs/1901.07066v2
PDF	http://arxiv.org/pdf/1901.07066v2.pdf
PWC	https://paperswithcode.com/paper/on-compression-of-unsupervised-neural-nets-by
Repo	https://github.com/ElternalEnVy/tensorflow_rbm
Framework	tf

All You Need is a Few Shifts: Designing Efficient Convolutional Neural Networks for Image Classification


Title	All You Need is a Few Shifts: Designing Efficient Convolutional Neural Networks for Image Classification
Authors	Weijie Chen, Di Xie, Yuan Zhang, Shiliang Pu
Abstract	Shift operation is an efficient alternative over depthwise separable convolution. However, it is still bottlenecked by its implementation manner, namely memory movement. To put this direction forward, a new and novel basic component named Sparse Shift Layer (SSL) is introduced in this paper to construct efficient convolutional neural networks. In this family of architectures, the basic block is only composed by 1x1 convolutional layers with only a few shift operations applied to the intermediate feature maps. To make this idea feasible, we introduce shift operation penalty during optimization and further propose a quantization-aware shift learning method to impose the learned displacement more friendly for inference. Extensive ablation studies indicate that only a few shift operations are sufficient to provide spatial information communication. Furthermore, to maximize the role of SSL, we redesign an improved network architecture to Fully Exploit the limited capacity of neural Network (FE-Net). Equipped with SSL, this network can achieve 75.0% top-1 accuracy on ImageNet with only 563M M-Adds. It surpasses other counterparts constructed by depthwise separable convolution and the networks searched by NAS in terms of accuracy and practical speed.
Tasks	Image Classification, Quantization
Published	2019-03-13
URL	http://arxiv.org/abs/1903.05285v1
PDF	http://arxiv.org/pdf/1903.05285v1.pdf
PWC	https://paperswithcode.com/paper/all-you-need-is-a-few-shifts-designing
Repo	https://github.com/DeadAt0m/ActiveSparseShifts-PyTorch
Framework	pytorch

Recurrent Registration Neural Networks for Deformable Image Registration


Title	Recurrent Registration Neural Networks for Deformable Image Registration
Authors	Robin Sandkühler, Simon Andermatt, Grzegorz Bauman, Sylvia Nyilas, Christoph Jud, Philippe C. Cattin
Abstract	Parametric spatial transformation models have been successfully applied to image registration tasks. In such models, the transformation of interest is parameterized by a fixed set of basis functions as for example B-splines. Each basis function is located on a fixed regular grid position among the image domain, because the transformation of interest is not known in advance. As a consequence, not all basis functions will necessarily contribute to the final transformation which results in a non-compact representation of the transformation. We reformulate the pairwise registration problem as a recursive sequence of successive alignments. For each element in the sequence, a local deformation defined by its position, shape, and weight is computed by our recurrent registration neural network. The sum of all local deformations yield the final spatial alignment of both images. Formulating the registration problem in this way allows the network to detect non-aligned regions in the images and to learn how to locally refine the registration properly. In contrast to current non-sequence-based registration methods, our approach iteratively applies local spatial deformations to the images until the desired registration accuracy is achieved. We trained our network on 2D magnetic resonance images of the lung and compared our method to a standard parametric B-spline registration. The experiments show, that our method performs on par for the accuracy but yields a more compact representation of the transformation. Furthermore, we achieve a speedup of around 15 compared to the B-spline registration.
Tasks	Image Registration
Published	2019-06-07
URL	https://arxiv.org/abs/1906.09988v1
PDF	https://arxiv.org/pdf/1906.09988v1.pdf
PWC	https://paperswithcode.com/paper/recurrent-registration-neural-networks-for
Repo	https://github.com/RobinSandkuehler/r2n2
Framework	pytorch

Semantic integration of disease-specific knowledge


Title	Semantic integration of disease-specific knowledge
Authors	Anastasios Nentidis, Konstantinos Bougiatiotis, Anastasia Krithara, Georgios Paliouras
Abstract	Biomedical researchers working on a specific disease need up-to-date and unified access to knowledge relevant to the disease of their interest. Knowledge is continuously accumulated in scientific literature and other resources such as biomedical ontologies. Identifying the specific information needed is a challenging task and computational tools can be valuable. In this study, we propose a pipeline to automatically retrieve and integrate relevant knowledge based on a semantic graph representation, the iASiS Open Data Graph. Results: The disease-specific semantic graph can provide easy access to resources relevant to specific concepts and individual aspects of these concepts, in the form of concept relations and attributes. The proposed approach is applied to three different case studies: Two prevalent diseases, Lung Cancer and Dementia, for which a lot of knowledge is available, and one rare disease, Duchenne Muscular Dystrophy, for which knowledge is less abundant and difficult to locate. Results from exemplary queries are presented, investigating the potential of this approach in integrating and accessing knowledge as an automatically generated semantic graph.
Tasks
Published	2019-12-18
URL	https://arxiv.org/abs/1912.08633v1
PDF	https://arxiv.org/pdf/1912.08633v1.pdf
PWC	https://paperswithcode.com/paper/semantic-integration-of-disease-specific
Repo	https://github.com/tasosnent/Biomedical-Knowledge-Integration
Framework	none

Learning Pyramid-Context Encoder Network for High-Quality Image Inpainting


Title	Learning Pyramid-Context Encoder Network for High-Quality Image Inpainting
Authors	Yanhong Zeng, Jianlong Fu, Hongyang Chao, Baining Guo
Abstract	High-quality image inpainting requires filling missing regions in a damaged image with plausible content. Existing works either fill the regions by copying image patches or generating semantically-coherent patches from region context, while neglect the fact that both visual and semantic plausibility are highly-demanded. In this paper, we propose a Pyramid-context ENcoder Network (PEN-Net) for image inpainting by deep generative models. The PEN-Net is built upon a U-Net structure, which can restore an image by encoding contextual semantics from full resolution input, and decoding the learned semantic features back into images. Specifically, we propose a pyramid-context encoder, which progressively learns region affinity by attention from a high-level semantic feature map and transfers the learned attention to the previous low-level feature map. As the missing content can be filled by attention transfer from deep to shallow in a pyramid fashion, both visual and semantic coherence for image inpainting can be ensured. We further propose a multi-scale decoder with deeply-supervised pyramid losses and an adversarial loss. Such a design not only results in fast convergence in training, but more realistic results in testing. Extensive experiments on various datasets show the superior performance of the proposed network
Tasks	Image Inpainting
Published	2019-04-16
URL	https://arxiv.org/abs/1904.07475v4
PDF	https://arxiv.org/pdf/1904.07475v4.pdf
PWC	https://paperswithcode.com/paper/learning-pyramid-context-encoder-network-for
Repo	https://github.com/researchmm/PEN-Net-for-Inpainting
Framework	tf


Title	Semantics- and Syntax-related Subvectors in the Skip-gram Embeddings
Authors	Maxat Tezekbayev, Zhenisbek Assylbekov, Rustem Takhanov
Abstract	We show that the skip-gram embedding of any word can be decomposed into two subvectors which roughly correspond to semantic and syntactic roles of the word.
Tasks
Published	2019-12-23
URL	https://arxiv.org/abs/1912.13413v1
PDF	https://arxiv.org/pdf/1912.13413v1.pdf
PWC	https://paperswithcode.com/paper/semantics-and-syntax-related-subvectors-in
Repo	https://github.com/MaxatTezekbayev/Semantics--and-Syntax-related-Subvectors-in-the-Skip-gram-Embeddings
Framework	tf