October 20, 2019

2933 words 14 mins read

Paper Group AWR 249

jLDADMM: A Java package for the LDA and DMM topic models. Stochastic Gradient MCMC with Repulsive Forces. Informed MCMC with Bayesian Neural Networks for Facial Image Analysis. Synthetic Lung Nodule 3D Image Generation Using Autoencoders. Synthesizing Neural Network Controllers with Probabilistic Model based Reinforcement Learning. Dual Asymmetric …

jLDADMM: A Java package for the LDA and DMM topic models


Title	jLDADMM: A Java package for the LDA and DMM topic models
Authors	Dat Quoc Nguyen
Abstract	In this technical report, we present jLDADMM—an easy-to-use Java toolkit for conventional topic models. jLDADMM is released to provide alternatives for topic modeling on normal or short texts. It provides implementations of the Latent Dirichlet Allocation topic model and the one-topic-per-document Dirichlet Multinomial Mixture model (i.e. mixture of unigrams), using collapsed Gibbs sampling. In addition, jLDADMM supplies a document clustering evaluation to compare topic models. jLDADMM is open-source and available to download at: https://github.com/datquocnguyen/jLDADMM
Tasks	Topic Models
Published	2018-08-11
URL	http://arxiv.org/abs/1808.03835v1
PDF	http://arxiv.org/pdf/1808.03835v1.pdf
PWC	https://paperswithcode.com/paper/jldadmm-a-java-package-for-the-lda-and-dmm
Repo	https://github.com/datquocnguyen/jLDADMM
Framework	none

Stochastic Gradient MCMC with Repulsive Forces


Title	Stochastic Gradient MCMC with Repulsive Forces
Authors	Victor Gallego, David Rios Insua
Abstract	We propose a unifying view of two different Bayesian inference algorithms, Stochastic Gradient Markov Chain Monte Carlo (SG-MCMC) and Stein Variational Gradient Descent (SVGD), leading to improved and efficient novel sampling schemes. We show that SVGD combined with a noise term can be framed as a multiple chain SG-MCMC method. Instead of treating each parallel chain independently from others, our proposed algorithm implements a repulsive force between particles, avoiding collapse and facilitating a better exploration of the parameter space. We also show how the addition of this noise term is necessary to obtain a valid SG-MCMC sampler, a significant difference with SVGD. Experiments with both synthetic distributions and real datasets illustrate the benefits of the proposed scheme.
Tasks	Bayesian Inference
Published	2018-11-30
URL	https://arxiv.org/abs/1812.00071v2
PDF	https://arxiv.org/pdf/1812.00071v2.pdf
PWC	https://paperswithcode.com/paper/stochastic-gradient-mcmc-with-repulsive
Repo	https://github.com/vicgalle/sgmcmc-force
Framework	jax

Informed MCMC with Bayesian Neural Networks for Facial Image Analysis


Title	Informed MCMC with Bayesian Neural Networks for Facial Image Analysis
Authors	Adam Kortylewski, Mario Wieser, Andreas Morel-Forster, Aleksander Wieczorek, Sonali Parbhoo, Volker Roth, Thomas Vetter
Abstract	Computer vision tasks are difficult because of the large variability in the data that is induced by changes in light, background, partial occlusion as well as the varying pose, texture, and shape of objects. Generative approaches to computer vision allow us to overcome this difficulty by explicitly modeling the physical image formation process. Using generative object models, the analysis of an observed image is performed via Bayesian inference of the posterior distribution. This conceptually simple approach tends to fail in practice because of several difficulties stemming from sampling the posterior distribution: high-dimensionality and multi-modality of the posterior distribution as well as expensive simulation of the rendering process. The main difficulty of sampling approaches in a computer vision context is choosing the proposal distribution accurately so that maxima of the posterior are explored early and the algorithm quickly converges to a valid image interpretation. In this work, we propose to use a Bayesian Neural Network for estimating an image dependent proposal distribution. Compared to a standard Gaussian random walk proposal, this accelerates the sampler in finding regions of the posterior with high value. In this way, we can significantly reduce the number of samples needed to perform facial image analysis.
Tasks	Bayesian Inference
Published	2018-11-19
URL	http://arxiv.org/abs/1811.07969v2
PDF	http://arxiv.org/pdf/1811.07969v2.pdf
PWC	https://paperswithcode.com/paper/informed-mcmc-with-bayesian-neural-networks
Repo	https://github.com/unibas-gravis/bnn-informed-face-sampler
Framework	none

Synthetic Lung Nodule 3D Image Generation Using Autoencoders


Title	Synthetic Lung Nodule 3D Image Generation Using Autoencoders
Authors	Steve Kommrusch, Louis-Noël Pouchet
Abstract	One of the challenges of using machine learning techniques with medical data is the frequent dearth of source image data on which to train. A representative example is automated lung cancer diagnosis, where nodule images need to be classified as suspicious or benign. In this work we propose an automatic synthetic lung nodule image generator. Our 3D shape generator is designed to augment the variety of 3D images. Our proposed system takes root in autoencoder techniques, and we provide extensive experimental characterization that demonstrates its ability to produce quality synthetic images.
Tasks	Image Generation, Lung Cancer Diagnosis
Published	2018-11-19
URL	https://arxiv.org/abs/1811.07999v3
PDF	https://arxiv.org/pdf/1811.07999v3.pdf
PWC	https://paperswithcode.com/paper/synthetic-lung-nodule-3d-image-generation
Repo	https://github.com/SteveKommrusch/LuNG3D
Framework	pytorch

Synthesizing Neural Network Controllers with Probabilistic Model based Reinforcement Learning


Title	Synthesizing Neural Network Controllers with Probabilistic Model based Reinforcement Learning
Authors	Juan Camilo Gamboa Higuera, David Meger, Gregory Dudek
Abstract	We present an algorithm for rapidly learning controllers for robotics systems. The algorithm follows the model-based reinforcement learning paradigm, and improves upon existing algorithms; namely Probabilistic learning in Control (PILCO) and a sample-based version of PILCO with neural network dynamics (Deep-PILCO). We propose training a neural network dynamics model using variational dropout with truncated Log-Normal noise. This allows us to obtain a dynamics model with calibrated uncertainty, which can be used to simulate controller executions via rollouts. We also describe set of techniques, inspired by viewing PILCO as a recurrent neural network model, that are crucial to improve the convergence of the method. We test our method on a variety of benchmark tasks, demonstrating data-efficiency that is competitive with PILCO, while being able to optimize complex neural network controllers. Finally, we assess the performance of the algorithm for learning motor controllers for a six legged autonomous underwater vehicle. This demonstrates the potential of the algorithm for scaling up the dimensionality and dataset sizes, in more complex control tasks.
Tasks
Published	2018-03-06
URL	http://arxiv.org/abs/1803.02291v3
PDF	http://arxiv.org/pdf/1803.02291v3.pdf
PWC	https://paperswithcode.com/paper/synthesizing-neural-network-controllers-with
Repo	https://github.com/mcgillmrl/robot_learning
Framework	none

Dual Asymmetric Deep Hashing Learning


Title	Dual Asymmetric Deep Hashing Learning
Authors	Jinxing Li, Bob Zhang, Guangming Lu, David Zhang
Abstract	Due to the impressive learning power, deep learning has achieved a remarkable performance in supervised hash function learning. In this paper, we propose a novel asymmetric supervised deep hashing method to preserve the semantic structure among different categories and generate the binary codes simultaneously. Specifically, two asymmetric deep networks are constructed to reveal the similarity between each pair of images according to their semantic labels. The deep hash functions are then learned through two networks by minimizing the gap between the learned features and discrete codes. Furthermore, since the binary codes in the Hamming space also should keep the semantic affinity existing in the original space, another asymmetric pairwise loss is introduced to capture the similarity between the binary codes and real-value features. This asymmetric loss not only improves the retrieval performance, but also contributes to a quick convergence at the training phase. By taking advantage of the two-stream deep structures and two types of asymmetric pairwise functions, an alternating algorithm is designed to optimize the deep features and high-quality binary codes efficiently. Experimental results on three real-world datasets substantiate the effectiveness and superiority of our approach as compared with state-of-the-art.
Tasks
Published	2018-01-25
URL	http://arxiv.org/abs/1801.08360v1
PDF	http://arxiv.org/pdf/1801.08360v1.pdf
PWC	https://paperswithcode.com/paper/dual-asymmetric-deep-hashing-learning
Repo	https://github.com/deepakks1995/DeepHashing
Framework	none

Sparsified SGD with Memory


Title	Sparsified SGD with Memory
Authors	Sebastian U. Stich, Jean-Baptiste Cordonnier, Martin Jaggi
Abstract	Huge scale machine learning problems are nowadays tackled by distributed optimization algorithms, i.e. algorithms that leverage the compute power of many devices for training. The communication overhead is a key bottleneck that hinders perfect scalability. Various recent works proposed to use quantization or sparsification techniques to reduce the amount of data that needs to be communicated, for instance by only sending the most significant entries of the stochastic gradient (top-k sparsification). Whilst such schemes showed very promising performance in practice, they have eluded theoretical analysis so far. In this work we analyze Stochastic Gradient Descent (SGD) with k-sparsification or compression (for instance top-k or random-k) and show that this scheme converges at the same rate as vanilla SGD when equipped with error compensation (keeping track of accumulated errors in memory). That is, communication can be reduced by a factor of the dimension of the problem (sometimes even more) whilst still converging at the same rate. We present numerical experiments to illustrate the theoretical findings and the better scalability for distributed applications.
Tasks	Distributed Optimization, Quantization
Published	2018-09-20
URL	http://arxiv.org/abs/1809.07599v2
PDF	http://arxiv.org/pdf/1809.07599v2.pdf
PWC	https://paperswithcode.com/paper/sparsified-sgd-with-memory
Repo	https://github.com/epfml/sparsifiedSGD
Framework	none

Chinese Text in the Wild


Title	Chinese Text in the Wild
Authors	Tai-Ling Yuan, Zhe Zhu, Kun Xu, Cheng-Jun Li, Shi-Min Hu
Abstract	[python3.6] 运用tf实现自然场景文字检测,keras/pytorch实现ctpn+crnn+ctc实现不定长场景文字OCR识别
Tasks	Optical Character Recognition
Published	2018-02-28
URL	http://arxiv.org/abs/1803.00085v1
PDF	http://arxiv.org/pdf/1803.00085v1.pdf
PWC	https://paperswithcode.com/paper/chinese-text-in-the-wild
Repo	https://github.com/OzHsu23/chineseocr
Framework	tf

Tile2Vec: Unsupervised representation learning for spatially distributed data


Title	Tile2Vec: Unsupervised representation learning for spatially distributed data
Authors	Neal Jean, Sherrie Wang, Anshul Samar, George Azzari, David Lobell, Stefano Ermon
Abstract	Geospatial analysis lacks methods like the word vector representations and pre-trained networks that significantly boost performance across a wide range of natural language and computer vision tasks. To fill this gap, we introduce Tile2Vec, an unsupervised representation learning algorithm that extends the distributional hypothesis from natural language – words appearing in similar contexts tend to have similar meanings – to spatially distributed data. We demonstrate empirically that Tile2Vec learns semantically meaningful representations on three datasets. Our learned representations significantly improve performance in downstream classification tasks and, similar to word vectors, visual analogies can be obtained via simple arithmetic in the latent space.
Tasks	Representation Learning, Unsupervised Representation Learning
Published	2018-05-08
URL	http://arxiv.org/abs/1805.02855v2
PDF	http://arxiv.org/pdf/1805.02855v2.pdf
PWC	https://paperswithcode.com/paper/tile2vec-unsupervised-representation-learning
Repo	https://github.com/simongrest/farm-pin-crop-detection-challenge
Framework	none

Roto-Translation Covariant Convolutional Networks for Medical Image Analysis


Title	Roto-Translation Covariant Convolutional Networks for Medical Image Analysis
Authors	Erik J Bekkers, Maxime W Lafarge, Mitko Veta, Koen AJ Eppenhof, Josien PW Pluim, Remco Duits
Abstract	We propose a framework for rotation and translation covariant deep learning using $SE(2)$ group convolutions. The group product of the special Euclidean motion group $SE(2)$ describes how a concatenation of two roto-translations results in a net roto-translation. We encode this geometric structure into convolutional neural networks (CNNs) via $SE(2)$ group convolutional layers, which fit into the standard 2D CNN framework, and which allow to generically deal with rotated input samples without the need for data augmentation. We introduce three layers: a lifting layer which lifts a 2D (vector valued) image to an $SE(2)$-image, i.e., 3D (vector valued) data whose domain is $SE(2)$; a group convolution layer from and to an $SE(2)$-image; and a projection layer from an $SE(2)$-image to a 2D image. The lifting and group convolution layers are $SE(2)$ covariant (the output roto-translates with the input). The final projection layer, a maximum intensity projection over rotations, makes the full CNN rotation invariant. We show with three different problems in histopathology, retinal imaging, and electron microscopy that with the proposed group CNNs, state-of-the-art performance can be achieved, without the need for data augmentation by rotation and with increased performance compared to standard CNNs that do rely on augmentation.
Tasks	Data Augmentation
Published	2018-04-10
URL	http://arxiv.org/abs/1804.03393v3
PDF	http://arxiv.org/pdf/1804.03393v3.pdf
PWC	https://paperswithcode.com/paper/roto-translation-covariant-convolutional
Repo	https://github.com/tueimage/se2cnn
Framework	tf

Fast and Accurate Single Image Super-Resolution via Information Distillation Network


Title	Fast and Accurate Single Image Super-Resolution via Information Distillation Network
Authors	Zheng Hui, Xiumei Wang, Xinbo Gao
Abstract	Recently, deep convolutional neural networks (CNNs) have been demonstrated remarkable progress on single image super-resolution. However, as the depth and width of the networks increase, CNN-based super-resolution methods have been faced with the challenges of computational complexity and memory consumption in practice. In order to solve the above questions, we propose a deep but compact convolutional network to directly reconstruct the high resolution image from the original low resolution image. In general, the proposed model consists of three parts, which are feature extraction block, stacked information distillation blocks and reconstruction block respectively. By combining an enhancement unit with a compression unit into a distillation block, the local long and short-path features can be effectively extracted. Specifically, the proposed enhancement unit mixes together two different types of features and the compression unit distills more useful information for the sequential blocks. In addition, the proposed network has the advantage of fast execution due to the comparatively few numbers of filters per layer and the use of group convolution. Experimental results demonstrate that the proposed method is superior to the state-of-the-art methods, especially in terms of time performance.
Tasks	Image Super-Resolution, Super-Resolution
Published	2018-03-26
URL	http://arxiv.org/abs/1803.09454v1
PDF	http://arxiv.org/pdf/1803.09454v1.pdf
PWC	https://paperswithcode.com/paper/fast-and-accurate-single-image-super
Repo	https://github.com/Zheng222/IDN-Caffe
Framework	tf

DLBI: Deep learning guided Bayesian inference for structure reconstruction of super-resolution fluorescence microscopy


Title	DLBI: Deep learning guided Bayesian inference for structure reconstruction of super-resolution fluorescence microscopy
Authors	Yu Li, Fan Xu, Fa Zhang, Pingyong Xu, Mingshu Zhang, Ming Fan, Lihua Li, Xin Gao, Renmin Han
Abstract	Super-resolution fluorescence microscopy, with a resolution beyond the diffraction limit of light, has become an indispensable tool to directly visualize biological structures in living cells at a nanometer-scale resolution. Despite advances in high-density super-resolution fluorescent techniques, existing methods still have bottlenecks, including extremely long execution time, artificial thinning and thickening of structures, and lack of ability to capture latent structures. Here we propose a novel deep learning guided Bayesian inference approach, DLBI, for the time-series analysis of high-density fluorescent images. Our method combines the strength of deep learning and statistical inference, where deep learning captures the underlying distribution of the fluorophores that are consistent with the observed time-series fluorescent images by exploring local features and correlation along time-axis, and statistical inference further refines the ultrastructure extracted by deep learning and endues physical meaning to the final image. Comprehensive experimental results on both real and simulated datasets demonstrate that our method provides more accurate and realistic local patch and large-field reconstruction than the state-of-the-art method, the 3B analysis, while our method is more than two orders of magnitude faster. The main program is available at https://github.com/lykaust15/DLBI
Tasks	Bayesian Inference, Super-Resolution, Time Series, Time Series Analysis
Published	2018-05-20
URL	http://arxiv.org/abs/1805.07777v3
PDF	http://arxiv.org/pdf/1805.07777v3.pdf
PWC	https://paperswithcode.com/paper/dlbi-deep-learning-guided-bayesian-inference
Repo	https://github.com/lykaust15/DLBI
Framework	tf

Sparse-to-Continuous: Enhancing Monocular Depth Estimation using Occupancy Maps


Title	Sparse-to-Continuous: Enhancing Monocular Depth Estimation using Occupancy Maps
Authors	Nícolas Rosa, Vitor Guizilini, Valdir Grassi Jr
Abstract	This paper addresses the problem of single image depth estimation (SIDE), focusing on improving the quality of deep neural network predictions. In a supervised learning scenario, the quality of predictions is intrinsically related to the training labels, which guide the optimization process. For indoor scenes, structured-light-based depth sensors (e.g. Kinect) are able to provide dense, albeit short-range, depth maps. On the other hand, for outdoor scenes, LiDARs are considered the standard sensor, which comparatively provides much sparser measurements, especially in areas further away. Rather than modifying the neural network architecture to deal with sparse depth maps, this article introduces a novel densification method for depth maps, using the Hilbert Maps framework. A continuous occupancy map is produced based on 3D points from LiDAR scans, and the resulting reconstructed surface is projected into a 2D depth map with arbitrary resolution. Experiments conducted with various subsets of the KITTI dataset show a significant improvement produced by the proposed Sparse-to-Continuous technique, without the introduction of extra information into the training stage.
Tasks	Depth Estimation, Monocular Depth Estimation
Published	2018-09-24
URL	https://arxiv.org/abs/1809.09061v3
PDF	https://arxiv.org/pdf/1809.09061v3.pdf
PWC	https://paperswithcode.com/paper/sparse-to-continuous-enhancing-monocular
Repo	https://github.com/nicolasrosa/Sparse-to-Continuous
Framework	tf

Unsupervised Cipher Cracking Using Discrete GANs


Title	Unsupervised Cipher Cracking Using Discrete GANs
Authors	Aidan N. Gomez, Sicong Huang, Ivan Zhang, Bryan M. Li, Muhammad Osama, Lukasz Kaiser
Abstract	This work details CipherGAN, an architecture inspired by CycleGAN used for inferring the underlying cipher mapping given banks of unpaired ciphertext and plaintext. We demonstrate that CipherGAN is capable of cracking language data enciphered using shift and Vigenere ciphers to a high degree of fidelity and for vocabularies much larger than previously achieved. We present how CycleGAN can be made compatible with discrete data and train in a stable way. We then prove that the technique used in CipherGAN avoids the common problem of uninformative discrimination associated with GANs applied to discrete data.
Tasks
Published	2018-01-15
URL	http://arxiv.org/abs/1801.04883v1
PDF	http://arxiv.org/pdf/1801.04883v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-cipher-cracking-using-discrete
Repo	https://github.com/for-ai/ciphergan
Framework	tf

AdVersarial: Perceptual Ad Blocking meets Adversarial Machine Learning


Title	AdVersarial: Perceptual Ad Blocking meets Adversarial Machine Learning
Authors	Florian Tramèr, Pascal Dupré, Gili Rusak, Giancarlo Pellegrino, Dan Boneh
Abstract	Perceptual ad-blocking is a novel approach that detects online advertisements based on their visual content. Compared to traditional filter lists, the use of perceptual signals is believed to be less prone to an arms race with web publishers and ad networks. We demonstrate that this may not be the case. We describe attacks on multiple perceptual ad-blocking techniques, and unveil a new arms race that likely disfavors ad-blockers. Unexpectedly, perceptual ad-blocking can also introduce new vulnerabilities that let an attacker bypass web security boundaries and mount DDoS attacks. We first analyze the design space of perceptual ad-blockers and present a unified architecture that incorporates prior academic and commercial work. We then explore a variety of attacks on the ad-blocker’s detection pipeline, that enable publishers or ad networks to evade or detect ad-blocking, and at times even abuse its high privilege level to bypass web security boundaries. On one hand, we show that perceptual ad-blocking must visually classify rendered web content to escape an arms race centered on obfuscation of page markup. On the other, we present a concrete set of attacks on visual ad-blockers by constructing adversarial examples in a real web page context. For seven ad-detectors, we create perturbed ads, ad-disclosure logos, and native web content that misleads perceptual ad-blocking with 100% success rates. In one of our attacks, we demonstrate how a malicious user can upload adversarial content, such as a perturbed image in a Facebook post, that fools the ad-blocker into removing another users’ non-ad content. Moving beyond the Web and visual domain, we also build adversarial examples for AdblockRadio, an open source radio client that uses machine learning to detects ads in raw audio streams.
Tasks
Published	2018-11-08
URL	https://arxiv.org/abs/1811.03194v3
PDF	https://arxiv.org/pdf/1811.03194v3.pdf
PWC	https://paperswithcode.com/paper/ad-versarial-perceptual-ad-blocking-meets
Repo	https://github.com/BenjaminBush/ADverse
Framework	tf