February 1, 2020

3026 words 15 mins read

Paper Group AWR 85

Synthetic dataset generation for object-to-model deep learning in industrial applications. OpenCL-based FPGA accelerator for disparity map generation with stereoscopic event cameras. Stochastic Gradient MCMC for Nonlinear State Space Models. Partially Exchangeable Networks and Architectures for Learning Summary Statistics in Approximate Bayesian Co …

Synthetic dataset generation for object-to-model deep learning in industrial applications


Title	Synthetic dataset generation for object-to-model deep learning in industrial applications
Authors	Matthew Z. Wong, Kiyohito Kunii, Max Baylis, Wai Hong Ong, Pavel Kroupa, Swen Koller
Abstract	The availability of large image data sets has been a crucial factor in the success of deep learning-based classification and detection methods. While data sets for everyday objects are widely available, data for specific industrial use-cases (e.g. identifying packaged products in a warehouse) remains scarce. In such cases, the data sets have to be created from scratch, placing a crucial bottleneck on the deployment of deep learning techniques in industrial applications. We present work carried out in collaboration with a leading UK online supermarket, with the aim of creating a computer vision system capable of detecting and identifying unique supermarket products in a warehouse setting. To this end, we demonstrate a framework for using synthetic data to create an end-to-end deep learning pipeline, beginning with real-world objects and culminating in a trained model. Our method is based on the generation of a synthetic dataset from 3D models obtained by applying photogrammetry techniques to real-world objects. Using 100k synthetic images generated from 60 real images per class, an InceptionV3 convolutional neural network (CNN) was trained, which achieved classification accuracy of 95.8% on a separately acquired test set of real supermarket product images. The image generation process supports automatic pixel annotation. This eliminates the prohibitively expensive manual annotation typically required for detection tasks. Based on this readily available data, a one-stage RetinaNet detector was trained on the synthetic, annotated images to produce a detector that can accurately localize and classify the specimen products in real-time.
Tasks	Image Generation
Published	2019-09-24
URL	https://arxiv.org/abs/1909.10976v1
PDF	https://arxiv.org/pdf/1909.10976v1.pdf
PWC	https://paperswithcode.com/paper/synthetic-dataset-generation-for-object-to
Repo	https://github.com/921kiyo/3d-dl
Framework	tf

OpenCL-based FPGA accelerator for disparity map generation with stereoscopic event cameras


Title	OpenCL-based FPGA accelerator for disparity map generation with stereoscopic event cameras
Authors	David Castells-Rufas, Jordi Carrabina
Abstract	Although event-based cameras are already commercially available. Vision algorithms based on them are still not common. As a consequence, there are few Hardware Accelerators for them. In this work we present some experiments to create FPGA accelerators for a well-known vision algorithm using event-based cameras. We present a stereo matching algorithm to create a stream of disparity events disparity map and implement several accelerators using the Intel FPGA OpenCL tool-chain. The results show that multiple designs can be easily tested and that a performance speedup of more than 8x can be achieved with simple code transformations.
Tasks	Stereo Matching, Stereo Matching Hand
Published	2019-03-08
URL	http://arxiv.org/abs/1903.03509v1
PDF	http://arxiv.org/pdf/1903.03509v1.pdf
PWC	https://paperswithcode.com/paper/opencl-based-fpga-accelerator-for-disparity
Repo	https://github.com/davidcastells/DVSSimulator
Framework	none

Stochastic Gradient MCMC for Nonlinear State Space Models


Title	Stochastic Gradient MCMC for Nonlinear State Space Models
Authors	Christopher Aicher, Srshti Putcha, Christopher Nemeth, Paul Fearnhead, Emily B. Fox
Abstract	State space models (SSMs) provide a flexible framework for modeling complex time series via a latent stochastic process. Inference for nonlinear, non-Gaussian SSMs is often tackled with particle methods that do not scale well to long time series. The challenge is two-fold: not only do computations scale linearly with time, as in the linear case, but particle filters additionally suffer from increasing particle degeneracy with longer series. Stochastic gradient MCMC methods have been developed to scale inference for hidden Markov models (HMMs) and linear SSMs using buffered stochastic gradient estimates to account for temporal dependencies. We extend these stochastic gradient estimators to nonlinear SSMs using particle methods. We present error bounds that account for both buffering error and particle error in the case of nonlinear SSMs that are log-concave in the latent process. We evaluate our proposed particle buffered stochastic gradient using SGMCMC for inference on both long sequential synthetic and minute-resolution financial returns data, demonstrating the importance of this class of methods.
Tasks	Time Series
Published	2019-01-29
URL	https://arxiv.org/abs/1901.10568v2
PDF	https://arxiv.org/pdf/1901.10568v2.pdf
PWC	https://paperswithcode.com/paper/stochastic-gradient-mcmc-for-nonlinear-state
Repo	https://github.com/aicherc/sgmcmc_ssm_code
Framework	none

Partially Exchangeable Networks and Architectures for Learning Summary Statistics in Approximate Bayesian Computation


Title	Partially Exchangeable Networks and Architectures for Learning Summary Statistics in Approximate Bayesian Computation
Authors	Samuel Wiqvist, Pierre-Alexandre Mattei, Umberto Picchini, Jes Frellsen
Abstract	We present a novel family of deep neural architectures, named partially exchangeable networks (PENs) that leverage probabilistic symmetries. By design, PENs are invariant to block-switch transformations, which characterize the partial exchangeability properties of conditionally Markovian processes. Moreover, we show that any block-switch invariant function has a PEN-like representation. The DeepSets architecture is a special case of PEN and we can therefore also target fully exchangeable data. We employ PENs to learn summary statistics in approximate Bayesian computation (ABC). When comparing PENs to previous deep learning methods for learning summary statistics, our results are highly competitive, both considering time series and static models. Indeed, PENs provide more reliable posterior samples even when using less training data.
Tasks	Time Series
Published	2019-01-29
URL	https://arxiv.org/abs/1901.10230v2
PDF	https://arxiv.org/pdf/1901.10230v2.pdf
PWC	https://paperswithcode.com/paper/partially-exchangeable-networks-and
Repo	https://github.com/SamuelWiqvist/PENs-and-ABC
Framework	none

Biadversarial Variational Autoencoder


Title	Biadversarial Variational Autoencoder
Authors	Arnaud Fickinger
Abstract	In the original version of the Variational Autoencoder, Kingma et al. assume Gaussian distributions for the approximate posterior during the inference and for the output during the generative process. This assumptions are good for computational reasons, e.g. we can easily optimize the parameters of a neural network using the reparametrization trick and the KL divergence between two Gaussians can be computed in closed form. However it results in blurry images due to its difficulty to represent multimodal distributions. We show that using two adversarial networks, we can optimize the parameters without any Gaussian assumptions.
Tasks
Published	2019-02-09
URL	http://arxiv.org/abs/1902.03517v2
PDF	http://arxiv.org/pdf/1902.03517v2.pdf
PWC	https://paperswithcode.com/paper/biadversarial-variational-autoencoder
Repo	https://github.com/ArnaudFickinger/BAVAE
Framework	pytorch

Vector of Locally-Aggregated Word Embeddings (VLAWE): A Novel Document-level Representation


Title	Vector of Locally-Aggregated Word Embeddings (VLAWE): A Novel Document-level Representation
Authors	Radu Tudor Ionescu, Andrei M. Butnaru
Abstract	In this paper, we propose a novel representation for text documents based on aggregating word embedding vectors into document embeddings. Our approach is inspired by the Vector of Locally-Aggregated Descriptors used for image representation, and it works as follows. First, the word embeddings gathered from a collection of documents are clustered by k-means in order to learn a codebook of semnatically-related word embeddings. Each word embedding is then associated to its nearest cluster centroid (codeword). The Vector of Locally-Aggregated Word Embeddings (VLAWE) representation of a document is then computed by accumulating the differences between each codeword vector and each word vector (from the document) associated to the respective codeword. We plug the VLAWE representation, which is learned in an unsupervised manner, into a classifier and show that it is useful for a diverse set of text classification tasks. We compare our approach with a broad range of recent state-of-the-art methods, demonstrating the effectiveness of our approach. Furthermore, we obtain a considerable improvement on the Movie Review data set, reporting an accuracy of 93.3%, which represents an absolute gain of 10% over the state-of-the-art approach. Our code is available at https://github.com/raduionescu/vlawe-boswe/.
Tasks	Text Classification, Word Embeddings
Published	2019-02-23
URL	https://arxiv.org/abs/1902.08850v3
PDF	https://arxiv.org/pdf/1902.08850v3.pdf
PWC	https://paperswithcode.com/paper/vector-of-locally-aggregated-word-embeddings
Repo	https://github.com/raduionescu/vlawe-boswe
Framework	none

S4NN: temporal backpropagation for spiking neural networks with one spike per neuron


Title	S4NN: temporal backpropagation for spiking neural networks with one spike per neuron
Authors	Saeed Reza Kheradpisheh, Timothée Masquelier
Abstract	We propose a new supervised learning rule for multilayer spiking neural networks (SNNs) that use a form of temporal coding known as rank-order-coding. With this coding scheme, all neurons fire exactly one spike per stimulus, but the firing order carries information. In particular, in the readout layer, the first neuron to fire determines the class of the stimulus. We derive a new learning rule for this sort of network, named S4NN, akin to traditional error backpropagation, yet based on latencies. We show how approximated error gradients can be computed backward in a feedforward network with any number of layers. This approach reaches state-of-the-art performance with supervised multi fully-connected layer SNNs: test accuracy of 97.4% for the MNIST dataset, and 99.2% for the Caltech Face/Motorbike dataset. Yet, the neuron model that we use, non-leaky integrate-and-fire, is much simpler than the one used in all previous works. The source codes of the proposed S4NN are publicly available at https://github.com/SRKH/S4NN.
Tasks
Published	2019-10-21
URL	https://arxiv.org/abs/1910.09495v2
PDF	https://arxiv.org/pdf/1910.09495v2.pdf
PWC	https://paperswithcode.com/paper/s4nn-temporal-backpropagation-for-spiking
Repo	https://github.com/SRKH/S4NN
Framework	none

On the choice of graph neural network architectures


Title	On the choice of graph neural network architectures
Authors	Clément Vignac, Guillermo Ortiz-Jiménez, Pascal Frossard
Abstract	Seminal works on graph neural networks have primarily targeted semi-supervised node classification problems with few observed labels and high-dimensional signals. With the development of graph networks, this setup has become a de facto benchmark for a significant body of research. Interestingly, several works have recently shown that in this particular setting, graph neural networks do not perform much better than predefined low-pass filters followed by a linear classifier. However, when learning from little data in a high-dimensional space, it is not surprising that simple and heavily regularized methods are near-optimal. In this paper, we show empirically that in settings with fewer features and more training data, more complex graph networks significantly outperform simple models, and propose a few insights towards the proper choice of graph network architectures. We finally outline the importance of using sufficiently diverse benchmarks (including lower dimensional signals as well) when designing and studying new types of graph neural networks.
Tasks	Node Classification
Published	2019-11-13
URL	https://arxiv.org/abs/1911.05384v2
PDF	https://arxiv.org/pdf/1911.05384v2.pdf
PWC	https://paperswithcode.com/paper/on-the-choice-of-graph-neural-network
Repo	https://github.com/cvignac/gnn_statistics
Framework	tf

EASSE: Easier Automatic Sentence Simplification Evaluation


Title	EASSE: Easier Automatic Sentence Simplification Evaluation
Authors	Fernando Alva-Manchego, Louis Martin, Carolina Scarton, Lucia Specia
Abstract	We introduce EASSE, a Python package aiming to facilitate and standardise automatic evaluation and comparison of Sentence Simplification (SS) systems. EASSE provides a single access point to a broad range of evaluation resources: standard automatic metrics for assessing SS outputs (e.g. SARI), word-level accuracy scores for certain simplification transformations, reference-independent quality estimation features (e.g. compression ratio), and standard test data for SS evaluation (e.g. TurkCorpus). Finally, EASSE generates easy-to-visualise reports on the various metrics and features above and on how a particular SS output fares against reference simplifications. Through experiments, we show that these functionalities allow for better comparison and understanding of the performance of SS systems.
Tasks
Published	2019-08-13
URL	https://arxiv.org/abs/1908.04567v2
PDF	https://arxiv.org/pdf/1908.04567v2.pdf
PWC	https://paperswithcode.com/paper/easse-easier-automatic-sentence
Repo	https://github.com/feralvam/easse
Framework	none

Gated Multiple Feedback Network for Image Super-Resolution


Title	Gated Multiple Feedback Network for Image Super-Resolution
Authors	Qilei Li, Zhen Li, Lu Lu, Gwanggil Jeon, Kai Liu, Xiaomin Yang
Abstract	The rapid development of deep learning (DL) has driven single image super-resolution (SR) into a new era. However, in most existing DL based image SR networks, the information flows are solely feedforward, and the high-level features cannot be fully explored. In this paper, we propose the gated multiple feedback network (GMFN) for accurate image SR, in which the representation of low-level features are efficiently enriched by rerouting multiple high-level features. We cascade multiple residual dense blocks (RDBs) and recurrently unfolds them across time. The multiple feedback connections between two adjacent time steps in the proposed GMFN exploits multiple high-level features captured under large receptive fields to refine the low-level features lacking enough contextual information. The elaborately designed gated feedback module (GFM) efficiently selects and further enhances useful information from multiple rerouted high-level features, and then refine the low-level features with the enhanced high-level information. Extensive experiments demonstrate the superiority of our proposed GMFN against state-of-the-art SR methods in terms of both quantitative metrics and visual quality. Code is available at https://github.com/liqilei/GMFN.
Tasks	Image Super-Resolution, Super-Resolution
Published	2019-07-09
URL	https://arxiv.org/abs/1907.04253v2
PDF	https://arxiv.org/pdf/1907.04253v2.pdf
PWC	https://paperswithcode.com/paper/gated-multiple-feedback-network-for-image
Repo	https://github.com/liqilei/GMFN
Framework	pytorch

Semi-supervisedly Co-embedding Attributed Networks


Title	Semi-supervisedly Co-embedding Attributed Networks
Authors	Zaiqiao Meng, Shangsong Liang, Jinyuan Fang, Teng Xiao
Abstract	Deep generative models (DGMs) have achieved remarkable advances. Semi-supervised variational auto-encoders (SVAE) as a classical DGM offer a principled framework to effectively generalize from small labelled data to large unlabelled ones, but it is difficult to incorporate rich unstructured relationships within the multiple heterogeneous entities. In this paper, to deal with the problem, we present a semi-supervised co-embedding model for attributed networks (SCAN) based on the generalized SVAE for heterogeneous data, which collaboratively learns low-dimensional vector representations of both nodes and attributes for partially labelled attributed networks semi-supervisedly. The node and attribute embeddings obtained in a unified manner by our SCAN can benefit for capturing not only the proximities between nodes but also the affinities between nodes and attributes. Moreover, our model also trains a discriminative network to learn the label predictive distribution of nodes. Experimental results on real-world networks demonstrate that our model yields excellent performance in a number of applications such as attribute inference, user profiling and node classification compared to the state-of-the-art baselines.
Tasks	Node Classification
Published	2019-10-31
URL	https://arxiv.org/abs/1910.14491v1
PDF	https://arxiv.org/pdf/1910.14491v1.pdf
PWC	https://paperswithcode.com/paper/semi-supervisedly-co-embedding-attributed
Repo	https://github.com/mengzaiqiao/SCAN
Framework	tf

LIBS2ML: A Library for Scalable Second Order Machine Learning Algorithms


Title	LIBS2ML: A Library for Scalable Second Order Machine Learning Algorithms
Authors	Vinod Kumar Chauhan, Anuj Sharma, Kalpana Dahiya
Abstract	LIBS2ML is a library based on scalable second order learning algorithms for solving large-scale problems, i.e., big data problems in machine learning. LIBS2ML has been developed using MEX files, i.e., C++ with MATLAB/Octave interface to take the advantage of both the worlds, i.e., faster learning using C++ and easy I/O using MATLAB. Most of the available libraries are either in MATLAB/Python/R which are very slow and not suitable for large-scale learning, or are in C/C++ which does not have easy ways to take input and display results. So LIBS2ML is completely unique due to its focus on the scalable second order methods, the hot research topic, and being based on MEX files. Thus it provides researchers a comprehensive environment to evaluate their ideas and it also provides machine learning practitioners an effective tool to deal with the large-scale learning problems. LIBS2ML is an open-source, highly efficient, extensible, scalable, readable, portable and easy to use library. The library can be downloaded from the URL: \url{https://github.com/jmdvinodjmd/LIBS2ML}.
Tasks
Published	2019-04-20
URL	http://arxiv.org/abs/1904.09448v1
PDF	http://arxiv.org/pdf/1904.09448v1.pdf
PWC	https://paperswithcode.com/paper/190409448
Repo	https://github.com/jmdvinodjmd/LIBS2ML
Framework	none

Federated Learning over Wireless Networks: Convergence Analysis and Resource Allocation


Title	Federated Learning over Wireless Networks: Convergence Analysis and Resource Allocation
Authors	Canh Dinh, Nguyen H. Tran, Minh N. H. Nguyen, Choong Seon Hong, Wei Bao, Albert Y. Zomaya, Vincent Gramoli
Abstract	There is an increasing interest in a fast-growing machine learning technique called Federated Learning, in which the model training is distributed over mobile user equipments (UEs), exploiting UEs’ local computation and training data. Despite its advantages in data privacy-preserving, Federated Learning (FL) still has challenges in heterogeneity across UEs’ data and physical resources. We first propose a FL algorithm which can handle the heterogeneous UEs’ data challenge without further assumptions except strongly convex and smooth loss functions. We provide the convergence rate characterizing the trade-off between local computation rounds of UE to update its local model and global communication rounds to update the FL global model. We then employ the proposed FL algorithm in wireless networks as a resource allocation optimization problem that captures the trade-off between the FL convergence wall clock time and energy consumption of UEs with heterogeneous computing and power resources. Even though the wireless resource allocation problem of FL is non-convex, we exploit this problem’s structure to decompose it into three sub-problems and analyze their closed-form solutions as well as insights to problem design. Finally, we illustrate the theoretical analysis for the new algorithm with Tensorflow experiments and extensive numerical results for the wireless resource allocation sub-problems. The experiment results not only verify the theoretical convergence but also show that our proposed algorithm outperforms the vanilla FedAvg algorithm in terms of convergence rate and testing accuracy.
Tasks
Published	2019-10-29
URL	https://arxiv.org/abs/1910.13067v3
PDF	https://arxiv.org/pdf/1910.13067v3.pdf
PWC	https://paperswithcode.com/paper/federated-learning-over-wireless-networks
Repo	https://github.com/nhatminh/FEDL
Framework	none

Graph Wavelet Neural Network


Title	Graph Wavelet Neural Network
Authors	Bingbing Xu, Huawei Shen, Qi Cao, Yunqi Qiu, Xueqi Cheng
Abstract	We present graph wavelet neural network (GWNN), a novel graph convolutional neural network (CNN), leveraging graph wavelet transform to address the shortcomings of previous spectral graph CNN methods that depend on graph Fourier transform. Different from graph Fourier transform, graph wavelet transform can be obtained via a fast algorithm without requiring matrix eigendecomposition with high computational cost. Moreover, graph wavelets are sparse and localized in vertex domain, offering high efficiency and good interpretability for graph convolution. The proposed GWNN significantly outperforms previous spectral graph CNNs in the task of graph-based semi-supervised classification on three benchmark datasets: Cora, Citeseer and Pubmed.
Tasks
Published	2019-04-12
URL	http://arxiv.org/abs/1904.07785v1
PDF	http://arxiv.org/pdf/1904.07785v1.pdf
PWC	https://paperswithcode.com/paper/graph-wavelet-neural-network-1
Repo	https://github.com/benedekrozemberczki/GraphWaveletNeuralNetwork
Framework	pytorch

Training Generative Adversarial Networks from Incomplete Observations using Factorised Discriminators


Title	Training Generative Adversarial Networks from Incomplete Observations using Factorised Discriminators
Authors	Daniel Stoller, Sebastian Ewert, Simon Dixon
Abstract	Generative adversarial networks (GANs) have shown great success in applications such as image generation and inpainting. However, they typically require large datasets, which are often not available, especially in the context of prediction tasks such as image segmentation that require labels. Therefore, methods such as the CycleGAN use more easily available unlabelled data, but do not offer a way to leverage additional labelled data for improved performance. To address this shortcoming, we show how to factorise the joint data distribution into a set of lower-dimensional distributions along with their dependencies. This allows splitting the discriminator in a GAN into multiple “sub-discriminators” that can be independently trained from incomplete observations. Their outputs can be combined to estimate the density ratio between the joint real and the generator distribution, which enables training generators as in the original GAN framework. We apply our method to image generation, image segmentation and audio source separation, and obtain improved performance over a standard GAN when additional incomplete training examples are available. For the Cityscapes segmentation task in particular, our method also improves accuracy by an absolute 14.9% over CycleGAN while using only 25 additional paired examples.
Tasks	Image Generation, Semantic Segmentation
Published	2019-05-29
URL	https://arxiv.org/abs/1905.12660v2
PDF	https://arxiv.org/pdf/1905.12660v2.pdf
PWC	https://paperswithcode.com/paper/training-generative-adversarial-networks-from
Repo	https://github.com/f90/FactorGAN
Framework	pytorch