October 17, 2019

2888 words 14 mins read

Paper Group ANR 710

Assessing Performance of Aerobic Routines using Background Subtraction and Intersected Image Region. An Evaluation of Trajectory Prediction Approaches and Notes on the TrajNet Benchmark. Deep learning-based super-resolution in coherent imaging systems. Land cover mapping at very high resolution with rotation equivariant CNNs: towards small yet accu …

Assessing Performance of Aerobic Routines using Background Subtraction and Intersected Image Region


Title	Assessing Performance of Aerobic Routines using Background Subtraction and Intersected Image Region
Authors	Faustine John, Irwandi Hipiny, Hamimah Ujir, Mohd Shahrizal Sunar
Abstract	It is recommended for a novice to engage a trained and experience person, i.e., a coach before starting an unfamiliar aerobic or weight routine. The coach’s task is to provide real-time feedbacks to ensure that the routine is performed in a correct manner. This greatly reduces the risk of injury and maximise physical gains. We present a simple image similarity measure based on intersected image region to assess a subject’s performance of an aerobic routine. The method is implemented inside an Augmented Reality (AR) desktop app that employs a single RGB camera to capture still images of the subject as he or she progresses through the routine. The background-subtracted body pose image is compared against the exemplar body pose image (i.e., AR template) at specific intervals. Based on a limited dataset, our pose matching function is reported to have an accuracy of 93.67%.
Tasks
Published	2018-10-03
URL	http://arxiv.org/abs/1810.01564v1
PDF	http://arxiv.org/pdf/1810.01564v1.pdf
PWC	https://paperswithcode.com/paper/assessing-performance-of-aerobic-routines
Repo
Framework

An Evaluation of Trajectory Prediction Approaches and Notes on the TrajNet Benchmark


Title	An Evaluation of Trajectory Prediction Approaches and Notes on the TrajNet Benchmark
Authors	Stefan Becker, Ronny Hug, Wolfgang Hübner, Michael Arens
Abstract	In recent years, there is a shift from modeling the tracking problem based on Bayesian formulation towards using deep neural networks. Towards this end, in this paper the effectiveness of various deep neural networks for predicting future pedestrian paths are evaluated. The analyzed deep networks solely rely, like in the traditional approaches, on observed tracklets without human-human interaction information. The evaluation is done on the publicly available TrajNet benchmark dataset, which builds up a repository of considerable and popular datasets for trajectory-based activity forecasting. We show that a Recurrent-Encoder with a Dense layer stacked on top, referred to as RED-predictor, is able to achieve sophisticated results compared to elaborated models in such scenarios. Further, we investigate failure cases and give explanations for observed phenomena and give some recommendations for overcoming demonstrated shortcomings.
Tasks	Trajectory Prediction
Published	2018-05-19
URL	http://arxiv.org/abs/1805.07663v6
PDF	http://arxiv.org/pdf/1805.07663v6.pdf
PWC	https://paperswithcode.com/paper/an-evaluation-of-trajectory-prediction
Repo
Framework

Deep learning-based super-resolution in coherent imaging systems


Title	Deep learning-based super-resolution in coherent imaging systems
Authors	Tairan Liu, Kevin de Haan, Yair Rivenson, Zhensong Wei, Xin Zeng, Yibo Zhang, Aydogan Ozcan
Abstract	We present a deep learning framework based on a generative adversarial network (GAN) to perform super-resolution in coherent imaging systems. We demonstrate that this framework can enhance the resolution of both pixel size-limited and diffraction-limited coherent imaging systems. We experimentally validated the capabilities of this deep learning-based coherent imaging approach by super-resolving complex images acquired using a lensfree on-chip holographic microscope, the resolution of which was pixel size-limited. Using the same GAN-based approach, we also improved the resolution of a lens-based holographic imaging system that was limited in resolution by the numerical aperture of its objective lens. This deep learning-based super-resolution framework can be broadly applied to enhance the space-bandwidth product of coherent imaging systems using image data and convolutional neural networks, and provides a rapid, non-iterative method for solving inverse image reconstruction or enhancement problems in optics.
Tasks	Image Reconstruction, Super-Resolution
Published	2018-10-15
URL	http://arxiv.org/abs/1810.06611v1
PDF	http://arxiv.org/pdf/1810.06611v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-based-super-resolution-in
Repo
Framework

Land cover mapping at very high resolution with rotation equivariant CNNs: towards small yet accurate models


Title	Land cover mapping at very high resolution with rotation equivariant CNNs: towards small yet accurate models
Authors	Diego Marcos, Michele Volpi, Benjamin Kellenberger, Devis Tuia
Abstract	In remote sensing images, the absolute orientation of objects is arbitrary. Depending on an object’s orientation and on a sensor’s flight path, objects of the same semantic class can be observed in different orientations in the same image. Equivariance to rotation, in this context understood as responding with a rotated semantic label map when subject to a rotation of the input image, is therefore a very desirable feature, in particular for high capacity models, such as Convolutional Neural Networks (CNNs). If rotation equivariance is encoded in the network, the model is confronted with a simpler task and does not need to learn specific (and redundant) weights to address rotated versions of the same object class. In this work we propose a CNN architecture called Rotation Equivariant Vector Field Network (RotEqNet) to encode rotation equivariance in the network itself. By using rotating convolutions as building blocks and passing only the the values corresponding to the maximally activating orientation throughout the network in the form of orientation encoding vector fields, RotEqNet treats rotated versions of the same object with the same filter bank and therefore achieves state-of-the-art performances even when using very small architectures trained from scratch. We test RotEqNet in two challenging sub-decimeter resolution semantic labeling problems, and show that we can perform better than a standard CNN while requiring one order of magnitude less parameters.
Tasks
Published	2018-03-16
URL	http://arxiv.org/abs/1803.06253v1
PDF	http://arxiv.org/pdf/1803.06253v1.pdf
PWC	https://paperswithcode.com/paper/land-cover-mapping-at-very-high-resolution
Repo
Framework

ReSet: Learning Recurrent Dynamic Routing in ResNet-like Neural Networks


Title	ReSet: Learning Recurrent Dynamic Routing in ResNet-like Neural Networks
Authors	Iurii Kemaev, Daniil Polykovskiy, Dmitry Vetrov
Abstract	Neural Network is a powerful Machine Learning tool that shows outstanding performance in Computer Vision, Natural Language Processing, and Artificial Intelligence. In particular, recently proposed ResNet architecture and its modifications produce state-of-the-art results in image classification problems. ResNet and most of the previously proposed architectures have a fixed structure and apply the same transformation to all input images. In this work, we develop a ResNet-based model that dynamically selects Computational Units (CU) for each input object from a learned set of transformations. Dynamic selection allows the network to learn a sequence of useful transformations and apply only required units to predict the image label. We compare our model to ResNet-38 architecture and achieve better results than the original ResNet on CIFAR-10.1 test set. While examining the produced paths, we discovered that the network learned different routes for images from different classes and similar routes for similar images.
Tasks	Image Classification
Published	2018-11-11
URL	http://arxiv.org/abs/1811.04380v1
PDF	http://arxiv.org/pdf/1811.04380v1.pdf
PWC	https://paperswithcode.com/paper/reset-learning-recurrent-dynamic-routing-in
Repo
Framework

Meta Reinforcement Learning with Latent Variable Gaussian Processes


Title	Meta Reinforcement Learning with Latent Variable Gaussian Processes
Authors	Steindór Sæmundsson, Katja Hofmann, Marc Peter Deisenroth
Abstract	Learning from small data sets is critical in many practical applications where data collection is time consuming or expensive, e.g., robotics, animal experiments or drug design. Meta learning is one way to increase the data efficiency of learning algorithms by generalizing learned concepts from a set of training tasks to unseen, but related, tasks. Often, this relationship between tasks is hard coded or relies in some other way on human expertise. In this paper, we frame meta learning as a hierarchical latent variable model and infer the relationship between tasks automatically from data. We apply our framework in a model-based reinforcement learning setting and show that our meta-learning model effectively generalizes to novel tasks by identifying how new tasks relate to prior ones from minimal data. This results in up to a 60% reduction in the average interaction time needed to solve tasks compared to strong baselines.
Tasks	Gaussian Processes, Meta-Learning
Published	2018-03-20
URL	http://arxiv.org/abs/1803.07551v2
PDF	http://arxiv.org/pdf/1803.07551v2.pdf
PWC	https://paperswithcode.com/paper/meta-reinforcement-learning-with-latent
Repo
Framework

Human Perception of Surprise: A User Study


Title	Human Perception of Surprise: A User Study
Authors	Nalin Chhibber, Rohail Syed, Mengqiu Teng, Joslin Goh, Kevyn Collins-Thompson, Edith Law
Abstract	Understanding how to engage users is a critical question in many applications. Previous research has shown that unexpected or astonishing events can attract user attention, leading to positive outcomes such as engagement and learning. In this work, we investigate the similarity and differences in how people and algorithms rank the surprisingness of facts. Our crowdsourcing study, involving 106 participants, shows that computational models of surprise can be used to artificially induce surprise in humans.
Tasks
Published	2018-07-16
URL	http://arxiv.org/abs/1807.05906v1
PDF	http://arxiv.org/pdf/1807.05906v1.pdf
PWC	https://paperswithcode.com/paper/human-perception-of-surprise-a-user-study
Repo
Framework

Variance Reduction in Stochastic Particle-Optimization Sampling


Title	Variance Reduction in Stochastic Particle-Optimization Sampling
Authors	Jianyi Zhang, Yang Zhao, Changyou Chen
Abstract	Stochastic particle-optimization sampling (SPOS) is a recently-developed scalable Bayesian sampling framework that unifies stochastic gradient MCMC (SG-MCMC) and Stein variational gradient descent (SVGD) algorithms based on Wasserstein gradient flows. With a rigorous non-asymptotic convergence theory developed recently, SPOS avoids the particle-collapsing pitfall of SVGD. Nevertheless, variance reduction in SPOS has never been studied. In this paper, we bridge the gap by presenting several variance-reduction techniques for SPOS. Specifically, we propose three variants of variance-reduced SPOS, called SAGA particle-optimization sampling (SAGA-POS), SVRG particle-optimization sampling (SVRG-POS) and a variant of SVRG-POS which avoids full gradient computations, denoted as SVRG-POS$^+$. Importantly, we provide non-asymptotic convergence guarantees for these algorithms in terms of 2-Wasserstein metric and analyze their complexities. Remarkably, the results show our algorithms yield better convergence rates than existing variance-reduced variants of stochastic Langevin dynamics, even though more space is required to store the particles in training. Our theory well aligns with experimental results on both synthetic and real datasets.
Tasks
Published	2018-11-20
URL	http://arxiv.org/abs/1811.08052v1
PDF	http://arxiv.org/pdf/1811.08052v1.pdf
PWC	https://paperswithcode.com/paper/variance-reduction-in-stochastic-particle
Repo
Framework

VERAM: View-Enhanced Recurrent Attention Model for 3D Shape Classification


Title	VERAM: View-Enhanced Recurrent Attention Model for 3D Shape Classification
Authors	Songle Chen, Lintao Zheng, Yan Zhang, Zhixin Sun, Kai Xu
Abstract	Multi-view deep neural network is perhaps the most successful approach in 3D shape classification. However, the fusion of multi-view features based on max or average pooling lacks a view selection mechanism, limiting its application in, e.g., multi-view active object recognition by a robot. This paper presents VERAM, a recurrent attention model capable of actively selecting a sequence of views for highly accurate 3D shape classification. VERAM addresses an important issue commonly found in existing attention-based models, i.e., the unbalanced training of the subnetworks corresponding to next view estimation and shape classification. The classification subnetwork is easily overfitted while the view estimation one is usually poorly trained, leading to a suboptimal classification performance. This is surmounted by three essential view-enhancement strategies: 1) enhancing the information flow of gradient backpropagation for the view estimation subnetwork, 2) devising a highly informative reward function for the reinforcement training of view estimation and 3) formulating a novel loss function that explicitly circumvents view duplication. Taking grayscale image as input and AlexNet as CNN architecture, VERAM with 9 views achieves instance-level and class-level accuracy of 95:5% and 95:3% on ModelNet10, 93:7% and 92:1% on ModelNet40, both are the state-of-the-art performance under the same number of views.
Tasks	Object Recognition
Published	2018-08-20
URL	http://arxiv.org/abs/1808.06698v1
PDF	http://arxiv.org/pdf/1808.06698v1.pdf
PWC	https://paperswithcode.com/paper/veram-view-enhanced-recurrent-attention-model
Repo
Framework

Convolutional Neural Network for Trajectory Prediction


Title	Convolutional Neural Network for Trajectory Prediction
Authors	Nishant Nikhil, Brendan Tran Morris
Abstract	Predicting trajectories of pedestrians is quintessential for autonomous robots which share the same environment with humans. In order to effectively and safely interact with humans, trajectory prediction needs to be both precise and computationally efficient. In this work, we propose a convolutional neural network (CNN) based human trajectory prediction approach. Unlike more recent LSTM-based moles which attend sequentially to each frame, our model supports increased parallelism and effective temporal representation. The proposed compact CNN model is faster than the current approaches yet still yields competitive results.
Tasks	Trajectory Prediction
Published	2018-09-03
URL	http://arxiv.org/abs/1809.00696v2
PDF	http://arxiv.org/pdf/1809.00696v2.pdf
PWC	https://paperswithcode.com/paper/convolutional-neural-network-for-trajectory
Repo
Framework

Removing Hidden Confounding by Experimental Grounding


Title	Removing Hidden Confounding by Experimental Grounding
Authors	Nathan Kallus, Aahlad Manas Puli, Uri Shalit
Abstract	Observational data is increasingly used as a means for making individual-level causal predictions and intervention recommendations. The foremost challenge of causal inference from observational data is hidden confounding, whose presence cannot be tested in data and can invalidate any causal conclusion. Experimental data does not suffer from confounding but is usually limited in both scope and scale. We introduce a novel method of using limited experimental data to correct the hidden confounding in causal effect models trained on larger observational data, even if the observational data does not fully overlap with the experimental data. Our method makes strictly weaker assumptions than existing approaches, and we prove conditions under which it yields a consistent estimator. We demonstrate our method’s efficacy using real-world data from a large educational experiment.
Tasks	Causal Inference
Published	2018-10-27
URL	http://arxiv.org/abs/1810.11646v1
PDF	http://arxiv.org/pdf/1810.11646v1.pdf
PWC	https://paperswithcode.com/paper/removing-hidden-confounding-by-experimental
Repo
Framework

EBIC: an open source software for high-dimensional and big data biclustering analyses


Title	EBIC: an open source software for high-dimensional and big data biclustering analyses
Authors	Patryk Orzechowski, Jason H. Moore
Abstract	Motivation: In this paper we present the latest release of EBIC, a next-generation biclustering algorithm for mining genetic data. The major contribution of this paper is adding support for big data, making it possible to efficiently run large genomic data mining analyses. Additional enhancements include integration with R and Bioconductor and an option to remove influence of missing value on the final result. Results: EBIC was applied to datasets of different sizes, including a large DNA methylation dataset with 436,444 rows. For the largest dataset we observed over 6.6 fold speedup in computation time on a cluster of 8 GPUs compared to running the method on a single GPU. This proves high scalability of the algorithm. Availability: The latest version of EBIC could be downloaded from http://github.com/EpistasisLab/ebic . Installation and usage instructions are also available online.
Tasks
Published	2018-07-26
URL	http://arxiv.org/abs/1807.09932v1
PDF	http://arxiv.org/pdf/1807.09932v1.pdf
PWC	https://paperswithcode.com/paper/ebic-an-open-source-software-for-high
Repo
Framework

Deeply Learning Derivatives


Title	Deeply Learning Derivatives
Authors	Ryan Ferguson, Andrew Green
Abstract	This paper uses deep learning to value derivatives. The approach is broadly applicable, and we use a call option on a basket of stocks as an example. We show that the deep learning model is accurate and very fast, capable of producing valuations a million times faster than traditional models. We develop a methodology to randomly generate appropriate training data and explore the impact of several parameters including layer width and depth, training data quality and quantity on model speed and accuracy.
Tasks
Published	2018-09-06
URL	http://arxiv.org/abs/1809.02233v4
PDF	http://arxiv.org/pdf/1809.02233v4.pdf
PWC	https://paperswithcode.com/paper/deeply-learning-derivatives
Repo
Framework

High Resolution Face Completion with Multiple Controllable Attributes via Fully End-to-End Progressive Generative Adversarial Networks


Title	High Resolution Face Completion with Multiple Controllable Attributes via Fully End-to-End Progressive Generative Adversarial Networks
Authors	Zeyuan Chen, Shaoliang Nie, Tianfu Wu, Christopher G. Healey
Abstract	We present a deep learning approach for high resolution face completion with multiple controllable attributes (e.g., male and smiling) under arbitrary masks. Face completion entails understanding both structural meaningfulness and appearance consistency locally and globally to fill in “holes” whose content do not appear elsewhere in an input image. It is a challenging task with the difficulty level increasing significantly with respect to high resolution, the complexity of “holes” and the controllable attributes of filled-in fragments. Our system addresses the challenges by learning a fully end-to-end framework that trains generative adversarial networks (GANs) progressively from low resolution to high resolution with conditional vectors encoding controllable attributes. We design novel network architectures to exploit information across multiple scales effectively and efficiently. We introduce new loss functions encouraging sharp completion. We show that our system can complete faces with large structural and appearance variations using a single feed-forward pass of computation with mean inference time of 0.007 seconds for images at 1024 x 1024 resolution. We also perform a pilot human study that shows our approach outperforms state-of-the-art face completion methods in terms of rank analysis. The code will be released upon publication.
Tasks	Facial Inpainting
Published	2018-01-23
URL	http://arxiv.org/abs/1801.07632v1
PDF	http://arxiv.org/pdf/1801.07632v1.pdf
PWC	https://paperswithcode.com/paper/high-resolution-face-completion-with-multiple
Repo
Framework

Quantization-Aware Phase Retrieval


Title	Quantization-Aware Phase Retrieval
Authors	Subhadip Mukherjee, Chandra Sekhar Seelamantula
Abstract	We address the problem of phase retrieval (PR) from quantized measurements. The goal is to reconstruct a signal from quadratic measurements encoded with a finite precision, which is indeed the case in many practical applications. We develop a rank-1 projection algorithm that recovers the signal subject to ensuring consistency with the measurement, that is, the recovered signal when encoded must yield the same set of measurements that one started with. The rank-1 projection stems from the idea of lifting, originally proposed in the context of PhaseLift. The consistency criterion is enforced using a one-sided quadratic cost. We also determine the probability with which different vectors lead to the same set of quantized measurements, which makes it impossible to resolve them. Naturally, this probability depends on how correlated such vectors are, and how coarsely/finely the measurements get quantized. The proposed algorithm is also capable of incorporating a sparsity constraint on the signal. An analysis of the cost function reveals that it is bounded, both above and below, by functions that are dependent on how well correlated the estimate is with the ground truth. We also derive the Cram'er-Rao lower bound (CRB) on the achievable reconstruction accuracy. A comparison with the state-of-the- art algorithms shows that the proposed algorithm has a higher reconstruction accuracy and is about 2 to 3 dB away from the CRB. The edge, in terms of the reconstruction signal-to-noise ratio, over the competing algorithms is higher (about 5 to 6 dB) when the quantization is coarse.
Tasks	Quantization
Published	2018-10-02
URL	http://arxiv.org/abs/1810.01097v1
PDF	http://arxiv.org/pdf/1810.01097v1.pdf
PWC	https://paperswithcode.com/paper/quantization-aware-phase-retrieval
Repo
Framework