February 1, 2020

3229 words 16 mins read

Paper Group AWR 136

Unified Sample-Optimal Property Estimation in Near-Linear Time. Detecting and Tracking Small Moving Objects in Wide Area Motion Imagery (WAMI) Using Convolutional Neural Networks (CNNs). Reinforcement Learning for Channel Coding: Learned Bit-Flipping Decoding. Adaptive Discretization for Episodic Reinforcement Learning in Metric Spaces. Robust Poin …

Unified Sample-Optimal Property Estimation in Near-Linear Time


Title	Unified Sample-Optimal Property Estimation in Near-Linear Time
Authors	Yi Hao, Alon Orlitsky
Abstract	We consider the fundamental learning problem of estimating properties of distributions over large domains. Using a novel piecewise-polynomial approximation technique, we derive the first unified methodology for constructing sample- and time-efficient estimators for all sufficiently smooth, symmetric and non-symmetric, additive properties. This technique yields near-linear-time computable estimators whose approximation values are asymptotically optimal and highly-concentrated, resulting in the first: 1) estimators achieving the $\mathcal{O}(k/(\varepsilon^2\log k))$ min-max $\varepsilon$-error sample complexity for all $k$-symbol Lipschitz properties; 2) unified near-optimal differentially private estimators for a variety of properties; 3) unified estimator achieving optimal bias and near-optimal variance for five important properties; 4) near-optimal sample-complexity estimators for several important symmetric properties over both domain sizes and confidence levels. In addition, we establish a McDiarmid’s inequality under Poisson sampling, which is of independent interest.
Tasks
Published	2019-11-08
URL	https://arxiv.org/abs/1911.03105v2
PDF	https://arxiv.org/pdf/1911.03105v2.pdf
PWC	https://paperswithcode.com/paper/unified-sample-optimal-property-estimation-in
Repo	https://github.com/ucsdyi/unified_polynomial_poster
Framework	none

Detecting and Tracking Small Moving Objects in Wide Area Motion Imagery (WAMI) Using Convolutional Neural Networks (CNNs)


Title	Detecting and Tracking Small Moving Objects in Wide Area Motion Imagery (WAMI) Using Convolutional Neural Networks (CNNs)
Authors	Yifan Zhou, Simon Maskell
Abstract	This paper proposes an approach to detect moving objects in Wide Area Motion Imagery (WAMI), in which the objects are both small and well separated. Identifying the objects only using foreground appearance is difficult since a $100-$pixel vehicle is hard to distinguish from objects comprising the background. Our approach is based on background subtraction as an efficient and unsupervised method that is able to output the shape of objects. In order to reliably detect low contrast and small objects, we configure the background subtraction to extract foreground regions that might be objects of interest. While this dramatically increases the number of false alarms, a Convolutional Neural Network (CNN) considering both spatial and temporal information is then trained to reject the false alarms. In areas with heavy traffic, the background subtraction yields merged detections. To reduce the complexity of multi-target tracker needed, we train another CNN to predict the positions of multiple moving objects in an area. Our approach shows competitive detection performance on smaller objects relative to the state-of-the-art. We adopt a GM-PHD filter to associate detections over time and analyse the resulting performance.
Tasks
Published	2019-11-05
URL	https://arxiv.org/abs/1911.01727v2
PDF	https://arxiv.org/pdf/1911.01727v2.pdf
PWC	https://paperswithcode.com/paper/detecting-and-tracking-small-moving-objects
Repo	https://github.com/zhouyifan233/MovingObjDetector-WAMI.python
Framework	tf

Reinforcement Learning for Channel Coding: Learned Bit-Flipping Decoding


Title	Reinforcement Learning for Channel Coding: Learned Bit-Flipping Decoding
Authors	Fabrizio Carpi, Christian Häger, Marco Martalò, Riccardo Raheli, Henry D. Pfister
Abstract	In this paper, we use reinforcement learning to find effective decoding strategies for binary linear codes. We start by reviewing several iterative decoding algorithms that involve a decision-making process at each step, including bit-flipping (BF) decoding, residual belief propagation, and anchor decoding. We then illustrate how such algorithms can be mapped to Markov decision processes allowing for data-driven learning of optimal decision strategies, rather than basing decisions on heuristics or intuition. As a case study, we consider BF decoding for both the binary symmetric and additive white Gaussian noise channel. Our results show that learned BF decoders can offer a range of performance-complexity trade-offs for the considered Reed-Muller and BCH codes, and achieve near-optimal performance in some cases. We also demonstrate learning convergence speed-ups when biasing the learning process towards correct decoding decisions, as opposed to relying only on random explorations and past knowledge.
Tasks	Decision Making
Published	2019-06-11
URL	https://arxiv.org/abs/1906.04448v2
PDF	https://arxiv.org/pdf/1906.04448v2.pdf
PWC	https://paperswithcode.com/paper/reinforcement-learning-for-channel-coding
Repo	https://github.com/fabriziocarpi/RLdecoding
Framework	tf

Adaptive Discretization for Episodic Reinforcement Learning in Metric Spaces


Title	Adaptive Discretization for Episodic Reinforcement Learning in Metric Spaces
Authors	Sean R. Sinclair, Siddhartha Banerjee, Christina Lee Yu
Abstract	We present an efficient algorithm for model-free episodic reinforcement learning on large (potentially continuous) state-action spaces. Our algorithm is based on a novel $Q$-learning policy with adaptive data-driven discretization. The central idea is to maintain a finer partition of the state-action space in regions which are frequently visited in historical trajectories, and have higher payoff estimates. We demonstrate how our adaptive partitions take advantage of the shape of the optimal $Q$-function and the joint space, without sacrificing the worst-case performance. In particular, we recover the regret guarantees of prior algorithms for continuous state-action spaces, which additionally require either an optimal discretization as input, and/or access to a simulation oracle. Moreover, experiments demonstrate how our algorithm automatically adapts to the underlying structure of the problem, resulting in much better performance compared both to heuristics and $Q$-learning with uniform discretization.
Tasks	Q-Learning
Published	2019-10-17
URL	https://arxiv.org/abs/1910.08151v2
PDF	https://arxiv.org/pdf/1910.08151v2.pdf
PWC	https://paperswithcode.com/paper/adaptive-discretization-for-episodic
Repo	https://github.com/seanrsinclair/AdaptiveQLearning
Framework	none

Robust Point Cloud Based Reconstruction of Large-Scale Outdoor Scenes


Title	Robust Point Cloud Based Reconstruction of Large-Scale Outdoor Scenes
Authors	Ziquan Lan, Zi Jian Yew, Gim Hee Lee
Abstract	Outlier feature matches and loop-closures that survived front-end data association can lead to catastrophic failures in the back-end optimization of large-scale point cloud based 3D reconstruction. To alleviate this problem, we propose a probabilistic approach for robust back-end optimization in the presence of outliers. More specifically, we model the problem as a Bayesian network and solve it using the Expectation-Maximization algorithm. Our approach leverages on a long-tail Cauchy distribution to suppress outlier feature matches in the odometry constraints, and a Cauchy-Uniform mixture model with a set of binary latent variables to simultaneously suppress outlier loop-closure constraints and outlier feature matches in the inlier loop-closure constraints. Furthermore, we show that by using a Gaussian-Uniform mixture model, our approach degenerates to the formulation of a state-of-the-art approach for robust indoor reconstruction. Experimental results demonstrate that our approach has comparable performance with the state-of-the-art on a benchmark indoor dataset, and outperforms it on a large-scale outdoor dataset. Our source code can be found on the project website.
Tasks	3D Reconstruction
Published	2019-05-23
URL	https://arxiv.org/abs/1905.09634v1
PDF	https://arxiv.org/pdf/1905.09634v1.pdf
PWC	https://paperswithcode.com/paper/190509634
Repo	https://github.com/ziquan111/RobustPCLReconstruction
Framework	none

Interpolated Adversarial Training: Achieving Robust Neural Networks without Sacrificing Too Much Accuracy


Title	Interpolated Adversarial Training: Achieving Robust Neural Networks without Sacrificing Too Much Accuracy
Authors	Alex Lamb, Vikas Verma, Juho Kannala, Yoshua Bengio
Abstract	Adversarial robustness has become a central goal in deep learning, both in the theory and the practice. However, successful methods to improve the adversarial robustness (such as adversarial training) greatly hurt generalization performance on the unperturbed data. This could have a major impact on how the adversarial robustness affects real world systems (i.e. many may opt to forgo robustness if it can improve accuracy on the unperturbed data). We propose Interpolated Adversarial Training, which employs recently proposed interpolation based training methods in the framework of adversarial training. On CIFAR-10, adversarial training increases the standard test error ( when there is no adversary) from 4.43% to 12.32%, whereas with our Interpolated adversarial training we retain the adversarial robustness while achieving a standard test error of only 6.45%. With our technique, the relative increase in the standard error for the robust model is reduced from 178.1% to just 45.5%.
Tasks
Published	2019-06-16
URL	https://arxiv.org/abs/1906.06784v4
PDF	https://arxiv.org/pdf/1906.06784v4.pdf
PWC	https://paperswithcode.com/paper/interpolated-adversarial-training-achieving
Repo	https://github.com/shivamsaboo17/ManifoldMixup
Framework	pytorch

BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Networks


Title	BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Networks
Authors	Yao Yao, Zixin Luo, Shiwei Li, Jingyang Zhang, Yufan Ren, Lei Zhou, Tian Fang, Long Quan
Abstract	While deep learning has recently achieved great success on multi-view stereo (MVS), limited training data makes the trained model hard to be generalized to unseen scenarios. Compared with other computer vision tasks, it is rather difficult to collect a large-scale MVS dataset as it requires expensive active scanners and labor-intensive process to obtain ground truth 3D structures. In this paper, we introduce BlendedMVS, a novel large-scale dataset, to provide sufficient training ground truth for learning-based MVS. To create the dataset, we apply a 3D reconstruction pipeline to recover high-quality textured meshes from images of well-selected scenes. Then, we render these mesh models to color images and depth maps. The rendered color images are further blended with the input images to generate photo-realistic blended images as the training input. Our dataset contains over 17k high-resolution images covering a variety of scenes, including cities, architectures, sculptures and small objects. Extensive experiments demonstrate that BlendedMVS endows the trained model with significantly better generalization ability compared with other MVS datasets. The entire dataset with pretrained models will be made publicly available at https://github.com/YoYo000/BlendedMVS.
Tasks	3D Reconstruction
Published	2019-11-22
URL	https://arxiv.org/abs/1911.10127v1
PDF	https://arxiv.org/pdf/1911.10127v1.pdf
PWC	https://paperswithcode.com/paper/blendedmvs-a-large-scale-dataset-for
Repo	https://github.com/YoYo000/BlendedMVS
Framework	tf

Unsupervised Learning of Anomaly Detection from Contaminated Image Data using Simultaneous Encoder Training


Title	Unsupervised Learning of Anomaly Detection from Contaminated Image Data using Simultaneous Encoder Training
Authors	Amanda Berg, Jörgen Ahlberg, Michael Felsberg
Abstract	Unsupervised learning of anomaly detection in high-dimensional data, such as images, is a challenging problem recently subject to intense research. Through careful modelling of the data distribution of normal samples, it is possible to detect deviant samples, so called anomalies. Generative Adversarial Networks (GANs) can model the highly complex, high-dimensional data distribution of normal image samples, and have shown to be a suitable approach to the problem. Previously published GAN-based anomaly detection methods often assume that anomaly-free data is available for training. However, this assumption is not valid in most real-life scenarios, a.k.a. in the wild. In this work, we evaluate the effects of anomaly contaminations in the training data on state-of-the-art GAN-based anomaly detection methods. As expected, detection performance deteriorates. To address this performance drop, we propose to add an additional encoder network already at training time and show that joint generator-encoder training stratifies the latent space, mitigating the problem with contaminated data. We show experimentally that the norm of a query image in this stratified latent space becomes a highly significant cue to discriminate anomalies from normal data. The proposed method achieves state-of-the-art performance on CIFAR-10 as well as on a large, previously untested dataset with cell images.
Tasks	Anomaly Detection
Published	2019-05-27
URL	https://arxiv.org/abs/1905.11034v2
PDF	https://arxiv.org/pdf/1905.11034v2.pdf
PWC	https://paperswithcode.com/paper/unsupervised-learning-of-anomaly-detection
Repo	https://github.com/amandaberg/GANanomalyDetection
Framework	tf

Federated PCA with Adaptive Rank Estimation


Title	Federated PCA with Adaptive Rank Estimation
Authors	Andreas Grammenos, Rodrigo Mendoza-Smith, Cecilia Mascolo, Jon Crowcroft
Abstract	In many online machine learning and data science tasks such as data summarisation and feature compression, $d$-dimensional vectors are usually distributed across a large number of clients in a decentralised network and collected in a streaming fashion. This is increasingly common in modern applications due to the sheer volume of data generated and the clients’ constrained resources. In this setting, some clients are required to compute an update to a centralised target model independently using local data while other clients aggregate these updates with a low-complexity merging algorithm. However, some clients with limited storage might not be able to store all of the data samples if $d$ is large, nor compute procedures requiring at least $\Omega(d^2)$ storage-complexity such as Principal Component Analysis, Subspace Tracking, or general Feature Correlation. In this work, we present a novel federated algorithm for PCA that is able to adaptively estimate the rank $r$ of the dataset and compute its $r$ leading principal components when only $O(dr)$ memory is available. This inherent adaptability implies that $r$ does not have to be supplied as a fixed hyper-parameter which is beneficial when the underlying data distribution is not known in advance, such as in a streaming setting. Numerical simulations show that, while using limited-memory, our algorithm exhibits state-of-the-art performance that closely matches or outperforms traditional non-federated algorithms, and in the absence of communication latency, it exhibits attractive horizontal scalability.
Tasks
Published	2019-07-18
URL	https://arxiv.org/abs/1907.08059v1
PDF	https://arxiv.org/pdf/1907.08059v1.pdf
PWC	https://paperswithcode.com/paper/federated-pca-with-adaptive-rank-estimation
Repo	https://github.com/andylamp/federated_pca
Framework	none

Morphological Irregularity Correlates with Frequency


Title	Morphological Irregularity Correlates with Frequency
Authors	Shijie Wu, Ryan Cotterell, Timothy J. O’Donnell
Abstract	We present a study of morphological irregularity. Following recent work, we define an information-theoretic measure of irregularity based on the predictability of forms in a language. Using a neural transduction model, we estimate this quantity for the forms in 28 languages. We first present several validatory and exploratory analyses of irregularity. We then show that our analyses provide evidence for a correlation between irregularity and frequency: higher frequency items are more likely to be irregular and irregular items are more likely be highly frequent. To our knowledge, this result is the first of its breadth and confirms longstanding proposals from the linguistics literature. The correlation is more robust when aggregated at the level of whole paradigms–providing support for models of linguistic structure in which inflected forms are unified by abstract underlying stems or lexemes. Code is available at https://github.com/shijie-wu/neural-transducer.
Tasks
Published	2019-06-27
URL	https://arxiv.org/abs/1906.11483v1
PDF	https://arxiv.org/pdf/1906.11483v1.pdf
PWC	https://paperswithcode.com/paper/morphological-irregularity-correlates-with
Repo	https://github.com/shijie-wu/neural-transducer
Framework	pytorch

Self-Supervised Deep Depth Denoising


Title	Self-Supervised Deep Depth Denoising
Authors	Vladimiros Sterzentsenko, Leonidas Saroglou, Anargyros Chatzitofis, Spyridon Thermos, Nikolaos Zioulis, Alexandros Doumanoglou, Dimitrios Zarpalas, Petros Daras
Abstract	Depth perception is considered an invaluable source of information for various vision tasks. However, depth maps acquired using consumer-level sensors still suffer from non-negligible noise. This fact has recently motivated researchers to exploit traditional filters, as well as the deep learning paradigm, in order to suppress the aforementioned non-uniform noise, while preserving geometric details. Despite the effort, deep depth denoising is still an open challenge mainly due to the lack of clean data that could be used as ground truth. In this paper, we propose a fully convolutional deep autoencoder that learns to denoise depth maps, surpassing the lack of ground truth data. Specifically, the proposed autoencoder exploits multiple views of the same scene from different points of view in order to learn to suppress noise in a self-supervised end-to-end manner using depth and color information during training, yet only depth during inference. To enforce selfsupervision, we leverage a differentiable rendering technique to exploit photometric supervision, which is further regularized using geometric and surface priors. As the proposed approach relies on raw data acquisition, a large RGB-D corpus is collected using Intel RealSense sensors. Complementary to a quantitative evaluation, we demonstrate the effectiveness of the proposed self-supervised denoising approach on established 3D reconstruction applications. Code is avalable at https://github.com/VCL3D/DeepDepthDenoising
Tasks	3D Reconstruction, Denoising
Published	2019-09-03
URL	https://arxiv.org/abs/1909.01193v2
PDF	https://arxiv.org/pdf/1909.01193v2.pdf
PWC	https://paperswithcode.com/paper/self-supervised-deep-depth-denoising
Repo	https://github.com/VCL3D/DeepDepthDenoising
Framework	pytorch

iPhys: An Open Non-Contact Imaging-Based Physiological Measurement Toolbox


Title	iPhys: An Open Non-Contact Imaging-Based Physiological Measurement Toolbox
Authors	Daniel McDuff, Ethan Blackford
Abstract	Imaging-based, non-contact measurement of physiology (including imaging photoplethysmography and imaging ballistocardiography) is a growing field of research. There are several strengths of imaging methods that make them attractive. They remove the need for uncomfortable contact sensors and can enable spatial and concomitant measurement from a single sensor. Furthermore, cameras are ubiquitous and often low-cost solutions for sensing. Open source toolboxes help accelerate the progress of research by providing a means to compare new approaches against standard implementations of the state-of-the-art. We present an open source imaging-based physiological measurement toolbox with implementations of many of the most frequently employed computational methods. We hope that this toolbox will contribute to the advancement of non-contact physiological sensing methods.
Tasks
Published	2019-01-14
URL	http://arxiv.org/abs/1901.04366v1
PDF	http://arxiv.org/pdf/1901.04366v1.pdf
PWC	https://paperswithcode.com/paper/iphys-an-open-non-contact-imaging-based
Repo	https://github.com/danmcduff/iphys-toolbox
Framework	none

Transformable Bottleneck Networks


Title	Transformable Bottleneck Networks
Authors	Kyle Olszewski, Sergey Tulyakov, Oliver Woodford, Hao Li, Linjie Luo
Abstract	We propose a novel approach to performing fine-grained 3D manipulation of image content via a convolutional neural network, which we call the Transformable Bottleneck Network (TBN). It applies given spatial transformations directly to a volumetric bottleneck within our encoder-bottleneck-decoder architecture. Multi-view supervision encourages the network to learn to spatially disentangle the feature space within the bottleneck. The resulting spatial structure can be manipulated with arbitrary spatial transformations. We demonstrate the efficacy of TBNs for novel view synthesis, achieving state-of-the-art results on a challenging benchmark. We demonstrate that the bottlenecks produced by networks trained for this task contain meaningful spatial structure that allows us to intuitively perform a variety of image manipulations in 3D, well beyond the rigid transformations seen during training. These manipulations include non-uniform scaling, non-rigid warping, and combining content from different images. Finally, we extract explicit 3D structure from the bottleneck, performing impressive 3D reconstruction from a single input image.
Tasks	3D Reconstruction, Novel View Synthesis
Published	2019-04-13
URL	https://arxiv.org/abs/1904.06458v5
PDF	https://arxiv.org/pdf/1904.06458v5.pdf
PWC	https://paperswithcode.com/paper/transformable-bottleneck-networks
Repo	https://github.com/kyleolsz/TB-Networks
Framework	pytorch

A generalization of regularized dual averaging and its dynamics


Title	A generalization of regularized dual averaging and its dynamics
Authors	Shih-Kang Chao, Guang Cheng
Abstract	Excessive computational cost for learning large data and streaming data can be alleviated by using stochastic algorithms, such as stochastic gradient descent and its variants. Recent advances improve stochastic algorithms on convergence speed, adaptivity and structural awareness. However, distributional aspects of these new algorithms are poorly understood, especially for structured parameters. To develop statistical inference in this case, we propose a class of generalized regularized dual averaging (gRDA) algorithms with constant step size, which improves RDA (Xiao, 2010; Flammarion and Bach, 2017). Weak convergence of gRDA trajectories are studied, and as a consequence, for the first time in the literature, the asymptotic distributions for online l1 penalized problems become available. These general results apply to both convex and non-convex differentiable loss functions, and in particular, recover the existing regret bound for convex losses (Nemirovski et al., 2009). As important applications, statistical inferential theory on online sparse linear regression and online sparse principal component analysis are developed, and are supported by extensive numerical analysis. Interestingly, when gRDA is properly tuned, support recovery and central limiting distribution (with mean zero) hold simultaneously in the online setting, which is in contrast with the biased central limiting distribution of batch Lasso (Knight and Fu, 2000). Technical devices, including weak convergence of stochastic mirror descent, are developed as by-products with independent interest. Preliminary empirical analysis of modern image data shows that learning very sparse deep neural networks by gRDA does not necessarily sacrifice testing accuracy.
Tasks
Published	2019-09-22
URL	https://arxiv.org/abs/1909.10072v1
PDF	https://arxiv.org/pdf/1909.10072v1.pdf
PWC	https://paperswithcode.com/paper/190910072
Repo	https://github.com/donlan2710/gRDA-Optimizer
Framework	tf

Face Beautification: Beyond Makeup Transfer


Title	Face Beautification: Beyond Makeup Transfer
Authors	Xudong Liu, Ruizhe Wang, Chih-Fan Chen, Minglei Yin, Hao Peng, Shukhan Ng, Xin Li
Abstract	Facial appearance plays an important role in our social lives. Subjective perception of women’s beauty depends on various face-related (e.g., skin, shape, hair) and environmental (e.g., makeup, lighting, angle) factors. Similar to cosmetic surgery in the physical world, virtual face beautification is an emerging field with many open issues to be addressed. Inspired by the latest advances in style-based synthesis and face beauty prediction, we propose a novel framework of face beautification. For a given reference face with a high beauty score, our GAN-based architecture is capable of translating an inquiry face into a sequence of beautified face images with referenced beauty style and targeted beauty score values. To achieve this objective, we propose to integrate both style-based beauty representation (extracted from the reference face) and beauty score prediction (trained on SCUT-FBP database) into the process of beautification. Unlike makeup transfer, our approach targets at many-to-many (instead of one-to-one) translation where multiple outputs can be defined by either different references or varying beauty scores. Extensive experimental results are reported to demonstrate the effectiveness and flexibility of the proposed face beautification framework.
Tasks
Published	2019-12-08
URL	https://arxiv.org/abs/1912.03630v2
PDF	https://arxiv.org/pdf/1912.03630v2.pdf
PWC	https://paperswithcode.com/paper/face-beautification-beyond-makeup-transfer
Repo	https://github.com/YuchaoZheng/Google_camp
Framework	pytorch