February 1, 2020

3083 words 15 mins read

Paper Group AWR 101

Paper Group AWR 101

FreeLabel: A Publicly Available Annotation Tool based on Freehand Traces. Interactive-Predictive Neural Machine Translation through Reinforcement and Imitation. Spatially and Temporally Efficient Non-local Attention Network for Video-based Person Re-Identification. A neural network oracle for quantum nonlocality problems in networks. ROI Pooled Cor …

FreeLabel: A Publicly Available Annotation Tool based on Freehand Traces

Title FreeLabel: A Publicly Available Annotation Tool based on Freehand Traces
Authors Philipe A. Dias, Zhou Shen, Amy Tabb, Henry Medeiros
Abstract Large-scale annotation of image segmentation datasets is often prohibitively expensive, as it usually requires a huge number of worker hours to obtain high-quality results. Abundant and reliable data has been, however, crucial for the advances on image understanding tasks achieved by deep learning models. In this paper, we introduce FreeLabel, an intuitive open-source web interface that allows users to obtain high-quality segmentation masks with just a few freehand scribbles, in a matter of seconds. The efficacy of FreeLabel is quantitatively demonstrated by experimental results on the PASCAL dataset as well as on a dataset from the agricultural domain. Designed to benefit the computer vision community, FreeLabel can be used for both crowdsourced or private annotation and has a modular structure that can be easily adapted for any image dataset.
Tasks Semantic Segmentation
Published 2019-02-18
URL http://arxiv.org/abs/1902.06806v2
PDF http://arxiv.org/pdf/1902.06806v2.pdf
PWC https://paperswithcode.com/paper/freelabel-a-publicly-available-annotation
Repo https://github.com/philadias/freelabel
Framework none

Interactive-Predictive Neural Machine Translation through Reinforcement and Imitation

Title Interactive-Predictive Neural Machine Translation through Reinforcement and Imitation
Authors Tsz Kin Lam, Shigehiko Schamoni, Stefan Riezler
Abstract We propose an interactive-predictive neural machine translation framework for easier model personalization using reinforcement and imitation learning. During the interactive translation process, the user is asked for feedback on uncertain locations identified by the system. Responses are weak feedback in the form of “keep” and “delete” edits, and expert demonstrations in the form of “substitute” edits. Conditioning on the collected feedback, the system creates alternative translations via constrained beam search. In simulation experiments on two language pairs our systems get close to the performance of supervised training with much less human effort.
Tasks Imitation Learning, Machine Translation
Published 2019-07-04
URL https://arxiv.org/abs/1907.02326v2
PDF https://arxiv.org/pdf/1907.02326v2.pdf
PWC https://paperswithcode.com/paper/interactive-predictive-neural-machine
Repo https://github.com/heidelkin/IPNMT_RL_IL
Framework none

Spatially and Temporally Efficient Non-local Attention Network for Video-based Person Re-Identification

Title Spatially and Temporally Efficient Non-local Attention Network for Video-based Person Re-Identification
Authors Chih-Ting Liu, Chih-Wei Wu, Yu-Chiang Frank Wang, Shao-Yi Chien
Abstract Video-based person re-identification (Re-ID) aims at matching video sequences of pedestrians across non-overlapping cameras. It is a practical yet challenging task of how to embed spatial and temporal information of a video into its feature representation. While most existing methods learn the video characteristics by aggregating image-wise features and designing attention mechanisms in Neural Networks, they only explore the correlation between frames at high-level features. In this work, we target at refining the intermediate features as well as high-level features with non-local attention operations and make two contributions. (i) We propose a Non-local Video Attention Network (NVAN) to incorporate video characteristics into the representation at multiple feature levels. (ii) We further introduce a Spatially and Temporally Efficient Non-local Video Attention Network (STE-NVAN) to reduce the computation complexity by exploring spatial and temporal redundancy presented in pedestrian videos. Extensive experiments show that our NVAN outperforms state-of-the-arts by 3.8% in rank-1 accuracy on MARS dataset and confirms our STE-NVAN displays a much superior computation footprint compared to existing methods.
Tasks Person Re-Identification, Video-Based Person Re-Identification
Published 2019-08-05
URL https://arxiv.org/abs/1908.01683v1
PDF https://arxiv.org/pdf/1908.01683v1.pdf
PWC https://paperswithcode.com/paper/spatially-and-temporally-efficient-non-local
Repo https://github.com/jackie840129/STE-NVAN
Framework pytorch

A neural network oracle for quantum nonlocality problems in networks

Title A neural network oracle for quantum nonlocality problems in networks
Authors Tamás Kriváchy, Yu Cai, Daniel Cavalcanti, Arash Tavakoli, Nicolas Gisin, Nicolas Brunner
Abstract Characterizing quantum nonlocality in networks is a challenging, but important problem. Using quantum sources one can achieve distributions which are unattainable classically. A key point in investigations is to decide whether an observed probability distribution can be reproduced using only classical resources. This causal inference task is challenging even for simple networks, both analytically and using standard numerical techniques. We propose to use neural networks as numerical tools to overcome these challenges, by learning the classical strategies required to reproduce a distribution. As such, the neural network acts as an oracle, demonstrating that a behavior is classical if it can be learned. We apply our method to several examples in the triangle configuration. After demonstrating that the method is consistent with previously known results, we give solid evidence that the distribution presented in [N. Gisin, Entropy 21(3), 325 (2019)] is indeed nonlocal as conjectured. Finally we examine the genuinely nonlocal distribution presented in [M.-O. Renou et al., PRL 123, 140401 (2019)], and, guided by the findings of the neural network, conjecture nonlocality in a new range of parameters in these distributions. The method allows us to get an estimate on the noise robustness of all examined distributions.
Tasks Causal Inference
Published 2019-07-24
URL https://arxiv.org/abs/1907.10552v2
PDF https://arxiv.org/pdf/1907.10552v2.pdf
PWC https://paperswithcode.com/paper/a-neural-network-oracle-for-quantum
Repo https://github.com/tkrivachy/neural-network-for-nonlocality-in-networks
Framework none

ROI Pooled Correlation Filters for Visual Tracking

Title ROI Pooled Correlation Filters for Visual Tracking
Authors Yuxuan Sun, Chong Sun, Dong Wang, You He, Huchuan Lu
Abstract The ROI (region-of-interest) based pooling method performs pooling operations on the cropped ROI regions for various samples and has shown great success in the object detection methods. It compresses the model size while preserving the localization accuracy, thus it is useful in the visual tracking field. Though being effective, the ROI-based pooling operation is not yet considered in the correlation filter formula. In this paper, we propose a novel ROI pooled correlation filter (RPCF) algorithm for robust visual tracking. Through mathematical derivations, we show that the ROI-based pooling can be equivalently achieved by enforcing additional constraints on the learned filter weights, which makes the ROI-based pooling feasible on the virtual circular samples. Besides, we develop an efficient joint training formula for the proposed correlation filter algorithm, and derive the Fourier solvers for efficient model training. Finally, we evaluate our RPCF tracker on OTB-2013, OTB-2015 and VOT-2017 benchmark datasets. Experimental results show that our tracker performs favourably against other state-of-the-art trackers.
Tasks Object Detection, Visual Tracking
Published 2019-11-05
URL https://arxiv.org/abs/1911.01668v1
PDF https://arxiv.org/pdf/1911.01668v1.pdf
PWC https://paperswithcode.com/paper/roi-pooled-correlation-filters-for-visual-1
Repo https://github.com/rumsyx/RPCF
Framework none

Autonomous Learning for Face Recognition in the Wild via Ambient Wireless Cues

Title Autonomous Learning for Face Recognition in the Wild via Ambient Wireless Cues
Authors Chris Xiaoxuan Lu, Xuan Kan, Bowen Du, Changhao Chen, Hongkai Wen, Andrew Markham, Niki Trigoni, John Stankovic
Abstract Facial recognition is a key enabling component for emerging Internet of Things (IoT) services such as smart homes or responsive offices. Through the use of deep neural networks, facial recognition has achieved excellent performance. However, this is only possibly when trained with hundreds of images of each user in different viewing and lighting conditions. Clearly, this level of effort in enrolment and labelling is impossible for wide-spread deployment and adoption. Inspired by the fact that most people carry smart wireless devices with them, e.g. smartphones, we propose to use this wireless identifier as a supervisory label. This allows us to curate a dataset of facial images that are unique to a certain domain e.g. a set of people in a particular office. This custom corpus can then be used to finetune existing pre-trained models e.g. FaceNet. However, due to the vagaries of wireless propagation in buildings, the supervisory labels are noisy and weak.We propose a novel technique, AutoTune, which learns and refines the association between a face and wireless identifier over time, by increasing the inter-cluster separation and minimizing the intra-cluster distance. Through extensive experiments with multiple users on two sites, we demonstrate the ability of AutoTune to design an environment-specific, continually evolving facial recognition system with entirely no user effort.
Tasks Face Recognition
Published 2019-08-14
URL https://arxiv.org/abs/1908.09002v1
PDF https://arxiv.org/pdf/1908.09002v1.pdf
PWC https://paperswithcode.com/paper/autonomous-learning-for-face-recognition-in
Repo https://github.com/Wayfear/Autotune
Framework tf

Deformable Kernels: Adapting Effective Receptive Fields for Object Deformation

Title Deformable Kernels: Adapting Effective Receptive Fields for Object Deformation
Authors Hang Gao, Xizhou Zhu, Steve Lin, Jifeng Dai
Abstract Convolutional networks are not aware of an object’s geometric variations, which leads to inefficient utilization of model and data capacity. To overcome this issue, recent works on deformation modeling seek to spatially reconfigure the data towards a common arrangement such that semantic recognition suffers less from deformation. This is typically done by augmenting static operators with learned free-form sampling grids in the image space, dynamically tuned to the data and task for adapting the receptive field. Yet adapting the receptive field does not quite reach the actual goal – what really matters to the network is the “effective” receptive field (ERF), which reflects how much each pixel contributes. It is thus natural to design other approaches to adapt the ERF directly during runtime. In this work, we instantiate one possible solution as Deformable Kernels (DKs), a family of novel and generic convolutional operators for handling object deformations by directly adapting the ERF while leaving the receptive field untouched. At the heart of our method is the ability to resample the original kernel space towards recovering the deformation of objects. This approach is justified with theoretical insights that the ERF is strictly determined by data sampling locations and kernel values. We implement DKs as generic drop-in replacements of rigid kernels and conduct a series of empirical studies whose results conform with our theories. Over several tasks and standard base models, our approach compares favorably against prior works that adapt during runtime. In addition, further experiments suggest a working mechanism orthogonal and complementary to previous works.
Tasks Image Classification, Object Detection
Published 2019-10-07
URL https://arxiv.org/abs/1910.02940v2
PDF https://arxiv.org/pdf/1910.02940v2.pdf
PWC https://paperswithcode.com/paper/deformable-kernels-adapting-effective
Repo https://github.com/hangg7/deformable-kernels
Framework pytorch

When AWGN-based Denoiser Meets Real Noises

Title When AWGN-based Denoiser Meets Real Noises
Authors Yuqian Zhou, Jianbo Jiao, Haibin Huang, Yang Wang, Jue Wang, Honghui Shi, Thomas Huang
Abstract Discriminative learning-based image denoisers have achieved promising performance on synthetic noises such as Additive White Gaussian Noise (AWGN). The synthetic noises adopted in most previous work are pixel-independent, but real noises are mostly spatially/channel-correlated and spatially/channel-variant. This domain gap yields unsatisfied performance on images with real noises if the model is only trained with AWGN. In this paper, we propose a novel approach to boost the performance of a real image denoiser which is trained only with synthetic pixel-independent noise data dominated by AWGN. First, we train a deep model that consists of a noise estimator and a denoiser with mixed AWGN and Random Value Impulse Noise (RVIN). We then investigate Pixel-shuffle Down-sampling (PD) strategy to adapt the trained model to real noises. Extensive experiments demonstrate the effectiveness and generalization of the proposed approach. Notably, our method achieves state-of-the-art performance on real sRGB images in the DND benchmark among models trained with synthetic noises. Codes are available at https://github.com/yzhouas/PD-Denoising-pytorch.
Tasks Denoising
Published 2019-04-06
URL https://arxiv.org/abs/1904.03485v2
PDF https://arxiv.org/pdf/1904.03485v2.pdf
PWC https://paperswithcode.com/paper/when-awgn-based-denoiser-meets-real-noises
Repo https://github.com/yzhouas/PD-Denoising-pytorch
Framework pytorch

Bayesian Subspace Hidden Markov Model for Acoustic Unit Discovery

Title Bayesian Subspace Hidden Markov Model for Acoustic Unit Discovery
Authors Lucas Ondel, Hari Krishna Vydana, Lukáš Burget, Jan Černocký
Abstract This work tackles the problem of learning a set of language specific acoustic units from unlabeled speech recordings given a set of labeled recordings from other languages. Our approach may be described by the following two steps procedure: first the model learns the notion of acoustic units from the labelled data and then the model uses its knowledge to find new acoustic units on the target language. We implement this process with the Bayesian Subspace Hidden Markov Model (SHMM), a model akin to the Subspace Gaussian Mixture Model (SGMM) where each low dimensional embedding represents an acoustic unit rather than just a HMM’s state. The subspace is trained on 3 languages from the GlobalPhone corpus (German, Polish and Spanish) and the AUs are discovered on the TIMIT corpus. Results, measured in equivalent Phone Error Rate, show that this approach significantly outperforms previous HMM based acoustic units discovery systems and compares favorably with the Variational Auto Encoder-HMM.
Tasks
Published 2019-04-08
URL https://arxiv.org/abs/1904.03876v2
PDF https://arxiv.org/pdf/1904.03876v2.pdf
PWC https://paperswithcode.com/paper/bayesian-subspace-hidden-markov-model-for
Repo https://github.com/beer-asr/beer
Framework pytorch

Estimating Risk and Uncertainty in Deep Reinforcement Learning

Title Estimating Risk and Uncertainty in Deep Reinforcement Learning
Authors William R. Clements, Benoît-Marie Robaglia, Bastien Van Delft, Reda Bahi Slaoui, Sébastien Toth
Abstract We propose a method for disentangling epistemic and aleatoric uncertainties in deep reinforcement learning. Aleatoric uncertainty, or risk, which arises from inherently stochastic environments or agents, must be accounted for in the design of risk-sensitive algorithms. Epistemic uncertainty, which stems from limited data, is important both for risk-sensitivity and for efficient exploration. Our method combines elements from distributional reinforcement learning and approximate Bayesian inference techniques with neural networks, allowing us to disentangle both types of uncertainty on the expected return of a policy. Specifically, the learned return distribution provides the aleatoric uncertainty, and the Bayesian posterior yields the epistemic uncertainty. Although our approach in principle requires a large number of samples from the Bayesian posterior to estimate the epistemic uncertainty, we show that two networks already yield a useful approximation. We perform experiments that illustrate our method and some applications.
Tasks Bayesian Inference, Distributional Reinforcement Learning, Efficient Exploration
Published 2019-05-23
URL https://arxiv.org/abs/1905.09638v4
PDF https://arxiv.org/pdf/1905.09638v4.pdf
PWC https://paperswithcode.com/paper/estimating-risk-and-uncertainty-in-deep
Repo https://github.com/IndustAI/risk-and-uncertainty
Framework pytorch

Hyper-Parameter Tuning for the (1+(λ,λ)) GA

Title Hyper-Parameter Tuning for the (1+(λ,λ)) GA
Authors Nguyen Dang, Carola Doerr
Abstract It is known that the $(1+(\lambda,\lambda))$~Genetic Algorithm (GA) with self-adjusting parameter choices achieves a linear expected optimization time on OneMax if its hyper-parameters are suitably chosen. However, it is not very well understood how the hyper-parameter settings influences the overall performance of the $(1+(\lambda,\lambda))$~GA. Analyzing such multi-dimensional dependencies precisely is at the edge of what running time analysis can offer. To make a step forward on this question, we present an in-depth empirical study of the self-adjusting $(1+(\lambda,\lambda))$~GA and its hyper-parameters. We show, among many other results, that a 15% reduction of the average running time is possible by a slightly different setup, which allows non-identical offspring population sizes of mutation and crossover phase, and more flexibility in the choice of mutation rate and crossover bias –a generalization which may be of independent interest. We also show indication that the parametrization of mutation rate and crossover bias derived by theoretical means for the static variant of the $(1+(\lambda,\lambda))$~GA extends to the non-static case.
Tasks
Published 2019-04-09
URL http://arxiv.org/abs/1904.04608v1
PDF http://arxiv.org/pdf/1904.04608v1.pdf
PWC https://paperswithcode.com/paper/hyper-parameter-tuning-for-the-1-ga
Repo https://github.com/ndangtt/1LLGA
Framework none

At Stability’s Edge: How to Adjust Hyperparameters to Preserve Minima Selection in Asynchronous Training of Neural Networks?

Title At Stability’s Edge: How to Adjust Hyperparameters to Preserve Minima Selection in Asynchronous Training of Neural Networks?
Authors Niv Giladi, Mor Shpigel Nacson, Elad Hoffer, Daniel Soudry
Abstract Background: Recent developments have made it possible to accelerate neural networks training significantly using large batch sizes and data parallelism. Training in an asynchronous fashion, where delay occurs, can make training even more scalable. However, asynchronous training has its pitfalls, mainly a degradation in generalization, even after convergence of the algorithm. This gap remains not well understood, as theoretical analysis so far mainly focused on the convergence rate of asynchronous methods. Contributions: We examine asynchronous training from the perspective of dynamical stability. We find that the degree of delay interacts with the learning rate, to change the set of minima accessible by an asynchronous stochastic gradient descent algorithm. We derive closed-form rules on how the learning rate could be changed, while keeping the accessible set the same. Specifically, for high delay values, we find that the learning rate should be kept inversely proportional to the delay. We then extend this analysis to include momentum. We find momentum should be either turned off, or modified to improve training stability. We provide empirical experiments to validate our theoretical findings.
Tasks
Published 2019-09-26
URL https://arxiv.org/abs/1909.12340v2
PDF https://arxiv.org/pdf/1909.12340v2.pdf
PWC https://paperswithcode.com/paper/at-stabilitys-edge-how-to-adjust-1
Repo https://github.com/paper-submissions/delay_stability
Framework pytorch

SAFE ML: Surrogate Assisted Feature Extraction for Model Learning

Title SAFE ML: Surrogate Assisted Feature Extraction for Model Learning
Authors Alicja Gosiewska, Aleksandra Gacek, Piotr Lubon, Przemyslaw Biecek
Abstract Complex black-box predictive models may have high accuracy, but opacity causes problems like lack of trust, lack of stability, sensitivity to concept drift. On the other hand, interpretable models require more work related to feature engineering, which is very time consuming. Can we train interpretable and accurate models, without timeless feature engineering? In this article, we show a method that uses elastic black-boxes as surrogate models to create a simpler, less opaque, yet still accurate and interpretable glass-box models. New models are created on newly engineered features extracted/learned with the help of a surrogate model. We show applications of this method for model level explanations and possible extensions for instance level explanations. We also present an example implementation in Python and benchmark this method on a number of tabular data sets.
Tasks Feature Engineering
Published 2019-02-28
URL http://arxiv.org/abs/1902.11035v1
PDF http://arxiv.org/pdf/1902.11035v1.pdf
PWC https://paperswithcode.com/paper/safe-ml-surrogate-assisted-feature-extraction
Repo https://github.com/ModelOriented/rSAFE
Framework none

Continuous Hierarchical Representations with Poincaré Variational Auto-Encoders

Title Continuous Hierarchical Representations with Poincaré Variational Auto-Encoders
Authors Emile Mathieu, Charline Le Lan, Chris J. Maddison, Ryota Tomioka, Yee Whye Teh
Abstract The variational auto-encoder (VAE) is a popular method for learning a generative model and embeddings of the data. Many real datasets are hierarchically structured. However, traditional VAEs map data in a Euclidean latent space which cannot efficiently embed tree-like structures. Hyperbolic spaces with negative curvature can. We therefore endow VAEs with a Poincar'e ball model of hyperbolic geometry as a latent space and rigorously derive the necessary methods to work with two main Gaussian generalisations on that space. We empirically show better generalisation to unseen data than the Euclidean counterpart, and can qualitatively and quantitatively better recover hierarchical structures.
Tasks
Published 2019-01-17
URL https://arxiv.org/abs/1901.06033v3
PDF https://arxiv.org/pdf/1901.06033v3.pdf
PWC https://paperswithcode.com/paper/hierarchical-representations-with-poincare
Repo https://github.com/omiethescientist/HyperbolicDeepLearning
Framework pytorch

VarGNet: Variable Group Convolutional Neural Network for Efficient Embedded Computing

Title VarGNet: Variable Group Convolutional Neural Network for Efficient Embedded Computing
Authors Qian Zhang, Jianjun Li, Meng Yao, Liangchen Song, Helong Zhou, Zhichao Li, Wenming Meng, Xuezhi Zhang, Guoli Wang
Abstract In this paper, we propose a novel network design mechanism for efficient embedded computing. Inspired by the limited computing patterns, we propose to fix the number of channels in a group convolution, instead of the existing practice that fixing the total group numbers. Our solution based network, named Variable Group Convolutional Network (VarGNet), can be optimized easier on hardware side, due to the more unified computing schemes among the layers. Extensive experiments on various vision tasks, including classification, detection, pixel-wise parsing and face recognition, have demonstrated the practical value of our VarGNet.
Tasks Face Recognition
Published 2019-07-12
URL https://arxiv.org/abs/1907.05653v1
PDF https://arxiv.org/pdf/1907.05653v1.pdf
PWC https://paperswithcode.com/paper/vargnet-variable-group-convolutional-neural
Repo https://github.com/zma-c-137/VarGFaceNet
Framework mxnet
comments powered by Disqus