Paper Group AWR 101
FreeLabel: A Publicly Available Annotation Tool based on Freehand Traces. Interactive-Predictive Neural Machine Translation through Reinforcement and Imitation. Spatially and Temporally Efficient Non-local Attention Network for Video-based Person Re-Identification. A neural network oracle for quantum nonlocality problems in networks. ROI Pooled Cor …
FreeLabel: A Publicly Available Annotation Tool based on Freehand Traces
Title | FreeLabel: A Publicly Available Annotation Tool based on Freehand Traces |
Authors | Philipe A. Dias, Zhou Shen, Amy Tabb, Henry Medeiros |
Abstract | Large-scale annotation of image segmentation datasets is often prohibitively expensive, as it usually requires a huge number of worker hours to obtain high-quality results. Abundant and reliable data has been, however, crucial for the advances on image understanding tasks achieved by deep learning models. In this paper, we introduce FreeLabel, an intuitive open-source web interface that allows users to obtain high-quality segmentation masks with just a few freehand scribbles, in a matter of seconds. The efficacy of FreeLabel is quantitatively demonstrated by experimental results on the PASCAL dataset as well as on a dataset from the agricultural domain. Designed to benefit the computer vision community, FreeLabel can be used for both crowdsourced or private annotation and has a modular structure that can be easily adapted for any image dataset. |
Tasks | Semantic Segmentation |
Published | 2019-02-18 |
URL | http://arxiv.org/abs/1902.06806v2 |
http://arxiv.org/pdf/1902.06806v2.pdf | |
PWC | https://paperswithcode.com/paper/freelabel-a-publicly-available-annotation |
Repo | https://github.com/philadias/freelabel |
Framework | none |
Interactive-Predictive Neural Machine Translation through Reinforcement and Imitation
Title | Interactive-Predictive Neural Machine Translation through Reinforcement and Imitation |
Authors | Tsz Kin Lam, Shigehiko Schamoni, Stefan Riezler |
Abstract | We propose an interactive-predictive neural machine translation framework for easier model personalization using reinforcement and imitation learning. During the interactive translation process, the user is asked for feedback on uncertain locations identified by the system. Responses are weak feedback in the form of “keep” and “delete” edits, and expert demonstrations in the form of “substitute” edits. Conditioning on the collected feedback, the system creates alternative translations via constrained beam search. In simulation experiments on two language pairs our systems get close to the performance of supervised training with much less human effort. |
Tasks | Imitation Learning, Machine Translation |
Published | 2019-07-04 |
URL | https://arxiv.org/abs/1907.02326v2 |
https://arxiv.org/pdf/1907.02326v2.pdf | |
PWC | https://paperswithcode.com/paper/interactive-predictive-neural-machine |
Repo | https://github.com/heidelkin/IPNMT_RL_IL |
Framework | none |
Spatially and Temporally Efficient Non-local Attention Network for Video-based Person Re-Identification
Title | Spatially and Temporally Efficient Non-local Attention Network for Video-based Person Re-Identification |
Authors | Chih-Ting Liu, Chih-Wei Wu, Yu-Chiang Frank Wang, Shao-Yi Chien |
Abstract | Video-based person re-identification (Re-ID) aims at matching video sequences of pedestrians across non-overlapping cameras. It is a practical yet challenging task of how to embed spatial and temporal information of a video into its feature representation. While most existing methods learn the video characteristics by aggregating image-wise features and designing attention mechanisms in Neural Networks, they only explore the correlation between frames at high-level features. In this work, we target at refining the intermediate features as well as high-level features with non-local attention operations and make two contributions. (i) We propose a Non-local Video Attention Network (NVAN) to incorporate video characteristics into the representation at multiple feature levels. (ii) We further introduce a Spatially and Temporally Efficient Non-local Video Attention Network (STE-NVAN) to reduce the computation complexity by exploring spatial and temporal redundancy presented in pedestrian videos. Extensive experiments show that our NVAN outperforms state-of-the-arts by 3.8% in rank-1 accuracy on MARS dataset and confirms our STE-NVAN displays a much superior computation footprint compared to existing methods. |
Tasks | Person Re-Identification, Video-Based Person Re-Identification |
Published | 2019-08-05 |
URL | https://arxiv.org/abs/1908.01683v1 |
https://arxiv.org/pdf/1908.01683v1.pdf | |
PWC | https://paperswithcode.com/paper/spatially-and-temporally-efficient-non-local |
Repo | https://github.com/jackie840129/STE-NVAN |
Framework | pytorch |
A neural network oracle for quantum nonlocality problems in networks
Title | A neural network oracle for quantum nonlocality problems in networks |
Authors | Tamás Kriváchy, Yu Cai, Daniel Cavalcanti, Arash Tavakoli, Nicolas Gisin, Nicolas Brunner |
Abstract | Characterizing quantum nonlocality in networks is a challenging, but important problem. Using quantum sources one can achieve distributions which are unattainable classically. A key point in investigations is to decide whether an observed probability distribution can be reproduced using only classical resources. This causal inference task is challenging even for simple networks, both analytically and using standard numerical techniques. We propose to use neural networks as numerical tools to overcome these challenges, by learning the classical strategies required to reproduce a distribution. As such, the neural network acts as an oracle, demonstrating that a behavior is classical if it can be learned. We apply our method to several examples in the triangle configuration. After demonstrating that the method is consistent with previously known results, we give solid evidence that the distribution presented in [N. Gisin, Entropy 21(3), 325 (2019)] is indeed nonlocal as conjectured. Finally we examine the genuinely nonlocal distribution presented in [M.-O. Renou et al., PRL 123, 140401 (2019)], and, guided by the findings of the neural network, conjecture nonlocality in a new range of parameters in these distributions. The method allows us to get an estimate on the noise robustness of all examined distributions. |
Tasks | Causal Inference |
Published | 2019-07-24 |
URL | https://arxiv.org/abs/1907.10552v2 |
https://arxiv.org/pdf/1907.10552v2.pdf | |
PWC | https://paperswithcode.com/paper/a-neural-network-oracle-for-quantum |
Repo | https://github.com/tkrivachy/neural-network-for-nonlocality-in-networks |
Framework | none |
ROI Pooled Correlation Filters for Visual Tracking
Title | ROI Pooled Correlation Filters for Visual Tracking |
Authors | Yuxuan Sun, Chong Sun, Dong Wang, You He, Huchuan Lu |
Abstract | The ROI (region-of-interest) based pooling method performs pooling operations on the cropped ROI regions for various samples and has shown great success in the object detection methods. It compresses the model size while preserving the localization accuracy, thus it is useful in the visual tracking field. Though being effective, the ROI-based pooling operation is not yet considered in the correlation filter formula. In this paper, we propose a novel ROI pooled correlation filter (RPCF) algorithm for robust visual tracking. Through mathematical derivations, we show that the ROI-based pooling can be equivalently achieved by enforcing additional constraints on the learned filter weights, which makes the ROI-based pooling feasible on the virtual circular samples. Besides, we develop an efficient joint training formula for the proposed correlation filter algorithm, and derive the Fourier solvers for efficient model training. Finally, we evaluate our RPCF tracker on OTB-2013, OTB-2015 and VOT-2017 benchmark datasets. Experimental results show that our tracker performs favourably against other state-of-the-art trackers. |
Tasks | Object Detection, Visual Tracking |
Published | 2019-11-05 |
URL | https://arxiv.org/abs/1911.01668v1 |
https://arxiv.org/pdf/1911.01668v1.pdf | |
PWC | https://paperswithcode.com/paper/roi-pooled-correlation-filters-for-visual-1 |
Repo | https://github.com/rumsyx/RPCF |
Framework | none |
Autonomous Learning for Face Recognition in the Wild via Ambient Wireless Cues
Title | Autonomous Learning for Face Recognition in the Wild via Ambient Wireless Cues |
Authors | Chris Xiaoxuan Lu, Xuan Kan, Bowen Du, Changhao Chen, Hongkai Wen, Andrew Markham, Niki Trigoni, John Stankovic |
Abstract | Facial recognition is a key enabling component for emerging Internet of Things (IoT) services such as smart homes or responsive offices. Through the use of deep neural networks, facial recognition has achieved excellent performance. However, this is only possibly when trained with hundreds of images of each user in different viewing and lighting conditions. Clearly, this level of effort in enrolment and labelling is impossible for wide-spread deployment and adoption. Inspired by the fact that most people carry smart wireless devices with them, e.g. smartphones, we propose to use this wireless identifier as a supervisory label. This allows us to curate a dataset of facial images that are unique to a certain domain e.g. a set of people in a particular office. This custom corpus can then be used to finetune existing pre-trained models e.g. FaceNet. However, due to the vagaries of wireless propagation in buildings, the supervisory labels are noisy and weak.We propose a novel technique, AutoTune, which learns and refines the association between a face and wireless identifier over time, by increasing the inter-cluster separation and minimizing the intra-cluster distance. Through extensive experiments with multiple users on two sites, we demonstrate the ability of AutoTune to design an environment-specific, continually evolving facial recognition system with entirely no user effort. |
Tasks | Face Recognition |
Published | 2019-08-14 |
URL | https://arxiv.org/abs/1908.09002v1 |
https://arxiv.org/pdf/1908.09002v1.pdf | |
PWC | https://paperswithcode.com/paper/autonomous-learning-for-face-recognition-in |
Repo | https://github.com/Wayfear/Autotune |
Framework | tf |
Deformable Kernels: Adapting Effective Receptive Fields for Object Deformation
Title | Deformable Kernels: Adapting Effective Receptive Fields for Object Deformation |
Authors | Hang Gao, Xizhou Zhu, Steve Lin, Jifeng Dai |
Abstract | Convolutional networks are not aware of an object’s geometric variations, which leads to inefficient utilization of model and data capacity. To overcome this issue, recent works on deformation modeling seek to spatially reconfigure the data towards a common arrangement such that semantic recognition suffers less from deformation. This is typically done by augmenting static operators with learned free-form sampling grids in the image space, dynamically tuned to the data and task for adapting the receptive field. Yet adapting the receptive field does not quite reach the actual goal – what really matters to the network is the “effective” receptive field (ERF), which reflects how much each pixel contributes. It is thus natural to design other approaches to adapt the ERF directly during runtime. In this work, we instantiate one possible solution as Deformable Kernels (DKs), a family of novel and generic convolutional operators for handling object deformations by directly adapting the ERF while leaving the receptive field untouched. At the heart of our method is the ability to resample the original kernel space towards recovering the deformation of objects. This approach is justified with theoretical insights that the ERF is strictly determined by data sampling locations and kernel values. We implement DKs as generic drop-in replacements of rigid kernels and conduct a series of empirical studies whose results conform with our theories. Over several tasks and standard base models, our approach compares favorably against prior works that adapt during runtime. In addition, further experiments suggest a working mechanism orthogonal and complementary to previous works. |
Tasks | Image Classification, Object Detection |
Published | 2019-10-07 |
URL | https://arxiv.org/abs/1910.02940v2 |
https://arxiv.org/pdf/1910.02940v2.pdf | |
PWC | https://paperswithcode.com/paper/deformable-kernels-adapting-effective |
Repo | https://github.com/hangg7/deformable-kernels |
Framework | pytorch |
When AWGN-based Denoiser Meets Real Noises
Title | When AWGN-based Denoiser Meets Real Noises |
Authors | Yuqian Zhou, Jianbo Jiao, Haibin Huang, Yang Wang, Jue Wang, Honghui Shi, Thomas Huang |
Abstract | Discriminative learning-based image denoisers have achieved promising performance on synthetic noises such as Additive White Gaussian Noise (AWGN). The synthetic noises adopted in most previous work are pixel-independent, but real noises are mostly spatially/channel-correlated and spatially/channel-variant. This domain gap yields unsatisfied performance on images with real noises if the model is only trained with AWGN. In this paper, we propose a novel approach to boost the performance of a real image denoiser which is trained only with synthetic pixel-independent noise data dominated by AWGN. First, we train a deep model that consists of a noise estimator and a denoiser with mixed AWGN and Random Value Impulse Noise (RVIN). We then investigate Pixel-shuffle Down-sampling (PD) strategy to adapt the trained model to real noises. Extensive experiments demonstrate the effectiveness and generalization of the proposed approach. Notably, our method achieves state-of-the-art performance on real sRGB images in the DND benchmark among models trained with synthetic noises. Codes are available at https://github.com/yzhouas/PD-Denoising-pytorch. |
Tasks | Denoising |
Published | 2019-04-06 |
URL | https://arxiv.org/abs/1904.03485v2 |
https://arxiv.org/pdf/1904.03485v2.pdf | |
PWC | https://paperswithcode.com/paper/when-awgn-based-denoiser-meets-real-noises |
Repo | https://github.com/yzhouas/PD-Denoising-pytorch |
Framework | pytorch |
Bayesian Subspace Hidden Markov Model for Acoustic Unit Discovery
Title | Bayesian Subspace Hidden Markov Model for Acoustic Unit Discovery |
Authors | Lucas Ondel, Hari Krishna Vydana, Lukáš Burget, Jan Černocký |
Abstract | This work tackles the problem of learning a set of language specific acoustic units from unlabeled speech recordings given a set of labeled recordings from other languages. Our approach may be described by the following two steps procedure: first the model learns the notion of acoustic units from the labelled data and then the model uses its knowledge to find new acoustic units on the target language. We implement this process with the Bayesian Subspace Hidden Markov Model (SHMM), a model akin to the Subspace Gaussian Mixture Model (SGMM) where each low dimensional embedding represents an acoustic unit rather than just a HMM’s state. The subspace is trained on 3 languages from the GlobalPhone corpus (German, Polish and Spanish) and the AUs are discovered on the TIMIT corpus. Results, measured in equivalent Phone Error Rate, show that this approach significantly outperforms previous HMM based acoustic units discovery systems and compares favorably with the Variational Auto Encoder-HMM. |
Tasks | |
Published | 2019-04-08 |
URL | https://arxiv.org/abs/1904.03876v2 |
https://arxiv.org/pdf/1904.03876v2.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-subspace-hidden-markov-model-for |
Repo | https://github.com/beer-asr/beer |
Framework | pytorch |
Estimating Risk and Uncertainty in Deep Reinforcement Learning
Title | Estimating Risk and Uncertainty in Deep Reinforcement Learning |
Authors | William R. Clements, Benoît-Marie Robaglia, Bastien Van Delft, Reda Bahi Slaoui, Sébastien Toth |
Abstract | We propose a method for disentangling epistemic and aleatoric uncertainties in deep reinforcement learning. Aleatoric uncertainty, or risk, which arises from inherently stochastic environments or agents, must be accounted for in the design of risk-sensitive algorithms. Epistemic uncertainty, which stems from limited data, is important both for risk-sensitivity and for efficient exploration. Our method combines elements from distributional reinforcement learning and approximate Bayesian inference techniques with neural networks, allowing us to disentangle both types of uncertainty on the expected return of a policy. Specifically, the learned return distribution provides the aleatoric uncertainty, and the Bayesian posterior yields the epistemic uncertainty. Although our approach in principle requires a large number of samples from the Bayesian posterior to estimate the epistemic uncertainty, we show that two networks already yield a useful approximation. We perform experiments that illustrate our method and some applications. |
Tasks | Bayesian Inference, Distributional Reinforcement Learning, Efficient Exploration |
Published | 2019-05-23 |
URL | https://arxiv.org/abs/1905.09638v4 |
https://arxiv.org/pdf/1905.09638v4.pdf | |
PWC | https://paperswithcode.com/paper/estimating-risk-and-uncertainty-in-deep |
Repo | https://github.com/IndustAI/risk-and-uncertainty |
Framework | pytorch |
Hyper-Parameter Tuning for the (1+(λ,λ)) GA
Title | Hyper-Parameter Tuning for the (1+(λ,λ)) GA |
Authors | Nguyen Dang, Carola Doerr |
Abstract | It is known that the $(1+(\lambda,\lambda))$~Genetic Algorithm (GA) with self-adjusting parameter choices achieves a linear expected optimization time on OneMax if its hyper-parameters are suitably chosen. However, it is not very well understood how the hyper-parameter settings influences the overall performance of the $(1+(\lambda,\lambda))$~GA. Analyzing such multi-dimensional dependencies precisely is at the edge of what running time analysis can offer. To make a step forward on this question, we present an in-depth empirical study of the self-adjusting $(1+(\lambda,\lambda))$~GA and its hyper-parameters. We show, among many other results, that a 15% reduction of the average running time is possible by a slightly different setup, which allows non-identical offspring population sizes of mutation and crossover phase, and more flexibility in the choice of mutation rate and crossover bias –a generalization which may be of independent interest. We also show indication that the parametrization of mutation rate and crossover bias derived by theoretical means for the static variant of the $(1+(\lambda,\lambda))$~GA extends to the non-static case. |
Tasks | |
Published | 2019-04-09 |
URL | http://arxiv.org/abs/1904.04608v1 |
http://arxiv.org/pdf/1904.04608v1.pdf | |
PWC | https://paperswithcode.com/paper/hyper-parameter-tuning-for-the-1-ga |
Repo | https://github.com/ndangtt/1LLGA |
Framework | none |
At Stability’s Edge: How to Adjust Hyperparameters to Preserve Minima Selection in Asynchronous Training of Neural Networks?
Title | At Stability’s Edge: How to Adjust Hyperparameters to Preserve Minima Selection in Asynchronous Training of Neural Networks? |
Authors | Niv Giladi, Mor Shpigel Nacson, Elad Hoffer, Daniel Soudry |
Abstract | Background: Recent developments have made it possible to accelerate neural networks training significantly using large batch sizes and data parallelism. Training in an asynchronous fashion, where delay occurs, can make training even more scalable. However, asynchronous training has its pitfalls, mainly a degradation in generalization, even after convergence of the algorithm. This gap remains not well understood, as theoretical analysis so far mainly focused on the convergence rate of asynchronous methods. Contributions: We examine asynchronous training from the perspective of dynamical stability. We find that the degree of delay interacts with the learning rate, to change the set of minima accessible by an asynchronous stochastic gradient descent algorithm. We derive closed-form rules on how the learning rate could be changed, while keeping the accessible set the same. Specifically, for high delay values, we find that the learning rate should be kept inversely proportional to the delay. We then extend this analysis to include momentum. We find momentum should be either turned off, or modified to improve training stability. We provide empirical experiments to validate our theoretical findings. |
Tasks | |
Published | 2019-09-26 |
URL | https://arxiv.org/abs/1909.12340v2 |
https://arxiv.org/pdf/1909.12340v2.pdf | |
PWC | https://paperswithcode.com/paper/at-stabilitys-edge-how-to-adjust-1 |
Repo | https://github.com/paper-submissions/delay_stability |
Framework | pytorch |
SAFE ML: Surrogate Assisted Feature Extraction for Model Learning
Title | SAFE ML: Surrogate Assisted Feature Extraction for Model Learning |
Authors | Alicja Gosiewska, Aleksandra Gacek, Piotr Lubon, Przemyslaw Biecek |
Abstract | Complex black-box predictive models may have high accuracy, but opacity causes problems like lack of trust, lack of stability, sensitivity to concept drift. On the other hand, interpretable models require more work related to feature engineering, which is very time consuming. Can we train interpretable and accurate models, without timeless feature engineering? In this article, we show a method that uses elastic black-boxes as surrogate models to create a simpler, less opaque, yet still accurate and interpretable glass-box models. New models are created on newly engineered features extracted/learned with the help of a surrogate model. We show applications of this method for model level explanations and possible extensions for instance level explanations. We also present an example implementation in Python and benchmark this method on a number of tabular data sets. |
Tasks | Feature Engineering |
Published | 2019-02-28 |
URL | http://arxiv.org/abs/1902.11035v1 |
http://arxiv.org/pdf/1902.11035v1.pdf | |
PWC | https://paperswithcode.com/paper/safe-ml-surrogate-assisted-feature-extraction |
Repo | https://github.com/ModelOriented/rSAFE |
Framework | none |
Continuous Hierarchical Representations with Poincaré Variational Auto-Encoders
Title | Continuous Hierarchical Representations with Poincaré Variational Auto-Encoders |
Authors | Emile Mathieu, Charline Le Lan, Chris J. Maddison, Ryota Tomioka, Yee Whye Teh |
Abstract | The variational auto-encoder (VAE) is a popular method for learning a generative model and embeddings of the data. Many real datasets are hierarchically structured. However, traditional VAEs map data in a Euclidean latent space which cannot efficiently embed tree-like structures. Hyperbolic spaces with negative curvature can. We therefore endow VAEs with a Poincar'e ball model of hyperbolic geometry as a latent space and rigorously derive the necessary methods to work with two main Gaussian generalisations on that space. We empirically show better generalisation to unseen data than the Euclidean counterpart, and can qualitatively and quantitatively better recover hierarchical structures. |
Tasks | |
Published | 2019-01-17 |
URL | https://arxiv.org/abs/1901.06033v3 |
https://arxiv.org/pdf/1901.06033v3.pdf | |
PWC | https://paperswithcode.com/paper/hierarchical-representations-with-poincare |
Repo | https://github.com/omiethescientist/HyperbolicDeepLearning |
Framework | pytorch |
VarGNet: Variable Group Convolutional Neural Network for Efficient Embedded Computing
Title | VarGNet: Variable Group Convolutional Neural Network for Efficient Embedded Computing |
Authors | Qian Zhang, Jianjun Li, Meng Yao, Liangchen Song, Helong Zhou, Zhichao Li, Wenming Meng, Xuezhi Zhang, Guoli Wang |
Abstract | In this paper, we propose a novel network design mechanism for efficient embedded computing. Inspired by the limited computing patterns, we propose to fix the number of channels in a group convolution, instead of the existing practice that fixing the total group numbers. Our solution based network, named Variable Group Convolutional Network (VarGNet), can be optimized easier on hardware side, due to the more unified computing schemes among the layers. Extensive experiments on various vision tasks, including classification, detection, pixel-wise parsing and face recognition, have demonstrated the practical value of our VarGNet. |
Tasks | Face Recognition |
Published | 2019-07-12 |
URL | https://arxiv.org/abs/1907.05653v1 |
https://arxiv.org/pdf/1907.05653v1.pdf | |
PWC | https://paperswithcode.com/paper/vargnet-variable-group-convolutional-neural |
Repo | https://github.com/zma-c-137/VarGFaceNet |
Framework | mxnet |