October 18, 2019

2878 words 14 mins read

Paper Group ANR 615

Deep ChArUco: Dark ChArUco Marker Pose Estimation. q-Neurons: Neuron Activations based on Stochastic Jackson’s Derivative Operators. An Evolutionary Algorithm with Crossover and Mutation for Model-Based Clustering. Multilingual Neural Machine Translation with Task-Specific Attention. Online Learning with an Almost Perfect Expert. Matrix Recovery wi …

Deep ChArUco: Dark ChArUco Marker Pose Estimation


Title	Deep ChArUco: Dark ChArUco Marker Pose Estimation
Authors	Danying Hu, Daniel DeTone, Vikram Chauhan, Igor Spivak, Tomasz Malisiewicz
Abstract	ChArUco boards are used for camera calibration, monocular pose estimation, and pose verification in both robotics and augmented reality. Such fiducials are detectable via traditional computer vision methods (as found in OpenCV) in well-lit environments, but classical methods fail when the lighting is poor or when the image undergoes extreme motion blur. We present Deep ChArUco, a real-time pose estimation system which combines two custom deep networks, ChArUcoNet and RefineNet, with the Perspective-n-Point (PnP) algorithm to estimate the marker’s 6DoF pose. ChArUcoNet is a two-headed marker-specific convolutional neural network (CNN) which jointly outputs ID-specific classifiers and 2D point locations. The 2D point locations are further refined into subpixel coordinates using RefineNet. Our networks are trained using a combination of auto-labeled videos of the target marker, synthetic subpixel corner data, and extreme data augmentation. We evaluate Deep ChArUco in challenging low-light, high-motion, high-blur scenarios and demonstrate that our approach is superior to a traditional OpenCV-based method for ChArUco marker detection and pose estimation.
Tasks	Calibration, Data Augmentation, Pose Estimation
Published	2018-12-08
URL	https://arxiv.org/abs/1812.03247v2
PDF	https://arxiv.org/pdf/1812.03247v2.pdf
PWC	https://paperswithcode.com/paper/deep-charuco-dark-charuco-marker-pose
Repo
Framework

q-Neurons: Neuron Activations based on Stochastic Jackson’s Derivative Operators


Title	q-Neurons: Neuron Activations based on Stochastic Jackson’s Derivative Operators
Authors	Frank Nielsen, Ke Sun
Abstract	We propose a new generic type of stochastic neurons, called $q$-neurons, that considers activation functions based on Jackson’s $q$-derivatives with stochastic parameters $q$. Our generalization of neural network architectures with $q$-neurons is shown to be both scalable and very easy to implement. We demonstrate experimentally consistently improved performances over state-of-the-art standard activation functions, both on training and testing loss functions.
Tasks
Published	2018-06-01
URL	http://arxiv.org/abs/1806.00149v2
PDF	http://arxiv.org/pdf/1806.00149v2.pdf
PWC	https://paperswithcode.com/paper/q-neurons-neuron-activations-based-on
Repo
Framework

An Evolutionary Algorithm with Crossover and Mutation for Model-Based Clustering


Title	An Evolutionary Algorithm with Crossover and Mutation for Model-Based Clustering
Authors	Sharon M. McNicholas, Paul D. McNicholas, Daniel A. Ashlock
Abstract	The expectation-maximization (EM) algorithm is almost ubiquitous for parameter estimation in model-based clustering problems; however, it can become stuck at local maxima, due to its single path, monotonic nature. Rather than using an EM algorithm, an evolutionary algorithm (EA) is developed. This EA facilitates a different search of the fitness landscape, i.e., the likelihood surface, utilizing both crossover and mutation. Furthermore, this EA represents an efficient approach to “hard” model-based clustering and so it can be viewed as a sort of generalization of the k-means algorithm, which is itself equivalent to a classification EM algorithm for a Gaussian mixture model with spherical component covariances. The EA is illustrated on several data sets, and its performance is compared to k-means clustering as well as model-based clustering with an EM algorithm.
Tasks
Published	2018-10-31
URL	http://arxiv.org/abs/1811.00097v1
PDF	http://arxiv.org/pdf/1811.00097v1.pdf
PWC	https://paperswithcode.com/paper/an-evolutionary-algorithm-with-crossover-and
Repo
Framework

Multilingual Neural Machine Translation with Task-Specific Attention


Title	Multilingual Neural Machine Translation with Task-Specific Attention
Authors	Graeme Blackwood, Miguel Ballesteros, Todd Ward
Abstract	Multilingual machine translation addresses the task of translating between multiple source and target languages. We propose task-specific attention models, a simple but effective technique for improving the quality of sequence-to-sequence neural multilingual translation. Our approach seeks to retain as much of the parameter sharing generalization of NMT models as possible, while still allowing for language-specific specialization of the attention model to a particular language-pair or task. Our experiments on four languages of the Europarl corpus show that using a target-specific model of attention provides consistent gains in translation quality for all possible translation directions, compared to a model in which all parameters are shared. We observe improved translation quality even in the (extreme) low-resource zero-shot translation directions for which the model never saw explicitly paired parallel data.
Tasks	Machine Translation
Published	2018-06-08
URL	http://arxiv.org/abs/1806.03280v1
PDF	http://arxiv.org/pdf/1806.03280v1.pdf
PWC	https://paperswithcode.com/paper/multilingual-neural-machine-translation-with
Repo
Framework

Online Learning with an Almost Perfect Expert


Title	Online Learning with an Almost Perfect Expert
Authors	Simina Brânzei, Yuval Peres
Abstract	We study the multiclass online learning problem where a forecaster makes a sequence of predictions using the advice of $n$ experts. Our main contribution is to analyze the regime where the best expert makes at most $b$ mistakes and to show that when $b = o(\log_4{n})$, the expected number of mistakes made by the optimal forecaster is at most $\log_4{n} + o(\log_4{n})$. We also describe an adversary strategy showing that this bound is tight and that the worst case is attained for binary prediction.
Tasks
Published	2018-07-30
URL	http://arxiv.org/abs/1807.11169v2
PDF	http://arxiv.org/pdf/1807.11169v2.pdf
PWC	https://paperswithcode.com/paper/online-learning-with-an-almost-perfect-expert
Repo
Framework

Matrix Recovery with Implicitly Low-Rank Data


Title	Matrix Recovery with Implicitly Low-Rank Data
Authors	Xingyu Xie, Jianlong Wu, Guangcan Liu, Jun Wang
Abstract	In this paper, we study the problem of matrix recovery, which aims to restore a target matrix of authentic samples from grossly corrupted observations. Most of the existing methods, such as the well-known Robust Principal Component Analysis (RPCA), assume that the target matrix we wish to recover is low-rank. However, the underlying data structure is often non-linear in practice, therefore the low-rankness assumption could be violated. To tackle this issue, we propose a novel method for matrix recovery in this paper, which could well handle the case where the target matrix is low-rank in an implicit feature space but high-rank or even full-rank in its original form. Namely, our method pursues the low-rank structure of the target matrix in an implicit feature space. By making use of the specifics of an accelerated proximal gradient based optimization algorithm, the proposed method could recover the target matrix with non-linear structures from its corrupted version. Comprehensive experiments on both synthetic and real datasets demonstrate the superiority of our method.
Tasks
Published	2018-11-09
URL	http://arxiv.org/abs/1811.03945v1
PDF	http://arxiv.org/pdf/1811.03945v1.pdf
PWC	https://paperswithcode.com/paper/matrix-recovery-with-implicitly-low-rank-data
Repo
Framework

Estimation of Dimensions Contributing to Detected Anomalies with Variational Autoencoders


Title	Estimation of Dimensions Contributing to Detected Anomalies with Variational Autoencoders
Authors	Yasuhiro Ikeda, Kengo Tajiri, Yuusuke Nakano, Keishiro Watanabe, Keisuke Ishibashi
Abstract	Anomaly detection using dimensionality reduction has been an essential technique for monitoring multidimensional data. Although deep learning-based methods have been well studied for their remarkable detection performance, their interpretability is still a problem. In this paper, we propose a novel algorithm for estimating the dimensions contributing to the detected anomalies by using variational autoencoders (VAEs). Our algorithm is based on an approximative probabilistic model that considers the existence of anomalies in the data, and by maximizing the log-likelihood, we estimate which dimensions contribute to determining data as an anomaly. The experiments results with benchmark datasets show that our algorithm extracts the contributing dimensions more accurately than baseline methods.
Tasks	Anomaly Detection, Dimensionality Reduction
Published	2018-11-12
URL	http://arxiv.org/abs/1811.04576v2
PDF	http://arxiv.org/pdf/1811.04576v2.pdf
PWC	https://paperswithcode.com/paper/estimation-of-dimensions-contributing-to
Repo
Framework

Machine Learning for Spatiotemporal Sequence Forecasting: A Survey


Title	Machine Learning for Spatiotemporal Sequence Forecasting: A Survey
Authors	Xingjian Shi, Dit-Yan Yeung
Abstract	Spatiotemporal systems are common in the real-world. Forecasting the multi-step future of these spatiotemporal systems based on the past observations, or, Spatiotemporal Sequence Forecasting (STSF), is a significant and challenging problem. Although lots of real-world problems can be viewed as STSF and many research works have proposed machine learning based methods for them, no existing work has summarized and compared these methods from a unified perspective. This survey aims to provide a systematic review of machine learning for STSF. In this survey, we define the STSF problem and classify it into three subcategories: Trajectory Forecasting of Moving Point Cloud (TF-MPC), STSF on Regular Grid (STSF-RG) and STSF on Irregular Grid (STSF-IG). We then introduce the two major challenges of STSF: 1) how to learn a model for multi-step forecasting and 2) how to adequately model the spatial and temporal structures. After that, we review the existing works for solving these challenges, including the general learning strategies for multi-step forecasting, the classical machine learning based methods for STSF, and the deep learning based methods for STSF. We also compare these methods and point out some potential research directions.
Tasks
Published	2018-08-21
URL	http://arxiv.org/abs/1808.06865v1
PDF	http://arxiv.org/pdf/1808.06865v1.pdf
PWC	https://paperswithcode.com/paper/machine-learning-for-spatiotemporal-sequence
Repo
Framework

Human-in-the-Loop Synthesis for Partially Observable Markov Decision Processes


Title	Human-in-the-Loop Synthesis for Partially Observable Markov Decision Processes
Authors	Steven Carr, Nils Jansen, Ralf Wimmer, Jie Fu, Ufuk Topcu
Abstract	We study planning problems where autonomous agents operate inside environments that are subject to uncertainties and not fully observable. Partially observable Markov decision processes (POMDPs) are a natural formal model to capture such problems. Because of the potentially huge or even infinite belief space in POMDPs, synthesis with safety guarantees is, in general, computationally intractable. We propose an approach that aims to circumvent this difficulty: in scenarios that can be partially or fully simulated in a virtual environment, we actively integrate a human user to control an agent. While the user repeatedly tries to safely guide the agent in the simulation, we collect data from the human input. Via behavior cloning, we translate the data into a strategy for the POMDP. The strategy resolves all nondeterminism and non-observability of the POMDP, resulting in a discrete-time Markov chain (MC). The efficient verification of this MC gives quantitative insights into the quality of the inferred human strategy by proving or disproving given system specifications. For the case that the quality of the strategy is not sufficient, we propose a refinement method using counterexamples presented to the human. Experiments show that by including humans into the POMDP verification loop we improve the state of the art by orders of magnitude in terms of scalability.
Tasks
Published	2018-02-27
URL	http://arxiv.org/abs/1802.09810v1
PDF	http://arxiv.org/pdf/1802.09810v1.pdf
PWC	https://paperswithcode.com/paper/human-in-the-loop-synthesis-for-partially
Repo
Framework

Conditional Adversarial Synthesis of 3D Facial Action Units


Title	Conditional Adversarial Synthesis of 3D Facial Action Units
Authors	Zhilei Liu, Guoxian Song, Jianfei Cai, Tat-Jen Cham, Juyong Zhang
Abstract	Employing deep learning-based approaches for fine-grained facial expression analysis, such as those involving the estimation of Action Unit (AU) intensities, is difficult due to the lack of a large-scale dataset of real faces with sufficiently diverse AU labels for training. In this paper, we consider how AU-level facial image synthesis can be used to substantially augment such a dataset. We propose an AU synthesis framework that combines the well-known 3D Morphable Model (3DMM), which intrinsically disentangles expression parameters from other face attributes, with models that adversarially generate 3DMM expression parameters conditioned on given target AU labels, in contrast to the more conventional approach of generating facial images directly. In this way, we are able to synthesize new combinations of expression parameters and facial images from desired AU labels. Extensive quantitative and qualitative results on the benchmark DISFA dataset demonstrate the effectiveness of our method on 3DMM facial expression parameter synthesis and data augmentation for deep learning-based AU intensity estimation.
Tasks	Data Augmentation, Image Generation
Published	2018-02-21
URL	http://arxiv.org/abs/1802.07421v2
PDF	http://arxiv.org/pdf/1802.07421v2.pdf
PWC	https://paperswithcode.com/paper/conditional-adversarial-synthesis-of-3d
Repo
Framework

A Framework for Automated Cellular Network Tuning with Reinforcement Learning


Title	A Framework for Automated Cellular Network Tuning with Reinforcement Learning
Authors	Faris B. Mismar, Jinseok Choi, Brian L. Evans
Abstract	Tuning cellular network performance against always occurring wireless impairments can dramatically improve reliability to end users. In this paper, we formulate cellular network performance tuning as a reinforcement learning (RL) problem and provide a solution to improve the performance for indoor and outdoor environments. By leveraging the ability of Q-learning to estimate future performance improvement rewards, we propose two algorithms: (1) closed loop power control (PC) for downlink voice over LTE (VoLTE) and (2) self-organizing network (SON) fault management. The VoLTE PC algorithm uses RL to adjust the indoor base station transmit power so that the signal to interference plus noise ratio (SINR) of a user equipment (UE) meets the target SINR. It does so without the UE having to send power control requests. The SON fault management algorithm uses RL to improve the performance of an outdoor base station cluster by resolving faults in the network through configuration management. Both algorithms exploit measurements from the connected users, wireless impairments, and relevant configuration parameters to solve a non-convex performance optimization problem using RL. Simulation results show that our proposed RL based algorithms outperform the industry standards today in realistic cellular communication environments.
Tasks	Q-Learning
Published	2018-08-13
URL	https://arxiv.org/abs/1808.05140v6
PDF	https://arxiv.org/pdf/1808.05140v6.pdf
PWC	https://paperswithcode.com/paper/a-framework-for-automated-cellular-network
Repo
Framework

Multi-Level Sequence GAN for Group Activity Recognition


Title	Multi-Level Sequence GAN for Group Activity Recognition
Authors	Harshala Gammulle, Simon Denman, Sridha Sridharan, Clinton Fookes
Abstract	We propose a novel semi-supervised, Multi-Level Sequential Generative Adversarial Network (MLS-GAN) architecture for group activity recognition. In contrast to previous works which utilise manually annotated individual human action predictions, we allow the models to learn it’s own internal representations to discover pertinent sub-activities that aid the final group activity recognition task. The generator is fed with person-level and scene-level features that are mapped temporally through LSTM networks. Action-based feature fusion is performed through novel gated fusion units that are able to consider long-term dependencies, exploring the relationships among all individual actions, to learn an intermediate representation or `action code’ for the current group activity. The network achieves its semi-supervised behaviour by allowing it to perform group action classification together with the adversarial real/fake validation. We perform extensive evaluations on different architectural variants to demonstrate the importance of the proposed architecture. Furthermore, we show that utilising both person-level and scene-level features facilitates the group activity prediction better than using only person-level features. Our proposed architecture outperforms current state-of-the-art results for sports and pedestrian based classification tasks on Volleyball and Collective Activity datasets, showing it’s flexible nature for effective learning of group activities. \|
Tasks	Action Classification, Activity Prediction, Activity Recognition, Group Activity Recognition
Published	2018-12-18
URL	http://arxiv.org/abs/1812.07124v1
PDF	http://arxiv.org/pdf/1812.07124v1.pdf
PWC	https://paperswithcode.com/paper/multi-level-sequence-gan-for-group-activity
Repo
Framework

Towards ultra-high resolution 3D reconstruction of a whole rat brain from 3D-PLI data


Title	Towards ultra-high resolution 3D reconstruction of a whole rat brain from 3D-PLI data
Authors	Sharib Ali, Martin Schober, Philipp Schlöme, Katrin Amunts, Markus Axer, Karl Rohr
Abstract	3D reconstruction of the fiber connectivity of the rat brain at microscopic scale enables gaining detailed insight about the complex structural organization of the brain. We introduce a new method for registration and 3D reconstruction of high- and ultra-high resolution (64 $\mu$m and 1.3 $\mu$m pixel size) histological images of a Wistar rat brain acquired by 3D polarized light imaging (3D-PLI). Our method exploits multi-scale and multi-modal 3D-PLI data up to cellular resolution. We propose a new feature transform-based similarity measure and a weighted regularization scheme for accurate and robust non-rigid registration. To transform the 1.3 $\mu$m ultra-high resolution data to the reference blockface images a feature-based registration method followed by a non-rigid registration is proposed. Our approach has been successfully applied to 278 histological sections of a rat brain and the performance has been quantitatively evaluated using manually placed landmarks by an expert.
Tasks	3D Reconstruction
Published	2018-07-29
URL	http://arxiv.org/abs/1807.11080v1
PDF	http://arxiv.org/pdf/1807.11080v1.pdf
PWC	https://paperswithcode.com/paper/towards-ultra-high-resolution-3d
Repo
Framework

EVA$^2$: Exploiting Temporal Redundancy in Live Computer Vision


Title	EVA$^2$: Exploiting Temporal Redundancy in Live Computer Vision
Authors	Mark Buckler, Philip Bedoukian, Suren Jayasuriya, Adrian Sampson
Abstract	Hardware support for deep convolutional neural networks (CNNs) is critical to advanced computer vision in mobile and embedded devices. Current designs, however, accelerate generic CNNs; they do not exploit the unique characteristics of real-time vision. We propose to use the temporal redundancy in natural video to avoid unnecessary computation on most frames. A new algorithm, activation motion compensation, detects changes in the visual input and incrementally updates a previously-computed output. The technique takes inspiration from video compression and applies well-known motion estimation techniques to adapt to visual changes. We use an adaptive key frame rate to control the trade-off between efficiency and vision quality as the input changes. We implement the technique in hardware as an extension to existing state-of-the-art CNN accelerator designs. The new unit reduces the average energy per frame by 54.2%, 61.7%, and 87.6% for three CNNs with less than 1% loss in vision accuracy.
Tasks	Motion Compensation, Motion Estimation, Video Compression
Published	2018-03-16
URL	http://arxiv.org/abs/1803.06312v2
PDF	http://arxiv.org/pdf/1803.06312v2.pdf
PWC	https://paperswithcode.com/paper/eva2-exploiting-temporal-redundancy-in-live
Repo
Framework

From handcrafted to deep local features


Title	From handcrafted to deep local features
Authors	Gabriela Csurka, Christopher R. Dance, Martin Humenberger
Abstract	This paper presents an overview of the evolution of local features from handcrafted to deep-learning-based methods, followed by a discussion of several benchmarks and papers evaluating such local features. Our investigations are motivated by 3D reconstruction problems, where the precise location of the features is important. As we describe these methods, we highlight and explain the challenges of feature extraction and potential ways to overcome them. We first present handcrafted methods, followed by methods based on classical machine learning and finally we discuss methods based on deep-learning. This largely chronologically-ordered presentation will help the reader to fully understand the topic of image and region description in order to make best use of it in modern computer vision applications. In particular, understanding handcrafted methods and their motivation can help to understand modern approaches and how machine learning is used to improve the results. We also provide references to most of the relevant literature and code.
Tasks	3D Reconstruction
Published	2018-07-26
URL	https://arxiv.org/abs/1807.10254v3
PDF	https://arxiv.org/pdf/1807.10254v3.pdf
PWC	https://paperswithcode.com/paper/from-handcrafted-to-deep-local-invariant
Repo
Framework