April 1, 2020

3281 words 16 mins read

Paper Group ANR 423

PnP-Net: A hybrid Perspective-n-Point Network. Using Deep Reinforcement Learning Methods for Autonomous Vessels in 2D Environments. Spike-Timing-Dependent Back Propagation in Deep Spiking Neural Networks. Learning State-Dependent Losses for Inverse Dynamics Learning. Fake Generated Painting Detection via Frequency Analysis. G-Net: A Deep Learning A …

PnP-Net: A hybrid Perspective-n-Point Network


Title	PnP-Net: A hybrid Perspective-n-Point Network
Authors	Roy Sheffer, Ami Wiesel
Abstract	We consider the robust Perspective-n-Point (PnP) problem using a hybrid approach that combines deep learning with model based algorithms. PnP is the problem of estimating the pose of a calibrated camera given a set of 3D points in the world and their corresponding 2D projections in the image. In its more challenging robust version, some of the correspondences may be mismatched and must be efficiently discarded. Classical solutions address PnP via iterative robust non-linear least squares method that exploit the problem’s geometry but are either inaccurate or computationally intensive. In contrast, we propose to combine a deep learning initial phase followed by a model-based fine tuning phase. This hybrid approach, denoted by PnP-Net, succeeds in estimating the unknown pose parameters under correspondence errors and noise, with low and fixed computational complexity requirements. We demonstrate its advantages on both synthetic data and real world data.
Tasks
Published	2020-03-10
URL	https://arxiv.org/abs/2003.04626v1
PDF	https://arxiv.org/pdf/2003.04626v1.pdf
PWC	https://paperswithcode.com/paper/pnp-net-a-hybrid-perspective-n-point-network
Repo
Framework

Using Deep Reinforcement Learning Methods for Autonomous Vessels in 2D Environments


Title	Using Deep Reinforcement Learning Methods for Autonomous Vessels in 2D Environments
Authors	Mohammad Etemad, Nader Zare, Mahtab Sarvmaili, Amilcar Soares, Bruno Brandoli Machado, Stan Matwin
Abstract	Unmanned Surface Vehicles technology (USVs) is an exciting topic that essentially deploys an algorithm to safely and efficiently performs a mission. Although reinforcement learning is a well-known approach to modeling such a task, instability and divergence may occur when combining off-policy and function approximation. In this work, we used deep reinforcement learning combining Q-learning with a neural representation to avoid instability. Our methodology uses deep q-learning and combines it with a rolling wave planning approach on agile methodology. Our method contains two critical parts in order to perform missions in an unknown environment. The first is a path planner that is responsible for generating a potential effective path to a destination without considering the details of the root. The latter is a decision-making module that is responsible for short-term decisions on avoiding obstacles during the near future steps of USV exploitation within the context of the value function. Simulations were performed using two algorithms: a basic vanilla vessel navigator (VVN) as a baseline and an improved one for the vessel navigator with a planner and local view (VNPLV). Experimental results show that the proposed method enhanced the performance of VVN by 55.31 on average for long-distance missions. Our model successfully demonstrated obstacle avoidance by means of deep reinforcement learning using planning adaptive paths in unknown environments.
Tasks	Decision Making, Q-Learning
Published	2020-03-23
URL	https://arxiv.org/abs/2003.10249v1
PDF	https://arxiv.org/pdf/2003.10249v1.pdf
PWC	https://paperswithcode.com/paper/using-deep-reinforcement-learning-methods-for
Repo
Framework

Spike-Timing-Dependent Back Propagation in Deep Spiking Neural Networks


Title	Spike-Timing-Dependent Back Propagation in Deep Spiking Neural Networks
Authors	Malu Zhang, Jiadong Wang, Zhixuan Zhang, Ammar Belatreche, Jibin Wu, Yansong Chua, Hong Qu, Haizhou Li
Abstract	The success of Deep Neural Networks (DNNs) can be attributed to its deep structure, that learns invariant feature representation at multiple levels of abstraction. Brain-inspired Spiking Neural Networks (SNNs) use spatiotemporal spike patterns to encode and transmit information, which is biologically realistic, and suitable for ultra-low-power event-driven neuromorphic implementation. Therefore, Deep Spiking Neural Networks (DSNNs) represent a promising direction in artificial intelligence, with the potential to benefit from the best of both worlds. However, the training of DSNNs is challenging because standard error back-propagation (BP) algorithms are not directly applicable. In this paper, we first establish an understanding of why error back-propagation does not work well in DSNNs. To address this problem, we propose a simple yet efficient Rectified Linear Postsynaptic Potential function (ReL-PSP) for spiking neurons and propose a Spike-Timing-Dependent Back-Propagation (STDBP) learning algorithm for DSNNs. In the proposed learning algorithm, the timing of individual spikes is used to carry information (temporal coding), and learning (back-propagation) is performed based on spike timing in an event-driven manner. Experimental results demonstrate that the proposed learning algorithm achieves state-of-the-art performance in spike time based learning algorithms of SNNs. This work investigates the contribution of dynamics in spike timing to information encoding, synaptic plasticity and decision making, providing a new perspective to design of future DSNNs.
Tasks	Decision Making
Published	2020-03-26
URL	https://arxiv.org/abs/2003.11837v1
PDF	https://arxiv.org/pdf/2003.11837v1.pdf
PWC	https://paperswithcode.com/paper/spike-timing-dependent-back-propagation-in
Repo
Framework

Learning State-Dependent Losses for Inverse Dynamics Learning


Title	Learning State-Dependent Losses for Inverse Dynamics Learning
Authors	Kristen Morse, Neha Das, Yixin Lin, Austin Wang, Akshara Rai, Franziska Meier
Abstract	Being able to quickly adapt to changes in dynamics is paramount in model-based control for object manipulation tasks. In order to influence fast adaptation of the inverse dynamics model’s parameters, data efficiency is crucial. Given observed data, a key element to how an optimizer updates model parameters is the loss function. In this work, we propose to apply meta-learning to learn structured, state-dependent loss functions during a meta-training phase. We then replace standard losses with our learned losses during online adaptation tasks. We evaluate our proposed approach on inverse dynamics learning tasks, both in simulation and on real hardware data. In both settings, the structured learned losses improve online adaptation speed, when compared to standard, state-independent loss functions.
Tasks	Meta-Learning
Published	2020-03-10
URL	https://arxiv.org/abs/2003.04947v2
PDF	https://arxiv.org/pdf/2003.04947v2.pdf
PWC	https://paperswithcode.com/paper/learning-state-dependent-losses-for-inverse
Repo
Framework

Fake Generated Painting Detection via Frequency Analysis


Title	Fake Generated Painting Detection via Frequency Analysis
Authors	Yong Bai, Yuanfang Guo, Jinjie Wei, Lin Lu, Rui Wang, Yunhong Wang
Abstract	With the development of deep neural networks, digital fake paintings can be generated by various style transfer algorithms.To detect the fake generated paintings, we analyze the fake generated and real paintings in Fourier frequency domain and observe statistical differences and artifacts. Based on our observations, we propose Fake Generated Painting Detection via Frequency Analysis (FGPD-FA) by extracting three types of features in frequency domain. Besides, we also propose a digital fake painting detection database for assessing the proposed method. Experimental results demonstrate the excellence of the proposed method in different testing conditions.
Tasks	Style Transfer
Published	2020-03-05
URL	https://arxiv.org/abs/2003.02467v1
PDF	https://arxiv.org/pdf/2003.02467v1.pdf
PWC	https://paperswithcode.com/paper/fake-generated-painting-detection-via
Repo
Framework

G-Net: A Deep Learning Approach to G-computation for Counterfactual Outcome Prediction Under Dynamic Treatment Regimes


Title	G-Net: A Deep Learning Approach to G-computation for Counterfactual Outcome Prediction Under Dynamic Treatment Regimes
Authors	Rui Li, Zach Shahn, Jun Li, Mingyu Lu, Prithwish Chakraborty, Daby Sow, Mohamed Ghalwash, Li-wei H. Lehman
Abstract	Counterfactual prediction is a fundamental task in decision-making. G-computation is a method for estimating expected counterfactual outcomes under dynamic time-varying treatment strategies. Existing G-computation implementations have mostly employed classical regression models with limited capacity to capture complex temporal and nonlinear dependence structures. This paper introduces G-Net, a novel sequential deep learning framework for G-computation that can handle complex time series data while imposing minimal modeling assumptions and provide estimates of individual or population-level time varying treatment effects. We evaluate alternative G-Net implementations using realistically complex temporal simulated data obtained from CVSim, a mechanistic model of the cardiovascular system.
Tasks	Decision Making, Time Series
Published	2020-03-23
URL	https://arxiv.org/abs/2003.10551v1
PDF	https://arxiv.org/pdf/2003.10551v1.pdf
PWC	https://paperswithcode.com/paper/g-net-a-deep-learning-approach-to-g
Repo
Framework

Deep Learning Estimation of Multi-Tissue Constrained Spherical Deconvolution with Limited Single Shell DW-MRI


Title	Deep Learning Estimation of Multi-Tissue Constrained Spherical Deconvolution with Limited Single Shell DW-MRI
Authors	Vishwesh Nath, Sudhir K. Pathak, Kurt G. Schilling, Walt Schneider, Bennett A. Landman
Abstract	Diffusion-weighted magnetic resonance imaging (DW-MRI) is the only non-invasive approach for estimation of intra-voxel tissue microarchitecture and reconstruction of in vivo neural pathways for the human brain. With improvement in accelerated MRI acquisition technologies, DW-MRI protocols that make use of multiple levels of diffusion sensitization have gained popularity. A well-known advanced method for reconstruction of white matter microstructure that uses multi-shell data is multi-tissue constrained spherical deconvolution (MT-CSD). MT-CSD substantially improves the resolution of intra-voxel structure over the traditional single shell version, constrained spherical deconvolution (CSD). Herein, we explore the possibility of using deep learning on single shell data (using the b=1000 s/mm2 from the Human Connectome Project (HCP)) to estimate the information content captured by 8th order MT-CSD using the full three shell data (b=1000, 2000, and 3000 s/mm2 from HCP). Briefly, we examine two network architectures: 1.) Sequential network of fully connected dense layers with a residual block in the middle (ResDNN), 2.) Patch based convolutional neural network with a residual block (ResCNN). For both networks an additional output block for estimation of voxel fraction was used with a modified loss function. Each approach was compared against the baseline of using MT-CSD on all data on 15 subjects from the HCP divided into 5 training, 2 validation, and 8 testing subjects with a total of 6.7 million voxels. The fiber orientation distribution function (fODF) can be recovered with high correlation (0.77 vs 0.74 and 0.65) as compared to the ground truth of MT-CST, which was derived from the multi-shell DW-MRI acquisitions. Source code and models have been made publicly available.
Tasks
Published	2020-02-20
URL	https://arxiv.org/abs/2002.08820v1
PDF	https://arxiv.org/pdf/2002.08820v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-estimation-of-multi-tissue
Repo
Framework

Accelerating the Registration of Image Sequences by Spatio-temporal Multilevel Strategies


Title	Accelerating the Registration of Image Sequences by Spatio-temporal Multilevel Strategies
Authors	Hari Om Aggrawal, Jan Modersitzki
Abstract	Multilevel strategies are an integral part of many image registration algorithms. These strategies are very well-known for avoiding undesirable local minima, providing an outstanding initial guess, and reducing overall computation time. State-of-the-art multilevel strategies build a hierarchy of discretization in the spatial dimensions. In this paper, we present a spatio-temporal strategy, where we introduce a hierarchical discretization in the temporal dimension at each spatial level. This strategy is suitable for a motion estimation problem where the motion is assumed smooth over time. Our strategy exploits the temporal smoothness among image frames by following a predictor-corrector approach. The strategy predicts the motion by a novel interpolation method and later corrects it by registration. The prediction step provides a good initial guess for the correction step, hence reduces the overall computational time for registration. The acceleration is achieved by a factor of 2.5 on average, over the state-of-the-art multilevel methods on three examined optical coherence tomography datasets.
Tasks	Image Registration, Motion Estimation
Published	2020-01-18
URL	https://arxiv.org/abs/2001.06613v1
PDF	https://arxiv.org/pdf/2001.06613v1.pdf
PWC	https://paperswithcode.com/paper/accelerating-the-registration-of-image
Repo
Framework

Supervised Segmentation of Retinal Vessel Structures Using ANN


Title	Supervised Segmentation of Retinal Vessel Structures Using ANN
Authors	Esra Kaya, İsmail Sarıtaş, Ilker Ali Ozkan
Abstract	In this study, a supervised retina blood vessel segmentation process was performed on the green channel of the RGB image using artificial neural network (ANN). The green channel is preferred because the retinal vessel structures can be distinguished most clearly from the green channel of the RGB image. The study was performed using 20 images in the DRIVE data set which is one of the most common retina data sets known. The images went through some preprocessing stages like contrastlimited adaptive histogram equalization (CLAHE), color intensity adjustment, morphological operations and median and Gaussian filtering to obtain a good segmentation. Retinal vessel structures were highlighted with top-hat and bot-hat morphological operations and converted to binary image by using global thresholding. Then, the network was trained by the binary version of the images specified as training images in the dataset and the targets are the images segmented manually by a specialist. The average segmentation accuracy for 20 images was found as 0.9492.
Tasks
Published	2020-01-15
URL	https://arxiv.org/abs/2001.05549v1
PDF	https://arxiv.org/pdf/2001.05549v1.pdf
PWC	https://paperswithcode.com/paper/supervised-segmentation-of-retinal-vessel
Repo
Framework

A Comparative Study for Non-rigid Image Registration and Rigid Image Registration


Title	A Comparative Study for Non-rigid Image Registration and Rigid Image Registration
Authors	Xiaoran Zhang, Hexiang Dong, Di Gao, Xiao Zhao
Abstract	Image registration algorithms can be generally categorized into two groups: non-rigid and rigid. Recently, many deep learning-based algorithms employ a neural net to characterize non-rigid image registration function. However, do they always perform better? In this study, we compare the state-of-art deep learning-based non-rigid registration approach with rigid registration approach. The data is generated from Kaggle Dog vs Cat Competition \url{https://www.kaggle.com/c/dogs-vs-cats/} and we test the algorithms’ performance on rigid transformation including translation, rotation, scaling, shearing and pixelwise non-rigid transformation. The Voxelmorph is trained on rigidset and nonrigidset separately for comparison and we also add a gaussian blur layer to its original architecture to improve registration performance. The best quantitative results in both root-mean-square error (RMSE) and mean absolute error (MAE) metrics for rigid registration are produced by SimpleElastix and non-rigid registration by Voxelmorph. We select representative samples for visual assessment.
Tasks	Image Registration
Published	2020-01-12
URL	https://arxiv.org/abs/2001.03831v1
PDF	https://arxiv.org/pdf/2001.03831v1.pdf
PWC	https://paperswithcode.com/paper/a-comparative-study-for-non-rigid-image
Repo
Framework

An Investigation of Feature-based Nonrigid Image Registration using Gaussian Process


Title	An Investigation of Feature-based Nonrigid Image Registration using Gaussian Process
Authors	Siming Bayer, Ute Spiske, Jie Luo, Tobias Geimer, William M. Wells III, Martin Ostermeier, Rebecca Fahrig, Arya Nabavi, Christoph Bert, Ilker Eyupoglo, Andreas Maier
Abstract	For a wide range of clinical applications, such as adaptive treatment planning or intraoperative image update, feature-based deformable registration (FDR) approaches are widely employed because of their simplicity and low computational complexity. FDR algorithms estimate a dense displacement field by interpolating a sparse field, which is given by the established correspondence between selected features. In this paper, we consider the deformation field as a Gaussian Process (GP), whereas the selected features are regarded as prior information on the valid deformations. Using GP, we are able to estimate the both dense displacement field and a corresponding uncertainty map at once. Furthermore, we evaluated the performance of different hyperparameter settings for squared exponential kernels with synthetic, phantom and clinical data respectively. The quantitative comparison shows, GP-based interpolation has performance on par with state-of-the-art B-spline interpolation. The greatest clinical benefit of GP-based interpolation is that it gives a reliable estimate of the mathematical uncertainty of the calculated dense displacement map.
Tasks	Image Registration
Published	2020-01-12
URL	https://arxiv.org/abs/2001.05862v1
PDF	https://arxiv.org/pdf/2001.05862v1.pdf
PWC	https://paperswithcode.com/paper/an-investigation-of-feature-based-nonrigid
Repo
Framework

Energy-efficient and Robust Cumulative Training with Net2Net Transformation


Title	Energy-efficient and Robust Cumulative Training with Net2Net Transformation
Authors	Aosong Feng, Priyadarshini Panda
Abstract	Deep learning has achieved state-of-the-art accuracies on several computer vision tasks. However, the computational and energy requirements associated with training such deep neural networks can be quite high. In this paper, we propose a cumulative training strategy with Net2Net transformation that achieves training computational efficiency without incurring large accuracy loss, in comparison to a model trained from scratch. We achieve this by first training a small network (with lesser parameters) on a small subset of the original dataset, and then gradually expanding the network using Net2Net transformation to train incrementally on larger subsets of the dataset. This incremental training strategy with Net2Net utilizes function-preserving transformations that transfers knowledge from each previous small network to the next larger network, thereby, reducing the overall training complexity. Our experiments demonstrate that compared with training from scratch, cumulative training yields ~2x reduction in computational complexity for training TinyImageNet using VGG19 at iso-accuracy. Besides training efficiency, a key advantage of our cumulative training strategy is that we can perform pruning during Net2Net expansion to obtain a final network with optimal configuration (~0.4x lower inference compute complexity) compared to conventional training from scratch. We also demonstrate that the final network obtained from cumulative training yields better generalization performance and noise robustness. Further, we show that mutual inference from all the networks created with cumulative Net2Net expansion enables improved adversarial input detection.
Tasks
Published	2020-03-02
URL	https://arxiv.org/abs/2003.01204v1
PDF	https://arxiv.org/pdf/2003.01204v1.pdf
PWC	https://paperswithcode.com/paper/energy-efficient-and-robust-cumulative
Repo
Framework

Modeling Climate Change Impact on Wind Power Resources Using Adaptive Neuro-Fuzzy Inference System


Title	Modeling Climate Change Impact on Wind Power Resources Using Adaptive Neuro-Fuzzy Inference System
Authors	Narjes Nabipour, Amir Mosavi, Eva Hajnal, Laszlo Nadai, Shahab Shamshirband, Kwok-Wing Chau
Abstract	Climate change impacts and adaptations are the subjects to ongoing issues that attract the attention of many researchers. Insight into the wind power potential in an area and its probable variation due to climate change impacts can provide useful information for energy policymakers and strategists for sustainable development and management of the energy. In this study, spatial variation of wind power density at the turbine hub-height and its variability under future climatic scenarios are taken under consideration. An ANFIS based post-processing technique was employed to match the power outputs of the regional climate model with those obtained from the reference data. The near-surface wind data obtained from a regional climate model are employed to investigate climate change impacts on the wind power resources in the Caspian Sea. Subsequent to converting near-surface wind speed to turbine hub-height speed and computation of wind power density, the results have been investigated to reveal mean annual power, seasonal, and monthly variability for a 20-year period in the present (1981-2000) and in the future (2081-2100). The findings of this study indicated that the middle and northern parts of the Caspian Sea are placed with the highest values of wind power. However, the results of the post-processing technique using adaptive neuro-fuzzy inference system (ANFIS) model showed that the real potential of the wind power in the area is lower than those of projected from the regional climate model.
Tasks
Published	2020-01-09
URL	https://arxiv.org/abs/2001.04279v1
PDF	https://arxiv.org/pdf/2001.04279v1.pdf
PWC	https://paperswithcode.com/paper/modeling-climate-change-impact-on-wind-power
Repo
Framework

Face Verification Using 60~GHz 802.11 waveforms


Title	Face Verification Using 60~GHz 802.11 waveforms
Authors	Eran Hof, Amichai Sanderovich, Evyatar Hemo
Abstract	Verification of an identity based on the human face radar signature in mmwave is studied. The chipset for 802.11ad/y networking that is cable of operating in a radar mode is used. A dataset with faces of 200 different persons was collected for the testing. Our preliminary study shows promising results for the application of autoencoder for the setup at hand.
Tasks	Face Verification
Published	2020-02-27
URL	https://arxiv.org/abs/2002.11965v1
PDF	https://arxiv.org/pdf/2002.11965v1.pdf
PWC	https://paperswithcode.com/paper/face-verification-using-60ghz-80211-waveforms
Repo
Framework

On the Integration of LinguisticFeatures into Statistical and Neural Machine Translation


Title	On the Integration of LinguisticFeatures into Statistical and Neural Machine Translation
Authors	Eva Vanmassenhove
Abstract	New machine translations (MT) technologies are emerging rapidly and with them, bold claims of achieving human parity such as: (i) the results produced approach “accuracy achieved by average bilingual human translators” (Wu et al., 2017b) or (ii) the “translation quality is at human parity when compared to professional human translators” (Hassan et al., 2018) have seen the light of day (Laubli et al., 2018). Aside from the fact that many of these papers craft their own definition of human parity, these sensational claims are often not supported by a complete analysis of all aspects involved in translation. Establishing the discrepancies between the strengths of statistical approaches to MT and the way humans translate has been the starting point of our research. By looking at MT output and linguistic theory, we were able to identify some remaining issues. The problems range from simple number and gender agreement errors to more complex phenomena such as the correct translation of aspectual values and tenses. Our experiments confirm, along with other studies (Bentivogli et al., 2016), that neural MT has surpassed statistical MT in many aspects. However, some problems remain and others have emerged. We cover a series of problems related to the integration of specific linguistic features into statistical and neural MT, aiming to analyse and provide a solution to some of them. Our work focuses on addressing three main research questions that revolve around the complex relationship between linguistics and MT in general. We identify linguistic information that is lacking in order for automatic translation systems to produce more accurate translations and integrate additional features into the existing pipelines. We identify overgeneralization or ‘algorithmic bias’ as a potential drawback of neural MT and link it to many of the remaining linguistic issues.
Tasks	Machine Translation
Published	2020-03-31
URL	https://arxiv.org/abs/2003.14324v1
PDF	https://arxiv.org/pdf/2003.14324v1.pdf
PWC	https://paperswithcode.com/paper/on-the-integration-of-linguisticfeatures-into
Repo
Framework