January 31, 2020

3277 words 16 mins read

Paper Group ANR 113

Paper Group ANR 113

4D X-Ray CT Reconstruction using Multi-Slice Fusion. A blockchain-orchestrated Federated Learning architecture for healthcare consortia. Painting with baryons: augmenting N-body simulations with gas using deep generative models. Simultaneously Learning Architectures and Features of Deep Neural Networks. Gating Revisited: Deep Multi-layer RNNs That …

4D X-Ray CT Reconstruction using Multi-Slice Fusion

Title 4D X-Ray CT Reconstruction using Multi-Slice Fusion
Authors Soumendu Majee, Thilo Balke, Craig A. J. Kemp, Gregery T. Buzzard, Charles A. Bouman
Abstract There is an increasing need to reconstruct objects in four or more dimensions corresponding to space, time and other independent parameters. The best 4D reconstruction algorithms use regularized iterative reconstruction approaches such as model based iterative reconstruction (MBIR), which depends critically on the quality of the prior modeling. Recently, Plug-and-Play methods have been shown to be an effective way to incorporate advanced prior models using state-of-the-art denoising algorithms designed to remove additive white Gaussian noise (AWGN). However, state-of-the-art denoising algorithms such as BM4D and deep convolutional neural networks (CNNs) are primarily available for 2D and sometimes 3D images. In particular, CNNs are difficult and computationally expensive to implement in four or more dimensions, and training may be impossible if there is no associated high-dimensional training data. In this paper, we present Multi-Slice Fusion, a novel algorithm for 4D and higher-dimensional reconstruction, based on the fusion of multiple low-dimensional denoisers. Our approach uses multi-agent consensus equilibrium (MACE), an extension of Plug-and-Play, as a framework for integrating the multiple lower-dimensional prior models. We apply our method to the problem of 4D cone-beam X-ray CT reconstruction for Non Destructive Evaluation (NDE) of moving parts. This is done by solving the MACE equations using lower-dimensional CNN denoisers implemented in parallel on a heterogeneous cluster. Results on experimental CT data demonstrate that Multi-Slice Fusion can substantially improve the quality of reconstructions relative to traditional 4D priors, while also being practical to implement and train.
Tasks Denoising
Published 2019-06-15
URL https://arxiv.org/abs/1906.06601v1
PDF https://arxiv.org/pdf/1906.06601v1.pdf
PWC https://paperswithcode.com/paper/4d-x-ray-ct-reconstruction-using-multi-slice
Repo
Framework

A blockchain-orchestrated Federated Learning architecture for healthcare consortia

Title A blockchain-orchestrated Federated Learning architecture for healthcare consortia
Authors Jonathan Passerat-Palmbach, Tyler Farnan, Robert Miller, Marielle S. Gross, Heather Leigh Flannery, Bill Gleim
Abstract We propose a novel architecture for federated learning within healthcare consortia. At the heart of the solution is a unique integration of privacy preserving technologies, built upon native enterprise blockchain components available in the Ethereum ecosystem. We show how the specific characteristics and challenges of healthcare consortia informed our design choices, notably the conception of a new Secure Aggregation protocol assembled with a protected hardware component and an encryption toolkit native to Ethereum. Our architecture also brings in a privacy preserving audit trail that logs events in the network without revealing identities.
Tasks
Published 2019-10-12
URL https://arxiv.org/abs/1910.12603v1
PDF https://arxiv.org/pdf/1910.12603v1.pdf
PWC https://paperswithcode.com/paper/a-blockchain-orchestrated-federated-learning
Repo
Framework

Painting with baryons: augmenting N-body simulations with gas using deep generative models

Title Painting with baryons: augmenting N-body simulations with gas using deep generative models
Authors Tilman Tröster, Cameron Ferguson, Joachim Harnois-Déraps, Ian G. McCarthy
Abstract Running hydrodynamical simulations to produce mock data of large-scale structure and baryonic probes, such as the thermal Sunyaev-Zeldovich (tSZ) effect, at cosmological scales is computationally challenging. We propose to leverage the expressive power of deep generative models to find an effective description of the large-scale gas distribution and temperature. We train two deep generative models, a variational auto-encoder and a generative adversarial network, on pairs of matter density and pressure slices from the BAHAMAS hydrodynamical simulation. The trained models are able to successfully map matter density to the corresponding gas pressure. We then apply the trained models on 100 lines-of-sight from SLICS, a suite of N-body simulations optimised for weak lensing covariance estimation, to generate maps of the tSZ effect. The generated tSZ maps are found to be statistically consistent with those from BAHAMAS. We conclude by considering a specific observable, the angular cross-power spectrum between the weak lensing convergence and the tSZ effect and its variance, where we find excellent agreement between the predictions from BAHAMAS and SLICS, thus enabling the use of SLICS for tSZ covariance estimation.
Tasks
Published 2019-03-28
URL https://arxiv.org/abs/1903.12173v2
PDF https://arxiv.org/pdf/1903.12173v2.pdf
PWC https://paperswithcode.com/paper/painting-with-baryons-augmenting-n-body
Repo
Framework

Simultaneously Learning Architectures and Features of Deep Neural Networks

Title Simultaneously Learning Architectures and Features of Deep Neural Networks
Authors Tinghuai Wang, Lixin Fan, Huiling Wang
Abstract This paper presents a novel method which simultaneously learns the number of filters and network features repeatedly over multiple epochs. We propose a novel pruning loss to explicitly enforces the optimizer to focus on promising candidate filters while suppressing contributions of less relevant ones. In the meanwhile, we further propose to enforce the diversities between filters and this diversity-based regularization term improves the trade-off between model sizes and accuracies. It turns out the interplay between architecture and feature optimizations improves the final compressed models, and the proposed method is compared favorably to existing methods, in terms of both models sizes and accuracies for a wide range of applications including image classification, image compression and audio classification.
Tasks Audio Classification, Image Classification, Image Compression
Published 2019-06-11
URL https://arxiv.org/abs/1906.04505v1
PDF https://arxiv.org/pdf/1906.04505v1.pdf
PWC https://paperswithcode.com/paper/simultaneously-learning-architectures-and
Repo
Framework

Gating Revisited: Deep Multi-layer RNNs That Can Be Trained

Title Gating Revisited: Deep Multi-layer RNNs That Can Be Trained
Authors Mehmet Ozgur Turkoglu, Stefano D’Aronco, Jan Dirk Wegner, Konrad Schindler
Abstract We propose a new stackable recurrent cell (STAR) for recurrent neural networks (RNNs) that has significantly less parameters than widely used LSTM and GRU while being more robust against vanishing or exploding gradients. Stacking multiple layers of recurrent units has two major drawbacks: i) many recurrent cells (e.g., LSTM cells) are extremely eager in terms of parameters and computation resources, ii) deep RNNs are prone to vanishing or exploding gradients during training. We investigate the training of multi-layer RNNs and examine the magnitude of the gradients as they propagate through the network in the “vertical” direction. We show that, depending on the structure of the basic recurrent unit, the gradients are systematically attenuated or amplified. Based on our analysis we design a new type of gated cell that better preserves gradient magnitude. We validate our design on a large number of sequence modelling tasks and demonstrate that the proposed STAR cell allows to build and train deeper recurrent architectures, ultimately leading to improved performance while being computationally efficient.
Tasks
Published 2019-11-25
URL https://arxiv.org/abs/1911.11033v1
PDF https://arxiv.org/pdf/1911.11033v1.pdf
PWC https://paperswithcode.com/paper/gating-revisited-deep-multi-layer-rnns-that-1
Repo
Framework

Revisiting EmbodiedQA: A Simple Baseline and Beyond

Title Revisiting EmbodiedQA: A Simple Baseline and Beyond
Authors Yu Wu, Lu Jiang, Yi Yang
Abstract In Embodied Question Answering (EmbodiedQA), an agent interacts with an environment to gather necessary information for answering user questions. Existing works have laid a solid foundation towards solving this interesting problem. But the current performance, especially in navigation, suggests that EmbodiedQA might be too challenging for current approaches. In this paper, we empirically study this problem and introduce 1) a simple yet effective baseline that can be end-to-end optimized by SGD; 2) an easier and practical setting for EmbodiedQA where an agent has a chance to adapt the trained model to a new environment before it actually answers users questions. In the new setting, we randomly place a few objects in new environments, and upgrade the agent policy by a distillation network to retain the generalization ability from the trained model. On the EmbodiedQA v1 benchmark, under the standard setting, our simple baseline achieves very competitive results to the-state-of-the-art; in the new setting, we found the introduced small change in settings yields a notable gain in navigation.
Tasks Embodied Question Answering, Question Answering
Published 2019-04-08
URL http://arxiv.org/abs/1904.04166v1
PDF http://arxiv.org/pdf/1904.04166v1.pdf
PWC https://paperswithcode.com/paper/revisiting-embodiedqa-a-simple-baseline-and
Repo
Framework

Embodied Question Answering in Photorealistic Environments with Point Cloud Perception

Title Embodied Question Answering in Photorealistic Environments with Point Cloud Perception
Authors Erik Wijmans, Samyak Datta, Oleksandr Maksymets, Abhishek Das, Georgia Gkioxari, Stefan Lee, Irfan Essa, Devi Parikh, Dhruv Batra
Abstract To help bridge the gap between internet vision-style problems and the goal of vision for embodied perception we instantiate a large-scale navigation task – Embodied Question Answering [1] in photo-realistic environments (Matterport 3D). We thoroughly study navigation policies that utilize 3D point clouds, RGB images, or their combination. Our analysis of these models reveals several key findings. We find that two seemingly naive navigation baselines, forward-only and random, are strong navigators and challenging to outperform, due to the specific choice of the evaluation setting presented by [1]. We find a novel loss-weighting scheme we call Inflection Weighting to be important when training recurrent models for navigation with behavior cloning and are able to out perform the baselines with this technique. We find that point clouds provide a richer signal than RGB images for learning obstacle avoidance, motivating the use (and continued study) of 3D deep learning models for embodied navigation.
Tasks Embodied Question Answering, Question Answering
Published 2019-04-06
URL http://arxiv.org/abs/1904.03461v1
PDF http://arxiv.org/pdf/1904.03461v1.pdf
PWC https://paperswithcode.com/paper/embodied-question-answering-in-photorealistic
Repo
Framework

DeCaFA: Deep Convolutional Cascade for Face Alignment In The Wild

Title DeCaFA: Deep Convolutional Cascade for Face Alignment In The Wild
Authors Arnaud Dapogny, Kévin Bailly, Matthieu Cord
Abstract Face Alignment is an active computer vision domain, that consists in localizing a number of facial landmarks that vary across datasets. State-of-the-art face alignment methods either consist in end-to-end regression, or in refining the shape in a cascaded manner, starting from an initial guess. In this paper, we introduce DeCaFA, an end-to-end deep convolutional cascade architecture for face alignment. DeCaFA uses fully-convolutional stages to keep full spatial resolution throughout the cascade. Between each cascade stage, DeCaFA uses multiple chained transfer layers with spatial softmax to produce landmark-wise attention maps for each of several landmark alignment tasks. Weighted intermediate supervision, as well as efficient feature fusion between the stages allow to learn to progressively refine the attention maps in an end-to-end manner. We show experimentally that DeCaFA significantly outperforms existing approaches on 300W, CelebA and WFLW databases. In addition, we show that DeCaFA can learn fine alignment with reasonable accuracy from very few images using coarsely annotated data.
Tasks Face Alignment
Published 2019-04-04
URL http://arxiv.org/abs/1904.02549v1
PDF http://arxiv.org/pdf/1904.02549v1.pdf
PWC https://paperswithcode.com/paper/decafa-deep-convolutional-cascade-for-face
Repo
Framework

Colorectal cancer diagnosis from histology images: A comparative study

Title Colorectal cancer diagnosis from histology images: A comparative study
Authors Junaid Malik, Serkan Kiranyaz, Suchitra Kunhoth, Turker Ince, Somaya Al-Maadeed, Ridha Hamila, Moncef Gabbouj
Abstract Computer-aided diagnosis (CAD) based on histopathological imaging has progressed rapidly in recent years with the rise of machine learning based methodologies. Traditional approaches consist of training a classification model using features extracted from the images, based on textures or morphological properties. Recently, deep-learning based methods have been applied directly to the raw (unprocessed) data. However, their usability is impacted by the paucity of annotated data in the biomedical sector. In order to leverage the learning capabilities of deep Convolutional Neural Nets (CNNs) within the confines of limited labelled data, in this study we shall investigate the transfer learning approaches that aim to apply the knowledge gained from solving a source (e.g., non-medical) problem, to learn better predictive models for the target (e.g., biomedical) task. As an alternative, we shall further propose a new adaptive and compact CNN based architecture that can be trained from scratch even on scarce and low-resolution data. Moreover, we conduct quantitative comparative evaluations among the traditional methods, transfer learning-based methods and the proposed adaptive approach for the particular task of cancer detection and identification from scarce and low-resolution histology images. Over the largest benchmark dataset formed for this purpose, the proposed adaptive approach achieved a higher cancer detection accuracy with a significant gap, whereas the deep CNNs with transfer learning achieved a superior cancer identification.
Tasks Transfer Learning
Published 2019-03-27
URL http://arxiv.org/abs/1903.11210v2
PDF http://arxiv.org/pdf/1903.11210v2.pdf
PWC https://paperswithcode.com/paper/colorectal-cancer-diagnosis-from-histology
Repo
Framework

Safe Policies for Reinforcement Learning via Primal-Dual Methods

Title Safe Policies for Reinforcement Learning via Primal-Dual Methods
Authors Santiago Paternain, Miguel Calvo-Fullana, Luiz F. O. Chamon, Alejandro Ribeiro
Abstract In this paper, we study the learning of safe policies in the setting of reinforcement learning problems. This is, we aim to control a Markov Decision Process (MDP) of which we do not know the transition probabilities, but we have access to sample trajectories through experience. We define safety as the agent remaining in a desired safe set with high probability during the operation time. We therefore consider a constrained MDP where the constraints are probabilistic. Since there is no straightforward way to optimize the policy with respect to the probabilistic constraint in a reinforcement learning framework, we propose an ergodic relaxation of the problem. The advantages of the proposed relaxation are threefold. (i) The safety guarantees are maintained in the case of episodic tasks and they are kept up to a given time horizon for continuing tasks. (ii) The constrained optimization problem despite its non-convexity has arbitrarily small duality gap if the parametrization of the policy is rich enough. (iii) The gradients of the Lagrangian associated with the safe-learning problem can be easily computed using standard policy gradient results and stochastic approximation tools. Leveraging these advantages, we establish that primal-dual algorithms are able to find policies that are safe and optimal. We test the proposed approach in a navigation task in a continuous domain. The numerical results show that our algorithm is capable of dynamically adapting the policy to the environment and the required safety levels.
Tasks
Published 2019-11-20
URL https://arxiv.org/abs/1911.09101v1
PDF https://arxiv.org/pdf/1911.09101v1.pdf
PWC https://paperswithcode.com/paper/safe-policies-for-reinforcement-learning-via
Repo
Framework

Linear interpolation gives better gradients than Gaussian smoothing in derivative-free optimization

Title Linear interpolation gives better gradients than Gaussian smoothing in derivative-free optimization
Authors Albert S Berahas, Liyuan Cao, Krzysztof Choromanski, Katya Scheinberg
Abstract In this paper, we consider derivative free optimization problems, where the objective function is smooth but is computed with some amount of noise, the function evaluations are expensive and no derivative information is available. We are motivated by policy optimization problems in reinforcement learning that have recently become popular [Choromaski et al. 2018; Fazel et al. 2018; Salimans et al. 2016], and that can be formulated as derivative free optimization problems with the aforementioned characteristics. In each of these works some approximation of the gradient is constructed and a (stochastic) gradient method is applied. In [Salimans et al. 2016] the gradient information is aggregated along Gaussian directions, while in [Choromaski et al. 2018] it is computed along orthogonal direction. We provide a convergence rate analysis for a first-order line search method, similar to the ones used in the literature, and derive the conditions on the gradient approximations that ensure this convergence. We then demonstrate via rigorous analysis of the variance and by numerical comparisons on reinforcement learning tasks that the Gaussian sampling method used in [Salimans et al. 2016] is significantly inferior to the orthogonal sampling used in [Choromaski et al. 2018] as well as more general interpolation methods.
Tasks
Published 2019-05-29
URL https://arxiv.org/abs/1905.13043v2
PDF https://arxiv.org/pdf/1905.13043v2.pdf
PWC https://paperswithcode.com/paper/linear-interpolation-gives-better-gradients
Repo
Framework

Fast Dynamic Perfusion and Angiography Reconstruction using an end-to-end 3D Convolutional Neural Network

Title Fast Dynamic Perfusion and Angiography Reconstruction using an end-to-end 3D Convolutional Neural Network
Authors Sahar Yousefi, Lydiane Hirschler, Merlijn van der Plas, Mohamed S. Elmahdy, Hessam Sokooti, Matthias Van Osch, Marius Staring
Abstract Hadamard time-encoded pseudo-continuous arterial spin labeling (te-pCASL) is a signal-to-noise ratio (SNR)-efficient MRI technique for acquiring dynamic pCASL signals that encodes the temporal information into the labeling according to a Hadamard matrix. In the decoding step, the contribution of each sub-bolus can be isolated resulting in dynamic perfusion scans. When acquiring te-ASL both with and without flow-crushing, the ASL-signal in the arteries can be isolated resulting in 4D-angiographic information. However, obtaining multi-timepoint perfusion and angiographic data requires two acquisitions. In this study, we propose a 3D Dense-Unet convolutional neural network with a multi-level loss function for reconstructing multi-timepoint perfusion and angiographic information from an interleaved $50%$-sampled crushed and $50%$-sampled non-crushed data, thereby negating the additional scan time. We present a framework to generate dynamic pCASL training and validation data, based on models of the intravascular and extravascular te-pCASL signals. The proposed network achieved SSIM values of $97.3 \pm 1.1$ and $96.2 \pm 11.1$ respectively for 4D perfusion and angiographic data reconstruction for 313 test data-sets.
Tasks
Published 2019-08-24
URL https://arxiv.org/abs/1908.08947v2
PDF https://arxiv.org/pdf/1908.08947v2.pdf
PWC https://paperswithcode.com/paper/fast-dynamic-perfusion-and-angiography
Repo
Framework

Deep learning predictions of sand dune migration

Title Deep learning predictions of sand dune migration
Authors Kelly Kochanski, Divya Mohan, Jenna Horrall, Barry Rountree, Ghaleb Abdulla
Abstract A dry decade in the Navajo Nation has killed vegetation, dessicated soils, and released once-stable sand into the wind. This sand now covers one-third of the Nation’s land, threatening roads, gardens and hundreds of homes. Many arid regions have similar problems: global warming has increased dune movement across farmland in Namibia and Angola, and the southwestern US. Current dune models, unfortunately, do not scale well enough to provide useful forecasts for the $\sim$5% of land surfaces covered by mobile sand. We test the ability of two deep learning algorithms, a GAN and a CNN, to model the motion of sand dunes. The models are trained on simulated data from community-standard cellular automaton model of sand dunes. Preliminary results show the GAN producing reasonable forward predictions of dune migration at ten million times the speed of the existing model.
Tasks
Published 2019-12-13
URL https://arxiv.org/abs/1912.10798v1
PDF https://arxiv.org/pdf/1912.10798v1.pdf
PWC https://paperswithcode.com/paper/deep-learning-predictions-of-sand-dune
Repo
Framework

Semantic Alignment: Finding Semantically Consistent Ground-truth for Facial Landmark Detection

Title Semantic Alignment: Finding Semantically Consistent Ground-truth for Facial Landmark Detection
Authors Zhiwei Liu, Xiangyu Zhu, Guosheng Hu, Haiyun Guo, Ming Tang, Zhen Lei, Neil M. Robertson, Jinqiao Wang
Abstract Recently, deep learning based facial landmark detection has achieved great success. Despite this, we notice that the semantic ambiguity greatly degrades the detection performance. Specifically, the semantic ambiguity means that some landmarks (e.g. those evenly distributed along the face contour) do not have clear and accurate definition, causing inconsistent annotations by annotators. Accordingly, these inconsistent annotations, which are usually provided by public databases, commonly work as the ground-truth to supervise network training, leading to the degraded accuracy. To our knowledge, little research has investigated this problem. In this paper, we propose a novel probabilistic model which introduces a latent variable, i.e. the ‘real’ ground-truth which is semantically consistent, to optimize. This framework couples two parts (1) training landmark detection CNN and (2) searching the ‘real’ ground-truth. These two parts are alternatively optimized: the searched ‘real’ ground-truth supervises the CNN training; and the trained CNN assists the searching of ‘real’ ground-truth. In addition, to recover the unconfidently predicted landmarks due to occlusion and low quality, we propose a global heatmap correction unit (GHCU) to correct outliers by considering the global face shape as a constraint. Extensive experiments on both image-based (300W and AFLW) and video-based (300-VW) databases demonstrate that our method effectively improves the landmark detection accuracy and achieves the state of the art performance.
Tasks Face Alignment, Facial Landmark Detection
Published 2019-03-26
URL http://arxiv.org/abs/1903.10661v1
PDF http://arxiv.org/pdf/1903.10661v1.pdf
PWC https://paperswithcode.com/paper/semantic-alignment-finding-semantically
Repo
Framework

Learning When-to-Treat Policies

Title Learning When-to-Treat Policies
Authors Xinkun Nie, Emma Brunskill, Stefan Wager
Abstract Many applied decision-making problems have a dynamic component: The policymaker needs not only to choose whom to treat, but also when to start which treatment. For example, a medical doctor may see a patient many times and, at each visit, need to choose between prescribing either an invasive or a non-invasive procedure and postponing the decision to the next visit. In this paper, we develop an “advantage doubly robust” estimator for learning such dynamic treatment rules using observational data under sequential ignorability. We prove welfare regret bounds that generalize results for doubly robust learning in the single-step setting, and show promising empirical performance in several different contexts. Our approach is practical for policy optimization, and does not need any structural (e.g., Markovian) assumptions.
Tasks Decision Making
Published 2019-05-23
URL https://arxiv.org/abs/1905.09751v2
PDF https://arxiv.org/pdf/1905.09751v2.pdf
PWC https://paperswithcode.com/paper/learning-when-to-treat-policies
Repo
Framework
comments powered by Disqus