January 29, 2020

3592 words 17 mins read

Paper Group ANR 579

Paper Group ANR 579

Adversarial Camera Alignment Network for Unsupervised Cross-camera Person Re-identification. Dynamic PET cardiac and parametric image reconstruction: a fixed-point proximity gradient approach using patch-based DCT and tensor SVD regularization. Deep Learning for Automated Medical Image Analysis. Defining AI in Policy versus Practice. Adaptive Prici …

Adversarial Camera Alignment Network for Unsupervised Cross-camera Person Re-identification

Title Adversarial Camera Alignment Network for Unsupervised Cross-camera Person Re-identification
Authors Lei Qi, Lei Wang, Jing Huo, Yinghuan Shi, Yang Gao
Abstract In person re-identification (Re-ID), supervised methods usually need a large amount of expensive label information, while unsupervised ones are still unable to deliver satisfactory identification performance. In this paper, we introduce a novel person Re-ID task called unsupervised cross-camera person Re-ID, which only needs the within-camera (intra-camera) label information but not cross-camera (inter-camera) labels which are more expensive to obtain. In real-world applications, the intra-camera label information can be easily captured by tracking algorithms or few manual annotations. In this situation, the main challenge becomes the distribution discrepancy across different camera views, caused by the various body pose, occlusion, image resolution, illumination conditions, and background noises in different cameras. To address this situation, we propose a novel Adversarial Camera Alignment Network (ACAN) for unsupervised cross-camera person Re-ID. It consists of the camera-alignment task and the supervised within-camera learning task. To achieve the camera alignment, we develop a Multi-Camera Adversarial Learning (MCAL) to map images of different cameras into a shared subspace. Particularly, we investigate two different schemes, including the existing GRL (i.e., gradient reversal layer) scheme and the proposed scheme called “other camera equiprobability” (OCE), to conduct the multi-camera adversarial task. Based on this shared subspace, we then leverage the within-camera labels to train the network. Extensive experiments on five large-scale datasets demonstrate the superiority of ACAN over multiple state-of-the-art unsupervised methods that take advantage of labeled source domains and generated images by GAN-based models. In particular, we verify that the proposed multi-camera adversarial task does contribute to the significant improvement.
Tasks Person Re-Identification
Published 2019-08-02
URL https://arxiv.org/abs/1908.00862v1
PDF https://arxiv.org/pdf/1908.00862v1.pdf
PWC https://paperswithcode.com/paper/adversarial-camera-alignment-network-for
Repo
Framework

Dynamic PET cardiac and parametric image reconstruction: a fixed-point proximity gradient approach using patch-based DCT and tensor SVD regularization

Title Dynamic PET cardiac and parametric image reconstruction: a fixed-point proximity gradient approach using patch-based DCT and tensor SVD regularization
Authors Ida Häggström, Yizun Lin, Si Li, Andrzej Krol, Yuesheng Xu, C. Ross Schmidtlein
Abstract Our aim was to enhance visual quality and quantitative accuracy of dynamic positron emission tomography (PET)uptake images by improved image reconstruction, using sophisticated sparse penalty models that incorporate both 2D spatial+1D temporal (3DT) information. We developed two new 3DT PET reconstruction algorithms, incorporating different temporal and spatial penalties based on discrete cosine transform (DCT)w/ patches, and tensor nuclear norm (TNN) w/ patches, and compared to frame-by-frame methods; conventional 2D ordered subsets expectation maximization (OSEM) w/ post-filtering and 2D-DCT and 2D-TNN. A 3DT brain phantom with kinetic uptake (2-tissue model), and a moving 3DT cardiac/lung phantom was simulated and reconstructed. For the cardiac/lung phantom, an additional cardiac gated 2D-OSEM set was reconstructed. The structural similarity index (SSIM) and relative root mean squared error (rRMSE) relative ground truth was investigated. The image derived left ventricular (LV) volume for the cardiac/lung images was found by region growing and parametric images of the brain phantom were calculated. For the cardiac/lung phantom, 3DT-TNN yielded optimal images, and 3DT-DCT was best for the brain phantom. The optimal LV volume from the 3DT-TNN images was on average 11 and 55 percentage points closer to the true value compared to cardiac gated 2D-OSEM and 2D-OSEM respectively. Compared to 2D-OSEM, parametric images based on 3DT-DCT images generally had smaller bias and higher SSIM. Our novel methods that incorporate both 2D spatial and 1D temporal penalties produced dynamic PET images of higher quality than conventional 2D methods, w/o need for post-filtering. Breathing and cardiac motion were simultaneously captured w/o need for respiratory or cardiac gating. LV volumes were better recovered, and subsequently fitted parametric images were generally less biased and of higher quality.
Tasks Image Reconstruction
Published 2019-06-13
URL https://arxiv.org/abs/1906.05897v1
PDF https://arxiv.org/pdf/1906.05897v1.pdf
PWC https://paperswithcode.com/paper/dynamic-pet-cardiac-and-parametric-image
Repo
Framework

Deep Learning for Automated Medical Image Analysis

Title Deep Learning for Automated Medical Image Analysis
Authors Wentao Zhu
Abstract Medical imaging is an essential tool in many areas of medical applications, used for both diagnosis and treatment. However, reading medical images and making diagnosis or treatment recommendations require specially trained medical specialists. The current practice of reading medical images is labor-intensive, time-consuming, costly, and error-prone. It would be more desirable to have a computer-aided system that can automatically make diagnosis and treatment recommendations. Recent advances in deep learning enable us to rethink the ways of clinician diagnosis based on medical images. In this thesis, we will introduce 1) mammograms for detecting breast cancers, the most frequently diagnosed solid cancer for U.S. women, 2) lung CT images for detecting lung cancers, the most frequently diagnosed malignant cancer, and 3) head and neck CT images for automated delineation of organs at risk in radiotherapy. First, we will show how to employ the adversarial concept to generate the hard examples improving mammogram mass segmentation. Second, we will demonstrate how to use the weakly labeled data for the mammogram breast cancer diagnosis by efficiently design deep learning for multi-instance learning. Third, the thesis will walk through DeepLung system which combines deep 3D ConvNets and GBM for automated lung nodule detection and classification. Fourth, we will show how to use weakly labeled data to improve existing lung nodule detection system by integrating deep learning with a probabilistic graphic model. Lastly, we will demonstrate the AnatomyNet which is thousands of times faster and more accurate than previous methods on automated anatomy segmentation.
Tasks Lung Nodule Detection
Published 2019-03-12
URL http://arxiv.org/abs/1903.04711v1
PDF http://arxiv.org/pdf/1903.04711v1.pdf
PWC https://paperswithcode.com/paper/deep-learning-for-automated-medical-image
Repo
Framework

Defining AI in Policy versus Practice

Title Defining AI in Policy versus Practice
Authors P. M. Krafft, Meg Young, Michael Katell, Karen Huang, Ghislain Bugingo
Abstract Recent concern about harms of information technologies motivate consideration of regulatory action to forestall or constrain certain developments in the field of artificial intelligence (AI). However, definitional ambiguity hampers the possibility of conversation about this urgent topic of public concern. Legal and regulatory interventions require agreed-upon definitions, but consensus around a definition of AI has been elusive, especially in policy conversations. With an eye towards practical working definitions and a broader understanding of positions on these issues, we survey experts and review published policy documents to examine researcher and policy-maker conceptions of AI. We find that while AI researchers favor definitions of AI that emphasize technical functionality, policy-makers instead use definitions that compare systems to human thinking and behavior. We point out that definitions adhering closely to the functionality of AI systems are more inclusive of technologies in use today, whereas definitions that emphasize human-like capabilities are most applicable to hypothetical future technologies. As a result of this gap, ethical and regulatory efforts may overemphasize concern about future technologies at the expense of pressing issues with existing deployed technologies.
Tasks
Published 2019-12-23
URL https://arxiv.org/abs/1912.11095v1
PDF https://arxiv.org/pdf/1912.11095v1.pdf
PWC https://paperswithcode.com/paper/defining-ai-in-policy-versus-practice
Repo
Framework

Adaptive Pricing in Insurance: Generalized Linear Models and Gaussian Process Regression Approaches

Title Adaptive Pricing in Insurance: Generalized Linear Models and Gaussian Process Regression Approaches
Authors Yuqing Zhang, Neil Walton
Abstract We study the application of dynamic pricing to insurance. We view this as an online revenue management problem where the insurance company looks to set prices to optimize the long-run revenue from selling a new insurance product. We develop two pricing models: an adaptive Generalized Linear Model (GLM) and an adaptive Gaussian Process (GP) regression model. Both balance between exploration, where we choose prices in order to learn the distribution of demands & claims for the insurance product, and exploitation, where we myopically choose the best price from the information gathered so far. The performance of the pricing policies is measured in terms of regret: the expected revenue loss caused by not using the optimal price. As is commonplace in insurance, we model demand and claims by GLMs. In our adaptive GLM design, we use the maximum quasi-likelihood estimation (MQLE) to estimate the unknown parameters. We show that, if prices are chosen with suitably decreasing variability, the MQLE parameters eventually exist and converge to the correct values, which in turn implies that the sequence of chosen prices will also converge to the optimal price. In the adaptive GP regression model, we sample demand and claims from Gaussian Processes and then choose selling prices by the upper confidence bound rule. We also analyze these GLM and GP pricing algorithms with delayed claims. Although similar results exist in other domains, this is among the first works to consider dynamic pricing problems in the field of insurance. We also believe this is the first work to consider Gaussian Process regression in the context of insurance pricing. These initial findings suggest that online machine learning algorithms could be a fruitful area of future investigation and application in insurance.
Tasks Gaussian Processes
Published 2019-07-02
URL https://arxiv.org/abs/1907.05381v1
PDF https://arxiv.org/pdf/1907.05381v1.pdf
PWC https://paperswithcode.com/paper/adaptive-pricing-in-insurance-generalized
Repo
Framework

Kernels over Sets of Finite Sets using RKHS Embeddings, with Application to Bayesian (Combinatorial) Optimization

Title Kernels over Sets of Finite Sets using RKHS Embeddings, with Application to Bayesian (Combinatorial) Optimization
Authors Poompol Buathong, David Ginsbourger, Tipaluck Krityakierne
Abstract We focus on kernel methods for set-valued inputs and their application to Bayesian set optimization, notably combinatorial optimization. We investigate two classes of set kernels that both rely on Reproducing Kernel Hilbert Space embeddings, namely the Double Sum'' (DS) kernels recently considered in Bayesian set optimization, and a class introduced here called Deep Embedding’’ (DE) kernels that essentially consists in applying a radial kernel on Hilbert space on top of the canonical distance induced by another kernel such as a DS kernel. We establish in particular that while DS kernels typically suffer from a lack of strict positive definiteness, vast subclasses of DE kernels built upon DS kernels do possess this property, enabling in turn combinatorial optimization without requiring to introduce a jitter parameter. Proofs of theoretical results about considered kernels are complemented by a few practicalities regarding hyperparameter fitting. We furthermore demonstrate the applicability of our approach in prediction and optimization tasks, relying both on toy examples and on two test cases from mechanical engineering and hydrogeology, respectively. Experimental results highlight the applicability and compared merits of the considered approaches while opening new perspectives in prediction and sequential design with set inputs.
Tasks Combinatorial Optimization
Published 2019-10-09
URL https://arxiv.org/abs/1910.04086v2
PDF https://arxiv.org/pdf/1910.04086v2.pdf
PWC https://paperswithcode.com/paper/kernels-over-sets-of-finite-sets-using-rkhs
Repo
Framework

On Adaptivity in Information-constrained Online Learning

Title On Adaptivity in Information-constrained Online Learning
Authors Siddharth Mitra, Aditya Gopalan
Abstract We study how to adapt to smoothly-varying (‘easy’) environments in well-known online learning problems where acquiring information is expensive. For the problem of label efficient prediction, which is a budgeted version of prediction with expert advice, we present an online algorithm whose regret depends optimally on the number of labels allowed and $Q^$ (the quadratic variation of the losses of the best action in hindsight), along with a parameter-free counterpart whose regret depends optimally on $Q$ (the quadratic variation of the losses of all the actions). These quantities can be significantly smaller than $T$ (the total time horizon), yielding an improvement over existing, variation-independent results for the problem. We then extend our analysis to handle label efficient prediction with bandit feedback, i.e., label efficient bandits. Our work builds upon the framework of optimistic online mirror descent, and leverages second order corrections along with a carefully designed hybrid regularizer that encodes the constrained information structure of the problem. We then consider revealing action-partial monitoring games – a version of label efficient prediction with additive information costs, which in general are known to lie in the \textit{hard} class of games having minimax regret of order $T^{\frac{2}{3}}$. We provide a strategy with an $\mathcal{O}((Q^T)^{\frac{1}{3}})$ bound for revealing action games, along with one with a $\mathcal{O}((QT)^{\frac{1}{3}})$ bound for the full class of hard partial monitoring games, both being strict improvements over current bounds.
Tasks
Published 2019-10-19
URL https://arxiv.org/abs/1910.08805v2
PDF https://arxiv.org/pdf/1910.08805v2.pdf
PWC https://paperswithcode.com/paper/on-adaptivity-in-information-constrained
Repo
Framework

ChaLearn Looking at People: IsoGD and ConGD Large-scale RGB-D Gesture Recognition

Title ChaLearn Looking at People: IsoGD and ConGD Large-scale RGB-D Gesture Recognition
Authors Jun Wan, Chi Lin, Longyin Wen, Yunan Li, Qiguang Miao, Sergio Escalera, Gholamreza Anbarjafari, Isabelle Guyon, Guodong Guo, Stan Z. Li
Abstract The ChaLearn large-scale gesture recognition challenge has been run twice in two workshops in conjunction with the International Conference on Pattern Recognition (ICPR) 2016 and International Conference on Computer Vision (ICCV) 2017, attracting more than $200$ teams round the world. This challenge has two tracks, focusing on isolated and continuous gesture recognition, respectively. This paper describes the creation of both benchmark datasets and analyzes the advances in large-scale gesture recognition based on these two datasets. We discuss the challenges of collecting large-scale ground-truth annotations of gesture recognition, and provide a detailed analysis of the current state-of-the-art methods for large-scale isolated and continuous gesture recognition based on RGB-D video sequences. In addition to recognition rate and mean jaccard index (MJI) as evaluation metrics used in our previous challenges, we also introduce the corrected segmentation rate (CSR) metric to evaluate the performance of temporal segmentation for continuous gesture recognition. Furthermore, we propose a bidirectional long short-term memory (Bi-LSTM) baseline method, determining the video division points based on the skeleton points extracted by convolutional pose machine (CPM). Experiments demonstrate that the proposed Bi-LSTM outperforms the state-of-the-art methods with an absolute improvement of $8.1%$ (from $0.8917$ to $0.9639$) of CSR.
Tasks Gesture Recognition
Published 2019-07-29
URL https://arxiv.org/abs/1907.12193v1
PDF https://arxiv.org/pdf/1907.12193v1.pdf
PWC https://paperswithcode.com/paper/chalearn-looking-at-people-isogd-and-congd
Repo
Framework

Two Birds with One Stone: Investigating Invertible Neural Networks for Inverse Problems in Morphology

Title Two Birds with One Stone: Investigating Invertible Neural Networks for Inverse Problems in Morphology
Authors Gözde Gül Şahin, Iryna Gurevych
Abstract Most problems in natural language processing can be approximated as inverse problems such as analysis and generation at variety of levels from morphological (e.g., cat+Plural <-> cats) to semantic (e.g., (call + 1 2) <-> “Calculate one plus two."). Although the tasks in both directions are closely related, general approach in the field has been to design separate models specific for each task. However, having one shared model for both tasks, would help the researchers exploit the common knowledge among these problems with reduced time and memory requirements. We investigate a specific class of neural networks, called Invertible Neural Networks (INNs) (Ardizzone et al. 2019) that enable simultaneous optimization in both directions, hence allow addressing of inverse problems via a single model. In this study, we investigate INNs on morphological problems casted as inverse problems. We apply INNs to various morphological tasks with varying ambiguity and show that they provide competitive performance in both directions. We show that they are able to recover the morphological input parameters, i.e., predicting the lemma (e.g., cat) or the morphological tags (e.g., Plural) when run in the reverse direction, without any significant performance drop in the forward direction, i.e., predicting the surface form (e.g., cats).
Tasks
Published 2019-12-11
URL https://arxiv.org/abs/1912.05274v1
PDF https://arxiv.org/pdf/1912.05274v1.pdf
PWC https://paperswithcode.com/paper/two-birds-with-one-stone-investigating
Repo
Framework

Heterogeneous Parallel Genetic Algorithm Paradigm

Title Heterogeneous Parallel Genetic Algorithm Paradigm
Authors Menouar Boulif
Abstract The encoding representation of the genetic algorithm can boost or hinder its performance albeit the care one can devote to operator design. Unfortunately, a representation-theory foundation that helps to find the suitable encoding for any problem has not yet become mature. Furthermore, we argue that such a best-performing encoding scheme can differ even for instances of the same problem. In this contribution, we present the basic principles of the heterogeneous parallel genetic algorithm that federates the efforts of many encoding representations in order to efficiently solve the problem in hand without prior knowledge of the best encoding.
Tasks
Published 2019-05-16
URL https://arxiv.org/abs/1905.06636v1
PDF https://arxiv.org/pdf/1905.06636v1.pdf
PWC https://paperswithcode.com/paper/heterogeneous-parallel-genetic-algorithm
Repo
Framework

A Time-Dependent TSP Formulation for the Design of an Active Debris Removal Mission using Simulated Annealing

Title A Time-Dependent TSP Formulation for the Design of an Active Debris Removal Mission using Simulated Annealing
Authors Lorenzo Federici, Alessandro Zavoli, Guido Colasurdo
Abstract This paper proposes a formulation of the Active Debris Removal (ADR) Mission Design problem as a modified Time-Dependent Traveling Salesman Problem (TDTSP). The TDTSP is a well-known combinatorial optimization problem, whose solution is the cheapest mono-cyclic tour connecting a number of non-stationary cities in a map. The problem is tackled with an optimization procedure based on Simulated Annealing, that efficiently exploits a natural encoding and a careful choice of mutation operators. The developed algorithm is used to simultaneously optimize the targets sequence and the rendezvous epochs of an impulsive ADR mission. Numerical results are presented for sets comprising up to 20 targets.
Tasks Combinatorial Optimization
Published 2019-09-23
URL https://arxiv.org/abs/1909.10427v1
PDF https://arxiv.org/pdf/1909.10427v1.pdf
PWC https://paperswithcode.com/paper/190910427
Repo
Framework

Joint Chromatic and Polarimetric Demosaicing via Sparse Coding

Title Joint Chromatic and Polarimetric Demosaicing via Sparse Coding
Authors Sijia Wen, Yinqiang Zheng, Feng Lu, Qinping Zhao
Abstract Thanks to the latest progress in image sensor manufacturing technology, the emergence of the single-chip polarized color sensor is likely to bring advantages to computer vision tasks. Despite the importance of the sensor, joint chromatic and polarimetric demosaicing is the key to obtaining the high-quality RGB-Polarization image for the sensor. Since the polarized color sensor is equipped with a new type of chip, the demosaicing problem cannot be currently well-addressed by former methods. In this paper, we propose a joint chromatic and polarimetric demosaicing model to address this challenging problem. To solve this non-convex problem, we further present a sparse representation-based optimization strategy that utilizes chromatic information and polarimetric information to jointly optimize the model. In addition, we build an optical data acquisition system to collect an RGB-Polarization dataset. Results of both qualitative and quantitative experiments have shown that our method is capable of faithfully recovering full 12-channel chromatic and polarimetric information for each pixel from a single mosaic input image. Moreover, we show that the proposed method can perform well not only on the synthetic data but the real captured data.
Tasks Demosaicking
Published 2019-12-16
URL https://arxiv.org/abs/1912.07308v1
PDF https://arxiv.org/pdf/1912.07308v1.pdf
PWC https://paperswithcode.com/paper/joint-chromatic-and-polarimetric-demosaicing
Repo
Framework

Bayes-Factor-VAE: Hierarchical Bayesian Deep Auto-Encoder Models for Factor Disentanglement

Title Bayes-Factor-VAE: Hierarchical Bayesian Deep Auto-Encoder Models for Factor Disentanglement
Authors Minyoung Kim, Yuting Wang, Pritish Sahu, Vladimir Pavlovic
Abstract We propose a family of novel hierarchical Bayesian deep auto-encoder models capable of identifying disentangled factors of variability in data. While many recent attempts at factor disentanglement have focused on sophisticated learning objectives within the VAE framework, their choice of a standard normal as the latent factor prior is both suboptimal and detrimental to performance. Our key observation is that the disentangled latent variables responsible for major sources of variability, the relevant factors, can be more appropriately modeled using long-tail distributions. The typical Gaussian priors are, on the other hand, better suited for modeling of nuisance factors. Motivated by this, we extend the VAE to a hierarchical Bayesian model by introducing hyper-priors on the variances of Gaussian latent priors, mimicking an infinite mixture, while maintaining tractable learning and inference of the traditional VAEs. This analysis signifies the importance of partitioning and treating in a different manner the latent dimensions corresponding to relevant factors and nuisances. Our proposed models, dubbed Bayes-Factor-VAEs, are shown to outperform existing methods both quantitatively and qualitatively in terms of latent disentanglement across several challenging benchmark tasks.
Tasks
Published 2019-09-06
URL https://arxiv.org/abs/1909.02820v1
PDF https://arxiv.org/pdf/1909.02820v1.pdf
PWC https://paperswithcode.com/paper/bayes-factor-vae-hierarchical-bayesian-deep
Repo
Framework

Fully Convolutional Networks for Monocular Retinal Depth Estimation and Optic Disc-Cup Segmentation

Title Fully Convolutional Networks for Monocular Retinal Depth Estimation and Optic Disc-Cup Segmentation
Authors Sharath M Shankaranarayana, Keerthi Ram, Kaushik Mitra, Mohanasankar Sivaprakasam
Abstract Glaucoma is a serious ocular disorder for which the screening and diagnosis are carried out by the examination of the optic nerve head (ONH). The color fundus image (CFI) is the most common modality used for ocular screening. In CFI, the central r
Tasks Depth Estimation
Published 2019-02-04
URL http://arxiv.org/abs/1902.01040v1
PDF http://arxiv.org/pdf/1902.01040v1.pdf
PWC https://paperswithcode.com/paper/fully-convolutional-networks-for-monocular
Repo
Framework

Fast and Efficient Lenslet Image Compression

Title Fast and Efficient Lenslet Image Compression
Authors Hadi Amirpour, Antonio Pinheiro, Manuela Pereira, Mohammad Ghanbari
Abstract Light field imaging is characterized by capturing brightness, color, and directional information of light rays in a scene. This leads to image representations with huge amount of data that require efficient coding schemes. In this paper, lenslet images are rendered into sub-aperture images. These images are organized as a pseudo-sequence input for the HEVC video codec. To better exploit redundancy among the neighboring sub-aperture images and consequently decrease the distances between a sub-aperture image and its references used for prediction, sub-aperture images are divided into four smaller groups that are scanned in a serpentine order. The most central sub-aperture image, which has the highest similarity to all the other images, is used as the initial reference image for each of the four regions. Furthermore, a structure is defined that selects spatially adjacent sub-aperture images as prediction references with the highest similarity to the current image. In this way, encoding efficiency increases, and furthermore it leads to a higher similarity among the co-located Coding Three Units (CTUs). The similarities among the co-located CTUs are exploited to predict Coding Unit depths.Moreover, independent encoding of each group division enables parallel processing, that along with the proposed coding unit depth prediction decrease the encoding execution time by almost 80% on average. Simulation results show that Rate-Distortion performance of the proposed method has higher compression gain than the other state-of-the-art lenslet compression methods with lower computational complexity.
Tasks Depth Estimation, Image Compression
Published 2019-01-27
URL http://arxiv.org/abs/1901.11396v1
PDF http://arxiv.org/pdf/1901.11396v1.pdf
PWC https://paperswithcode.com/paper/fast-and-efficient-lenslet-image-compression
Repo
Framework
comments powered by Disqus