April 1, 2020

3128 words 15 mins read

Paper Group ANR 456

Paper Group ANR 456

Public Authorities as Defendants: Using Bayesian Networks to determine the Likelihood of Success for Negligence claims in the wake of Oakden. A Robust Pose Transformational GAN for Pose Guided Person Image Synthesis. A diffusion approach to Stein’s method on Riemannian manifolds. Smoothness and Stability in GANs. Best Principal Submatrix Selection …

Public Authorities as Defendants: Using Bayesian Networks to determine the Likelihood of Success for Negligence claims in the wake of Oakden

Title Public Authorities as Defendants: Using Bayesian Networks to determine the Likelihood of Success for Negligence claims in the wake of Oakden
Authors Scott McLachlan, Evangelia Kyrimi, Norman Fenton
Abstract Several countries are currently investigating issues of neglect, poor quality care and abuse in the aged care sector. In most cases it is the State who license and monitor aged care providers, which frequently introduces a serious conflict of interest because the State also operate many of the facilities where our most vulnerable peoples are cared for. Where issues are raised with the standard of care being provided, the State are seen by many as a deep-pockets defendant and become the target of high-value lawsuits. This paper draws on cases and circumstances from one jurisdiction based on the English legal tradition, Australia, and proposes a Bayesian solution capable of determining probability for success for citizen plaintiffs who bring negligence claims against a public authority defendant. Use of a Bayesian network trained on case audit data shows that even when the plaintiff case meets all requirements for a successful negligence litigation, success is not often assured. Only in around one-fifth of these cases does the plaintiff succeed against a public authority as defendant.
Published 2020-02-01
URL https://arxiv.org/abs/2002.05664v1
PDF https://arxiv.org/pdf/2002.05664v1.pdf
PWC https://paperswithcode.com/paper/public-authorities-as-defendants-using

A Robust Pose Transformational GAN for Pose Guided Person Image Synthesis

Title A Robust Pose Transformational GAN for Pose Guided Person Image Synthesis
Authors Arnab Karmakar, Deepak Mishra
Abstract Generating photorealistic images of human subjects in any unseen pose have crucial applications in generating a complete appearance model of the subject. However, from a computer vision perspective, this task becomes significantly challenging due to the inability of modelling the data distribution conditioned on pose. Existing works use a complicated pose transformation model with various additional features such as foreground segmentation, human body parsing etc. to achieve robustness that leads to computational overhead. In this work, we propose a simple yet effective pose transformation GAN by utilizing the Residual Learning method without any additional feature learning to generate a given human image in any arbitrary pose. Using effective data augmentation techniques and cleverly tuning the model, we achieve robustness in terms of illumination, occlusion, distortion and scale. We present a detailed study, both qualitative and quantitative, to demonstrate the superiority of our model over the existing methods on two large datasets.
Tasks Data Augmentation, Image Generation
Published 2020-01-05
URL https://arxiv.org/abs/2001.01259v1
PDF https://arxiv.org/pdf/2001.01259v1.pdf
PWC https://paperswithcode.com/paper/a-robust-pose-transformational-gan-for-pose

A diffusion approach to Stein’s method on Riemannian manifolds

Title A diffusion approach to Stein’s method on Riemannian manifolds
Authors Huiling Le, Alexander Lewis, Karthik Bharath, Christopher Fallaize
Abstract We detail an approach to develop Stein’s method for bounding integral metrics on probability measures defined on a Riemannian manifold $\mathbf{M}$. Our approach exploits the relationship between the generator of a diffusion on $\mathbf{M}$ with target invariant measure and its characterising Stein operator. We consider a pair of such diffusions with different starting points, and investigate properties of solution to the Stein equation based on analysis of the distance process between the pair. Several examples elucidating the role of geometry of $\mathbf{M}$ in these developments are presented.
Published 2020-03-25
URL https://arxiv.org/abs/2003.11497v1
PDF https://arxiv.org/pdf/2003.11497v1.pdf
PWC https://paperswithcode.com/paper/a-diffusion-approach-to-stein-s-method-on

Smoothness and Stability in GANs

Title Smoothness and Stability in GANs
Authors Casey Chu, Kentaro Minami, Kenji Fukumizu
Abstract Generative adversarial networks, or GANs, commonly display unstable behavior during training. In this work, we develop a principled theoretical framework for understanding the stability of various types of GANs. In particular, we derive conditions that guarantee eventual stationarity of the generator when it is trained with gradient descent, conditions that must be satisfied by the divergence that is minimized by the GAN and the generator’s architecture. We find that existing GAN variants satisfy some, but not all, of these conditions. Using tools from convex analysis, optimal transport, and reproducing kernels, we construct a GAN that fulfills these conditions simultaneously. In the process, we explain and clarify the need for various existing GAN stabilization techniques, including Lipschitz constraints, gradient penalties, and smooth activation functions.
Published 2020-02-11
URL https://arxiv.org/abs/2002.04185v1
PDF https://arxiv.org/pdf/2002.04185v1.pdf
PWC https://paperswithcode.com/paper/smoothness-and-stability-in-gans-1

Best Principal Submatrix Selection for the Maximum Entropy Sampling Problem: Scalable Algorithms and Performance Guarantees

Title Best Principal Submatrix Selection for the Maximum Entropy Sampling Problem: Scalable Algorithms and Performance Guarantees
Authors Yongchun Li, Weijun Xie
Abstract This paper studies a classic maximum entropy sampling problem (MESP), which aims to select the most informative principal submatrix of a prespecified size from a covariance matrix. MESP has been widely applied to many areas, including healthcare, power system, manufacturing and data science. By investigating its Lagrangian dual and primal characterization, we derive a novel convex integer program for MESP and show that its continuous relaxation yields a near-optimal solution. The results motivate us to study an efficient sampling algorithm and develop its approximation bound for MESP, which improves the best-known bound in literature. We then provide an efficient deterministic implementation of the sampling algorithm with the same approximation bound. By developing new mathematical tools for the singular matrices and analyzing the Lagrangian dual of the proposed convex integer program, we investigate the widely-used local search algorithm and prove its first-known approximation bound for MESP. The proof techniques further inspire us with an efficient implementation of the local search algorithm. Our numerical experiments demonstrate that these approximation algorithms can efficiently solve medium-sized and large-scale instances to near-optimality. Our proposed algorithms are coded and released as open-source software. Finally, we extend the analyses to the A-Optimal MESP (A-MESP), where the objective is to minimize the trace of the inverse of the selected principal submatrix.
Published 2020-01-23
URL https://arxiv.org/abs/2001.08537v1
PDF https://arxiv.org/pdf/2001.08537v1.pdf
PWC https://paperswithcode.com/paper/best-principal-submatrix-selection-for-the

User Generated Data: Achilles’ heel of BERT

Title User Generated Data: Achilles’ heel of BERT
Authors Ankit Kumar, Piyush Makhija, Anuj Gupta
Abstract Pre-trained language models such as BERT are known to perform exceedingly well on various NLP tasks and have even established new State-Of-The-Art (SOTA) benchmarks for many of these tasks. Owing to its success on various tasks and benchmark datasets, industry practitioners have started to explore BERT to build applications solving industry use cases. These use cases are known to have much more noise in the data as compared to benchmark datasets. In this work we systematically show that when the data is noisy, there is a significant degradation in the performance of BERT. Specifically, we performed experiments using BERT on popular tasks such sentiment analysis and textual similarity. For this we work with three well known datasets - IMDB movie reviews, SST-2 and STS-B to measure the performance. Further, we examine the reason behind this performance drop and identify the shortcomings in the BERT pipeline.
Tasks Sentiment Analysis
Published 2020-03-29
URL https://arxiv.org/abs/2003.12932v1
PDF https://arxiv.org/pdf/2003.12932v1.pdf
PWC https://paperswithcode.com/paper/user-generated-data-achilles-heel-of-bert

Derivation of Coupled PCA and SVD Learning Rules from a Newton Zero-Finding Framework

Title Derivation of Coupled PCA and SVD Learning Rules from a Newton Zero-Finding Framework
Authors Ralf Möller
Abstract In coupled learning rules for PCA (principal component analysis) and SVD (singular value decomposition), the update of the estimates of eigenvectors or singular vectors is influenced by the estimates of eigenvalues or singular values, respectively. This coupled update mitigates the speed-stability problem since the update equations converge from all directions with approximately the same speed. A method to derive coupled learning rules from information criteria by Newton optimization is known. However, these information criteria have to be designed, offer no explanatory value, and can only impose Euclidean constraints on the vector estimates. Here we describe an alternative approach where coupled PCA and SVD learning rules can systematically be derived from a Newton zero-finding framework. The derivation starts from an objective function, combines the equations for its extrema with arbitrary constraints on the vector estimates, and solves the resulting vector zero-point equation using Newton’s zero-finding method. To demonstrate the framework, we derive PCA and SVD learning rules with constant Euclidean length or constant sum of the vector estimates.
Published 2020-03-25
URL https://arxiv.org/abs/2003.11456v1
PDF https://arxiv.org/pdf/2003.11456v1.pdf
PWC https://paperswithcode.com/paper/derivation-of-coupled-pca-and-svd-learning

Review: deep learning on 3D point clouds

Title Review: deep learning on 3D point clouds
Authors Saifullahi Aminu Bello, Shangshu Yu, Cheng Wang
Abstract Point cloud is point sets defined in 3D metric space. Point cloud has become one of the most significant data format for 3D representation. Its gaining increased popularity as a result of increased availability of acquisition devices, such as LiDAR, as well as increased application in areas such as robotics, autonomous driving, augmented and virtual reality. Deep learning is now the most powerful tool for data processing in computer vision, becoming the most preferred technique for tasks such as classification, segmentation, and detection. While deep learning techniques are mainly applied to data with a structured grid, point cloud, on the other hand, is unstructured. The unstructuredness of point clouds makes use of deep learning for its processing directly very challenging. Earlier approaches overcome this challenge by preprocessing the point cloud into a structured grid format at the cost of increased computational cost or lost of depth information. Recently, however, many state-of-the-arts deep learning techniques that directly operate on point cloud are being developed. This paper contains a survey of the recent state-of-the-art deep learning techniques that mainly focused on point cloud data. We first briefly discussed the major challenges faced when using deep learning directly on point cloud, we also briefly discussed earlier approaches which overcome the challenges by preprocessing the point cloud into a structured grid. We then give the review of the various state-of-the-art deep learning approaches that directly process point cloud in its unstructured form. We introduced the popular 3D point cloud benchmark datasets. And we also further discussed the application of deep learning in popular 3D vision tasks including classification, segmentation and detection.
Tasks Autonomous Driving
Published 2020-01-17
URL https://arxiv.org/abs/2001.06280v1
PDF https://arxiv.org/pdf/2001.06280v1.pdf
PWC https://paperswithcode.com/paper/review-deep-learning-on-3d-point-clouds

Bayesian Networks in Healthcare: Distribution by Medical Condition

Title Bayesian Networks in Healthcare: Distribution by Medical Condition
Authors Scott McLachlan, Kudakwashe Dube, Graham A Hitman, Norman E Fenton, Evangelia Kyrimi
Abstract Bayesian networks (BNs) have received increasing research attention that is not matched by adoption in practice and yet have potential to significantly benefit healthcare. Hitherto, research works have not investigated the types of medical conditions being modelled with BNs, nor whether any differences exist in how and why they are applied to different conditions. This research seeks to identify and quantify the range of medical conditions for which healthcare-related BN models have been proposed, and the differences in approach between the most common medical conditions to which they have been applied. We found that almost two-thirds of all healthcare BNs are focused on four conditions: cardiac, cancer, psychological and lung disorders. We believe that a lack of understanding regarding how BNs work and what they are capable of exists, and that it is only with greater understanding and promotion that we may ever realise the full potential of BNs to effect positive change in daily healthcare practice.
Published 2020-02-01
URL https://arxiv.org/abs/2002.00224v2
PDF https://arxiv.org/pdf/2002.00224v2.pdf
PWC https://paperswithcode.com/paper/bayesian-networks-in-healthcare-distribution

Cascade EF-GAN: Progressive Facial Expression Editing with Local Focuses

Title Cascade EF-GAN: Progressive Facial Expression Editing with Local Focuses
Authors Rongliang Wu, Gongjie Zhang, Shijian Lu, Tao Chen
Abstract Recent advances in Generative Adversarial Nets (GANs) have shown remarkable improvements for facial expression editing. However, current methods are still prone to generate artifacts and blurs around expression-intensive regions, and often introduce undesired overlapping artifacts while handling large-gap expression transformations such as transformation from furious to laughing. To address these limitations, we propose Cascade Expression Focal GAN (Cascade EF-GAN), a novel network that performs progressive facial expression editing with local expression focuses. The introduction of the local focus enables the Cascade EF-GAN to better preserve identity-related features and details around eyes, noses and mouths, which further helps reduce artifacts and blurs within the generated facial images. In addition, an innovative cascade transformation strategy is designed by dividing a large facial expression transformation into multiple small ones in cascade, which helps suppress overlapping artifacts and produce more realistic editing while dealing with large-gap expression transformations. Extensive experiments over two publicly available facial expression datasets show that our proposed Cascade EF-GAN achieves superior performance for facial expression editing.
Published 2020-03-12
URL https://arxiv.org/abs/2003.05905v2
PDF https://arxiv.org/pdf/2003.05905v2.pdf
PWC https://paperswithcode.com/paper/cascade-ef-gan-progressive-facial-expression

Change Point Models for Real-time Cyber Attack Detection in Connected Vehicle Environment

Title Change Point Models for Real-time Cyber Attack Detection in Connected Vehicle Environment
Authors Gurcan Comert, Mizanur Rahman, Mhafuzul Islam, Mashrur Chowdhury
Abstract Connected vehicle (CV) systems are cognizant of potential cyber attacks because of increasing connectivity between its different components such as vehicles, roadside infrastructure, and traffic management centers. However, it is a challenge to detect security threats in real-time and develop appropriate or effective countermeasures for a CV system because of the dynamic behavior of such attacks, high computational power requirement, and a historical data requirement for training detection models. To address these challenges, statistical models, especially change point models, have potentials for real-time anomaly detections. Thus, the objective of this study is to investigate the efficacy of two change point models, Expectation Maximization (EM) and two forms of Cumulative Summation (CUSUM) algorithms (i.e., typical and adaptive), for real-time V2I cyber attack detection in a CV Environment. To prove the efficacy of these models, we evaluated these two models for three different type of cyber attack, denial of service (DOS), impersonation, and false information, using basic safety messages (BSMs) generated from CVs through simulation. Results from numerical analysis revealed that EM, CUSUM, and adaptive CUSUM could detect these cyber attacks, DOS, impersonation, and false information, with an accuracy of (99%, 100%, 100%), (98%, 10%, 100%), and (100%, 98%, 100%) respectively.
Tasks Cyber Attack Detection
Published 2020-03-05
URL https://arxiv.org/abs/2003.04185v1
PDF https://arxiv.org/pdf/2003.04185v1.pdf
PWC https://paperswithcode.com/paper/change-point-models-for-real-time-cyber

Design-unbiased statistical learning in survey sampling

Title Design-unbiased statistical learning in survey sampling
Authors Luis Sanguiao Sande, Li-Chun Zhang
Abstract Design-consistent model-assisted estimation has become the standard practice in survey sampling. However, a general theory is lacking so far, which allows one to incorporate modern machine-learning techniques that can lead to potentially much more powerful assisting models. We propose a subsampling Rao-Blackwell method, and develop a statistical learning theory for exactly design-unbiased estimation with the help of linear or non-linear prediction models. Our approach makes use of classic ideas from Statistical Science as well as the rapidly growing field of Machine Learning. Provided rich auxiliary information, it can yield considerable efficiency gains over standard linear model-assisted methods, while ensuring valid estimation for the given target population, which is robust against potential mis-specifications of the assisting model at the individual level.
Published 2020-03-25
URL https://arxiv.org/abs/2003.11423v1
PDF https://arxiv.org/pdf/2003.11423v1.pdf
PWC https://paperswithcode.com/paper/design-unbiased-statistical-learning-in

Visual Question Answering on 360° Images

Title Visual Question Answering on 360° Images
Authors Shih-Han Chou, Wei-Lun Chao, Wei-Sheng Lai, Min Sun, Ming-Hsuan Yang
Abstract In this work, we introduce VQA 360, a novel task of visual question answering on 360 images. Unlike a normal field-of-view image, a 360 image captures the entire visual content around the optical center of a camera, demanding more sophisticated spatial understanding and reasoning. To address this problem, we collect the first VQA 360 dataset, containing around 17,000 real-world image-question-answer triplets for a variety of question types. We then study two different VQA models on VQA 360, including one conventional model that takes an equirectangular image (with intrinsic distortion) as input and one dedicated model that first projects a 360 image onto cubemaps and subsequently aggregates the information from multiple spatial resolutions. We demonstrate that the cubemap-based model with multi-level fusion and attention diffusion performs favorably against other variants and the equirectangular-based models. Nevertheless, the gap between the humans’ and machines’ performance reveals the need for more advanced VQA 360 algorithms. We, therefore, expect our dataset and studies to serve as the benchmark for future development in this challenging task. Dataset, code, and pre-trained models are available online.
Tasks Question Answering, Visual Question Answering
Published 2020-01-10
URL https://arxiv.org/abs/2001.03339v1
PDF https://arxiv.org/pdf/2001.03339v1.pdf
PWC https://paperswithcode.com/paper/visual-question-answering-on-360-images

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Title Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey
Authors Sanmit Narvekar, Bei Peng, Matteo Leonetti, Jivko Sinapov, Matthew E. Taylor, Peter Stone
Abstract Reinforcement learning (RL) is a popular paradigm for addressing sequential decision tasks in which the agent has only limited environmental feedback. Despite many advances over the past three decades, learning in many domains still requires a large amount of interaction with the environment, which can be prohibitively expensive in realistic scenarios. To address this problem, transfer learning has been applied to reinforcement learning such that experience gained in one task can be leveraged when starting to learn the next, harder task. More recently, several lines of research have explored how tasks, or data samples themselves, can be sequenced into a curriculum for the purpose of learning a problem that may otherwise be too difficult to learn from scratch. In this article, we present a framework for curriculum learning (CL) in reinforcement learning, and use it to survey and classify existing CL methods in terms of their assumptions, capabilities, and goals. Finally, we use our framework to find open problems and suggest directions for future RL curriculum learning research.
Tasks Transfer Learning
Published 2020-03-10
URL https://arxiv.org/abs/2003.04960v1
PDF https://arxiv.org/pdf/2003.04960v1.pdf
PWC https://paperswithcode.com/paper/curriculum-learning-for-reinforcement

On the Convergence of Nesterov’s Accelerated Gradient Method in Stochastic Settings

Title On the Convergence of Nesterov’s Accelerated Gradient Method in Stochastic Settings
Authors Mahmoud Assran, Michael Rabbat
Abstract We study Nesterov’s accelerated gradient method in the stochastic approximation setting (unbiased gradients with bounded variance) and the finite-sum setting (where randomness is due to sampling mini-batches). To build better insight into the behavior of Nesterov’s method in stochastic settings, we focus throughout on objectives that are smooth, strongly-convex, and twice continuously differentiable. In the stochastic approximation setting, Nesterov’s method converges to a neighborhood of the optimal point at the same accelerated rate as in the deterministic setting. Perhaps surprisingly, in the finite-sum setting, we prove that Nesterov’s method may diverge with the usual choice of step-size and momentum, unless additional conditions on the problem related to conditioning and data coherence are satisfied. Our results shed light as to why Nesterov’s method may fail to converge or achieve acceleration in the finite-sum setting.
Published 2020-02-27
URL https://arxiv.org/abs/2002.12414v1
PDF https://arxiv.org/pdf/2002.12414v1.pdf
PWC https://paperswithcode.com/paper/on-the-convergence-of-nesterovs-accelerated
comments powered by Disqus