January 29, 2020

3179 words 15 mins read

Paper Group ANR 494

Sharper bounds for uniformly stable algorithms. Graph-based Discriminators: Sample Complexity and Expressiveness. Domain specific cues improve robustness of deep learning based segmentation of ct volumes. Modelling Bushfire Evacuation Behaviours. Tropical Polynomial Division and Neural Networks. State Estimation in Visual Inertial Autonomous Helico …

Sharper bounds for uniformly stable algorithms


Title	Sharper bounds for uniformly stable algorithms
Authors	Olivier Bousquet, Yegor Klochkov, Nikita Zhivotovskiy
Abstract	The generalization bounds for stable algorithms is a classical question in learning theory taking its roots in the early works of Vapnik and Chervonenkis and Rogers and Wagner. In a series of recent breakthrough papers, Feldman and Vondrak have shown that the best known high probability upper bounds for uniformly stable learning algorithms due to Bousquet and Elisseeff are sub-optimal in some natural regimes. To do so, they proved two generalization bounds that significantly outperform the original generalization bound. Feldman and Vondrak also asked if it is possible to provide sharper bounds and prove corresponding high probability lower bounds. This paper is devoted to these questions: firstly, inspired by the original arguments of, we provide a short proof of the moment bound that implies the generalization bound stronger than both recent results. Secondly, we prove general lower bounds, showing that our moment bound is sharp (up to a logarithmic factor) unless some additional properties of the corresponding random variables are used. Our main probabilistic result is a general concentration inequality for weakly correlated random variables, which may be of independent interest.
Tasks
Published	2019-10-17
URL	https://arxiv.org/abs/1910.07833v1
PDF	https://arxiv.org/pdf/1910.07833v1.pdf
PWC	https://paperswithcode.com/paper/sharper-bounds-for-uniformly-stable
Repo
Framework

Graph-based Discriminators: Sample Complexity and Expressiveness


Title	Graph-based Discriminators: Sample Complexity and Expressiveness
Authors	Roi Livni, Yishay Mansour
Abstract	A basic question in learning theory is to identify if two distributions are identical when we have access only to examples sampled from the distributions. This basic task is considered, for example, in the context of Generative Adversarial Networks (GANs), where a discriminator is trained to distinguish between a real-life distribution and a synthetic distribution. % Classically, we use a hypothesis class $H$ and claim that the two distributions are distinct if for some $h\in H$ the expected value on the two distributions is (significantly) different. Our starting point is the following fundamental problem: “is having the hypothesis dependent on more than a single random example beneficial”. To address this challenge we define $k$-ary based discriminators, which have a family of Boolean $k$-ary functions $\mathcal{G}$. Each function $g\in \mathcal{G}$ naturally defines a hyper-graph, indicating whether a given hyper-edge exists. A function $g\in \mathcal{G}$ distinguishes between two distributions, if the expected value of $g$, on a $k$-tuple of i.i.d examples, on the two distributions is (significantly) different. We study the expressiveness of families of $k$-ary functions, compared to the classical hypothesis class $H$, which is $k=1$. We show a separation in expressiveness of $k+1$-ary versus $k$-ary functions. This demonstrate the great benefit of having $k\geq 2$ as distinguishers. For $k\geq 2$ we introduce a notion similar to the VC-dimension, and show that it controls the sample complexity. We proceed and provide upper and lower bounds as a function of our extended notion of VC-dimension.
Tasks
Published	2019-06-01
URL	https://arxiv.org/abs/1906.00264v1
PDF	https://arxiv.org/pdf/1906.00264v1.pdf
PWC	https://paperswithcode.com/paper/190600264
Repo
Framework

Domain specific cues improve robustness of deep learning based segmentation of ct volumes


Title	Domain specific cues improve robustness of deep learning based segmentation of ct volumes
Authors	Marie Kloenne, Sebastian Niehaus, Leonie Lampe, Alberto Merola, Janis Reinelt, Ingo Roeder, Nico Scherf
Abstract	Machine Learning has considerably improved medical image analysis in the past years. Although data-driven approaches are intrinsically adaptive and thus, generic, they often do not perform the same way on data from different imaging modalities. In particular Computed tomography (CT) data poses many challenges to medical image segmentation based on convolutional neural networks (CNNs), mostly due to the broad dynamic range of intensities and the varying number of recorded slices of CT volumes. In this paper, we address these issues with a framework that combines domain-specific data preprocessing and augmentation with state-of-the-art CNN architectures. The focus is not limited to optimise the score, but also to stabilise the prediction performance since this is a mandatory requirement for use in automated and semi-automated workflows in the clinical environment. The framework is validated with an architecture comparison to show CNN architecture-independent effects of our framework functionality. We compare a modified U-Net and a modified Mixed-Scale Dense Network (MS-D Net) to compare dilated convolutions for parallel multi-scale processing to the U-Net approach based on traditional scaling operations. Finally, we propose an ensemble model combining the strengths of different individual methods. The framework performs well on a range of tasks such as liver and kidney segmentation, without significant differences in prediction performance on strongly differing volume sizes and varying slice thickness. Thus our framework is an essential step towards performing robust segmentation of unknown real-world samples.
Tasks	Computed Tomography (CT), Data Augmentation, Medical Image Segmentation, Semantic Segmentation
Published	2019-07-23
URL	https://arxiv.org/abs/1907.10132v3
PDF	https://arxiv.org/pdf/1907.10132v3.pdf
PWC	https://paperswithcode.com/paper/convolutional-neural-network-stacking-for
Repo
Framework

Modelling Bushfire Evacuation Behaviours


Title	Modelling Bushfire Evacuation Behaviours
Authors	Joel Robertson
Abstract	Bushfires pose a significant threat to Australia’s regional areas. To minimise risk and increase resilience, communities need robust evacuation strategies that account for people’s likely behaviour both before and during a bushfire. Agent-based modelling (ABM) offers a practical way to simulate a range of bushfire evacuation scenarios. However, the ABM should reflect the diversity of possible human responses in a given community. The Belief-Desire-Intention (BDI) cognitive model captures behaviour in a compact representation that is understandable by domain experts. Within a BDI-ABM simulation, individual BDI agents can be assigned profiles that determine their likely behaviour. Over a population of agents their collective behaviour will characterise the community response. These profiles are drawn from existing human behaviour research and consultation with emergency services personnel and capture the expected behaviours of identified groups in the population, both prior to and during an evacuation. A realistic representation of each community can then be formed, and evacuation scenarios within the simulation can be used to explore the possible impact of population structure on outcomes. It is hoped that this will give an improved understanding of the risks associated with evacuation, and lead to tailored evacuation plans for each community to help them prepare for and respond to bushfire.
Tasks
Published	2019-09-03
URL	https://arxiv.org/abs/1909.00991v1
PDF	https://arxiv.org/pdf/1909.00991v1.pdf
PWC	https://paperswithcode.com/paper/modelling-bushfire-evacuation-behaviours
Repo
Framework

Tropical Polynomial Division and Neural Networks


Title	Tropical Polynomial Division and Neural Networks
Authors	Georgios Smyrnis, Petros Maragos
Abstract	In this work, we examine the process of Tropical Polynomial Division, a geometric method which seeks to emulate the division of regular polynomials, when applied to those of the max-plus semiring. This is done via the approximation of the Newton Polytope of the dividend polynomial by that of the divisor. This process is afterwards generalized and applied in the context of neural networks with ReLU activations. In particular, we make use of the intuition it provides, in order to minimize a two-layer fully connected network, trained for a binary classification problem. This method is later evaluated on a variety of experiments, demonstrating its capability to approximate a network, with minimal loss in performance.
Tasks
Published	2019-11-29
URL	https://arxiv.org/abs/1911.12922v1
PDF	https://arxiv.org/pdf/1911.12922v1.pdf
PWC	https://paperswithcode.com/paper/tropical-polynomial-division-and-neural
Repo
Framework

State Estimation in Visual Inertial Autonomous Helicopter Landing Using Optimisation on Manifold


Title	State Estimation in Visual Inertial Autonomous Helicopter Landing Using Optimisation on Manifold
Authors	Thinh Hoang Dinh, Hieu Le Thi Hong, Tri Ngo Dinh
Abstract	Autonomous helicopter landing is a challenging task that requires precise information about the aircraft states regarding the helicopters position, attitude, as well as position of the helipad. To this end, we propose a solution that fuses data from an Inertial Measurement Unit (IMU) and a monocular camera which is capable of detecting helipads position in the image plane. The algorithm utilises manifold based nonlinear optimisation over preintegrated IMU measurements and reprojection error in temporally uniformly distributed keyframes, exhibiting good performance in terms of accuracy and being computationally feasible. Our contributions of this paper are the formal address of the landmarks Jacobian expressions and the adaptation of equality constrained Gauss-Newton method to this specific problem. Numerical simulations on MATLAB/Simulink confirm the validity of given claims.
Tasks
Published	2019-07-14
URL	https://arxiv.org/abs/1907.06247v1
PDF	https://arxiv.org/pdf/1907.06247v1.pdf
PWC	https://paperswithcode.com/paper/state-estimation-in-visual-inertial
Repo
Framework

A Note on the Equivalence of Upper Confidence Bounds and Gittins Indices for Patient Agents


Title	A Note on the Equivalence of Upper Confidence Bounds and Gittins Indices for Patient Agents
Authors	Daniel Russo
Abstract	This note gives a short, self-contained, proof of a sharp connection between Gittins indices and Bayesian upper confidence bound algorithms. I consider a Gaussian multi-armed bandit problem with discount factor $\gamma$. The Gittins index of an arm is shown to equal the $\gamma$-quantile of the posterior distribution of the arm’s mean plus an error term that vanishes as $\gamma\to 1$. In this sense, for sufficiently patient agents, a Gittins index measures the highest plausible mean-reward of an arm in a manner equivalent to an upper confidence bound.
Tasks
Published	2019-04-09
URL	http://arxiv.org/abs/1904.04732v1
PDF	http://arxiv.org/pdf/1904.04732v1.pdf
PWC	https://paperswithcode.com/paper/a-note-on-the-equivalence-of-upper-confidence
Repo
Framework

Privacy Preserving Link Prediction with Latent Geometric Network Models


Title	Privacy Preserving Link Prediction with Latent Geometric Network Models
Authors	Abir De, Soumen Chakrabarti
Abstract	Link prediction is an important task in social network analysis, with a wide variety of applications ranging from graph search to recommendation. The usual paradigm is to propose to each node a ranked list of nodes that are currently non-neighbors, as the most likely candidates for future linkage. Owing to increasing concerns about privacy, users (nodes) may prefer to keep some or all their connections private. Most link prediction heuristics, such as common neighbor, Jaccard coefficient, and Adamic-Adar, can leak private link information in making predictions. We present D P L P , a generic framework to protect differential privacy for these popular heuristics under the ranking objective. Under a recently-introduced latent node embedding model, we also analyze the trade-off between privacy and link prediction utility. Extensive experiments with eight diverse real-life graphs and several link prediction heuristics show that D P L P can trade off between privacy and predictive performance more effectively than several alternatives.
Tasks	Link Prediction
Published	2019-07-20
URL	https://arxiv.org/abs/1908.04849v1
PDF	https://arxiv.org/pdf/1908.04849v1.pdf
PWC	https://paperswithcode.com/paper/privacy-preserving-link-prediction-with
Repo
Framework

Data Efficient Unsupervised Domain Adaptation for Cross-Modality Image Segmentation


Title	Data Efficient Unsupervised Domain Adaptation for Cross-Modality Image Segmentation
Authors	Cheng Ouyang, Konstantinos Kamnitsas, Carlo Biffi, Jinming Duan, Daniel Rueckert
Abstract	Deep learning models trained on medical images from a source domain (e.g. imaging modality) often fail when deployed on images from a different target domain, despite imaging common anatomical structures. Deep unsupervised domain adaptation (UDA) aims to improve the performance of a deep neural network model on a target domain, using solely unlabelled target domain data and labelled source domain data. However, current state-of-the-art methods exhibit reduced performance when target data is scarce. In this work, we introduce a new data efficient UDA method for multi-domain medical image segmentation. The proposed method combines a novel VAE-based feature prior matching, which is data-efficient, and domain adversarial training to learn a shared domain-invariant latent space which is exploited during segmentation. Our method is evaluated on a public multi-modality cardiac image segmentation dataset by adapting from the labelled source domain (3D MRI) to the unlabelled target domain (3D CT). We show that by using only one single unlabelled 3D CT scan, the proposed architecture outperforms the state-of-the-art in the same setting. Finally, we perform ablation studies on prior matching and domain adversarial training to shed light on the theoretical grounding of the proposed method.
Tasks	Domain Adaptation, Medical Image Segmentation, Semantic Segmentation, Unsupervised Domain Adaptation
Published	2019-07-05
URL	https://arxiv.org/abs/1907.02766v2
PDF	https://arxiv.org/pdf/1907.02766v2.pdf
PWC	https://paperswithcode.com/paper/data-efficient-unsupervised-domain-adaptation
Repo
Framework

Learning to Infer and Execute 3D Shape Programs


Title	Learning to Infer and Execute 3D Shape Programs
Authors	Yonglong Tian, Andrew Luo, Xingyuan Sun, Kevin Ellis, William T. Freeman, Joshua B. Tenenbaum, Jiajun Wu
Abstract	Human perception of 3D shapes goes beyond reconstructing them as a set of points or a composition of geometric primitives: we also effortlessly understand higher-level shape structure such as the repetition and reflective symmetry of object parts. In contrast, recent advances in 3D shape sensing focus more on low-level geometry but less on these higher-level relationships. In this paper, we propose 3D shape programs, integrating bottom-up recognition systems with top-down, symbolic program structure to capture both low-level geometry and high-level structural priors for 3D shapes. Because there are no annotations of shape programs for real shapes, we develop neural modules that not only learn to infer 3D shape programs from raw, unannotated shapes, but also to execute these programs for shape reconstruction. After initial bootstrapping, our end-to-end differentiable model learns 3D shape programs by reconstructing shapes in a self-supervised manner. Experiments demonstrate that our model accurately infers and executes 3D shape programs for highly complex shapes from various categories. It can also be integrated with an image-to-shape module to infer 3D shape programs directly from an RGB image, leading to 3D shape reconstructions that are both more accurate and more physically plausible.
Tasks
Published	2019-01-09
URL	https://arxiv.org/abs/1901.02875v3
PDF	https://arxiv.org/pdf/1901.02875v3.pdf
PWC	https://paperswithcode.com/paper/learning-to-infer-and-execute-3d-shape
Repo
Framework

A sparse annotation strategy based on attention-guided active learning for 3D medical image segmentation


Title	A sparse annotation strategy based on attention-guided active learning for 3D medical image segmentation
Authors	Zhenxi Zhang, Jie Li, Zhusi Zhong, Zhicheng Jiao, Xinbo Gao
Abstract	3D image segmentation is one of the most important and ubiquitous problems in medical image processing. It provides detailed quantitative analysis for accurate disease diagnosis, abnormal detection, and classification. Currently deep learning algorithms are widely used in medical image segmentation, most algorithms trained models with full annotated datasets. However, obtaining medical image datasets is very difficult and expensive, and full annotation of 3D medical image is a monotonous and time-consuming work. Partially labelling informative slices in 3D images will be a great relief of manual annotation. Sample selection strategies based on active learning have been proposed in the field of 2D image, but few strategies focus on 3D images. In this paper, we propose a sparse annotation strategy based on attention-guided active learning for 3D medical image segmentation. Attention mechanism is used to improve segmentation accuracy and estimate the segmentation accuracy of each slice. The comparative experiments with three different strategies using datasets from the developing human connectome project (dHCP) show that, our strategy only needs 15% to 20% annotated slices in brain extraction task and 30% to 35% annotated slices in tissue segmentation task to achieve comparative results as full annotation.
Tasks	Active Learning, Medical Image Segmentation, Semantic Segmentation
Published	2019-06-18
URL	https://arxiv.org/abs/1906.07367v1
PDF	https://arxiv.org/pdf/1906.07367v1.pdf
PWC	https://paperswithcode.com/paper/a-sparse-annotation-strategy-based-on
Repo
Framework

Facial Behavior Analysis using 4D Curvature Statistics for Presentation Attack Detection


Title	Facial Behavior Analysis using 4D Curvature Statistics for Presentation Attack Detection
Authors	Martin Thümmel, Sven Sickert, Joachim Denzler
Abstract	The uniqueness, complexity, and diversity of facial shapes and expressions led to success of facial biometric systems. Regardless of the accuracy of current facial recognition methods, most of them are vulnerable against the presentation of sophisticated masks. In the highly monitored application scenario at airports and banks, fraudsters probably do not wear masks. However, a deception will become more probable due to the increase of unsupervised authentication using kiosks, eGates and mobile phones in self-service. To robustly detect elastic 3D masks, one of the ultimate goals is to automatically analyze the plausibility of the facial behavior based on a sequence of 3D face scans. Most importantly, such a method would also detect all less advanced presentation attacks using static 3D masks, bent photographs with eyeholes, and replay attacks using monitors. Our proposed method achieves this goal by comparing the temporal curvature change between presentation attacks and genuine faces. For evaluation purposes, we recorded a challenging database containing replay attacks, static and elastic 3D masks using a high-quality 3D sensor. Based on the proposed representation, we found a clear separation between the low facial expressiveness of presentation attacks and the plausible behavior of genuine faces.
Tasks
Published	2019-10-14
URL	https://arxiv.org/abs/1910.06056v3
PDF	https://arxiv.org/pdf/1910.06056v3.pdf
PWC	https://paperswithcode.com/paper/facial-behavior-analysis-using-4d-curvature
Repo
Framework

Activity Monitoring of Islamic Prayer (Salat) Postures using Deep Learning


Title	Activity Monitoring of Islamic Prayer (Salat) Postures using Deep Learning
Authors	Anis Koubaa, Adel Ammar, Bilel Benjdira, Abdullatif Al-Hadid, Belal Kawaf, Saleh Ali Al-Yahri, Abdelrahman Babiker, Koutaiba Assaf, Mohannad Ba Ras
Abstract	In the Muslim community, the prayer (i.e. Salat) is the second pillar of Islam, and it is the most essential and fundamental worshiping activity that believers have to perform five times a day. From a gestures’ perspective, there are predefined human postures that must be performed in a precise manner. However, for several people, these postures are not correctly performed, due to being new to Salat or even having learned prayers in an incorrect manner. Furthermore, the time spent in each posture has to be balanced. To address these issues, we propose to develop an artificial intelligence assistive framework that guides worshippers to evaluate the correctness of the postures of their prayers. This paper represents the first step to achieve this objective and addresses the problem of the recognition of the basic gestures of Islamic prayer using Convolutional Neural Networks (CNN). The contribution of this paper lies in building a dataset for the basic Salat positions, and train a YOLOv3 neural network for the recognition of the gestures. Experimental results demonstrate that the mean average precision attains 85% for a training dataset of 764 images of the different postures. To the best of our knowledge, this is the first work that addresses human activity recognition of Salat using deep learning.
Tasks	Activity Recognition, Human Activity Recognition
Published	2019-11-11
URL	https://arxiv.org/abs/1911.04102v1
PDF	https://arxiv.org/pdf/1911.04102v1.pdf
PWC	https://paperswithcode.com/paper/activity-monitoring-of-islamic-prayer-salat
Repo
Framework

Parametric Shape Modeling and Skeleton Extraction with Radial Basis Functions using Similarity Domains Network


Title	Parametric Shape Modeling and Skeleton Extraction with Radial Basis Functions using Similarity Domains Network
Authors	Sedat Ozer
Abstract	We demonstrate the use of similarity domains (SDs) for shape modeling and skeleton extraction. SDs are recently proposed and they can be utilized in a neural network framework to help us analyze shapes. SDs are modeled with radial basis functions with varying shape parameters in Similarity Domains Networks (SDNs). In this paper, we demonstrate how using SDN can first help us model a pixel-based image in terms of SDs and then demonstrate how those learned SDs can be used to extract the skeleton of a shape.
Tasks
Published	2019-06-01
URL	https://arxiv.org/abs/1906.00265v1
PDF	https://arxiv.org/pdf/1906.00265v1.pdf
PWC	https://paperswithcode.com/paper/190600265
Repo
Framework

Target Based Speech Act Classification in Political Campaign Text


Title	Target Based Speech Act Classification in Political Campaign Text
Authors	Shivashankar Subramanian, Trevor Cohn, Timothy Baldwin
Abstract	We study pragmatics in political campaign text, through analysis of speech acts and the target of each utterance. We propose a new annotation schema incorporating domain-specific speech acts, such as commissive-action, and present a novel annotated corpus of media releases and speech transcripts from the 2016 Australian election cycle. We show how speech acts and target referents can be modeled as sequential classification, and evaluate several techniques, exploiting contextualized word representations, semi-supervised learning, task dependencies and speaker meta-data.
Tasks
Published	2019-05-20
URL	https://arxiv.org/abs/1905.07856v1
PDF	https://arxiv.org/pdf/1905.07856v1.pdf
PWC	https://paperswithcode.com/paper/target-based-speech-act-classification-in
Repo
Framework