January 30, 2020

3091 words 15 mins read

Paper Group ANR 444

A General Scoring Rule for Randomized Kernel Approximation with Application to Canonical Correlation Analysis. A Review of Object Detection Models based on Convolutional Neural Network. Hypothesis Testing Interpretations and Renyi Differential Privacy. Cross Attention Network for Semantic Segmentation. DeepLofargram: A Deep Learning based Fluctuati …

A General Scoring Rule for Randomized Kernel Approximation with Application to Canonical Correlation Analysis


Title	A General Scoring Rule for Randomized Kernel Approximation with Application to Canonical Correlation Analysis
Authors	Yinsong Wang, Shahin Shahrampour
Abstract	Random features has been widely used for kernel approximation in large-scale machine learning. A number of recent studies have explored data-dependent sampling of features, modifying the stochastic oracle from which random features are sampled. While proposed techniques in this realm improve the approximation, their application is limited to a specific learning task. In this paper, we propose a general scoring rule for sampling random features, which can be employed for various applications with some adjustments. We first observe that our method can recover a number of data-dependent sampling methods (e.g., leverage scores and energy-based sampling). Then, we restrict our attention to a ubiquitous problem in statistics and machine learning, namely Canonical Correlation Analysis (CCA). We provide a principled guide for finding the distribution maximizing the canonical correlations, resulting in a novel data-dependent method for sampling features. Numerical experiments verify that our algorithm consistently outperforms other sampling techniques in the CCA task.
Tasks
Published	2019-10-11
URL	https://arxiv.org/abs/1910.05384v2
PDF	https://arxiv.org/pdf/1910.05384v2.pdf
PWC	https://paperswithcode.com/paper/a-general-scoring-rule-for-randomized-kernel
Repo
Framework

A Review of Object Detection Models based on Convolutional Neural Network


Title	A Review of Object Detection Models based on Convolutional Neural Network
Authors	F. Sultana, A. Sufian, P. Dutta
Abstract	Convolutional Neural Network (CNN) has become the state-of-the-art for object detection in image task. In this chapter, we have explained different state-of-the-art CNN based object detection models. We have made this review with categorization those detection models according to two different approaches: two-stage approach and one-stage approach. Through this chapter, it has shown advancements in object detection models from R-CNN to latest RefineDet. It has also discussed the model description and training details of each model. Here, we have also drawn a comparison among those models.
Tasks	Object Detection
Published	2019-05-05
URL	https://arxiv.org/abs/1905.01614v3
PDF	https://arxiv.org/pdf/1905.01614v3.pdf
PWC	https://paperswithcode.com/paper/a-review-of-object-detection-models-based-on
Repo
Framework

Hypothesis Testing Interpretations and Renyi Differential Privacy


Title	Hypothesis Testing Interpretations and Renyi Differential Privacy
Authors	Borja Balle, Gilles Barthe, Marco Gaboardi, Justin Hsu, Tetsuya Sato
Abstract	Differential privacy is a de facto standard in data privacy, with applications in the public and private sectors. A way to explain differential privacy, which is particularly appealing to statistician and social scientists is by means of its statistical hypothesis testing interpretation. Informally, one cannot effectively test whether a specific individual has contributed her data by observing the output of a private mechanism—any test cannot have both high significance and high power. In this paper, we identify some conditions under which a privacy definition given in terms of a statistical divergence satisfies a similar interpretation. These conditions are useful to analyze the distinguishability power of divergences and we use them to study the hypothesis testing interpretation of some relaxations of differential privacy based on Renyi divergence. This analysis also results in an improved conversion rule between these definitions and differential privacy.
Tasks
Published	2019-05-24
URL	https://arxiv.org/abs/1905.09982v2
PDF	https://arxiv.org/pdf/1905.09982v2.pdf
PWC	https://paperswithcode.com/paper/hypothesis-testing-interpretations-and-renyi
Repo
Framework

Cross Attention Network for Semantic Segmentation


Title	Cross Attention Network for Semantic Segmentation
Authors	Mengyu Liu, Hujun Yin
Abstract	In this paper, we address the semantic segmentation task with a deep network that combines contextual features and spatial information. The proposed Cross Attention Network is composed of two branches and a Feature Cross Attention (FCA) module. Specifically, a shallow branch is used to preserve low-level spatial information and a deep branch is employed to extract high-level contextual features. Then the FCA module is introduced to combine these two branches. Different from most existing attention mechanisms, the FCA module obtains spatial attention map and channel attention map from two branches separately, and then fuses them. The contextual features are used to provide global contextual guidance in fused feature maps, and spatial features are used to refine localizations. The proposed network outperforms other real-time methods with improved speed on the Cityscapes and CamVid datasets with lightweight backbones, and achieves state-of-the-art performance with a deep backbone.
Tasks	Semantic Segmentation
Published	2019-07-25
URL	https://arxiv.org/abs/1907.10958v1
PDF	https://arxiv.org/pdf/1907.10958v1.pdf
PWC	https://paperswithcode.com/paper/cross-attention-network-for-semantic
Repo
Framework

DeepLofargram: A Deep Learning based Fluctuating Dim Frequency Line Detection and Recovery


Title	DeepLofargram: A Deep Learning based Fluctuating Dim Frequency Line Detection and Recovery
Authors	Yina Han, Yuyan Li, Qingyu Liu, Yuanliang Ma
Abstract	This paper investigates the problem of dim frequency line detection and recovery in the so-called lofargram. Theoretically, time integration long enough can always enhance the detection characteristic. But this does not hold for irregularly fluctuating lines. Deep learning has been shown to perform very well for sophisticated visual inference tasks. With the composition of multiple processing layers, very complex high level representation that amplify the important aspects of input while suppresses irrelevant variations can be learned. Hence we propose a new DeepLofargram, composed of deep convolutional neural network and its visualization counterpart. Plugging into specifically designed multi-task loss, an end-to-end training jointly learns to detect and recover the spatial location of potential lines. Leveraging on this deep architecture, the performance boundary is -24dB on average, and -26dB for some. This is far beyond the perception of human visual and significantly improves the state-of-the-art.
Tasks
Published	2019-12-02
URL	https://arxiv.org/abs/1912.00605v1
PDF	https://arxiv.org/pdf/1912.00605v1.pdf
PWC	https://paperswithcode.com/paper/deeplofargram-a-deep-learning-based
Repo
Framework

MRI Tissue Magnetism Quantification through Total Field Inversion with Deep Neural Networks


Title	MRI Tissue Magnetism Quantification through Total Field Inversion with Deep Neural Networks
Authors	Juan Liu, Kevin M. Koch
Abstract	Quantitative susceptibility mapping (QSM) utilizes MRI signal phase to infer estimates of local tissue magnetism (magnetic susceptibility), which has been shown useful to provide novel image contrast and as biomarkers of abnormal tissue. QSM requires addressing a challenging post-processing problem: filtering of image phase estimates and inversion of the phase to susceptibility relationship. A wide variety of quantification errors, robustness limitations, and artifacts plague QSM algorithms. To overcome these limitations, a robust deep-learning-based single-step QSM reconstruction approach is proposed and demonstrated. This neural network was trained using magnetostatic physics simulations based on in-vivo data sources. Random perturbations were added to the physics simulations to provide sufficient input-label pairs for the training purposes. The network was quantitatively tested using gold-standard in-silico labeled datasets against established QSM total field inversion approaches. In addition, the algorithm was applied to susceptibility-weighted imaging (SWI) data collected on a cohort of clinical subjects with brain hemmhorage. When quantitatively compared against gold-standard in-silico labels, the proposed algorithm outperformed the existing comparable approaches. High quality QSM were consistently estimated from clinical susceptibility-weighted data on 100 subjects without any noticeable inversion failures. The proposed approach was able to robustly generate high quality QSM with improved accuracy in in-silico gold-standard experiments. QSM produced by the proposed method can be generated in real-time on existing MRI scanner platforms and provide enhanced visualization and quantification of magnetism-based tissue contrasts.
Tasks
Published	2019-04-11
URL	http://arxiv.org/abs/1904.07105v1
PDF	http://arxiv.org/pdf/1904.07105v1.pdf
PWC	https://paperswithcode.com/paper/mri-tissue-magnetism-quantification-through
Repo
Framework

Privacy Risks of Explaining Machine Learning Models


Title	Privacy Risks of Explaining Machine Learning Models
Authors	Reza Shokri, Martin Strobel, Yair Zick
Abstract	Can an adversary exploit model explanations to infer sensitive information about the models’ training set? To investigate this question, we first focus on membership inference attacks: given a data point and a model explanation, the attacker’s goal is to decide whether or not the point belongs to the training data. We study this problem for two popular transparency methods: gradient-based attribution methods and record-based influence measures. We develop membership inference attacks based on these model explanations, and extensively test them on a variety of datasets. For gradient-based methods, we show that the explanations can leak a significant amount of information about the individual data points in the training set, much beyond what is leaked through the predicted labels. We also show that record-based measures can be effectively, and even more significantly, exploited for membership inference attacks. More importantly, we design reconstruction attacks against this class of model explanations. We demonstrate that they can be exploited to recover significant parts of the training set. Finally, our results indicate that minorities and outliers are more vulnerable to these type of attacks than the rest of the population. Thus, there is a significant disparity for the privacy risks of model explanations across different groups.
Tasks
Published	2019-06-29
URL	https://arxiv.org/abs/1907.00164v4
PDF	https://arxiv.org/pdf/1907.00164v4.pdf
PWC	https://paperswithcode.com/paper/privacy-risks-of-explaining-machine-learning
Repo
Framework

Implicit regularization for deep neural networks driven by an Ornstein-Uhlenbeck like process


Title	Implicit regularization for deep neural networks driven by an Ornstein-Uhlenbeck like process
Authors	Guy Blanc, Neha Gupta, Gregory Valiant, Paul Valiant
Abstract	We consider deep networks, trained via stochastic gradient descent to minimize L2 loss, with the training labels perturbed by independent noise at each iteration. We characterize the behavior of the training dynamics near any parameter vector that achieves zero training error, in terms of an implicit regularization term corresponding to the sum over the data points, of the squared L2 norm of the gradient of the model with respect to the parameter vector, evaluated at each data point. We then leverage this general characterization, which holds for networks of any connectivity, width, depth, and choice of activation function, to show that for 2-layer ReLU networks of arbitrary width and L2 loss, when trained on one-dimensional labeled data $(x_1,y_1),\ldots,(x_n,y_n),$ the only stable solutions with zero training error correspond to functions that: 1) are linear over any set of three or more co-linear training points (i.e. the function has no extra “kinks”); and 2) change convexity the minimum number of times that is necessary to fit the training data. Additionally, for 2-layer networks of arbitrary width, with tanh or logistic activations, we show that when trained on a single $d$-dimensional point $(x,y)$ the only stable solutions correspond to networks where the activations of all hidden units at the datapoint, and all weights from the hidden units to the output, take at most two distinct values, or are zero. In this sense, we show that when trained on “simple” data, models corresponding to stable parameters are also “simple”; in short, despite fitting in an over-parameterized regime where the vast majority of expressible functions are complicated and badly behaved, stable parameters reached by training with noise express nearly the “simplest possible” hypothesis consistent with the data. These results shed light on the mystery of why deep networks generalize so well in practice.
Tasks
Published	2019-04-19
URL	http://arxiv.org/abs/1904.09080v1
PDF	http://arxiv.org/pdf/1904.09080v1.pdf
PWC	https://paperswithcode.com/paper/implicit-regularization-for-deep-neural
Repo
Framework

Adaptive Graphical Model Network for 2D Handpose Estimation


Title	Adaptive Graphical Model Network for 2D Handpose Estimation
Authors	Deying Kong, Yifei Chen, Haoyu Ma, Xiangyi Yan, Xiaohui Xie
Abstract	In this paper, we propose a new architecture called Adaptive Graphical Model Network (AGMN) to tackle the task of 2D hand pose estimation from a monocular RGB image. The AGMN consists of two branches of deep convolutional neural networks for calculating unary and pairwise potential functions, followed by a graphical model inference module for integrating unary and pairwise potentials. Unlike existing architectures proposed to combine DCNNs with graphical models, our AGMN is novel in that the parameters of its graphical model are conditioned on and fully adaptive to individual input images. Experiments show that our approach outperforms the state-of-the-art method used in 2D hand keypoints estimation by a notable margin on two public datasets.
Tasks	Hand Pose Estimation, Pose Estimation
Published	2019-09-18
URL	https://arxiv.org/abs/1909.08205v1
PDF	https://arxiv.org/pdf/1909.08205v1.pdf
PWC	https://paperswithcode.com/paper/adaptive-graphical-model-network-for-2d
Repo
Framework

Towards Markerless Grasp Capture


Title	Towards Markerless Grasp Capture
Authors	Samarth Brahmbhatt, Charles C. Kemp, James Hays
Abstract	Humans excel at grasping objects and manipulating them. Capturing human grasps is important for understanding grasping behavior and reconstructing it realistically in Virtual Reality (VR). However, grasp capture - capturing the pose of a hand grasping an object, and orienting it w.r.t. the object - is difficult because of the complexity and diversity of the human hand, and occlusion. Reflective markers and magnetic trackers traditionally used to mitigate this difficulty introduce undesirable artifacts in images and can interfere with natural grasping behavior. We present preliminary work on a completely marker-less algorithm for grasp capture from a video depicting a grasp. We show how recent advances in 2D hand pose estimation can be used with well-established optimization techniques. Uniquely, our algorithm can also capture hand-object contact in detail and integrate it in the grasp capture process. This is work in progress, find more details at https://contactdb. cc.gatech.edu/grasp_capture.html.
Tasks	Hand Pose Estimation, Pose Estimation
Published	2019-07-17
URL	https://arxiv.org/abs/1907.07388v1
PDF	https://arxiv.org/pdf/1907.07388v1.pdf
PWC	https://paperswithcode.com/paper/towards-markerless-grasp-capture
Repo
Framework

GumDrop at the DISRPT2019 Shared Task: A Model Stacking Approach to Discourse Unit Segmentation and Connective Detection


Title	GumDrop at the DISRPT2019 Shared Task: A Model Stacking Approach to Discourse Unit Segmentation and Connective Detection
Authors	Yue Yu, Yilun Zhu, Yang Liu, Yan Liu, Siyao Peng, Mackenzie Gong, Amir Zeldes
Abstract	In this paper we present GumDrop, Georgetown University’s entry at the DISRPT 2019 Shared Task on automatic discourse unit segmentation and connective detection. Our approach relies on model stacking, creating a heterogeneous ensemble of classifiers, which feed into a metalearner for each final task. The system encompasses three trainable component stacks: one for sentence splitting, one for discourse unit segmentation and one for connective detection. The flexibility of each ensemble allows the system to generalize well to datasets of different sizes and with varying levels of homogeneity.
Tasks
Published	2019-04-23
URL	https://arxiv.org/abs/1904.10419v2
PDF	https://arxiv.org/pdf/1904.10419v2.pdf
PWC	https://paperswithcode.com/paper/gumdrop-at-the-disrpt2019-shared-task-a-model
Repo
Framework

Learning Multi-Level Information for Dialogue Response Selection by Highway Recurrent Transformer


Title	Learning Multi-Level Information for Dialogue Response Selection by Highway Recurrent Transformer
Authors	Ting-Rui Chiang, Chao-Wei Huang, Shang-Yu Su, Yun-Nung Chen
Abstract	With the increasing research interest in dialogue response generation, there is an emerging branch formulating this task as selecting next sentences, where given the partial dialogue contexts, the goal is to determine the most probable next sentence. Following the recent success of the Transformer model, this paper proposes (1) a new variant of attention mechanism based on multi-head attention, called highway attention, and (2) a recurrent model based on transformer and the proposed highway attention, so-called Highway Recurrent Transformer. Experiments on the response selection task in the seventh Dialog System Technology Challenge (DSTC7) show the capability of the proposed model of modeling both utterance-level and dialogue-level information; the effectiveness of each module is further analyzed as well.
Tasks
Published	2019-03-21
URL	http://arxiv.org/abs/1903.08953v1
PDF	http://arxiv.org/pdf/1903.08953v1.pdf
PWC	https://paperswithcode.com/paper/learning-multi-level-information-for-dialogue
Repo
Framework

A Spectral Nonlocal Block for Neural Networks


Title	A Spectral Nonlocal Block for Neural Networks
Authors	Lei Zhu, Qi She, Lidan Zhang, Ping Guo
Abstract	The nonlocal-based blocks are designed for capturing long-range spatial-temporal dependencies in computer vision tasks. Although having shown excellent performances, they lack the mechanism to encode the rich, structured information among elements in an image. In this paper, to theoretically analyze the property of these nonlocal-based blocks, we provide a unified approach to interpreting them, where we view them as a graph filter generated on a fully-connected graph. When the graph filter is approximated by Chebyshev polynomials, a generalized formulation can be derived for explaining the existing nonlocal-based blocks ($\mathit{e.g.,}$ nonlocal block, nonlocal stage, double attention block). Furthermore, we propose an efficient and robust spectral nonlocal block, which can be flexibly inserted into deep neural networks to catch the long-range dependencies between spatial pixels or temporal frames. Experimental results demonstrate the clear-cut improvements and practical applicabilities of the spectral nonlocal block on image classification (Cifar-10/100, ImageNet), fine-grained image classification (CUB-200), action recognition (UCF-101), and person re-identification (ILID-SVID, Mars, Prid-2011) tasks.
Tasks	Fine-Grained Image Classification, Image Classification, Person Re-Identification, Video Classification
Published	2019-11-04
URL	https://arxiv.org/abs/1911.01059v4
PDF	https://arxiv.org/pdf/1911.01059v4.pdf
PWC	https://paperswithcode.com/paper/a-spectral-nonlocal-block-for-neural-networks
Repo
Framework

Sepsis World Model: A MIMIC-based OpenAI Gym “World Model” Simulator for Sepsis Treatment


Title	Sepsis World Model: A MIMIC-based OpenAI Gym “World Model” Simulator for Sepsis Treatment
Authors	Amirhossein Kiani, Chris Wang, Angela Xu
Abstract	Sepsis is a life-threatening condition caused by the body’s response to an infection. In order to treat patients with sepsis, physicians must control varying dosages of various antibiotics, fluids, and vasopressors based on a large number of variables in an emergency setting. In this project we employ a “world model” methodology to create a simulator that aims to predict the next state of a patient given a current state and treatment action. In doing so, we hope our simulator learns from a latent and less noisy representation of the EHR data. Using historical sepsis patient records from the MIMIC dataset, our method creates an OpenAI Gym simulator that leverages a Variational Auto-Encoder and a Mixture Density Network combined with a RNN (MDN-RNN) to model the trajectory of any sepsis patient in the hospital. To reduce the effects of noise, we sample from a generated distribution of next steps during simulation and have the option of introducing uncertainty into our simulator by controlling the “temperature” variable. It is worth noting that we do not have access to the ground truth for the best policy because we can only evaluate learned policies by real-world experimentation or expert feedback. Instead, we aim to study our simulator model’s performance by evaluating the similarity between our environment’s rollouts with the real EHR data and assessing its viability for learning a realistic policy for sepsis treatment using Deep Q-Learning.
Tasks	Q-Learning
Published	2019-12-15
URL	https://arxiv.org/abs/1912.07127v1
PDF	https://arxiv.org/pdf/1912.07127v1.pdf
PWC	https://paperswithcode.com/paper/sepsis-world-model-a-mimic-based-openai-gym
Repo
Framework

Goodness-of-fit tests on manifolds


Title	Goodness-of-fit tests on manifolds
Authors	Alexander Shapiro, Yao Xie, Rui Zhang
Abstract	We develop a general theory for the goodness-of-fit test to non-linear models. In particular, we assume that the observations are noisy samples of a sub-manifold defined by a non-linear map of some intrinsic structures. The observation noise is additive Gaussian. Our main result shows that the “residual” of the model fit, by solving a non-linear least-square problem, follows a (possibly non-central) $\chi^2$ distribution. The parameters of the $\chi^2$ distribution are related to the model order and dimension of the problem. The main result is established by making a novel connection between statistical test and differential geometry. We further present a method to select the model orders sequentially. We demonstrate the broad application of the general theory in a range of applications in machine learning and signal processing, including determining the rank of low-rank (possibly complex-valued) matrices and tensors, from noisy, partial, or indirect observations, determining the number of sources in signal demixing, and potential applications in determining the number of hidden nodes in neural networks.
Tasks
Published	2019-09-11
URL	https://arxiv.org/abs/1909.05229v1
PDF	https://arxiv.org/pdf/1909.05229v1.pdf
PWC	https://paperswithcode.com/paper/goodness-of-fit-tests-on-manifolds
Repo
Framework