April 2, 2020

3168 words 15 mins read

Paper Group ANR 275

A Comparative Evaluation of Temporal Pooling Methods for Blind Video Quality Assessment. Graph Domain Adaptation for Alignment-Invariant Brain Surface Segmentation. Personal Health Knowledge Graphs for Patients. AI-GAN: Attack-Inspired Generation of Adversarial Examples. HOPE-Net: A Graph-based Model for Hand-Object Pose Estimation. Graph matching …


Title	A Comparative Evaluation of Temporal Pooling Methods for Blind Video Quality Assessment
Authors	Zhengzhong Tu, Chia-Ju Chen, Li-Heng Chen, Neil Birkbeck, Balu Adsumilli, Alan C. Bovik
Abstract	Many objective video quality assessment (VQA) algorithms include a key step of temporal pooling of frame-level quality scores. However, less attention has been paid to studying the relative efficiencies of different pooling methods on no-reference (blind) VQA. Here we conduct a large-scale comparative evaluation to assess the capabilities and limitations of multiple temporal pooling strategies on blind VQA of user-generated videos. The study yields insights and general guidance regarding the application and selection of temporal pooling models. In addition, we also propose an ensemble pooling model built on top of high-performing temporal pooling models. Our experimental results demonstrate the relative efficacies of the evaluated temporal pooling models, using several popular VQA algorithms, and evaluated on two recent large-scale natural video quality databases. In addition to the new ensemble model, we provide a general recipe for applying temporal pooling of frame-based quality predictions.
Tasks	Video Quality Assessment, Visual Question Answering
Published	2020-02-25
URL	https://arxiv.org/abs/2002.10651v1
PDF	https://arxiv.org/pdf/2002.10651v1.pdf
PWC	https://paperswithcode.com/paper/a-comparative-evaluation-of-temporal-pooling
Repo
Framework

Graph Domain Adaptation for Alignment-Invariant Brain Surface Segmentation


Title	Graph Domain Adaptation for Alignment-Invariant Brain Surface Segmentation
Authors	Karthik Gopinath, Christian Desrosiers, Herve Lombaert
Abstract	The varying cortical geometry of the brain creates numerous challenges for its analysis. Recent developments have enabled learning surface data directly across multiple brain surfaces via graph convolutions on cortical data. However, current graph learning algorithms do fail when brain surface data are misaligned across subjects, thereby affecting their ability to deal with data from multiple domains. Adversarial training is widely used for domain adaptation to improve the segmentation performance across domains. In this paper, adversarial training is exploited to learn surface data across inconsistent graph alignments. This novel approach comprises a segmentator that uses a set of graph convolution layers to enable parcellation directly across brain surfaces in a source domain, and a discriminator that predicts a graph domain from segmentations. More precisely, the proposed adversarial network learns to generalize a parcellation across both, source and target domains. We demonstrate an 8% mean improvement in performance over a non-adversarial training strategy applied on multiple target domains extracted from MindBoggle, the largest publicly available manually-labeled brain surface dataset.
Tasks	Domain Adaptation
Published	2020-03-31
URL	https://arxiv.org/abs/2004.00074v1
PDF	https://arxiv.org/pdf/2004.00074v1.pdf
PWC	https://paperswithcode.com/paper/graph-domain-adaptation-for-alignment
Repo
Framework

Personal Health Knowledge Graphs for Patients


Title	Personal Health Knowledge Graphs for Patients
Authors	Nidhi Rastogi, Mohammed J. Zaki
Abstract	Existing patient data analytics platforms fail to incorporate information that has context, is personal, and topical to patients. For a recommendation system to give a suitable response to a query or to derive meaningful insights from patient data, it should consider personal information about the patient’s health history, including but not limited to their preferences, locations, and life choices that are currently applicable to them. In this review paper, we critique existing literature in this space and also discuss the various research challenges that come with designing, building, and operationalizing a personal health knowledge graph (PHKG) for patients.
Tasks	Knowledge Graphs
Published	2020-03-31
URL	https://arxiv.org/abs/2004.00071v1
PDF	https://arxiv.org/pdf/2004.00071v1.pdf
PWC	https://paperswithcode.com/paper/personal-health-knowledge-graphs-for-patients
Repo
Framework

AI-GAN: Attack-Inspired Generation of Adversarial Examples


Title	AI-GAN: Attack-Inspired Generation of Adversarial Examples
Authors	Tao Bai, Jun Zhao, Jinlin Zhu, Shoudong Han, Jiefeng Chen, Bo Li
Abstract	Adversarial examples that can fool deep models are mainly crafted by adding small perturbations imperceptible to human eyes. There are various optimization-based methods in the literature to generate adversarial perturbations, most of which are time-consuming. AdvGAN, a method proposed by Xiao~\emph{et al.}~in IJCAI~2018, employs Generative Adversarial Networks (GAN) to generate adversarial perturbation with original images as inputs, which is faster than optimization-based methods at inference time. AdvGAN, however, fixes the target classes in the training and we find it difficult to train AdvGAN when it is modified to take original images and target classes as inputs. In this paper, we propose \mbox{Attack-Inspired} GAN (\mbox{AI-GAN}) with a different training strategy to solve this problem. \mbox{AI-GAN} is a two-stage method, in which we use projected gradient descent (PGD) attack to inspire the training of GAN in the first stage and apply standard training of GAN in the second stage. Once trained, the Generator can approximate the conditional distribution of adversarial instances and generate \mbox{imperceptible} adversarial perturbations given different target classes. We conduct experiments and evaluate the performance of \mbox{AI-GAN} on MNIST and \mbox{CIFAR-10}. Compared with AdvGAN, \mbox{AI-GAN} achieves higher attack success rates with similar perturbation magnitudes.
Tasks
Published	2020-02-06
URL	https://arxiv.org/abs/2002.02196v1
PDF	https://arxiv.org/pdf/2002.02196v1.pdf
PWC	https://paperswithcode.com/paper/ai-gan-attack-inspired-generation-of
Repo
Framework

HOPE-Net: A Graph-based Model for Hand-Object Pose Estimation


Title	HOPE-Net: A Graph-based Model for Hand-Object Pose Estimation
Authors	Bardia Doosti, Shujon Naha, Majid Mirbagheri, David Crandall
Abstract	Hand-object pose estimation (HOPE) aims to jointly detect the poses of both a hand and of a held object. In this paper, we propose a lightweight model called HOPE-Net which jointly estimates hand and object pose in 2D and 3D in real-time. Our network uses a cascade of two adaptive graph convolutional neural networks, one to estimate 2D coordinates of the hand joints and object corners, followed by another to convert 2D coordinates to 3D. Our experiments show that through end-to-end training of the full network, we achieve better accuracy for both the 2D and 3D coordinate estimation problems. The proposed 2D to 3D graph convolution-based model could be applied to other 3D landmark detection problems, where it is possible to first predict the 2D keypoints and then transform them to 3D.
Tasks	Pose Estimation
Published	2020-03-31
URL	https://arxiv.org/abs/2004.00060v1
PDF	https://arxiv.org/pdf/2004.00060v1.pdf
PWC	https://paperswithcode.com/paper/hope-net-a-graph-based-model-for-hand-object
Repo
Framework

Graph matching between bipartite and unipartite networks: to collapse, or not to collapse, that is the question


Title	Graph matching between bipartite and unipartite networks: to collapse, or not to collapse, that is the question
Authors	Jesús Arroyo, Carey E. Priebe, Vince Lyzinski
Abstract	Graph matching consists of aligning the vertices of two unlabeled graphs in order to maximize the shared structure across networks; when the graphs are unipartite, this is commonly formulated as minimizing their edge disagreements. In this paper, we address the common setting in which one of the graphs to match is a bipartite network and one is unipartite. Commonly, the bipartite networks are collapsed or projected into a unipartite graph, and graph matching proceeds as in the classical setting. This potentially leads to noisy edge estimates and loss of information. We formulate the graph matching problem between a bipartite and a unipartite graph using an undirected graphical model, and introduce methods to find the alignment with this model without collapsing. In simulations and real data examples, we show how our methods can result in a more accurate matching than the naive approach of transforming the bipartite networks into unipartite, and we demonstrate the performance gains achieved by our method in simulated and real data networks, including a co-authorship-citation network pair and brain structural and functional data.
Tasks	Graph Matching
Published	2020-02-05
URL	https://arxiv.org/abs/2002.01648v1
PDF	https://arxiv.org/pdf/2002.01648v1.pdf
PWC	https://paperswithcode.com/paper/graph-matching-between-bipartite-and
Repo
Framework

A Note on Latency Variability of Deep Neural Networks for Mobile Inference


Title	A Note on Latency Variability of Deep Neural Networks for Mobile Inference
Authors	Luting Yang, Bingqian Lu, Shaolei Ren
Abstract	Running deep neural network (DNN) inference on mobile devices, i.e., mobile inference, has become a growing trend, making inference less dependent on network connections and keeping private data locally. The prior studies on optimizing DNNs for mobile inference typically focus on the metric of average inference latency, thus implicitly assuming that mobile inference exhibits little latency variability. In this note, we conduct a preliminary measurement study on the latency variability of DNNs for mobile inference. We show that the inference latency variability can become quite significant in the presence of CPU resource contention. More interestingly, unlike the common belief that the relative performance superiority of DNNs on one device can carry over to another device and/or another level of resource contention, we highlight that a DNN model with a better latency performance than another model can become outperformed by the other model when resource contention be more severe or running on another device. Thus, when optimizing DNN models for mobile inference, only measuring the average latency may not be adequate; instead, latency variability under various conditions should be accounted for, including but not limited to different devices and different levels of CPU resource contention considered in this note.
Tasks
Published	2020-02-29
URL	https://arxiv.org/abs/2003.00138v1
PDF	https://arxiv.org/pdf/2003.00138v1.pdf
PWC	https://paperswithcode.com/paper/a-note-on-latency-variability-of-deep-neural
Repo
Framework

Making Sense of Reinforcement Learning and Probabilistic Inference


Title	Making Sense of Reinforcement Learning and Probabilistic Inference
Authors	Brendan O’Donoghue, Ian Osband, Catalin Ionescu
Abstract	Reinforcement learning (RL) combines a control problem with statistical estimation: The system dynamics are not known to the agent, but can be learned through experience. A recent line of research casts ‘RL as inference’ and suggests a particular framework to generalize the RL problem as probabilistic inference. Our paper surfaces a key shortcoming in that approach, and clarifies the sense in which RL can be coherently cast as an inference problem. In particular, an RL agent must consider the effects of its actions upon future rewards and observations: The exploration-exploitation tradeoff. In all but the most simple settings, the resulting inference is computationally intractable so that practical RL algorithms must resort to approximation. We demonstrate that the popular ‘RL as inference’ approximation can perform poorly in even very basic problems. However, we show that with a small modification the framework does yield algorithms that can provably perform well, and we show that the resulting algorithm is equivalent to the recently proposed K-learning, which we further connect with Thompson sampling.
Tasks
Published	2020-01-03
URL	https://arxiv.org/abs/2001.00805v2
PDF	https://arxiv.org/pdf/2001.00805v2.pdf
PWC	https://paperswithcode.com/paper/making-sense-of-reinforcement-learning-and-1
Repo
Framework

The Edge of Depth: Explicit Constraints between Segmentation and Depth


Title	The Edge of Depth: Explicit Constraints between Segmentation and Depth
Authors	Shengjie Zhu, Garrick Brazil, Xiaoming Liu
Abstract	In this work we study the mutual benefits of two common computer vision tasks, self-supervised depth estimation and semantic segmentation from images. For example, to help unsupervised monocular depth estimation, constraints from semantic segmentation has been explored implicitly such as sharing and transforming features. In contrast, we propose to explicitly measure the border consistency between segmentation and depth and minimize it in a greedy manner by iteratively supervising the network towards a locally optimal solution. Partially this is motivated by our observation that semantic segmentation even trained with limited ground truth (200 images of KITTI) can offer more accurate border than that of any (monocular or stereo) image-based depth estimation. Through extensive experiments, our proposed approach advances the state of the art on unsupervised monocular depth estimation in the KITTI.
Tasks	Depth Estimation, Monocular Depth Estimation, Semantic Segmentation
Published	2020-04-01
URL	https://arxiv.org/abs/2004.00171v1
PDF	https://arxiv.org/pdf/2004.00171v1.pdf
PWC	https://paperswithcode.com/paper/the-edge-of-depth-explicit-constraints
Repo
Framework

Information Leakage in Embedding Models


Title	Information Leakage in Embedding Models
Authors	Congzheng Song, Ananth Raghunathan
Abstract	Embeddings are functions that map raw input data to low-dimensional vector representations, while preserving important semantic information about the inputs. Pre-training embeddings on a large amount of unlabeled data and fine-tuning them for downstream tasks is now a de facto standard in achieving state of the art learning in many domains. We demonstrate that embeddings, in addition to encoding generic semantics, often also present a vector that leaks sensitive information about the input data. We develop three classes of attacks to systematically study information that might be leaked by embeddings. First, embedding vectors can be inverted to partially recover some of the input data. As an example, we show that our attacks on popular sentence embeddings recover between 50%–70% of the input words (F1 scores of 0.5–0.7). Second, embeddings may reveal sensitive attributes inherent in inputs and independent of the underlying semantic task at hand. Attributes such as authorship of text can be easily extracted by training an inference model on just a handful of labeled embedding vectors. Third, embedding models leak moderate amount of membership information for infrequent training data inputs. We extensively evaluate our attacks on various state-of-the-art embedding models in the text domain. We also propose and evaluate defenses that can prevent the leakage to some extent at a minor cost in utility.
Tasks	Sentence Embeddings
Published	2020-03-31
URL	https://arxiv.org/abs/2004.00053v1
PDF	https://arxiv.org/pdf/2004.00053v1.pdf
PWC	https://paperswithcode.com/paper/information-leakage-in-embedding-models
Repo
Framework

Learning Implicit Generative Models with Theoretical Guarantees


Title	Learning Implicit Generative Models with Theoretical Guarantees
Authors	Yuan Gao, Jian Huang, Yuling Jiao, Jin Liu
Abstract	We propose a \textbf{uni}fied \textbf{f}ramework for \textbf{i}mplicit \textbf{ge}nerative \textbf{m}odeling (UnifiGem) with theoretical guarantees by integrating approaches from optimal transport, numerical ODE, density-ratio (density-difference) estimation and deep neural networks. First, the problem of implicit generative learning is formulated as that of finding the optimal transport map between the reference distribution and the target distribution, which is characterized by a totally nonlinear Monge-Amp`{e}re equation. Interpreting the infinitesimal linearization of the Monge-Amp`{e}re equation from the perspective of gradient flows in measure spaces leads to the continuity equation or the McKean-Vlasov equation. We then solve the McKean-Vlasov equation numerically using the forward Euler iteration, where the forward Euler map depends on the density ratio (density difference) between the distribution at current iteration and the underlying target distribution. We further estimate the density ratio (density difference) via deep density-ratio (density-difference) fitting and derive explicit upper bounds on the estimation error. Experimental results on both synthetic datasets and real benchmark datasets support our theoretical findings and demonstrate the effectiveness of UnifiGem.
Tasks
Published	2020-02-07
URL	https://arxiv.org/abs/2002.02862v2
PDF	https://arxiv.org/pdf/2002.02862v2.pdf
PWC	https://paperswithcode.com/paper/learning-implicit-generative-models-with
Repo
Framework

A Multimodal Dialogue System for Conversational Image Editing


Title	A Multimodal Dialogue System for Conversational Image Editing
Authors	Tzu-Hsiang Lin, Trung Bui, Doo Soon Kim, Jean Oh
Abstract	In this paper, we present a multimodal dialogue system for Conversational Image Editing. We formulate our multimodal dialogue system as a Partially Observed Markov Decision Process (POMDP) and trained it with Deep Q-Network (DQN) and a user simulator. Our evaluation shows that the DQN policy outperforms a rule-based baseline policy, achieving 90% success rate under high error rates. We also conducted a real user study and analyzed real user behavior.
Tasks
Published	2020-02-16
URL	https://arxiv.org/abs/2002.06484v1
PDF	https://arxiv.org/pdf/2002.06484v1.pdf
PWC	https://paperswithcode.com/paper/a-multimodal-dialogue-system-for
Repo
Framework

Locally Interpretable Predictions of Parkinson’s Disease Progression


Title	Locally Interpretable Predictions of Parkinson’s Disease Progression
Authors	Qiaomei Li, Rachel Cummings, Yonatan Mintz
Abstract	In precision medicine, machine learning techniques have been commonly proposed to aid physicians in early screening of chronic diseases such as Parkinson’s Disease. These automated screening procedures should be interpretable by a clinician who must explain the decision-making process to patients for informed consent. However, the methods which typically achieve the highest level of accuracy given early screening data are complex black box models. In this paper, we provide a novel approach for explaining black box model predictions of Parkinson’s Disease progression that can give high fidelity explanations with lower model complexity. Specifically, we use the Parkinson’s Progression Marker Initiative (PPMI) data set to cluster patients based on the trajectory of their disease progression. This can be used to predict how a patient’s symptoms are likely to develop based on initial screening data. We then develop a black box (random forest) model for predicting which cluster a patient belongs in, along with a method for generating local explainers for these predictions. Our local explainer methodology uses a computationally efficient information filter to include only the most relevant features. We also develop a global explainer methodology and empirically validate its performance on the PPMI data set, showing that our approach may Pareto-dominate existing techniques on the trade-off between fidelity and coverage. Such tools should prove useful for implementing medical screening tools in practice by providing explainer models with high fidelity and significantly less functional complexity.
Tasks	Decision Making
Published	2020-03-20
URL	https://arxiv.org/abs/2003.09466v1
PDF	https://arxiv.org/pdf/2003.09466v1.pdf
PWC	https://paperswithcode.com/paper/locally-interpretable-predictions-of
Repo
Framework

Generalized ODIN: Detecting Out-of-distribution Image without Learning from Out-of-distribution Data


Title	Generalized ODIN: Detecting Out-of-distribution Image without Learning from Out-of-distribution Data
Authors	Yen-Chang Hsu, Yilin Shen, Hongxia Jin, Zsolt Kira
Abstract	Deep neural networks have attained remarkable performance when applied to data that comes from the same distribution as that of the training set, but can significantly degrade otherwise. Therefore, detecting whether an example is out-of-distribution (OoD) is crucial to enable a system that can reject such samples or alert users. Recent works have made significant progress on OoD benchmarks consisting of small image datasets. However, many recent methods based on neural networks rely on training or tuning with both in-distribution and out-of-distribution data. The latter is generally hard to define a-priori, and its selection can easily bias the learning. We base our work on a popular method ODIN, proposing two strategies for freeing it from the needs of tuning with OoD data, while improving its OoD detection performance. We specifically propose to decompose confidence scoring as well as a modified input pre-processing method. We show that both of these significantly help in detection performance. Our further analysis on a larger scale image dataset shows that the two types of distribution shifts, specifically semantic shift and non-semantic shift, present a significant difference in the difficulty of the problem, providing an analysis of when ODIN-like strategies do or do not work.
Tasks
Published	2020-02-26
URL	https://arxiv.org/abs/2002.11297v2
PDF	https://arxiv.org/pdf/2002.11297v2.pdf
PWC	https://paperswithcode.com/paper/generalized-odin-detecting-out-of
Repo
Framework

Mimicking Evolution with Reinforcement Learning


Title	Mimicking Evolution with Reinforcement Learning
Authors	João P. Abrantes, Arnaldo J. Abrantes, Frans A. Oliehoek
Abstract	Evolution gave rise to human and animal intelligence here on Earth. We argue that the path to developing artificial human-like-intelligence will pass through mimicking the evolutionary process in a nature-like simulation. In Nature, there are two processes driving the development of the brain: evolution and learning. Evolution acts slowly, across generations, and amongst other things, it defines what agents learn by changing their internal reward function. Learning acts fast, across one’s lifetime, and it quickly updates agents’ policy to maximise pleasure and minimise pain. The reward function is slowly aligned with the fitness function by evolution, however, as agents evolve the environment and its fitness function also change, increasing the misalignment between reward and fitness. It is extremely computationally expensive to replicate these two processes in simulation. This work proposes Evolution via Evolutionary Reward (EvER) that allows learning to single-handedly drive the search for policies with increasingly evolutionary fitness by ensuring the alignment of the reward function with the fitness function. In this search, EvER makes use of the whole state-action trajectories that agents go through their lifetime. In contrast, current evolutionary algorithms discard this information and consequently limit their potential efficiency at tackling sequential decision problems. We test our algorithm in two simple bio-inspired environments and show its superiority at generating more capable agents at surviving and reproducing their genes when compared with a state-of-the-art evolutionary algorithm.
Tasks
Published	2020-03-31
URL	https://arxiv.org/abs/2004.00048v1
PDF	https://arxiv.org/pdf/2004.00048v1.pdf
PWC	https://paperswithcode.com/paper/mimicking-evolution-with-reinforcement
Repo
Framework