January 30, 2020

2660 words 13 mins read

Paper Group ANR 208

Language-Conditioned Graph Networks for Relational Reasoning. Secure Federated Matrix Factorization. Learning chordal extensions. A Survey on Biomedical Image Captioning. Fault Tolerance of Neural Networks in Adversarial Settings. Virtual Training for a Real Application: Accurate Object-Robot Relative Localization without Calibration. Global Salien …

Language-Conditioned Graph Networks for Relational Reasoning


Title	Language-Conditioned Graph Networks for Relational Reasoning
Authors	Ronghang Hu, Anna Rohrbach, Trevor Darrell, Kate Saenko
Abstract	Solving grounded language tasks often requires reasoning about relationships between objects in the context of a given task. For example, to answer the question “What color is the mug on the plate?” we must check the color of the specific mug that satisfies the “on” relationship with respect to the plate. Recent work has proposed various methods capable of complex relational reasoning. However, most of their power is in the inference structure, while the scene is represented with simple local appearance features. In this paper, we take an alternate approach and build contextualized representations for objects in a visual scene to support relational reasoning. We propose a general framework of Language-Conditioned Graph Networks (LCGN), where each node represents an object, and is described by a context-aware representation from related objects through iterative message passing conditioned on the textual input. E.g., conditioning on the “on” relationship to the plate, the object “mug” gathers messages from the object “plate” to update its representation to “mug on the plate”, which can be easily consumed by a simple classifier for answer prediction. We experimentally show that our LCGN approach effectively supports relational reasoning and improves performance across several tasks and datasets. Our code is available at http://ronghanghu.com/lcgn.
Tasks	Relational Reasoning
Published	2019-05-10
URL	https://arxiv.org/abs/1905.04405v2
PDF	https://arxiv.org/pdf/1905.04405v2.pdf
PWC	https://paperswithcode.com/paper/language-conditioned-graph-networks-for
Repo
Framework

Secure Federated Matrix Factorization


Title	Secure Federated Matrix Factorization
Authors	Di Chai, Leye Wang, Kai Chen, Qiang Yang
Abstract	To protect user privacy and meet law regulations, federated (machine) learning is obtaining vast interests in recent years. The key principle of federated learning is training a machine learning model without needing to know each user’s personal raw private data. In this paper, we propose a secure matrix factorization framework under the federated learning setting, called FedMF. First, we design a user-level distributed matrix factorization framework where the model can be learned when each user only uploads the gradient information (instead of the raw preference data) to the server. While gradient information seems secure, we prove that it could still leak users’ raw data. To this end, we enhance the distributed matrix factorization framework with homomorphic encryption. We implement the prototype of FedMF and test it with a real movie rating dataset. Results verify the feasibility of FedMF. We also discuss the challenges for applying FedMF in practice for future research.
Tasks
Published	2019-06-12
URL	https://arxiv.org/abs/1906.05108v1
PDF	https://arxiv.org/pdf/1906.05108v1.pdf
PWC	https://paperswithcode.com/paper/secure-federated-matrix-factorization
Repo
Framework

Learning chordal extensions


Title	Learning chordal extensions
Authors	Defeng Liu, Andrea Lodi, Mathieu Tanneau
Abstract	A highly influential ingredient of many techniques designed to exploit sparsity in numerical optimization is the so-called chordal extension of a graph representation of the optimization problem. The definitive relation between chordal extension and the performance of the optimization algorithm that uses the extension is not a mathematically understood task. For this reason, we follow the current research trend of looking at Combinatorial Optimization tasks by using a Machine Learning lens, and we devise a framework for learning elimination rules yielding high-quality chordal extensions. As a first building block of the learning framework, we propose an on-policy imitation learning scheme that mimics the elimination ordering provided by the (classical) minimum degree rule. The results show that our on-policy imitation learning approach is effective in learning the minimum degree policy and, consequently, produces graphs with desirable fill-in characteristics.
Tasks	Combinatorial Optimization, Imitation Learning
Published	2019-10-16
URL	https://arxiv.org/abs/1910.07600v1
PDF	https://arxiv.org/pdf/1910.07600v1.pdf
PWC	https://paperswithcode.com/paper/learning-chordal-extensions
Repo
Framework

A Survey on Biomedical Image Captioning


Title	A Survey on Biomedical Image Captioning
Authors	Vasiliki Kougia, John Pavlopoulos, Ion Androutsopoulos
Abstract	Image captioning applied to biomedical images can assist and accelerate the diagnosis process followed by clinicians. This article is the first survey of biomedical image captioning, discussing datasets, evaluation measures, and state of the art methods. Additionally, we suggest two baselines, a weak and a stronger one; the latter outperforms all current state of the art systems on one of the datasets.
Tasks	Image Captioning
Published	2019-05-26
URL	https://arxiv.org/abs/1905.13302v1
PDF	https://arxiv.org/pdf/1905.13302v1.pdf
PWC	https://paperswithcode.com/paper/190513302
Repo
Framework

Fault Tolerance of Neural Networks in Adversarial Settings


Title	Fault Tolerance of Neural Networks in Adversarial Settings
Authors	Vasisht Duddu, N. Rajesh Pillai, D. Vijay Rao, Valentina E. Balas
Abstract	Artificial Intelligence systems require a through assessment of different pillars of trust, namely, fairness, interpretability, data and model privacy, reliability (safety) and robustness against against adversarial attacks. While these research problems have been extensively studied in isolation, an understanding of the trade-off between different pillars of trust is lacking. To this extent, the trade-off between fault tolerance, privacy and adversarial robustness is evaluated for the specific case of Deep Neural Networks, by considering two adversarial settings under a security and a privacy threat model. Specifically, this work studies the impact of the fault tolerance of the Neural Network on training the model by adding noise to the input (Adversarial Robustness) and noise to the gradients (Differential Privacy). While training models with noise to inputs, gradients or weights enhances fault tolerance, it is observed that adversarial robustness and fault tolerance are at odds with each other. On the other hand, ($\epsilon,\delta$)-Differentially Private models enhance the fault tolerance, measured using generalisation error, theoretically has an upper bound of $e^{\epsilon} - 1 + \delta$. This novel study of the trade-off between different elements of trust is pivotal for training a model which satisfies the requirements for different pillars of trust simultaneously.
Tasks
Published	2019-10-30
URL	https://arxiv.org/abs/1910.13875v2
PDF	https://arxiv.org/pdf/1910.13875v2.pdf
PWC	https://paperswithcode.com/paper/fault-tolerance-of-neural-networks-in
Repo
Framework

Virtual Training for a Real Application: Accurate Object-Robot Relative Localization without Calibration


Title	Virtual Training for a Real Application: Accurate Object-Robot Relative Localization without Calibration
Authors	Vianney Loing, Renaud Marlet, Mathieu Aubry
Abstract	Localizing an object accurately with respect to a robot is a key step for autonomous robotic manipulation. In this work, we propose to tackle this task knowing only 3D models of the robot and object in the particular case where the scene is viewed from uncalibrated cameras – a situation which would be typical in an uncontrolled environment, e.g., on a construction site. We demonstrate that this localization can be performed very accurately, with millimetric errors, without using a single real image for training, a strong advantage since acquiring representative training data is a long and expensive process. Our approach relies on a classification Convolutional Neural Network (CNN) trained using hundreds of thousands of synthetically rendered scenes with randomized parameters. To evaluate our approach quantitatively and make it comparable to alternative approaches, we build a new rich dataset of real robot images with accurately localized blocks.
Tasks	Calibration
Published	2019-02-07
URL	http://arxiv.org/abs/1902.02711v1
PDF	http://arxiv.org/pdf/1902.02711v1.pdf
PWC	https://paperswithcode.com/paper/virtual-training-for-a-real-application
Repo
Framework

Global Saliency: Aggregating Saliency Maps to Assess Dataset Artefact Bias


Title	Global Saliency: Aggregating Saliency Maps to Assess Dataset Artefact Bias
Authors	Jacob Pfau, Albert T. Young, Maria L. Wei, Michael J. Keiser
Abstract	In high-stakes applications of machine learning models, interpretability methods provide guarantees that models are right for the right reasons. In medical imaging, saliency maps have become the standard tool for determining whether a neural model has learned relevant robust features, rather than artefactual noise. However, saliency maps are limited to local model explanation because they interpret predictions on an image-by-image basis. We propose aggregating saliency globally, using semantic segmentation masks, to provide quantitative measures of model bias across a dataset. To evaluate global saliency methods, we propose two metrics for quantifying the validity of saliency explanations. We apply the global saliency method to skin lesion diagnosis to determine the effect of artefacts, such as ink, on model bias.
Tasks	Semantic Segmentation
Published	2019-10-16
URL	https://arxiv.org/abs/1910.07604v2
PDF	https://arxiv.org/pdf/1910.07604v2.pdf
PWC	https://paperswithcode.com/paper/global-saliency-aggregating-saliency-maps-to
Repo
Framework

Stacked dense optical flows and dropout layers to predict sperm motility and morphology


Title	Stacked dense optical flows and dropout layers to predict sperm motility and morphology
Authors	Vajira Thambawita, Pål Halvorsen, Hugo Hammer, Michael Riegler, Trine B. Haugen
Abstract	In this paper, we analyse two deep learning methods to predict sperm motility and sperm morphology from sperm videos. We use two different inputs: stacked pure frames of videos and dense optical flows of video frames. To solve this regression task of predicting motility and morphology, stacked dense optical flows and extracted original frames from sperm videos were used with the modified state of the art convolution neural networks. For modifications of the selected models, we have introduced an additional multi-layer perceptron to overcome the problem of over-fitting. The method which had an additional multi-layer perceptron with dropout layers, shows the best results when the inputs consist of both dense optical flows and an original frame of videos.
Tasks
Published	2019-11-08
URL	https://arxiv.org/abs/1911.03086v1
PDF	https://arxiv.org/pdf/1911.03086v1.pdf
PWC	https://paperswithcode.com/paper/stacked-dense-optical-flows-and-dropout
Repo
Framework

Robust Neural Networks using Randomized Adversarial Training


Title	Robust Neural Networks using Randomized Adversarial Training
Authors	Alexandre Araujo, Laurent Meunier, Rafael Pinot, Benjamin Negrevergne
Abstract	This paper tackles the problem of defending a neural network against adversarial attacks crafted with different norms (in particular $\ell_\infty$ and $\ell_2$ bounded adversarial examples). It has been observed that defense mechanisms designed to protect against one type of attacks often offer poor performance against the other. We show that $\ell_\infty$ defense mechanisms cannot offer good protection against $\ell_2$ attacks and vice-versa, and we provide both theoretical and empirical insights on this phenomenon. Then, we discuss various ways of combining existing defense mechanisms in order to train neural networks robust against both types of attacks. Our experiments show that these new defense mechanisms offer better protection when attacked with both norms.
Tasks
Published	2019-03-25
URL	https://arxiv.org/abs/1903.10219v3
PDF	https://arxiv.org/pdf/1903.10219v3.pdf
PWC	https://paperswithcode.com/paper/robust-neural-networks-using-randomized
Repo
Framework

Negated LAMA: Birds cannot fly


Title	Negated LAMA: Birds cannot fly
Authors	Nora Kassner, Hinrich Schütze
Abstract	Pretrained language models have achieved remarkable improvements in a broad range of natural language processing tasks, including question answering (QA). To analyze pretrained language model performance on QA, we extend the LAMA (Petroni et al., 2019) evaluation framework by a component that is focused on negation. We find that pretrained language models are equally prone to generate facts (“birds can fly”) and their negation (“birds cannot fly”). This casts doubt on the claim that pretrained language models have adequately learned factual knowledge.
Tasks	Language Modelling, Question Answering
Published	2019-11-08
URL	https://arxiv.org/abs/1911.03343v1
PDF	https://arxiv.org/pdf/1911.03343v1.pdf
PWC	https://paperswithcode.com/paper/negated-lama-birds-cannot-fly
Repo
Framework

DiaNet: BERT and Hierarchical Attention Multi-Task Learning of Fine-Grained Dialect


Title	DiaNet: BERT and Hierarchical Attention Multi-Task Learning of Fine-Grained Dialect
Authors	Muhammad Abdul-Mageed, Chiyu Zhang, AbdelRahim Elmadany, Arun Rajendran, Lyle Ungar
Abstract	Prediction of language varieties and dialects is an important language processing task, with a wide range of applications. For Arabic, the native tongue of ~ 300 million people, most varieties remain unsupported. To ease this bottleneck, we present a very large scale dataset covering 319 cities from all 21 Arab countries. We introduce a hierarchical attention multi-task learning (HA-MTL) approach for dialect identification exploiting our data at the city, state, and country levels. We also evaluate use of BERT on the three tasks, comparing it to the MTL approach. We benchmark and release our data and models.
Tasks	Multi-Task Learning
Published	2019-10-31
URL	https://arxiv.org/abs/1910.14243v1
PDF	https://arxiv.org/pdf/1910.14243v1.pdf
PWC	https://paperswithcode.com/paper/dianet-bert-and-hierarchical-attention-multi
Repo
Framework

Consistency and Finite Sample Behavior of Binary Class Probability Estimation


Title	Consistency and Finite Sample Behavior of Binary Class Probability Estimation
Authors	Alexander Mey, Marco Loog
Abstract	In this work we investigate to which extent one can recover class probabilities within the empirical risk minimization (ERM) paradigm. The main aim of our paper is to extend existing results and emphasize the tight relations between empirical risk minimization and class probability estimation. Based on existing literature on excess risk bounds and proper scoring rules, we derive a class probability estimator based on empirical risk minimization. We then derive fairly general conditions under which this estimator will converge, in the L1-norm and in probability, to the true class probabilities. Our main contribution is to present a way to derive finite sample L1-convergence rates of this estimator for different surrogate loss functions. We also study in detail which commonly used loss functions are suitable for this estimation problem and finally discuss the setting of model-misspecification as well as a possible extension to asymmetric loss functions.
Tasks
Published	2019-08-30
URL	https://arxiv.org/abs/1908.11823v2
PDF	https://arxiv.org/pdf/1908.11823v2.pdf
PWC	https://paperswithcode.com/paper/consistency-and-finite-sample-behavior-of
Repo
Framework

Gap Aware Mitigation of Gradient Staleness


Title	Gap Aware Mitigation of Gradient Staleness
Authors	Saar Barkai, Ido Hakimi, Assaf Schuster
Abstract	Cloud computing is becoming increasingly popular as a platform for distributed training of deep neural networks. Synchronous stochastic gradient descent (SSGD) suffers from substantial slowdowns due to stragglers if the environment is non-dedicated, as is common in cloud computing. Asynchronous SGD (ASGD) methods are immune to these slowdowns but are scarcely used due to gradient staleness, which encumbers the convergence process. Recent techniques have had limited success mitigating the gradient staleness when scaling up to many workers (computing nodes). In this paper we define the Gap as a measure of gradient staleness and propose Gap-Aware (GA), a novel asynchronous-distributed method that penalizes stale gradients linearly to the Gap and performs well even when scaling to large numbers of workers. Our evaluation on the CIFAR, ImageNet, and WikiText-103 datasets shows that GA outperforms the currently acceptable gradient penalization method, in final test accuracy. We also provide convergence rate proof for GA. Despite prior beliefs, we show that if GA is applied, momentum becomes beneficial in asynchronous environments, even when the number of workers scales up.
Tasks
Published	2019-09-24
URL	https://arxiv.org/abs/1909.10802v3
PDF	https://arxiv.org/pdf/1909.10802v3.pdf
PWC	https://paperswithcode.com/paper/gap-aware-mitigation-of-gradient-staleness
Repo
Framework

Robustly Extracting Medical Knowledge from EHRs: A Case Study of Learning a Health Knowledge Graph


Title	Robustly Extracting Medical Knowledge from EHRs: A Case Study of Learning a Health Knowledge Graph
Authors	Irene Y. Chen, Monica Agrawal, Steven Horng, David Sontag
Abstract	Increasingly large electronic health records (EHRs) provide an opportunity to algorithmically learn medical knowledge. In one prominent example, a causal health knowledge graph could learn relationships between diseases and symptoms and then serve as a diagnostic tool to be refined with additional clinical input. Prior research has demonstrated the ability to construct such a graph from over 270,000 emergency department patient visits. In this work, we describe methods to evaluate a health knowledge graph for robustness. Moving beyond precision and recall, we analyze for which diseases and for which patients the graph is most accurate. We identify sample size and unmeasured confounders as major sources of error in the health knowledge graph. We introduce a method to leverage non-linear functions in building the causal graph to better understand existing model assumptions. Finally, to assess model generalizability, we extend to a larger set of complete patient visits within a hospital system. We conclude with a discussion on how to robustly extract medical knowledge from EHRs.
Tasks
Published	2019-10-02
URL	https://arxiv.org/abs/1910.01116v1
PDF	https://arxiv.org/pdf/1910.01116v1.pdf
PWC	https://paperswithcode.com/paper/robustly-extracting-medical-knowledge-from
Repo
Framework

RecSim: A Configurable Simulation Platform for Recommender Systems


Title	RecSim: A Configurable Simulation Platform for Recommender Systems
Authors	Eugene Ie, Chih-wei Hsu, Martin Mladenov, Vihan Jain, Sanmit Narvekar, Jing Wang, Rui Wu, Craig Boutilier
Abstract	We propose RecSim, a configurable platform for authoring simulation environments for recommender systems (RSs) that naturally supports sequential interaction with users. RecSim allows the creation of new environments that reflect particular aspects of user behavior and item structure at a level of abstraction well-suited to pushing the limits of current reinforcement learning (RL) and RS techniques in sequential interactive recommendation problems. Environments can be easily configured that vary assumptions about: user preferences and item familiarity; user latent state and its dynamics; and choice models and other user response behavior. We outline how RecSim offers value to RL and RS researchers and practitioners, and how it can serve as a vehicle for academic-industrial collaboration.
Tasks	Recommendation Systems
Published	2019-09-11
URL	https://arxiv.org/abs/1909.04847v2
PDF	https://arxiv.org/pdf/1909.04847v2.pdf
PWC	https://paperswithcode.com/paper/recsim-a-configurable-simulation-platform-for
Repo
Framework