January 26, 2020

3454 words 17 mins read

Paper Group ANR 1496

Paper Group ANR 1496

Decorrelated Adversarial Learning for Age-Invariant Face Recognition. Adversarial Attacks on Remote User Authentication Using Behavioural Mouse Dynamics. Diverse Trajectory Forecasting with Determinantal Point Processes. Efficient Neural Architecture Search on Low-Dimensional Data for OCT Image Segmentation. Privately Answering Classification Queri …

Decorrelated Adversarial Learning for Age-Invariant Face Recognition

Title Decorrelated Adversarial Learning for Age-Invariant Face Recognition
Authors Hao Wang, Dihong Gong, Zhifeng Li, Wei Liu
Abstract There has been an increasing research interest in age-invariant face recognition. However, matching faces with big age gaps remains a challenging problem, primarily due to the significant discrepancy of face appearances caused by aging. To reduce such a discrepancy, in this paper we propose a novel algorithm to remove age-related components from features mixed with both identity and age information. Specifically, we factorize a mixed face feature into two uncorrelated components: identity-dependent component and age-dependent component, where the identity-dependent component includes information that is useful for face recognition. To implement this idea, we propose the Decorrelated Adversarial Learning (DAL) algorithm, where a Canonical Mapping Module (CMM) is introduced to find the maximum correlation between the paired features generated by a backbone network, while the backbone network and the factorization module are trained to generate features reducing the correlation. Thus, the proposed model learns the decomposed features of age and identity whose correlation is significantly reduced. Simultaneously, the identity-dependent feature and the age-dependent feature are respectively supervised by ID and age preserving signals to ensure that they both contain the correct information. Extensive experiments are conducted on popular public-domain face aging datasets (FG-NET, MORPH Album 2, and CACD-VS) to demonstrate the effectiveness of the proposed approach.
Tasks Age-Invariant Face Recognition, Face Recognition
Published 2019-04-10
URL http://arxiv.org/abs/1904.04972v1
PDF http://arxiv.org/pdf/1904.04972v1.pdf
PWC https://paperswithcode.com/paper/decorrelated-adversarial-learning-for-age
Repo
Framework

Adversarial Attacks on Remote User Authentication Using Behavioural Mouse Dynamics

Title Adversarial Attacks on Remote User Authentication Using Behavioural Mouse Dynamics
Authors Yi Xiang Marcus Tan, Alfonso Iacovazzi, Ivan Homoliak, Yuval Elovici, Alexander Binder
Abstract Mouse dynamics is a potential means of authenticating users. Typically, the authentication process is based on classical machine learning techniques, but recently, deep learning techniques have been introduced for this purpose. Although prior research has demonstrated how machine learning and deep learning algorithms can be bypassed by carefully crafted adversarial samples, there has been very little research performed on the topic of behavioural biometrics in the adversarial domain. In an attempt to address this gap, we built a set of attacks, which are applications of several generative approaches, to construct adversarial mouse trajectories that bypass authentication models. These generated mouse sequences will serve as the adversarial samples in the context of our experiments. We also present an analysis of the attack approaches we explored, explaining their limitations. In contrast to previous work, we consider the attacks in a more realistic and challenging setting in which an attacker has access to recorded user data but does not have access to the authentication model or its outputs. We explore three different attack strategies: 1) statistics-based, 2) imitation-based, and 3) surrogate-based; we show that they are able to evade the functionality of the authentication models, thereby impacting their robustness adversely. We show that imitation-based attacks often perform better than surrogate-based attacks, unless, however, the attacker can guess the architecture of the authentication model. In such cases, we propose a potential detection mechanism against surrogate-based attacks.
Tasks
Published 2019-05-28
URL https://arxiv.org/abs/1905.11831v2
PDF https://arxiv.org/pdf/1905.11831v2.pdf
PWC https://paperswithcode.com/paper/adversarial-attacks-on-remote-user
Repo
Framework

Diverse Trajectory Forecasting with Determinantal Point Processes

Title Diverse Trajectory Forecasting with Determinantal Point Processes
Authors Ye Yuan, Kris Kitani
Abstract The ability to forecast a set of likely yet diverse possible future behaviors of an agent (e.g., future trajectories of a pedestrian) is essential for safety-critical perception systems (e.g., autonomous vehicles). In particular, a set of possible future behaviors generated by the system must be diverse to account for all possible outcomes in order to take necessary safety precautions. It is not sufficient to maintain a set of the most likely future outcomes because the set may only contain perturbations of a single outcome. While generative models such as variational autoencoders (VAEs) have been shown to be a powerful tool for learning a distribution over future trajectories, randomly drawn samples from the learned implicit likelihood model may not be diverse – the likelihood model is derived from the training data distribution and the samples will concentrate around the major mode that has most data. In this work, we propose to learn a diversity sampling function (DSF) that generates a diverse and likely set of future trajectories. The DSF maps forecasting context features to a set of latent codes which can be decoded by a generative model (e.g., VAE) into a set of diverse trajectory samples. Concretely, the process of identifying the diverse set of samples is posed as a parameter estimation of the DSF. To learn the parameters of the DSF, the diversity of the trajectory samples is evaluated by a diversity loss based on a determinantal point process (DPP). Gradient descent is performed over the DSF parameters, which in turn move the latent codes of the sample set to find an optimal diverse and likely set of trajectories. Our method is a novel application of DPPs to optimize a set of items (trajectories) in continuous space. We demonstrate the diversity of the trajectories produced by our approach on both low-dimensional 2D trajectory data and high-dimensional human motion data.
Tasks Autonomous Vehicles, Point Processes
Published 2019-07-11
URL https://arxiv.org/abs/1907.04967v2
PDF https://arxiv.org/pdf/1907.04967v2.pdf
PWC https://paperswithcode.com/paper/diverse-trajectory-forecasting-with
Repo
Framework

Efficient Neural Architecture Search on Low-Dimensional Data for OCT Image Segmentation

Title Efficient Neural Architecture Search on Low-Dimensional Data for OCT Image Segmentation
Authors Nils Gessert, Alexander Schlaefer
Abstract Typically, deep learning architectures are handcrafted for their respective learning problem. As an alternative, neural architecture search (NAS) has been proposed where the architecture’s structure is learned in an additional optimization step. For the medical imaging domain, this approach is very promising as there are diverse problems and imaging modalities that require architecture design. However, NAS is very time-consuming and medical learning problems often involve high-dimensional data with high computational requirements. We propose an efficient approach for NAS in the context of medical, image-based deep learning problems by searching for architectures on low-dimensional data which are subsequently transferred to high-dimensional data. For OCT-based layer segmentation, we demonstrate that a search on 1D data reduces search time by 87.5% compared to a search on 2D data while the final 2D models achieve similar performance.
Tasks Neural Architecture Search, Semantic Segmentation
Published 2019-05-07
URL https://arxiv.org/abs/1905.02590v1
PDF https://arxiv.org/pdf/1905.02590v1.pdf
PWC https://paperswithcode.com/paper/efficient-neural-architecture-search-on-low
Repo
Framework

Privately Answering Classification Queries in the Agnostic PAC Model

Title Privately Answering Classification Queries in the Agnostic PAC Model
Authors Anupama Nandi, Raef Bassily
Abstract We revisit the problem of differentially private release of classification queries. In this problem, the goal is to design an algorithm that can accurately answer a sequence of classification queries based on a private training set while ensuring differential privacy. We formally study this problem in the agnostic PAC model and derive a new upper bound on the private sample complexity. Our results improve over those obtained in a recent work [BTT18] for the agnostic PAC setting. In particular, we give an improved construction that yields a tighter upper bound on the sample complexity. Moreover, unlike [BTT18], our accuracy guarantee does not involve any blow-up in the approximation error associated with the given hypothesis class. Given any hypothesis class with VC-dimension $d$, we show that our construction can privately answer up to $m$ classification queries with average excess error $\alpha$ using a private sample of size $\approx \frac{d}{\alpha^2},\max\left(1, \sqrt{m},\alpha^{3/2}\right)$. Using recent results on private learning with auxiliary public data, we extend our construction to show that one can privately answer any number of classification queries with average excess error $\alpha$ using a private sample of size $\approx \frac{d}{\alpha^2},\max\left(1, \sqrt{d},\alpha\right)$. When $\alpha=O\left(\frac{1}{\sqrt{d}}\right)$, our private sample complexity bound is essentially optimal.
Tasks
Published 2019-07-31
URL https://arxiv.org/abs/1907.13553v3
PDF https://arxiv.org/pdf/1907.13553v3.pdf
PWC https://paperswithcode.com/paper/privately-answering-classification-queries-in
Repo
Framework

Deep Reinforcement Learning meets Graph Neural Networks: exploring a routing optimization use case

Title Deep Reinforcement Learning meets Graph Neural Networks: exploring a routing optimization use case
Authors Paul Almasan, José Suárez-Varela, Arnau Badia-Sampera, Krzysztof Rusek, Pere Barlet-Ros, Albert Cabellos-Aparicio
Abstract Recent advances in Deep Reinforcement Learning (DRL) have shown a significant improvement in decision-making problems. The networking community has started to investigate how DRL can provide a new breed of solutions to relevant optimization problems, such as routing. However, most of the state-of-the-art DRL-based networking techniques fail to generalize, this means that they can only operate over network topologies seen during training, but not over new topologies. The reason behind this important limitation is that existing DRL networking solutions use standard neural networks (e.g., fully connected), which are unable to learn graph-structured information. In this paper we propose to use Graph Neural Networks (GNN) in combination with DRL. GNN have been recently proposed to model graphs, and our novel DRL+GNN architecture is able to learn, operate and generalize over arbitrary network topologies. To showcase its generalization capabilities, we evaluate it on an Optical Transport Network (OTN) scenario, where the agent needs to allocate traffic demands efficiently. Our results show that our DRL+GNN agent is able to achieve outstanding performance in topologies unseen during training.
Tasks Decision Making
Published 2019-10-16
URL https://arxiv.org/abs/1910.07421v2
PDF https://arxiv.org/pdf/1910.07421v2.pdf
PWC https://paperswithcode.com/paper/deep-reinforcement-learning-meets-graph
Repo
Framework

Volume Doubling Condition and a Local Poincaré Inequality on Unweighted Random Geometric Graphs

Title Volume Doubling Condition and a Local Poincaré Inequality on Unweighted Random Geometric Graphs
Authors Franziska Göbel, Gilles Blanchard
Abstract The aim of this paper is to establish two fundamental measure-metric properties of particular random geometric graphs. We consider $\varepsilon$-neighborhood graphs whose vertices are drawn independently and identically distributed from a common distribution defined on a regular submanifold of $\mathbb{R}^K$. We show that a volume doubling condition (VD) and local Poincar'e inequality (LPI) hold for the random geometric graph (with high probability, and uniformly over all shortest path distance balls in a certain radius range) under suitable regularity conditions of the underlying submanifold and the sampling distribution.
Tasks
Published 2019-07-06
URL https://arxiv.org/abs/1907.03192v2
PDF https://arxiv.org/pdf/1907.03192v2.pdf
PWC https://paperswithcode.com/paper/volume-doubling-condition-and-a-local
Repo
Framework

Deep Contextualized Word Embeddings in Transition-Based and Graph-Based Dependency Parsing – A Tale of Two Parsers Revisited

Title Deep Contextualized Word Embeddings in Transition-Based and Graph-Based Dependency Parsing – A Tale of Two Parsers Revisited
Authors Artur Kulmizev, Miryam de Lhoneux, Johannes Gontrum, Elena Fano, Joakim Nivre
Abstract Transition-based and graph-based dependency parsers have previously been shown to have complementary strengths and weaknesses: transition-based parsers exploit rich structural features but suffer from error propagation, while graph-based parsers benefit from global optimization but have restricted feature scope. In this paper, we show that, even though some details of the picture have changed after the switch to neural networks and continuous representations, the basic trade-off between rich features and global optimization remains essentially the same. Moreover, we show that deep contextualized word embeddings, which allow parsers to pack information about global sentence structure into local feature representations, benefit transition-based parsers more than graph-based parsers, making the two approaches virtually equivalent in terms of both accuracy and error profile. We argue that the reason is that these representations help prevent search errors and thereby allow transition-based parsers to better exploit their inherent strength of making accurate local decisions. We support this explanation by an error analysis of parsing experiments on 13 languages.
Tasks Dependency Parsing, Word Embeddings
Published 2019-08-20
URL https://arxiv.org/abs/1908.07397v2
PDF https://arxiv.org/pdf/1908.07397v2.pdf
PWC https://paperswithcode.com/paper/deep-contextualized-word-embeddings-in
Repo
Framework

The generalization error of max-margin linear classifiers: High-dimensional asymptotics in the overparametrized regime

Title The generalization error of max-margin linear classifiers: High-dimensional asymptotics in the overparametrized regime
Authors Andrea Montanari, Feng Ruan, Youngtak Sohn, Jun Yan
Abstract Modern machine learning models are often so complex that they achieve vanishing classification error on the training set. Max-margin linear classifiers are among the simplest classification methods that have zero training error (with linearly separable data). Despite this simplicity, their high-dimensional behavior is not yet completely understood. We assume to be given i.i.d. data $(y_i,{\boldsymbol x}i)$, $i\le n$ with ${\boldsymbol x}i\sim {\sf N}({\boldsymbol 0},{\boldsymbol \Sigma})$ a $p$-dimensional Gaussian feature vector, and $y_i \in{+1,-1}$ a label whose distribution depends on a linear combination of the covariates $\langle {\boldsymbol \theta},{\boldsymbol x}i\rangle$. We consider the proportional asymptotics $n,p\to\infty$ with $p/n\to \psi$, and derive exact expressions for the limiting prediction error. Our asymptotic results match simulations already when $n,p$ are of the order of a few hundreds. We explore several choices for the the pair $({\boldsymbol \theta},{\boldsymbol \Sigma})$, and show that the resulting generalization curve (test error error as a function of the overparametrization ratio $\psi=p/n$) is qualitatively different, depending on this choice. In particular we consider a specific structure of $({\boldsymbol \theta}*,{\boldsymbol \Sigma})$ that captures the behavior of nonlinear random feature models or, equivalently, two-layers neural networks with random first layer weights. In this case, we observe that the test error is monotone decreasing in the number of parameters. This finding agrees with the recently developed `double descent’ phenomenology for overparametrized models. |
Tasks
Published 2019-11-05
URL https://arxiv.org/abs/1911.01544v1
PDF https://arxiv.org/pdf/1911.01544v1.pdf
PWC https://paperswithcode.com/paper/the-generalization-error-of-max-margin-linear
Repo
Framework

Cross-modal Zero-shot Hashing

Title Cross-modal Zero-shot Hashing
Authors Xuanwu Liu, Zhao Li, Jun Wang, Guoxian Yu, Carlotta Domeniconi, Xiangliang Zhang
Abstract Hashing has been widely studied for big data retrieval due to its low storage cost and fast query speed. Zero-shot hashing (ZSH) aims to learn a hashing model that is trained using only samples from seen categories, but can generalize well to samples of unseen categories. ZSH generally uses category attributes to seek a semantic embedding space to transfer knowledge from seen categories to unseen ones. As a result, it may perform poorly when labeled data are insufficient. ZSH methods are mainly designed for single-modality data, which prevents their application to the widely spread multi-modal data. On the other hand, existing cross-modal hashing solutions assume that all the modalities share the same category labels, while in practice the labels of different data modalities may be different. To address these issues, we propose a general Cross-modal Zero-shot Hashing (CZHash) solution to effectively leverage unlabeled and labeled multi-modality data with different label spaces. CZHash first quantifies the composite similarity between instances using label and feature information. It then defines an objective function to achieve deep feature learning compatible with the composite similarity preserving, category attribute space learning, and hashing coding function learning. CZHash further introduces an alternative optimization procedure to jointly optimize these learning objectives. Experiments on benchmark multi-modal datasets show that CZHash significantly outperforms related representative hashing approaches both on effectiveness and adaptability.
Tasks
Published 2019-08-19
URL https://arxiv.org/abs/1908.07388v1
PDF https://arxiv.org/pdf/1908.07388v1.pdf
PWC https://paperswithcode.com/paper/cross-modal-zero-shot-hashing
Repo
Framework

Physicist’s Journeys Through the AI World - A Topical Review. There is no royal road to unsupervised learning

Title Physicist’s Journeys Through the AI World - A Topical Review. There is no royal road to unsupervised learning
Authors Imad Alhousseini, Wissam Chemissany, Fatima Kleit, Aly Nasrallah
Abstract Artificial Intelligence (AI), defined in its most simple form, is a technological tool that makes machines intelligent. Since learning is at the core of intelligence, machine learning poses itself as a core sub-field of AI. Then there comes a subclass of machine learning, known as deep learning, to address the limitations of their predecessors. AI has generally acquired its prominence over the past few years due to its considerable progress in various fields. AI has vastly invaded the realm of research. This has led physicists to attentively direct their research towards implementing AI tools. Their central aim has been to gain better understanding and enrich their intuition. This review article is meant to supplement the previously presented efforts to bridge the gap between AI and physics, and take a serious step forward to filter out the “Babelian” clashes brought about from such gabs. This necessitates first to have fundamental knowledge about common AI tools. To this end, the review’s primary focus shall be on deep learning models called artificial neural networks. They are deep learning models which train themselves through different learning processes. It discusses also the concept of Markov decision processes. Finally, shortcut to the main goal, the review thoroughly examines how these neural networks are capable to construct a physical theory describing some observations without applying any previous physical knowledge.
Tasks
Published 2019-05-02
URL http://arxiv.org/abs/1905.01023v1
PDF http://arxiv.org/pdf/1905.01023v1.pdf
PWC https://paperswithcode.com/paper/physicists-journeys-through-the-ai-world-a
Repo
Framework

A Real-Time Cross-modality Correlation Filtering Method for Referring Expression Comprehension

Title A Real-Time Cross-modality Correlation Filtering Method for Referring Expression Comprehension
Authors Yue Liao, Si Liu, Guanbin Li, Fei Wang, Yanjie Chen, Chen Qian, Bo Li
Abstract Referring expression comprehension aims to localize the object instance described by a natural language expression. Current referring expression methods have achieved pretty-well performance. However, none of them is able to achieve real-time inference without accuracy drop. The reason for the relatively slow inference speed is that these methods artificially split the referring expression comprehension into two sequential stages including proposal generation and proposal ranking. It does not exactly conform to the habit of human cognition. To this end, we propose a novel Real-time Cross-modality Correlation Filtering method (RCCF). RCCF reformulates the referring expression as a correlation filtering process. The expression is first mapped from the language domain to the visual domain and then treated as a template (kernel) to perform correlation filtering on the image feature map. The peak value in the correlation heatmap indicates the center points of the target box. In addition, RCCF also regresses a 2-D object size and 2-D offset. The center point coordinates, object size and center point offset together form the target bounding-box. Our method runs at 40 FPS while achieves leading performance in RefClef, RefCOCO, RefCOCO+, and RefCOCOg benchmarks. In the challenge RefClef dataset, our methods almost double the state-of-the-art performance(34.70% increased to 63.79%). We hope this work can arouse more attention and studies to the new cross-modality correlation filtering framework as well as the one-stage framework for referring expression comprehension.
Tasks
Published 2019-09-16
URL https://arxiv.org/abs/1909.07072v3
PDF https://arxiv.org/pdf/1909.07072v3.pdf
PWC https://paperswithcode.com/paper/a-real-time-cross-modality-correlation
Repo
Framework

testRNN: Coverage-guided Testing on Recurrent Neural Networks

Title testRNN: Coverage-guided Testing on Recurrent Neural Networks
Authors Wei Huang, Youcheng Sun, Xiaowei Huang, James Sharp
Abstract Recurrent neural networks (RNNs) have been widely applied to various sequential tasks such as text processing, video recognition, and molecular property prediction. We introduce the first coverage-guided testing tool, coined testRNN, for the verification and validation of a major class of RNNs, long short-term memory networks (LSTMs). The tool implements a generic mutation-based test case generation method, and it empirically evaluates the robustness of a network using three novel LSTM structural test coverage metrics. Moreover, it is able to help the model designer go through the internal data flow processing of the LSTM layer. The tool is available through: https://github.com/TrustAI/testRNN under the BSD 3-Clause licence.
Tasks Molecular Property Prediction, Video Recognition
Published 2019-06-20
URL https://arxiv.org/abs/1906.08557v1
PDF https://arxiv.org/pdf/1906.08557v1.pdf
PWC https://paperswithcode.com/paper/testrnn-coverage-guided-testing-on-recurrent
Repo
Framework

An empirical study of pretrained representations for few-shot classification

Title An empirical study of pretrained representations for few-shot classification
Authors Tiago Ramalho, Thierry Sousbie, Stefano Peluchetti
Abstract Recent algorithms with state-of-the-art few-shot classification results start their procedure by computing data features output by a large pretrained model. In this paper we systematically investigate which models provide the best representations for a few-shot image classification task when pretrained on the Imagenet dataset. We test their representations when used as the starting point for different few-shot classification algorithms. We observe that models trained on a supervised classification task have higher performance than models trained in an unsupervised manner even when transferred to out-of-distribution datasets. Models trained with adversarial robustness transfer better, while having slightly lower accuracy than supervised models.
Tasks Few-Shot Image Classification, Image Classification
Published 2019-10-03
URL https://arxiv.org/abs/1910.01319v1
PDF https://arxiv.org/pdf/1910.01319v1.pdf
PWC https://paperswithcode.com/paper/an-empirical-study-of-pretrained
Repo
Framework

Automatic Tip Detection of Surgical Instruments in Biportal Endoscopic Spine Surgery

Title Automatic Tip Detection of Surgical Instruments in Biportal Endoscopic Spine Surgery
Authors Sue Min Cho, Young-Gon Kim, Jinhoon Jeong, Ho-jin Lee, Namkug Kim
Abstract Some endoscopic surgeries require a surgeon to hold the endoscope with one hand and the surgical instruments with the other hand to perform the actual surgery with correct vision. Recent technical advances in deep learning as well as in robotics can introduce robotics to these endoscopic surgeries. This can have numerous advantages by freeing one hand of the surgeon, which will allow the surgeon to use both hands and to use more intricate and sophisticated techniques. Recently, deep learning with convolutional neural network achieves state-of-the-art results in computer vision. Therefore, the aim of this study is to automatically detect the tip of the instrument, localize a point, and evaluate detection accuracy in biportal endoscopic spine surgery. The localized point could be used for the controller’s inputs of robotic endoscopy in these types of endoscopic surgeries.
Tasks
Published 2019-11-07
URL https://arxiv.org/abs/1911.02755v2
PDF https://arxiv.org/pdf/1911.02755v2.pdf
PWC https://paperswithcode.com/paper/automatic-tip-detection-of-surgical
Repo
Framework
comments powered by Disqus