January 26, 2020

3454 words 17 mins read

Paper Group ANR 1496

Decorrelated Adversarial Learning for Age-Invariant Face Recognition. Adversarial Attacks on Remote User Authentication Using Behavioural Mouse Dynamics. Diverse Trajectory Forecasting with Determinantal Point Processes. Efficient Neural Architecture Search on Low-Dimensional Data for OCT Image Segmentation. Privately Answering Classification Queri …

Decorrelated Adversarial Learning for Age-Invariant Face Recognition


Title	Decorrelated Adversarial Learning for Age-Invariant Face Recognition
Authors	Hao Wang, Dihong Gong, Zhifeng Li, Wei Liu
Abstract	There has been an increasing research interest in age-invariant face recognition. However, matching faces with big age gaps remains a challenging problem, primarily due to the significant discrepancy of face appearances caused by aging. To reduce such a discrepancy, in this paper we propose a novel algorithm to remove age-related components from features mixed with both identity and age information. Specifically, we factorize a mixed face feature into two uncorrelated components: identity-dependent component and age-dependent component, where the identity-dependent component includes information that is useful for face recognition. To implement this idea, we propose the Decorrelated Adversarial Learning (DAL) algorithm, where a Canonical Mapping Module (CMM) is introduced to find the maximum correlation between the paired features generated by a backbone network, while the backbone network and the factorization module are trained to generate features reducing the correlation. Thus, the proposed model learns the decomposed features of age and identity whose correlation is significantly reduced. Simultaneously, the identity-dependent feature and the age-dependent feature are respectively supervised by ID and age preserving signals to ensure that they both contain the correct information. Extensive experiments are conducted on popular public-domain face aging datasets (FG-NET, MORPH Album 2, and CACD-VS) to demonstrate the effectiveness of the proposed approach.
Tasks	Age-Invariant Face Recognition, Face Recognition
Published	2019-04-10
URL	http://arxiv.org/abs/1904.04972v1
PDF	http://arxiv.org/pdf/1904.04972v1.pdf
PWC	https://paperswithcode.com/paper/decorrelated-adversarial-learning-for-age
Repo
Framework

Adversarial Attacks on Remote User Authentication Using Behavioural Mouse Dynamics


Title	Adversarial Attacks on Remote User Authentication Using Behavioural Mouse Dynamics
Authors	Yi Xiang Marcus Tan, Alfonso Iacovazzi, Ivan Homoliak, Yuval Elovici, Alexander Binder
Abstract	Mouse dynamics is a potential means of authenticating users. Typically, the authentication process is based on classical machine learning techniques, but recently, deep learning techniques have been introduced for this purpose. Although prior research has demonstrated how machine learning and deep learning algorithms can be bypassed by carefully crafted adversarial samples, there has been very little research performed on the topic of behavioural biometrics in the adversarial domain. In an attempt to address this gap, we built a set of attacks, which are applications of several generative approaches, to construct adversarial mouse trajectories that bypass authentication models. These generated mouse sequences will serve as the adversarial samples in the context of our experiments. We also present an analysis of the attack approaches we explored, explaining their limitations. In contrast to previous work, we consider the attacks in a more realistic and challenging setting in which an attacker has access to recorded user data but does not have access to the authentication model or its outputs. We explore three different attack strategies: 1) statistics-based, 2) imitation-based, and 3) surrogate-based; we show that they are able to evade the functionality of the authentication models, thereby impacting their robustness adversely. We show that imitation-based attacks often perform better than surrogate-based attacks, unless, however, the attacker can guess the architecture of the authentication model. In such cases, we propose a potential detection mechanism against surrogate-based attacks.
Tasks
Published	2019-05-28
URL	https://arxiv.org/abs/1905.11831v2
PDF	https://arxiv.org/pdf/1905.11831v2.pdf
PWC	https://paperswithcode.com/paper/adversarial-attacks-on-remote-user
Repo
Framework

Diverse Trajectory Forecasting with Determinantal Point Processes


Title	Diverse Trajectory Forecasting with Determinantal Point Processes
Authors	Ye Yuan, Kris Kitani
Abstract	The ability to forecast a set of likely yet diverse possible future behaviors of an agent (e.g., future trajectories of a pedestrian) is essential for safety-critical perception systems (e.g., autonomous vehicles). In particular, a set of possible future behaviors generated by the system must be diverse to account for all possible outcomes in order to take necessary safety precautions. It is not sufficient to maintain a set of the most likely future outcomes because the set may only contain perturbations of a single outcome. While generative models such as variational autoencoders (VAEs) have been shown to be a powerful tool for learning a distribution over future trajectories, randomly drawn samples from the learned implicit likelihood model may not be diverse – the likelihood model is derived from the training data distribution and the samples will concentrate around the major mode that has most data. In this work, we propose to learn a diversity sampling function (DSF) that generates a diverse and likely set of future trajectories. The DSF maps forecasting context features to a set of latent codes which can be decoded by a generative model (e.g., VAE) into a set of diverse trajectory samples. Concretely, the process of identifying the diverse set of samples is posed as a parameter estimation of the DSF. To learn the parameters of the DSF, the diversity of the trajectory samples is evaluated by a diversity loss based on a determinantal point process (DPP). Gradient descent is performed over the DSF parameters, which in turn move the latent codes of the sample set to find an optimal diverse and likely set of trajectories. Our method is a novel application of DPPs to optimize a set of items (trajectories) in continuous space. We demonstrate the diversity of the trajectories produced by our approach on both low-dimensional 2D trajectory data and high-dimensional human motion data.
Tasks	Autonomous Vehicles, Point Processes
Published	2019-07-11
URL	https://arxiv.org/abs/1907.04967v2
PDF	https://arxiv.org/pdf/1907.04967v2.pdf
PWC	https://paperswithcode.com/paper/diverse-trajectory-forecasting-with
Repo
Framework

Efficient Neural Architecture Search on Low-Dimensional Data for OCT Image Segmentation


Title	Efficient Neural Architecture Search on Low-Dimensional Data for OCT Image Segmentation
Authors	Nils Gessert, Alexander Schlaefer
Abstract	Typically, deep learning architectures are handcrafted for their respective learning problem. As an alternative, neural architecture search (NAS) has been proposed where the architecture’s structure is learned in an additional optimization step. For the medical imaging domain, this approach is very promising as there are diverse problems and imaging modalities that require architecture design. However, NAS is very time-consuming and medical learning problems often involve high-dimensional data with high computational requirements. We propose an efficient approach for NAS in the context of medical, image-based deep learning problems by searching for architectures on low-dimensional data which are subsequently transferred to high-dimensional data. For OCT-based layer segmentation, we demonstrate that a search on 1D data reduces search time by 87.5% compared to a search on 2D data while the final 2D models achieve similar performance.
Tasks	Neural Architecture Search, Semantic Segmentation
Published	2019-05-07
URL	https://arxiv.org/abs/1905.02590v1
PDF	https://arxiv.org/pdf/1905.02590v1.pdf
PWC	https://paperswithcode.com/paper/efficient-neural-architecture-search-on-low
Repo
Framework

Privately Answering Classification Queries in the Agnostic PAC Model


Title	Privately Answering Classification Queries in the Agnostic PAC Model
Authors	Anupama Nandi, Raef Bassily
Abstract	We revisit the problem of differentially private release of classification queries. In this problem, the goal is to design an algorithm that can accurately answer a sequence of classification queries based on a private training set while ensuring differential privacy. We formally study this problem in the agnostic PAC model and derive a new upper bound on the private sample complexity. Our results improve over those obtained in a recent work [BTT18] for the agnostic PAC setting. In particular, we give an improved construction that yields a tighter upper bound on the sample complexity. Moreover, unlike [BTT18], our accuracy guarantee does not involve any blow-up in the approximation error associated with the given hypothesis class. Given any hypothesis class with VC-dimension $d$, we show that our construction can privately answer up to $m$ classification queries with average excess error $\alpha$ using a private sample of size $\approx \frac{d}{\alpha^2},\max\left(1, \sqrt{m},\alpha^{3/2}\right)$. Using recent results on private learning with auxiliary public data, we extend our construction to show that one can privately answer any number of classification queries with average excess error $\alpha$ using a private sample of size $\approx \frac{d}{\alpha^2},\max\left(1, \sqrt{d},\alpha\right)$. When $\alpha=O\left(\frac{1}{\sqrt{d}}\right)$, our private sample complexity bound is essentially optimal.
Tasks
Published	2019-07-31
URL	https://arxiv.org/abs/1907.13553v3
PDF	https://arxiv.org/pdf/1907.13553v3.pdf
PWC	https://paperswithcode.com/paper/privately-answering-classification-queries-in
Repo
Framework

Deep Reinforcement Learning meets Graph Neural Networks: exploring a routing optimization use case


Title	Deep Reinforcement Learning meets Graph Neural Networks: exploring a routing optimization use case
Authors	Paul Almasan, José Suárez-Varela, Arnau Badia-Sampera, Krzysztof Rusek, Pere Barlet-Ros, Albert Cabellos-Aparicio
Abstract	Recent advances in Deep Reinforcement Learning (DRL) have shown a significant improvement in decision-making problems. The networking community has started to investigate how DRL can provide a new breed of solutions to relevant optimization problems, such as routing. However, most of the state-of-the-art DRL-based networking techniques fail to generalize, this means that they can only operate over network topologies seen during training, but not over new topologies. The reason behind this important limitation is that existing DRL networking solutions use standard neural networks (e.g., fully connected), which are unable to learn graph-structured information. In this paper we propose to use Graph Neural Networks (GNN) in combination with DRL. GNN have been recently proposed to model graphs, and our novel DRL+GNN architecture is able to learn, operate and generalize over arbitrary network topologies. To showcase its generalization capabilities, we evaluate it on an Optical Transport Network (OTN) scenario, where the agent needs to allocate traffic demands efficiently. Our results show that our DRL+GNN agent is able to achieve outstanding performance in topologies unseen during training.
Tasks	Decision Making
Published	2019-10-16
URL	https://arxiv.org/abs/1910.07421v2
PDF	https://arxiv.org/pdf/1910.07421v2.pdf
PWC	https://paperswithcode.com/paper/deep-reinforcement-learning-meets-graph
Repo
Framework

Volume Doubling Condition and a Local Poincaré Inequality on Unweighted Random Geometric Graphs


Title	Volume Doubling Condition and a Local Poincaré Inequality on Unweighted Random Geometric Graphs
Authors	Franziska Göbel, Gilles Blanchard
Abstract	The aim of this paper is to establish two fundamental measure-metric properties of particular random geometric graphs. We consider $\varepsilon$-neighborhood graphs whose vertices are drawn independently and identically distributed from a common distribution defined on a regular submanifold of $\mathbb{R}^K$. We show that a volume doubling condition (VD) and local Poincar'e inequality (LPI) hold for the random geometric graph (with high probability, and uniformly over all shortest path distance balls in a certain radius range) under suitable regularity conditions of the underlying submanifold and the sampling distribution.
Tasks
Published	2019-07-06
URL	https://arxiv.org/abs/1907.03192v2
PDF	https://arxiv.org/pdf/1907.03192v2.pdf
PWC	https://paperswithcode.com/paper/volume-doubling-condition-and-a-local
Repo
Framework

Deep Contextualized Word Embeddings in Transition-Based and Graph-Based Dependency Parsing – A Tale of Two Parsers Revisited


Title	Deep Contextualized Word Embeddings in Transition-Based and Graph-Based Dependency Parsing – A Tale of Two Parsers Revisited
Authors	Artur Kulmizev, Miryam de Lhoneux, Johannes Gontrum, Elena Fano, Joakim Nivre
Abstract	Transition-based and graph-based dependency parsers have previously been shown to have complementary strengths and weaknesses: transition-based parsers exploit rich structural features but suffer from error propagation, while graph-based parsers benefit from global optimization but have restricted feature scope. In this paper, we show that, even though some details of the picture have changed after the switch to neural networks and continuous representations, the basic trade-off between rich features and global optimization remains essentially the same. Moreover, we show that deep contextualized word embeddings, which allow parsers to pack information about global sentence structure into local feature representations, benefit transition-based parsers more than graph-based parsers, making the two approaches virtually equivalent in terms of both accuracy and error profile. We argue that the reason is that these representations help prevent search errors and thereby allow transition-based parsers to better exploit their inherent strength of making accurate local decisions. We support this explanation by an error analysis of parsing experiments on 13 languages.
Tasks	Dependency Parsing, Word Embeddings
Published	2019-08-20
URL	https://arxiv.org/abs/1908.07397v2
PDF	https://arxiv.org/pdf/1908.07397v2.pdf
PWC	https://paperswithcode.com/paper/deep-contextualized-word-embeddings-in
Repo
Framework

The generalization error of max-margin linear classifiers: High-dimensional asymptotics in the overparametrized regime


Title	The generalization error of max-margin linear classifiers: High-dimensional asymptotics in the overparametrized regime
Authors	Andrea Montanari, Feng Ruan, Youngtak Sohn, Jun Yan
Abstract	Modern machine learning models are often so complex that they achieve vanishing classification error on the training set. Max-margin linear classifiers are among the simplest classification methods that have zero training error (with linearly separable data). Despite this simplicity, their high-dimensional behavior is not yet completely understood. We assume to be given i.i.d. data $(y_i,{\boldsymbol x}i)$, $i\le n$ with ${\boldsymbol x}i\sim {\sf N}({\boldsymbol 0},{\boldsymbol \Sigma})$ a $p$-dimensional Gaussian feature vector, and $y_i \in{+1,-1}$ a label whose distribution depends on a linear combination of the covariates $\langle {\boldsymbol \theta},{\boldsymbol x}i\rangle$. We consider the proportional asymptotics $n,p\to\infty$ with $p/n\to \psi$, and derive exact expressions for the limiting prediction error. Our asymptotic results match simulations already when $n,p$ are of the order of a few hundreds. We explore several choices for the the pair $({\boldsymbol \theta},{\boldsymbol \Sigma})$, and show that the resulting generalization curve (test error error as a function of the overparametrization ratio $\psi=p/n$) is qualitatively different, depending on this choice. In particular we consider a specific structure of $({\boldsymbol \theta}*,{\boldsymbol \Sigma})$ that captures the behavior of nonlinear random feature models or, equivalently, two-layers neural networks with random first layer weights. In this case, we observe that the test error is monotone decreasing in the number of parameters. This finding agrees with the recently developed `double descent’ phenomenology for overparametrized models. \|
Tasks
Published	2019-11-05
URL	https://arxiv.org/abs/1911.01544v1
PDF	https://arxiv.org/pdf/1911.01544v1.pdf
PWC	https://paperswithcode.com/paper/the-generalization-error-of-max-margin-linear
Repo
Framework


Title	Cross-modal Zero-shot Hashing
Authors	Xuanwu Liu, Zhao Li, Jun Wang, Guoxian Yu, Carlotta Domeniconi, Xiangliang Zhang
Abstract	Hashing has been widely studied for big data retrieval due to its low storage cost and fast query speed. Zero-shot hashing (ZSH) aims to learn a hashing model that is trained using only samples from seen categories, but can generalize well to samples of unseen categories. ZSH generally uses category attributes to seek a semantic embedding space to transfer knowledge from seen categories to unseen ones. As a result, it may perform poorly when labeled data are insufficient. ZSH methods are mainly designed for single-modality data, which prevents their application to the widely spread multi-modal data. On the other hand, existing cross-modal hashing solutions assume that all the modalities share the same category labels, while in practice the labels of different data modalities may be different. To address these issues, we propose a general Cross-modal Zero-shot Hashing (CZHash) solution to effectively leverage unlabeled and labeled multi-modality data with different label spaces. CZHash first quantifies the composite similarity between instances using label and feature information. It then defines an objective function to achieve deep feature learning compatible with the composite similarity preserving, category attribute space learning, and hashing coding function learning. CZHash further introduces an alternative optimization procedure to jointly optimize these learning objectives. Experiments on benchmark multi-modal datasets show that CZHash significantly outperforms related representative hashing approaches both on effectiveness and adaptability.
Tasks
Published	2019-08-19
URL	https://arxiv.org/abs/1908.07388v1
PDF	https://arxiv.org/pdf/1908.07388v1.pdf
PWC	https://paperswithcode.com/paper/cross-modal-zero-shot-hashing
Repo
Framework

Physicist’s Journeys Through the AI World - A Topical Review. There is no royal road to unsupervised learning


Title	Physicist’s Journeys Through the AI World - A Topical Review. There is no royal road to unsupervised learning
Authors	Imad Alhousseini, Wissam Chemissany, Fatima Kleit, Aly Nasrallah
Abstract	Artificial Intelligence (AI), defined in its most simple form, is a technological tool that makes machines intelligent. Since learning is at the core of intelligence, machine learning poses itself as a core sub-field of AI. Then there comes a subclass of machine learning, known as deep learning, to address the limitations of their predecessors. AI has generally acquired its prominence over the past few years due to its considerable progress in various fields. AI has vastly invaded the realm of research. This has led physicists to attentively direct their research towards implementing AI tools. Their central aim has been to gain better understanding and enrich their intuition. This review article is meant to supplement the previously presented efforts to bridge the gap between AI and physics, and take a serious step forward to filter out the “Babelian” clashes brought about from such gabs. This necessitates first to have fundamental knowledge about common AI tools. To this end, the review’s primary focus shall be on deep learning models called artificial neural networks. They are deep learning models which train themselves through different learning processes. It discusses also the concept of Markov decision processes. Finally, shortcut to the main goal, the review thoroughly examines how these neural networks are capable to construct a physical theory describing some observations without applying any previous physical knowledge.
Tasks
Published	2019-05-02
URL	http://arxiv.org/abs/1905.01023v1
PDF	http://arxiv.org/pdf/1905.01023v1.pdf
PWC	https://paperswithcode.com/paper/physicists-journeys-through-the-ai-world-a
Repo
Framework

A Real-Time Cross-modality Correlation Filtering Method for Referring Expression Comprehension


Title	A Real-Time Cross-modality Correlation Filtering Method for Referring Expression Comprehension
Authors	Yue Liao, Si Liu, Guanbin Li, Fei Wang, Yanjie Chen, Chen Qian, Bo Li
Abstract	Referring expression comprehension aims to localize the object instance described by a natural language expression. Current referring expression methods have achieved pretty-well performance. However, none of them is able to achieve real-time inference without accuracy drop. The reason for the relatively slow inference speed is that these methods artificially split the referring expression comprehension into two sequential stages including proposal generation and proposal ranking. It does not exactly conform to the habit of human cognition. To this end, we propose a novel Real-time Cross-modality Correlation Filtering method (RCCF). RCCF reformulates the referring expression as a correlation filtering process. The expression is first mapped from the language domain to the visual domain and then treated as a template (kernel) to perform correlation filtering on the image feature map. The peak value in the correlation heatmap indicates the center points of the target box. In addition, RCCF also regresses a 2-D object size and 2-D offset. The center point coordinates, object size and center point offset together form the target bounding-box. Our method runs at 40 FPS while achieves leading performance in RefClef, RefCOCO, RefCOCO+, and RefCOCOg benchmarks. In the challenge RefClef dataset, our methods almost double the state-of-the-art performance(34.70% increased to 63.79%). We hope this work can arouse more attention and studies to the new cross-modality correlation filtering framework as well as the one-stage framework for referring expression comprehension.
Tasks
Published	2019-09-16
URL	https://arxiv.org/abs/1909.07072v3
PDF	https://arxiv.org/pdf/1909.07072v3.pdf
PWC	https://paperswithcode.com/paper/a-real-time-cross-modality-correlation
Repo
Framework

testRNN: Coverage-guided Testing on Recurrent Neural Networks


Title	testRNN: Coverage-guided Testing on Recurrent Neural Networks
Authors	Wei Huang, Youcheng Sun, Xiaowei Huang, James Sharp
Abstract	Recurrent neural networks (RNNs) have been widely applied to various sequential tasks such as text processing, video recognition, and molecular property prediction. We introduce the first coverage-guided testing tool, coined testRNN, for the verification and validation of a major class of RNNs, long short-term memory networks (LSTMs). The tool implements a generic mutation-based test case generation method, and it empirically evaluates the robustness of a network using three novel LSTM structural test coverage metrics. Moreover, it is able to help the model designer go through the internal data flow processing of the LSTM layer. The tool is available through: https://github.com/TrustAI/testRNN under the BSD 3-Clause licence.
Tasks	Molecular Property Prediction, Video Recognition
Published	2019-06-20
URL	https://arxiv.org/abs/1906.08557v1
PDF	https://arxiv.org/pdf/1906.08557v1.pdf
PWC	https://paperswithcode.com/paper/testrnn-coverage-guided-testing-on-recurrent
Repo
Framework

An empirical study of pretrained representations for few-shot classification


Title	An empirical study of pretrained representations for few-shot classification
Authors	Tiago Ramalho, Thierry Sousbie, Stefano Peluchetti
Abstract	Recent algorithms with state-of-the-art few-shot classification results start their procedure by computing data features output by a large pretrained model. In this paper we systematically investigate which models provide the best representations for a few-shot image classification task when pretrained on the Imagenet dataset. We test their representations when used as the starting point for different few-shot classification algorithms. We observe that models trained on a supervised classification task have higher performance than models trained in an unsupervised manner even when transferred to out-of-distribution datasets. Models trained with adversarial robustness transfer better, while having slightly lower accuracy than supervised models.
Tasks	Few-Shot Image Classification, Image Classification
Published	2019-10-03
URL	https://arxiv.org/abs/1910.01319v1
PDF	https://arxiv.org/pdf/1910.01319v1.pdf
PWC	https://paperswithcode.com/paper/an-empirical-study-of-pretrained
Repo
Framework

Automatic Tip Detection of Surgical Instruments in Biportal Endoscopic Spine Surgery


Title	Automatic Tip Detection of Surgical Instruments in Biportal Endoscopic Spine Surgery
Authors	Sue Min Cho, Young-Gon Kim, Jinhoon Jeong, Ho-jin Lee, Namkug Kim
Abstract	Some endoscopic surgeries require a surgeon to hold the endoscope with one hand and the surgical instruments with the other hand to perform the actual surgery with correct vision. Recent technical advances in deep learning as well as in robotics can introduce robotics to these endoscopic surgeries. This can have numerous advantages by freeing one hand of the surgeon, which will allow the surgeon to use both hands and to use more intricate and sophisticated techniques. Recently, deep learning with convolutional neural network achieves state-of-the-art results in computer vision. Therefore, the aim of this study is to automatically detect the tip of the instrument, localize a point, and evaluate detection accuracy in biportal endoscopic spine surgery. The localized point could be used for the controller’s inputs of robotic endoscopy in these types of endoscopic surgeries.
Tasks
Published	2019-11-07
URL	https://arxiv.org/abs/1911.02755v2
PDF	https://arxiv.org/pdf/1911.02755v2.pdf
PWC	https://paperswithcode.com/paper/automatic-tip-detection-of-surgical
Repo
Framework