January 28, 2020

3154 words 15 mins read

Paper Group ANR 858

Revealing Backdoors, Post-Training, in DNN Classifiers via Novel Inference on Optimized Perturbations Inducing Group Misclassification. Adversarial Feature Learning in Brain Interfacing: An Experimental Study on Eliminating Drowsiness Effects. Predicting passenger origin-destination in online taxi-hailing systems. An Empirical Study of Factors Affe …

Revealing Backdoors, Post-Training, in DNN Classifiers via Novel Inference on Optimized Perturbations Inducing Group Misclassification


Title	Revealing Backdoors, Post-Training, in DNN Classifiers via Novel Inference on Optimized Perturbations Inducing Group Misclassification
Authors	Zhen Xiang, David J. Miller, George Kesidis
Abstract	Recently, a special type of data poisoning (DP) attack targeting Deep Neural Network (DNN) classifiers, known as a backdoor, was proposed. These attacks do not seek to degrade classification accuracy, but rather to have the classifier learn to classify to a target class whenever the backdoor pattern is present in a test example. Launching backdoor attacks does not require knowledge of the classifier or its training process - it only needs the ability to poison the training set with (a sufficient number of) exemplars containing a sufficiently strong backdoor pattern (labeled with the target class). Here we address post-training detection of backdoor attacks in DNN image classifiers, seldom considered in existing works, wherein the defender does not have access to the poisoned training set, but only to the trained classifier itself, as well as to clean examples from the classification domain. This is an important scenario because a trained classifier may be the basis of e.g. a phone app that will be shared with many users. Detecting backdoors post-training may thus reveal a widespread attack. We propose a purely unsupervised anomaly detection (AD) defense against imperceptible backdoor attacks that: i) detects whether the trained DNN has been backdoor-attacked; ii) infers the source and target classes involved in a detected attack; iii) we even demonstrate it is possible to accurately estimate the backdoor pattern. We test our AD approach, in comparison with alternative defenses, for several backdoor patterns, data sets, and attack settings and demonstrate its favorability. Our defense essentially requires setting a single hyperparameter (the detection threshold), which can e.g. be chosen to fix the system’s false positive rate.
Tasks	Anomaly Detection, data poisoning, Unsupervised Anomaly Detection
Published	2019-08-27
URL	https://arxiv.org/abs/1908.10498v2
PDF	https://arxiv.org/pdf/1908.10498v2.pdf
PWC	https://paperswithcode.com/paper/revealing-backdoors-post-training-in-dnn
Repo
Framework

Adversarial Feature Learning in Brain Interfacing: An Experimental Study on Eliminating Drowsiness Effects


Title	Adversarial Feature Learning in Brain Interfacing: An Experimental Study on Eliminating Drowsiness Effects
Authors	Ozan Ozdenizci, Barry Oken, Tab Memmott, Melanie Fried-Oken, Deniz Erdogmus
Abstract	Across- and within-recording variabilities in electroencephalographic (EEG) activity is a major limitation in EEG-based brain-computer interfaces (BCIs). Specifically, gradual changes in fatigue and vigilance levels during long EEG recording durations and BCI system usage bring along significant fluctuations in BCI performances even when these systems are calibrated daily. We address this in an experimental offline study from EEG-based BCI speller usage data acquired for one hour duration. As the main part of our methodological approach, we propose the concept of adversarial invariant feature learning for BCIs as a regularization approach on recently expanding EEG deep learning architectures, to learn nuisance-invariant discriminative features. We empirically demonstrate the feasibility of adversarial feature learning on eliminating drowsiness effects from event related EEG activity features, by using temporal recording block ordering as the source of drowsiness variability.
Tasks	EEG
Published	2019-07-22
URL	https://arxiv.org/abs/1907.09540v1
PDF	https://arxiv.org/pdf/1907.09540v1.pdf
PWC	https://paperswithcode.com/paper/adversarial-feature-learning-in-brain
Repo
Framework

Predicting passenger origin-destination in online taxi-hailing systems


Title	Predicting passenger origin-destination in online taxi-hailing systems
Authors	Pouria Golshanrad, Hamid Mahini, Behnam Bahrak
Abstract	Because of transportation planning, traffic management and dispatch optimization importance, the passenger origin-destination prediction has become one of the most important requirements for intelligent transportation systems management. In this paper, we propose a model to predict the origin and destination of travels which will occur in the next specified time window. In order to extract meaningful travel flows we use K-means clustering in four-dimensional space with maximum cluster size limitation for origin and destination. Because of large number of clusters, we use non-negative matrix factorization to decrease the number of travel clusters. We also use a stacked recurrent neural network model to predict travels count in each cluster. Comparing our results with other existing models show that our proposed model has 5-7% lower mean absolute percentage error (MAPE) for 1-hour time window, and 14% lower MAPE for 30-minute time window.
Tasks
Published	2019-10-17
URL	https://arxiv.org/abs/1910.08145v1
PDF	https://arxiv.org/pdf/1910.08145v1.pdf
PWC	https://paperswithcode.com/paper/predicting-passenger-origin-destination-in
Repo
Framework

An Empirical Study of Factors Affecting Language-Independent Models


Title	An Empirical Study of Factors Affecting Language-Independent Models
Authors	Xiaotong Liu, Yingbei Tong, Anbang Xu, Rama Akkiraju
Abstract	Scaling existing applications and solutions to multiple human languages has traditionally proven to be difficult, mainly due to the language-dependent nature of preprocessing and feature engineering techniques employed in traditional approaches. In this work, we empirically investigate the factors affecting language-independent models built with multilingual representations, including task type, language set and data resource. On two most representative NLP tasks – sentence classification and sequence labeling, we show that language-independent models can be comparable to or even outperforms the models trained using monolingual data, and they are generally more effective on sentence classification. We experiment language-independent models with many different languages and show that they are more suitable for typologically similar languages. We also explore the effects of different data sizes when training and testing language-independent models, and demonstrate that they are not only suitable for high-resource languages, but also very effective in low-resource languages.
Tasks	Feature Engineering, Sentence Classification
Published	2019-12-30
URL	https://arxiv.org/abs/1912.13106v1
PDF	https://arxiv.org/pdf/1912.13106v1.pdf
PWC	https://paperswithcode.com/paper/an-empirical-study-of-factors-affecting
Repo
Framework

Distributed Correlation-Based Feature Selection in Spark


Title	Distributed Correlation-Based Feature Selection in Spark
Authors	Raul-Jose Palma-Mendoza, Luis de-Marcos, Daniel Rodriguez, Amparo Alonso-Betanzos
Abstract	CFS (Correlation-Based Feature Selection) is an FS algorithm that has been successfully applied to classification problems in many domains. We describe Distributed CFS (DiCFS) as a completely redesigned, scalable, parallel and distributed version of the CFS algorithm, capable of dealing with the large volumes of data typical of big data applications. Two versions of the algorithm were implemented and compared using the Apache Spark cluster computing model, currently gaining popularity due to its much faster processing times than Hadoop’s MapReduce model. We tested our algorithms on four publicly available datasets, each consisting of a large number of instances and two also consisting of a large number of features. The results show that our algorithms were superior in terms of both time-efficiency and scalability. In leveraging a computer cluster, they were able to handle larger datasets than the non-distributed WEKA version while maintaining the quality of the results, i.e., exactly the same features were returned by our algorithms when compared to the original algorithm available in WEKA.
Tasks	Feature Selection
Published	2019-01-31
URL	http://arxiv.org/abs/1901.11286v1
PDF	http://arxiv.org/pdf/1901.11286v1.pdf
PWC	https://paperswithcode.com/paper/distributed-correlation-based-feature
Repo
Framework

A joint 3D UNet-Graph Neural Network-based method for Airway Segmentation from chest CTs


Title	A joint 3D UNet-Graph Neural Network-based method for Airway Segmentation from chest CTs
Authors	Antonio Garcia-Uceda Juarez, Raghavendra Selvan, Zaigham Saghir, Marleen de Bruijne
Abstract	We present an end-to-end deep learning segmentation method by combining a 3D UNet architecture with a graph neural network (GNN) model. In this approach, the convolutional layers at the deepest level of the UNet are replaced by a GNN-based module with a series of graph convolutions. The dense feature maps at this level are transformed into a graph input to the GNN module. The incorporation of graph convolutions in the UNet provides nodes in the graph with information that is based on node connectivity, in addition to the local features learnt through the downsampled paths. This information can help improve segmentation decisions. By stacking several graph convolution layers, the nodes can access higher order neighbourhood information without substantial increase in computational expense. We propose two types of node connectivity in the graph adjacency: i) one predefined and based on a regular node neighbourhood, and ii) one dynamically computed during training and using the nearest neighbour nodes in the feature space. We have applied this method to the task of segmenting the airway tree from chest CT scans. Experiments have been performed on 32 CTs from the Danish Lung Cancer Screening Trial dataset. We evaluate the performance of the UNet-GNN models with two types of graph adjacency and compare it with the baseline UNet.
Tasks	3D Medical Imaging Segmentation
Published	2019-08-22
URL	https://arxiv.org/abs/1908.08588v1
PDF	https://arxiv.org/pdf/1908.08588v1.pdf
PWC	https://paperswithcode.com/paper/a-joint-3d-unet-graph-neural-network-based
Repo
Framework

Lifelong Neural Predictive Coding: Sparsity Yields Less Forgetting when Learning Cumulatively


Title	Lifelong Neural Predictive Coding: Sparsity Yields Less Forgetting when Learning Cumulatively
Authors	Alexander Ororbia, Ankur Mali, Daniel Kifer, C. Lee Giles
Abstract	In lifelong learning systems, especially those based on artificial neural networks, one of the biggest obstacles is the severe inability to retain old knowledge as new information is encountered. This phenomenon is known as catastrophic forgetting. In this paper, we present a new connectionist model, the Sequential Neural Coding Network, and its learning procedure, grounded in the neurocognitive theory of predictive coding. The architecture experiences significantly less forgetting as compared to standard neural models and outperforms a variety of previously proposed remedies and methods when trained across multiple task datasets in a stream-like fashion. The promising performance demonstrated in our experiments offers motivation that directly incorporating mechanisms prominent in real neuronal systems, such as competition, sparse activation patterns, and iterative input processing, can create viable pathways for tackling the challenge of lifelong machine learning.
Tasks
Published	2019-05-25
URL	https://arxiv.org/abs/1905.10696v1
PDF	https://arxiv.org/pdf/1905.10696v1.pdf
PWC	https://paperswithcode.com/paper/lifelong-neural-predictive-coding-sparsity
Repo
Framework

Relationship Explainable Multi-objective Reinforcement Learning with Semantic Explainability Generation


Title	Relationship Explainable Multi-objective Reinforcement Learning with Semantic Explainability Generation
Authors	Huixin Zhan, Yongcan Cao
Abstract	Solving multi-objective optimization problems is important in various applications where users are interested in obtaining optimal policies subject to multiple, yet often conflicting objectives. A typical approach to obtain optimal policies is to first construct a loss function that is based on the scalarization of individual objectives, and then find the optimal policy that minimizes the loss. However, optimizing the scalarized (and weighted) loss does not necessarily provide guarantee of high performance on each possibly conflicting objective because it is challenging to assign the right weights without knowing the relationship among these objectives. Moreover, the effectiveness of these gradient descent algorithms is limited by the agent’s ability to explain their decisions and actions to human users. The purpose of this study is two-fold. First, we propose a vector value function based multi-objective reinforcement learning (V2f-MORL) approach that seeks to quantify the inter-objective relationship via reinforcement learning (RL) when the impact of one objective on others is unknown a prior. In particular, we construct one actor and multiple critics that can co-learn the policy and inter-objective relationship matrix (IORM), quantifying the impact of objectives on each other, in an iterative way. Second, we provide a semantic representation that can uncover the trade-off of decision policies made by users to reconcile conflicting objectives based on the proposed V2f-MORL approach for the explainability of the generated behaviors subject to given optimization objectives. We demonstrate the effectiveness of the proposed approach via a MuJoCo based robotics case study.
Tasks
Published	2019-09-26
URL	https://arxiv.org/abs/1909.12268v1
PDF	https://arxiv.org/pdf/1909.12268v1.pdf
PWC	https://paperswithcode.com/paper/relationship-explainable-multi-objective
Repo
Framework

Dirac Delta Regression: Conditional Density Estimation with Clinical Trials


Title	Dirac Delta Regression: Conditional Density Estimation with Clinical Trials
Authors	Eric V. Strobl, Shyam Visweswaran
Abstract	Personalized medicine seeks to identify the causal effect of treatment for a particular patient as opposed to a clinical population at large. Most investigators estimate such personalized treatment effects by regressing the outcome of a randomized clinical trial (RCT) on patient covariates. The realized value of the outcome may however lie far from the conditional expectation. We therefore introduce a method called Dirac Delta Regression (DDR) that estimates the entire conditional density from RCT data in order to visualize the probabilities across all possible treatment outcomes. DDR transforms the outcome into a set of asymptotically Dirac delta distributions and then estimates the density using non-linear regression. The algorithm can identify significant patient-specific treatment effects even when no population level effect exists. Moreover, DDR outperforms state-of-the-art algorithms in conditional density estimation on average regardless of the need for causal inference.
Tasks	Causal Inference, Density Estimation
Published	2019-05-24
URL	https://arxiv.org/abs/1905.10330v1
PDF	https://arxiv.org/pdf/1905.10330v1.pdf
PWC	https://paperswithcode.com/paper/dirac-delta-regression-conditional-density
Repo
Framework

Routine Modeling with Time Series Metric Learning


Title	Routine Modeling with Time Series Metric Learning
Authors	Paul Compagnon, Grégoire Lefebvre, Stefan Duffner, Christophe Garcia
Abstract	Traditionally, the automatic recognition of human activities is performed with supervised learning algorithms on limited sets of specific activities. This work proposes to recognize recurrent activity patterns, called routines, instead of precisely defined activities. The modeling of routines is defined as a metric learning problem, and an architecture, called SS2S, based on sequence-to-sequence models is proposed to learn a distance between time series. This approach only relies on inertial data and is thus non intrusive and preserves privacy. Experimental results show that a clustering algorithm provided with the learned distance is able to recover daily routines.
Tasks	Metric Learning, Time Series
Published	2019-07-08
URL	https://arxiv.org/abs/1907.04666v1
PDF	https://arxiv.org/pdf/1907.04666v1.pdf
PWC	https://paperswithcode.com/paper/routine-modeling-with-time-series-metric
Repo
Framework

Multi-scale Aggregation R-CNN for 2D Multi-person Pose Estimation


Title	Multi-scale Aggregation R-CNN for 2D Multi-person Pose Estimation
Authors	Gyeongsik Moon, Ju Yong Chang, Kyoung Mu Lee
Abstract	Multi-person pose estimation from a 2D image is challenging because it requires not only keypoint localization but also human detection. In state-of-the-art top-down methods, multi-scale information is a crucial factor for the accurate pose estimation because it contains both of local information around the keypoints and global information of the entire person. Although multi-scale information allows these methods to achieve the state-of-the-art performance, the top-down methods still require a huge amount of computation because they need to use an additional human detector to feed the cropped human image to their pose estimation model. To effectively utilize multi-scale information with the smaller computation, we propose a multi-scale aggregation R-CNN (MSA R-CNN). It consists of multi-scale RoIAlign block (MS-RoIAlign) and multi-scale keypoint head network (MS-KpsNet) which are designed to effectively utilize multi-scale information. Also, in contrast to previous top-down methods, the MSA R-CNN performs human detection and keypoint localization in a single model, which results in reduced computation. The proposed model achieved the best performance among single model-based methods and its results are comparable to those of separated model-based methods with a smaller amount of computation on the publicly available 2D multi-person keypoint localization dataset.
Tasks	Human Detection, Multi-Person Pose Estimation, Pose Estimation
Published	2019-05-10
URL	https://arxiv.org/abs/1905.03912v1
PDF	https://arxiv.org/pdf/1905.03912v1.pdf
PWC	https://paperswithcode.com/paper/multi-scale-aggregation-r-cnn-for-2d-multi
Repo
Framework

The Choice Function Framework for Online Policy Improvement


Title	The Choice Function Framework for Online Policy Improvement
Authors	Murugeswari Issakkimuthu, Alan Fern, Prasad Tadepalli
Abstract	There are notable examples of online search improving over hand-coded or learned policies (e.g. AlphaZero) for sequential decision making. It is not clear, however, whether or not policy improvement is guaranteed for many of these approaches, even when given a perfect evaluation function and transition model. Indeed, simple counter examples show that seemingly reasonable online search procedures can hurt performance compared to the original policy. To address this issue, we introduce the choice function framework for analyzing online search procedures for policy improvement. A choice function specifies the actions to be considered at every node of a search tree, with all other actions being pruned. Our main contribution is to give sufficient conditions for stationary and non-stationary choice functions to guarantee that the value achieved by online search is no worse than the original policy. In addition, we describe a general parametric class of choice functions that satisfy those conditions and present an illustrative use case of the framework’s empirical utility.
Tasks	Decision Making
Published	2019-10-01
URL	https://arxiv.org/abs/1910.00614v2
PDF	https://arxiv.org/pdf/1910.00614v2.pdf
PWC	https://paperswithcode.com/paper/the-choice-function-framework-for-online
Repo
Framework

Random Function Priors for Correlation Modeling


Title	Random Function Priors for Correlation Modeling
Authors	Aonan Zhang, John Paisley
Abstract	The likelihood model of high dimensional data $X_n$ can often be expressed as $p(X_nZ_n,\theta)$, where $\theta\mathrel{\mathop:}=(\theta_k){k\in[K]}$ is a collection of hidden features shared across objects, indexed by $n$, and $Z_n$ is a non-negative factor loading vector with $K$ entries where $Z{nk}$ indicates the strength of $\theta_k$ used to express $X_n$. In this paper, we introduce random function priors for $Z_n$ for modeling correlations among its $K$ dimensions $Z_{n1}$ through $Z_{nK}$, which we call \textit{population random measure embedding} (PRME). Our model can be viewed as a generalized paintbox model~\cite{Broderick13} using random functions, and can be learned efficiently with neural networks via amortized variational inference. We derive our Bayesian nonparametric method by applying a representation theorem on separately exchangeable discrete random measures.
Tasks
Published	2019-05-09
URL	https://arxiv.org/abs/1905.03826v2
PDF	https://arxiv.org/pdf/1905.03826v2.pdf
PWC	https://paperswithcode.com/paper/random-function-priors-for-correlation
Repo
Framework

Center-Extraction-Based Three Dimensional Nuclei Instance Segmentation of Fluorescence Microscopy Images


Title	Center-Extraction-Based Three Dimensional Nuclei Instance Segmentation of Fluorescence Microscopy Images
Authors	David Joon Ho, Shuo Han, Chichen Fu, Paul Salama, Kenneth W. Dunn, Edward J. Delp
Abstract	Fluorescence microscopy is an essential tool for the analysis of 3D subcellular structures in tissue. An important step in the characterization of tissue involves nuclei segmentation. In this paper, a two-stage method for segmentation of nuclei using convolutional neural networks (CNNs) is described. In particular, since creating labeled volumes manually for training purposes is not practical due to the size and complexity of the 3D data sets, the paper describes a method for generating synthetic microscopy volumes based on a spatially constrained cycle-consistent adversarial network. The proposed method is tested on multiple real microscopy data sets and outperforms other commonly used segmentation techniques.
Tasks	Instance Segmentation, Semantic Segmentation
Published	2019-09-13
URL	https://arxiv.org/abs/1909.05992v1
PDF	https://arxiv.org/pdf/1909.05992v1.pdf
PWC	https://paperswithcode.com/paper/center-extraction-based-three-dimensional
Repo
Framework

Going Deep in Medical Image Analysis: Concepts, Methods, Challenges and Future Directions


Title	Going Deep in Medical Image Analysis: Concepts, Methods, Challenges and Future Directions
Authors	Fouzia Altaf, Syed M. S. Islam, Naveed Akhtar, Naeem K. Janjua
Abstract	Medical Image Analysis is currently experiencing a paradigm shift due to Deep Learning. This technology has recently attracted so much interest of the Medical Imaging community that it led to a specialized conference in Medical Imaging with Deep Learning' in the year 2018. This article surveys the recent developments in this direction, and provides a critical review of the related major aspects. We organize the reviewed literature according to the underlying Pattern Recognition tasks, and further sub-categorize it following a taxonomy based on human anatomy. This article does not assume prior knowledge of Deep Learning and makes a significant contribution in explaining the core Deep Learning concepts to the non-experts in the Medical community. Unique to this study is the Computer Vision/Machine Learning perspective taken on the advances of Deep Learning in Medical Imaging. This enables us to single out lack of appropriately annotated large-scale datasets’ as the core challenge (among other challenges) in this research direction. We draw on the insights from the sister research fields of Computer Vision, Pattern Recognition and Machine Learning etc.; where the techniques of dealing with such challenges have already matured, to provide promising directions for the Medical Imaging community to fully harness Deep Learning in the future.
Tasks
Published	2019-02-15
URL	http://arxiv.org/abs/1902.05655v1
PDF	http://arxiv.org/pdf/1902.05655v1.pdf
PWC	https://paperswithcode.com/paper/going-deep-in-medical-image-analysis-concepts
Repo
Framework