May 7, 2019

2952 words 14 mins read

Paper Group AWR 93

Business Process Deviance Mining: Review and Evaluation. Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings. Measuring Neural Net Robustness with Constraints. CliqueCNN: Deep Unsupervised Exemplar Learning. An Architecture for Deep, Hierarchical Generative Models. Deep Transfer Learning for Person Re-identification. B …

Business Process Deviance Mining: Review and Evaluation


Title	Business Process Deviance Mining: Review and Evaluation
Authors	Hoang Nguyen, Marlon Dumas, Marcello La Rosa, Fabrizio Maria Maggi, Suriadi Suriadi
Abstract	Business process deviance refers to the phenomenon whereby a subset of the executions of a business process deviate, in a negative or positive way, with respect to its expected or desirable outcomes. Deviant executions of a business process include those that violate compliance rules, or executions that undershoot or exceed performance targets. Deviance mining is concerned with uncovering the reasons for deviant executions by analyzing business process event logs. This article provides a systematic review and comparative evaluation of deviance mining approaches based on a family of data mining techniques known as sequence classification. Using real-life logs from multiple domains, we evaluate a range of feature types and classification methods in terms of their ability to accurately discriminate between normal and deviant executions of a process. We also analyze the interestingness of the rule sets extracted using different methods. We observe that feature sets extracted using pattern mining techniques only slightly outperform simpler feature sets based on counts of individual activity occurrences in a trace.
Tasks
Published	2016-08-29
URL	http://arxiv.org/abs/1608.08252v1
PDF	http://arxiv.org/pdf/1608.08252v1.pdf
PWC	https://paperswithcode.com/paper/business-process-deviance-mining-review-and
Repo	https://github.com/Abercus/devianceminingoverview
Framework	none

Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings


Title	Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings
Authors	Tolga Bolukbasi, Kai-Wei Chang, James Zou, Venkatesh Saligrama, Adam Kalai
Abstract	The blind application of machine learning runs the risk of amplifying biases present in data. Such a danger is facing us with word embedding, a popular framework to represent text data as vectors which has been used in many machine learning and natural language processing tasks. We show that even word embeddings trained on Google News articles exhibit female/male gender stereotypes to a disturbing extent. This raises concerns because their widespread use, as we describe, often tends to amplify these biases. Geometrically, gender bias is first shown to be captured by a direction in the word embedding. Second, gender neutral words are shown to be linearly separable from gender definition words in the word embedding. Using these properties, we provide a methodology for modifying an embedding to remove gender stereotypes, such as the association between between the words receptionist and female, while maintaining desired associations such as between the words queen and female. We define metrics to quantify both direct and indirect gender biases in embeddings, and develop algorithms to “debias” the embedding. Using crowd-worker evaluation as well as standard benchmarks, we empirically demonstrate that our algorithms significantly reduce gender bias in embeddings while preserving the its useful properties such as the ability to cluster related concepts and to solve analogy tasks. The resulting embeddings can be used in applications without amplifying gender bias.
Tasks	Word Embeddings
Published	2016-07-21
URL	http://arxiv.org/abs/1607.06520v1
PDF	http://arxiv.org/pdf/1607.06520v1.pdf
PWC	https://paperswithcode.com/paper/man-is-to-computer-programmer-as-woman-is-to
Repo	https://github.com/if1015-datascience-ufpe/2018-2-ex3-bash
Framework	none

Measuring Neural Net Robustness with Constraints


Title	Measuring Neural Net Robustness with Constraints
Authors	Osbert Bastani, Yani Ioannou, Leonidas Lampropoulos, Dimitrios Vytiniotis, Aditya Nori, Antonio Criminisi
Abstract	Despite having high accuracy, neural nets have been shown to be susceptible to adversarial examples, where a small perturbation to an input can cause it to become mislabeled. We propose metrics for measuring the robustness of a neural net and devise a novel algorithm for approximating these metrics based on an encoding of robustness as a linear program. We show how our metrics can be used to evaluate the robustness of deep neural nets with experiments on the MNIST and CIFAR-10 datasets. Our algorithm generates more informative estimates of robustness metrics compared to estimates based on existing algorithms. Furthermore, we show how existing approaches to improving robustness “overfit” to adversarial examples generated using a specific algorithm. Finally, we show that our techniques can be used to additionally improve neural net robustness both according to the metrics that we propose, but also according to previously proposed metrics.
Tasks
Published	2016-05-24
URL	http://arxiv.org/abs/1605.07262v2
PDF	http://arxiv.org/pdf/1605.07262v2.pdf
PWC	https://paperswithcode.com/paper/measuring-neural-net-robustness-with
Repo	https://github.com/Microsoft/NeuralNetworkAnalysis
Framework	none

CliqueCNN: Deep Unsupervised Exemplar Learning


Title	CliqueCNN: Deep Unsupervised Exemplar Learning
Authors	Miguel A. Bautista, Artsiom Sanakoyeu, Ekaterina Sutter, Björn Ommer
Abstract	Exemplar learning is a powerful paradigm for discovering visual similarities in an unsupervised manner. In this context, however, the recent breakthrough in deep learning could not yet unfold its full potential. With only a single positive sample, a great imbalance between one positive and many negatives, and unreliable relationships between most samples, training of Convolutional Neural networks is impaired. Given weak estimates of local distance we propose a single optimization problem to extract batches of samples with mutually consistent relations. Conflicting relations are distributed over different batches and similar samples are grouped into compact cliques. Learning exemplar similarities is framed as a sequence of clique categorization tasks. The CNN then consolidates transitivity relations within and between cliques and learns a single representation for all samples without the need for labels. The proposed unsupervised approach has shown competitive performance on detailed posture analysis and object classification.
Tasks	Object Classification
Published	2016-08-31
URL	http://arxiv.org/abs/1608.08792v1
PDF	http://arxiv.org/pdf/1608.08792v1.pdf
PWC	https://paperswithcode.com/paper/cliquecnn-deep-unsupervised-exemplar-learning
Repo	https://github.com/asanakoy/cliquecnn
Framework	none

An Architecture for Deep, Hierarchical Generative Models


Title	An Architecture for Deep, Hierarchical Generative Models
Authors	Philip Bachman
Abstract	We present an architecture which lets us train deep, directed generative models with many layers of latent variables. We include deterministic paths between all latent variables and the generated output, and provide a richer set of connections between computations for inference and generation, which enables more effective communication of information throughout the model during training. To improve performance on natural images, we incorporate a lightweight autoregressive model in the reconstruction distribution. These techniques permit end-to-end training of models with 10+ layers of latent variables. Experiments show that our approach achieves state-of-the-art performance on standard image modelling benchmarks, can expose latent class structure in the absence of label information, and can provide convincing imputations of occluded regions in natural images.
Tasks
Published	2016-12-08
URL	http://arxiv.org/abs/1612.04739v1
PDF	http://arxiv.org/pdf/1612.04739v1.pdf
PWC	https://paperswithcode.com/paper/an-architecture-for-deep-hierarchical
Repo	https://github.com/Philip-Bachman/MatNets-NIPS
Framework	none

Deep Transfer Learning for Person Re-identification


Title	Deep Transfer Learning for Person Re-identification
Authors	Mengyue Geng, Yaowei Wang, Tao Xiang, Yonghong Tian
Abstract	Person re-identification (Re-ID) poses a unique challenge to deep learning: how to learn a deep model with millions of parameters on a small training set of few or no labels. In this paper, a number of deep transfer learning models are proposed to address the data sparsity problem. First, a deep network architecture is designed which differs from existing deep Re-ID models in that (a) it is more suitable for transferring representations learned from large image classification datasets, and (b) classification loss and verification loss are combined, each of which adopts a different dropout strategy. Second, a two-stepped fine-tuning strategy is developed to transfer knowledge from auxiliary datasets. Third, given an unlabelled Re-ID dataset, a novel unsupervised deep transfer learning model is developed based on co-training. The proposed models outperform the state-of-the-art deep Re-ID models by large margins: we achieve Rank-1 accuracy of 85.4%, 83.7% and 56.3% on CUHK03, Market1501, and VIPeR respectively, whilst on VIPeR, our unsupervised model (45.1%) beats most supervised models.
Tasks	Image Classification, Person Re-Identification, Transfer Learning
Published	2016-11-16
URL	http://arxiv.org/abs/1611.05244v2
PDF	http://arxiv.org/pdf/1611.05244v2.pdf
PWC	https://paperswithcode.com/paper/deep-transfer-learning-for-person-re
Repo	https://github.com/KaiyangZhou/deep-person-reid
Framework	pytorch

Bayesian quantile additive regression trees


Title	Bayesian quantile additive regression trees
Authors	Bereket P. Kindo, Hao Wang, Timothy Hanson, Edsel A. Peña
Abstract	Ensemble of regression trees have become popular statistical tools for the estimation of conditional mean given a set of predictors. However, quantile regression trees and their ensembles have not yet garnered much attention despite the increasing popularity of the linear quantile regression model. This work proposes a Bayesian quantile additive regression trees model that shows very good predictive performance illustrated using simulation studies and real data applications. Further extension to tackle binary classification problems is also considered.
Tasks
Published	2016-07-10
URL	http://arxiv.org/abs/1607.02676v1
PDF	http://arxiv.org/pdf/1607.02676v1.pdf
PWC	https://paperswithcode.com/paper/bayesian-quantile-additive-regression-trees
Repo	https://github.com/bpkindo/bayesqart
Framework	none

Medical Concept Representation Learning from Electronic Health Records and its Application on Heart Failure Prediction


Title	Medical Concept Representation Learning from Electronic Health Records and its Application on Heart Failure Prediction
Authors	Edward Choi, Andy Schuetz, Walter F. Stewart, Jimeng Sun
Abstract	Objective: To transform heterogeneous clinical data from electronic health records into clinically meaningful constructed features using data driven method that rely, in part, on temporal relations among data. Materials and Methods: The clinically meaningful representations of medical concepts and patients are the key for health analytic applications. Most of existing approaches directly construct features mapped to raw data (e.g., ICD or CPT codes), or utilize some ontology mapping such as SNOMED codes. However, none of the existing approaches leverage EHR data directly for learning such concept representation. We propose a new way to represent heterogeneous medical concepts (e.g., diagnoses, medications and procedures) based on co-occurrence patterns in longitudinal electronic health records. The intuition behind the method is to map medical concepts that are co-occuring closely in time to similar concept vectors so that their distance will be small. We also derive a simple method to construct patient vectors from the related medical concept vectors. Results: For qualitative evaluation, we study similar medical concepts across diagnosis, medication and procedure. In quantitative evaluation, our proposed representation significantly improves the predictive modeling performance for onset of heart failure (HF), where classification methods (e.g. logistic regression, neural network, support vector machine and K-nearest neighbors) achieve up to 23% improvement in area under the ROC curve (AUC) using this proposed representation. Conclusion: We proposed an effective method for patient and medical concept representation learning. The resulting representation can map relevant concepts together and also improves predictive modeling performance.
Tasks	Representation Learning
Published	2016-02-11
URL	http://arxiv.org/abs/1602.03686v2
PDF	http://arxiv.org/pdf/1602.03686v2.pdf
PWC	https://paperswithcode.com/paper/medical-concept-representation-learning-from
Repo	https://github.com/mp2893/retain
Framework	none

Optimizing Neural Network Hyperparameters with Gaussian Processes for Dialog Act Classification


Title	Optimizing Neural Network Hyperparameters with Gaussian Processes for Dialog Act Classification
Authors	Franck Dernoncourt, Ji Young Lee
Abstract	Systems based on artificial neural networks (ANNs) have achieved state-of-the-art results in many natural language processing tasks. Although ANNs do not require manually engineered features, ANNs have many hyperparameters to be optimized. The choice of hyperparameters significantly impacts models’ performances. However, the ANN hyperparameters are typically chosen by manual, grid, or random search, which either requires expert experiences or is computationally expensive. Recent approaches based on Bayesian optimization using Gaussian processes (GPs) is a more systematic way to automatically pinpoint optimal or near-optimal machine learning hyperparameters. Using a previously published ANN model yielding state-of-the-art results for dialog act classification, we demonstrate that optimizing hyperparameters using GP further improves the results, and reduces the computational time by a factor of 4 compared to a random search. Therefore it is a useful technique for tuning ANN models to yield the best performances for natural language processing tasks.
Tasks	Dialog Act Classification, Gaussian Processes
Published	2016-09-27
URL	http://arxiv.org/abs/1609.08703v1
PDF	http://arxiv.org/pdf/1609.08703v1.pdf
PWC	https://paperswithcode.com/paper/optimizing-neural-network-hyperparameters
Repo	https://github.com/Franck-Dernoncourt/slt2016
Framework	none

Discriminative Correlation Filter with Channel and Spatial Reliability


Title	Discriminative Correlation Filter with Channel and Spatial Reliability
Authors	Alan Lukežič, Tomáš Vojíř, Luka Čehovin, Jiří Matas, Matej Kristan
Abstract	Short-term tracking is an open and challenging problem for which discriminative correlation filters (DCF) have shown excellent performance. We introduce the channel and spatial reliability concepts to DCF tracking and provide a novel learning algorithm for its efficient and seamless integration in the filter update and the tracking process. The spatial reliability map adjusts the filter support to the part of the object suitable for tracking. This both allows to enlarge the search region and improves tracking of non-rectangular objects. Reliability scores reflect channel-wise quality of the learned filters and are used as feature weighting coefficients in localization. Experimentally, with only two simple standard features, HoGs and Colornames, the novel CSR-DCF method – DCF with Channel and Spatial Reliability – achieves state-of-the-art results on VOT 2016, VOT 2015 and OTB100. The CSR-DCF runs in real-time on a CPU.
Tasks	Visual Object Tracking
Published	2016-11-25
URL	http://arxiv.org/abs/1611.08461v3
PDF	http://arxiv.org/pdf/1611.08461v3.pdf
PWC	https://paperswithcode.com/paper/discriminative-correlation-filter-with
Repo	https://github.com/alanlukezic/csr-dcf
Framework	none

Robust Named Entity Recognition in Idiosyncratic Domains


Title	Robust Named Entity Recognition in Idiosyncratic Domains
Authors	Sebastian Arnold, Felix A. Gers, Torsten Kilias, Alexander Löser
Abstract	Named entity recognition often fails in idiosyncratic domains. That causes a problem for depending tasks, such as entity linking and relation extraction. We propose a generic and robust approach for high-recall named entity recognition. Our approach is easy to train and offers strong generalization over diverse domain-specific language, such as news documents (e.g. Reuters) or biomedical text (e.g. Medline). Our approach is based on deep contextual sequence learning and utilizes stacked bidirectional LSTM networks. Our model is trained with only few hundred labeled sentences and does not rely on further external knowledge. We report from our results F1 scores in the range of 84-94% on standard datasets.
Tasks	Entity Linking, Named Entity Recognition, Relation Extraction
Published	2016-08-24
URL	http://arxiv.org/abs/1608.06757v1
PDF	http://arxiv.org/pdf/1608.06757v1.pdf
PWC	https://paperswithcode.com/paper/robust-named-entity-recognition-in
Repo	https://github.com/sebastianarnold/TeXoo
Framework	none

Generalizing and Hybridizing Count-based and Neural Language Models


Title	Generalizing and Hybridizing Count-based and Neural Language Models
Authors	Graham Neubig, Chris Dyer
Abstract	Language models (LMs) are statistical models that calculate probabilities over sequences of words or other discrete symbols. Currently two major paradigms for language modeling exist: count-based n-gram models, which have advantages of scalability and test-time speed, and neural LMs, which often achieve superior modeling performance. We demonstrate how both varieties of models can be unified in a single modeling framework that defines a set of probability distributions over the vocabulary of words, and then dynamically calculates mixture weights over these distributions. This formulation allows us to create novel hybrid models that combine the desirable features of count-based and neural LMs, and experiments demonstrate the advantages of these approaches.
Tasks	Language Modelling
Published	2016-06-01
URL	http://arxiv.org/abs/1606.00499v2
PDF	http://arxiv.org/pdf/1606.00499v2.pdf
PWC	https://paperswithcode.com/paper/generalizing-and-hybridizing-count-based-and
Repo	https://github.com/neubig/modlm
Framework	none

Towards the Science of Security and Privacy in Machine Learning


Title	Towards the Science of Security and Privacy in Machine Learning
Authors	Nicolas Papernot, Patrick McDaniel, Arunesh Sinha, Michael Wellman
Abstract	Advances in machine learning (ML) in recent years have enabled a dizzying array of applications such as data analytics, autonomous systems, and security diagnostics. ML is now pervasive—new systems and models are being deployed in every domain imaginable, leading to rapid and widespread deployment of software based inference and decision making. There is growing recognition that ML exposes new vulnerabilities in software systems, yet the technical community’s understanding of the nature and extent of these vulnerabilities remains limited. We systematize recent findings on ML security and privacy, focusing on attacks identified on these systems and defenses crafted to date. We articulate a comprehensive threat model for ML, and categorize attacks and defenses within an adversarial framework. Key insights resulting from works both in the ML and security communities are identified and the effectiveness of approaches are related to structural elements of ML algorithms and the data used to train them. We conclude by formally exploring the opposing relationship between model accuracy and resilience to adversarial manipulation. Through these explorations, we show that there are (possibly unavoidable) tensions between model complexity, accuracy, and resilience that must be calibrated for the environments in which they will be used.
Tasks	Decision Making
Published	2016-11-11
URL	http://arxiv.org/abs/1611.03814v1
PDF	http://arxiv.org/pdf/1611.03814v1.pdf
PWC	https://paperswithcode.com/paper/towards-the-science-of-security-and-privacy
Repo	https://github.com/Abukara/Udacity_Secure-and-Private-AI-Scholarship-Challenge-Nanodegree-Program
Framework	tf

Covariate Regularized Community Detection in Sparse Graphs


Title	Covariate Regularized Community Detection in Sparse Graphs
Authors	Bowei Yan, Purnamrita Sarkar
Abstract	In this paper, we investigate community detection in networks in the presence of node covariates. In many instances, covariates and networks individually only give a partial view of the cluster structure. One needs to jointly infer the full cluster structure by considering both. In statistics, an emerging body of work has been focused on combining information from both the edges in the network and the node covariates to infer community memberships. However, so far the theoretical guarantees have been established in the dense regime, where the network can lead to perfect clustering under a broad parameter regime, and hence the role of covariates is often not clear. In this paper, we examine sparse networks in conjunction with finite dimensional sub-gaussian mixtures as covariates under moderate separation conditions. In this setting each individual source can only cluster a non-vanishing fraction of nodes correctly. We propose a simple optimization framework which provably improves clustering accuracy when the two sources carry partial information about the cluster memberships, and hence perform poorly on their own. Our optimization problem can be solved using scalable convex optimization algorithms. Using a variety of simulated and real data examples, we show that the proposed method outperforms other existing methodology.
Tasks	Community Detection
Published	2016-07-10
URL	http://arxiv.org/abs/1607.02675v4
PDF	http://arxiv.org/pdf/1607.02675v4.pdf
PWC	https://paperswithcode.com/paper/covariate-regularized-community-detection-in
Repo	https://github.com/boweiYan/SDP_SBM_unbalanced_size
Framework	none

Density estimation using Real NVP


Title	Density estimation using Real NVP
Authors	Laurent Dinh, Jascha Sohl-Dickstein, Samy Bengio
Abstract	Unsupervised learning of probabilistic models is a central yet challenging problem in machine learning. Specifically, designing models with tractable learning, sampling, inference and evaluation is crucial in solving this task. We extend the space of such models using real-valued non-volume preserving (real NVP) transformations, a set of powerful invertible and learnable transformations, resulting in an unsupervised learning algorithm with exact log-likelihood computation, exact sampling, exact inference of latent variables, and an interpretable latent space. We demonstrate its ability to model natural images on four datasets through sampling, log-likelihood evaluation and latent variable manipulations.
Tasks	Density Estimation, Image Generation
Published	2016-05-27
URL	http://arxiv.org/abs/1605.08803v3
PDF	http://arxiv.org/pdf/1605.08803v3.pdf
PWC	https://paperswithcode.com/paper/density-estimation-using-real-nvp
Repo	https://github.com/ANLGBOY/RealNVP-with-PyTorch
Framework	pytorch