May 7, 2019

2952 words 14 mins read

Paper Group AWR 93

Paper Group AWR 93

Business Process Deviance Mining: Review and Evaluation. Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings. Measuring Neural Net Robustness with Constraints. CliqueCNN: Deep Unsupervised Exemplar Learning. An Architecture for Deep, Hierarchical Generative Models. Deep Transfer Learning for Person Re-identification. B …

Business Process Deviance Mining: Review and Evaluation

Title Business Process Deviance Mining: Review and Evaluation
Authors Hoang Nguyen, Marlon Dumas, Marcello La Rosa, Fabrizio Maria Maggi, Suriadi Suriadi
Abstract Business process deviance refers to the phenomenon whereby a subset of the executions of a business process deviate, in a negative or positive way, with respect to its expected or desirable outcomes. Deviant executions of a business process include those that violate compliance rules, or executions that undershoot or exceed performance targets. Deviance mining is concerned with uncovering the reasons for deviant executions by analyzing business process event logs. This article provides a systematic review and comparative evaluation of deviance mining approaches based on a family of data mining techniques known as sequence classification. Using real-life logs from multiple domains, we evaluate a range of feature types and classification methods in terms of their ability to accurately discriminate between normal and deviant executions of a process. We also analyze the interestingness of the rule sets extracted using different methods. We observe that feature sets extracted using pattern mining techniques only slightly outperform simpler feature sets based on counts of individual activity occurrences in a trace.
Tasks
Published 2016-08-29
URL http://arxiv.org/abs/1608.08252v1
PDF http://arxiv.org/pdf/1608.08252v1.pdf
PWC https://paperswithcode.com/paper/business-process-deviance-mining-review-and
Repo https://github.com/Abercus/devianceminingoverview
Framework none

Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings

Title Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings
Authors Tolga Bolukbasi, Kai-Wei Chang, James Zou, Venkatesh Saligrama, Adam Kalai
Abstract The blind application of machine learning runs the risk of amplifying biases present in data. Such a danger is facing us with word embedding, a popular framework to represent text data as vectors which has been used in many machine learning and natural language processing tasks. We show that even word embeddings trained on Google News articles exhibit female/male gender stereotypes to a disturbing extent. This raises concerns because their widespread use, as we describe, often tends to amplify these biases. Geometrically, gender bias is first shown to be captured by a direction in the word embedding. Second, gender neutral words are shown to be linearly separable from gender definition words in the word embedding. Using these properties, we provide a methodology for modifying an embedding to remove gender stereotypes, such as the association between between the words receptionist and female, while maintaining desired associations such as between the words queen and female. We define metrics to quantify both direct and indirect gender biases in embeddings, and develop algorithms to “debias” the embedding. Using crowd-worker evaluation as well as standard benchmarks, we empirically demonstrate that our algorithms significantly reduce gender bias in embeddings while preserving the its useful properties such as the ability to cluster related concepts and to solve analogy tasks. The resulting embeddings can be used in applications without amplifying gender bias.
Tasks Word Embeddings
Published 2016-07-21
URL http://arxiv.org/abs/1607.06520v1
PDF http://arxiv.org/pdf/1607.06520v1.pdf
PWC https://paperswithcode.com/paper/man-is-to-computer-programmer-as-woman-is-to
Repo https://github.com/if1015-datascience-ufpe/2018-2-ex3-bash
Framework none

Measuring Neural Net Robustness with Constraints

Title Measuring Neural Net Robustness with Constraints
Authors Osbert Bastani, Yani Ioannou, Leonidas Lampropoulos, Dimitrios Vytiniotis, Aditya Nori, Antonio Criminisi
Abstract Despite having high accuracy, neural nets have been shown to be susceptible to adversarial examples, where a small perturbation to an input can cause it to become mislabeled. We propose metrics for measuring the robustness of a neural net and devise a novel algorithm for approximating these metrics based on an encoding of robustness as a linear program. We show how our metrics can be used to evaluate the robustness of deep neural nets with experiments on the MNIST and CIFAR-10 datasets. Our algorithm generates more informative estimates of robustness metrics compared to estimates based on existing algorithms. Furthermore, we show how existing approaches to improving robustness “overfit” to adversarial examples generated using a specific algorithm. Finally, we show that our techniques can be used to additionally improve neural net robustness both according to the metrics that we propose, but also according to previously proposed metrics.
Tasks
Published 2016-05-24
URL http://arxiv.org/abs/1605.07262v2
PDF http://arxiv.org/pdf/1605.07262v2.pdf
PWC https://paperswithcode.com/paper/measuring-neural-net-robustness-with
Repo https://github.com/Microsoft/NeuralNetworkAnalysis
Framework none

CliqueCNN: Deep Unsupervised Exemplar Learning

Title CliqueCNN: Deep Unsupervised Exemplar Learning
Authors Miguel A. Bautista, Artsiom Sanakoyeu, Ekaterina Sutter, Björn Ommer
Abstract Exemplar learning is a powerful paradigm for discovering visual similarities in an unsupervised manner. In this context, however, the recent breakthrough in deep learning could not yet unfold its full potential. With only a single positive sample, a great imbalance between one positive and many negatives, and unreliable relationships between most samples, training of Convolutional Neural networks is impaired. Given weak estimates of local distance we propose a single optimization problem to extract batches of samples with mutually consistent relations. Conflicting relations are distributed over different batches and similar samples are grouped into compact cliques. Learning exemplar similarities is framed as a sequence of clique categorization tasks. The CNN then consolidates transitivity relations within and between cliques and learns a single representation for all samples without the need for labels. The proposed unsupervised approach has shown competitive performance on detailed posture analysis and object classification.
Tasks Object Classification
Published 2016-08-31
URL http://arxiv.org/abs/1608.08792v1
PDF http://arxiv.org/pdf/1608.08792v1.pdf
PWC https://paperswithcode.com/paper/cliquecnn-deep-unsupervised-exemplar-learning
Repo https://github.com/asanakoy/cliquecnn
Framework none

An Architecture for Deep, Hierarchical Generative Models

Title An Architecture for Deep, Hierarchical Generative Models
Authors Philip Bachman
Abstract We present an architecture which lets us train deep, directed generative models with many layers of latent variables. We include deterministic paths between all latent variables and the generated output, and provide a richer set of connections between computations for inference and generation, which enables more effective communication of information throughout the model during training. To improve performance on natural images, we incorporate a lightweight autoregressive model in the reconstruction distribution. These techniques permit end-to-end training of models with 10+ layers of latent variables. Experiments show that our approach achieves state-of-the-art performance on standard image modelling benchmarks, can expose latent class structure in the absence of label information, and can provide convincing imputations of occluded regions in natural images.
Tasks
Published 2016-12-08
URL http://arxiv.org/abs/1612.04739v1
PDF http://arxiv.org/pdf/1612.04739v1.pdf
PWC https://paperswithcode.com/paper/an-architecture-for-deep-hierarchical
Repo https://github.com/Philip-Bachman/MatNets-NIPS
Framework none

Deep Transfer Learning for Person Re-identification

Title Deep Transfer Learning for Person Re-identification
Authors Mengyue Geng, Yaowei Wang, Tao Xiang, Yonghong Tian
Abstract Person re-identification (Re-ID) poses a unique challenge to deep learning: how to learn a deep model with millions of parameters on a small training set of few or no labels. In this paper, a number of deep transfer learning models are proposed to address the data sparsity problem. First, a deep network architecture is designed which differs from existing deep Re-ID models in that (a) it is more suitable for transferring representations learned from large image classification datasets, and (b) classification loss and verification loss are combined, each of which adopts a different dropout strategy. Second, a two-stepped fine-tuning strategy is developed to transfer knowledge from auxiliary datasets. Third, given an unlabelled Re-ID dataset, a novel unsupervised deep transfer learning model is developed based on co-training. The proposed models outperform the state-of-the-art deep Re-ID models by large margins: we achieve Rank-1 accuracy of 85.4%, 83.7% and 56.3% on CUHK03, Market1501, and VIPeR respectively, whilst on VIPeR, our unsupervised model (45.1%) beats most supervised models.
Tasks Image Classification, Person Re-Identification, Transfer Learning
Published 2016-11-16
URL http://arxiv.org/abs/1611.05244v2
PDF http://arxiv.org/pdf/1611.05244v2.pdf
PWC https://paperswithcode.com/paper/deep-transfer-learning-for-person-re
Repo https://github.com/KaiyangZhou/deep-person-reid
Framework pytorch

Bayesian quantile additive regression trees

Title Bayesian quantile additive regression trees
Authors Bereket P. Kindo, Hao Wang, Timothy Hanson, Edsel A. Peña
Abstract Ensemble of regression trees have become popular statistical tools for the estimation of conditional mean given a set of predictors. However, quantile regression trees and their ensembles have not yet garnered much attention despite the increasing popularity of the linear quantile regression model. This work proposes a Bayesian quantile additive regression trees model that shows very good predictive performance illustrated using simulation studies and real data applications. Further extension to tackle binary classification problems is also considered.
Tasks
Published 2016-07-10
URL http://arxiv.org/abs/1607.02676v1
PDF http://arxiv.org/pdf/1607.02676v1.pdf
PWC https://paperswithcode.com/paper/bayesian-quantile-additive-regression-trees
Repo https://github.com/bpkindo/bayesqart
Framework none

Medical Concept Representation Learning from Electronic Health Records and its Application on Heart Failure Prediction

Title Medical Concept Representation Learning from Electronic Health Records and its Application on Heart Failure Prediction
Authors Edward Choi, Andy Schuetz, Walter F. Stewart, Jimeng Sun
Abstract Objective: To transform heterogeneous clinical data from electronic health records into clinically meaningful constructed features using data driven method that rely, in part, on temporal relations among data. Materials and Methods: The clinically meaningful representations of medical concepts and patients are the key for health analytic applications. Most of existing approaches directly construct features mapped to raw data (e.g., ICD or CPT codes), or utilize some ontology mapping such as SNOMED codes. However, none of the existing approaches leverage EHR data directly for learning such concept representation. We propose a new way to represent heterogeneous medical concepts (e.g., diagnoses, medications and procedures) based on co-occurrence patterns in longitudinal electronic health records. The intuition behind the method is to map medical concepts that are co-occuring closely in time to similar concept vectors so that their distance will be small. We also derive a simple method to construct patient vectors from the related medical concept vectors. Results: For qualitative evaluation, we study similar medical concepts across diagnosis, medication and procedure. In quantitative evaluation, our proposed representation significantly improves the predictive modeling performance for onset of heart failure (HF), where classification methods (e.g. logistic regression, neural network, support vector machine and K-nearest neighbors) achieve up to 23% improvement in area under the ROC curve (AUC) using this proposed representation. Conclusion: We proposed an effective method for patient and medical concept representation learning. The resulting representation can map relevant concepts together and also improves predictive modeling performance.
Tasks Representation Learning
Published 2016-02-11
URL http://arxiv.org/abs/1602.03686v2
PDF http://arxiv.org/pdf/1602.03686v2.pdf
PWC https://paperswithcode.com/paper/medical-concept-representation-learning-from
Repo https://github.com/mp2893/retain
Framework none

Optimizing Neural Network Hyperparameters with Gaussian Processes for Dialog Act Classification

Title Optimizing Neural Network Hyperparameters with Gaussian Processes for Dialog Act Classification
Authors Franck Dernoncourt, Ji Young Lee
Abstract Systems based on artificial neural networks (ANNs) have achieved state-of-the-art results in many natural language processing tasks. Although ANNs do not require manually engineered features, ANNs have many hyperparameters to be optimized. The choice of hyperparameters significantly impacts models’ performances. However, the ANN hyperparameters are typically chosen by manual, grid, or random search, which either requires expert experiences or is computationally expensive. Recent approaches based on Bayesian optimization using Gaussian processes (GPs) is a more systematic way to automatically pinpoint optimal or near-optimal machine learning hyperparameters. Using a previously published ANN model yielding state-of-the-art results for dialog act classification, we demonstrate that optimizing hyperparameters using GP further improves the results, and reduces the computational time by a factor of 4 compared to a random search. Therefore it is a useful technique for tuning ANN models to yield the best performances for natural language processing tasks.
Tasks Dialog Act Classification, Gaussian Processes
Published 2016-09-27
URL http://arxiv.org/abs/1609.08703v1
PDF http://arxiv.org/pdf/1609.08703v1.pdf
PWC https://paperswithcode.com/paper/optimizing-neural-network-hyperparameters
Repo https://github.com/Franck-Dernoncourt/slt2016
Framework none

Discriminative Correlation Filter with Channel and Spatial Reliability

Title Discriminative Correlation Filter with Channel and Spatial Reliability
Authors Alan Lukežič, Tomáš Vojíř, Luka Čehovin, Jiří Matas, Matej Kristan
Abstract Short-term tracking is an open and challenging problem for which discriminative correlation filters (DCF) have shown excellent performance. We introduce the channel and spatial reliability concepts to DCF tracking and provide a novel learning algorithm for its efficient and seamless integration in the filter update and the tracking process. The spatial reliability map adjusts the filter support to the part of the object suitable for tracking. This both allows to enlarge the search region and improves tracking of non-rectangular objects. Reliability scores reflect channel-wise quality of the learned filters and are used as feature weighting coefficients in localization. Experimentally, with only two simple standard features, HoGs and Colornames, the novel CSR-DCF method – DCF with Channel and Spatial Reliability – achieves state-of-the-art results on VOT 2016, VOT 2015 and OTB100. The CSR-DCF runs in real-time on a CPU.
Tasks Visual Object Tracking
Published 2016-11-25
URL http://arxiv.org/abs/1611.08461v3
PDF http://arxiv.org/pdf/1611.08461v3.pdf
PWC https://paperswithcode.com/paper/discriminative-correlation-filter-with
Repo https://github.com/alanlukezic/csr-dcf
Framework none

Robust Named Entity Recognition in Idiosyncratic Domains

Title Robust Named Entity Recognition in Idiosyncratic Domains
Authors Sebastian Arnold, Felix A. Gers, Torsten Kilias, Alexander Löser
Abstract Named entity recognition often fails in idiosyncratic domains. That causes a problem for depending tasks, such as entity linking and relation extraction. We propose a generic and robust approach for high-recall named entity recognition. Our approach is easy to train and offers strong generalization over diverse domain-specific language, such as news documents (e.g. Reuters) or biomedical text (e.g. Medline). Our approach is based on deep contextual sequence learning and utilizes stacked bidirectional LSTM networks. Our model is trained with only few hundred labeled sentences and does not rely on further external knowledge. We report from our results F1 scores in the range of 84-94% on standard datasets.
Tasks Entity Linking, Named Entity Recognition, Relation Extraction
Published 2016-08-24
URL http://arxiv.org/abs/1608.06757v1
PDF http://arxiv.org/pdf/1608.06757v1.pdf
PWC https://paperswithcode.com/paper/robust-named-entity-recognition-in
Repo https://github.com/sebastianarnold/TeXoo
Framework none

Generalizing and Hybridizing Count-based and Neural Language Models

Title Generalizing and Hybridizing Count-based and Neural Language Models
Authors Graham Neubig, Chris Dyer
Abstract Language models (LMs) are statistical models that calculate probabilities over sequences of words or other discrete symbols. Currently two major paradigms for language modeling exist: count-based n-gram models, which have advantages of scalability and test-time speed, and neural LMs, which often achieve superior modeling performance. We demonstrate how both varieties of models can be unified in a single modeling framework that defines a set of probability distributions over the vocabulary of words, and then dynamically calculates mixture weights over these distributions. This formulation allows us to create novel hybrid models that combine the desirable features of count-based and neural LMs, and experiments demonstrate the advantages of these approaches.
Tasks Language Modelling
Published 2016-06-01
URL http://arxiv.org/abs/1606.00499v2
PDF http://arxiv.org/pdf/1606.00499v2.pdf
PWC https://paperswithcode.com/paper/generalizing-and-hybridizing-count-based-and
Repo https://github.com/neubig/modlm
Framework none

Towards the Science of Security and Privacy in Machine Learning

Title Towards the Science of Security and Privacy in Machine Learning
Authors Nicolas Papernot, Patrick McDaniel, Arunesh Sinha, Michael Wellman
Abstract Advances in machine learning (ML) in recent years have enabled a dizzying array of applications such as data analytics, autonomous systems, and security diagnostics. ML is now pervasive—new systems and models are being deployed in every domain imaginable, leading to rapid and widespread deployment of software based inference and decision making. There is growing recognition that ML exposes new vulnerabilities in software systems, yet the technical community’s understanding of the nature and extent of these vulnerabilities remains limited. We systematize recent findings on ML security and privacy, focusing on attacks identified on these systems and defenses crafted to date. We articulate a comprehensive threat model for ML, and categorize attacks and defenses within an adversarial framework. Key insights resulting from works both in the ML and security communities are identified and the effectiveness of approaches are related to structural elements of ML algorithms and the data used to train them. We conclude by formally exploring the opposing relationship between model accuracy and resilience to adversarial manipulation. Through these explorations, we show that there are (possibly unavoidable) tensions between model complexity, accuracy, and resilience that must be calibrated for the environments in which they will be used.
Tasks Decision Making
Published 2016-11-11
URL http://arxiv.org/abs/1611.03814v1
PDF http://arxiv.org/pdf/1611.03814v1.pdf
PWC https://paperswithcode.com/paper/towards-the-science-of-security-and-privacy
Repo https://github.com/Abukara/Udacity_Secure-and-Private-AI-Scholarship-Challenge-Nanodegree-Program
Framework tf

Covariate Regularized Community Detection in Sparse Graphs

Title Covariate Regularized Community Detection in Sparse Graphs
Authors Bowei Yan, Purnamrita Sarkar
Abstract In this paper, we investigate community detection in networks in the presence of node covariates. In many instances, covariates and networks individually only give a partial view of the cluster structure. One needs to jointly infer the full cluster structure by considering both. In statistics, an emerging body of work has been focused on combining information from both the edges in the network and the node covariates to infer community memberships. However, so far the theoretical guarantees have been established in the dense regime, where the network can lead to perfect clustering under a broad parameter regime, and hence the role of covariates is often not clear. In this paper, we examine sparse networks in conjunction with finite dimensional sub-gaussian mixtures as covariates under moderate separation conditions. In this setting each individual source can only cluster a non-vanishing fraction of nodes correctly. We propose a simple optimization framework which provably improves clustering accuracy when the two sources carry partial information about the cluster memberships, and hence perform poorly on their own. Our optimization problem can be solved using scalable convex optimization algorithms. Using a variety of simulated and real data examples, we show that the proposed method outperforms other existing methodology.
Tasks Community Detection
Published 2016-07-10
URL http://arxiv.org/abs/1607.02675v4
PDF http://arxiv.org/pdf/1607.02675v4.pdf
PWC https://paperswithcode.com/paper/covariate-regularized-community-detection-in
Repo https://github.com/boweiYan/SDP_SBM_unbalanced_size
Framework none

Density estimation using Real NVP

Title Density estimation using Real NVP
Authors Laurent Dinh, Jascha Sohl-Dickstein, Samy Bengio
Abstract Unsupervised learning of probabilistic models is a central yet challenging problem in machine learning. Specifically, designing models with tractable learning, sampling, inference and evaluation is crucial in solving this task. We extend the space of such models using real-valued non-volume preserving (real NVP) transformations, a set of powerful invertible and learnable transformations, resulting in an unsupervised learning algorithm with exact log-likelihood computation, exact sampling, exact inference of latent variables, and an interpretable latent space. We demonstrate its ability to model natural images on four datasets through sampling, log-likelihood evaluation and latent variable manipulations.
Tasks Density Estimation, Image Generation
Published 2016-05-27
URL http://arxiv.org/abs/1605.08803v3
PDF http://arxiv.org/pdf/1605.08803v3.pdf
PWC https://paperswithcode.com/paper/density-estimation-using-real-nvp
Repo https://github.com/ANLGBOY/RealNVP-with-PyTorch
Framework pytorch
comments powered by Disqus