July 28, 2019

3299 words 16 mins read

Paper Group ANR 431

Paper Group ANR 431

Weighted Data Normalization Based on Eigenvalues for Artificial Neural Network Classification. Soft Label Memorization-Generalization for Natural Language Inference. Feature-Guided Black-Box Safety Testing of Deep Neural Networks. Modeling the Resource Requirements of Convolutional Neural Networks on Mobile Devices. Multi-task Neural Networks for P …

Weighted Data Normalization Based on Eigenvalues for Artificial Neural Network Classification

Title Weighted Data Normalization Based on Eigenvalues for Artificial Neural Network Classification
Authors Qingjiu Zhang, Shiliang Sun
Abstract Artificial neural network (ANN) is a very useful tool in solving learning problems. Boosting the performances of ANN can be mainly concluded from two aspects: optimizing the architecture of ANN and normalizing the raw data for ANN. In this paper, a novel method which improves the effects of ANN by preprocessing the raw data is proposed. It totally leverages the fact that different features should play different roles. The raw data set is firstly preprocessed by principle component analysis (PCA), and then its principle components are weighted by their corresponding eigenvalues. Several aspects of analysis are carried out to analyze its theory and the applicable occasions. Three classification problems are launched by an active learning algorithm to verify the proposed method. From the empirical results, conclusion comes to the fact that the proposed method can significantly improve the performance of ANN.
Tasks Active Learning
Published 2017-12-24
URL http://arxiv.org/abs/1712.08885v1
PDF http://arxiv.org/pdf/1712.08885v1.pdf
PWC https://paperswithcode.com/paper/weighted-data-normalization-based-on
Repo
Framework

Soft Label Memorization-Generalization for Natural Language Inference

Title Soft Label Memorization-Generalization for Natural Language Inference
Authors John P. Lalor, Hao Wu, Hong Yu
Abstract Often when multiple labels are obtained for a training example it is assumed that there is an element of noise that must be accounted for. It has been shown that this disagreement can be considered signal instead of noise. In this work we investigate using soft labels for training data to improve generalization in machine learning models. However, using soft labels for training Deep Neural Networks (DNNs) is not practical due to the costs involved in obtaining multiple labels for large data sets. We propose soft label memorization-generalization (SLMG), a fine-tuning approach to using soft labels for training DNNs. We assume that differences in labels provided by human annotators represent ambiguity about the true label instead of noise. Experiments with SLMG demonstrate improved generalization performance on the Natural Language Inference (NLI) task. Our experiments show that by injecting a small percentage of soft label training data (0.03% of training set size) we can improve generalization performance over several baselines.
Tasks Natural Language Inference
Published 2017-02-27
URL http://arxiv.org/abs/1702.08563v3
PDF http://arxiv.org/pdf/1702.08563v3.pdf
PWC https://paperswithcode.com/paper/soft-label-memorization-generalization-for
Repo
Framework

Feature-Guided Black-Box Safety Testing of Deep Neural Networks

Title Feature-Guided Black-Box Safety Testing of Deep Neural Networks
Authors Matthew Wicker, Xiaowei Huang, Marta Kwiatkowska
Abstract Despite the improved accuracy of deep neural networks, the discovery of adversarial examples has raised serious safety concerns. Most existing approaches for crafting adversarial examples necessitate some knowledge (architecture, parameters, etc.) of the network at hand. In this paper, we focus on image classifiers and propose a feature-guided black-box approach to test the safety of deep neural networks that requires no such knowledge. Our algorithm employs object detection techniques such as SIFT (Scale Invariant Feature Transform) to extract features from an image. These features are converted into a mutable saliency distribution, where high probability is assigned to pixels that affect the composition of the image with respect to the human visual system. We formulate the crafting of adversarial examples as a two-player turn-based stochastic game, where the first player’s objective is to minimise the distance to an adversarial example by manipulating the features, and the second player can be cooperative, adversarial, or random. We show that, theoretically, the two-player game can con- verge to the optimal strategy, and that the optimal strategy represents a globally minimal adversarial image. For Lipschitz networks, we also identify conditions that provide safety guarantees that no adversarial examples exist. Using Monte Carlo tree search we gradually explore the game state space to search for adversarial examples. Our experiments show that, despite the black-box setting, manipulations guided by a perception-based saliency distribution are competitive with state-of-the-art methods that rely on white-box saliency matrices or sophisticated optimization procedures. Finally, we show how our method can be used to evaluate robustness of neural networks in safety-critical applications such as traffic sign recognition in self-driving cars.
Tasks Object Detection, Self-Driving Cars, Traffic Sign Recognition
Published 2017-10-21
URL http://arxiv.org/abs/1710.07859v2
PDF http://arxiv.org/pdf/1710.07859v2.pdf
PWC https://paperswithcode.com/paper/feature-guided-black-box-safety-testing-of
Repo
Framework

Modeling the Resource Requirements of Convolutional Neural Networks on Mobile Devices

Title Modeling the Resource Requirements of Convolutional Neural Networks on Mobile Devices
Authors Zongqing Lu, Swati Rallapalli, Kevin Chan, Thomas La Porta
Abstract Convolutional Neural Networks (CNNs) have revolutionized the research in computer vision, due to their ability to capture complex patterns, resulting in high inference accuracies. However, the increasingly complex nature of these neural networks means that they are particularly suited for server computers with powerful GPUs. We envision that deep learning applications will be eventually and widely deployed on mobile devices, e.g., smartphones, self-driving cars, and drones. Therefore, in this paper, we aim to understand the resource requirements (time, memory) of CNNs on mobile devices. First, by deploying several popular CNNs on mobile CPUs and GPUs, we measure and analyze the performance and resource usage for every layer of the CNNs. Our findings point out the potential ways of optimizing the performance on mobile devices. Second, we model the resource requirements of the different CNN computations. Finally, based on the measurement, pro ling, and modeling, we build and evaluate our modeling tool, Augur, which takes a CNN configuration (descriptor) as the input and estimates the compute time and resource usage of the CNN, to give insights about whether and how e ciently a CNN can be run on a given mobile platform. In doing so Augur tackles several challenges: (i) how to overcome pro ling and measurement overhead; (ii) how to capture the variance in different mobile platforms with different processors, memory, and cache sizes; and (iii) how to account for the variance in the number, type and size of layers of the different CNN configurations.
Tasks Self-Driving Cars
Published 2017-09-27
URL http://arxiv.org/abs/1709.09503v1
PDF http://arxiv.org/pdf/1709.09503v1.pdf
PWC https://paperswithcode.com/paper/modeling-the-resource-requirements-of
Repo
Framework

Multi-task Neural Networks for Personalized Pain Recognition from Physiological Signals

Title Multi-task Neural Networks for Personalized Pain Recognition from Physiological Signals
Authors Daniel Lopez-Martinez, Rosalind Picard
Abstract Pain is a complex and subjective experience that poses a number of measurement challenges. While self-report by the patient is viewed as the gold standard of pain assessment, this approach fails when patients cannot verbally communicate pain intensity or lack normal mental abilities. Here, we present a pain intensity measurement method based on physiological signals. Specifically, we implement a multi-task learning approach based on neural networks that accounts for individual differences in pain responses while still leveraging data from across the population. We test our method in a dataset containing multi-modal physiological responses to nociceptive pain.
Tasks Multi-Task Learning
Published 2017-08-17
URL http://arxiv.org/abs/1708.08755v2
PDF http://arxiv.org/pdf/1708.08755v2.pdf
PWC https://paperswithcode.com/paper/multi-task-neural-networks-for-personalized
Repo
Framework

Profit Driven Decision Trees for Churn Prediction

Title Profit Driven Decision Trees for Churn Prediction
Authors Sebastiaan Höppner, Eugen Stripling, Bart Baesens, Seppe vanden Broucke, Tim Verdonck
Abstract Customer retention campaigns increasingly rely on predictive models to detect potential churners in a vast customer base. From the perspective of machine learning, the task of predicting customer churn can be presented as a binary classification problem. Using data on historic behavior, classification algorithms are built with the purpose of accurately predicting the probability of a customer defecting. The predictive churn models are then commonly selected based on accuracy related performance measures such as the area under the ROC curve (AUC). However, these models are often not well aligned with the core business requirement of profit maximization, in the sense that, the models fail to take into account not only misclassification costs, but also the benefits originating from a correct classification. Therefore, the aim is to construct churn prediction models that are profitable and preferably interpretable too. The recently developed expected maximum profit measure for customer churn (EMPC) has been proposed in order to select the most profitable churn model. We present a new classifier that integrates the EMPC metric directly into the model construction. Our technique, called ProfTree, uses an evolutionary algorithm for learning profit driven decision trees. In a benchmark study with real-life data sets from various telecommunication service providers, we show that ProfTree achieves significant profit improvements compared to classic accuracy driven tree-based methods.
Tasks
Published 2017-12-21
URL http://arxiv.org/abs/1712.08101v1
PDF http://arxiv.org/pdf/1712.08101v1.pdf
PWC https://paperswithcode.com/paper/profit-driven-decision-trees-for-churn
Repo
Framework

Deep Descriptor Transforming for Image Co-Localization

Title Deep Descriptor Transforming for Image Co-Localization
Authors Xiu-Shen Wei, Chen-Lin Zhang, Yao Li, Chen-Wei Xie, Jianxin Wu, Chunhua Shen, Zhi-Hua Zhou
Abstract Reusable model design becomes desirable with the rapid expansion of machine learning applications. In this paper, we focus on the reusability of pre-trained deep convolutional models. Specifically, different from treating pre-trained models as feature extractors, we reveal more treasures beneath convolutional layers, i.e., the convolutional activations could act as a detector for the common object in the image co-localization problem. We propose a simple but effective method, named Deep Descriptor Transforming (DDT), for evaluating the correlations of descriptors and then obtaining the category-consistent regions, which can accurately locate the common object in a set of images. Empirical studies validate the effectiveness of the proposed DDT method. On benchmark image co-localization datasets, DDT consistently outperforms existing state-of-the-art methods by a large margin. Moreover, DDT also demonstrates good generalization ability for unseen categories and robustness for dealing with noisy data.
Tasks
Published 2017-05-08
URL http://arxiv.org/abs/1705.02758v1
PDF http://arxiv.org/pdf/1705.02758v1.pdf
PWC https://paperswithcode.com/paper/deep-descriptor-transforming-for-image-co
Repo
Framework

Image-based Localization using Hourglass Networks

Title Image-based Localization using Hourglass Networks
Authors Iaroslav Melekhov, Juha Ylioinas, Juho Kannala, Esa Rahtu
Abstract In this paper, we propose an encoder-decoder convolutional neural network (CNN) architecture for estimating camera pose (orientation and location) from a single RGB-image. The architecture has a hourglass shape consisting of a chain of convolution and up-convolution layers followed by a regression part. The up-convolution layers are introduced to preserve the fine-grained information of the input image. Following the common practice, we train our model in end-to-end manner utilizing transfer learning from large scale classification data. The experiments demonstrate the performance of the approach on data exhibiting different lighting conditions, reflections, and motion blur. The results indicate a clear improvement over the previous state-of-the-art even when compared to methods that utilize sequence of test frames instead of a single frame.
Tasks Image-Based Localization, Transfer Learning
Published 2017-03-23
URL http://arxiv.org/abs/1703.07971v3
PDF http://arxiv.org/pdf/1703.07971v3.pdf
PWC https://paperswithcode.com/paper/image-based-localization-using-hourglass
Repo
Framework

Positive semi-definite embedding for dimensionality reduction and out-of-sample extensions

Title Positive semi-definite embedding for dimensionality reduction and out-of-sample extensions
Authors Michaël Fanuel, Antoine Aspeel, Jean-Charles Delvenne, Johan A. K. Suykens
Abstract In machine learning or statistics, it is often desirable to reduce the dimensionality of high dimensional data. We propose to obtain the low dimensional embedding coordinates as the eigenvectors of a positive semi-definite kernel matrix. This kernel matrix is the solution of a semi-definite program promoting a low rank solution and defined with the help of a diffusion kernel. Besides, we also discuss an infinite dimensional analogue of the same semi-definite program. From a practical perspective, a main feature of our approach is the existence of a non-linear out-of-sample extension formula of the embedding coordinates that we call a projected Nystr"om approximation. This extension formula yields an extension of the kernel matrix to a data-dependent Mercer kernel function. Although the semi-definite program may be solved directly, we propose another strategy based on a rank constrained formulation solved thanks to a projected power method algorithm followed by a singular value decomposition. This strategy allows for a reduced computational time.
Tasks Dimensionality Reduction
Published 2017-11-20
URL http://arxiv.org/abs/1711.07271v2
PDF http://arxiv.org/pdf/1711.07271v2.pdf
PWC https://paperswithcode.com/paper/positive-semi-definite-embedding-for
Repo
Framework

Optimal Experiment Design for Causal Discovery from Fixed Number of Experiments

Title Optimal Experiment Design for Causal Discovery from Fixed Number of Experiments
Authors AmirEmad Ghassami, Saber Salehkaleybar, Negar Kiyavash
Abstract We study the problem of causal structure learning over a set of random variables when the experimenter is allowed to perform at most $M$ experiments in a non-adaptive manner. We consider the optimal learning strategy in terms of minimizing the portions of the structure that remains unknown given the limited number of experiments in both Bayesian and minimax setting. We characterize the theoretical optimal solution and propose an algorithm, which designs the experiments efficiently in terms of time complexity. We show that for bounded degree graphs, in the minimax case and in the Bayesian case with uniform priors, our proposed algorithm is a $\rho$-approximation algorithm, where $\rho$ is independent of the order of the underlying graph. Simulations on both synthetic and real data show that the performance of our algorithm is very close to the optimal solution.
Tasks Causal Discovery
Published 2017-02-27
URL http://arxiv.org/abs/1702.08567v1
PDF http://arxiv.org/pdf/1702.08567v1.pdf
PWC https://paperswithcode.com/paper/optimal-experiment-design-for-causal
Repo
Framework

Estimation and Inference about Conditional Average Treatment Effect and Other Structural Functions

Title Estimation and Inference about Conditional Average Treatment Effect and Other Structural Functions
Authors Vira Semenova, Victor Chernozhukov
Abstract Our framework can be viewed as inference on low-dimensional nonparametric functions in the presence of high-dimensional nuisance function (where dimensionality refers to the number of covariates). Specifically, we consider the setting where we have a signal $Y=Y(\eta_0)$ that is an unbiased predictor of causal/structural objects like treatment effect, structural derivative, outcome given treatment, and others, conditional on a set of very high dimensional controls $Z$. We are interested in simpler lower-dimensional nonparametric summaries of $Y$, namely $g(x)=E[YX=x]$ conditional on a low-dimensional subset of covariates $X$. The signal $Y=Y(\eta)$ depends on an unknown nuisance function $\eta_0(Z)$. In the first stage, we need to learn the function $\eta_0(Z)$ using any machine learning method that is able to approximate $\eta$ accurately under very high dimensionality of $Z$. For example, under approximate sparsity with respect to a dictionary, $\ell_1$-penalized methods can be used; in others, tools such as deep neural networks can be used. To make the subsequent inference valid, we make the signal orthogonal to perturbations of $\eta$. As a result, the second-stage low-dimensional nonparametric inference enjoys the quasi-oracle properties, as if we knew $\eta_0$. In the second stage, we approximate the target function $g(x)$ by a linear form $p(x)'\beta_0$, where $\beta_0$ is the Best Linear Predictor parameter. We develop a complete set of results about estimation and approximately Gaussian inference on $x \mapsto p(x)'\beta$ and $x \mapsto g(x)$. If $p(x)$ is sufficiently rich and $g(x)$ admits a good approximation, then $g(x)$ gets automatically targeted by the inference; otherwise, the best linear approximation $p(x)'\beta$ to $g(x)$ gets targeted. When $p(x)$ is specified as a collection of group indicators, $p(x)'\beta$ describes group-average treatment effects (GATEs).
Tasks
Published 2017-02-21
URL https://arxiv.org/abs/1702.06240v3
PDF https://arxiv.org/pdf/1702.06240v3.pdf
PWC https://paperswithcode.com/paper/simultaneous-inference-for-best-linear
Repo
Framework

Compressed Factorization: Fast and Accurate Low-Rank Factorization of Compressively-Sensed Data

Title Compressed Factorization: Fast and Accurate Low-Rank Factorization of Compressively-Sensed Data
Authors Vatsal Sharan, Kai Sheng Tai, Peter Bailis, Gregory Valiant
Abstract What learning algorithms can be run directly on compressively-sensed data? In this work, we consider the question of accurately and efficiently computing low-rank matrix or tensor factorizations given data compressed via random projections. We examine the approach of first performing factorization in the compressed domain, and then reconstructing the original high-dimensional factors from the recovered (compressed) factors. In both the matrix and tensor settings, we establish conditions under which this natural approach will provably recover the original factors. While it is well-known that random projections preserve a number of geometric properties of a dataset, our work can be viewed as showing that they can also preserve certain solutions of non-convex, NP-Hard problems like non-negative matrix factorization. We support these theoretical results with experiments on synthetic data and demonstrate the practical applicability of compressed factorization on real-world gene expression and EEG time series datasets.
Tasks EEG, Time Series
Published 2017-06-25
URL https://arxiv.org/abs/1706.08146v3
PDF https://arxiv.org/pdf/1706.08146v3.pdf
PWC https://paperswithcode.com/paper/fast-and-accurate-low-rank-factorization-of
Repo
Framework

MirBot: A collaborative object recognition system for smartphones using convolutional neural networks

Title MirBot: A collaborative object recognition system for smartphones using convolutional neural networks
Authors Antonio Pertusa, Antonio-Javier Gallego, Marisa Bernabeu
Abstract MirBot is a collaborative application for smartphones that allows users to perform object recognition. This app can be used to take a photograph of an object, select the region of interest and obtain the most likely class (dog, chair, etc.) by means of similarity search using features extracted from a convolutional neural network (CNN). The answers provided by the system can be validated by the user so as to improve the results for future queries. All the images are stored together with a series of metadata, thus enabling a multimodal incremental dataset labeled with synset identifiers from the WordNet ontology. This dataset grows continuously thanks to the users’ feedback, and is publicly available for research. This work details the MirBot object recognition system, analyzes the statistics gathered after more than four years of usage, describes the image classification methodology, and performs an exhaustive evaluation using handcrafted features, convolutional neural codes and different transfer learning techniques. After comparing various models and transformation methods, the results show that the CNN features maintain the accuracy of MirBot constant over time, despite the increasing number of new classes. The app is freely available at the Apple and Google Play stores.
Tasks Image Classification, Object Recognition, Transfer Learning
Published 2017-06-09
URL http://arxiv.org/abs/1706.02889v3
PDF http://arxiv.org/pdf/1706.02889v3.pdf
PWC https://paperswithcode.com/paper/mirbot-a-collaborative-object-recognition
Repo
Framework

Efficient Preconditioning for Noisy Separable NMFs by Successive Projection Based Low-Rank Approximations

Title Efficient Preconditioning for Noisy Separable NMFs by Successive Projection Based Low-Rank Approximations
Authors Tomohiko Mizutani, Mirai Tanaka
Abstract The successive projection algorithm (SPA) can quickly solve a nonnegative matrix factorization problem under a separability assumption. Even if noise is added to the problem, SPA is robust as long as the perturbations caused by the noise are small. In particular, robustness against noise should be high when handling the problems arising from real applications. The preconditioner proposed by Gillis and Vavasis (2015) makes it possible to enhance the noise robustness of SPA. Meanwhile, an additional computational cost is required. The construction of the preconditioner contains a step to compute the top-$k$ truncated singular value decomposition of an input matrix. It is known that the decomposition provides the best rank-$k$ approximation to the input matrix; in other words, a matrix with the smallest approximation error among all matrices of rank less than $k$. This step is an obstacle to an efficient implementation of the preconditioned SPA. To address the cost issue, we propose a modification of the algorithm for constructing the preconditioner. Although the original algorithm uses the best rank-$k$ approximation, instead of it, our modification uses an alternative. Ideally, this alternative should have high approximation accuracy and low computational cost. To ensure this, our modification employs a rank-$k$ approximation produced by an SPA based algorithm. We analyze the accuracy of the approximation and evaluate the computational cost of the algorithm. We then present an empirical study revealing the actual performance of the SPA based rank-$k$ approximation algorithm and the modified preconditioned SPA.
Tasks
Published 2017-10-01
URL http://arxiv.org/abs/1710.00387v1
PDF http://arxiv.org/pdf/1710.00387v1.pdf
PWC https://paperswithcode.com/paper/efficient-preconditioning-for-noisy-separable
Repo
Framework

Beliefs in Markov Trees - From Local Computations to Local Valuation

Title Beliefs in Markov Trees - From Local Computations to Local Valuation
Authors Mieczysław A. Kłopotek
Abstract This paper is devoted to expressiveness of hypergraphs for which uncertainty propagation by local computations via Shenoy/Shafer method applies. It is demonstrated that for this propagation method for a given joint belief distribution no valuation of hyperedges of a hypergraph may provide with simpler hypergraph structure than valuation of hyperedges by conditional distributions. This has vital implication that methods recovering belief networks from data have no better alternative for finding the simplest hypergraph structure for belief propagation. A method for recovery tree-structured belief networks has been developed and specialized for Dempster-Shafer belief functions
Tasks
Published 2017-04-12
URL http://arxiv.org/abs/1704.03723v1
PDF http://arxiv.org/pdf/1704.03723v1.pdf
PWC https://paperswithcode.com/paper/beliefs-in-markov-trees-from-local
Repo
Framework
comments powered by Disqus