Paper Group ANR 194
Approximation Trees: Statistical Stability in Model Distillation. Learning to Group and Label Fine-Grained Shape Components. Face De-Spoofing: Anti-Spoofing via Noise Modeling. Modelling hidden structure of signals in group data analysis with modified (Lr, 1) and block-term decompositions. Dual SVM Training on a Budget. SalientDSO: Bringing Attenti …
Approximation Trees: Statistical Stability in Model Distillation
Title | Approximation Trees: Statistical Stability in Model Distillation |
Authors | Yichen Zhou, Zhengze Zhou, Giles Hooker |
Abstract | This paper examines the stability of learned explanations for black-box predictions via model distillation with decision trees. One approach to intelligibility in machine learning is to use an understandable student' model to mimic the output of an accurate teacher’. Here, we consider the use of regression trees as a student model, in which nodes of the tree can be used as `explanations’ for particular predictions, and the whole structure of the tree can be used as a global representation of the resulting function. However, individual trees are sensitive to the particular data sets used to train them, and an interpretation of a student model may be suspect if small changes in the training data have a large effect on it. In this context, access to outcomes from a teacher helps to stabilize the greedy splitting strategy by generating a much larger corpus of training examples than was originally available. We develop tests to ensure that enough examples are generated at each split so that the same splitting rule would be chosen with high probability were the tree to be re trained. Further, we develop a stopping rule to indicate how deep the tree should be built based on recent results on the variability of Random Forests when these are used as the teacher. We provide concrete examples of these procedures on the CAD-MDD and COMPAS data sets. | |
Tasks | |
Published | 2018-08-22 |
URL | http://arxiv.org/abs/1808.07573v1 |
http://arxiv.org/pdf/1808.07573v1.pdf | |
PWC | https://paperswithcode.com/paper/approximation-trees-statistical-stability-in |
Repo | |
Framework | |
Learning to Group and Label Fine-Grained Shape Components
Title | Learning to Group and Label Fine-Grained Shape Components |
Authors | Xiaogang Wang, Bin Zhou, Haiyue Fang, Xiaowu Chen, Qinping Zhao, Kai Xu |
Abstract | A majority of stock 3D models in modern shape repositories are assembled with many fine-grained components. The main cause of such data form is the component-wise modeling process widely practiced by human modelers. These modeling components thus inherently reflect some function-based shape decomposition the artist had in mind during modeling. On the other hand, modeling components represent an over-segmentation since a functional part is usually modeled as a multi-component assembly. Based on these observations, we advocate that labeled segmentation of stock 3D models should not overlook the modeling components and propose a learning solution to grouping and labeling of the fine-grained components. However, directly characterizing the shape of individual components for the purpose of labeling is unreliable, since they can be arbitrarily tiny and semantically meaningless. We propose to generate part hypotheses from the components based on a hierarchical grouping strategy, and perform labeling on those part groups instead of directly on the components. Part hypotheses are mid-level elements which are more probable to carry semantic information. A multiscale 3D convolutional neural network is trained to extract context-aware features for the hypotheses. To accomplish a labeled segmentation of the whole shape, we formulate higher-order conditional random fields (CRFs) to infer an optimal label assignment for all components. Extensive experiments demonstrate that our method achieves significantly robust labeling results on raw 3D models from public shape repositories. Our work also contributes the first benchmark for component-wise labeling. |
Tasks | |
Published | 2018-09-13 |
URL | http://arxiv.org/abs/1809.05050v1 |
http://arxiv.org/pdf/1809.05050v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-group-and-label-fine-grained |
Repo | |
Framework | |
Face De-Spoofing: Anti-Spoofing via Noise Modeling
Title | Face De-Spoofing: Anti-Spoofing via Noise Modeling |
Authors | Amin Jourabloo, Yaojie Liu, Xiaoming Liu |
Abstract | Many prior face anti-spoofing works develop discriminative models for recognizing the subtle differences between live and spoof faces. Those approaches often regard the image as an indivisible unit, and process it holistically, without explicit modeling of the spoofing process. In this work, motivated by the noise modeling and denoising algorithms, we identify a new problem of face de-spoofing, for the purpose of anti-spoofing: inversely decomposing a spoof face into a spoof noise and a live face, and then utilizing the spoof noise for classification. A CNN architecture with proper constraints and supervisions is proposed to overcome the problem of having no ground truth for the decomposition. We evaluate the proposed method on multiple face anti-spoofing databases. The results show promising improvements due to our spoof noise modeling. Moreover, the estimated spoof noise provides a visualization which helps to understand the added spoof noise by each spoof medium. |
Tasks | Denoising, Face Anti-Spoofing |
Published | 2018-07-26 |
URL | http://arxiv.org/abs/1807.09968v1 |
http://arxiv.org/pdf/1807.09968v1.pdf | |
PWC | https://paperswithcode.com/paper/face-de-spoofing-anti-spoofing-via-noise |
Repo | |
Framework | |
Modelling hidden structure of signals in group data analysis with modified (Lr, 1) and block-term decompositions
Title | Modelling hidden structure of signals in group data analysis with modified (Lr, 1) and block-term decompositions |
Authors | Pavel Kharyuk, Ivan Oseledets |
Abstract | This work is devoted to elaboration on the idea to use block term decomposition for group data analysis and to raise the possibility of modelling group activity with (Lr, 1) and Tucker blocks. A new generalization of block tensor decomposition was considered in application to group data analysis. Suggested approach was evaluated on multilabel classification task for a set of images. This contribution also reports results of investigation on clustering with proposed tensor models in comparison with known matrix models, namely common orthogonal basis extraction and group independent component analysis. |
Tasks | |
Published | 2018-08-07 |
URL | http://arxiv.org/abs/1808.02316v1 |
http://arxiv.org/pdf/1808.02316v1.pdf | |
PWC | https://paperswithcode.com/paper/modelling-hidden-structure-of-signals-in |
Repo | |
Framework | |
Dual SVM Training on a Budget
Title | Dual SVM Training on a Budget |
Authors | Sahar Qaadan, Merlin Schüler, Tobias Glasmachers |
Abstract | We present a dual subspace ascent algorithm for support vector machine training that respects a budget constraint limiting the number of support vectors. Budget methods are effective for reducing the training time of kernel SVM while retaining high accuracy. To date, budget training is available only for primal (SGD-based) solvers. Dual subspace ascent methods like sequential minimal optimization are attractive for their good adaptation to the problem structure, their fast convergence rate, and their practical speed. By incorporating a budget constraint into a dual algorithm, our method enjoys the best of both worlds. We demonstrate considerable speed-ups over primal budget training methods. |
Tasks | |
Published | 2018-06-26 |
URL | http://arxiv.org/abs/1806.10182v1 |
http://arxiv.org/pdf/1806.10182v1.pdf | |
PWC | https://paperswithcode.com/paper/dual-svm-training-on-a-budget |
Repo | |
Framework | |
SalientDSO: Bringing Attention to Direct Sparse Odometry
Title | SalientDSO: Bringing Attention to Direct Sparse Odometry |
Authors | Huai-Jen Liang, Nitin J. Sanket, Cornelia Fermüller, Yiannis Aloimonos |
Abstract | Although cluttered indoor scenes have a lot of useful high-level semantic information which can be used for mapping and localization, most Visual Odometry (VO) algorithms rely on the usage of geometric features such as points, lines and planes. Lately, driven by this idea, the joint optimization of semantic labels and obtaining odometry has gained popularity in the robotics community. The joint optimization is good for accurate results but is generally very slow. At the same time, in the vision community, direct and sparse approaches for VO have stricken the right balance between speed and accuracy. We merge the successes of these two communities and present a way to incorporate semantic information in the form of visual saliency to Direct Sparse Odometry - a highly successful direct sparse VO algorithm. We also present a framework to filter the visual saliency based on scene parsing. Our framework, SalientDSO, relies on the widely successful deep learning based approaches for visual saliency and scene parsing which drives the feature selection for obtaining highly-accurate and robust VO even in the presence of as few as 40 point features per frame. We provide extensive quantitative evaluation of SalientDSO on the ICL-NUIM and TUM monoVO datasets and show that we outperform DSO and ORB-SLAM - two very popular state-of-the-art approaches in the literature. We also collect and publicly release a CVL-UMD dataset which contains two indoor cluttered sequences on which we show qualitative evaluations. To our knowledge this is the first paper to use visual saliency and scene parsing to drive the feature selection in direct VO. |
Tasks | Feature Selection, Scene Parsing, Visual Odometry |
Published | 2018-02-28 |
URL | http://arxiv.org/abs/1803.00127v1 |
http://arxiv.org/pdf/1803.00127v1.pdf | |
PWC | https://paperswithcode.com/paper/salientdso-bringing-attention-to-direct |
Repo | |
Framework | |
A Convolutional Neural Network for Aspect Sentiment Classification
Title | A Convolutional Neural Network for Aspect Sentiment Classification |
Authors | Yongping Xing, Chuangbai Xiao, Yifei Wu, Ziming Ding |
Abstract | With the development of the Internet, natural language processing (NLP), in which sentiment analysis is an important task, became vital in information processing.Sentiment analysis includes aspect sentiment classification. Aspect sentiment can provide complete and in-depth results with increased attention on aspect-level. Different context words in a sentence influence the sentiment polarity of a sentence variably, and polarity varies based on the different aspects in a sentence. Take the sentence, ‘I bought a new camera. The picture quality is amazing but the battery life is too short.‘as an example. If the aspect is picture quality, then the expected sentiment polarity is ‘positive’, if the battery life aspect is considered, then the sentiment polarity should be ‘negative’; therefore, aspect is important to consider when we explore aspect sentiment in the sentence. Recurrent neural network (RNN) is regarded as a good model to deal with natural language processing, and RNNs has get good performance on aspect sentiment classification including Target-Dependent LSTM (TD-LSTM) ,Target-Connection LSTM (TC-LSTM) (Tang, 2015a, b), AE-LSTM, AT-LSTM, AEAT-LSTM (Wang et al., 2016).There are also extensive literatures on sentiment classification utilizing convolutional neural network, but there is little literature on aspect sentiment classification using convolutional neural network. In our paper, we develop attention-based input layers in which aspect information is considered by input layer. We then incorporate attention-based input layers into convolutional neural network (CNN) to introduce context words information. In our experiment, incorporating aspect information into CNN improves the latter’s aspect sentiment classification performance without using syntactic parser or external sentiment lexicons in a benchmark dataset from Twitter but get better performance compared with other models. |
Tasks | Sentiment Analysis |
Published | 2018-07-04 |
URL | http://arxiv.org/abs/1807.01704v1 |
http://arxiv.org/pdf/1807.01704v1.pdf | |
PWC | https://paperswithcode.com/paper/a-convolutional-neural-network-for-aspect |
Repo | |
Framework | |
A Smoothed Analysis of the Greedy Algorithm for the Linear Contextual Bandit Problem
Title | A Smoothed Analysis of the Greedy Algorithm for the Linear Contextual Bandit Problem |
Authors | Sampath Kannan, Jamie Morgenstern, Aaron Roth, Bo Waggoner, Zhiwei Steven Wu |
Abstract | Bandit learning is characterized by the tension between long-term exploration and short-term exploitation. However, as has recently been noted, in settings in which the choices of the learning algorithm correspond to important decisions about individual people (such as criminal recidivism prediction, lending, and sequential drug trials), exploration corresponds to explicitly sacrificing the well-being of one individual for the potential future benefit of others. This raises a fairness concern. In such settings, one might like to run a “greedy” algorithm, which always makes the (myopically) optimal decision for the individuals at hand - but doing this can result in a catastrophic failure to learn. In this paper, we consider the linear contextual bandit problem and revisit the performance of the greedy algorithm. We give a smoothed analysis, showing that even when contexts may be chosen by an adversary, small perturbations of the adversary’s choices suffice for the algorithm to achieve “no regret”, perhaps (depending on the specifics of the setting) with a constant amount of initial training data. This suggests that “generically” (i.e. in slightly perturbed environments), exploration and exploitation need not be in conflict in the linear setting. |
Tasks | |
Published | 2018-01-10 |
URL | http://arxiv.org/abs/1801.03423v1 |
http://arxiv.org/pdf/1801.03423v1.pdf | |
PWC | https://paperswithcode.com/paper/a-smoothed-analysis-of-the-greedy-algorithm |
Repo | |
Framework | |
Understanding AI Data Repositories with Automatic Query Generation
Title | Understanding AI Data Repositories with Automatic Query Generation |
Authors | Erik Altman |
Abstract | We describe a set of techniques to generate queries automatically based on one or more ingested, input corpuses. These queries require no a priori domain knowledge, and hence no human domain experts. Thus, these auto-generated queries help address the epistemological question of how we know what we know, or more precisely in this case, how an AI system with ingested data knows what it knows. These auto-generated queries can also be used to identify and remedy problem areas in ingested material – areas for which the knowledge of the AI system is incomplete or even erroneous. Similarly, the proposed techniques facilitate tests of AI capability – both in terms of coverage and accuracy. By removing humans from the main learning loop, our approach also allows more effective scaling of AI and cognitive capabilities to provide (1) broader coverage in a single domain such as health or geology; and (2) more rapid deployment to new domains. The proposed techniques also allow ingested knowledge to be extended naturally. Our investigations are early, and this paper provides a description of the techniques. Assessment of their efficacy is our next step for future work. |
Tasks | |
Published | 2018-04-20 |
URL | http://arxiv.org/abs/1804.07819v1 |
http://arxiv.org/pdf/1804.07819v1.pdf | |
PWC | https://paperswithcode.com/paper/understanding-ai-data-repositories-with |
Repo | |
Framework | |
Automatically Evolving CNN Architectures Based on Blocks
Title | Automatically Evolving CNN Architectures Based on Blocks |
Authors | Yanan Sun, Bing Xue, Mengjie Zhang, Gary G. Yen |
Abstract | The performance of Convolutional Neural Networks (CNNs) highly relies on their architectures. In order to design a CNN with promising performance, extended expertise in both CNNs and the investigated problem is required, which is not necessarily held by every user interested in CNNs or the problem domain. In this paper, we propose to automatically evolve CNN architectures by using a genetic algorithm based on ResNet blocks and DenseNet blocks. The proposed algorithm is \textbf{completely} automatic in designing CNN architectures, particularly, neither pre-processing before it starts nor post-processing on the designed CNN is needed. Furthermore, the proposed algorithm does not require users with domain knowledge on CNNs, the investigated problem or even genetic algorithms. The proposed algorithm is evaluated on CIFAR10 and CIFAR100 against 18 state-of-the-art peer competitors. Experimental results show that it outperforms state-of-the-art CNNs hand-crafted and CNNs designed by automatic peer competitors in terms of the classification accuracy, and achieves the competitive classification accuracy against semi-automatic peer competitors. In addition, the proposed algorithm consumes much less time than most peer competitors in finding the best CNN architectures. |
Tasks | |
Published | 2018-10-28 |
URL | http://arxiv.org/abs/1810.11875v2 |
http://arxiv.org/pdf/1810.11875v2.pdf | |
PWC | https://paperswithcode.com/paper/automatically-evolving-cnn-architectures |
Repo | |
Framework | |
Multi-player Multi-armed Bandits for Stable Allocation in Heterogeneous Ad-Hoc Networks
Title | Multi-player Multi-armed Bandits for Stable Allocation in Heterogeneous Ad-Hoc Networks |
Authors | Sumit J Darak, Manjesh K. Hanawal |
Abstract | Next generation networks are expected to be ultradense and aim to explore spectrum sharing paradigm that allows users to communicate in licensed, shared as well as unlicensed spectrum. Such ultra-dense networks will incur significant signaling load at base stations leading to a negative effect on spectrum and energy efficiency. To minimize signaling overhead, an adhoc approach is being considered for users communicating in the unlicensed and shared spectrums. For such users, decisions need to be completely decentralized as: 1) No communication between users and signaling from the base station is possible which necessitates independent channel selection at each user. A collision occurs when multiple users transmit simultaneously on the same channel, 2) Channel qualities may be heterogeneous, i.e., they are not same across all users, and moreover, are unknown, and 3) The network could be dynamic where users can enter or leave anytime. We develop a multi-armed bandit based distributed algorithm for static networks and extend it for the dynamic networks. The algorithms aim to achieve stable orthogonal allocation (SOC) in finite time and meet the above three constraints with two novel characteristics: 1) Low complexity narrowband radio compared to wideband radio in existing works, and 2) Epoch-less approach for dynamic networks. We establish a convergence of our algorithms to SOC and validate via extensive simulation experiments. |
Tasks | Multi-Armed Bandits |
Published | 2018-12-24 |
URL | https://arxiv.org/abs/1812.11651v2 |
https://arxiv.org/pdf/1812.11651v2.pdf | |
PWC | https://paperswithcode.com/paper/distributed-learning-and-stable |
Repo | |
Framework | |
Efficient ConvNets for Analog Arrays
Title | Efficient ConvNets for Analog Arrays |
Authors | Malte J. Rasch, Tayfun Gokmen, Mattia Rigotti, Wilfried Haensch |
Abstract | Analog arrays are a promising upcoming hardware technology with the potential to drastically speed up deep learning. Their main advantage is that they compute matrix-vector products in constant time, irrespective of the size of the matrix. However, early convolution layers in ConvNets map very unfavorably onto analog arrays, because kernel matrices are typically small and the constant time operation needs to be sequentially iterated a large number of times, reducing the speed up advantage for ConvNets. Here, we propose to replicate the kernel matrix of a convolution layer on distinct analog arrays, and randomly divide parts of the compute among them, so that multiple kernel matrices are trained in parallel. With this modification, analog arrays execute ConvNets with an acceleration factor that is proportional to the number of kernel matrices used per layer (here tested 16-128). Despite having more free parameters, we show analytically and in numerical experiments that this convolution architecture is self-regularizing and implicitly learns similar filters across arrays. We also report superior performance on a number of datasets and increased robustness to adversarial attacks. Our investigation suggests to revise the notion that mixed analog-digital hardware is not suitable for ConvNets. |
Tasks | |
Published | 2018-07-03 |
URL | http://arxiv.org/abs/1807.01356v1 |
http://arxiv.org/pdf/1807.01356v1.pdf | |
PWC | https://paperswithcode.com/paper/efficient-convnets-for-analog-arrays |
Repo | |
Framework | |
Variance Reduction in Monte Carlo Counterfactual Regret Minimization (VR-MCCFR) for Extensive Form Games using Baselines
Title | Variance Reduction in Monte Carlo Counterfactual Regret Minimization (VR-MCCFR) for Extensive Form Games using Baselines |
Authors | Martin Schmid, Neil Burch, Marc Lanctot, Matej Moravcik, Rudolf Kadlec, Michael Bowling |
Abstract | Learning strategies for imperfect information games from samples of interaction is a challenging problem. A common method for this setting, Monte Carlo Counterfactual Regret Minimization (MCCFR), can have slow long-term convergence rates due to high variance. In this paper, we introduce a variance reduction technique (VR-MCCFR) that applies to any sampling variant of MCCFR. Using this technique, per-iteration estimated values and updates are reformulated as a function of sampled values and state-action baselines, similar to their use in policy gradient reinforcement learning. The new formulation allows estimates to be bootstrapped from other estimates within the same episode, propagating the benefits of baselines along the sampled trajectory; the estimates remain unbiased even when bootstrapping from other estimates. Finally, we show that given a perfect baseline, the variance of the value estimates can be reduced to zero. Experimental evaluation shows that VR-MCCFR brings an order of magnitude speedup, while the empirical variance decreases by three orders of magnitude. The decreased variance allows for the first time CFR+ to be used with sampling, increasing the speedup to two orders of magnitude. |
Tasks | |
Published | 2018-09-09 |
URL | http://arxiv.org/abs/1809.03057v1 |
http://arxiv.org/pdf/1809.03057v1.pdf | |
PWC | https://paperswithcode.com/paper/variance-reduction-in-monte-carlo |
Repo | |
Framework | |
Neural Network based classification of bone metastasis by primary cacinoma
Title | Neural Network based classification of bone metastasis by primary cacinoma |
Authors | Marija Prokopijević, Aleksandar Stančić, Jelena Vasiljević, Željko Stojković, Goran Dimić, Jelena Sopta, Dalibor Ristić, Dhinaharan Nagamalai |
Abstract | Neural networks have been known for a long time as a tool for different types of classification, but only just in the last decade they have showed their entire power. Along with appearing of hardware that is capable to support demanding matrix operations and parallel algorithms, the neural network, as a universal function approximation framework, turns out to be the most successful classification method widely used in all fields of science. On the other side, multifractal (MF) approach is an efficient way for quantitative description of complex structures [1] such as metastatic carcinoma, which recommends this method as an accurate tool for medical diagnostics. The only part that is missing is classification method. The goal of this research is to describe and apply a feed-forward neural network as an auxiliary diagnostic method for classification of multifractal parameters in order to determine primary cancer. |
Tasks | |
Published | 2018-10-08 |
URL | http://arxiv.org/abs/1810.05725v1 |
http://arxiv.org/pdf/1810.05725v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-network-based-classification-of-bone |
Repo | |
Framework | |
Training Discriminative Models to Evaluate Generative Ones
Title | Training Discriminative Models to Evaluate Generative Ones |
Authors | Timothée Lesort, Andrei Stoain, Jean-François Goudou, David Filliat |
Abstract | Generative models are known to be difficult to assess. Recent works, especially on generative adversarial networks (GANs), produce good visual samples of varied categories of images. However, the validation of their quality is still difficult to define and there is no existing agreement on the best evaluation process. This paper aims at making a step toward an objective evaluation process for generative models. It presents a new method to assess a trained generative model by evaluating the test accuracy of a classifier trained with generated data. The test set is composed of real images. Therefore, The classifier accuracy is used as a proxy to evaluate if the generative model fit the true data distribution. By comparing results with different generated datasets we are able to classify and compare generative models. The motivation of this approach is also to evaluate if generative models can help discriminative neural networks to learn, i.e., measure if training on generated data is able to make a model successful at testing on real settings. Our experiments compare different generators from the Variational Auto-Encoders (VAE) and Generative Adversarial Network (GAN) frameworks on MNIST and fashion MNIST datasets. Our results show that none of the generative models is able to replace completely true data to train a discriminative model. But they also show that the initial GAN and WGAN are the best choices to generate on MNIST database (Modified National Institute of Standards and Technology database) and fashion MNIST database. |
Tasks | |
Published | 2018-06-28 |
URL | https://arxiv.org/abs/1806.10840v2 |
https://arxiv.org/pdf/1806.10840v2.pdf | |
PWC | https://paperswithcode.com/paper/training-discriminative-models-to-evaluate |
Repo | |
Framework | |