Paper Group ANR 1021
Random 2.5D U-net for Fully 3D Segmentation. Maintaining Discrimination and Fairness in Class Incremental Learning. Towards Understanding Adversarial Examples Systematically: Exploring Data Size, Task and Model Factors. The Ridge Path Estimator for Linear Instrumental Variables. Low-dimensional statistical manifold embedding of directed graphs. DeL …
Random 2.5D U-net for Fully 3D Segmentation
Title | Random 2.5D U-net for Fully 3D Segmentation |
Authors | Christoph Angermann, Markus Haltmeier |
Abstract | Convolutional neural networks are state-of-the-art for various segmentation tasks. While for 2D images these networks are also computationally efficient, 3D convolutions have huge storage requirements and therefore, end-to-end training is limited by GPU memory and data size. To overcome this issue, we introduce a network structure for volumetric data without 3D convolution layers. The main idea is to include projections from different directions to transform the volumetric data to a sequence of images, where each image contains information of the full data. We then apply 2D convolutions to these projection images and lift them again to volumetric data using a trainable reconstruction algorithm. The proposed architecture can be applied end-to-end to very large data volumes without cropping or sliding-window techniques. For a tested sparse binary segmentation task, it outperforms already known standard approaches and is more resistant to generation of artefacts. |
Tasks | |
Published | 2019-10-23 |
URL | https://arxiv.org/abs/1910.10398v1 |
https://arxiv.org/pdf/1910.10398v1.pdf | |
PWC | https://paperswithcode.com/paper/random-25d-u-net-for-fully-3d-segmentation |
Repo | |
Framework | |
Maintaining Discrimination and Fairness in Class Incremental Learning
Title | Maintaining Discrimination and Fairness in Class Incremental Learning |
Authors | Bowen Zhao, Xi Xiao, Guojun Gan, Bin Zhang, Shutao Xia |
Abstract | Deep neural networks (DNNs) have been applied in class incremental learning, which aims to solve common real-world problems of learning new classes continually. One drawback of standard DNNs is that they are prone to catastrophic forgetting. Knowledge distillation (KD) is a commonly used technique to alleviate this problem. In this paper, we demonstrate it can indeed help the model to output more discriminative results within old classes. However, it cannot alleviate the problem that the model tends to classify objects into new classes, causing the positive effect of KD to be hidden and limited. We observed that an important factor causing catastrophic forgetting is that the weights in the last fully connected (FC) layer are highly biased in class incremental learning. In this paper, we propose a simple and effective solution motivated by the aforementioned observations to address catastrophic forgetting. Firstly, we utilize KD to maintain the discrimination within old classes. Then, to further maintain the fairness between old classes and new classes, we propose Weight Aligning (WA) that corrects the biased weights in the FC layer after normal training process. Unlike previous work, WA does not require any extra parameters or a validation set in advance, as it utilizes the information provided by the biased weights themselves. The proposed method is evaluated on ImageNet-1000, ImageNet-100, and CIFAR-100 under various settings. Experimental results show that the proposed method can effectively alleviate catastrophic forgetting and significantly outperform state-of-the-art methods. |
Tasks | |
Published | 2019-11-16 |
URL | https://arxiv.org/abs/1911.07053v1 |
https://arxiv.org/pdf/1911.07053v1.pdf | |
PWC | https://paperswithcode.com/paper/maintaining-discrimination-and-fairness-in |
Repo | |
Framework | |
Towards Understanding Adversarial Examples Systematically: Exploring Data Size, Task and Model Factors
Title | Towards Understanding Adversarial Examples Systematically: Exploring Data Size, Task and Model Factors |
Authors | Ke Sun, Zhanxing Zhu, Zhouchen Lin |
Abstract | Most previous works usually explained adversarial examples from several specific perspectives, lacking relatively integral comprehension about this problem. In this paper, we present a systematic study on adversarial examples from three aspects: the amount of training data, task-dependent and model-specific factors. Particularly, we show that adversarial generalization (i.e. test accuracy on adversarial examples) for standard training requires more data than standard generalization (i.e. test accuracy on clean examples); and uncover the global relationship between generalization and robustness with respect to the data size especially when data is augmented by generative models. This reveals the trade-off correlation between standard generalization and robustness in limited training data regime and their consistency when data size is large enough. Furthermore, we explore how different task-dependent and model-specific factors influence the vulnerability of deep neural networks by extensive empirical analysis. Relevant recommendations on defense against adversarial attacks are provided as well. Our results outline a potential path towards the luminous and systematic understanding of adversarial examples. |
Tasks | |
Published | 2019-02-28 |
URL | http://arxiv.org/abs/1902.11019v1 |
http://arxiv.org/pdf/1902.11019v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-understanding-adversarial-examples |
Repo | |
Framework | |
The Ridge Path Estimator for Linear Instrumental Variables
Title | The Ridge Path Estimator for Linear Instrumental Variables |
Authors | Nandana Sengupta, Fallaw Sowell |
Abstract | This paper presents the asymptotic behavior of a linear instrumental variables (IV) estimator that uses a ridge regression penalty. The regularization tuning parameter is selected empirically by splitting the observed data into training and test samples. Conditional on the tuning parameter, the training sample creates a path from the IV estimator to a prior. The optimal tuning parameter is the value along this path that minimizes the IV objective function for the test sample. The empirically selected regularization tuning parameter becomes an estimated parameter that jointly converges with the parameters of interest. The asymptotic distribution of the tuning parameter is a nonstandard mixture distribution. Monte Carlo simulations show the asymptotic distribution captures the characteristics of the sampling distributions and when this ridge estimator performs better than two-stage least squares. |
Tasks | |
Published | 2019-08-25 |
URL | https://arxiv.org/abs/1908.09237v1 |
https://arxiv.org/pdf/1908.09237v1.pdf | |
PWC | https://paperswithcode.com/paper/the-ridge-path-estimator-for-linear |
Repo | |
Framework | |
Low-dimensional statistical manifold embedding of directed graphs
Title | Low-dimensional statistical manifold embedding of directed graphs |
Authors | Thorben Funke, Tian Guo, Alen Lancic, Nino Antulov-Fantulin |
Abstract | We propose a novel node embedding of directed graphs to statistical manifolds, which is based on a global minimization of pairwise relative entropy and graph geodesics in a non-linear way. Each node is encoded with a probability density function over a measurable space. Furthermore, we analyze the connection between the geometrical properties of such embedding and their efficient learning procedure. Extensive experiments show that our proposed embedding is better in preserving the global geodesic information of graphs, as well as outperforming existing embedding models on directed graphs in a variety of evaluation metrics, in an unsupervised setting. |
Tasks | |
Published | 2019-05-24 |
URL | https://arxiv.org/abs/1905.10227v3 |
https://arxiv.org/pdf/1905.10227v3.pdf | |
PWC | https://paperswithcode.com/paper/statistical-embedding-for-directed-graphs |
Repo | |
Framework | |
DeLiO: Decoupled LiDAR Odometry
Title | DeLiO: Decoupled LiDAR Odometry |
Authors | Queens Maria Thomas, Oliver Wasenmüller, Didier Stricker |
Abstract | Most LiDAR odometry algorithms estimate the transformation between two consecutive frames by estimating the rotation and translation in an intervening fashion. In this paper, we propose our Decoupled LiDAR Odometry (DeLiO), which – for the first time – decouples the rotation estimation completely from the translation estimation. In particular, the rotation is estimated by extracting the surface normals from the input point clouds and tracking their characteristic pattern on a unit sphere. Using this rotation the point clouds are unrotated so that the underlying transformation is pure translation, which can be easily estimated using a line cloud approach. An evaluation is performed on the KITTI dataset and the results are compared against state-of-the-art algorithms. |
Tasks | |
Published | 2019-04-29 |
URL | http://arxiv.org/abs/1904.12667v1 |
http://arxiv.org/pdf/1904.12667v1.pdf | |
PWC | https://paperswithcode.com/paper/delio-decoupled-lidar-odometry |
Repo | |
Framework | |
Dependency-aware Attention Control for Unconstrained Face Recognition with Image Sets
Title | Dependency-aware Attention Control for Unconstrained Face Recognition with Image Sets |
Authors | Xiaofeng Liu, B. V. K Vijaya Kumar, Chao Yang, Qingming Tang, Jane You |
Abstract | This paper targets the problem of image set-based face verification and identification. Unlike traditional single media (an image or video) setting, we encounter a set of heterogeneous contents containing orderless images and videos. The importance of each image is usually considered either equal or based on their independent quality assessment. How to model the relationship of orderless images within a set remains a challenge. We address this problem by formulating it as a Markov Decision Process (MDP) in the latent space. Specifically, we first present a dependency-aware attention control (DAC) network, which resorts to actor-critic reinforcement learning for sequential attention decision of each image embedding to fully exploit the rich correlation cues among the unordered images. Moreover, we introduce its sample-efficient variant with off-policy experience replay to speed up the learning process. The pose-guided representation scheme can further boost the performance at the extremes of the pose variation. |
Tasks | Face Recognition, Face Verification |
Published | 2019-07-05 |
URL | https://arxiv.org/abs/1907.03030v1 |
https://arxiv.org/pdf/1907.03030v1.pdf | |
PWC | https://paperswithcode.com/paper/dependency-aware-attention-control-for-1 |
Repo | |
Framework | |
Uncheatable Machine Learning Inference
Title | Uncheatable Machine Learning Inference |
Authors | Mustafa Canim, Ashish Kundu, Josh Payne |
Abstract | Classification-as-a-Service (CaaS) is widely deployed today in machine intelligence stacks for a vastly diverse set of applications including anything from medical prognosis to computer vision tasks to natural language processing to identity fraud detection. The computing power required for training complex models on large datasets to perform inference to solve these problems can be very resource-intensive. A CaaS provider may cheat a customer by fraudulently bypassing expensive training procedures in favor of weaker, less computationally-intensive algorithms which yield results of reduced quality. Given a classification service supplier $S$, intermediary CaaS provider $P$ claiming to use $S$ as a classification backend, and customer $C$, our work addresses the following questions: (i) how can $P$'s claim to be using $S$ be verified by $C$? (ii) how might $S$ make performance guarantees that may be verified by $C$? and (iii) how might one design a decentralized system that incentivizes service proofing and accountability? To this end, we propose a variety of methods for $C$ to evaluate the service claims made by $P$ using probabilistic performance metrics, instance seeding, and steganography. We also propose a method of measuring the robustness of a model using a blackbox adversarial procedure, which may then be used as a benchmark or comparison to a claim made by $S$. Finally, we propose the design of a smart contract-based decentralized system that incentivizes service accountability to serve as a trusted Quality of Service (QoS) auditor. |
Tasks | Fraud Detection |
Published | 2019-08-08 |
URL | https://arxiv.org/abs/1908.03270v1 |
https://arxiv.org/pdf/1908.03270v1.pdf | |
PWC | https://paperswithcode.com/paper/uncheatable-machine-learning-inference |
Repo | |
Framework | |
A Fast Sampling Gradient Tree Boosting Framework
Title | A Fast Sampling Gradient Tree Boosting Framework |
Authors | Daniel Chao Zhou, Zhongming Jin, Tong Zhang |
Abstract | As an adaptive, interpretable, robust, and accurate meta-algorithm for arbitrary differentiable loss functions, gradient tree boosting is one of the most popular machine learning techniques, though the computational expensiveness severely limits its usage. Stochastic gradient boosting could be adopted to accelerates gradient boosting by uniformly sampling training instances, but its estimator could introduce a high variance. This situation arises motivation for us to optimize gradient tree boosting. We combine gradient tree boosting with importance sampling, which achieves better performance by reducing the stochastic variance. Furthermore, we use a regularizer to improve the diagonal approximation in the Newton step of gradient boosting. The theoretical analysis supports that our strategies achieve a linear convergence rate on logistic loss. Empirical results show that our algorithm achieves a 2.5x–18x acceleration on two different gradient boosting algorithms (LogitBoost and LambdaMART) without appreciable performance loss. |
Tasks | |
Published | 2019-11-20 |
URL | https://arxiv.org/abs/1911.08820v1 |
https://arxiv.org/pdf/1911.08820v1.pdf | |
PWC | https://paperswithcode.com/paper/a-fast-sampling-gradient-tree-boosting |
Repo | |
Framework | |
Towards Robust Voice Pathology Detection
Title | Towards Robust Voice Pathology Detection |
Authors | Pavol Harar, Zoltan Galaz, Jesus B. Alonso-Hernandez, Jiri Mekyska, Radim Burget, Zdenek Smekal |
Abstract | Automatic objective non-invasive detection of pathological voice based on computerized analysis of acoustic signals can play an important role in early diagnosis, progression tracking and even effective treatment of pathological voices. In search towards such a robust voice pathology detection system we investigated 3 distinct classifiers within supervised learning and anomaly detection paradigms. We conducted a set of experiments using a variety of input data such as raw waveforms, spectrograms, mel-frequency cepstral coefficients (MFCC) and conventional acoustic (dysphonic) features (AF). In comparison with previously published works, this article is the first to utilize combination of 4 different databases comprising normophonic and pathological recordings of sustained phonation of the vowel /a/ unrestricted to a subset of vocal pathologies. Furthermore, to our best knowledge, this article is the first to explore gradient boosted trees and deep learning for this application. The following best classification performances measured by F1 score on dedicated test set were achieved: XGBoost (0.733) using AF and MFCC, DenseNet (0.621) using MFCC, and Isolation Forest (0.610) using AF. Even though these results are of exploratory character, conducted experiments do show promising potential of gradient boosting and deep learning methods to robustly detect voice pathologies. |
Tasks | Anomaly Detection |
Published | 2019-07-13 |
URL | https://arxiv.org/abs/1907.06129v1 |
https://arxiv.org/pdf/1907.06129v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-robust-voice-pathology-detection |
Repo | |
Framework | |
Online Multivariate Anomaly Detection and Localization for High-dimensional Settings
Title | Online Multivariate Anomaly Detection and Localization for High-dimensional Settings |
Authors | Mahsa Mozaffari, Yasin Yilmaz |
Abstract | This paper considers the real-time detection of anomalies in high-dimensional systems. The goal is to detect anomalies quickly and accurately so that the appropriate countermeasures could be taken in time, before the system possibly gets harmed. We propose a sequential and multivariate anomaly detection method that scales well to high-dimensional datasets. The proposed method follows a nonparametric, i.e., data-driven, and semi-supervised approach, i.e., trains only on nominal data. Thus, it is applicable to a wide range of applications and data types. Thanks to its multivariate nature, it can quickly and accurately detect challenging anomalies, such as changes in the correlation structure and stealth low-rate cyberattacks. Its asymptotic optimality and computational complexity are comprehensively analyzed. In conjunction with the detection method, an effective technique for localizing the anomalous data dimensions is also proposed. We further extend the proposed detection and localization methods to a supervised setup where an additional anomaly dataset is available, and combine the proposed semi-supervised and supervised algorithms to obtain an online learning algorithm under the semi-supervised framework. The practical use of proposed algorithms are demonstrated in DDoS attack mitigation, and their performances are evaluated using a real IoT-botnet dataset and simulations. |
Tasks | Anomaly Detection |
Published | 2019-05-17 |
URL | https://arxiv.org/abs/1905.07107v1 |
https://arxiv.org/pdf/1905.07107v1.pdf | |
PWC | https://paperswithcode.com/paper/online-multivariate-anomaly-detection-and |
Repo | |
Framework | |
DeepMDP: Learning Continuous Latent Space Models for Representation Learning
Title | DeepMDP: Learning Continuous Latent Space Models for Representation Learning |
Authors | Carles Gelada, Saurabh Kumar, Jacob Buckman, Ofir Nachum, Marc G. Bellemare |
Abstract | Many reinforcement learning (RL) tasks provide the agent with high-dimensional observations that can be simplified into low-dimensional continuous states. To formalize this process, we introduce the concept of a DeepMDP, a parameterized latent space model that is trained via the minimization of two tractable losses: prediction of rewards and prediction of the distribution over next latent states. We show that the optimization of these objectives guarantees (1) the quality of the latent space as a representation of the state space and (2) the quality of the DeepMDP as a model of the environment. We connect these results to prior work in the bisimulation literature, and explore the use of a variety of metrics. Our theoretical findings are substantiated by the experimental result that a trained DeepMDP recovers the latent structure underlying high-dimensional observations on a synthetic environment. Finally, we show that learning a DeepMDP as an auxiliary task in the Atari 2600 domain leads to large performance improvements over model-free RL. |
Tasks | Representation Learning |
Published | 2019-06-06 |
URL | https://arxiv.org/abs/1906.02736v1 |
https://arxiv.org/pdf/1906.02736v1.pdf | |
PWC | https://paperswithcode.com/paper/deepmdp-learning-continuous-latent-space |
Repo | |
Framework | |
Petri Net Machines for Human-Agent Interaction
Title | Petri Net Machines for Human-Agent Interaction |
Authors | Christian Dondrup, Ioannis Papaioannou, Oliver Lemon |
Abstract | Smart speakers and robots become ever more prevalent in our daily lives. These agents are able to execute a wide range of tasks and actions and, therefore, need systems to control their execution. Current state-of-the-art such as (deep) reinforcement learning, however, requires vast amounts of data for training which is often hard to come by when interacting with humans. To overcome this issue, most systems still rely on Finite State Machines. We introduce Petri Net Machines which present a formal definition for state machines based on Petri Nets that are able to execute concurrent actions reliably, execute and interleave several plans at the same time, and provide an easy to use modelling language. We show their workings based on the example of Human-Robot Interaction in a shopping mall. |
Tasks | |
Published | 2019-09-13 |
URL | https://arxiv.org/abs/1909.06174v1 |
https://arxiv.org/pdf/1909.06174v1.pdf | |
PWC | https://paperswithcode.com/paper/petri-net-machines-for-human-agent |
Repo | |
Framework | |
Subsumption-driven clause learning with DPLL+restarts
Title | Subsumption-driven clause learning with DPLL+restarts |
Authors | Olivier Bailleux |
Abstract | We propose to use a DPLL+restart to solve SAT instances by successive simplifications based on the production of clauses that subsume the initial clauses. We show that this approach allows the refutation of pebbling formulae in polynomial time and linear space, as effectively as with a CDCL solver. |
Tasks | |
Published | 2019-06-18 |
URL | https://arxiv.org/abs/1906.07508v1 |
https://arxiv.org/pdf/1906.07508v1.pdf | |
PWC | https://paperswithcode.com/paper/subsumption-driven-clause-learning-with |
Repo | |
Framework | |
Dirichlet uncertainty wrappers for actionable algorithm accuracy accountability and auditability
Title | Dirichlet uncertainty wrappers for actionable algorithm accuracy accountability and auditability |
Authors | José Mena, Oriol Pujol, Jordi Vitrià |
Abstract | Nowadays, the use of machine learning models is becoming a utility in many applications. Companies deliver pre-trained models encapsulated as application programming interfaces (APIs) that developers combine with third party components and their own models and data to create complex data products to solve specific problems. The complexity of such products and the lack of control and knowledge of the internals of each component used cause unavoidable effects, such as lack of transparency, difficulty in auditability, and emergence of potential uncontrolled risks. They are effectively black-boxes. Accountability of such solutions is a challenge for the auditors and the machine learning community. In this work, we propose a wrapper that given a black-box model enriches its output prediction with a measure of uncertainty. By using this wrapper, we make the black-box auditable for the accuracy risk (risk derived from low quality or uncertain decisions) and at the same time we provide an actionable mechanism to mitigate that risk in the form of decision rejection; we can choose not to issue a prediction when the risk or uncertainty in that decision is significant. Based on the resulting uncertainty measure, we advocate for a rejection system that selects the more confident predictions, discarding those more uncertain, leading to an improvement in the trustability of the resulting system. We showcase the proposed technique and methodology in a practical scenario where a simulated sentiment analysis API based on natural language processing is applied to different domains. Results demonstrate the effectiveness of the uncertainty computed by the wrapper and its high correlation to bad quality predictions and misclassifications. |
Tasks | Sentiment Analysis |
Published | 2019-12-29 |
URL | https://arxiv.org/abs/1912.12628v1 |
https://arxiv.org/pdf/1912.12628v1.pdf | |
PWC | https://paperswithcode.com/paper/dirichlet-uncertainty-wrappers-for-actionable |
Repo | |
Framework | |