Paper Group ANR 316
DWM: A Decomposable Winograd Method for Convolution Acceleration. AVATAR – Machine Learning Pipeline Evaluation Using Surrogate Model. Distributed Sketching Methods for Privacy Preserving Regression. CPR-GCN: Conditional Partial-Residual Graph Convolutional Network in Automated Anatomical Labeling of Coronary Arteries. On the rate of convergence o …
DWM: A Decomposable Winograd Method for Convolution Acceleration
Title | DWM: A Decomposable Winograd Method for Convolution Acceleration |
Authors | Di Huang, Xishan Zhang, Rui Zhang, Tian Zhi, Deyuan He, Jiaming Guo, Chang Liu, Qi Guo, Zidong Du, Shaoli Liu, Tianshi Chen, Yunji Chen |
Abstract | Winograd’s minimal filtering algorithm has been widely used in Convolutional Neural Networks (CNNs) to reduce the number of multiplications for faster processing. However, it is only effective on convolutions with kernel size as 3x3 and stride as 1, because it suffers from significantly increased FLOPs and numerical accuracy problem for kernel size larger than 3x3 and fails on convolution with stride larger than 1. In this paper, we propose a novel Decomposable Winograd Method (DWM), which breaks through the limitation of original Winograd’s minimal filtering algorithm to a wide and general convolutions. DWM decomposes kernels with large size or large stride to several small kernels with stride as 1 for further applying Winograd method, so that DWM can reduce the number of multiplications while keeping the numerical accuracy. It enables the fast exploring of larger kernel size and larger stride value in CNNs for high performance and accuracy and even the potential for new CNNs. Comparing against the original Winograd, the proposed DWM is able to support all kinds of convolutions with a speedup of ~2, without affecting the numerical accuracy. |
Tasks | |
Published | 2020-02-03 |
URL | https://arxiv.org/abs/2002.00552v1 |
https://arxiv.org/pdf/2002.00552v1.pdf | |
PWC | https://paperswithcode.com/paper/dwm-a-decomposable-winograd-method-for |
Repo | |
Framework | |
AVATAR – Machine Learning Pipeline Evaluation Using Surrogate Model
Title | AVATAR – Machine Learning Pipeline Evaluation Using Surrogate Model |
Authors | Tien-Dung Nguyen, Tomasz Maszczyk, Katarzyna Musial, Marc-Andre Zöller, Bogdan Gabrys |
Abstract | The evaluation of machine learning (ML) pipelines is essential during automatic ML pipeline composition and optimisation. The previous methods such as Bayesian-based and genetic-based optimisation, which are implemented in Auto-Weka, Auto-sklearn and TPOT, evaluate pipelines by executing them. Therefore, the pipeline composition and optimisation of these methods requires a tremendous amount of time that prevents them from exploring complex pipelines to find better predictive models. To further explore this research challenge, we have conducted experiments showing that many of the generated pipelines are invalid, and it is unnecessary to execute them to find out whether they are good pipelines. To address this issue, we propose a novel method to evaluate the validity of ML pipelines using a surrogate model (AVATAR). The AVATAR enables to accelerate automatic ML pipeline composition and optimisation by quickly ignoring invalid pipelines. Our experiments show that the AVATAR is more efficient in evaluating complex pipelines in comparison with the traditional evaluation approaches requiring their execution. |
Tasks | |
Published | 2020-01-30 |
URL | https://arxiv.org/abs/2001.11158v2 |
https://arxiv.org/pdf/2001.11158v2.pdf | |
PWC | https://paperswithcode.com/paper/avatar-machine-learning-pipeline-evaluation |
Repo | |
Framework | |
Distributed Sketching Methods for Privacy Preserving Regression
Title | Distributed Sketching Methods for Privacy Preserving Regression |
Authors | Burak Bartan, Mert Pilanci |
Abstract | In this work, we study distributed sketching methods for large scale regression problems. We leverage multiple randomized sketches for reducing the problem dimensions as well as preserving privacy and improving straggler resilience in asynchronous distributed systems. We derive novel approximation guarantees for classical sketching methods and analyze the accuracy of parameter averaging for distributed sketches. We consider random matrices including Gaussian, randomized Hadamard, uniform sampling and leverage score sampling in the distributed setting. Moreover, we propose a hybrid approach combining sampling and fast random projections for better computational efficiency. We illustrate the performance of distributed sketches in a serverless computing platform with large scale experiments. |
Tasks | |
Published | 2020-02-16 |
URL | https://arxiv.org/abs/2002.06538v1 |
https://arxiv.org/pdf/2002.06538v1.pdf | |
PWC | https://paperswithcode.com/paper/distributed-sketching-methods-for-privacy |
Repo | |
Framework | |
CPR-GCN: Conditional Partial-Residual Graph Convolutional Network in Automated Anatomical Labeling of Coronary Arteries
Title | CPR-GCN: Conditional Partial-Residual Graph Convolutional Network in Automated Anatomical Labeling of Coronary Arteries |
Authors | Han Yang, Xingjian Zhen, Ying Chi, Lei Zhang, Xian-Sheng Hua |
Abstract | Automated anatomical labeling plays a vital role in coronary artery disease diagnosing procedure. The main challenge in this problem is the large individual variability inherited in human anatomy. Existing methods usually rely on the position information and the prior knowledge of the topology of the coronary artery tree, which may lead to unsatisfactory performance when the main branches are confusing. Motivated by the wide application of the graph neural network in structured data, in this paper, we propose a conditional partial-residual graph convolutional network (CPR-GCN), which takes both position and CT image into consideration, since CT image contains abundant information such as branch size and spanning direction. Two majority parts, a Partial-Residual GCN and a conditions extractor, are included in CPR-GCN. The conditions extractor is a hybrid model containing the 3D CNN and the LSTM, which can extract 3D spatial image features along the branches. On the technical side, the Partial-Residual GCN takes the position features of the branches, with the 3D spatial image features as conditions, to predict the label for each branches. While on the mathematical side, our approach twists the partial differential equation (PDE) into the graph modeling. A dataset with 511 subjects is collected from the clinic and annotated by two experts with a two-phase annotation process. According to the five-fold cross-validation, our CPR-GCN yields 95.8% meanRecall, 95.4% meanPrecision and 0.955 meanF1, which outperforms state-of-the-art approaches. |
Tasks | |
Published | 2020-03-19 |
URL | https://arxiv.org/abs/2003.08560v3 |
https://arxiv.org/pdf/2003.08560v3.pdf | |
PWC | https://paperswithcode.com/paper/cpr-gcn-conditional-partial-residual-graph |
Repo | |
Framework | |
On the rate of convergence of image classifiers based on convolutional neural networks
Title | On the rate of convergence of image classifiers based on convolutional neural networks |
Authors | M. Kohler, A. Krzyzak, B. Walter |
Abstract | Image classifiers based on convolutional neural networks are defined, and the rate of convergence of the misclassification risk of the estimates towards the optimal misclassification risk is analyzed. Under suitable assumptions on the smoothness and structure of the aposteriori probability a rate of convergence is shown which is independent of the dimension of the image. This proves that in image classification it is possible to circumvent the curse of dimensionality by convolutional neural networks. |
Tasks | Image Classification |
Published | 2020-03-03 |
URL | https://arxiv.org/abs/2003.01526v1 |
https://arxiv.org/pdf/2003.01526v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-rate-of-convergence-of-image |
Repo | |
Framework | |
Detecting and Recovering Adversarial Examples: An Input Sensitivity Guided Method
Title | Detecting and Recovering Adversarial Examples: An Input Sensitivity Guided Method |
Authors | Mingxuan Li, Jingyuan Wang, Yufan Wu, Shuchang Zhou |
Abstract | Deep neural networks undergo rapid development and achieve notable success in various tasks, including many security concerned scenarios. However, a considerable amount of works have proved its vulnerability in adversaries. To address this problem, we propose a Guided Robust and Efficient Defensive Model GRED integrating detection and recovery processes together. From the lens of the properties of gradient distribution of adversarial examples, our model detects malicious inputs effectively, as well as recovering the ground-truth label with high accuracy. Compared with commonly used adversarial training methods, our model is more efficient and outperforms state-of-the-art adversarial trained models by a large margin up to 99% on MNIST, 89 % on CIFAR-10 and 87% on ImageNet subsets. When exclusively compared with previous adversarial detection methods, the detector of GRED is robust under all threat settings with a detection rate of over 95% against most of the attacks. It is also demonstrated by empirical assessment that our model could increase attacking cost significantly resulting in either unacceptable time consuming or human perceptible image distortions. |
Tasks | |
Published | 2020-02-28 |
URL | https://arxiv.org/abs/2002.12527v1 |
https://arxiv.org/pdf/2002.12527v1.pdf | |
PWC | https://paperswithcode.com/paper/detecting-and-recovering-adversarial-examples |
Repo | |
Framework | |
OCGNN: One-class Classification with Graph Neural Networks
Title | OCGNN: One-class Classification with Graph Neural Networks |
Authors | Xuhong Wang, Ying Du, Ping Cui, Yupu Yang |
Abstract | Nowadays, graph-structured data are increasingly used to model complex systems. Meanwhile, detecting anomalies from graph has become a vital research problem of pressing societal concerns. Anomaly detection is an unsupervised learning task of identifying rare data that differ from the majority. As one of the dominant anomaly detection algorithms, One Class Support Vector Machine has been widely used to detect outliers. However, those traditional anomaly detection methods lost their effectiveness in graph data. Since traditional anomaly detection methods are stable, robust and easy to use, it is vitally important to generalize them to graph data. In this work, we propose One Class Graph Neural Network (OCGNN), a one-class classification framework for graph anomaly detection. OCGNN is designed to combine the powerful representation ability of Graph Neural Networks along with the classical one-class objective. Compared with other baselines, OCGNN achieves significant improvements in extensive experiments. |
Tasks | Anomaly Detection |
Published | 2020-02-22 |
URL | https://arxiv.org/abs/2002.09594v1 |
https://arxiv.org/pdf/2002.09594v1.pdf | |
PWC | https://paperswithcode.com/paper/ocgnn-one-class-classification-with-graph |
Repo | |
Framework | |
Fast Predictive Uncertainty for Classification with Bayesian Deep Networks
Title | Fast Predictive Uncertainty for Classification with Bayesian Deep Networks |
Authors | Marius Hobbhahn, Agustinus Kristiadi, Philipp Hennig |
Abstract | In Bayesian Deep Learning, distributions over the output of classification neural networks are approximated by first constructing a Gaussian distribution over the weights, then sampling from it to receive a distribution over the categorical output distribution. This is costly. We reconsider old work to construct a Dirichlet approximation of this output distribution, which yields an analytic map between Gaussian distributions in logit space and Dirichlet distributions (the conjugate prior to the categorical) in the output space. We argue that the resulting Dirichlet distribution has theoretical and practical advantages, in particular more efficient computation of the uncertainty estimate, scaling to large datasets and networks like ImageNet and DenseNet. We demonstrate the use of this Dirichlet approximation by using it to construct a lightweight uncertainty-aware output ranking for the ImageNet setup. |
Tasks | |
Published | 2020-03-02 |
URL | https://arxiv.org/abs/2003.01227v1 |
https://arxiv.org/pdf/2003.01227v1.pdf | |
PWC | https://paperswithcode.com/paper/fast-predictive-uncertainty-for |
Repo | |
Framework | |
First Order Optimization in Policy Space for Constrained Deep Reinforcement Learning
Title | First Order Optimization in Policy Space for Constrained Deep Reinforcement Learning |
Authors | Yiming Zhang, Quan Vuong, Keith W. Ross |
Abstract | In reinforcement learning, an agent attempts to learn high-performing behaviors through interacting with the environment, such behaviors are often quantified in the form of a reward function. However some aspects of behavior, such as ones which are deemed unsafe and are to be avoided, are best captured through constraints. We propose a novel approach called First Order Constrained Optimization in Policy Space (FOCOPS) which maximizes an agent’s overall reward while ensuring the agent satisfies a set of cost constraints. Using data generated from the current policy, FOCOPS first finds the optimal update policy by solving a constrained optimization problem in the nonparameterized policy space. FOCOPS then projects the update policy back into the parametric policy space. Our approach provides a guarantee for constraint satisfaction throughout training and is first-order in nature therefore extremely simple to implement. We provide empirical evidence that our algorithm achieves better performance on a set of constrained robotics locomotive tasks compared to current state of the art approaches. |
Tasks | |
Published | 2020-02-16 |
URL | https://arxiv.org/abs/2002.06506v1 |
https://arxiv.org/pdf/2002.06506v1.pdf | |
PWC | https://paperswithcode.com/paper/first-order-optimization-in-policy-space-for |
Repo | |
Framework | |
A flexible outlier detector based on a topology given by graph communities
Title | A flexible outlier detector based on a topology given by graph communities |
Authors | O. Ramos Terrades, A. Berenguel, D. Gil |
Abstract | Outlier, or anomaly, detection is essential for optimal performance of machine learning methods and statistical predictive models. It is not just a technical step in a data cleaning process but a key topic in many fields such as fraudulent document detection, in medical applications and assisted diagnosis systems or detecting security threats. In contrast to population-based methods, neighborhood based local approaches are simple flexible methods that have the potential to perform well in small sample size unbalanced problems. However, a main concern of local approaches is the impact that the computation of each sample neighborhood has on the method performance. Most approaches use a distance in the feature space to define a single neighborhood that requires careful selection of several parameters. This work presents a local approach based on a local measure of the heterogeneity of sample labels in the feature space considered as a topological manifold. Topology is computed using the communities of a weighted graph codifying mutual nearest neighbors in the feature space. This way, we provide with a set of multiple neighborhoods able to describe the structure of complex spaces without parameter fine tuning. The extensive experiments on real-world data sets show that our approach overall outperforms, both, local and global strategies in multi and single view settings. |
Tasks | Anomaly Detection |
Published | 2020-02-18 |
URL | https://arxiv.org/abs/2002.07791v1 |
https://arxiv.org/pdf/2002.07791v1.pdf | |
PWC | https://paperswithcode.com/paper/a-flexible-outlier-detector-based-on-a |
Repo | |
Framework | |
Correlation-aware Deep Generative Model for Unsupervised Anomaly Detection
Title | Correlation-aware Deep Generative Model for Unsupervised Anomaly Detection |
Authors | Haoyi Fan, Fengbin Zhang, Ruidong Wang, Liang Xi, Zuoyong, Li |
Abstract | Unsupervised anomaly detection aims to identify anomalous samples from highly complex and unstructured data, which is pervasive in both fundamental research and industrial applications. However, most existing methods neglect the complex correlation among data samples, which is important for capturing normal patterns from which the abnormal ones deviate. In this paper, we propose a method of Correlation aware unsupervised Anomaly detection via Deep Gaussian Mixture Model (CADGMM), which captures the complex correlation among data points for high-quality low-dimensional representation learning. Specifically, the relations among data samples are correlated firstly in forms of a graph structure, in which, the node denotes the sample and the edge denotes the correlation between two samples from the feature space. Then, a dual-encoder that consists of a graph encoder and a feature encoder, is employed to encode both the feature and correlation information of samples into the low-dimensional latent space jointly, followed by a decoder for data reconstruction. Finally, a separate estimation network as a Gaussian Mixture Model is utilized to estimate the density of the learned latent vector, and the anomalies can be detected by measuring the energy of the samples. Extensive experiments on real-world datasets demonstrate the effectiveness of the proposed method. |
Tasks | Anomaly Detection, Representation Learning, Unsupervised Anomaly Detection |
Published | 2020-02-18 |
URL | https://arxiv.org/abs/2002.07349v1 |
https://arxiv.org/pdf/2002.07349v1.pdf | |
PWC | https://paperswithcode.com/paper/correlation-aware-deep-generative-model-for |
Repo | |
Framework | |
Untrue.News: A New Search Engine For Fake Stories
Title | Untrue.News: A New Search Engine For Fake Stories |
Authors | Vinicius Woloszyn, Felipe Schaeffer, Beliza Boniatti, Eduardo Cortes, Salar Mohtaj, Sebastian Möller |
Abstract | In this paper, we demonstrate Untrue News, a new search engine for fake stories. Untrue News is easy to use and offers useful features such as: a) a multi-language option combining fake stories from different countries and languages around the same subject or person; b) an user privacy protector, avoiding the filter bubble by employing a bias-free ranking scheme; and c) a collaborative platform that fosters the development of new tools for fighting disinformation. Untrue News relies on Elasticsearch, a new scalable analytic search engine based on the Lucene library that provides near real-time results. We demonstrate two key scenarios: the first related to a politician - looking how the categories are shown for different types of fake stories - and a second related to a refugee - showing the multilingual tool. A prototype of Untrue News is accessible via http://untrue.news |
Tasks | |
Published | 2020-02-16 |
URL | https://arxiv.org/abs/2002.06585v1 |
https://arxiv.org/pdf/2002.06585v1.pdf | |
PWC | https://paperswithcode.com/paper/untruenews-a-new-search-engine-for-fake |
Repo | |
Framework | |
Tactic Learning and Proving for the Coq Proof Assistant
Title | Tactic Learning and Proving for the Coq Proof Assistant |
Authors | Lasse Blaauwbroek, Josef Urban, Herman Geuvers |
Abstract | We present a system that utilizes machine learning for tactic proof search in the Coq Proof Assistant. In a similar vein as the TacticToe project for HOL4, our system predicts appropriate tactics and finds proofs in the form of tactic scripts. To do this, it learns from previous tactic scripts and how they are applied to proof states. The performance of the system is evaluated on the Coq Standard Library. Currently, our predictor can identify the correct tactic to be applied to a proof state 23.4% of the time. Our proof searcher can fully automatically prove 39.3% of the lemmas. When combined with the CoqHammer system, the two systems together prove 56.7% of the library’s lemmas. |
Tasks | |
Published | 2020-03-20 |
URL | https://arxiv.org/abs/2003.09140v1 |
https://arxiv.org/pdf/2003.09140v1.pdf | |
PWC | https://paperswithcode.com/paper/tactic-learning-and-proving-for-the-coq-proof |
Repo | |
Framework | |
Compact Surjective Encoding Autoencoder for Unsupervised Novelty Detection
Title | Compact Surjective Encoding Autoencoder for Unsupervised Novelty Detection |
Authors | Jaewoo Park, Yoon Gyo Jung, Andrew Beng Jin Teoh |
Abstract | In unsupervised novelty detection, a model is trained solely on the in-class data, and infer to single out out-class data. Autoencoder (AE) variants aim to compactly model the in-class data to reconstruct it exclusively, differentiating it from out-class by the reconstruction error. However, imposing compactness improperly may damage in-class reconstruction and, therefore, detection performance. To solve this, we propose Compact Surjective Encoding AE (CSE-AE). In this model, the encoding of any input is constrained into a compact manifold by exploiting the deep neural net’s ignorance of the unknown. Concurrently, the in-class data is surjectively encoded to the compact manifold via AE. The mechanism is realized by both GAN and its ensembled discriminative layers, and results to reconstruct the in-class exclusively. In inference, the reconstruction error of a query is measured using high-level semantics captured by the discriminator. Extensive experiments on image data show that the proposed model gives state-of-the-art performance. |
Tasks | |
Published | 2020-03-03 |
URL | https://arxiv.org/abs/2003.01665v2 |
https://arxiv.org/pdf/2003.01665v2.pdf | |
PWC | https://paperswithcode.com/paper/compact-surjective-encoding-autoencoder-for |
Repo | |
Framework | |
PMIndia – A Collection of Parallel Corpora of Languages of India
Title | PMIndia – A Collection of Parallel Corpora of Languages of India |
Authors | Barry Haddow, Faheem Kirefu |
Abstract | Parallel text is required for building high-quality machine translation (MT) systems, as well as for other multilingual NLP applications. For many South Asian languages, such data is in short supply. In this paper, we described a new publicly available corpus (PMIndia) consisting of parallel sentences which pair 13 major languages of India with English. The corpus includes up to 56000 sentences for each language pair. We explain how the corpus was constructed, including an assessment of two different automatic sentence alignment methods, and present some initial NMT results on the corpus. |
Tasks | Machine Translation |
Published | 2020-01-27 |
URL | https://arxiv.org/abs/2001.09907v1 |
https://arxiv.org/pdf/2001.09907v1.pdf | |
PWC | https://paperswithcode.com/paper/pmindia-a-collection-of-parallel-corpora-of |
Repo | |
Framework | |