February 2, 2020

3073 words 15 mins read

Paper Group AWR 47

Paper Group AWR 47

Many Task Learning with Task Routing. Rényi Differential Privacy of the Sampled Gaussian Mechanism. AP-Perf: Incorporating Generic Performance Metrics in Differentiable Learning. Assessing Disparate Impacts of Personalized Interventions: Identifiability and Bounds. Multi-task Learning for Aggregated Data using Gaussian Processes. Distantly Supervis …

Many Task Learning with Task Routing

Title Many Task Learning with Task Routing
Authors Gjorgji Strezoski, Nanne van Noord, Marcel Worring
Abstract Typical multi-task learning (MTL) methods rely on architectural adjustments and a large trainable parameter set to jointly optimize over several tasks. However, when the number of tasks increases so do the complexity of the architectural adjustments and resource requirements. In this paper, we introduce a method which applies a conditional feature-wise transformation over the convolutional activations that enables a model to successfully perform a large number of tasks. To distinguish from regular MTL, we introduce Many Task Learning (MaTL) as a special case of MTL where more than 20 tasks are performed by a single model. Our method dubbed Task Routing (TR) is encapsulated in a layer we call the Task Routing Layer (TRL), which applied in an MaTL scenario successfully fits hundreds of classification tasks in one model. We evaluate our method on 5 datasets against strong baselines and state-of-the-art approaches.
Tasks Multi-Task Learning
Published 2019-03-28
URL http://arxiv.org/abs/1903.12117v1
PDF http://arxiv.org/pdf/1903.12117v1.pdf
PWC https://paperswithcode.com/paper/many-task-learning-with-task-routing
Repo https://github.com/gstrezoski/TaskRouting
Framework pytorch

Rényi Differential Privacy of the Sampled Gaussian Mechanism

Title Rényi Differential Privacy of the Sampled Gaussian Mechanism
Authors Ilya Mironov, Kunal Talwar, Li Zhang
Abstract The Sampled Gaussian Mechanism (SGM)—a composition of subsampling and the additive Gaussian noise—has been successfully used in a number of machine learning applications. The mechanism’s unexpected power is derived from privacy amplification by sampling where the privacy cost of a single evaluation diminishes quadratically, rather than linearly, with the sampling rate. Characterizing the precise privacy properties of SGM motivated development of several relaxations of the notion of differential privacy. This work unifies and fills in gaps in published results on SGM. We describe a numerically stable procedure for precise computation of SGM’s R'enyi Differential Privacy and prove a nearly tight (within a small constant factor) closed-form bound.
Tasks
Published 2019-08-28
URL https://arxiv.org/abs/1908.10530v1
PDF https://arxiv.org/pdf/1908.10530v1.pdf
PWC https://paperswithcode.com/paper/renyi-differential-privacy-of-the-sampled
Repo https://github.com/facebookresearch/pytorch-dp
Framework pytorch

AP-Perf: Incorporating Generic Performance Metrics in Differentiable Learning

Title AP-Perf: Incorporating Generic Performance Metrics in Differentiable Learning
Authors Rizal Fathony, J. Zico Kolter
Abstract We propose a method that enables practitioners to conveniently incorporate custom non-decomposable performance metrics into differentiable learning pipelines, notably those based upon neural network architectures. Our approach is based on the recently developed adversarial prediction framework, a distributionally robust approach that optimizes a metric in the worst case given the statistical summary of the empirical distribution. We formulate a marginal distribution technique to reduce the complexity of optimizing the adversarial prediction formulation over a vast range of non-decomposable metrics. We demonstrate how easy it is to write and incorporate complex custom metrics using our provided tool. Finally, we show the effectiveness of our approach various classification tasks on tabular datasets from the UCI repository and benchmark datasets, as well as image classification tasks. The code for our proposed method is available at https://github.com/rizalzaf/AdversarialPrediction.jl.
Tasks Image Classification
Published 2019-12-02
URL https://arxiv.org/abs/1912.00965v2
PDF https://arxiv.org/pdf/1912.00965v2.pdf
PWC https://paperswithcode.com/paper/ap-perf-incorporating-generic-performance
Repo https://github.com/rizalzaf/AP-examples
Framework none

Assessing Disparate Impacts of Personalized Interventions: Identifiability and Bounds

Title Assessing Disparate Impacts of Personalized Interventions: Identifiability and Bounds
Authors Nathan Kallus, Angela Zhou
Abstract Personalized interventions in social services, education, and healthcare leverage individual-level causal effect predictions in order to give the best treatment to each individual or to prioritize program interventions for the individuals most likely to benefit. While the sensitivity of these domains compels us to evaluate the fairness of such policies, we show that actually auditing their disparate impacts per standard observational metrics, such as true positive rates, is impossible since ground truths are unknown. Whether our data is experimental or observational, an individual’s actual outcome under an intervention different than that received can never be known, only predicted based on features. We prove how we can nonetheless point-identify these quantities under the additional assumption of monotone treatment response, which may be reasonable in many applications. We further provide a sensitivity analysis for this assumption by means of sharp partial-identification bounds under violations of monotonicity of varying strengths. We show how to use our results to audit personalized interventions using partially-identified ROC and xROC curves and demonstrate this in a case study of a French job training dataset.
Tasks
Published 2019-06-04
URL https://arxiv.org/abs/1906.01552v1
PDF https://arxiv.org/pdf/1906.01552v1.pdf
PWC https://paperswithcode.com/paper/assessing-disparate-impacts-of-personalized
Repo https://github.com/CausalML/interventions-disparate-impact-responders
Framework none

Multi-task Learning for Aggregated Data using Gaussian Processes

Title Multi-task Learning for Aggregated Data using Gaussian Processes
Authors Fariba Yousefi, Michael Thomas Smith, Mauricio A. Álvarez
Abstract Aggregated data is commonplace in areas such as epidemiology and demography. For example, census data for a population is usually given as averages defined over time periods or spatial resolutions (cities, regions or countries). In this paper, we present a novel multi-task learning model based on Gaussian processes for joint learning of variables that have been aggregated at different input scales. Our model represents each task as the linear combination of the realizations of latent processes that are integrated at a different scale per task. We are then able to compute the cross-covariance between the different tasks either analytically or numerically. We also allow each task to have a potentially different likelihood model and provide a variational lower bound that can be optimised in a stochastic fashion making our model suitable for larger datasets. We show examples of the model in a synthetic example, a fertility dataset, and an air pollution prediction application.
Tasks Air Pollution Prediction, Epidemiology, Gaussian Processes, Multi-Task Learning
Published 2019-06-22
URL https://arxiv.org/abs/1906.09412v4
PDF https://arxiv.org/pdf/1906.09412v4.pdf
PWC https://paperswithcode.com/paper/multi-task-learning-for-aggregated-data-using
Repo https://github.com/frb-yousefi/aggregated-multitask-gp
Framework none

Distantly Supervised Named Entity Recognition using Positive-Unlabeled Learning

Title Distantly Supervised Named Entity Recognition using Positive-Unlabeled Learning
Authors Minlong Peng, Xiaoyu Xing, Qi Zhang, Jinlan Fu, Xuanjing Huang
Abstract In this work, we explore the way to perform named entity recognition (NER) using only unlabeled data and named entity dictionaries. To this end, we formulate the task as a positive-unlabeled (PU) learning problem and accordingly propose a novel PU learning algorithm to perform the task. We prove that the proposed algorithm can unbiasedly and consistently estimate the task loss as if there is fully labeled data. A key feature of the proposed method is that it does not require the dictionaries to label every entity within a sentence, and it even does not require the dictionaries to label all of the words constituting an entity. This greatly reduces the requirement on the quality of the dictionaries and makes our method generalize well with quite simple dictionaries. Empirical studies on four public NER datasets demonstrate the effectiveness of our proposed method. We have published the source code at \url{https://github.com/v-mipeng/LexiconNER}.
Tasks Named Entity Recognition
Published 2019-06-04
URL https://arxiv.org/abs/1906.01378v2
PDF https://arxiv.org/pdf/1906.01378v2.pdf
PWC https://paperswithcode.com/paper/distantly-supervised-named-entity-recognition
Repo https://github.com/v-mipeng/LexiconNER
Framework pytorch

Guarantees for Spectral Clustering with Fairness Constraints

Title Guarantees for Spectral Clustering with Fairness Constraints
Authors Matthäus Kleindessner, Samira Samadi, Pranjal Awasthi, Jamie Morgenstern
Abstract Given the widespread popularity of spectral clustering (SC) for partitioning graph data, we study a version of constrained SC in which we try to incorporate the fairness notion proposed by Chierichetti et al. (2017). According to this notion, a clustering is fair if every demographic group is approximately proportionally represented in each cluster. To this end, we develop variants of both normalized and unnormalized constrained SC and show that they help find fairer clusterings on both synthetic and real data. We also provide a rigorous theoretical analysis of our algorithms on a natural variant of the stochastic block model, where $h$ groups have strong inter-group connectivity, but also exhibit a “natural” clustering structure which is fair. We prove that our algorithms can recover this fair clustering with high probability.
Tasks
Published 2019-01-24
URL https://arxiv.org/abs/1901.08668v2
PDF https://arxiv.org/pdf/1901.08668v2.pdf
PWC https://paperswithcode.com/paper/guarantees-for-spectral-clustering-with
Repo https://github.com/matthklein/fair_spectral_clustering
Framework none

Deep learning in bioinformatics: introduction, application, and perspective in big data era

Title Deep learning in bioinformatics: introduction, application, and perspective in big data era
Authors Yu Li, Chao Huang, Lizhong Ding, Zhongxiao Li, Yijie Pan, Xin Gao
Abstract Deep learning, which is especially formidable in handling big data, has achieved great success in various fields, including bioinformatics. With the advances of the big data era in biology, it is foreseeable that deep learning will become increasingly important in the field and will be incorporated in vast majorities of analysis pipelines. In this review, we provide both the exoteric introduction of deep learning, and concrete examples and implementations of its representative applications in bioinformatics. We start from the recent achievements of deep learning in the bioinformatics field, pointing out the problems which are suitable to use deep learning. After that, we introduce deep learning in an easy-to-understand fashion, from shallow neural networks to legendary convolutional neural networks, legendary recurrent neural networks, graph neural networks, generative adversarial networks, variational autoencoder, and the most recent state-of-the-art architectures. After that, we provide eight examples, covering five bioinformatics research directions and all the four kinds of data type, with the implementation written in Tensorflow and Keras. Finally, we discuss the common issues, such as overfitting and interpretability, that users will encounter when adopting deep learning methods and provide corresponding suggestions. The implementations are freely available at \url{https://github.com/lykaust15/Deep_learning_examples}.
Tasks
Published 2019-02-28
URL http://arxiv.org/abs/1903.00342v1
PDF http://arxiv.org/pdf/1903.00342v1.pdf
PWC https://paperswithcode.com/paper/deep-learning-in-bioinformatics-introduction
Repo https://github.com/lykaust15/Deep_learning_examples
Framework tf

Differentiable Soft Quantization: Bridging Full-Precision and Low-Bit Neural Networks

Title Differentiable Soft Quantization: Bridging Full-Precision and Low-Bit Neural Networks
Authors Ruihao Gong, Xianglong Liu, Shenghu Jiang, Tianxiang Li, Peng Hu, Jiazhen Lin, Fengwei Yu, Junjie Yan
Abstract Hardware-friendly network quantization (e.g., binary/uniform quantization) can efficiently accelerate the inference and meanwhile reduce memory consumption of the deep neural networks, which is crucial for model deployment on resource-limited devices like mobile phones. However, due to the discreteness of low-bit quantization, existing quantization methods often face the unstable training process and severe performance degradation. To address this problem, in this paper we propose Differentiable Soft Quantization (DSQ) to bridge the gap between the full-precision and low-bit networks. DSQ can automatically evolve during training to gradually approximate the standard quantization. Owing to its differentiable property, DSQ can help pursue the accurate gradients in backward propagation, and reduce the quantization loss in forward process with an appropriate clipping range. Extensive experiments over several popular network structures show that training low-bit neural networks with DSQ can consistently outperform state-of-the-art quantization methods. Besides, our first efficient implementation for deploying 2 to 4-bit DSQ on devices with ARM architecture achieves up to 1.7$\times$ speed up, compared with the open-source 8-bit high-performance inference framework NCNN. [31]
Tasks Quantization
Published 2019-08-14
URL https://arxiv.org/abs/1908.05033v1
PDF https://arxiv.org/pdf/1908.05033v1.pdf
PWC https://paperswithcode.com/paper/differentiable-soft-quantization-bridging
Repo https://github.com/ricky40403/DSQ
Framework pytorch

Deep Tree Learning for Zero-shot Face Anti-Spoofing

Title Deep Tree Learning for Zero-shot Face Anti-Spoofing
Authors Yaojie Liu, Joel Stehouwer, Amin Jourabloo, Xiaoming Liu
Abstract Face anti-spoofing is designed to keep face recognition systems from recognizing fake faces as the genuine users. While advanced face anti-spoofing methods are developed, new types of spoof attacks are also being created and becoming a threat to all existing systems. We define the detection of unknown spoof attacks as Zero-Shot Face Anti-spoofing (ZSFA). Previous works of ZSFA only study 1-2 types of spoof attacks, such as print/replay attacks, which limits the insight of this problem. In this work, we expand the ZSFA problem to a wide range of 13 types of spoof attacks, including print attack, replay attack, 3D mask attacks, and so on. A novel Deep Tree Network (DTN) is proposed to tackle the ZSFA. The tree is learned to partition the spoof samples into semantic sub-groups in an unsupervised fashion. When a data sample arrives, being know or unknown attacks, DTN routes it to the most similar spoof cluster, and make the binary decision. In addition, to enable the study of ZSFA, we introduce the first face anti-spoofing database that contains diverse types of spoof attacks. Experiments show that our proposed method achieves the state of the art on multiple testing protocols of ZSFA.
Tasks Face Anti-Spoofing, Face Recognition
Published 2019-04-05
URL http://arxiv.org/abs/1904.02860v2
PDF http://arxiv.org/pdf/1904.02860v2.pdf
PWC https://paperswithcode.com/paper/deep-tree-learning-for-zero-shot-face-anti
Repo https://github.com/yaojieliu/CVPR2019-DeepTreeLearningForZeroShotFaceAntispoofing
Framework tf

Quantized Reinforcement Learning (QUARL)

Title Quantized Reinforcement Learning (QUARL)
Authors Srivatsan Krishnan, Sharad Chitlangia, Maximilian Lam, Zishen Wan, Aleksandra Faust, Vijay Janapa Reddi
Abstract Recent work has shown that quantization can help reduce the memory, compute, and energy demands of deep neural networks without significantly harming their quality. However, whether these prior techniques, applied traditionally to image-based models, work with the same efficacy to the sequential decision making process in reinforcement learning remains an unanswered question. To address this void, we conduct the first comprehensive empirical study that quantifies the effects of quantization on various deep reinforcement learning policies with the intent to reduce their computational resource demands. We apply techniques such as post-training quantization and quantization aware training to a spectrum of reinforcement learning tasks (such as Pong, Breakout, BeamRider and more) and training algorithms (such as PPO, A2C, DDPG, and DQN). Across this spectrum of tasks and learning algorithms, we show that policies can be quantized to 6-8 bits of precision without loss of accuracy. We also show that certain tasks and reinforcement learning algorithms yield policies that are more difficult to quantize due to their effect of widening the models’ distribution of weights and that quantization aware training consistently improves results over post-training quantization and oftentimes even over the full precision baseline. Finally, we demonstrate real-world applications of quantization for reinforcement learning. We use half-precision training to train a Pong model 50% faster, and we deploy a quantized reinforcement learning based navigation policy to an embedded system, achieving an 18$\times$ speedup and a 4$\times$ reduction in memory usage over an unquantized policy.
Tasks Decision Making, Quantization
Published 2019-10-02
URL https://arxiv.org/abs/1910.01055v3
PDF https://arxiv.org/pdf/1910.01055v3.pdf
PWC https://paperswithcode.com/paper/quantized-reinforcement-learning-quarl
Repo https://github.com/harvard-edge/quarl
Framework tf

CAE-ADMM: Implicit Bitrate Optimization via ADMM-based Pruning in Compressive Autoencoders

Title CAE-ADMM: Implicit Bitrate Optimization via ADMM-based Pruning in Compressive Autoencoders
Authors Haimeng Zhao, Peiyuan Liao
Abstract We introduce ADMM-pruned Compressive AutoEncoder (CAE-ADMM) that uses Alternative Direction Method of Multipliers (ADMM) to optimize the trade-off between distortion and efficiency of lossy image compression. Specifically, ADMM in our method is to promote sparsity to implicitly optimize the bitrate, different from entropy estimators used in the previous research. The experiments on public datasets show that our method outperforms the original CAE and some traditional codecs in terms of SSIM/MS-SSIM metrics, at reasonable inference speed.
Tasks Image Compression, Neural Architecture Search
Published 2019-01-22
URL http://arxiv.org/abs/1901.07196v4
PDF http://arxiv.org/pdf/1901.07196v4.pdf
PWC https://paperswithcode.com/paper/cae-admm-implicit-bitrate-optimization-via
Repo https://github.com/JasonZHM/CAEP
Framework pytorch

Combating Label Noise in Deep Learning Using Abstention

Title Combating Label Noise in Deep Learning Using Abstention
Authors Sunil Thulasidasan, Tanmoy Bhattacharya, Jeff Bilmes, Gopinath Chennupati, Jamal Mohd-Yusof
Abstract We introduce a novel method to combat label noise when training deep neural networks for classification. We propose a loss function that permits abstention during training thereby allowing the DNN to abstain on confusing samples while continuing to learn and improve classification performance on the non-abstained samples. We show how such a deep abstaining classifier (DAC) can be used for robust learning in the presence of different types of label noise. In the case of structured or systematic label noise – where noisy training labels or confusing examples are correlated with underlying features of the data– training with abstention enables representation learning for features that are associated with unreliable labels. In the case of unstructured (arbitrary) label noise, abstention during training enables the DAC to be used as an effective data cleaner by identifying samples that are likely to have label noise. We provide analytical results on the loss function behavior that enable dynamic adaption of abstention rates based on learning progress during training. We demonstrate the utility of the deep abstaining classifier for various image classification tasks under different types of label noise; in the case of arbitrary label noise, we show significant improvements over previously published results on multiple image benchmarks. Source code is available at https://github.com/thulas/dac-label-noise
Tasks Image Classification, Representation Learning
Published 2019-05-27
URL https://arxiv.org/abs/1905.10964v2
PDF https://arxiv.org/pdf/1905.10964v2.pdf
PWC https://paperswithcode.com/paper/combating-label-noise-in-deep-learning-using
Repo https://github.com/thulas/dac-label-noise
Framework pytorch

Pretraining-Based Natural Language Generation for Text Summarization

Title Pretraining-Based Natural Language Generation for Text Summarization
Authors Haoyu Zhang, Jianjun Xu, Ji Wang
Abstract In this paper, we propose a novel pretraining-based encoder-decoder framework, which can generate the output sequence based on the input sequence in a two-stage manner. For the encoder of our model, we encode the input sequence into context representations using BERT. For the decoder, there are two stages in our model, in the first stage, we use a Transformer-based decoder to generate a draft output sequence. In the second stage, we mask each word of the draft sequence and feed it to BERT, then by combining the input sequence and the draft representation generated by BERT, we use a Transformer-based decoder to predict the refined word for each masked position. To the best of our knowledge, our approach is the first method which applies the BERT into text generation tasks. As the first step in this direction, we evaluate our proposed method on the text summarization task. Experimental results show that our model achieves new state-of-the-art on both CNN/Daily Mail and New York Times datasets.
Tasks Text Generation, Text Summarization
Published 2019-02-25
URL http://arxiv.org/abs/1902.09243v2
PDF http://arxiv.org/pdf/1902.09243v2.pdf
PWC https://paperswithcode.com/paper/pretraining-based-natural-language-generation
Repo https://github.com/praveenjune17/BERT_text_summarisation
Framework tf

Beyond Cartesian Representations for Local Descriptors

Title Beyond Cartesian Representations for Local Descriptors
Authors Patrick Ebel, Anastasiia Mishchuk, Kwang Moo Yi, Pascal Fua, Eduard Trulls
Abstract The dominant approach for learning local patch descriptors relies on small image regions whose scale must be properly estimated a priori by a keypoint detector. In other words, if two patches are not in correspondence, their descriptors will not match. A strategy often used to alleviate this problem is to “pool” the pixel-wise features over log-polar regions, rather than regularly spaced ones. By contrast, we propose to extract the “support region” directly with a log-polar sampling scheme. We show that this provides us with a better representation by simultaneously oversampling the immediate neighbourhood of the point and undersampling regions far away from it. We demonstrate that this representation is particularly amenable to learning descriptors with deep networks. Our models can match descriptors across a much wider range of scales than was possible before, and also leverage much larger support regions without suffering from occlusions. We report state-of-the-art results on three different datasets.
Tasks
Published 2019-08-15
URL https://arxiv.org/abs/1908.05547v1
PDF https://arxiv.org/pdf/1908.05547v1.pdf
PWC https://paperswithcode.com/paper/beyond-cartesian-representations-for-local
Repo https://github.com/cvlab-epfl/log-polar-descriptors
Framework pytorch
comments powered by Disqus