January 31, 2020

3245 words 16 mins read

Paper Group AWR 385

Paper Group AWR 385

An Alternative Surrogate Loss for PGD-based Adversarial Testing. Hetero-Modal Variational Encoder-Decoder for Joint Modality Completion and Segmentation. Noise-tolerant fair classification. Learning Resolution Parameters for Graph Clustering. STCN: Stochastic Temporal Convolutional Networks. Intensity-Free Learning of Temporal Point Processes. Mult …

An Alternative Surrogate Loss for PGD-based Adversarial Testing

Title An Alternative Surrogate Loss for PGD-based Adversarial Testing
Authors Sven Gowal, Jonathan Uesato, Chongli Qin, Po-Sen Huang, Timothy Mann, Pushmeet Kohli
Abstract Adversarial testing methods based on Projected Gradient Descent (PGD) are widely used for searching norm-bounded perturbations that cause the inputs of neural networks to be misclassified. This paper takes a deeper look at these methods and explains the effect of different hyperparameters (i.e., optimizer, step size and surrogate loss). We introduce the concept of MultiTargeted testing, which makes clever use of alternative surrogate losses, and explain when and how MultiTargeted is guaranteed to find optimal perturbations. Finally, we demonstrate that MultiTargeted outperforms more sophisticated methods and often requires less iterative steps than other variants of PGD found in the literature. Notably, MultiTargeted ranks first on MadryLab’s white-box MNIST and CIFAR-10 leaderboards, reducing the accuracy of their MNIST model to 88.36% (with $\ell_\infty$ perturbations of $\epsilon = 0.3$) and the accuracy of their CIFAR-10 model to 44.03% (at $\epsilon = 8/255$). MultiTargeted also ranks first on the TRADES leaderboard reducing the accuracy of their CIFAR-10 model to 53.07% (with $\ell_\infty$ perturbations of $\epsilon = 0.031$).
Tasks
Published 2019-10-21
URL https://arxiv.org/abs/1910.09338v1
PDF https://arxiv.org/pdf/1910.09338v1.pdf
PWC https://paperswithcode.com/paper/an-alternative-surrogate-loss-for-pgd-based
Repo https://github.com/yaodongyu/TRADES
Framework pytorch

Hetero-Modal Variational Encoder-Decoder for Joint Modality Completion and Segmentation

Title Hetero-Modal Variational Encoder-Decoder for Joint Modality Completion and Segmentation
Authors Reuben Dorent, Samuel Joutard, Marc Modat, Sébastien Ourselin, Tom Vercauteren
Abstract We propose a new deep learning method for tumour segmentation when dealing with missing imaging modalities. Instead of producing one network for each possible subset of observed modalities or using arithmetic operations to combine feature maps, our hetero-modal variational 3D encoder-decoder independently embeds all observed modalities into a shared latent representation. Missing data and tumour segmentation can be then generated from this embedding. In our scenario, the input is a random subset of modalities. We demonstrate that the optimisation problem can be seen as a mixture sampling. In addition to this, we introduce a new network architecture building upon both the 3D U-Net and the Multi-Modal Variational Auto-Encoder (MVAE). Finally, we evaluate our method on BraTS2018 using subsets of the imaging modalities as input. Our model outperforms the current state-of-the-art method for dealing with missing modalities and achieves similar performance to the subset-specific equivalent networks.
Tasks
Published 2019-07-25
URL https://arxiv.org/abs/1907.11150v1
PDF https://arxiv.org/pdf/1907.11150v1.pdf
PWC https://paperswithcode.com/paper/hetero-modal-variational-encoder-decoder-for
Repo https://github.com/ReubenDo/U-HVED
Framework tf

Noise-tolerant fair classification

Title Noise-tolerant fair classification
Authors Alexandre Louis Lamy, Ziyuan Zhong, Aditya Krishna Menon, Nakul Verma
Abstract Fairness-aware learning involves designing algorithms that do not discriminate with respect to some sensitive feature (e.g., race or gender). Existing work on the problem operates under the assumption that the sensitive feature available in one’s training sample is perfectly reliable. This assumption may be violated in many real-world cases: for example, respondents to a survey may choose to conceal or obfuscate their group identity out of fear of potential discrimination. This poses the question of whether one can still learn fair classifiers given noisy sensitive features. In this paper, we answer the question in the affirmative: we show that if one measures fairness using the mean-difference score, and sensitive features are subject to noise from the mutually contaminated learning model, then owing to a simple identity we only need to change the desired fairness-tolerance. The requisite tolerance can be estimated by leveraging existing noise-rate estimators from the label noise literature. We finally show that our procedure is empirically effective on two case-studies involving sensitive feature censoring.
Tasks
Published 2019-01-30
URL https://arxiv.org/abs/1901.10837v4
PDF https://arxiv.org/pdf/1901.10837v4.pdf
PWC https://paperswithcode.com/paper/noise-tolerant-fair-classification
Repo https://github.com/AIasd/noise_fairlearn
Framework none

Learning Resolution Parameters for Graph Clustering

Title Learning Resolution Parameters for Graph Clustering
Authors Nate Veldt, David F. Gleich, Anthony Wirth
Abstract Finding clusters of well-connected nodes in a graph is an extensively studied problem in graph-based data analysis. Because of its many applications, a large number of distinct graph clustering objective functions and algorithms have already been proposed and analyzed. To aid practitioners in determining the best clustering approach to use in different applications, we present new techniques for automatically learning how to set clustering resolution parameters. These parameters control the size and structure of communities that are formed by optimizing a generalized objective function. We begin by formalizing the notion of a parameter fitness function, which measures how well a fixed input clustering approximately solves a generalized clustering objective for a specific resolution parameter value. Under reasonable assumptions, which suit two key graph clustering applications, such a parameter fitness function can be efficiently minimized using a bisection-like method, yielding a resolution parameter that fits well with the example clustering. We view our framework as a type of single-shot hyperparameter tuning, as we are able to learn a good resolution parameter with just a single example. Our general approach can be applied to learn resolution parameters for both local and global graph clustering objectives. We demonstrate its utility in several experiments on real-world data where it is helpful to learn resolution parameters from a given example clustering.
Tasks Graph Clustering
Published 2019-03-12
URL http://arxiv.org/abs/1903.05246v1
PDF http://arxiv.org/pdf/1903.05246v1.pdf
PWC https://paperswithcode.com/paper/learning-resolution-parameters-for-graph
Repo https://github.com/nveldt/LearnResParams
Framework none

STCN: Stochastic Temporal Convolutional Networks

Title STCN: Stochastic Temporal Convolutional Networks
Authors Emre Aksan, Otmar Hilliges
Abstract Convolutional architectures have recently been shown to be competitive on many sequence modelling tasks when compared to the de-facto standard of recurrent neural networks (RNNs), while providing computational and modeling advantages due to inherent parallelism. However, currently there remains a performance gap to more expressive stochastic RNN variants, especially those with several layers of dependent random variables. In this work, we propose stochastic temporal convolutional networks (STCNs), a novel architecture that combines the computational advantages of temporal convolutional networks (TCN) with the representational power and robustness of stochastic latent spaces. In particular, we propose a hierarchy of stochastic latent variables that captures temporal dependencies at different time-scales. The architecture is modular and flexible due to the decoupling of the deterministic and stochastic layers. We show that the proposed architecture achieves state of the art log-likelihoods across several tasks. Finally, the model is capable of predicting high-quality synthetic samples over a long-range temporal horizon in modeling of handwritten text.
Tasks
Published 2019-02-18
URL http://arxiv.org/abs/1902.06568v1
PDF http://arxiv.org/pdf/1902.06568v1.pdf
PWC https://paperswithcode.com/paper/stcn-stochastic-temporal-convolutional
Repo https://github.com/emreaksan/stcn
Framework tf

Intensity-Free Learning of Temporal Point Processes

Title Intensity-Free Learning of Temporal Point Processes
Authors Oleksandr Shchur, Marin Biloš, Stephan Günnemann
Abstract Temporal point processes are the dominant paradigm for modeling sequences of events happening at irregular intervals. The standard way of learning in such models is by estimating the conditional intensity function. However, parameterizing the intensity function usually incurs several trade-offs. We show how to overcome the limitations of intensity-based approaches by directly modeling the conditional distribution of inter-event times. We draw on the literature on normalizing flows to design models that are flexible and efficient. We additionally propose a simple mixture model that matches the flexibility of flow-based models, but also permits sampling and computing moments in closed form. The proposed models achieve state-of-the-art performance in standard prediction tasks and are suitable for novel applications, such as learning sequence embeddings and imputing missing data.
Tasks Point Processes
Published 2019-09-26
URL https://arxiv.org/abs/1909.12127v2
PDF https://arxiv.org/pdf/1909.12127v2.pdf
PWC https://paperswithcode.com/paper/intensity-free-learning-of-temporal-point
Repo https://github.com/shchur/ifl-tpp
Framework pytorch

Multilingual Neural Machine Translation With Soft Decoupled Encoding

Title Multilingual Neural Machine Translation With Soft Decoupled Encoding
Authors Xinyi Wang, Hieu Pham, Philip Arthur, Graham Neubig
Abstract Multilingual training of neural machine translation (NMT) systems has led to impressive accuracy improvements on low-resource languages. However, there are still significant challenges in efficiently learning word representations in the face of paucity of data. In this paper, we propose Soft Decoupled Encoding (SDE), a multilingual lexicon encoding framework specifically designed to share lexical-level information intelligently without requiring heuristic preprocessing such as pre-segmenting the data. SDE represents a word by its spelling through a character encoding, and its semantic meaning through a latent embedding space shared by all languages. Experiments on a standard dataset of four low-resource languages show consistent improvements over strong multilingual NMT baselines, with gains of up to 2 BLEU on one of the tested languages, achieving the new state-of-the-art on all four language pairs.
Tasks Machine Translation
Published 2019-02-09
URL http://arxiv.org/abs/1902.03499v1
PDF http://arxiv.org/pdf/1902.03499v1.pdf
PWC https://paperswithcode.com/paper/multilingual-neural-machine-translation-with-1
Repo https://github.com/cindyxinyiwang/SDE
Framework pytorch

Facial Expression Recognition Research Based on Deep Learning

Title Facial Expression Recognition Research Based on Deep Learning
Authors Yongpei Zhu, Hongwei Fan, Kehong Yuan
Abstract With the development of deep learning, the structure of convolution neural network is becoming more and more complex and the performance of object recognition is getting better. However, the classification mechanism of convolution neural networks is still an unsolved core problem. The main problem is that convolution neural networks have too many parameters, which makes it difficult to analyze them. In this paper, we design and train a convolution neural network based on the expression recognition, and explore the classification mechanism of the network. By using the Deconvolution visualization method, the extremum point of the convolution neural network is projected back to the pixel space of the original image, and we qualitatively verify that the trained expression recognition convolution neural network forms a detector for the specific facial action unit. At the same time, we design the distance function to measure the distance between the presence of facial feature unit and the maximal value of the response on the feature map of convolution neural network. The greater the distance, the more sensitive the feature map is to the facial feature unit. By comparing the maximum distance of all facial feature elements in the feature graph, the mapping relationship between facial feature element and convolution neural network feature map is determined. Therefore, we have verified that the convolution neural network has formed a detector for the facial Action unit in the training process to realize the expression recognition.
Tasks Facial Expression Recognition, Object Recognition
Published 2019-04-22
URL https://arxiv.org/abs/1904.09737v3
PDF https://arxiv.org/pdf/1904.09737v3.pdf
PWC https://paperswithcode.com/paper/facial-expression-recognition-research-based
Repo https://github.com/fulviomascara/pytorch-cv
Framework pytorch

SuperGlue: Learning Feature Matching with Graph Neural Networks

Title SuperGlue: Learning Feature Matching with Graph Neural Networks
Authors Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich
Abstract This paper introduces SuperGlue, a neural network that matches two sets of local features by jointly finding correspondences and rejecting non-matchable points. Assignments are estimated by solving a differentiable optimal transport problem, whose costs are predicted by a graph neural network. We introduce a flexible context aggregation mechanism based on attention, enabling SuperGlue to reason about the underlying 3D scene and feature assignments jointly. Compared to traditional, hand-designed heuristics, our technique learns priors over geometric transformations and regularities of the 3D world through end-to-end training from image pairs. SuperGlue outperforms other learned approaches and achieves state-of-the-art results on the task of pose estimation in challenging real-world indoor and outdoor environments. The proposed method performs matching in real-time on a modern GPU and can be readily integrated into modern SfM or SLAM systems. The code and trained weights are publicly available at https://github.com/magicleap/SuperGluePretrainedNetwork.
Tasks Pose Estimation
Published 2019-11-26
URL https://arxiv.org/abs/1911.11763v2
PDF https://arxiv.org/pdf/1911.11763v2.pdf
PWC https://paperswithcode.com/paper/superglue-learning-feature-matching-with
Repo https://github.com/magicleap/SuperGluePretrainedNetwork
Framework pytorch

Neural Style Transfer Improves 3D Cardiovascular MR Image Segmentation on Inconsistent Data

Title Neural Style Transfer Improves 3D Cardiovascular MR Image Segmentation on Inconsistent Data
Authors Chunwei Ma, Zhanghexuan Ji, Mingchen Gao
Abstract Three-dimensional medical image segmentation is one of the most important problems in medical image analysis and plays a key role in downstream diagnosis and treatment. Recent years, deep neural networks have made groundbreaking success in medical image segmentation problem. However, due to the high variance in instrumental parameters, experimental protocols, and subject appearances, the generalization of deep learning models is often hindered by the inconsistency in medical images generated by different machines and hospitals. In this work, we present StyleSegor, an efficient and easy-to-use strategy to alleviate this inconsistency issue. Specifically, neural style transfer algorithm is applied to unlabeled data in order to minimize the differences in image properties including brightness, contrast, texture, etc. between the labeled and unlabeled data. We also apply probabilistic adjustment on the network output and integrate multiple predictions through ensemble learning. On a publicly available whole heart segmentation benchmarking dataset from MICCAI HVSMR 2016 challenge, we have demonstrated an elevated dice accuracy surpassing current state-of-the-art method and notably, an improvement of the total score by 29.91%. StyleSegor is thus corroborated to be an accurate tool for 3D whole heart segmentation especially on highly inconsistent data, and is available at https://github.com/horsepurve/StyleSegor.
Tasks Medical Image Segmentation, Semantic Segmentation, Style Transfer
Published 2019-09-20
URL https://arxiv.org/abs/1909.09716v1
PDF https://arxiv.org/pdf/1909.09716v1.pdf
PWC https://paperswithcode.com/paper/190909716
Repo https://github.com/horsepurve/StyleSegor
Framework mxnet

Conv-MCD: A Plug-and-Play Multi-task Module for Medical Image Segmentation

Title Conv-MCD: A Plug-and-Play Multi-task Module for Medical Image Segmentation
Authors Balamurali Murugesan, Kaushik Sarveswaran, Sharath M Shankaranarayana, Keerthi Ram, Jayaraj Joseph, Mohanasankar Sivaprakasam
Abstract For the task of medical image segmentation, fully convolutional network (FCN) based architectures have been extensively used with various modifications. A rising trend in these architectures is to employ joint-learning of the target region with an auxiliary task, a method commonly known as multi-task learning. These approaches help impose smoothness and shape priors, which vanilla FCN approaches do not necessarily incorporate. In this paper, we propose a novel plug-and-play module, which we term as Conv-MCD, which exploits structural information in two ways - i) using the contour map and ii) using the distance map, both of which can be obtained from ground truth segmentation maps with no additional annotation costs. The key benefit of our module is the ease of its addition to any state-of-the-art architecture, resulting in a significant improvement in performance with a minimal increase in parameters. To substantiate the above claim, we conduct extensive experiments using 4 state-of-the-art architectures across various evaluation metrics, and report a significant increase in performance in relation to the base networks. In addition to the aforementioned experiments, we also perform ablative studies and visualization of feature maps to further elucidate our approach.
Tasks Medical Image Segmentation, Multi-Task Learning, Semantic Segmentation
Published 2019-08-14
URL https://arxiv.org/abs/1908.05311v1
PDF https://arxiv.org/pdf/1908.05311v1.pdf
PWC https://paperswithcode.com/paper/conv-mcd-a-plug-and-play-multi-task-module
Repo https://github.com/Bala93/Multi-task-deep-network
Framework pytorch

CrossTrainer: Practical Domain Adaptation with Loss Reweighting

Title CrossTrainer: Practical Domain Adaptation with Loss Reweighting
Authors Justin Chen, Edward Gan, Kexin Rong, Sahaana Suri, Peter Bailis
Abstract Domain adaptation provides a powerful set of model training techniques given domain-specific training data and supplemental data with unknown relevance. The techniques are useful when users need to develop models with data from varying sources, of varying quality, or from different time ranges. We build CrossTrainer, a system for practical domain adaptation. CrossTrainer utilizes loss reweighting, which provides consistently high model accuracy across a variety of datasets in our empirical analysis. However, loss reweighting is sensitive to the choice of a weight hyperparameter that is expensive to tune. We develop optimizations leveraging unique properties of loss reweighting that allow CrossTrainer to output accurate models while improving training time compared to naive hyperparameter search.
Tasks Domain Adaptation
Published 2019-05-07
URL https://arxiv.org/abs/1905.02304v1
PDF https://arxiv.org/pdf/1905.02304v1.pdf
PWC https://paperswithcode.com/paper/crosstrainer-practical-domain-adaptation-with
Repo https://github.com/stanford-futuredata/crosstrainer
Framework none

Separating the EoR Signal with a Convolutional Denoising Autoencoder: A Deep-learning-based Method

Title Separating the EoR Signal with a Convolutional Denoising Autoencoder: A Deep-learning-based Method
Authors Weitian Li, Haiguang Xu, Zhixian Ma, Ruimin Zhu, Dan Hu, Zhenghao Zhu, Junhua Gu, Chenxi Shan, Jie Zhu, Xiang-Ping Wu
Abstract When applying the foreground removal methods to uncover the faint cosmological signal from the epoch of reionization (EoR), the foreground spectra are assumed to be smooth. However, this assumption can be seriously violated in practice since the unresolved or mis-subtracted foreground sources, which are further complicated by the frequency-dependent beam effects of interferometers, will generate significant fluctuations along the frequency dimension. To address this issue, we propose a novel deep-learning-based method that uses a 9-layer convolutional denoising autoencoder (CDAE) to separate the EoR signal. After being trained on the SKA images simulated with realistic beam effects, the CDAE achieves excellent performance as the mean correlation coefficient ($\bar{\rho}$) between the reconstructed and input EoR signals reaches $0.929 \pm 0.045$. In comparison, the two representative traditional methods, namely the polynomial fitting method and the continuous wavelet transform method, both have difficulties in modelling and removing the foreground emission complicated with the beam effects, yielding only $\bar{\rho}{\text{poly}} = 0.296 \pm 0.121$ and $\bar{\rho}{\text{cwt}} = 0.198 \pm 0.160$, respectively. We conclude that, by hierarchically learning sophisticated features through multiple convolutional layers, the CDAE is a powerful tool that can be used to overcome the complicated beam effects and accurately separate the EoR signal. Our results also exhibit the great potential of deep-learning-based methods in future EoR experiments.
Tasks Denoising
Published 2019-02-25
URL http://arxiv.org/abs/1902.09278v2
PDF http://arxiv.org/pdf/1902.09278v2.pdf
PWC https://paperswithcode.com/paper/separating-the-eor-signal-with-a
Repo https://github.com/liweitianux/paper-eor-detection
Framework none

Invert and Defend: Model-based Approximate Inversion of Generative Adversarial Networks for Secure Inference

Title Invert and Defend: Model-based Approximate Inversion of Generative Adversarial Networks for Secure Inference
Authors Wei-An Lin, Yogesh Balaji, Pouya Samangouei, Rama Chellappa
Abstract Inferring the latent variable generating a given test sample is a challenging problem in Generative Adversarial Networks (GANs). In this paper, we propose InvGAN - a novel framework for solving the inference problem in GANs, which involves training an encoder network capable of inverting a pre-trained generator network without access to any training data. Under mild assumptions, we theoretically show that using InvGAN, we can approximately invert the generations of any latent code of a trained GAN model. Furthermore, we empirically demonstrate the superiority of our inference scheme by quantitative and qualitative comparisons with other methods that perform a similar task. We also show the effectiveness of our framework in the problem of adversarial defenses where InvGAN can successfully be used as a projection-based defense mechanism. Additionally, we show how InvGAN can be used to implement reparameterization white-box attacks on projection-based defense mechanisms. Experimental validation on several benchmark datasets demonstrate the efficacy of our method in achieving improved performance on several white-box and black-box attacks. Our code is available at https://github.com/yogeshbalaji/InvGAN.
Tasks
Published 2019-11-23
URL https://arxiv.org/abs/1911.10291v1
PDF https://arxiv.org/pdf/1911.10291v1.pdf
PWC https://paperswithcode.com/paper/invert-and-defend-model-based-approximate
Repo https://github.com/yogeshbalaji/InvGAN
Framework tf

PANet: Few-Shot Image Semantic Segmentation with Prototype Alignment

Title PANet: Few-Shot Image Semantic Segmentation with Prototype Alignment
Authors Kaixin Wang, Jun Hao Liew, Yingtian Zou, Daquan Zhou, Jiashi Feng
Abstract Despite the great progress made by deep CNNs in image semantic segmentation, they typically require a large number of densely-annotated images for training and are difficult to generalize to unseen object categories. Few-shot segmentation has thus been developed to learn to perform segmentation from only a few annotated examples. In this paper, we tackle the challenging few-shot segmentation problem from a metric learning perspective and present PANet, a novel prototype alignment network to better utilize the information of the support set. Our PANet learns class-specific prototype representations from a few support images within an embedding space and then performs segmentation over the query images through matching each pixel to the learned prototypes. With non-parametric metric learning, PANet offers high-quality prototypes that are representative for each semantic class and meanwhile discriminative for different classes. Moreover, PANet introduces a prototype alignment regularization between support and query. With this, PANet fully exploits knowledge from the support and provides better generalization on few-shot segmentation. Significantly, our model achieves the mIoU score of 48.1% and 55.7% on PASCAL-5i for 1-shot and 5-shot settings respectively, surpassing the state-of-the-art method by 1.8% and 8.6%.
Tasks Metric Learning, Semantic Segmentation
Published 2019-08-18
URL https://arxiv.org/abs/1908.06391v2
PDF https://arxiv.org/pdf/1908.06391v2.pdf
PWC https://paperswithcode.com/paper/panet-few-shot-image-semantic-segmentation
Repo https://github.com/kaixin96/PANet
Framework pytorch
comments powered by Disqus