January 31, 2020

3245 words 16 mins read

Paper Group AWR 385

An Alternative Surrogate Loss for PGD-based Adversarial Testing. Hetero-Modal Variational Encoder-Decoder for Joint Modality Completion and Segmentation. Noise-tolerant fair classification. Learning Resolution Parameters for Graph Clustering. STCN: Stochastic Temporal Convolutional Networks. Intensity-Free Learning of Temporal Point Processes. Mult …

An Alternative Surrogate Loss for PGD-based Adversarial Testing


Title	An Alternative Surrogate Loss for PGD-based Adversarial Testing
Authors	Sven Gowal, Jonathan Uesato, Chongli Qin, Po-Sen Huang, Timothy Mann, Pushmeet Kohli
Abstract	Adversarial testing methods based on Projected Gradient Descent (PGD) are widely used for searching norm-bounded perturbations that cause the inputs of neural networks to be misclassified. This paper takes a deeper look at these methods and explains the effect of different hyperparameters (i.e., optimizer, step size and surrogate loss). We introduce the concept of MultiTargeted testing, which makes clever use of alternative surrogate losses, and explain when and how MultiTargeted is guaranteed to find optimal perturbations. Finally, we demonstrate that MultiTargeted outperforms more sophisticated methods and often requires less iterative steps than other variants of PGD found in the literature. Notably, MultiTargeted ranks first on MadryLab’s white-box MNIST and CIFAR-10 leaderboards, reducing the accuracy of their MNIST model to 88.36% (with $\ell_\infty$ perturbations of $\epsilon = 0.3$) and the accuracy of their CIFAR-10 model to 44.03% (at $\epsilon = 8/255$). MultiTargeted also ranks first on the TRADES leaderboard reducing the accuracy of their CIFAR-10 model to 53.07% (with $\ell_\infty$ perturbations of $\epsilon = 0.031$).
Tasks
Published	2019-10-21
URL	https://arxiv.org/abs/1910.09338v1
PDF	https://arxiv.org/pdf/1910.09338v1.pdf
PWC	https://paperswithcode.com/paper/an-alternative-surrogate-loss-for-pgd-based
Repo	https://github.com/yaodongyu/TRADES
Framework	pytorch


Title	Hetero-Modal Variational Encoder-Decoder for Joint Modality Completion and Segmentation
Authors	Reuben Dorent, Samuel Joutard, Marc Modat, Sébastien Ourselin, Tom Vercauteren
Abstract	We propose a new deep learning method for tumour segmentation when dealing with missing imaging modalities. Instead of producing one network for each possible subset of observed modalities or using arithmetic operations to combine feature maps, our hetero-modal variational 3D encoder-decoder independently embeds all observed modalities into a shared latent representation. Missing data and tumour segmentation can be then generated from this embedding. In our scenario, the input is a random subset of modalities. We demonstrate that the optimisation problem can be seen as a mixture sampling. In addition to this, we introduce a new network architecture building upon both the 3D U-Net and the Multi-Modal Variational Auto-Encoder (MVAE). Finally, we evaluate our method on BraTS2018 using subsets of the imaging modalities as input. Our model outperforms the current state-of-the-art method for dealing with missing modalities and achieves similar performance to the subset-specific equivalent networks.
Tasks
Published	2019-07-25
URL	https://arxiv.org/abs/1907.11150v1
PDF	https://arxiv.org/pdf/1907.11150v1.pdf
PWC	https://paperswithcode.com/paper/hetero-modal-variational-encoder-decoder-for
Repo	https://github.com/ReubenDo/U-HVED
Framework	tf

Noise-tolerant fair classification


Title	Noise-tolerant fair classification
Authors	Alexandre Louis Lamy, Ziyuan Zhong, Aditya Krishna Menon, Nakul Verma
Abstract	Fairness-aware learning involves designing algorithms that do not discriminate with respect to some sensitive feature (e.g., race or gender). Existing work on the problem operates under the assumption that the sensitive feature available in one’s training sample is perfectly reliable. This assumption may be violated in many real-world cases: for example, respondents to a survey may choose to conceal or obfuscate their group identity out of fear of potential discrimination. This poses the question of whether one can still learn fair classifiers given noisy sensitive features. In this paper, we answer the question in the affirmative: we show that if one measures fairness using the mean-difference score, and sensitive features are subject to noise from the mutually contaminated learning model, then owing to a simple identity we only need to change the desired fairness-tolerance. The requisite tolerance can be estimated by leveraging existing noise-rate estimators from the label noise literature. We finally show that our procedure is empirically effective on two case-studies involving sensitive feature censoring.
Tasks
Published	2019-01-30
URL	https://arxiv.org/abs/1901.10837v4
PDF	https://arxiv.org/pdf/1901.10837v4.pdf
PWC	https://paperswithcode.com/paper/noise-tolerant-fair-classification
Repo	https://github.com/AIasd/noise_fairlearn
Framework	none

Learning Resolution Parameters for Graph Clustering


Title	Learning Resolution Parameters for Graph Clustering
Authors	Nate Veldt, David F. Gleich, Anthony Wirth
Abstract	Finding clusters of well-connected nodes in a graph is an extensively studied problem in graph-based data analysis. Because of its many applications, a large number of distinct graph clustering objective functions and algorithms have already been proposed and analyzed. To aid practitioners in determining the best clustering approach to use in different applications, we present new techniques for automatically learning how to set clustering resolution parameters. These parameters control the size and structure of communities that are formed by optimizing a generalized objective function. We begin by formalizing the notion of a parameter fitness function, which measures how well a fixed input clustering approximately solves a generalized clustering objective for a specific resolution parameter value. Under reasonable assumptions, which suit two key graph clustering applications, such a parameter fitness function can be efficiently minimized using a bisection-like method, yielding a resolution parameter that fits well with the example clustering. We view our framework as a type of single-shot hyperparameter tuning, as we are able to learn a good resolution parameter with just a single example. Our general approach can be applied to learn resolution parameters for both local and global graph clustering objectives. We demonstrate its utility in several experiments on real-world data where it is helpful to learn resolution parameters from a given example clustering.
Tasks	Graph Clustering
Published	2019-03-12
URL	http://arxiv.org/abs/1903.05246v1
PDF	http://arxiv.org/pdf/1903.05246v1.pdf
PWC	https://paperswithcode.com/paper/learning-resolution-parameters-for-graph
Repo	https://github.com/nveldt/LearnResParams
Framework	none

STCN: Stochastic Temporal Convolutional Networks


Title	STCN: Stochastic Temporal Convolutional Networks
Authors	Emre Aksan, Otmar Hilliges
Abstract	Convolutional architectures have recently been shown to be competitive on many sequence modelling tasks when compared to the de-facto standard of recurrent neural networks (RNNs), while providing computational and modeling advantages due to inherent parallelism. However, currently there remains a performance gap to more expressive stochastic RNN variants, especially those with several layers of dependent random variables. In this work, we propose stochastic temporal convolutional networks (STCNs), a novel architecture that combines the computational advantages of temporal convolutional networks (TCN) with the representational power and robustness of stochastic latent spaces. In particular, we propose a hierarchy of stochastic latent variables that captures temporal dependencies at different time-scales. The architecture is modular and flexible due to the decoupling of the deterministic and stochastic layers. We show that the proposed architecture achieves state of the art log-likelihoods across several tasks. Finally, the model is capable of predicting high-quality synthetic samples over a long-range temporal horizon in modeling of handwritten text.
Tasks
Published	2019-02-18
URL	http://arxiv.org/abs/1902.06568v1
PDF	http://arxiv.org/pdf/1902.06568v1.pdf
PWC	https://paperswithcode.com/paper/stcn-stochastic-temporal-convolutional
Repo	https://github.com/emreaksan/stcn
Framework	tf

Intensity-Free Learning of Temporal Point Processes


Title	Intensity-Free Learning of Temporal Point Processes
Authors	Oleksandr Shchur, Marin Biloš, Stephan Günnemann
Abstract	Temporal point processes are the dominant paradigm for modeling sequences of events happening at irregular intervals. The standard way of learning in such models is by estimating the conditional intensity function. However, parameterizing the intensity function usually incurs several trade-offs. We show how to overcome the limitations of intensity-based approaches by directly modeling the conditional distribution of inter-event times. We draw on the literature on normalizing flows to design models that are flexible and efficient. We additionally propose a simple mixture model that matches the flexibility of flow-based models, but also permits sampling and computing moments in closed form. The proposed models achieve state-of-the-art performance in standard prediction tasks and are suitable for novel applications, such as learning sequence embeddings and imputing missing data.
Tasks	Point Processes
Published	2019-09-26
URL	https://arxiv.org/abs/1909.12127v2
PDF	https://arxiv.org/pdf/1909.12127v2.pdf
PWC	https://paperswithcode.com/paper/intensity-free-learning-of-temporal-point
Repo	https://github.com/shchur/ifl-tpp
Framework	pytorch

Multilingual Neural Machine Translation With Soft Decoupled Encoding


Title	Multilingual Neural Machine Translation With Soft Decoupled Encoding
Authors	Xinyi Wang, Hieu Pham, Philip Arthur, Graham Neubig
Abstract	Multilingual training of neural machine translation (NMT) systems has led to impressive accuracy improvements on low-resource languages. However, there are still significant challenges in efficiently learning word representations in the face of paucity of data. In this paper, we propose Soft Decoupled Encoding (SDE), a multilingual lexicon encoding framework specifically designed to share lexical-level information intelligently without requiring heuristic preprocessing such as pre-segmenting the data. SDE represents a word by its spelling through a character encoding, and its semantic meaning through a latent embedding space shared by all languages. Experiments on a standard dataset of four low-resource languages show consistent improvements over strong multilingual NMT baselines, with gains of up to 2 BLEU on one of the tested languages, achieving the new state-of-the-art on all four language pairs.
Tasks	Machine Translation
Published	2019-02-09
URL	http://arxiv.org/abs/1902.03499v1
PDF	http://arxiv.org/pdf/1902.03499v1.pdf
PWC	https://paperswithcode.com/paper/multilingual-neural-machine-translation-with-1
Repo	https://github.com/cindyxinyiwang/SDE
Framework	pytorch

Facial Expression Recognition Research Based on Deep Learning


Title	Facial Expression Recognition Research Based on Deep Learning
Authors	Yongpei Zhu, Hongwei Fan, Kehong Yuan
Abstract	With the development of deep learning, the structure of convolution neural network is becoming more and more complex and the performance of object recognition is getting better. However, the classification mechanism of convolution neural networks is still an unsolved core problem. The main problem is that convolution neural networks have too many parameters, which makes it difficult to analyze them. In this paper, we design and train a convolution neural network based on the expression recognition, and explore the classification mechanism of the network. By using the Deconvolution visualization method, the extremum point of the convolution neural network is projected back to the pixel space of the original image, and we qualitatively verify that the trained expression recognition convolution neural network forms a detector for the specific facial action unit. At the same time, we design the distance function to measure the distance between the presence of facial feature unit and the maximal value of the response on the feature map of convolution neural network. The greater the distance, the more sensitive the feature map is to the facial feature unit. By comparing the maximum distance of all facial feature elements in the feature graph, the mapping relationship between facial feature element and convolution neural network feature map is determined. Therefore, we have verified that the convolution neural network has formed a detector for the facial Action unit in the training process to realize the expression recognition.
Tasks	Facial Expression Recognition, Object Recognition
Published	2019-04-22
URL	https://arxiv.org/abs/1904.09737v3
PDF	https://arxiv.org/pdf/1904.09737v3.pdf
PWC	https://paperswithcode.com/paper/facial-expression-recognition-research-based
Repo	https://github.com/fulviomascara/pytorch-cv
Framework	pytorch

SuperGlue: Learning Feature Matching with Graph Neural Networks


Title	SuperGlue: Learning Feature Matching with Graph Neural Networks
Authors	Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich
Abstract	This paper introduces SuperGlue, a neural network that matches two sets of local features by jointly finding correspondences and rejecting non-matchable points. Assignments are estimated by solving a differentiable optimal transport problem, whose costs are predicted by a graph neural network. We introduce a flexible context aggregation mechanism based on attention, enabling SuperGlue to reason about the underlying 3D scene and feature assignments jointly. Compared to traditional, hand-designed heuristics, our technique learns priors over geometric transformations and regularities of the 3D world through end-to-end training from image pairs. SuperGlue outperforms other learned approaches and achieves state-of-the-art results on the task of pose estimation in challenging real-world indoor and outdoor environments. The proposed method performs matching in real-time on a modern GPU and can be readily integrated into modern SfM or SLAM systems. The code and trained weights are publicly available at https://github.com/magicleap/SuperGluePretrainedNetwork.
Tasks	Pose Estimation
Published	2019-11-26
URL	https://arxiv.org/abs/1911.11763v2
PDF	https://arxiv.org/pdf/1911.11763v2.pdf
PWC	https://paperswithcode.com/paper/superglue-learning-feature-matching-with
Repo	https://github.com/magicleap/SuperGluePretrainedNetwork
Framework	pytorch

Neural Style Transfer Improves 3D Cardiovascular MR Image Segmentation on Inconsistent Data


Title	Neural Style Transfer Improves 3D Cardiovascular MR Image Segmentation on Inconsistent Data
Authors	Chunwei Ma, Zhanghexuan Ji, Mingchen Gao
Abstract	Three-dimensional medical image segmentation is one of the most important problems in medical image analysis and plays a key role in downstream diagnosis and treatment. Recent years, deep neural networks have made groundbreaking success in medical image segmentation problem. However, due to the high variance in instrumental parameters, experimental protocols, and subject appearances, the generalization of deep learning models is often hindered by the inconsistency in medical images generated by different machines and hospitals. In this work, we present StyleSegor, an efficient and easy-to-use strategy to alleviate this inconsistency issue. Specifically, neural style transfer algorithm is applied to unlabeled data in order to minimize the differences in image properties including brightness, contrast, texture, etc. between the labeled and unlabeled data. We also apply probabilistic adjustment on the network output and integrate multiple predictions through ensemble learning. On a publicly available whole heart segmentation benchmarking dataset from MICCAI HVSMR 2016 challenge, we have demonstrated an elevated dice accuracy surpassing current state-of-the-art method and notably, an improvement of the total score by 29.91%. StyleSegor is thus corroborated to be an accurate tool for 3D whole heart segmentation especially on highly inconsistent data, and is available at https://github.com/horsepurve/StyleSegor.
Tasks	Medical Image Segmentation, Semantic Segmentation, Style Transfer
Published	2019-09-20
URL	https://arxiv.org/abs/1909.09716v1
PDF	https://arxiv.org/pdf/1909.09716v1.pdf
PWC	https://paperswithcode.com/paper/190909716
Repo	https://github.com/horsepurve/StyleSegor
Framework	mxnet

Conv-MCD: A Plug-and-Play Multi-task Module for Medical Image Segmentation


Title	Conv-MCD: A Plug-and-Play Multi-task Module for Medical Image Segmentation
Authors	Balamurali Murugesan, Kaushik Sarveswaran, Sharath M Shankaranarayana, Keerthi Ram, Jayaraj Joseph, Mohanasankar Sivaprakasam
Abstract	For the task of medical image segmentation, fully convolutional network (FCN) based architectures have been extensively used with various modifications. A rising trend in these architectures is to employ joint-learning of the target region with an auxiliary task, a method commonly known as multi-task learning. These approaches help impose smoothness and shape priors, which vanilla FCN approaches do not necessarily incorporate. In this paper, we propose a novel plug-and-play module, which we term as Conv-MCD, which exploits structural information in two ways - i) using the contour map and ii) using the distance map, both of which can be obtained from ground truth segmentation maps with no additional annotation costs. The key benefit of our module is the ease of its addition to any state-of-the-art architecture, resulting in a significant improvement in performance with a minimal increase in parameters. To substantiate the above claim, we conduct extensive experiments using 4 state-of-the-art architectures across various evaluation metrics, and report a significant increase in performance in relation to the base networks. In addition to the aforementioned experiments, we also perform ablative studies and visualization of feature maps to further elucidate our approach.
Tasks	Medical Image Segmentation, Multi-Task Learning, Semantic Segmentation
Published	2019-08-14
URL	https://arxiv.org/abs/1908.05311v1
PDF	https://arxiv.org/pdf/1908.05311v1.pdf
PWC	https://paperswithcode.com/paper/conv-mcd-a-plug-and-play-multi-task-module
Repo	https://github.com/Bala93/Multi-task-deep-network
Framework	pytorch

CrossTrainer: Practical Domain Adaptation with Loss Reweighting


Title	CrossTrainer: Practical Domain Adaptation with Loss Reweighting
Authors	Justin Chen, Edward Gan, Kexin Rong, Sahaana Suri, Peter Bailis
Abstract	Domain adaptation provides a powerful set of model training techniques given domain-specific training data and supplemental data with unknown relevance. The techniques are useful when users need to develop models with data from varying sources, of varying quality, or from different time ranges. We build CrossTrainer, a system for practical domain adaptation. CrossTrainer utilizes loss reweighting, which provides consistently high model accuracy across a variety of datasets in our empirical analysis. However, loss reweighting is sensitive to the choice of a weight hyperparameter that is expensive to tune. We develop optimizations leveraging unique properties of loss reweighting that allow CrossTrainer to output accurate models while improving training time compared to naive hyperparameter search.
Tasks	Domain Adaptation
Published	2019-05-07
URL	https://arxiv.org/abs/1905.02304v1
PDF	https://arxiv.org/pdf/1905.02304v1.pdf
PWC	https://paperswithcode.com/paper/crosstrainer-practical-domain-adaptation-with
Repo	https://github.com/stanford-futuredata/crosstrainer
Framework	none

Separating the EoR Signal with a Convolutional Denoising Autoencoder: A Deep-learning-based Method


Title	Separating the EoR Signal with a Convolutional Denoising Autoencoder: A Deep-learning-based Method
Authors	Weitian Li, Haiguang Xu, Zhixian Ma, Ruimin Zhu, Dan Hu, Zhenghao Zhu, Junhua Gu, Chenxi Shan, Jie Zhu, Xiang-Ping Wu
Abstract	When applying the foreground removal methods to uncover the faint cosmological signal from the epoch of reionization (EoR), the foreground spectra are assumed to be smooth. However, this assumption can be seriously violated in practice since the unresolved or mis-subtracted foreground sources, which are further complicated by the frequency-dependent beam effects of interferometers, will generate significant fluctuations along the frequency dimension. To address this issue, we propose a novel deep-learning-based method that uses a 9-layer convolutional denoising autoencoder (CDAE) to separate the EoR signal. After being trained on the SKA images simulated with realistic beam effects, the CDAE achieves excellent performance as the mean correlation coefficient ($\bar{\rho}$) between the reconstructed and input EoR signals reaches $0.929 \pm 0.045$. In comparison, the two representative traditional methods, namely the polynomial fitting method and the continuous wavelet transform method, both have difficulties in modelling and removing the foreground emission complicated with the beam effects, yielding only $\bar{\rho}{\text{poly}} = 0.296 \pm 0.121$ and $\bar{\rho}{\text{cwt}} = 0.198 \pm 0.160$, respectively. We conclude that, by hierarchically learning sophisticated features through multiple convolutional layers, the CDAE is a powerful tool that can be used to overcome the complicated beam effects and accurately separate the EoR signal. Our results also exhibit the great potential of deep-learning-based methods in future EoR experiments.
Tasks	Denoising
Published	2019-02-25
URL	http://arxiv.org/abs/1902.09278v2
PDF	http://arxiv.org/pdf/1902.09278v2.pdf
PWC	https://paperswithcode.com/paper/separating-the-eor-signal-with-a
Repo	https://github.com/liweitianux/paper-eor-detection
Framework	none

Invert and Defend: Model-based Approximate Inversion of Generative Adversarial Networks for Secure Inference


Title	Invert and Defend: Model-based Approximate Inversion of Generative Adversarial Networks for Secure Inference
Authors	Wei-An Lin, Yogesh Balaji, Pouya Samangouei, Rama Chellappa
Abstract	Inferring the latent variable generating a given test sample is a challenging problem in Generative Adversarial Networks (GANs). In this paper, we propose InvGAN - a novel framework for solving the inference problem in GANs, which involves training an encoder network capable of inverting a pre-trained generator network without access to any training data. Under mild assumptions, we theoretically show that using InvGAN, we can approximately invert the generations of any latent code of a trained GAN model. Furthermore, we empirically demonstrate the superiority of our inference scheme by quantitative and qualitative comparisons with other methods that perform a similar task. We also show the effectiveness of our framework in the problem of adversarial defenses where InvGAN can successfully be used as a projection-based defense mechanism. Additionally, we show how InvGAN can be used to implement reparameterization white-box attacks on projection-based defense mechanisms. Experimental validation on several benchmark datasets demonstrate the efficacy of our method in achieving improved performance on several white-box and black-box attacks. Our code is available at https://github.com/yogeshbalaji/InvGAN.
Tasks
Published	2019-11-23
URL	https://arxiv.org/abs/1911.10291v1
PDF	https://arxiv.org/pdf/1911.10291v1.pdf
PWC	https://paperswithcode.com/paper/invert-and-defend-model-based-approximate
Repo	https://github.com/yogeshbalaji/InvGAN
Framework	tf

PANet: Few-Shot Image Semantic Segmentation with Prototype Alignment


Title	PANet: Few-Shot Image Semantic Segmentation with Prototype Alignment
Authors	Kaixin Wang, Jun Hao Liew, Yingtian Zou, Daquan Zhou, Jiashi Feng
Abstract	Despite the great progress made by deep CNNs in image semantic segmentation, they typically require a large number of densely-annotated images for training and are difficult to generalize to unseen object categories. Few-shot segmentation has thus been developed to learn to perform segmentation from only a few annotated examples. In this paper, we tackle the challenging few-shot segmentation problem from a metric learning perspective and present PANet, a novel prototype alignment network to better utilize the information of the support set. Our PANet learns class-specific prototype representations from a few support images within an embedding space and then performs segmentation over the query images through matching each pixel to the learned prototypes. With non-parametric metric learning, PANet offers high-quality prototypes that are representative for each semantic class and meanwhile discriminative for different classes. Moreover, PANet introduces a prototype alignment regularization between support and query. With this, PANet fully exploits knowledge from the support and provides better generalization on few-shot segmentation. Significantly, our model achieves the mIoU score of 48.1% and 55.7% on PASCAL-5i for 1-shot and 5-shot settings respectively, surpassing the state-of-the-art method by 1.8% and 8.6%.
Tasks	Metric Learning, Semantic Segmentation
Published	2019-08-18
URL	https://arxiv.org/abs/1908.06391v2
PDF	https://arxiv.org/pdf/1908.06391v2.pdf
PWC	https://paperswithcode.com/paper/panet-few-shot-image-semantic-segmentation
Repo	https://github.com/kaixin96/PANet
Framework	pytorch