October 20, 2019

3422 words 17 mins read

Paper Group AWR 188

Triple consistency loss for pairing distributions in GAN-based face synthesis. Batch Active Preference-Based Learning of Reward Functions. Disease Prediction using Graph Convolutional Networks: Application to Autism Spectrum Disorder and Alzheimer’s Disease. Structured Content Preservation for Unsupervised Text Style Transfer. AutoZOOM: Autoencoder …

Triple consistency loss for pairing distributions in GAN-based face synthesis


Title	Triple consistency loss for pairing distributions in GAN-based face synthesis
Authors	Enrique Sanchez, Michel Valstar
Abstract	Generative Adversarial Networks have shown impressive results for the task of object translation, including face-to-face translation. A key component behind the success of recent approaches is the self-consistency loss, which encourages a network to recover the original input image when the output generated for a desired attribute is itself passed through the same network, but with the target attribute inverted. While the self-consistency loss yields photo-realistic results, it can be shown that the input and target domains, supposed to be close, differ substantially. This is empirically found by observing that a network recovers the input image even if attributes other than the inversion of the original goal are set as target. This stops one combining networks for different tasks, or using a network to do progressive forward passes. In this paper, we show empirical evidence of this effect, and propose a new loss to bridge the gap between the distributions of the input and target domains. This “triple consistency loss”, aims to minimise the distance between the outputs generated by the network for different routes to the target, independent of any intermediate steps. To show this is effective, we incorporate the triple consistency loss into the training of a new landmark-guided face to face synthesis, where, contrary to previous works, the generated images can simultaneously undergo a large transformation in both expression and pose. To the best of our knowledge, we are the first to tackle the problem of mismatching distributions in self-domain synthesis, and to propose “in-the-wild” landmark-guided synthesis. Code will be available at https://github.com/ESanchezLozano/GANnotation
Tasks	Face Generation, Face to Face Translation
Published	2018-11-08
URL	http://arxiv.org/abs/1811.03492v1
PDF	http://arxiv.org/pdf/1811.03492v1.pdf
PWC	https://paperswithcode.com/paper/triple-consistency-loss-for-pairing
Repo	https://github.com/ESanchezLozano/GANnotation
Framework	pytorch

Batch Active Preference-Based Learning of Reward Functions


Title	Batch Active Preference-Based Learning of Reward Functions
Authors	Erdem Bıyık, Dorsa Sadigh
Abstract	Data generation and labeling are usually an expensive part of learning for robotics. While active learning methods are commonly used to tackle the former problem, preference-based learning is a concept that attempts to solve the latter by querying users with preference questions. In this paper, we will develop a new algorithm, batch active preference-based learning, that enables efficient learning of reward functions using as few data samples as possible while still having short query generation times. We introduce several approximations to the batch active learning problem, and provide theoretical guarantees for the convergence of our algorithms. Finally, we present our experimental results for a variety of robotics tasks in simulation. Our results suggest that our batch active learning algorithm requires only a few queries that are computed in a short amount of time. We then showcase our algorithm in a study to learn human users’ preferences.
Tasks	Active Learning
Published	2018-10-10
URL	http://arxiv.org/abs/1810.04303v1
PDF	http://arxiv.org/pdf/1810.04303v1.pdf
PWC	https://paperswithcode.com/paper/batch-active-preference-based-learning-of
Repo	https://github.com/Stanford-ILIAD/batch-active-preference-based-learning
Framework	none

Disease Prediction using Graph Convolutional Networks: Application to Autism Spectrum Disorder and Alzheimer’s Disease


Title	Disease Prediction using Graph Convolutional Networks: Application to Autism Spectrum Disorder and Alzheimer’s Disease
Authors	Sarah Parisot, Sofia Ira Ktena, Enzo Ferrante, Matthew Lee, Ricardo Guerrero, Ben Glocker, Daniel Rueckert
Abstract	Graphs are widely used as a natural framework that captures interactions between individual elements represented as nodes in a graph. In medical applications, specifically, nodes can represent individuals within a potentially large population (patients or healthy controls) accompanied by a set of features, while the graph edges incorporate associations between subjects in an intuitive manner. This representation allows to incorporate the wealth of imaging and non-imaging information as well as individual subject features simultaneously in disease classification tasks. Previous graph-based approaches for supervised or unsupervised learning in the context of disease prediction solely focus on pairwise similarities between subjects, disregarding individual characteristics and features, or rather rely on subject-specific imaging feature vectors and fail to model interactions between them. In this paper, we present a thorough evaluation of a generic framework that leverages both imaging and non-imaging information and can be used for brain analysis in large populations. This framework exploits Graph Convolutional Networks (GCNs) and involves representing populations as a sparse graph, where its nodes are associated with imaging-based feature vectors, while phenotypic information is integrated as edge weights. The extensive evaluation explores the effect of each individual component of this framework on disease prediction performance and further compares it to different baselines. The framework performance is tested on two large datasets with diverse underlying data, ABIDE and ADNI, for the prediction of Autism Spectrum Disorder and conversion to Alzheimer’s disease, respectively. Our analysis shows that our novel framework can improve over state-of-the-art results on both databases, with 70.4% classification accuracy for ABIDE and 80.0% for ADNI.
Tasks	Disease Prediction
Published	2018-06-05
URL	http://arxiv.org/abs/1806.01738v1
PDF	http://arxiv.org/pdf/1806.01738v1.pdf
PWC	https://paperswithcode.com/paper/disease-prediction-using-graph-convolutional
Repo	https://github.com/SSinyu/p
Framework	tf

Structured Content Preservation for Unsupervised Text Style Transfer


Title	Structured Content Preservation for Unsupervised Text Style Transfer
Authors	Youzhi Tian, Zhiting Hu, Zhou Yu
Abstract	Text style transfer aims to modify the style of a sentence while keeping its content unchanged. Recent style transfer systems often fail to faithfully preserve the content after changing the style. This paper proposes a structured content preserving model that leverages linguistic information in the structured fine-grained supervisions to better preserve the style-independent content during style transfer. In particular, we achieve the goal by devising rich model objectives based on both the sentence’s lexical information and a language model that conditions on content. The resulting model therefore is encouraged to retain the semantic meaning of the target sentences. We perform extensive experiments that compare our model to other existing approaches in the tasks of sentiment and political slant transfer. Our model achieves significant improvement in terms of both content preservation and style transfer in automatic and human evaluation.
Tasks	Language Modelling, Style Transfer, Text Style Transfer
Published	2018-10-15
URL	http://arxiv.org/abs/1810.06526v2
PDF	http://arxiv.org/pdf/1810.06526v2.pdf
PWC	https://paperswithcode.com/paper/structured-content-preservation-for
Repo	https://github.com/YouzhiTian/Structured-Content-Preservation-for-Unsupervised-Text-Style-Transfer
Framework	pytorch

AutoZOOM: Autoencoder-based Zeroth Order Optimization Method for Attacking Black-box Neural Networks


Title	AutoZOOM: Autoencoder-based Zeroth Order Optimization Method for Attacking Black-box Neural Networks
Authors	Chun-Chen Tu, Paishun Ting, Pin-Yu Chen, Sijia Liu, Huan Zhang, Jinfeng Yi, Cho-Jui Hsieh, Shin-Ming Cheng
Abstract	Recent studies have shown that adversarial examples in state-of-the-art image classifiers trained by deep neural networks (DNN) can be easily generated when the target model is transparent to an attacker, known as the white-box setting. However, when attacking a deployed machine learning service, one can only acquire the input-output correspondences of the target model; this is the so-called black-box attack setting. The major drawback of existing black-box attacks is the need for excessive model queries, which may give a false sense of model robustness due to inefficient query designs. To bridge this gap, we propose a generic framework for query-efficient black-box attacks. Our framework, AutoZOOM, which is short for Autoencoder-based Zeroth Order Optimization Method, has two novel building blocks towards efficient black-box attacks: (i) an adaptive random gradient estimation strategy to balance query counts and distortion, and (ii) an autoencoder that is either trained offline with unlabeled data or a bilinear resizing operation for attack acceleration. Experimental results suggest that, by applying AutoZOOM to a state-of-the-art black-box attack (ZOO), a significant reduction in model queries can be achieved without sacrificing the attack success rate and the visual quality of the resulting adversarial examples. In particular, when compared to the standard ZOO method, AutoZOOM can consistently reduce the mean query counts in finding successful adversarial examples (or reaching the same distortion level) by at least 93% on MNIST, CIFAR-10 and ImageNet datasets, leading to novel insights on adversarial robustness.
Tasks
Published	2018-05-30
URL	https://arxiv.org/abs/1805.11770v5
PDF	https://arxiv.org/pdf/1805.11770v5.pdf
PWC	https://paperswithcode.com/paper/autozoom-autoencoder-based-zeroth-order
Repo	https://github.com/psting/IBM-UM_Projects
Framework	none

MiME: Multilevel Medical Embedding of Electronic Health Records for Predictive Healthcare


Title	MiME: Multilevel Medical Embedding of Electronic Health Records for Predictive Healthcare
Authors	Edward Choi, Cao Xiao, Walter F. Stewart, Jimeng Sun
Abstract	Deep learning models exhibit state-of-the-art performance for many predictive healthcare tasks using electronic health records (EHR) data, but these models typically require training data volume that exceeds the capacity of most healthcare systems. External resources such as medical ontologies are used to bridge the data volume constraint, but this approach is often not directly applicable or useful because of inconsistencies with terminology. To solve the data insufficiency challenge, we leverage the inherent multilevel structure of EHR data and, in particular, the encoded relationships among medical codes. We propose Multilevel Medical Embedding (MiME) which learns the multilevel embedding of EHR data while jointly performing auxiliary prediction tasks that rely on this inherent EHR structure without the need for external labels. We conducted two prediction tasks, heart failure prediction and sequential disease prediction, where MiME outperformed baseline methods in diverse evaluation settings. In particular, MiME consistently outperformed all baselines when predicting heart failure on datasets of different volumes, especially demonstrating the greatest performance improvement (15% relative gain in PR-AUC over the best baseline) on the smallest dataset, demonstrating its ability to effectively model the multilevel structure of EHR data.
Tasks	Disease Prediction
Published	2018-10-22
URL	http://arxiv.org/abs/1810.09593v1
PDF	http://arxiv.org/pdf/1810.09593v1.pdf
PWC	https://paperswithcode.com/paper/mime-multilevel-medical-embedding-of
Repo	https://github.com/mp2893/mime
Framework	tf

Robust Multiple Kernel k-means Clustering using Min-Max Optimization


Title	Robust Multiple Kernel k-means Clustering using Min-Max Optimization
Authors	Seojin Bang, Yaoliang Yu, Wei Wu
Abstract	Multiple kernel learning is a type of multiview learning that combines different data modalities by capturing view-specific patterns using kernels. Although supervised multiple kernel learning has been extensively studied, until recently, only a few unsupervised approaches have been proposed. In the meanwhile, adversarial learning has recently received much attention. Many works have been proposed to defend against adversarial examples. However, little is known about the effect of adversarial perturbation in the context of multiview learning, and even less in the unsupervised case. In this study, we show that adversarial features added to a view can make the existing approaches with the min-max formulation in multiple kernel clustering yield unfavorable clusters. To address this problem and inspired by recent works in adversarial learning, we propose a multiple kernel clustering method with the min-max framework that aims to be robust to such adversarial perturbation. We evaluate the robustness of our method on simulation data under different types of adversarial perturbations and show that it outperforms several compared existing methods. In the real data analysis, We demonstrate the utility of our method on a real-world problem.
Tasks	Disease Prediction, Multiview Learning
Published	2018-03-06
URL	https://arxiv.org/abs/1803.02458v2
PDF	https://arxiv.org/pdf/1803.02458v2.pdf
PWC	https://paperswithcode.com/paper/multiple-kernel-k-means-clustering-using-min
Repo	https://github.com/SeojinBang/MKKC
Framework	none

DeRPN: Taking a further step toward more general object detection


Title	DeRPN: Taking a further step toward more general object detection
Authors	Lele Xie, Yuliang Liu, Lianwen Jin, Zecheng Xie
Abstract	Most current detection methods have adopted anchor boxes as regression references. However, the detection performance is sensitive to the setting of the anchor boxes. A proper setting of anchor boxes may vary significantly across different datasets, which severely limits the universality of the detectors. To improve the adaptivity of the detectors, in this paper, we present a novel dimension-decomposition region proposal network (DeRPN) that can perfectly displace the traditional Region Proposal Network (RPN). DeRPN utilizes an anchor string mechanism to independently match object widths and heights, which is conducive to treating variant object shapes. In addition, a novel scale-sensitive loss is designed to address the imbalanced loss computations of different scaled objects, which can avoid the small objects being overwhelmed by larger ones. Comprehensive experiments conducted on both general object detection datasets (Pascal VOC 2007, 2012 and MS COCO) and scene text detection datasets (ICDAR 2013 and COCO-Text) all prove that our DeRPN can significantly outperform RPN. It is worth mentioning that the proposed DeRPN can be employed directly on different models, tasks, and datasets without any modifications of hyperparameters or specialized optimization, which further demonstrates its adaptivity. The code will be released at https://github.com/HCIILAB/DeRPN.
Tasks	Object Detection, Scene Text Detection
Published	2018-11-16
URL	http://arxiv.org/abs/1811.06700v1
PDF	http://arxiv.org/pdf/1811.06700v1.pdf
PWC	https://paperswithcode.com/paper/derpn-taking-a-further-step-toward-more
Repo	https://github.com/HCIILAB/DeRPN
Framework	none

A System for Massively Parallel Hyperparameter Tuning


Title	A System for Massively Parallel Hyperparameter Tuning
Authors	Liam Li, Kevin Jamieson, Afshin Rostamizadeh, Ekaterina Gonina, Moritz Hardt, Benjamin Recht, Ameet Talwalkar
Abstract	Modern learning models are characterized by large hyperparameter spaces and long training times. These properties, coupled with the rise of parallel computing and the growing demand to productionize machine learning workloads, motivate the need to develop mature hyperparameter optimization functionality in distributed computing settings. We address this challenge by first introducing a simple and robust hyperparameter optimization algorithm called ASHA, which exploits parallelism and aggressive early-stopping to tackle large-scale hyperparameter optimization problems. Our extensive empirical results show that ASHA outperforms existing state-of-the-art hyperparameter optimization methods; scales linearly with the number of workers in distributed settings; and is suitable for massive parallelism, as demonstrated on a task with 500 workers. We then describe several design decisions we encountered, along with our associated solutions, when integrating ASHA in Determined AI’s end-to-end production-quality machine learning system that offers hyperparameter tuning as a service.
Tasks	Hyperparameter Optimization
Published	2018-10-13
URL	https://arxiv.org/abs/1810.05934v5
PDF	https://arxiv.org/pdf/1810.05934v5.pdf
PWC	https://paperswithcode.com/paper/massively-parallel-hyperparameter-tuning
Repo	https://github.com/c-bata/goptuna
Framework	none

A Twofold Siamese Network for Real-Time Object Tracking


Title	A Twofold Siamese Network for Real-Time Object Tracking
Authors	Anfeng He, Chong Luo, Xinmei Tian, Wenjun Zeng
Abstract	Observing that Semantic features learned in an image classification task and Appearance features learned in a similarity matching task complement each other, we build a twofold Siamese network, named SA-Siam, for real-time object tracking. SA-Siam is composed of a semantic branch and an appearance branch. Each branch is a similarity-learning Siamese network. An important design choice in SA-Siam is to separately train the two branches to keep the heterogeneity of the two types of features. In addition, we propose a channel attention mechanism for the semantic branch. Channel-wise weights are computed according to the channel activations around the target position. While the inherited architecture from SiamFC \cite{SiamFC} allows our tracker to operate beyond real-time, the twofold design and the attention mechanism significantly improve the tracking performance. The proposed SA-Siam outperforms all other real-time trackers by a large margin on OTB-2013/50/100 benchmarks.
Tasks	Image Classification, Object Tracking
Published	2018-02-24
URL	http://arxiv.org/abs/1802.08817v1
PDF	http://arxiv.org/pdf/1802.08817v1.pdf
PWC	https://paperswithcode.com/paper/a-twofold-siamese-network-for-real-time
Repo	https://github.com/Microsoft/SA-Siam
Framework	tf

One-shot Texture Segmentation


Title	One-shot Texture Segmentation
Authors	Ivan Ustyuzhaninov, Claudio Michaelis, Wieland Brendel, Matthias Bethge
Abstract	We introduce one-shot texture segmentation: the task of segmenting an input image containing multiple textures given a patch of a reference texture. This task is designed to turn the problem of texture-based perceptual grouping into an objective benchmark. We show that it is straight-forward to generate large synthetic data sets for this task from a relatively small number of natural textures. In particular, this task can be cast as a self-supervised problem thereby alleviating the need for massive amounts of manually annotated data necessary for traditional segmentation tasks. In this paper we introduce and study two concrete data sets: a dense collage of textures (CollTex) and a cluttered texturized Omniglot data set. We show that a baseline model trained on these synthesized data is able to generalize to natural images and videos without further fine-tuning, suggesting that the learned image representations are useful for higher-level vision tasks.
Tasks	Omniglot
Published	2018-07-07
URL	http://arxiv.org/abs/1807.02654v1
PDF	http://arxiv.org/pdf/1807.02654v1.pdf
PWC	https://paperswithcode.com/paper/one-shot-texture-segmentation
Repo	https://github.com/atch841/one-shot-texture-segmentation
Framework	tf

Deep Unfolding of a Proximal Interior Point Method for Image Restoration


Title	Deep Unfolding of a Proximal Interior Point Method for Image Restoration
Authors	Carla Bertocchi, Emilie Chouzenoux, Marie-Caroline Corbineau, Jean-Christophe Pesquet, Marco Prato
Abstract	Variational methods are widely applied to ill-posed inverse problems for they have the ability to embed prior knowledge about the solution. However, the level of performance of these methods significantly depends on a set of parameters, which can be estimated through computationally expensive and time-consuming methods. In contrast, deep learning offers very generic and efficient architectures, at the expense of explainability, since it is often used as a black-box, without any fine control over its output. Deep unfolding provides a convenient approach to combine variational-based and deep learning approaches. Starting from a variational formulation for image restoration, we develop iRestNet, a neural network architecture obtained by unfolding a proximal interior point algorithm. Hard constraints, encoding desirable properties for the restored image, are incorporated into the network thanks to a logarithmic barrier, while the barrier parameter, the stepsize, and the penalization weight are learned by the network. We derive explicit expressions for the gradient of the proximity operator for various choices of constraints, which allows training iRestNet with gradient descent and backpropagation. In addition, we provide theoretical results regarding the stability of the network for a common inverse problem example. Numerical experiments on image deblurring problems show that the proposed approach compares favorably with both state-of-the-art variational and machine learning methods in terms of image quality.
Tasks	Deblurring, Image Restoration
Published	2018-12-11
URL	https://arxiv.org/abs/1812.04276v5
PDF	https://arxiv.org/pdf/1812.04276v5.pdf
PWC	https://paperswithcode.com/paper/deep-unfolding-of-a-proximal-interior-point
Repo	https://github.com/mccorbineau/iRestNet
Framework	pytorch

Interactive Classification for Deep Learning Interpretation


Title	Interactive Classification for Deep Learning Interpretation
Authors	Ángel Alexander Cabrera, Fred Hohman, Jason Lin, Duen Horng Chau
Abstract	We present an interactive system enabling users to manipulate images to explore the robustness and sensitivity of deep learning image classifiers. Using modern web technologies to run in-browser inference, users can remove image features using inpainting algorithms and obtain new classifications in real time, which allows them to ask a variety of “what if” questions by experimentally modifying images and seeing how the model reacts. Our system allows users to compare and contrast what image regions humans and machine learning models use for classification, revealing a wide range of surprising results ranging from spectacular failures (e.g., a “water bottle” image becomes a “concert” when removing a person) to impressive resilience (e.g., a “baseball player” image remains correctly classified even without a glove or base). We demonstrate our system at The 2018 Conference on Computer Vision and Pattern Recognition (CVPR) for the audience to try it live. Our system is open-sourced at https://github.com/poloclub/interactive-classification. A video demo is available at https://youtu.be/llub5GcOF6w.
Tasks
Published	2018-06-14
URL	http://arxiv.org/abs/1806.05660v2
PDF	http://arxiv.org/pdf/1806.05660v2.pdf
PWC	https://paperswithcode.com/paper/interactive-classification-for-deep-learning
Repo	https://github.com/poloclub/interactive-classification
Framework	tf

Hierarchical Density Order Embeddings


Title	Hierarchical Density Order Embeddings
Authors	Ben Athiwaratkun, Andrew Gordon Wilson
Abstract	By representing words with probability densities rather than point vectors, probabilistic word embeddings can capture rich and interpretable semantic information and uncertainty. The uncertainty information can be particularly meaningful in capturing entailment relationships – whereby general words such as “entity” correspond to broad distributions that encompass more specific words such as “animal” or “instrument”. We introduce density order embeddings, which learn hierarchical representations through encapsulation of probability densities. In particular, we propose simple yet effective loss functions and distance metrics, as well as graph-based schemes to select negative samples to better learn hierarchical density representations. Our approach provides state-of-the-art performance on the WordNet hypernym relationship prediction task and the challenging HyperLex lexical entailment dataset – while retaining a rich and interpretable density representation.
Tasks	Word Embeddings
Published	2018-04-26
URL	http://arxiv.org/abs/1804.09843v1
PDF	http://arxiv.org/pdf/1804.09843v1.pdf
PWC	https://paperswithcode.com/paper/hierarchical-density-order-embeddings
Repo	https://github.com/benathi/density-order-emb
Framework	pytorch

A Hierarchical Latent Structure for Variational Conversation Modeling


Title	A Hierarchical Latent Structure for Variational Conversation Modeling
Authors	Yookoon Park, Jaemin Cho, Gunhee Kim
Abstract	Variational autoencoders (VAE) combined with hierarchical RNNs have emerged as a powerful framework for conversation modeling. However, they suffer from the notorious degeneration problem, where the decoders learn to ignore latent variables and reduce to vanilla RNNs. We empirically show that this degeneracy occurs mostly due to two reasons. First, the expressive power of hierarchical RNN decoders is often high enough to model the data using only its decoding distributions without relying on the latent variables. Second, the conditional VAE structure whose generation process is conditioned on a context, makes the range of training targets very sparse; that is, the RNN decoders can easily overfit to the training data ignoring the latent variables. To solve the degeneration problem, we propose a novel model named Variational Hierarchical Conversation RNNs (VHCR), involving two key ideas of (1) using a hierarchical structure of latent variables, and (2) exploiting an utterance drop regularization. With evaluations on two datasets of Cornell Movie Dialog and Ubuntu Dialog Corpus, we show that our VHCR successfully utilizes latent variables and outperforms state-of-the-art models for conversation generation. Moreover, it can perform several new utterance control tasks, thanks to its hierarchical latent structure.
Tasks
Published	2018-04-10
URL	http://arxiv.org/abs/1804.03424v2
PDF	http://arxiv.org/pdf/1804.03424v2.pdf
PWC	https://paperswithcode.com/paper/a-hierarchical-latent-structure-for
Repo	https://github.com/natashamjaques/neural_chat
Framework	pytorch