Paper Group AWR 188
Triple consistency loss for pairing distributions in GAN-based face synthesis. Batch Active Preference-Based Learning of Reward Functions. Disease Prediction using Graph Convolutional Networks: Application to Autism Spectrum Disorder and Alzheimer’s Disease. Structured Content Preservation for Unsupervised Text Style Transfer. AutoZOOM: Autoencoder …
Triple consistency loss for pairing distributions in GAN-based face synthesis
Title | Triple consistency loss for pairing distributions in GAN-based face synthesis |
Authors | Enrique Sanchez, Michel Valstar |
Abstract | Generative Adversarial Networks have shown impressive results for the task of object translation, including face-to-face translation. A key component behind the success of recent approaches is the self-consistency loss, which encourages a network to recover the original input image when the output generated for a desired attribute is itself passed through the same network, but with the target attribute inverted. While the self-consistency loss yields photo-realistic results, it can be shown that the input and target domains, supposed to be close, differ substantially. This is empirically found by observing that a network recovers the input image even if attributes other than the inversion of the original goal are set as target. This stops one combining networks for different tasks, or using a network to do progressive forward passes. In this paper, we show empirical evidence of this effect, and propose a new loss to bridge the gap between the distributions of the input and target domains. This “triple consistency loss”, aims to minimise the distance between the outputs generated by the network for different routes to the target, independent of any intermediate steps. To show this is effective, we incorporate the triple consistency loss into the training of a new landmark-guided face to face synthesis, where, contrary to previous works, the generated images can simultaneously undergo a large transformation in both expression and pose. To the best of our knowledge, we are the first to tackle the problem of mismatching distributions in self-domain synthesis, and to propose “in-the-wild” landmark-guided synthesis. Code will be available at https://github.com/ESanchezLozano/GANnotation |
Tasks | Face Generation, Face to Face Translation |
Published | 2018-11-08 |
URL | http://arxiv.org/abs/1811.03492v1 |
http://arxiv.org/pdf/1811.03492v1.pdf | |
PWC | https://paperswithcode.com/paper/triple-consistency-loss-for-pairing |
Repo | https://github.com/ESanchezLozano/GANnotation |
Framework | pytorch |
Batch Active Preference-Based Learning of Reward Functions
Title | Batch Active Preference-Based Learning of Reward Functions |
Authors | Erdem Bıyık, Dorsa Sadigh |
Abstract | Data generation and labeling are usually an expensive part of learning for robotics. While active learning methods are commonly used to tackle the former problem, preference-based learning is a concept that attempts to solve the latter by querying users with preference questions. In this paper, we will develop a new algorithm, batch active preference-based learning, that enables efficient learning of reward functions using as few data samples as possible while still having short query generation times. We introduce several approximations to the batch active learning problem, and provide theoretical guarantees for the convergence of our algorithms. Finally, we present our experimental results for a variety of robotics tasks in simulation. Our results suggest that our batch active learning algorithm requires only a few queries that are computed in a short amount of time. We then showcase our algorithm in a study to learn human users’ preferences. |
Tasks | Active Learning |
Published | 2018-10-10 |
URL | http://arxiv.org/abs/1810.04303v1 |
http://arxiv.org/pdf/1810.04303v1.pdf | |
PWC | https://paperswithcode.com/paper/batch-active-preference-based-learning-of |
Repo | https://github.com/Stanford-ILIAD/batch-active-preference-based-learning |
Framework | none |
Disease Prediction using Graph Convolutional Networks: Application to Autism Spectrum Disorder and Alzheimer’s Disease
Title | Disease Prediction using Graph Convolutional Networks: Application to Autism Spectrum Disorder and Alzheimer’s Disease |
Authors | Sarah Parisot, Sofia Ira Ktena, Enzo Ferrante, Matthew Lee, Ricardo Guerrero, Ben Glocker, Daniel Rueckert |
Abstract | Graphs are widely used as a natural framework that captures interactions between individual elements represented as nodes in a graph. In medical applications, specifically, nodes can represent individuals within a potentially large population (patients or healthy controls) accompanied by a set of features, while the graph edges incorporate associations between subjects in an intuitive manner. This representation allows to incorporate the wealth of imaging and non-imaging information as well as individual subject features simultaneously in disease classification tasks. Previous graph-based approaches for supervised or unsupervised learning in the context of disease prediction solely focus on pairwise similarities between subjects, disregarding individual characteristics and features, or rather rely on subject-specific imaging feature vectors and fail to model interactions between them. In this paper, we present a thorough evaluation of a generic framework that leverages both imaging and non-imaging information and can be used for brain analysis in large populations. This framework exploits Graph Convolutional Networks (GCNs) and involves representing populations as a sparse graph, where its nodes are associated with imaging-based feature vectors, while phenotypic information is integrated as edge weights. The extensive evaluation explores the effect of each individual component of this framework on disease prediction performance and further compares it to different baselines. The framework performance is tested on two large datasets with diverse underlying data, ABIDE and ADNI, for the prediction of Autism Spectrum Disorder and conversion to Alzheimer’s disease, respectively. Our analysis shows that our novel framework can improve over state-of-the-art results on both databases, with 70.4% classification accuracy for ABIDE and 80.0% for ADNI. |
Tasks | Disease Prediction |
Published | 2018-06-05 |
URL | http://arxiv.org/abs/1806.01738v1 |
http://arxiv.org/pdf/1806.01738v1.pdf | |
PWC | https://paperswithcode.com/paper/disease-prediction-using-graph-convolutional |
Repo | https://github.com/SSinyu/p |
Framework | tf |
Structured Content Preservation for Unsupervised Text Style Transfer
Title | Structured Content Preservation for Unsupervised Text Style Transfer |
Authors | Youzhi Tian, Zhiting Hu, Zhou Yu |
Abstract | Text style transfer aims to modify the style of a sentence while keeping its content unchanged. Recent style transfer systems often fail to faithfully preserve the content after changing the style. This paper proposes a structured content preserving model that leverages linguistic information in the structured fine-grained supervisions to better preserve the style-independent content during style transfer. In particular, we achieve the goal by devising rich model objectives based on both the sentence’s lexical information and a language model that conditions on content. The resulting model therefore is encouraged to retain the semantic meaning of the target sentences. We perform extensive experiments that compare our model to other existing approaches in the tasks of sentiment and political slant transfer. Our model achieves significant improvement in terms of both content preservation and style transfer in automatic and human evaluation. |
Tasks | Language Modelling, Style Transfer, Text Style Transfer |
Published | 2018-10-15 |
URL | http://arxiv.org/abs/1810.06526v2 |
http://arxiv.org/pdf/1810.06526v2.pdf | |
PWC | https://paperswithcode.com/paper/structured-content-preservation-for |
Repo | https://github.com/YouzhiTian/Structured-Content-Preservation-for-Unsupervised-Text-Style-Transfer |
Framework | pytorch |
AutoZOOM: Autoencoder-based Zeroth Order Optimization Method for Attacking Black-box Neural Networks
Title | AutoZOOM: Autoencoder-based Zeroth Order Optimization Method for Attacking Black-box Neural Networks |
Authors | Chun-Chen Tu, Paishun Ting, Pin-Yu Chen, Sijia Liu, Huan Zhang, Jinfeng Yi, Cho-Jui Hsieh, Shin-Ming Cheng |
Abstract | Recent studies have shown that adversarial examples in state-of-the-art image classifiers trained by deep neural networks (DNN) can be easily generated when the target model is transparent to an attacker, known as the white-box setting. However, when attacking a deployed machine learning service, one can only acquire the input-output correspondences of the target model; this is the so-called black-box attack setting. The major drawback of existing black-box attacks is the need for excessive model queries, which may give a false sense of model robustness due to inefficient query designs. To bridge this gap, we propose a generic framework for query-efficient black-box attacks. Our framework, AutoZOOM, which is short for Autoencoder-based Zeroth Order Optimization Method, has two novel building blocks towards efficient black-box attacks: (i) an adaptive random gradient estimation strategy to balance query counts and distortion, and (ii) an autoencoder that is either trained offline with unlabeled data or a bilinear resizing operation for attack acceleration. Experimental results suggest that, by applying AutoZOOM to a state-of-the-art black-box attack (ZOO), a significant reduction in model queries can be achieved without sacrificing the attack success rate and the visual quality of the resulting adversarial examples. In particular, when compared to the standard ZOO method, AutoZOOM can consistently reduce the mean query counts in finding successful adversarial examples (or reaching the same distortion level) by at least 93% on MNIST, CIFAR-10 and ImageNet datasets, leading to novel insights on adversarial robustness. |
Tasks | |
Published | 2018-05-30 |
URL | https://arxiv.org/abs/1805.11770v5 |
https://arxiv.org/pdf/1805.11770v5.pdf | |
PWC | https://paperswithcode.com/paper/autozoom-autoencoder-based-zeroth-order |
Repo | https://github.com/psting/IBM-UM_Projects |
Framework | none |
MiME: Multilevel Medical Embedding of Electronic Health Records for Predictive Healthcare
Title | MiME: Multilevel Medical Embedding of Electronic Health Records for Predictive Healthcare |
Authors | Edward Choi, Cao Xiao, Walter F. Stewart, Jimeng Sun |
Abstract | Deep learning models exhibit state-of-the-art performance for many predictive healthcare tasks using electronic health records (EHR) data, but these models typically require training data volume that exceeds the capacity of most healthcare systems. External resources such as medical ontologies are used to bridge the data volume constraint, but this approach is often not directly applicable or useful because of inconsistencies with terminology. To solve the data insufficiency challenge, we leverage the inherent multilevel structure of EHR data and, in particular, the encoded relationships among medical codes. We propose Multilevel Medical Embedding (MiME) which learns the multilevel embedding of EHR data while jointly performing auxiliary prediction tasks that rely on this inherent EHR structure without the need for external labels. We conducted two prediction tasks, heart failure prediction and sequential disease prediction, where MiME outperformed baseline methods in diverse evaluation settings. In particular, MiME consistently outperformed all baselines when predicting heart failure on datasets of different volumes, especially demonstrating the greatest performance improvement (15% relative gain in PR-AUC over the best baseline) on the smallest dataset, demonstrating its ability to effectively model the multilevel structure of EHR data. |
Tasks | Disease Prediction |
Published | 2018-10-22 |
URL | http://arxiv.org/abs/1810.09593v1 |
http://arxiv.org/pdf/1810.09593v1.pdf | |
PWC | https://paperswithcode.com/paper/mime-multilevel-medical-embedding-of |
Repo | https://github.com/mp2893/mime |
Framework | tf |
Robust Multiple Kernel k-means Clustering using Min-Max Optimization
Title | Robust Multiple Kernel k-means Clustering using Min-Max Optimization |
Authors | Seojin Bang, Yaoliang Yu, Wei Wu |
Abstract | Multiple kernel learning is a type of multiview learning that combines different data modalities by capturing view-specific patterns using kernels. Although supervised multiple kernel learning has been extensively studied, until recently, only a few unsupervised approaches have been proposed. In the meanwhile, adversarial learning has recently received much attention. Many works have been proposed to defend against adversarial examples. However, little is known about the effect of adversarial perturbation in the context of multiview learning, and even less in the unsupervised case. In this study, we show that adversarial features added to a view can make the existing approaches with the min-max formulation in multiple kernel clustering yield unfavorable clusters. To address this problem and inspired by recent works in adversarial learning, we propose a multiple kernel clustering method with the min-max framework that aims to be robust to such adversarial perturbation. We evaluate the robustness of our method on simulation data under different types of adversarial perturbations and show that it outperforms several compared existing methods. In the real data analysis, We demonstrate the utility of our method on a real-world problem. |
Tasks | Disease Prediction, Multiview Learning |
Published | 2018-03-06 |
URL | https://arxiv.org/abs/1803.02458v2 |
https://arxiv.org/pdf/1803.02458v2.pdf | |
PWC | https://paperswithcode.com/paper/multiple-kernel-k-means-clustering-using-min |
Repo | https://github.com/SeojinBang/MKKC |
Framework | none |
DeRPN: Taking a further step toward more general object detection
Title | DeRPN: Taking a further step toward more general object detection |
Authors | Lele Xie, Yuliang Liu, Lianwen Jin, Zecheng Xie |
Abstract | Most current detection methods have adopted anchor boxes as regression references. However, the detection performance is sensitive to the setting of the anchor boxes. A proper setting of anchor boxes may vary significantly across different datasets, which severely limits the universality of the detectors. To improve the adaptivity of the detectors, in this paper, we present a novel dimension-decomposition region proposal network (DeRPN) that can perfectly displace the traditional Region Proposal Network (RPN). DeRPN utilizes an anchor string mechanism to independently match object widths and heights, which is conducive to treating variant object shapes. In addition, a novel scale-sensitive loss is designed to address the imbalanced loss computations of different scaled objects, which can avoid the small objects being overwhelmed by larger ones. Comprehensive experiments conducted on both general object detection datasets (Pascal VOC 2007, 2012 and MS COCO) and scene text detection datasets (ICDAR 2013 and COCO-Text) all prove that our DeRPN can significantly outperform RPN. It is worth mentioning that the proposed DeRPN can be employed directly on different models, tasks, and datasets without any modifications of hyperparameters or specialized optimization, which further demonstrates its adaptivity. The code will be released at https://github.com/HCIILAB/DeRPN. |
Tasks | Object Detection, Scene Text Detection |
Published | 2018-11-16 |
URL | http://arxiv.org/abs/1811.06700v1 |
http://arxiv.org/pdf/1811.06700v1.pdf | |
PWC | https://paperswithcode.com/paper/derpn-taking-a-further-step-toward-more |
Repo | https://github.com/HCIILAB/DeRPN |
Framework | none |
A System for Massively Parallel Hyperparameter Tuning
Title | A System for Massively Parallel Hyperparameter Tuning |
Authors | Liam Li, Kevin Jamieson, Afshin Rostamizadeh, Ekaterina Gonina, Moritz Hardt, Benjamin Recht, Ameet Talwalkar |
Abstract | Modern learning models are characterized by large hyperparameter spaces and long training times. These properties, coupled with the rise of parallel computing and the growing demand to productionize machine learning workloads, motivate the need to develop mature hyperparameter optimization functionality in distributed computing settings. We address this challenge by first introducing a simple and robust hyperparameter optimization algorithm called ASHA, which exploits parallelism and aggressive early-stopping to tackle large-scale hyperparameter optimization problems. Our extensive empirical results show that ASHA outperforms existing state-of-the-art hyperparameter optimization methods; scales linearly with the number of workers in distributed settings; and is suitable for massive parallelism, as demonstrated on a task with 500 workers. We then describe several design decisions we encountered, along with our associated solutions, when integrating ASHA in Determined AI’s end-to-end production-quality machine learning system that offers hyperparameter tuning as a service. |
Tasks | Hyperparameter Optimization |
Published | 2018-10-13 |
URL | https://arxiv.org/abs/1810.05934v5 |
https://arxiv.org/pdf/1810.05934v5.pdf | |
PWC | https://paperswithcode.com/paper/massively-parallel-hyperparameter-tuning |
Repo | https://github.com/c-bata/goptuna |
Framework | none |
A Twofold Siamese Network for Real-Time Object Tracking
Title | A Twofold Siamese Network for Real-Time Object Tracking |
Authors | Anfeng He, Chong Luo, Xinmei Tian, Wenjun Zeng |
Abstract | Observing that Semantic features learned in an image classification task and Appearance features learned in a similarity matching task complement each other, we build a twofold Siamese network, named SA-Siam, for real-time object tracking. SA-Siam is composed of a semantic branch and an appearance branch. Each branch is a similarity-learning Siamese network. An important design choice in SA-Siam is to separately train the two branches to keep the heterogeneity of the two types of features. In addition, we propose a channel attention mechanism for the semantic branch. Channel-wise weights are computed according to the channel activations around the target position. While the inherited architecture from SiamFC \cite{SiamFC} allows our tracker to operate beyond real-time, the twofold design and the attention mechanism significantly improve the tracking performance. The proposed SA-Siam outperforms all other real-time trackers by a large margin on OTB-2013/50/100 benchmarks. |
Tasks | Image Classification, Object Tracking |
Published | 2018-02-24 |
URL | http://arxiv.org/abs/1802.08817v1 |
http://arxiv.org/pdf/1802.08817v1.pdf | |
PWC | https://paperswithcode.com/paper/a-twofold-siamese-network-for-real-time |
Repo | https://github.com/Microsoft/SA-Siam |
Framework | tf |
One-shot Texture Segmentation
Title | One-shot Texture Segmentation |
Authors | Ivan Ustyuzhaninov, Claudio Michaelis, Wieland Brendel, Matthias Bethge |
Abstract | We introduce one-shot texture segmentation: the task of segmenting an input image containing multiple textures given a patch of a reference texture. This task is designed to turn the problem of texture-based perceptual grouping into an objective benchmark. We show that it is straight-forward to generate large synthetic data sets for this task from a relatively small number of natural textures. In particular, this task can be cast as a self-supervised problem thereby alleviating the need for massive amounts of manually annotated data necessary for traditional segmentation tasks. In this paper we introduce and study two concrete data sets: a dense collage of textures (CollTex) and a cluttered texturized Omniglot data set. We show that a baseline model trained on these synthesized data is able to generalize to natural images and videos without further fine-tuning, suggesting that the learned image representations are useful for higher-level vision tasks. |
Tasks | Omniglot |
Published | 2018-07-07 |
URL | http://arxiv.org/abs/1807.02654v1 |
http://arxiv.org/pdf/1807.02654v1.pdf | |
PWC | https://paperswithcode.com/paper/one-shot-texture-segmentation |
Repo | https://github.com/atch841/one-shot-texture-segmentation |
Framework | tf |
Deep Unfolding of a Proximal Interior Point Method for Image Restoration
Title | Deep Unfolding of a Proximal Interior Point Method for Image Restoration |
Authors | Carla Bertocchi, Emilie Chouzenoux, Marie-Caroline Corbineau, Jean-Christophe Pesquet, Marco Prato |
Abstract | Variational methods are widely applied to ill-posed inverse problems for they have the ability to embed prior knowledge about the solution. However, the level of performance of these methods significantly depends on a set of parameters, which can be estimated through computationally expensive and time-consuming methods. In contrast, deep learning offers very generic and efficient architectures, at the expense of explainability, since it is often used as a black-box, without any fine control over its output. Deep unfolding provides a convenient approach to combine variational-based and deep learning approaches. Starting from a variational formulation for image restoration, we develop iRestNet, a neural network architecture obtained by unfolding a proximal interior point algorithm. Hard constraints, encoding desirable properties for the restored image, are incorporated into the network thanks to a logarithmic barrier, while the barrier parameter, the stepsize, and the penalization weight are learned by the network. We derive explicit expressions for the gradient of the proximity operator for various choices of constraints, which allows training iRestNet with gradient descent and backpropagation. In addition, we provide theoretical results regarding the stability of the network for a common inverse problem example. Numerical experiments on image deblurring problems show that the proposed approach compares favorably with both state-of-the-art variational and machine learning methods in terms of image quality. |
Tasks | Deblurring, Image Restoration |
Published | 2018-12-11 |
URL | https://arxiv.org/abs/1812.04276v5 |
https://arxiv.org/pdf/1812.04276v5.pdf | |
PWC | https://paperswithcode.com/paper/deep-unfolding-of-a-proximal-interior-point |
Repo | https://github.com/mccorbineau/iRestNet |
Framework | pytorch |
Interactive Classification for Deep Learning Interpretation
Title | Interactive Classification for Deep Learning Interpretation |
Authors | Ángel Alexander Cabrera, Fred Hohman, Jason Lin, Duen Horng Chau |
Abstract | We present an interactive system enabling users to manipulate images to explore the robustness and sensitivity of deep learning image classifiers. Using modern web technologies to run in-browser inference, users can remove image features using inpainting algorithms and obtain new classifications in real time, which allows them to ask a variety of “what if” questions by experimentally modifying images and seeing how the model reacts. Our system allows users to compare and contrast what image regions humans and machine learning models use for classification, revealing a wide range of surprising results ranging from spectacular failures (e.g., a “water bottle” image becomes a “concert” when removing a person) to impressive resilience (e.g., a “baseball player” image remains correctly classified even without a glove or base). We demonstrate our system at The 2018 Conference on Computer Vision and Pattern Recognition (CVPR) for the audience to try it live. Our system is open-sourced at https://github.com/poloclub/interactive-classification. A video demo is available at https://youtu.be/llub5GcOF6w. |
Tasks | |
Published | 2018-06-14 |
URL | http://arxiv.org/abs/1806.05660v2 |
http://arxiv.org/pdf/1806.05660v2.pdf | |
PWC | https://paperswithcode.com/paper/interactive-classification-for-deep-learning |
Repo | https://github.com/poloclub/interactive-classification |
Framework | tf |
Hierarchical Density Order Embeddings
Title | Hierarchical Density Order Embeddings |
Authors | Ben Athiwaratkun, Andrew Gordon Wilson |
Abstract | By representing words with probability densities rather than point vectors, probabilistic word embeddings can capture rich and interpretable semantic information and uncertainty. The uncertainty information can be particularly meaningful in capturing entailment relationships – whereby general words such as “entity” correspond to broad distributions that encompass more specific words such as “animal” or “instrument”. We introduce density order embeddings, which learn hierarchical representations through encapsulation of probability densities. In particular, we propose simple yet effective loss functions and distance metrics, as well as graph-based schemes to select negative samples to better learn hierarchical density representations. Our approach provides state-of-the-art performance on the WordNet hypernym relationship prediction task and the challenging HyperLex lexical entailment dataset – while retaining a rich and interpretable density representation. |
Tasks | Word Embeddings |
Published | 2018-04-26 |
URL | http://arxiv.org/abs/1804.09843v1 |
http://arxiv.org/pdf/1804.09843v1.pdf | |
PWC | https://paperswithcode.com/paper/hierarchical-density-order-embeddings |
Repo | https://github.com/benathi/density-order-emb |
Framework | pytorch |
A Hierarchical Latent Structure for Variational Conversation Modeling
Title | A Hierarchical Latent Structure for Variational Conversation Modeling |
Authors | Yookoon Park, Jaemin Cho, Gunhee Kim |
Abstract | Variational autoencoders (VAE) combined with hierarchical RNNs have emerged as a powerful framework for conversation modeling. However, they suffer from the notorious degeneration problem, where the decoders learn to ignore latent variables and reduce to vanilla RNNs. We empirically show that this degeneracy occurs mostly due to two reasons. First, the expressive power of hierarchical RNN decoders is often high enough to model the data using only its decoding distributions without relying on the latent variables. Second, the conditional VAE structure whose generation process is conditioned on a context, makes the range of training targets very sparse; that is, the RNN decoders can easily overfit to the training data ignoring the latent variables. To solve the degeneration problem, we propose a novel model named Variational Hierarchical Conversation RNNs (VHCR), involving two key ideas of (1) using a hierarchical structure of latent variables, and (2) exploiting an utterance drop regularization. With evaluations on two datasets of Cornell Movie Dialog and Ubuntu Dialog Corpus, we show that our VHCR successfully utilizes latent variables and outperforms state-of-the-art models for conversation generation. Moreover, it can perform several new utterance control tasks, thanks to its hierarchical latent structure. |
Tasks | |
Published | 2018-04-10 |
URL | http://arxiv.org/abs/1804.03424v2 |
http://arxiv.org/pdf/1804.03424v2.pdf | |
PWC | https://paperswithcode.com/paper/a-hierarchical-latent-structure-for |
Repo | https://github.com/natashamjaques/neural_chat |
Framework | pytorch |