January 25, 2020

3227 words 16 mins read

Paper Group ANR 1713

Paper Group ANR 1713

Multimodal Multitask Representation Learning for Pathology Biobank Metadata Prediction. Evolvability ES: Scalable and Direct Optimization of Evolvability. BUZz: BUffer Zones for defending adversarial examples in image classification. Computing Nonlinear Eigenfunctions via Gradient Flow Extinction. Compression and Localization in Reinforcement Learn …

Multimodal Multitask Representation Learning for Pathology Biobank Metadata Prediction

Title Multimodal Multitask Representation Learning for Pathology Biobank Metadata Prediction
Authors Wei-Hung Weng, Yuannan Cai, Angela Lin, Fraser Tan, Po-Hsuan Cameron Chen
Abstract Metadata are general characteristics of the data in a well-curated and condensed format, and have been proven to be useful for decision making, knowledge discovery, and also heterogeneous data organization of biobank. Among all data types in the biobank, pathology is the key component of the biobank and also serves as the gold standard of diagnosis. To maximize the utility of biobank and allow the rapid progress of biomedical science, it is essential to organize the data with well-populated pathology metadata. However, manual annotation of such information is tedious and time-consuming. In the study, we develop a multimodal multitask learning framework to predict four major slide-level metadata of pathology images. The framework learns generalizable representations across tissue slides, pathology reports, and case-level structured data. We demonstrate improved performance across all four tasks with the proposed method compared to a single modal single task baseline on two test sets, one external test set from a distinct data source (TCGA) and one internal held-out test set (TTH). In the test sets, the performance improvements on the averaged area under receiver operating characteristic curve across the four tasks are 16.48% and 9.05% on TCGA and TTH, respectively. Such pathology metadata prediction system may be adopted to mitigate the effort of expert annotation and ultimately accelerate the data-driven research by better utilization of the pathology biobank.
Tasks Decision Making, Representation Learning
Published 2019-09-17
URL https://arxiv.org/abs/1909.07846v1
PDF https://arxiv.org/pdf/1909.07846v1.pdf
PWC https://paperswithcode.com/paper/multimodal-multitask-representation-learning
Repo
Framework

Evolvability ES: Scalable and Direct Optimization of Evolvability

Title Evolvability ES: Scalable and Direct Optimization of Evolvability
Authors Alexander Gajewski, Jeff Clune, Kenneth O. Stanley, Joel Lehman
Abstract Designing evolutionary algorithms capable of uncovering highly evolvable representations is an open challenge; such evolvability is important because it accelerates evolution and enables fast adaptation to changing circumstances. This paper introduces evolvability ES, an evolutionary algorithm designed to explicitly and efficiently optimize for evolvability, i.e. the ability to further adapt. The insight is that it is possible to derive a novel objective in the spirit of natural evolution strategies that maximizes the diversity of behaviors exhibited when an individual is subject to random mutations, and that efficiently scales with computation. Experiments in 2-D and 3-D locomotion tasks highlight the potential of evolvability ES to generate solutions with tens of thousands of parameters that can quickly be adapted to solve different tasks and that can productively seed further evolution. We further highlight a connection between evolvability and a recent and popular gradient-based meta-learning algorithm called MAML; results show that evolvability ES can perform competitively with MAML and that it discovers solutions with distinct properties. The conclusion is that evolvability ES opens up novel research directions for studying and exploiting the potential of evolvable representations for deep neural networks.
Tasks Meta-Learning
Published 2019-07-13
URL https://arxiv.org/abs/1907.06077v1
PDF https://arxiv.org/pdf/1907.06077v1.pdf
PWC https://paperswithcode.com/paper/evolvability-es-scalable-and-direct
Repo
Framework

BUZz: BUffer Zones for defending adversarial examples in image classification

Title BUZz: BUffer Zones for defending adversarial examples in image classification
Authors Phuong Ha Nguyen, Kaleel Mahmood, Lam M. Nguyen, Thanh Nguyen, Marten van Dijk
Abstract We propose a novel defense against all existing gradient based adversarial attacks on deep neural networks for image classification problems. Our defense is based on a combination of deep neural networks and simple image transformations. While straight forward in implementation, this defense yields a unique security property which we term buffer zones. In this paper, we formalize the concept of buffer zones. We argue that our defense based on buffer zones is secure against state-of-the-art black box attacks. We are able to achieve this security even when the adversary has access to the {\em entire} original training data set and unlimited query access to the defense. We verify our security claims through experimentation using FashionMNIST, CIFAR-10 and CIFAR-100. We demonstrate $<10%$ attack success rate – significantly lower than what other well-known defenses offer – at only a price of a 15-20% drop in clean accuracy. By using a new intuitive metric we explain why this trade-off offers a significant improvement over prior work.
Tasks Image Classification
Published 2019-10-03
URL https://arxiv.org/abs/1910.02785v1
PDF https://arxiv.org/pdf/1910.02785v1.pdf
PWC https://paperswithcode.com/paper/buzz-buffer-zones-for-defending-adversarial
Repo
Framework

Computing Nonlinear Eigenfunctions via Gradient Flow Extinction

Title Computing Nonlinear Eigenfunctions via Gradient Flow Extinction
Authors Leon Bungert, Martin Burger, Daniel Tenbrinck
Abstract In this work we investigate the computation of nonlinear eigenfunctions via the extinction profiles of gradient flows. We analyze a scheme that recursively subtracts such eigenfunctions from given data and show that this procedure yields a decomposition of the data into eigenfunctions in some cases as the 1-dimensional total variation, for instance. We discuss results of numerical experiments in which we use extinction profiles and the gradient flow for the task of spectral graph clustering as used, e.g., in machine learning applications.
Tasks Graph Clustering, Spectral Graph Clustering
Published 2019-02-27
URL http://arxiv.org/abs/1902.10414v1
PDF http://arxiv.org/pdf/1902.10414v1.pdf
PWC https://paperswithcode.com/paper/computing-nonlinear-eigenfunctions-via
Repo
Framework

Compression and Localization in Reinforcement Learning for ATARI Games

Title Compression and Localization in Reinforcement Learning for ATARI Games
Authors Joel Ruben Antony Moniz, Barun Patra, Sarthak Garg
Abstract Deep neural networks have become commonplace in the domain of reinforcement learning, but are often expensive in terms of the number of parameters needed. While compressing deep neural networks has of late assumed great importance to overcome this drawback, little work has been done to address this problem in the context of reinforcement learning agents. This work aims at making first steps towards model compression in an RL agent. In particular, we compress networks to drastically reduce the number of parameters in them (to sizes less than 3% of their original size), further facilitated by applying a global max pool after the final convolution layer, and propose using Actor-Mimic in the context of compression. Finally, we show that this global max-pool allows for weakly supervised object localization, improving the ability to identify the agent’s points of focus.
Tasks Atari Games, Model Compression, Object Localization, Weakly-Supervised Object Localization
Published 2019-04-20
URL http://arxiv.org/abs/1904.09489v1
PDF http://arxiv.org/pdf/1904.09489v1.pdf
PWC https://paperswithcode.com/paper/compression-and-localization-in-reinforcement
Repo
Framework

Multi-Objective Hyperparameter Tuning and Feature Selection using Filter Ensembles

Title Multi-Objective Hyperparameter Tuning and Feature Selection using Filter Ensembles
Authors Martin Binder, Julia Moosbauer, Janek Thomas, Bernd Bischl
Abstract Both feature selection and hyperparameter tuning are key tasks in machine learning. Hyperparameter tuning is often useful to increase model performance, while feature selection is undertaken to attain sparse models. Sparsity may yield better model interpretability and lower cost of data acquisition, data handling and model inference. While sparsity may have a beneficial or detrimental effect on predictive performance, a small drop in performance may be acceptable in return for a substantial gain in sparseness. We therefore treat feature selection as a multi-objective optimization task. We perform hyperparameter tuning and feature selection simultaneously because the choice of features of a model may influence what hyperparameters perform well. We present, benchmark, and compare two different approaches for multi-objective joint hyperparameter optimization and feature selection: The first uses multi-objective model-based optimization. The second is an evolutionary NSGA-II-based wrapper approach to feature selection which incorporates specialized sampling, mutation and recombination operators. Both methods make use of parameterized filter ensembles. While model-based optimization needs fewer objective evaluations to achieve good performance, it incurs computational overhead compared to the NSGA-II, so the preferred choice depends on the cost of evaluating a model on given data.
Tasks Feature Selection, Hyperparameter Optimization
Published 2019-12-30
URL https://arxiv.org/abs/1912.12912v2
PDF https://arxiv.org/pdf/1912.12912v2.pdf
PWC https://paperswithcode.com/paper/model-agnostic-approaches-to-multi-objective
Repo
Framework

Pun Generation with Surprise

Title Pun Generation with Surprise
Authors He He, Nanyun Peng, Percy Liang
Abstract We tackle the problem of generating a pun sentence given a pair of homophones (e.g., “died” and “dyed”). Supervised text generation is inappropriate due to the lack of a large corpus of puns, and even if such a corpus existed, mimicry is at odds with generating novel content. In this paper, we propose an unsupervised approach to pun generation using a corpus of unhumorous text and what we call the local-global surprisal principle: we posit that in a pun sentence, there is a strong association between the pun word (e.g., “dyed”) and the distant context, as well as a strong association between the alternative word (e.g., “died”) and the immediate context. This contrast creates surprise and thus humor. We instantiate this principle for pun generation in two ways: (i) as a measure based on the ratio of probabilities under a language model, and (ii) a retrieve-and-edit approach based on words suggested by a skip-gram model. Human evaluation shows that our retrieve-and-edit approach generates puns successfully 31% of the time, tripling the success rate of a neural generation baseline.
Tasks Language Modelling, Text Generation
Published 2019-04-15
URL http://arxiv.org/abs/1904.06828v1
PDF http://arxiv.org/pdf/1904.06828v1.pdf
PWC https://paperswithcode.com/paper/pun-generation-with-surprise
Repo
Framework

Semantic Similarity To Improve Question Understanding in a Virtual Patient

Title Semantic Similarity To Improve Question Understanding in a Virtual Patient
Authors Fréjus A. A. Laleye, Antonia Blanié, Antoine Brouquet, Dan Behnamou, Gaël de Chalendar
Abstract In medicine, a communicating virtual patient or doctor allows students to train in medical diagnosis and develop skills to conduct a medical consultation. In this paper, we describe a conversational virtual standardized patient system to allow medical students to simulate a diagnosis strategy of an abdominal surgical emergency. We exploited the semantic properties captured by distributed word representations to search for similar questions in the virtual patient dialogue system. We created two dialogue systems that were evaluated on datasets collected during tests with students. The first system based on hand-crafted rules obtains $92.29%$ as $F1$-score on the studied clinical case while the second system that combines rules and semantic similarity achieves $94.88%$. It represents an error reduction of $9.70%$ as compared to the rules-only-based system.
Tasks Medical Diagnosis, Semantic Similarity, Semantic Textual Similarity
Published 2019-12-16
URL https://arxiv.org/abs/1912.07421v1
PDF https://arxiv.org/pdf/1912.07421v1.pdf
PWC https://paperswithcode.com/paper/semantic-similarity-to-improve-question
Repo
Framework

Translating multispectral imagery to nighttime imagery via conditional generative adversarial networks

Title Translating multispectral imagery to nighttime imagery via conditional generative adversarial networks
Authors Xiao Huang, Dong Xu, Zhenlong Li, Cuizhen Wang
Abstract Nighttime satellite imagery has been applied in a wide range of fields. However, our limited understanding of how observed light intensity is formed and whether it can be simulated greatly hinders its further application. This study explores the potential of conditional Generative Adversarial Networks (cGAN) in translating multispectral imagery to nighttime imagery. A popular cGAN framework, pix2pix, was adopted and modified to facilitate this translation using gridded training image pairs derived from Landsat 8 and Visible Infrared Imaging Radiometer Suite (VIIRS). The results of this study prove the possibility of multispectral-to-nighttime translation and further indicate that, with the additional social media data, the generated nighttime imagery can be very similar to the ground-truth imagery. This study fills the gap in understanding the composition of satellite observed nighttime light and provides new paradigms to solve the emerging problems in nighttime remote sensing fields, including nighttime series construction, light desaturation, and multi-sensor calibration.
Tasks Calibration
Published 2019-12-28
URL https://arxiv.org/abs/2001.05848v1
PDF https://arxiv.org/pdf/2001.05848v1.pdf
PWC https://paperswithcode.com/paper/translating-multispectral-imagery-to
Repo
Framework

Learning Condensed and Aligned Features for Unsupervised Domain Adaptation Using Label Propagation

Title Learning Condensed and Aligned Features for Unsupervised Domain Adaptation Using Label Propagation
Authors Jaeyoon Yoo, Changhwa Park, Yongjun Hong, Sungroh Yoon
Abstract Unsupervised domain adaptation aiming to learn a specific task for one domain using another domain data has emerged to address the labeling issue in supervised learning, especially because it is difficult to obtain massive amounts of labeled data in practice. The existing methods have succeeded by reducing the difference between the embedded features of both domains, but the performance is still unsatisfactory compared to the supervised learning scheme. This is attributable to the embedded features that lay around each other but do not align perfectly and establish clearly separable clusters. We propose a novel domain adaptation method based on label propagation and cycle consistency to let the clusters of the features from the two domains overlap exactly and become clear for high accuracy. Specifically, we introduce cycle consistency to enforce the relationship between each cluster and exploit label propagation to achieve the association between the data from the perspective of the manifold structure instead of a one-to-one relation. Hence, we successfully formed aligned and discriminative clusters. We present the empirical results of our method for various domain adaptation scenarios and visualize the embedded features to prove that our method is critical for better domain adaptation.
Tasks Domain Adaptation, Unsupervised Domain Adaptation
Published 2019-03-12
URL http://arxiv.org/abs/1903.04860v1
PDF http://arxiv.org/pdf/1903.04860v1.pdf
PWC https://paperswithcode.com/paper/learning-condensed-and-aligned-features-for
Repo
Framework

Controlling Risk of Web Question Answering

Title Controlling Risk of Web Question Answering
Authors Lixin Su, Jiafeng Guo, Yixing Fan, Yanyan Lan, Xueqi Cheng
Abstract Web question answering (QA) has become an indispensable component in modern search systems, which can significantly improve users’ search experience by providing a direct answer to users’ information need. This could be achieved by applying machine reading comprehension (MRC) models over the retrieved passages to extract answers with respect to the search query. With the development of deep learning techniques, state-of-the-art MRC performances have been achieved by recent deep methods. However, existing studies on MRC seldom address the predictive uncertainty issue, i.e., how likely the prediction of an MRC model is wrong, leading to uncontrollable risks in real-world Web QA applications. In this work, we first conduct an in-depth investigation over the risk of Web QA. We then introduce a novel risk control framework, which consists of a qualify model for uncertainty estimation using the probe idea, and a decision model for selectively output. For evaluation, we introduce risk-related metrics, rather than the traditional EM and F1 in MRC, for the evaluation of risk-aware Web QA. The empirical results over both the real-world Web QA dataset and the academic MRC benchmark collection demonstrate the effectiveness of our approach.
Tasks Machine Reading Comprehension, Question Answering, Reading Comprehension
Published 2019-05-24
URL https://arxiv.org/abs/1905.10077v3
PDF https://arxiv.org/pdf/1905.10077v3.pdf
PWC https://paperswithcode.com/paper/controlling-risk-of-web-question-answering
Repo
Framework

Ink removal from histopathology whole slide images by combining classification, detection and image generation models

Title Ink removal from histopathology whole slide images by combining classification, detection and image generation models
Authors Sharib Ali, Nasullah Khalid Alham, Clare Verrill, Jens Rittscher
Abstract Histopathology slides are routinely marked by pathologists using permanent ink markers that should not be removed as they form part of the medical record. Often tumour regions are marked up for the purpose of highlighting features or other downstream processing such an gene sequencing. Once digitised there is no established method for removing this information from the whole slide images limiting its usability in research and study. Removal of marker ink from these high-resolution whole slide images is non-trivial and complex problem as they contaminate different regions and in an inconsistent manner. We propose an efficient pipeline using convolution neural networks that results in ink-free images without compromising information and image resolution. Our pipeline includes a sequential classical convolution neural network for accurate classification of contaminated image tiles, a fast region detector and a domain adaptive cycle consistent adversarial generative model for restoration of foreground pixels. Both quantitative and qualitative results on four different whole slide images show that our approach yields visually coherent ink-free whole slide images.
Tasks Image Generation
Published 2019-05-10
URL https://arxiv.org/abs/1905.04385v1
PDF https://arxiv.org/pdf/1905.04385v1.pdf
PWC https://paperswithcode.com/paper/ink-removal-from-histopathology-whole-slide
Repo
Framework

Investigating Convolutional Neural Networks using Spatial Orderness

Title Investigating Convolutional Neural Networks using Spatial Orderness
Authors Rohan Ghosh, Anupam K. Gupta, Mehul Motani
Abstract Convolutional Neural Networks (CNN) have been pivotal to the success of many state-of-the-art classification problems, in a wide variety of domains (for e.g. vision, speech, graphs and medical imaging). A commonality within those domains is the presence of hierarchical, spatially agglomerative local-to-global interactions within the data. For two-dimensional images, such interactions may induce an a priori relationship between the pixel data and the underlying spatial ordering of the pixels. For instance in natural images, neighboring pixels are more likely contain similar values than non-neighboring pixels which are further apart. To that end, we propose a statistical metric called spatial orderness, which quantifies the extent to which the input data (2D) obeys the underlying spatial ordering at various scales. In our experiments, we mainly find that adding convolutional layers to a CNN could be counterproductive for data bereft of spatial order at higher scales. We also observe, quite counter-intuitively, that the spatial orderness of CNN feature maps show a synchronized increase during the intial stages of training, and validation performance only improves after spatial orderness of feature maps start decreasing. Lastly, we present a theoretical analysis (and empirical validation) of the spatial orderness of network weights, where we find that using smaller kernel sizes leads to kernels of greater spatial orderness and vice-versa.
Tasks
Published 2019-08-18
URL https://arxiv.org/abs/1908.06416v2
PDF https://arxiv.org/pdf/1908.06416v2.pdf
PWC https://paperswithcode.com/paper/investigating-convolutional-neural-networks
Repo
Framework

Learning to Communicate in Multi-Agent Reinforcement Learning : A Review

Title Learning to Communicate in Multi-Agent Reinforcement Learning : A Review
Authors Mohamed Salah Zaïem, Etienne Bennequin
Abstract We consider the issue of multiple agents learning to communicate through reinforcement learning within partially observable environments, with a focus on information asymmetry in the second part of our work. We provide a review of the recent algorithms developed to improve the agents’ policy by allowing the sharing of information between agents and the learning of communication strategies, with a focus on Deep Recurrent Q-Network-based models. We also describe recent efforts to interpret the languages generated by these agents and study their properties in an attempt to generate human-language-like sentences. We discuss the metrics used to evaluate the generated communication strategies and propose a novel entropy-based evaluation metric. Finally, we address the issue of the cost of communication and introduce the idea of an experimental setup to expose this cost in cooperative-competitive game.
Tasks Multi-agent Reinforcement Learning
Published 2019-11-13
URL https://arxiv.org/abs/1911.05438v1
PDF https://arxiv.org/pdf/1911.05438v1.pdf
PWC https://paperswithcode.com/paper/learning-to-communicate-in-multi-agent
Repo
Framework

Boosting CNN beyond Label in Inverse Problems

Title Boosting CNN beyond Label in Inverse Problems
Authors Eunju Cha, Jaeduck Jang, Junho Lee, Eunha Lee, Jong Chul Ye
Abstract Convolutional neural networks (CNN) have been extensively used for inverse problems. However, their prediction error for unseen test data is difficult to estimate a priori since the neural networks are trained using only selected data and their architecture are largely considered a blackbox. This poses a fundamental challenge to neural networks for unsupervised learning or improvement beyond the label. In this paper, we show that the recent unsupervised learning methods such as Noise2Noise, Stein’s unbiased risk estimator (SURE)-based denoiser, and Noise2Void are closely related to each other in their formulation of an unbiased estimator of the prediction error, but each of them are associated with its own limitations. Based on these observations, we provide a novel boosting estimator for the prediction error. In particular, by employing combinatorial convolutional frame representation of encoder-decoder CNN and synergistically combining it with the batch normalization, we provide a close form formulation for the unbiased estimator of the prediction error that can be minimized for neural network training beyond the label. Experimental results show that the resulting algorithm, what we call Noise2Boosting, provides consistent improvement in various inverse problems under both supervised and unsupervised learning setting.
Tasks
Published 2019-06-18
URL https://arxiv.org/abs/1906.07330v1
PDF https://arxiv.org/pdf/1906.07330v1.pdf
PWC https://paperswithcode.com/paper/boosting-cnn-beyond-label-in-inverse-problems
Repo
Framework
comments powered by Disqus