January 26, 2020

2699 words 13 mins read

Paper Group ANR 1457

Paper Group ANR 1457

Probabilistic Super-Resolution of Solar Magnetograms: Generating Many Explanations and Measuring Uncertainties. Perceptual Quality-preserving Black-Box Attack against Deep Learning Image Classifiers. Facial Expression Restoration Based on Improved Graph Convolutional Networks. Tutorial: Safe and Reliable Machine Learning. Multimodal Image Super-res …

Probabilistic Super-Resolution of Solar Magnetograms: Generating Many Explanations and Measuring Uncertainties

Title Probabilistic Super-Resolution of Solar Magnetograms: Generating Many Explanations and Measuring Uncertainties
Authors Xavier Gitiaux, Shane A. Maloney, Anna Jungbluth, Carl Shneider, Paul J. Wright, Atılım Güneş Baydin, Michel Deudon, Yarin Gal, Alfredo Kalaitzis, Andrés Muñoz-Jaramillo
Abstract Machine learning techniques have been successfully applied to super-resolution tasks on natural images where visually pleasing results are sufficient. However in many scientific domains this is not adequate and estimations of errors and uncertainties are crucial. To address this issue we propose a Bayesian framework that decomposes uncertainties into epistemic and aleatoric uncertainties. We test the validity of our approach by super-resolving images of the Sun’s magnetic field and by generating maps measuring the range of possible high resolution explanations compatible with a given low resolution magnetogram.
Tasks Super-Resolution
Published 2019-11-04
URL https://arxiv.org/abs/1911.01486v1
PDF https://arxiv.org/pdf/1911.01486v1.pdf
PWC https://paperswithcode.com/paper/probabilistic-super-resolution-of-solar
Repo
Framework

Perceptual Quality-preserving Black-Box Attack against Deep Learning Image Classifiers

Title Perceptual Quality-preserving Black-Box Attack against Deep Learning Image Classifiers
Authors Diego Gragnaniello, Francesco Marra, Giovanni Poggi, Luisa Verdoliva
Abstract Deep neural networks provide unprecedented performance in all image classification problems, taking advantage of huge amounts of data available for training. Recent studies, however, have shown their vulnerability to adversarial attacks, spawning an intense research effort in this field. With the aim of building better systems, new countermeasures and stronger attacks are proposed by the day. On the attacker’s side, there is growing interest for the realistic black-box scenario, in which the user has no access to the neural network parameters. The problem is to design efficient attacks which mislead the neural network without compromising image quality. In this work, we propose to perform the black-box attack along a low-distortion path, so as to improve both the attack efficiency and the perceptual quality of the adversarial image. Numerical experiments on real-world systems prove the effectiveness of the proposed approach, both in benchmark classification tasks and in key applications in biometrics and forensics.
Tasks Face Recognition, Image Classification
Published 2019-02-20
URL https://arxiv.org/abs/1902.07776v2
PDF https://arxiv.org/pdf/1902.07776v2.pdf
PWC https://paperswithcode.com/paper/perceptual-quality-preserving-black-box
Repo
Framework

Facial Expression Restoration Based on Improved Graph Convolutional Networks

Title Facial Expression Restoration Based on Improved Graph Convolutional Networks
Authors Zhilei Liu, Le Li, Yunpeng Wu, Cuicui Zhang
Abstract Facial expression analysis in the wild is challenging when the facial image is with low resolution or partial occlusion. Considering the correlations among different facial local regions under different facial expressions, this paper proposes a novel facial expression restoration method based on generative adversarial network by integrating an improved graph convolutional network (IGCN) and region relation modeling block (RRMB). Unlike conventional graph convolutional networks taking vectors as input features, IGCN can use tensors of face patches as inputs. It is better to retain the structure information of face patches. The proposed RRMB is designed to address facial generative tasks including inpainting and super-resolution with facial action units detection, which aims to restore facial expression as the ground-truth. Extensive experiments conducted on BP4D and DISFA benchmarks demonstrate the effectiveness of our proposed method through quantitative and qualitative evaluations.
Tasks Super-Resolution
Published 2019-10-23
URL https://arxiv.org/abs/1910.10344v1
PDF https://arxiv.org/pdf/1910.10344v1.pdf
PWC https://paperswithcode.com/paper/facial-expression-restoration-based-on
Repo
Framework

Tutorial: Safe and Reliable Machine Learning

Title Tutorial: Safe and Reliable Machine Learning
Authors Suchi Saria, Adarsh Subbaswamy
Abstract This document serves as a brief overview of the “Safe and Reliable Machine Learning” tutorial given at the 2019 ACM Conference on Fairness, Accountability, and Transparency (FAT* 2019). The talk slides can be found here: https://bit.ly/2Gfsukp, while a video of the talk is available here: https://youtu.be/FGLOCkC4KmE, and a complete list of references for the tutorial here: https://bit.ly/2GdLPme.
Tasks
Published 2019-04-15
URL http://arxiv.org/abs/1904.07204v1
PDF http://arxiv.org/pdf/1904.07204v1.pdf
PWC https://paperswithcode.com/paper/tutorial-safe-and-reliable-machine-learning
Repo
Framework

Multimodal Image Super-resolution via Deep Unfolding with Side Information

Title Multimodal Image Super-resolution via Deep Unfolding with Side Information
Authors Iman Marivani, Evaggelia Tsiligianni, Bruno Cornelis, Nikos Deligiannis
Abstract Deep learning methods have been successfully applied to various computer vision tasks. However, existing neural network architectures do not per se incorporate domain knowledge about the addressed problem, thus, understanding what the model has learned is an open research topic. In this paper, we rely on the unfolding of an iterative algorithm for sparse approximation with side information, and design a deep learning architecture for multimodal image super-resolution that incorporates sparse priors and effectively utilizes information from another image modality. We develop two deep models performing reconstruction of a high-resolution image of a target image modality from its low-resolution variant with the aid of a high-resolution image from a second modality. We apply the proposed models to super-resolve near-infrared images using as side information high-resolution RGB\ images. Experimental results demonstrate the superior performance of the proposed models against state-of-the-art methods including unimodal and multimodal approaches.
Tasks Image Super-Resolution, Super-Resolution
Published 2019-10-18
URL https://arxiv.org/abs/1910.08320v1
PDF https://arxiv.org/pdf/1910.08320v1.pdf
PWC https://paperswithcode.com/paper/multimodal-image-super-resolution-via-deep
Repo
Framework

Illumination Invariant Foreground Object Segmentation using ForeGANs

Title Illumination Invariant Foreground Object Segmentation using ForeGANs
Authors Maryam Sultana, Soon Ki Jung
Abstract The foreground segmentation algorithms suffer performance degradation in the presence of various challenges such as dynamic backgrounds, and various illumination conditions. To handle these challenges, we present a foreground segmentation method, based on generative adversarial network (GAN). We aim to segment foreground objects in the presence of two aforementioned major challenges in background scenes in real environments. To address this problem, our presented GAN model is trained on background image samples with dynamic changes, after that for testing the GAN model has to generate the same background sample as test sample with similar conditions via back-propagation technique. The generated background sample is then subtracted from the given test sample to segment foreground objects. The comparison of our proposed method with five state-of-the-art methods highlights the strength of our algorithm for foreground segmentation in the presence of challenging dynamic background scenario.
Tasks Semantic Segmentation
Published 2019-02-07
URL https://arxiv.org/abs/1902.03120v3
PDF https://arxiv.org/pdf/1902.03120v3.pdf
PWC https://paperswithcode.com/paper/illumination-invariant-foreground-object
Repo
Framework

Variational Bayesian Optimal Experimental Design

Title Variational Bayesian Optimal Experimental Design
Authors Adam Foster, Martin Jankowiak, Eli Bingham, Paul Horsfall, Yee Whye Teh, Tom Rainforth, Noah Goodman
Abstract Bayesian optimal experimental design (BOED) is a principled framework for making efficient use of limited experimental resources. Unfortunately, its applicability is hampered by the difficulty of obtaining accurate estimates of the expected information gain (EIG) of an experiment. To address this, we introduce several classes of fast EIG estimators by building on ideas from amortized variational inference. We show theoretically and empirically that these estimators can provide significant gains in speed and accuracy over previous approaches. We further demonstrate the practicality of our approach on a number of end-to-end experiments.
Tasks
Published 2019-03-13
URL https://arxiv.org/abs/1903.05480v3
PDF https://arxiv.org/pdf/1903.05480v3.pdf
PWC https://paperswithcode.com/paper/variational-estimators-for-bayesian-optimal
Repo
Framework

Towards Near-imperceptible Steganographic Text

Title Towards Near-imperceptible Steganographic Text
Authors Falcon Z. Dai, Zheng Cai
Abstract We show that the imperceptibility of several existing linguistic steganographic systems (Fang et al., 2017; Yang et al., 2018) relies on implicit assumptions on statistical behaviors of fluent text. We formally analyze them and empirically evaluate these assumptions. Furthermore, based on these observations, we propose an encoding algorithm called patient-Huffman with improved near-imperceptible guarantees.
Tasks
Published 2019-07-15
URL https://arxiv.org/abs/1907.06679v2
PDF https://arxiv.org/pdf/1907.06679v2.pdf
PWC https://paperswithcode.com/paper/towards-near-imperceptible-steganographic
Repo
Framework

KonIQ-10k: An ecologically valid database for deep learning of blind image quality assessment

Title KonIQ-10k: An ecologically valid database for deep learning of blind image quality assessment
Authors Vlad Hosu, Hanhe Lin, Tamas Sziranyi, Dietmar Saupe
Abstract Deep learning methods for image quality assessment (IQA) are limited due to the small size of existing datasets. Extensive datasets require substantial resources both for generating publishable content, and annotating it accurately. We present a systematic and scalable approach to create KonIQ-10k, the largest IQA dataset to date consisting of 10,073 quality scored images. This is the first in-the-wild database aiming for ecological validity, with regard to the authenticity of distortions, the diversity of content, and quality-related indicators. Through the use of crowdsourcing, we obtained 1.2 million reliable quality ratings from 1,459 crowd workers, paving the way for more general IQA models. We propose a novel, deep learning model (KonCept512), to show an excellent generalization beyond the test set (0.921 SROCC), to the current state-of-the-art database LIVE-in-the-Wild (0.825 SROCC). The model derives its core performance from the InceptionResNet architecture, being trained at a higher resolution than previous models (512x384). A correlation analysis shows that KonCept512 performs similar to having 9 subjective scores for each test image.
Tasks Blind Image Quality Assessment, Image Quality Assessment
Published 2019-10-14
URL https://arxiv.org/abs/1910.06180v1
PDF https://arxiv.org/pdf/1910.06180v1.pdf
PWC https://paperswithcode.com/paper/koniq-10k-an-ecologically-valid-database-for
Repo
Framework

PAN: Path Integral Based Convolution for Deep Graph Neural Networks

Title PAN: Path Integral Based Convolution for Deep Graph Neural Networks
Authors Zheng Ma, Ming Li, Yuguang Wang
Abstract Convolution operations designed for graph-structured data usually utilize the graph Laplacian, which can be seen as message passing between the adjacent neighbors through a generic random walk. In this paper, we propose PAN, a new graph convolution framework that involves every path linking the message sender and receiver with learnable weights depending on the path length, which corresponds to the maximal entropy random walk. PAN generalizes the graph Laplacian to a new transition matrix we call \emph{maximal entropy transition} (MET) matrix derived from a path integral formalism. Most previous graph convolutional network architectures can be adapted to our framework, and many variations and derivatives based on the path integral idea can be developed. Experimental results show that the path integral based graph neural networks have great learnability and fast convergence rate, and achieve state-of-the-art performance on benchmark tasks.
Tasks
Published 2019-04-24
URL http://arxiv.org/abs/1904.10996v1
PDF http://arxiv.org/pdf/1904.10996v1.pdf
PWC https://paperswithcode.com/paper/pan-path-integral-based-convolution-for-deep
Repo
Framework

General Video Game Rule Generation

Title General Video Game Rule Generation
Authors Ahmed Khalifa, Michael Cerny Green, Diego Perez-Liebana, Julian Togelius
Abstract We introduce the General Video Game Rule Generation problem, and the eponymous software framework which will be used in a new track of the General Video Game AI (GVGAI) competition. The problem is, given a game level as input, to generate the rules of a game that fits that level. This can be seen as the inverse of the General Video Game Level Generation problem. Conceptualizing these two problems as separate helps breaking the very hard problem of generating complete games into smaller, more manageable subproblems. The proposed framework builds on the GVGAI software and thus asks the rule generator for rules defined in the Video Game Description Language. We describe the API, and three different rule generators: a random, a constructive and a search-based generator. Early results indicate that the constructive generator generates playable and somewhat interesting game rules but has a limited expressive range, whereas the search-based generator generates remarkably diverse rulesets, but with an uneven quality.
Tasks
Published 2019-06-12
URL https://arxiv.org/abs/1906.05160v1
PDF https://arxiv.org/pdf/1906.05160v1.pdf
PWC https://paperswithcode.com/paper/general-video-game-rule-generation
Repo
Framework

Do place cells dream of conditional probabilities? Learning Neural Nyström representations

Title Do place cells dream of conditional probabilities? Learning Neural Nyström representations
Authors Mariano Tepper
Abstract We posit that hippocampal place cells encode information about future locations under a transition distribution observed as an agent explores a given (physical or conceptual) space. The encoding of information about the current location, usually associated with place cells, then emerges as a necessary step to achieve this broader goal. We formally derive a biologically-inspired neural network from Nystr"om kernel approximations and empirically demonstrate that the network successfully approximates transition distributions. The proposed network yields representations that, just like place cells, soft-tile the input space with highly sparse and localized receptive fields. Additionally, we show that the proposed computational motif can be extended to handle supervised problems, creating class-specific place cells while exhibiting low sample complexity.
Tasks
Published 2019-06-03
URL https://arxiv.org/abs/1906.01102v2
PDF https://arxiv.org/pdf/1906.01102v2.pdf
PWC https://paperswithcode.com/paper/do-place-cells-dream-of-conditional
Repo
Framework

3DSiameseNet to Analyze Brain MRI

Title 3DSiameseNet to Analyze Brain MRI
Authors Cecilia Ostertag, Marie Beurton-Aimar, Thierry Urruty
Abstract Prediction of the cognitive evolution of a person susceptible to develop a neurodegenerative disorder is crucial to provide an appropriate treatment as soon as possible. In this paper we propose a 3D siamese network designed to extract features from whole-brain 3D MRI images. We show that it is possible to extract meaningful features using convolution layers, reducing the need of classical image processing operations such as segmentation or pre-computing features such as cortical thickness. To lead this study we used the Alzheimer’s Disease Neuroimaging Initiative (ADNI), a public data base of 3D MRI brain images. A set of 247 subjects has been extracted, all of the subjects having 2 images in a range of 12 months. In order to measure the evolution of the patients states we have compared these 2 images. Our work has been inspired at the beginning by an article of Bhagwat et al. in 2018, who have proposed a siamese network to predict the status of patients but without any convolutional layers and reducing the MRI images to a vector of features extracted from predefined ROIs. We show that our network achieves an accuracy of 90% in the classification of cognitively declining VS stable patients. This result has been obtained without the help of a cognitive score and with a small number of patients comparing to the current datasets size claimed in deep learning domain.
Tasks
Published 2019-09-03
URL https://arxiv.org/abs/1909.01098v1
PDF https://arxiv.org/pdf/1909.01098v1.pdf
PWC https://paperswithcode.com/paper/3dsiamesenet-to-analyze-brain-mri
Repo
Framework

Unsupervised pre-training for sequence to sequence speech recognition

Title Unsupervised pre-training for sequence to sequence speech recognition
Authors Zhiyun Fan, Shiyu Zhou, Bo Xu
Abstract This paper proposes a novel approach to pre-train encoder-decoder sequence-to-sequence (seq2seq) model with unpaired speech and transcripts respectively. Our pre-training method is divided into two stages, named acoustic pre-trianing and linguistic pre-training. In the acoustic pre-training stage, we use a large amount of speech to pre-train the encoder by predicting masked speech feature chunks with its context. In the linguistic pre-training stage, we generate synthesized speech from a large number of transcripts using a single-speaker text to speech (TTS) system, and use the synthesized paired data to pre-train decoder. This two-stage pre-training method integrates rich acoustic and linguistic knowledge into seq2seq model, which will benefit downstream automatic speech recognition (ASR) tasks. The unsupervised pre-training is finished on AISHELL-2 dataset and we apply the pre-trained model to multiple paired data ratios of AISHELL-1 and HKUST. We obtain relative character error rate reduction (CERR) from 38.24% to 7.88% on AISHELL-1 and from 12.00% to 1.20% on HKUST. Besides, we apply our pretrained model to a cross-lingual case with CALLHOME dataset. For all six languages in CALLHOME dataset, our pre-training method makes model outperform baseline consistently.
Tasks Sequence-To-Sequence Speech Recognition, Speech Recognition
Published 2019-10-28
URL https://arxiv.org/abs/1910.12418v2
PDF https://arxiv.org/pdf/1910.12418v2.pdf
PWC https://paperswithcode.com/paper/unsupervised-pre-traing-for-sequence-to
Repo
Framework

End-to-End Learning of Geometric Deformations of Feature Maps for Virtual Try-On

Title End-to-End Learning of Geometric Deformations of Feature Maps for Virtual Try-On
Authors Thibaut Issenhuth, Jérémie Mary, Clément Calauzènes
Abstract The 2D virtual try-on task has recently attracted a lot of interest from the research community, for its direct potential applications in online shopping as well as for its inherent and non-addressed scientific challenges. This task requires to fit an in-shop cloth image on the image of a person. It is highly challenging because it requires to warp the cloth on the target person while preserving its patterns and characteristics, and to compose the item with the person in a realistic manner. Current state-of-the-art models generate images with visible artifacts, due either to a pixel-level composition step or to the geometric transformation. In this paper, we propose WUTON: a Warping U-net for a Virtual Try-On system. It is a siamese U-net generator whose skip connections are geometrically transformed by a convolutional geometric matcher. The whole architecture is trained end-to-end with a multi-task loss including an adversarial one. This enables our network to generate and use realistic spatial transformations of the cloth to synthesize images of high visual quality. The proposed architecture can be trained end-to-end and allows us to advance towards a detail-preserving and photo-realistic 2D virtual try-on system. Our method outperforms the current state-of-the-art with visual results as well as with the Learned Perceptual Image Similarity (LPIPS) metric.
Tasks
Published 2019-06-04
URL https://arxiv.org/abs/1906.01347v2
PDF https://arxiv.org/pdf/1906.01347v2.pdf
PWC https://paperswithcode.com/paper/end-to-end-learning-of-geometric-deformations
Repo
Framework
comments powered by Disqus