Paper Group ANR 1457
Probabilistic Super-Resolution of Solar Magnetograms: Generating Many Explanations and Measuring Uncertainties. Perceptual Quality-preserving Black-Box Attack against Deep Learning Image Classifiers. Facial Expression Restoration Based on Improved Graph Convolutional Networks. Tutorial: Safe and Reliable Machine Learning. Multimodal Image Super-res …
Probabilistic Super-Resolution of Solar Magnetograms: Generating Many Explanations and Measuring Uncertainties
Title | Probabilistic Super-Resolution of Solar Magnetograms: Generating Many Explanations and Measuring Uncertainties |
Authors | Xavier Gitiaux, Shane A. Maloney, Anna Jungbluth, Carl Shneider, Paul J. Wright, Atılım Güneş Baydin, Michel Deudon, Yarin Gal, Alfredo Kalaitzis, Andrés Muñoz-Jaramillo |
Abstract | Machine learning techniques have been successfully applied to super-resolution tasks on natural images where visually pleasing results are sufficient. However in many scientific domains this is not adequate and estimations of errors and uncertainties are crucial. To address this issue we propose a Bayesian framework that decomposes uncertainties into epistemic and aleatoric uncertainties. We test the validity of our approach by super-resolving images of the Sun’s magnetic field and by generating maps measuring the range of possible high resolution explanations compatible with a given low resolution magnetogram. |
Tasks | Super-Resolution |
Published | 2019-11-04 |
URL | https://arxiv.org/abs/1911.01486v1 |
https://arxiv.org/pdf/1911.01486v1.pdf | |
PWC | https://paperswithcode.com/paper/probabilistic-super-resolution-of-solar |
Repo | |
Framework | |
Perceptual Quality-preserving Black-Box Attack against Deep Learning Image Classifiers
Title | Perceptual Quality-preserving Black-Box Attack against Deep Learning Image Classifiers |
Authors | Diego Gragnaniello, Francesco Marra, Giovanni Poggi, Luisa Verdoliva |
Abstract | Deep neural networks provide unprecedented performance in all image classification problems, taking advantage of huge amounts of data available for training. Recent studies, however, have shown their vulnerability to adversarial attacks, spawning an intense research effort in this field. With the aim of building better systems, new countermeasures and stronger attacks are proposed by the day. On the attacker’s side, there is growing interest for the realistic black-box scenario, in which the user has no access to the neural network parameters. The problem is to design efficient attacks which mislead the neural network without compromising image quality. In this work, we propose to perform the black-box attack along a low-distortion path, so as to improve both the attack efficiency and the perceptual quality of the adversarial image. Numerical experiments on real-world systems prove the effectiveness of the proposed approach, both in benchmark classification tasks and in key applications in biometrics and forensics. |
Tasks | Face Recognition, Image Classification |
Published | 2019-02-20 |
URL | https://arxiv.org/abs/1902.07776v2 |
https://arxiv.org/pdf/1902.07776v2.pdf | |
PWC | https://paperswithcode.com/paper/perceptual-quality-preserving-black-box |
Repo | |
Framework | |
Facial Expression Restoration Based on Improved Graph Convolutional Networks
Title | Facial Expression Restoration Based on Improved Graph Convolutional Networks |
Authors | Zhilei Liu, Le Li, Yunpeng Wu, Cuicui Zhang |
Abstract | Facial expression analysis in the wild is challenging when the facial image is with low resolution or partial occlusion. Considering the correlations among different facial local regions under different facial expressions, this paper proposes a novel facial expression restoration method based on generative adversarial network by integrating an improved graph convolutional network (IGCN) and region relation modeling block (RRMB). Unlike conventional graph convolutional networks taking vectors as input features, IGCN can use tensors of face patches as inputs. It is better to retain the structure information of face patches. The proposed RRMB is designed to address facial generative tasks including inpainting and super-resolution with facial action units detection, which aims to restore facial expression as the ground-truth. Extensive experiments conducted on BP4D and DISFA benchmarks demonstrate the effectiveness of our proposed method through quantitative and qualitative evaluations. |
Tasks | Super-Resolution |
Published | 2019-10-23 |
URL | https://arxiv.org/abs/1910.10344v1 |
https://arxiv.org/pdf/1910.10344v1.pdf | |
PWC | https://paperswithcode.com/paper/facial-expression-restoration-based-on |
Repo | |
Framework | |
Tutorial: Safe and Reliable Machine Learning
Title | Tutorial: Safe and Reliable Machine Learning |
Authors | Suchi Saria, Adarsh Subbaswamy |
Abstract | This document serves as a brief overview of the “Safe and Reliable Machine Learning” tutorial given at the 2019 ACM Conference on Fairness, Accountability, and Transparency (FAT* 2019). The talk slides can be found here: https://bit.ly/2Gfsukp, while a video of the talk is available here: https://youtu.be/FGLOCkC4KmE, and a complete list of references for the tutorial here: https://bit.ly/2GdLPme. |
Tasks | |
Published | 2019-04-15 |
URL | http://arxiv.org/abs/1904.07204v1 |
http://arxiv.org/pdf/1904.07204v1.pdf | |
PWC | https://paperswithcode.com/paper/tutorial-safe-and-reliable-machine-learning |
Repo | |
Framework | |
Multimodal Image Super-resolution via Deep Unfolding with Side Information
Title | Multimodal Image Super-resolution via Deep Unfolding with Side Information |
Authors | Iman Marivani, Evaggelia Tsiligianni, Bruno Cornelis, Nikos Deligiannis |
Abstract | Deep learning methods have been successfully applied to various computer vision tasks. However, existing neural network architectures do not per se incorporate domain knowledge about the addressed problem, thus, understanding what the model has learned is an open research topic. In this paper, we rely on the unfolding of an iterative algorithm for sparse approximation with side information, and design a deep learning architecture for multimodal image super-resolution that incorporates sparse priors and effectively utilizes information from another image modality. We develop two deep models performing reconstruction of a high-resolution image of a target image modality from its low-resolution variant with the aid of a high-resolution image from a second modality. We apply the proposed models to super-resolve near-infrared images using as side information high-resolution RGB\ images. Experimental results demonstrate the superior performance of the proposed models against state-of-the-art methods including unimodal and multimodal approaches. |
Tasks | Image Super-Resolution, Super-Resolution |
Published | 2019-10-18 |
URL | https://arxiv.org/abs/1910.08320v1 |
https://arxiv.org/pdf/1910.08320v1.pdf | |
PWC | https://paperswithcode.com/paper/multimodal-image-super-resolution-via-deep |
Repo | |
Framework | |
Illumination Invariant Foreground Object Segmentation using ForeGANs
Title | Illumination Invariant Foreground Object Segmentation using ForeGANs |
Authors | Maryam Sultana, Soon Ki Jung |
Abstract | The foreground segmentation algorithms suffer performance degradation in the presence of various challenges such as dynamic backgrounds, and various illumination conditions. To handle these challenges, we present a foreground segmentation method, based on generative adversarial network (GAN). We aim to segment foreground objects in the presence of two aforementioned major challenges in background scenes in real environments. To address this problem, our presented GAN model is trained on background image samples with dynamic changes, after that for testing the GAN model has to generate the same background sample as test sample with similar conditions via back-propagation technique. The generated background sample is then subtracted from the given test sample to segment foreground objects. The comparison of our proposed method with five state-of-the-art methods highlights the strength of our algorithm for foreground segmentation in the presence of challenging dynamic background scenario. |
Tasks | Semantic Segmentation |
Published | 2019-02-07 |
URL | https://arxiv.org/abs/1902.03120v3 |
https://arxiv.org/pdf/1902.03120v3.pdf | |
PWC | https://paperswithcode.com/paper/illumination-invariant-foreground-object |
Repo | |
Framework | |
Variational Bayesian Optimal Experimental Design
Title | Variational Bayesian Optimal Experimental Design |
Authors | Adam Foster, Martin Jankowiak, Eli Bingham, Paul Horsfall, Yee Whye Teh, Tom Rainforth, Noah Goodman |
Abstract | Bayesian optimal experimental design (BOED) is a principled framework for making efficient use of limited experimental resources. Unfortunately, its applicability is hampered by the difficulty of obtaining accurate estimates of the expected information gain (EIG) of an experiment. To address this, we introduce several classes of fast EIG estimators by building on ideas from amortized variational inference. We show theoretically and empirically that these estimators can provide significant gains in speed and accuracy over previous approaches. We further demonstrate the practicality of our approach on a number of end-to-end experiments. |
Tasks | |
Published | 2019-03-13 |
URL | https://arxiv.org/abs/1903.05480v3 |
https://arxiv.org/pdf/1903.05480v3.pdf | |
PWC | https://paperswithcode.com/paper/variational-estimators-for-bayesian-optimal |
Repo | |
Framework | |
Towards Near-imperceptible Steganographic Text
Title | Towards Near-imperceptible Steganographic Text |
Authors | Falcon Z. Dai, Zheng Cai |
Abstract | We show that the imperceptibility of several existing linguistic steganographic systems (Fang et al., 2017; Yang et al., 2018) relies on implicit assumptions on statistical behaviors of fluent text. We formally analyze them and empirically evaluate these assumptions. Furthermore, based on these observations, we propose an encoding algorithm called patient-Huffman with improved near-imperceptible guarantees. |
Tasks | |
Published | 2019-07-15 |
URL | https://arxiv.org/abs/1907.06679v2 |
https://arxiv.org/pdf/1907.06679v2.pdf | |
PWC | https://paperswithcode.com/paper/towards-near-imperceptible-steganographic |
Repo | |
Framework | |
KonIQ-10k: An ecologically valid database for deep learning of blind image quality assessment
Title | KonIQ-10k: An ecologically valid database for deep learning of blind image quality assessment |
Authors | Vlad Hosu, Hanhe Lin, Tamas Sziranyi, Dietmar Saupe |
Abstract | Deep learning methods for image quality assessment (IQA) are limited due to the small size of existing datasets. Extensive datasets require substantial resources both for generating publishable content, and annotating it accurately. We present a systematic and scalable approach to create KonIQ-10k, the largest IQA dataset to date consisting of 10,073 quality scored images. This is the first in-the-wild database aiming for ecological validity, with regard to the authenticity of distortions, the diversity of content, and quality-related indicators. Through the use of crowdsourcing, we obtained 1.2 million reliable quality ratings from 1,459 crowd workers, paving the way for more general IQA models. We propose a novel, deep learning model (KonCept512), to show an excellent generalization beyond the test set (0.921 SROCC), to the current state-of-the-art database LIVE-in-the-Wild (0.825 SROCC). The model derives its core performance from the InceptionResNet architecture, being trained at a higher resolution than previous models (512x384). A correlation analysis shows that KonCept512 performs similar to having 9 subjective scores for each test image. |
Tasks | Blind Image Quality Assessment, Image Quality Assessment |
Published | 2019-10-14 |
URL | https://arxiv.org/abs/1910.06180v1 |
https://arxiv.org/pdf/1910.06180v1.pdf | |
PWC | https://paperswithcode.com/paper/koniq-10k-an-ecologically-valid-database-for |
Repo | |
Framework | |
PAN: Path Integral Based Convolution for Deep Graph Neural Networks
Title | PAN: Path Integral Based Convolution for Deep Graph Neural Networks |
Authors | Zheng Ma, Ming Li, Yuguang Wang |
Abstract | Convolution operations designed for graph-structured data usually utilize the graph Laplacian, which can be seen as message passing between the adjacent neighbors through a generic random walk. In this paper, we propose PAN, a new graph convolution framework that involves every path linking the message sender and receiver with learnable weights depending on the path length, which corresponds to the maximal entropy random walk. PAN generalizes the graph Laplacian to a new transition matrix we call \emph{maximal entropy transition} (MET) matrix derived from a path integral formalism. Most previous graph convolutional network architectures can be adapted to our framework, and many variations and derivatives based on the path integral idea can be developed. Experimental results show that the path integral based graph neural networks have great learnability and fast convergence rate, and achieve state-of-the-art performance on benchmark tasks. |
Tasks | |
Published | 2019-04-24 |
URL | http://arxiv.org/abs/1904.10996v1 |
http://arxiv.org/pdf/1904.10996v1.pdf | |
PWC | https://paperswithcode.com/paper/pan-path-integral-based-convolution-for-deep |
Repo | |
Framework | |
General Video Game Rule Generation
Title | General Video Game Rule Generation |
Authors | Ahmed Khalifa, Michael Cerny Green, Diego Perez-Liebana, Julian Togelius |
Abstract | We introduce the General Video Game Rule Generation problem, and the eponymous software framework which will be used in a new track of the General Video Game AI (GVGAI) competition. The problem is, given a game level as input, to generate the rules of a game that fits that level. This can be seen as the inverse of the General Video Game Level Generation problem. Conceptualizing these two problems as separate helps breaking the very hard problem of generating complete games into smaller, more manageable subproblems. The proposed framework builds on the GVGAI software and thus asks the rule generator for rules defined in the Video Game Description Language. We describe the API, and three different rule generators: a random, a constructive and a search-based generator. Early results indicate that the constructive generator generates playable and somewhat interesting game rules but has a limited expressive range, whereas the search-based generator generates remarkably diverse rulesets, but with an uneven quality. |
Tasks | |
Published | 2019-06-12 |
URL | https://arxiv.org/abs/1906.05160v1 |
https://arxiv.org/pdf/1906.05160v1.pdf | |
PWC | https://paperswithcode.com/paper/general-video-game-rule-generation |
Repo | |
Framework | |
Do place cells dream of conditional probabilities? Learning Neural Nyström representations
Title | Do place cells dream of conditional probabilities? Learning Neural Nyström representations |
Authors | Mariano Tepper |
Abstract | We posit that hippocampal place cells encode information about future locations under a transition distribution observed as an agent explores a given (physical or conceptual) space. The encoding of information about the current location, usually associated with place cells, then emerges as a necessary step to achieve this broader goal. We formally derive a biologically-inspired neural network from Nystr"om kernel approximations and empirically demonstrate that the network successfully approximates transition distributions. The proposed network yields representations that, just like place cells, soft-tile the input space with highly sparse and localized receptive fields. Additionally, we show that the proposed computational motif can be extended to handle supervised problems, creating class-specific place cells while exhibiting low sample complexity. |
Tasks | |
Published | 2019-06-03 |
URL | https://arxiv.org/abs/1906.01102v2 |
https://arxiv.org/pdf/1906.01102v2.pdf | |
PWC | https://paperswithcode.com/paper/do-place-cells-dream-of-conditional |
Repo | |
Framework | |
3DSiameseNet to Analyze Brain MRI
Title | 3DSiameseNet to Analyze Brain MRI |
Authors | Cecilia Ostertag, Marie Beurton-Aimar, Thierry Urruty |
Abstract | Prediction of the cognitive evolution of a person susceptible to develop a neurodegenerative disorder is crucial to provide an appropriate treatment as soon as possible. In this paper we propose a 3D siamese network designed to extract features from whole-brain 3D MRI images. We show that it is possible to extract meaningful features using convolution layers, reducing the need of classical image processing operations such as segmentation or pre-computing features such as cortical thickness. To lead this study we used the Alzheimer’s Disease Neuroimaging Initiative (ADNI), a public data base of 3D MRI brain images. A set of 247 subjects has been extracted, all of the subjects having 2 images in a range of 12 months. In order to measure the evolution of the patients states we have compared these 2 images. Our work has been inspired at the beginning by an article of Bhagwat et al. in 2018, who have proposed a siamese network to predict the status of patients but without any convolutional layers and reducing the MRI images to a vector of features extracted from predefined ROIs. We show that our network achieves an accuracy of 90% in the classification of cognitively declining VS stable patients. This result has been obtained without the help of a cognitive score and with a small number of patients comparing to the current datasets size claimed in deep learning domain. |
Tasks | |
Published | 2019-09-03 |
URL | https://arxiv.org/abs/1909.01098v1 |
https://arxiv.org/pdf/1909.01098v1.pdf | |
PWC | https://paperswithcode.com/paper/3dsiamesenet-to-analyze-brain-mri |
Repo | |
Framework | |
Unsupervised pre-training for sequence to sequence speech recognition
Title | Unsupervised pre-training for sequence to sequence speech recognition |
Authors | Zhiyun Fan, Shiyu Zhou, Bo Xu |
Abstract | This paper proposes a novel approach to pre-train encoder-decoder sequence-to-sequence (seq2seq) model with unpaired speech and transcripts respectively. Our pre-training method is divided into two stages, named acoustic pre-trianing and linguistic pre-training. In the acoustic pre-training stage, we use a large amount of speech to pre-train the encoder by predicting masked speech feature chunks with its context. In the linguistic pre-training stage, we generate synthesized speech from a large number of transcripts using a single-speaker text to speech (TTS) system, and use the synthesized paired data to pre-train decoder. This two-stage pre-training method integrates rich acoustic and linguistic knowledge into seq2seq model, which will benefit downstream automatic speech recognition (ASR) tasks. The unsupervised pre-training is finished on AISHELL-2 dataset and we apply the pre-trained model to multiple paired data ratios of AISHELL-1 and HKUST. We obtain relative character error rate reduction (CERR) from 38.24% to 7.88% on AISHELL-1 and from 12.00% to 1.20% on HKUST. Besides, we apply our pretrained model to a cross-lingual case with CALLHOME dataset. For all six languages in CALLHOME dataset, our pre-training method makes model outperform baseline consistently. |
Tasks | Sequence-To-Sequence Speech Recognition, Speech Recognition |
Published | 2019-10-28 |
URL | https://arxiv.org/abs/1910.12418v2 |
https://arxiv.org/pdf/1910.12418v2.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-pre-traing-for-sequence-to |
Repo | |
Framework | |
End-to-End Learning of Geometric Deformations of Feature Maps for Virtual Try-On
Title | End-to-End Learning of Geometric Deformations of Feature Maps for Virtual Try-On |
Authors | Thibaut Issenhuth, Jérémie Mary, Clément Calauzènes |
Abstract | The 2D virtual try-on task has recently attracted a lot of interest from the research community, for its direct potential applications in online shopping as well as for its inherent and non-addressed scientific challenges. This task requires to fit an in-shop cloth image on the image of a person. It is highly challenging because it requires to warp the cloth on the target person while preserving its patterns and characteristics, and to compose the item with the person in a realistic manner. Current state-of-the-art models generate images with visible artifacts, due either to a pixel-level composition step or to the geometric transformation. In this paper, we propose WUTON: a Warping U-net for a Virtual Try-On system. It is a siamese U-net generator whose skip connections are geometrically transformed by a convolutional geometric matcher. The whole architecture is trained end-to-end with a multi-task loss including an adversarial one. This enables our network to generate and use realistic spatial transformations of the cloth to synthesize images of high visual quality. The proposed architecture can be trained end-to-end and allows us to advance towards a detail-preserving and photo-realistic 2D virtual try-on system. Our method outperforms the current state-of-the-art with visual results as well as with the Learned Perceptual Image Similarity (LPIPS) metric. |
Tasks | |
Published | 2019-06-04 |
URL | https://arxiv.org/abs/1906.01347v2 |
https://arxiv.org/pdf/1906.01347v2.pdf | |
PWC | https://paperswithcode.com/paper/end-to-end-learning-of-geometric-deformations |
Repo | |
Framework | |