October 20, 2019

3118 words 15 mins read

Paper Group AWR 239

Paper Group AWR 239

Super-resolution of Sentinel-2 images: Learning a globally applicable deep neural network. Photo-Realistic Blocksworld Dataset. Where are the Blobs: Counting by Localization with Point Supervision. Improving Object Counting with Heatmap Regulation. Perceptual deep depth super-resolution. DeepTraffic: Crowdsourced Hyperparameter Tuning of Deep Reinf …

Super-resolution of Sentinel-2 images: Learning a globally applicable deep neural network

Title Super-resolution of Sentinel-2 images: Learning a globally applicable deep neural network
Authors Charis Lanaras, José Bioucas-Dias, Silvano Galliani, Emmanuel Baltsavias, Konrad Schindler
Abstract The Sentinel-2 satellite mission delivers multi-spectral imagery with 13 spectral bands, acquired at three different spatial resolutions. The aim of this research is to super-resolve the lower-resolution (20 m and 60 m Ground Sampling Distance - GSD) bands to 10 m GSD, so as to obtain a complete data cube at the maximal sensor resolution. We employ a state-of-the-art convolutional neural network (CNN) to perform end-to-end upsampling, which is trained with data at lower resolution, i.e., from 40->20 m, respectively 360->60 m GSD. In this way, one has access to a virtually infinite amount of training data, by downsampling real Sentinel-2 images. We use data sampled globally over a wide range of geographical locations, to obtain a network that generalises across different climate zones and land-cover types, and can super-resolve arbitrary Sentinel-2 images without the need of retraining. In quantitative evaluations (at lower scale, where ground truth is available), our network, which we call DSen2, outperforms the best competing approach by almost 50% in RMSE, while better preserving the spectral characteristics. It also delivers visually convincing results at the full 10 m GSD. The code is available at https://github.com/lanha/DSen2
Tasks Super-Resolution
Published 2018-03-12
URL http://arxiv.org/abs/1803.04271v2
PDF http://arxiv.org/pdf/1803.04271v2.pdf
PWC https://paperswithcode.com/paper/super-resolution-of-sentinel-2-images
Repo https://github.com/deephdc/satsr
Framework tf

Photo-Realistic Blocksworld Dataset

Title Photo-Realistic Blocksworld Dataset
Authors Masataro Asai
Abstract In this report, we introduce an artificial dataset generator for Photo-realistic Blocksworld domain. Blocksworld is one of the oldest high-level task planning domain that is well defined but contains sufficient complexity, e.g., the conflicting subgoals and the decomposability into subproblems. We aim to make this dataset a benchmark for Neural-Symbolic integrated systems and accelerate the research in this area. The key advantage of such systems is the ability to obtain a symbolic model from the real-world input and perform a fast, systematic, complete algorithm for symbolic reasoning, without any supervision and the reward signal from the environment.
Tasks
Published 2018-12-05
URL http://arxiv.org/abs/1812.01818v1
PDF http://arxiv.org/pdf/1812.01818v1.pdf
PWC https://paperswithcode.com/paper/photo-realistic-blocksworld-dataset
Repo https://github.com/ibm/photorealistic-blocksworld
Framework none

Where are the Blobs: Counting by Localization with Point Supervision

Title Where are the Blobs: Counting by Localization with Point Supervision
Authors Issam H. Laradji, Negar Rostamzadeh, Pedro O. Pinheiro, David Vazquez, Mark Schmidt
Abstract Object counting is an important task in computer vision due to its growing demand in applications such as surveillance, traffic monitoring, and counting everyday objects. State-of-the-art methods use regression-based optimization where they explicitly learn to count the objects of interest. These often perform better than detection-based methods that need to learn the more difficult task of predicting the location, size, and shape of each object. However, we propose a detection-based method that does not need to estimate the size and shape of the objects and that outperforms regression-based methods. Our contributions are three-fold: (1) we propose a novel loss function that encourages the network to output a single blob per object instance using point-level annotations only; (2) we design two methods for splitting large predicted blobs between object instances; and (3) we show that our method achieves new state-of-the-art results on several challenging datasets including the Pascal VOC and the Penguins dataset. Our method even outperforms those that use stronger supervision such as depth features, multi-point annotations, and bounding-box labels.
Tasks Object Counting
Published 2018-07-25
URL http://arxiv.org/abs/1807.09856v1
PDF http://arxiv.org/pdf/1807.09856v1.pdf
PWC https://paperswithcode.com/paper/where-are-the-blobs-counting-by-localization
Repo https://github.com/ElementAI/LCFCN
Framework pytorch

Improving Object Counting with Heatmap Regulation

Title Improving Object Counting with Heatmap Regulation
Authors Shubhra Aich, Ian Stavness
Abstract In this paper, we propose a simple and effective way to improve one-look regression models for object counting from images. We use class activation map visualizations to illustrate the drawbacks of learning a pure one-look regression model for a counting task. Based on these insights, we enhance one-look regression counting models by regulating activation maps from the final convolution layer of the network with coarse ground-truth activation maps generated from simple dot annotations. We call this strategy heatmap regulation (HR). We show that this simple enhancement effectively suppresses false detections generated by the corresponding one-look baseline model and also improves the performance in terms of false negatives. Evaluations are performed on four different counting datasets — two for car counting (CARPK, PUCPR+), one for crowd counting (WorldExpo) and another for biological cell counting (VGG-Cells). Adding HR to a simple VGG front-end improves performance on all these benchmarks compared to a simple one-look baseline model and results in state-of-the-art performance for car counting.
Tasks Crowd Counting, Object Counting
Published 2018-03-14
URL http://arxiv.org/abs/1803.05494v2
PDF http://arxiv.org/pdf/1803.05494v2.pdf
PWC https://paperswithcode.com/paper/improving-object-counting-with-heatmap
Repo https://github.com/littleaich/heatmap-regulation
Framework pytorch

Perceptual deep depth super-resolution

Title Perceptual deep depth super-resolution
Authors Oleg Voynov, Alexey Artemov, Vage Egiazarian, Alexander Notchenko, Gleb Bobrovskikh, Denis Zorin, Evgeny Burnaev
Abstract RGBD images, combining high-resolution color and lower-resolution depth from various types of depth sensors, are increasingly common. One can significantly improve the resolution of depth maps by taking advantage of color information; deep learning methods make combining color and depth information particularly easy. However, fusing these two sources of data may lead to a variety of artifacts. If depth maps are used to reconstruct 3D shapes, e.g., for virtual reality applications, the visual quality of upsampled images is particularly important. The main idea of our approach is to measure the quality of depth map upsampling using renderings of resulting 3D surfaces. We demonstrate that a simple visual appearance-based loss, when used with either a trained CNN or simply a deep prior, yields significantly improved 3D shapes, as measured by a number of existing perceptual metrics. We compare this approach with a number of existing optimization and learning-based techniques.
Tasks Super-Resolution
Published 2018-12-24
URL https://arxiv.org/abs/1812.09874v3
PDF https://arxiv.org/pdf/1812.09874v3.pdf
PWC https://paperswithcode.com/paper/perceptually-based-single-image-depth-super
Repo https://github.com/twhui/MSG-Net
Framework none

DeepTraffic: Crowdsourced Hyperparameter Tuning of Deep Reinforcement Learning Systems for Multi-Agent Dense Traffic Navigation

Title DeepTraffic: Crowdsourced Hyperparameter Tuning of Deep Reinforcement Learning Systems for Multi-Agent Dense Traffic Navigation
Authors Lex Fridman, Jack Terwilliger, Benedikt Jenik
Abstract We present a traffic simulation named DeepTraffic where the planning systems for a subset of the vehicles are handled by a neural network as part of a model-free, off-policy reinforcement learning process. The primary goal of DeepTraffic is to make the hands-on study of deep reinforcement learning accessible to thousands of students, educators, and researchers in order to inspire and fuel the exploration and evaluation of deep Q-learning network variants and hyperparameter configurations through large-scale, open competition. This paper investigates the crowd-sourced hyperparameter tuning of the policy network that resulted from the first iteration of the DeepTraffic competition where thousands of participants actively searched through the hyperparameter space.
Tasks Autonomous Driving, Autonomous Navigation, Q-Learning
Published 2018-01-09
URL http://arxiv.org/abs/1801.02805v2
PDF http://arxiv.org/pdf/1801.02805v2.pdf
PWC https://paperswithcode.com/paper/deeptraffic-crowdsourced-hyperparameter
Repo https://github.com/asarav/MIT-Deep-Traffic-Solution
Framework none

Augmenting Neural Response Generation with Context-Aware Topical Attention

Title Augmenting Neural Response Generation with Context-Aware Topical Attention
Authors Nouha Dziri, Ehsan Kamalloo, Kory W. Mathewson, Osmar Zaiane
Abstract Sequence-to-Sequence (Seq2Seq) models have witnessed a notable success in generating natural conversational exchanges. Notwithstanding the syntactically well-formed responses generated by these neural network models, they are prone to be acontextual, short and generic. In this work, we introduce a Topical Hierarchical Recurrent Encoder Decoder (THRED), a novel, fully data-driven, multi-turn response generation system intended to produce contextual and topic-aware responses. Our model is built upon the basic Seq2Seq model by augmenting it with a hierarchical joint attention mechanism that incorporates topical concepts and previous interactions into the response generation. To train our model, we provide a clean and high-quality conversational dataset mined from Reddit comments. We evaluate THRED on two novel automated metrics, dubbed Semantic Similarity and Response Echo Index, as well as with human evaluation. Our experiments demonstrate that the proposed model is able to generate more diverse and contextually relevant responses compared to the strong baselines.
Tasks Semantic Similarity, Semantic Textual Similarity
Published 2018-11-02
URL https://arxiv.org/abs/1811.01063v2
PDF https://arxiv.org/pdf/1811.01063v2.pdf
PWC https://paperswithcode.com/paper/augmenting-neural-response-generation-with
Repo https://github.com/nouhadziri/THRED
Framework tf

A high-bias, low-variance introduction to Machine Learning for physicists

Title A high-bias, low-variance introduction to Machine Learning for physicists
Authors Pankaj Mehta, Marin Bukov, Ching-Hao Wang, Alexandre G. R. Day, Clint Richardson, Charles K. Fisher, David J. Schwab
Abstract Machine Learning (ML) is one of the most exciting and dynamic areas of modern research and application. The purpose of this review is to provide an introduction to the core concepts and tools of machine learning in a manner easily understood and intuitive to physicists. The review begins by covering fundamental concepts in ML and modern statistics such as the bias-variance tradeoff, overfitting, regularization, generalization, and gradient descent before moving on to more advanced topics in both supervised and unsupervised learning. Topics covered in the review include ensemble models, deep learning and neural networks, clustering and data visualization, energy-based models (including MaxEnt models and Restricted Boltzmann Machines), and variational methods. Throughout, we emphasize the many natural connections between ML and statistical physics. A notable aspect of the review is the use of Python Jupyter notebooks to introduce modern ML/statistical packages to readers using physics-inspired datasets (the Ising Model and Monte-Carlo simulations of supersymmetric decays of proton-proton collisions). We conclude with an extended outlook discussing possible uses of machine learning for furthering our understanding of the physical world as well as open problems in ML where physicists may be able to contribute. (Notebooks are available at https://physics.bu.edu/~pankajm/MLnotebooks.html )
Tasks
Published 2018-03-23
URL https://arxiv.org/abs/1803.08823v3
PDF https://arxiv.org/pdf/1803.08823v3.pdf
PWC https://paperswithcode.com/paper/a-high-bias-low-variance-introduction-to
Repo https://github.com/vmartinezalvarez/Machine-Learning-for-physicists
Framework pytorch

A Tutorial on Modeling and Inference in Undirected Graphical Models for Hyperspectral Image Analysis

Title A Tutorial on Modeling and Inference in Undirected Graphical Models for Hyperspectral Image Analysis
Authors Utsav B. Gewali, Sildomar T. Monteiro
Abstract Undirected graphical models have been successfully used to jointly model the spatial and the spectral dependencies in earth observing hyperspectral images. They produce less noisy, smooth, and spatially coherent land cover maps and give top accuracies on many datasets. Moreover, they can easily be combined with other state-of-the-art approaches, such as deep learning. This has made them an essential tool for remote sensing researchers and practitioners. However, graphical models have not been easily accessible to the larger remote sensing community as they are not discussed in standard remote sensing textbooks and not included in the popular remote sensing software and toolboxes. In this tutorial, we provide a theoretical introduction to Markov random fields and conditional random fields based spatial-spectral classification for land cover mapping along with a detailed step-by-step practical guide on applying these methods using freely available software. Furthermore, the discussed methods are benchmarked on four public hyperspectral datasets for a fair comparison among themselves and easy comparison with the vast number of methods in literature which use the same datasets. The source code necessary to reproduce all the results in the paper is published on-line to make it easier for the readers to apply these techniques to different remote sensing problems.
Tasks
Published 2018-01-25
URL http://arxiv.org/abs/1801.08268v1
PDF http://arxiv.org/pdf/1801.08268v1.pdf
PWC https://paperswithcode.com/paper/a-tutorial-on-modeling-and-inference-in
Repo https://github.com/UBGewali/tutorial-UGM-hyperspectral
Framework none

Densely Connected Pyramid Dehazing Network

Title Densely Connected Pyramid Dehazing Network
Authors He Zhang, Vishal M. Patel
Abstract We propose a new end-to-end single image dehazing method, called Densely Connected Pyramid Dehazing Network (DCPDN), which can jointly learn the transmission map, atmospheric light and dehazing all together. The end-to-end learning is achieved by directly embedding the atmospheric scattering model into the network, thereby ensuring that the proposed method strictly follows the physics-driven scattering model for dehazing. Inspired by the dense network that can maximize the information flow along features from different levels, we propose a new edge-preserving densely connected encoder-decoder structure with multi-level pyramid pooling module for estimating the transmission map. This network is optimized using a newly introduced edge-preserving loss function. To further incorporate the mutual structural information between the estimated transmission map and the dehazed result, we propose a joint-discriminator based on generative adversarial network framework to decide whether the corresponding dehazed image and the estimated transmission map are real or fake. An ablation study is conducted to demonstrate the effectiveness of each module evaluated at both estimated transmission map and dehazed result. Extensive experiments demonstrate that the proposed method achieves significant improvements over the state-of-the-art methods. Code will be made available at: https://github.com/hezhangsprinter
Tasks Image Dehazing, Single Image Dehazing
Published 2018-03-22
URL http://arxiv.org/abs/1803.08396v1
PDF http://arxiv.org/pdf/1803.08396v1.pdf
PWC https://paperswithcode.com/paper/densely-connected-pyramid-dehazing-network
Repo https://github.com/hezhangsprinter/DCPDN
Framework pytorch

Image Generation from Scene Graphs

Title Image Generation from Scene Graphs
Authors Justin Johnson, Agrim Gupta, Li Fei-Fei
Abstract To truly understand the visual world our models should be able not only to recognize images but also generate them. To this end, there has been exciting recent progress on generating images from natural language descriptions. These methods give stunning results on limited domains such as descriptions of birds or flowers, but struggle to faithfully reproduce complex sentences with many objects and relationships. To overcome this limitation we propose a method for generating images from scene graphs, enabling explicitly reasoning about objects and their relationships. Our model uses graph convolution to process input graphs, computes a scene layout by predicting bounding boxes and segmentation masks for objects, and converts the layout to an image with a cascaded refinement network. The network is trained adversarially against a pair of discriminators to ensure realistic outputs. We validate our approach on Visual Genome and COCO-Stuff, where qualitative results, ablations, and user studies demonstrate our method’s ability to generate complex images with multiple objects.
Tasks Image Generation, Layout-to-Image Generation
Published 2018-04-04
URL http://arxiv.org/abs/1804.01622v1
PDF http://arxiv.org/pdf/1804.01622v1.pdf
PWC https://paperswithcode.com/paper/image-generation-from-scene-graphs
Repo https://github.com/google/sg2im
Framework pytorch

Cycle-Dehaze: Enhanced CycleGAN for Single Image Dehazing

Title Cycle-Dehaze: Enhanced CycleGAN for Single Image Dehazing
Authors Deniz Engin, Anıl Genç, Hazım Kemal Ekenel
Abstract In this paper, we present an end-to-end network, called Cycle-Dehaze, for single image dehazing problem, which does not require pairs of hazy and corresponding ground truth images for training. That is, we train the network by feeding clean and hazy images in an unpaired manner. Moreover, the proposed approach does not rely on estimation of the atmospheric scattering model parameters. Our method enhances CycleGAN formulation by combining cycle-consistency and perceptual losses in order to improve the quality of textural information recovery and generate visually better haze-free images. Typically, deep learning models for dehazing take low resolution images as input and produce low resolution outputs. However, in the NTIRE 2018 challenge on single image dehazing, high resolution images were provided. Therefore, we apply bicubic downscaling. After obtaining low-resolution outputs from the network, we utilize the Laplacian pyramid to upscale the output images to the original resolution. We conduct experiments on NYU-Depth, I-HAZE, and O-HAZE datasets. Extensive experiments demonstrate that the proposed approach improves CycleGAN method both quantitatively and qualitatively.
Tasks Image Dehazing, Single Image Dehazing
Published 2018-05-14
URL http://arxiv.org/abs/1805.05308v1
PDF http://arxiv.org/pdf/1805.05308v1.pdf
PWC https://paperswithcode.com/paper/cycle-dehaze-enhanced-cyclegan-for-single
Repo https://github.com/engindeniz/Cycle-Dehaze
Framework tf

Analyzing Perception-Distortion Tradeoff using Enhanced Perceptual Super-resolution Network

Title Analyzing Perception-Distortion Tradeoff using Enhanced Perceptual Super-resolution Network
Authors Subeesh Vasu, Nimisha Thekke Madam, Rajagopalan A. N
Abstract Convolutional neural network (CNN) based methods have recently achieved great success for image super-resolution (SR). However, most deep CNN based SR models attempt to improve distortion measures (e.g. PSNR, SSIM, IFC, VIF) while resulting in poor quantified perceptual quality (e.g. human opinion score, no-reference quality measures such as NIQE). Few works have attempted to improve the perceptual quality at the cost of performance reduction in distortion measures. A very recent study has revealed that distortion and perceptual quality are at odds with each other and there is always a trade-off between the two. Often the restoration algorithms that are superior in terms of perceptual quality, are inferior in terms of distortion measures. Our work attempts to analyze the trade-off between distortion and perceptual quality for the problem of single image SR. To this end, we use the well-known SR architecture-enhanced deep super-resolution (EDSR) network and show that it can be adapted to achieve better perceptual quality for a specific range of the distortion measure. While the original network of EDSR was trained to minimize the error defined based on per-pixel accuracy alone, we train our network using a generative adversarial network framework with EDSR as the generator module. Our proposed network, called enhanced perceptual super-resolution network (EPSR), is trained with a combination of mean squared error loss, perceptual loss, and adversarial loss. Our experiments reveal that EPSR achieves the state-of-the-art trade-off between distortion and perceptual quality while the existing methods perform well in either of these measures alone.
Tasks Image Super-Resolution, Super-Resolution
Published 2018-11-01
URL http://arxiv.org/abs/1811.00344v2
PDF http://arxiv.org/pdf/1811.00344v2.pdf
PWC https://paperswithcode.com/paper/analyzing-perception-distortion-tradeoff
Repo https://github.com/subeeshvasu/2018_subeesh_epsr_eccvw
Framework pytorch

End-to-End Multi-Task Learning with Attention

Title End-to-End Multi-Task Learning with Attention
Authors Shikun Liu, Edward Johns, Andrew J. Davison
Abstract We propose a novel multi-task learning architecture, which allows learning of task-specific feature-level attention. Our design, the Multi-Task Attention Network (MTAN), consists of a single shared network containing a global feature pool, together with a soft-attention module for each task. These modules allow for learning of task-specific features from the global features, whilst simultaneously allowing for features to be shared across different tasks. The architecture can be trained end-to-end and can be built upon any feed-forward neural network, is simple to implement, and is parameter efficient. We evaluate our approach on a variety of datasets, across both image-to-image predictions and image classification tasks. We show that our architecture is state-of-the-art in multi-task learning compared to existing methods, and is also less sensitive to various weighting schemes in the multi-task loss function. Code is available at https://github.com/lorenmt/mtan.
Tasks Multi-Task Learning
Published 2018-03-28
URL http://arxiv.org/abs/1803.10704v2
PDF http://arxiv.org/pdf/1803.10704v2.pdf
PWC https://paperswithcode.com/paper/end-to-end-multi-task-learning-with-attention
Repo https://github.com/lorenmt/mtan
Framework pytorch

Conditional molecular design with deep generative models

Title Conditional molecular design with deep generative models
Authors Seokho Kang, Kyunghyun Cho
Abstract Although machine learning has been successfully used to propose novel molecules that satisfy desired properties, it is still challenging to explore a large chemical space efficiently. In this paper, we present a conditional molecular design method that facilitates generating new molecules with desired properties. The proposed model, which simultaneously performs both property prediction and molecule generation, is built as a semi-supervised variational autoencoder trained on a set of existing molecules with only a partial annotation. We generate new molecules with desired properties by sampling from the generative distribution estimated by the model. We demonstrate the effectiveness of the proposed model by evaluating it on drug-like molecules. The model improves the performance of property prediction by exploiting unlabeled molecules, and efficiently generates novel molecules fulfilling various target conditions.
Tasks
Published 2018-04-30
URL http://arxiv.org/abs/1805.00108v3
PDF http://arxiv.org/pdf/1805.00108v3.pdf
PWC https://paperswithcode.com/paper/conditional-molecular-design-with-deep
Repo https://github.com/gcolmenarejo/cmd
Framework tf
comments powered by Disqus