October 20, 2019

3118 words 15 mins read

Paper Group AWR 239

Super-resolution of Sentinel-2 images: Learning a globally applicable deep neural network. Photo-Realistic Blocksworld Dataset. Where are the Blobs: Counting by Localization with Point Supervision. Improving Object Counting with Heatmap Regulation. Perceptual deep depth super-resolution. DeepTraffic: Crowdsourced Hyperparameter Tuning of Deep Reinf …

Super-resolution of Sentinel-2 images: Learning a globally applicable deep neural network


Title	Super-resolution of Sentinel-2 images: Learning a globally applicable deep neural network
Authors	Charis Lanaras, José Bioucas-Dias, Silvano Galliani, Emmanuel Baltsavias, Konrad Schindler
Abstract	The Sentinel-2 satellite mission delivers multi-spectral imagery with 13 spectral bands, acquired at three different spatial resolutions. The aim of this research is to super-resolve the lower-resolution (20 m and 60 m Ground Sampling Distance - GSD) bands to 10 m GSD, so as to obtain a complete data cube at the maximal sensor resolution. We employ a state-of-the-art convolutional neural network (CNN) to perform end-to-end upsampling, which is trained with data at lower resolution, i.e., from 40->20 m, respectively 360->60 m GSD. In this way, one has access to a virtually infinite amount of training data, by downsampling real Sentinel-2 images. We use data sampled globally over a wide range of geographical locations, to obtain a network that generalises across different climate zones and land-cover types, and can super-resolve arbitrary Sentinel-2 images without the need of retraining. In quantitative evaluations (at lower scale, where ground truth is available), our network, which we call DSen2, outperforms the best competing approach by almost 50% in RMSE, while better preserving the spectral characteristics. It also delivers visually convincing results at the full 10 m GSD. The code is available at https://github.com/lanha/DSen2
Tasks	Super-Resolution
Published	2018-03-12
URL	http://arxiv.org/abs/1803.04271v2
PDF	http://arxiv.org/pdf/1803.04271v2.pdf
PWC	https://paperswithcode.com/paper/super-resolution-of-sentinel-2-images
Repo	https://github.com/deephdc/satsr
Framework	tf

Photo-Realistic Blocksworld Dataset


Title	Photo-Realistic Blocksworld Dataset
Authors	Masataro Asai
Abstract	In this report, we introduce an artificial dataset generator for Photo-realistic Blocksworld domain. Blocksworld is one of the oldest high-level task planning domain that is well defined but contains sufficient complexity, e.g., the conflicting subgoals and the decomposability into subproblems. We aim to make this dataset a benchmark for Neural-Symbolic integrated systems and accelerate the research in this area. The key advantage of such systems is the ability to obtain a symbolic model from the real-world input and perform a fast, systematic, complete algorithm for symbolic reasoning, without any supervision and the reward signal from the environment.
Tasks
Published	2018-12-05
URL	http://arxiv.org/abs/1812.01818v1
PDF	http://arxiv.org/pdf/1812.01818v1.pdf
PWC	https://paperswithcode.com/paper/photo-realistic-blocksworld-dataset
Repo	https://github.com/ibm/photorealistic-blocksworld
Framework	none

Where are the Blobs: Counting by Localization with Point Supervision


Title	Where are the Blobs: Counting by Localization with Point Supervision
Authors	Issam H. Laradji, Negar Rostamzadeh, Pedro O. Pinheiro, David Vazquez, Mark Schmidt
Abstract	Object counting is an important task in computer vision due to its growing demand in applications such as surveillance, traffic monitoring, and counting everyday objects. State-of-the-art methods use regression-based optimization where they explicitly learn to count the objects of interest. These often perform better than detection-based methods that need to learn the more difficult task of predicting the location, size, and shape of each object. However, we propose a detection-based method that does not need to estimate the size and shape of the objects and that outperforms regression-based methods. Our contributions are three-fold: (1) we propose a novel loss function that encourages the network to output a single blob per object instance using point-level annotations only; (2) we design two methods for splitting large predicted blobs between object instances; and (3) we show that our method achieves new state-of-the-art results on several challenging datasets including the Pascal VOC and the Penguins dataset. Our method even outperforms those that use stronger supervision such as depth features, multi-point annotations, and bounding-box labels.
Tasks	Object Counting
Published	2018-07-25
URL	http://arxiv.org/abs/1807.09856v1
PDF	http://arxiv.org/pdf/1807.09856v1.pdf
PWC	https://paperswithcode.com/paper/where-are-the-blobs-counting-by-localization
Repo	https://github.com/ElementAI/LCFCN
Framework	pytorch

Improving Object Counting with Heatmap Regulation


Title	Improving Object Counting with Heatmap Regulation
Authors	Shubhra Aich, Ian Stavness
Abstract	In this paper, we propose a simple and effective way to improve one-look regression models for object counting from images. We use class activation map visualizations to illustrate the drawbacks of learning a pure one-look regression model for a counting task. Based on these insights, we enhance one-look regression counting models by regulating activation maps from the final convolution layer of the network with coarse ground-truth activation maps generated from simple dot annotations. We call this strategy heatmap regulation (HR). We show that this simple enhancement effectively suppresses false detections generated by the corresponding one-look baseline model and also improves the performance in terms of false negatives. Evaluations are performed on four different counting datasets — two for car counting (CARPK, PUCPR+), one for crowd counting (WorldExpo) and another for biological cell counting (VGG-Cells). Adding HR to a simple VGG front-end improves performance on all these benchmarks compared to a simple one-look baseline model and results in state-of-the-art performance for car counting.
Tasks	Crowd Counting, Object Counting
Published	2018-03-14
URL	http://arxiv.org/abs/1803.05494v2
PDF	http://arxiv.org/pdf/1803.05494v2.pdf
PWC	https://paperswithcode.com/paper/improving-object-counting-with-heatmap
Repo	https://github.com/littleaich/heatmap-regulation
Framework	pytorch

Perceptual deep depth super-resolution


Title	Perceptual deep depth super-resolution
Authors	Oleg Voynov, Alexey Artemov, Vage Egiazarian, Alexander Notchenko, Gleb Bobrovskikh, Denis Zorin, Evgeny Burnaev
Abstract	RGBD images, combining high-resolution color and lower-resolution depth from various types of depth sensors, are increasingly common. One can significantly improve the resolution of depth maps by taking advantage of color information; deep learning methods make combining color and depth information particularly easy. However, fusing these two sources of data may lead to a variety of artifacts. If depth maps are used to reconstruct 3D shapes, e.g., for virtual reality applications, the visual quality of upsampled images is particularly important. The main idea of our approach is to measure the quality of depth map upsampling using renderings of resulting 3D surfaces. We demonstrate that a simple visual appearance-based loss, when used with either a trained CNN or simply a deep prior, yields significantly improved 3D shapes, as measured by a number of existing perceptual metrics. We compare this approach with a number of existing optimization and learning-based techniques.
Tasks	Super-Resolution
Published	2018-12-24
URL	https://arxiv.org/abs/1812.09874v3
PDF	https://arxiv.org/pdf/1812.09874v3.pdf
PWC	https://paperswithcode.com/paper/perceptually-based-single-image-depth-super
Repo	https://github.com/twhui/MSG-Net
Framework	none


Title	DeepTraffic: Crowdsourced Hyperparameter Tuning of Deep Reinforcement Learning Systems for Multi-Agent Dense Traffic Navigation
Authors	Lex Fridman, Jack Terwilliger, Benedikt Jenik
Abstract	We present a traffic simulation named DeepTraffic where the planning systems for a subset of the vehicles are handled by a neural network as part of a model-free, off-policy reinforcement learning process. The primary goal of DeepTraffic is to make the hands-on study of deep reinforcement learning accessible to thousands of students, educators, and researchers in order to inspire and fuel the exploration and evaluation of deep Q-learning network variants and hyperparameter configurations through large-scale, open competition. This paper investigates the crowd-sourced hyperparameter tuning of the policy network that resulted from the first iteration of the DeepTraffic competition where thousands of participants actively searched through the hyperparameter space.
Tasks	Autonomous Driving, Autonomous Navigation, Q-Learning
Published	2018-01-09
URL	http://arxiv.org/abs/1801.02805v2
PDF	http://arxiv.org/pdf/1801.02805v2.pdf
PWC	https://paperswithcode.com/paper/deeptraffic-crowdsourced-hyperparameter
Repo	https://github.com/asarav/MIT-Deep-Traffic-Solution
Framework	none

Augmenting Neural Response Generation with Context-Aware Topical Attention


Title	Augmenting Neural Response Generation with Context-Aware Topical Attention
Authors	Nouha Dziri, Ehsan Kamalloo, Kory W. Mathewson, Osmar Zaiane
Abstract	Sequence-to-Sequence (Seq2Seq) models have witnessed a notable success in generating natural conversational exchanges. Notwithstanding the syntactically well-formed responses generated by these neural network models, they are prone to be acontextual, short and generic. In this work, we introduce a Topical Hierarchical Recurrent Encoder Decoder (THRED), a novel, fully data-driven, multi-turn response generation system intended to produce contextual and topic-aware responses. Our model is built upon the basic Seq2Seq model by augmenting it with a hierarchical joint attention mechanism that incorporates topical concepts and previous interactions into the response generation. To train our model, we provide a clean and high-quality conversational dataset mined from Reddit comments. We evaluate THRED on two novel automated metrics, dubbed Semantic Similarity and Response Echo Index, as well as with human evaluation. Our experiments demonstrate that the proposed model is able to generate more diverse and contextually relevant responses compared to the strong baselines.
Tasks	Semantic Similarity, Semantic Textual Similarity
Published	2018-11-02
URL	https://arxiv.org/abs/1811.01063v2
PDF	https://arxiv.org/pdf/1811.01063v2.pdf
PWC	https://paperswithcode.com/paper/augmenting-neural-response-generation-with
Repo	https://github.com/nouhadziri/THRED
Framework	tf

A high-bias, low-variance introduction to Machine Learning for physicists


Title	A high-bias, low-variance introduction to Machine Learning for physicists
Authors	Pankaj Mehta, Marin Bukov, Ching-Hao Wang, Alexandre G. R. Day, Clint Richardson, Charles K. Fisher, David J. Schwab
Abstract	Machine Learning (ML) is one of the most exciting and dynamic areas of modern research and application. The purpose of this review is to provide an introduction to the core concepts and tools of machine learning in a manner easily understood and intuitive to physicists. The review begins by covering fundamental concepts in ML and modern statistics such as the bias-variance tradeoff, overfitting, regularization, generalization, and gradient descent before moving on to more advanced topics in both supervised and unsupervised learning. Topics covered in the review include ensemble models, deep learning and neural networks, clustering and data visualization, energy-based models (including MaxEnt models and Restricted Boltzmann Machines), and variational methods. Throughout, we emphasize the many natural connections between ML and statistical physics. A notable aspect of the review is the use of Python Jupyter notebooks to introduce modern ML/statistical packages to readers using physics-inspired datasets (the Ising Model and Monte-Carlo simulations of supersymmetric decays of proton-proton collisions). We conclude with an extended outlook discussing possible uses of machine learning for furthering our understanding of the physical world as well as open problems in ML where physicists may be able to contribute. (Notebooks are available at https://physics.bu.edu/~pankajm/MLnotebooks.html )
Tasks
Published	2018-03-23
URL	https://arxiv.org/abs/1803.08823v3
PDF	https://arxiv.org/pdf/1803.08823v3.pdf
PWC	https://paperswithcode.com/paper/a-high-bias-low-variance-introduction-to
Repo	https://github.com/vmartinezalvarez/Machine-Learning-for-physicists
Framework	pytorch

A Tutorial on Modeling and Inference in Undirected Graphical Models for Hyperspectral Image Analysis


Title	A Tutorial on Modeling and Inference in Undirected Graphical Models for Hyperspectral Image Analysis
Authors	Utsav B. Gewali, Sildomar T. Monteiro
Abstract	Undirected graphical models have been successfully used to jointly model the spatial and the spectral dependencies in earth observing hyperspectral images. They produce less noisy, smooth, and spatially coherent land cover maps and give top accuracies on many datasets. Moreover, they can easily be combined with other state-of-the-art approaches, such as deep learning. This has made them an essential tool for remote sensing researchers and practitioners. However, graphical models have not been easily accessible to the larger remote sensing community as they are not discussed in standard remote sensing textbooks and not included in the popular remote sensing software and toolboxes. In this tutorial, we provide a theoretical introduction to Markov random fields and conditional random fields based spatial-spectral classification for land cover mapping along with a detailed step-by-step practical guide on applying these methods using freely available software. Furthermore, the discussed methods are benchmarked on four public hyperspectral datasets for a fair comparison among themselves and easy comparison with the vast number of methods in literature which use the same datasets. The source code necessary to reproduce all the results in the paper is published on-line to make it easier for the readers to apply these techniques to different remote sensing problems.
Tasks
Published	2018-01-25
URL	http://arxiv.org/abs/1801.08268v1
PDF	http://arxiv.org/pdf/1801.08268v1.pdf
PWC	https://paperswithcode.com/paper/a-tutorial-on-modeling-and-inference-in
Repo	https://github.com/UBGewali/tutorial-UGM-hyperspectral
Framework	none

Densely Connected Pyramid Dehazing Network


Title	Densely Connected Pyramid Dehazing Network
Authors	He Zhang, Vishal M. Patel
Abstract	We propose a new end-to-end single image dehazing method, called Densely Connected Pyramid Dehazing Network (DCPDN), which can jointly learn the transmission map, atmospheric light and dehazing all together. The end-to-end learning is achieved by directly embedding the atmospheric scattering model into the network, thereby ensuring that the proposed method strictly follows the physics-driven scattering model for dehazing. Inspired by the dense network that can maximize the information flow along features from different levels, we propose a new edge-preserving densely connected encoder-decoder structure with multi-level pyramid pooling module for estimating the transmission map. This network is optimized using a newly introduced edge-preserving loss function. To further incorporate the mutual structural information between the estimated transmission map and the dehazed result, we propose a joint-discriminator based on generative adversarial network framework to decide whether the corresponding dehazed image and the estimated transmission map are real or fake. An ablation study is conducted to demonstrate the effectiveness of each module evaluated at both estimated transmission map and dehazed result. Extensive experiments demonstrate that the proposed method achieves significant improvements over the state-of-the-art methods. Code will be made available at: https://github.com/hezhangsprinter
Tasks	Image Dehazing, Single Image Dehazing
Published	2018-03-22
URL	http://arxiv.org/abs/1803.08396v1
PDF	http://arxiv.org/pdf/1803.08396v1.pdf
PWC	https://paperswithcode.com/paper/densely-connected-pyramid-dehazing-network
Repo	https://github.com/hezhangsprinter/DCPDN
Framework	pytorch

Image Generation from Scene Graphs


Title	Image Generation from Scene Graphs
Authors	Justin Johnson, Agrim Gupta, Li Fei-Fei
Abstract	To truly understand the visual world our models should be able not only to recognize images but also generate them. To this end, there has been exciting recent progress on generating images from natural language descriptions. These methods give stunning results on limited domains such as descriptions of birds or flowers, but struggle to faithfully reproduce complex sentences with many objects and relationships. To overcome this limitation we propose a method for generating images from scene graphs, enabling explicitly reasoning about objects and their relationships. Our model uses graph convolution to process input graphs, computes a scene layout by predicting bounding boxes and segmentation masks for objects, and converts the layout to an image with a cascaded refinement network. The network is trained adversarially against a pair of discriminators to ensure realistic outputs. We validate our approach on Visual Genome and COCO-Stuff, where qualitative results, ablations, and user studies demonstrate our method’s ability to generate complex images with multiple objects.
Tasks	Image Generation, Layout-to-Image Generation
Published	2018-04-04
URL	http://arxiv.org/abs/1804.01622v1
PDF	http://arxiv.org/pdf/1804.01622v1.pdf
PWC	https://paperswithcode.com/paper/image-generation-from-scene-graphs
Repo	https://github.com/google/sg2im
Framework	pytorch

Cycle-Dehaze: Enhanced CycleGAN for Single Image Dehazing


Title	Cycle-Dehaze: Enhanced CycleGAN for Single Image Dehazing
Authors	Deniz Engin, Anıl Genç, Hazım Kemal Ekenel
Abstract	In this paper, we present an end-to-end network, called Cycle-Dehaze, for single image dehazing problem, which does not require pairs of hazy and corresponding ground truth images for training. That is, we train the network by feeding clean and hazy images in an unpaired manner. Moreover, the proposed approach does not rely on estimation of the atmospheric scattering model parameters. Our method enhances CycleGAN formulation by combining cycle-consistency and perceptual losses in order to improve the quality of textural information recovery and generate visually better haze-free images. Typically, deep learning models for dehazing take low resolution images as input and produce low resolution outputs. However, in the NTIRE 2018 challenge on single image dehazing, high resolution images were provided. Therefore, we apply bicubic downscaling. After obtaining low-resolution outputs from the network, we utilize the Laplacian pyramid to upscale the output images to the original resolution. We conduct experiments on NYU-Depth, I-HAZE, and O-HAZE datasets. Extensive experiments demonstrate that the proposed approach improves CycleGAN method both quantitatively and qualitatively.
Tasks	Image Dehazing, Single Image Dehazing
Published	2018-05-14
URL	http://arxiv.org/abs/1805.05308v1
PDF	http://arxiv.org/pdf/1805.05308v1.pdf
PWC	https://paperswithcode.com/paper/cycle-dehaze-enhanced-cyclegan-for-single
Repo	https://github.com/engindeniz/Cycle-Dehaze
Framework	tf

Analyzing Perception-Distortion Tradeoff using Enhanced Perceptual Super-resolution Network


Title	Analyzing Perception-Distortion Tradeoff using Enhanced Perceptual Super-resolution Network
Authors	Subeesh Vasu, Nimisha Thekke Madam, Rajagopalan A. N
Abstract	Convolutional neural network (CNN) based methods have recently achieved great success for image super-resolution (SR). However, most deep CNN based SR models attempt to improve distortion measures (e.g. PSNR, SSIM, IFC, VIF) while resulting in poor quantified perceptual quality (e.g. human opinion score, no-reference quality measures such as NIQE). Few works have attempted to improve the perceptual quality at the cost of performance reduction in distortion measures. A very recent study has revealed that distortion and perceptual quality are at odds with each other and there is always a trade-off between the two. Often the restoration algorithms that are superior in terms of perceptual quality, are inferior in terms of distortion measures. Our work attempts to analyze the trade-off between distortion and perceptual quality for the problem of single image SR. To this end, we use the well-known SR architecture-enhanced deep super-resolution (EDSR) network and show that it can be adapted to achieve better perceptual quality for a specific range of the distortion measure. While the original network of EDSR was trained to minimize the error defined based on per-pixel accuracy alone, we train our network using a generative adversarial network framework with EDSR as the generator module. Our proposed network, called enhanced perceptual super-resolution network (EPSR), is trained with a combination of mean squared error loss, perceptual loss, and adversarial loss. Our experiments reveal that EPSR achieves the state-of-the-art trade-off between distortion and perceptual quality while the existing methods perform well in either of these measures alone.
Tasks	Image Super-Resolution, Super-Resolution
Published	2018-11-01
URL	http://arxiv.org/abs/1811.00344v2
PDF	http://arxiv.org/pdf/1811.00344v2.pdf
PWC	https://paperswithcode.com/paper/analyzing-perception-distortion-tradeoff
Repo	https://github.com/subeeshvasu/2018_subeesh_epsr_eccvw
Framework	pytorch

End-to-End Multi-Task Learning with Attention


Title	End-to-End Multi-Task Learning with Attention
Authors	Shikun Liu, Edward Johns, Andrew J. Davison
Abstract	We propose a novel multi-task learning architecture, which allows learning of task-specific feature-level attention. Our design, the Multi-Task Attention Network (MTAN), consists of a single shared network containing a global feature pool, together with a soft-attention module for each task. These modules allow for learning of task-specific features from the global features, whilst simultaneously allowing for features to be shared across different tasks. The architecture can be trained end-to-end and can be built upon any feed-forward neural network, is simple to implement, and is parameter efficient. We evaluate our approach on a variety of datasets, across both image-to-image predictions and image classification tasks. We show that our architecture is state-of-the-art in multi-task learning compared to existing methods, and is also less sensitive to various weighting schemes in the multi-task loss function. Code is available at https://github.com/lorenmt/mtan.
Tasks	Multi-Task Learning
Published	2018-03-28
URL	http://arxiv.org/abs/1803.10704v2
PDF	http://arxiv.org/pdf/1803.10704v2.pdf
PWC	https://paperswithcode.com/paper/end-to-end-multi-task-learning-with-attention
Repo	https://github.com/lorenmt/mtan
Framework	pytorch

Conditional molecular design with deep generative models


Title	Conditional molecular design with deep generative models
Authors	Seokho Kang, Kyunghyun Cho
Abstract	Although machine learning has been successfully used to propose novel molecules that satisfy desired properties, it is still challenging to explore a large chemical space efficiently. In this paper, we present a conditional molecular design method that facilitates generating new molecules with desired properties. The proposed model, which simultaneously performs both property prediction and molecule generation, is built as a semi-supervised variational autoencoder trained on a set of existing molecules with only a partial annotation. We generate new molecules with desired properties by sampling from the generative distribution estimated by the model. We demonstrate the effectiveness of the proposed model by evaluating it on drug-like molecules. The model improves the performance of property prediction by exploiting unlabeled molecules, and efficiently generates novel molecules fulfilling various target conditions.
Tasks
Published	2018-04-30
URL	http://arxiv.org/abs/1805.00108v3
PDF	http://arxiv.org/pdf/1805.00108v3.pdf
PWC	https://paperswithcode.com/paper/conditional-molecular-design-with-deep
Repo	https://github.com/gcolmenarejo/cmd
Framework	tf