February 1, 2020

3327 words 16 mins read

Paper Group AWR 255

AeroRIT: A New Scene for Hyperspectral Image Analysis. REflex: Flexible Framework for Relation Extraction in Multiple Domains. Inferring Distributions Over Depth from a Single Image. Towards Ghost-free Shadow Removal via Dual Hierarchical Aggregation Network and Shadow Matting GAN. DD-PPO: Learning Near-Perfect PointGoal Navigators from 2.5 Billion …

AeroRIT: A New Scene for Hyperspectral Image Analysis


Title	AeroRIT: A New Scene for Hyperspectral Image Analysis
Authors	Aneesh Rangnekar, Nilay Mokashi, Emmett Ientilucci, Christopher Kanan, Matthew J. Hoffman
Abstract	We investigate modifying convolutional neural network (CNN) architecture to facilitate aerial hyperspectral scene understanding and present a new hyperspectral dataset-AeroRIT-that is large enough for CNN training. To date the majority of hyperspectral airborne have been confined to various sub-categories of vegetation and roads and this scene introduces two new categories: buildings and cars. To the best of our knowledge, this is the first comprehensive large-scale hyperspectral scene with nearly seven million pixel annotations for identifying cars, roads, and buildings. We compare the performance of three popular architectures - SegNet, U-Net, and Res-U-Net, for scene understanding and object identification via the task of dense semantic segmentation to establish a benchmark for the scene. To further strengthen the network, we add squeeze and excitation blocks for better channel interactions and use self-supervised learning for better encoder initialization. Aerial hyperspectral image analysis has been restricted to small datasets with limited train/test splits capabilities and we believe that AeroRIT will help advance the research in the field with a more complex object distribution to perform well on. The full dataset, with flight lines in radiance and reflectance domain, is available for download at https://github.com/aneesh3108/AeroRIT. This dataset is the first step towards developing robust algorithms for hyperspectral airborne sensing that can robustly perform advanced tasks like vehicle tracking and occlusion handling.
Tasks	Image Super-Resolution, Scene Understanding, Semantic Segmentation, Super-Resolution
Published	2019-12-17
URL	https://arxiv.org/abs/1912.08178v2
PDF	https://arxiv.org/pdf/1912.08178v2.pdf
PWC	https://paperswithcode.com/paper/aerorit-a-new-scene-for-hyperspectral-image
Repo	https://github.com/aneesh3108/AeroRIT
Framework	none

REflex: Flexible Framework for Relation Extraction in Multiple Domains


Title	REflex: Flexible Framework for Relation Extraction in Multiple Domains
Authors	Geeticka Chauhan, Matthew B. A. McDermott, Peter Szolovits
Abstract	Systematic comparison of methods for relation extraction (RE) is difficult because many experiments in the field are not described precisely enough to be completely reproducible and many papers fail to report ablation studies that would highlight the relative contributions of their various combined techniques. In this work, we build a unifying framework for RE, applying this on three highly used datasets (from the general, biomedical and clinical domains) with the ability to be extendable to new datasets. By performing a systematic exploration of modeling, pre-processing and training methodologies, we find that choices of pre-processing are a large contributor performance and that omission of such information can further hinder fair comparison. Other insights from our exploration allow us to provide recommendations for future research in this area.
Tasks	Relation Extraction
Published	2019-06-19
URL	https://arxiv.org/abs/1906.08318v4
PDF	https://arxiv.org/pdf/1906.08318v4.pdf
PWC	https://paperswithcode.com/paper/reflex-flexible-framework-for-relation
Repo	https://github.com/geetickachauhan/relation-extraction
Framework	tf

Inferring Distributions Over Depth from a Single Image


Title	Inferring Distributions Over Depth from a Single Image
Authors	Gengshan Yang, Peiyun Hu, Deva Ramanan
Abstract	When building a geometric scene understanding system for autonomous vehicles, it is crucial to know when the system might fail. Most contemporary approaches cast the problem as depth regression, whose output is a depth value for each pixel. Such approaches cannot diagnose when failures might occur. One attractive alternative is a deep Bayesian network, which captures uncertainty in both model parameters and ambiguous sensor measurements. However, estimating uncertainties is often slow and the distributions are often limited to be uni-modal. In this paper, we recast the continuous problem of depth regression as discrete binary classification, whose output is an un-normalized distribution over possible depths for each pixel. Such output allows one to reliably and efficiently capture multi-modal depth distributions in ambiguous cases, such as depth discontinuities and reflective surfaces. Results on standard benchmarks show that our method produces accurate depth predictions and significantly better uncertainty estimations than prior art while running near real-time. Finally, by making use of uncertainties of the predicted distribution, we significantly reduce streak-like artifacts and improves accuracy as well as memory efficiency in 3D map reconstruction.
Tasks	Autonomous Vehicles, Scene Understanding
Published	2019-12-12
URL	https://arxiv.org/abs/1912.06268v1
PDF	https://arxiv.org/pdf/1912.06268v1.pdf
PWC	https://paperswithcode.com/paper/inferring-distributions-over-depth-from-a
Repo	https://github.com/gengshan-y/monodepth-uncertainty
Framework	tf

Towards Ghost-free Shadow Removal via Dual Hierarchical Aggregation Network and Shadow Matting GAN


Title	Towards Ghost-free Shadow Removal via Dual Hierarchical Aggregation Network and Shadow Matting GAN
Authors	Xiaodong Cun, Chi-Man Pun, Cheng Shi
Abstract	Shadow removal is an essential task for scene understanding. Many studies consider only matching the image contents, which often causes two types of ghosts: color in-consistencies in shadow regions or artifacts on shadow boundaries. In this paper, we tackle these issues in two ways. First, to carefully learn the border artifacts-free image, we propose a novel network structure named the dual hierarchically aggregation network~(DHAN). It contains a series of growth dilated convolutions as the backbone without any down-samplings, and we hierarchically aggregate multi-context features for attention and prediction, respectively. Second, we argue that training on a limited dataset restricts the textural understanding of the network, which leads to the shadow region color in-consistencies. Currently, the largest dataset contains 2k+ shadow/shadow-free image pairs. However, it has only 0.1k+ unique scenes since many samples share exactly the same background with different shadow positions. Thus, we design a shadow matting generative adversarial network~(SMGAN) to synthesize realistic shadow mattings from a given shadow mask and shadow-free image. With the help of novel masks or scenes, we enhance the current datasets using synthesized shadow images. Experiments show that our DHAN can erase the shadows and produce high-quality ghost-free images. After training on the synthesized and real datasets, our network outperforms other state-of-the-art methods by a large margin. The code is available: http://github.com/vinthony/ghost-free-shadow-removal/
Tasks	Scene Understanding
Published	2019-11-20
URL	https://arxiv.org/abs/1911.08718v2
PDF	https://arxiv.org/pdf/1911.08718v2.pdf
PWC	https://paperswithcode.com/paper/towards-ghost-free-shadow-removal-via-dual
Repo	https://github.com/vinthony/ghost-free-shadow-removal
Framework	tf

DD-PPO: Learning Near-Perfect PointGoal Navigators from 2.5 Billion Frames


Title	DD-PPO: Learning Near-Perfect PointGoal Navigators from 2.5 Billion Frames
Authors	Erik Wijmans, Abhishek Kadian, Ari Morcos, Stefan Lee, Irfan Essa, Devi Parikh, Manolis Savva, Dhruv Batra
Abstract	We present Decentralized Distributed Proximal Policy Optimization (DD-PPO), a method for distributed reinforcement learning in resource-intensive simulated environments. DD-PPO is distributed (uses multiple machines), decentralized (lacks a centralized server), and synchronous (no computation is ever stale), making it conceptually simple and easy to implement. In our experiments on training virtual robots to navigate in Habitat-Sim, DD-PPO exhibits near-linear scaling – achieving a speedup of 107x on 128 GPUs over a serial implementation. We leverage this scaling to train an agent for 2.5 Billion steps of experience (the equivalent of 80 years of human experience) – over 6 months of GPU-time training in under 3 days of wall-clock time with 64 GPUs. This massive-scale training not only sets the state of art on Habitat Autonomous Navigation Challenge 2019, but essentially solves the task –near-perfect autonomous navigation in an unseen environment without access to a map, directly from an RGB-D camera and a GPS+Compass sensor. Fortuitously, error vs computation exhibits a power-law-like distribution; thus, 90% of peak performance is obtained relatively early (at 100 million steps) and relatively cheaply (under 1 day with 8 GPUs). Finally, we show that the scene understanding and navigation policies learned can be transferred to other navigation tasks – the analog of ImageNet pre-training + task-specific fine-tuning for embodied AI. Our model outperforms ImageNet pre-trained CNNs on these transfer tasks and can serve as a universal resource (all models and code are publicly available).
Tasks	Autonomous Navigation, Scene Understanding
Published	2019-11-01
URL	https://arxiv.org/abs/1911.00357v2
PDF	https://arxiv.org/pdf/1911.00357v2.pdf
PWC	https://paperswithcode.com/paper/decentralized-distributed-ppo-solving
Repo	https://github.com/facebookresearch/habitat-api/tree/master/habitat_baselines/rl/ddppo
Framework	pytorch

Symmetric Cross Entropy for Robust Learning with Noisy Labels


Title	Symmetric Cross Entropy for Robust Learning with Noisy Labels
Authors	Yisen Wang, Xingjun Ma, Zaiyi Chen, Yuan Luo, Jinfeng Yi, James Bailey
Abstract	Training accurate deep neural networks (DNNs) in the presence of noisy labels is an important and challenging task. Though a number of approaches have been proposed for learning with noisy labels, many open issues remain. In this paper, we show that DNN learning with Cross Entropy (CE) exhibits overfitting to noisy labels on some classes (“easy” classes), but more surprisingly, it also suffers from significant under learning on some other classes (“hard” classes). Intuitively, CE requires an extra term to facilitate learning of hard classes, and more importantly, this term should be noise tolerant, so as to avoid overfitting to noisy labels. Inspired by the symmetric KL-divergence, we propose the approach of \textbf{Symmetric cross entropy Learning} (SL), boosting CE symmetrically with a noise robust counterpart Reverse Cross Entropy (RCE). Our proposed SL approach simultaneously addresses both the under learning and overfitting problem of CE in the presence of noisy labels. We provide a theoretical analysis of SL and also empirically show, on a range of benchmark and real-world datasets, that SL outperforms state-of-the-art methods. We also show that SL can be easily incorporated into existing methods in order to further enhance their performance.
Tasks
Published	2019-08-16
URL	https://arxiv.org/abs/1908.06112v1
PDF	https://arxiv.org/pdf/1908.06112v1.pdf
PWC	https://paperswithcode.com/paper/symmetric-cross-entropy-for-robust-learning
Repo	https://github.com/xingjunm/dimensionality-driven-learning
Framework	tf

Natural Image Noise Dataset


Title	Natural Image Noise Dataset
Authors	Benoit Brummer, Christophe De Vleeschouwer
Abstract	Convolutional neural networks have been the focus of research aiming to solve image denoising problems, but their performance remains unsatisfactory for most applications. These networks are trained with synthetic noise distributions that do not accurately reflect the noise captured by image sensors. Some datasets of clean-noisy image pairs have been introduced but they are usually meant for benchmarking or specific applications. We introduce the Natural Image Noise Dataset (NIND), a dataset of DSLR-like images with varying levels of ISO noise which is large enough to train models for blind denoising over a wide range of noise. We demonstrate a denoising model trained with the NIND and show that it significantly outperforms BM3D on ISO noise from unseen images, even when generalizing to images from a different type of camera. The Natural Image Noise Dataset is published on Wikimedia Commons such that it remains open for curation and contributions. We expect that this dataset will prove useful for future image denoising applications.
Tasks	Denoising, Image Denoising
Published	2019-06-01
URL	https://arxiv.org/abs/1906.00270v1
PDF	https://arxiv.org/pdf/1906.00270v1.pdf
PWC	https://paperswithcode.com/paper/190600270
Repo	https://github.com/MatusPilnan/nsiete-project
Framework	tf

Real-time Scene Text Detection with Differentiable Binarization


Title	Real-time Scene Text Detection with Differentiable Binarization
Authors	Minghui Liao, Zhaoyi Wan, Cong Yao, Kai Chen, Xiang Bai
Abstract	Recently, segmentation-based methods are quite popular in scene text detection, as the segmentation results can more accurately describe scene text of various shapes such as curve text. However, the post-processing of binarization is essential for segmentation-based detection, which converts probability maps produced by a segmentation method into bounding boxes/regions of text. In this paper, we propose a module named Differentiable Binarization (DB), which can perform the binarization process in a segmentation network. Optimized along with a DB module, a segmentation network can adaptively set the thresholds for binarization, which not only simplifies the post-processing but also enhances the performance of text detection. Based on a simple segmentation network, we validate the performance improvements of DB on five benchmark datasets, which consistently achieves state-of-the-art results, in terms of both detection accuracy and speed. In particular, with a light-weight backbone, the performance improvements by DB are significant so that we can look for an ideal tradeoff between detection accuracy and efficiency. Specifically, with a backbone of ResNet-18, our detector achieves an F-measure of 82.8, running at 62 FPS, on the MSRA-TD500 dataset. Code is available at: https://github.com/MhLiao/DB
Tasks	Scene Text Detection
Published	2019-11-20
URL	https://arxiv.org/abs/1911.08947v2
PDF	https://arxiv.org/pdf/1911.08947v2.pdf
PWC	https://paperswithcode.com/paper/real-time-scene-text-detection-with
Repo	https://github.com/xuannianz/DifferentiableBinarization
Framework	tf

Efficient Communication in Multi-Agent Reinforcement Learning via Variance Based Control


Title	Efficient Communication in Multi-Agent Reinforcement Learning via Variance Based Control
Authors	Sai Qian Zhang, Qi Zhang, Jieyu Lin
Abstract	Multi-agent reinforcement learning (MARL) has recently received considerable attention due to its applicability to a wide range of real-world applications. However, achieving efficient communication among agents has always been an overarching problem in MARL. In this work, we propose Variance Based Control (VBC), a simple yet efficient technique to improve communication efficiency in MARL. By limiting the variance of the exchanged messages between agents during the training phase, the noisy component in the messages can be eliminated effectively, while the useful part can be preserved and utilized by the agents for better performance. Our evaluation using a challenging set of StarCraft II benchmarks indicates that our method achieves $2-10\times$ lower in communication overhead than state-of-the-art MARL algorithms, while allowing agents to better collaborate by developing sophisticated strategies.
Tasks	Multi-agent Reinforcement Learning, Starcraft, Starcraft II
Published	2019-09-06
URL	https://arxiv.org/abs/1909.02682v2
PDF	https://arxiv.org/pdf/1909.02682v2.pdf
PWC	https://paperswithcode.com/paper/efficient-communication-in-multi-agent
Repo	https://github.com/saizhang0218/VBC
Framework	pytorch

Evolving Robust Neural Architectures to Defend from Adversarial Attacks


Title	Evolving Robust Neural Architectures to Defend from Adversarial Attacks
Authors	Danilo Vasconcellos Vargas, Shashank Kotyan
Abstract	Deep neural networks are prone to misclassify slightly modified input images. Recently, many defences have been proposed, but none have improved the robustness of neural networks consistently. Here, we propose to use adversarial attacks as a function evaluation to automatically search for neural architectures that can resist such attacks. Experiments on neural architecture search algorithms from the literature show that although accurate, they are not able to find robust architectures. A major reason for this lies in their limited search space. By creating a novel neural architecture search with options for dense layers to connect with convolution layers and vice-versa as well as the addition of concatenation layers in the search, we were able to evolve an architecture that is inherently accurate on adversarial samples. Interestingly, this inherent robustness of the evolved architecture rivals state-of-the-art defences such as adversarial training while being trained only on the non-adversarial samples. Moreover, the evolved architecture makes use of some peculiar traits which might be useful for developing even more robust ones. Thus, the results here demonstrate that more robust architectures exist as well as opens up a new range of possibilities for the development and exploration of deep neural networks using automatic architecture search. Code available at http://bit.ly/RobustArchitectureSearch.
Tasks	Neural Architecture Search
Published	2019-06-27
URL	https://arxiv.org/abs/1906.11667v2
PDF	https://arxiv.org/pdf/1906.11667v2.pdf
PWC	https://paperswithcode.com/paper/evolving-robust-neural-architectures-to
Repo	https://github.com/shashankkotyan/RobustArchitectureSearch
Framework	tf

Defending Neural Backdoors via Generative Distribution Modeling


Title	Defending Neural Backdoors via Generative Distribution Modeling
Authors	Ximing Qiao, Yukun Yang, Hai Li
Abstract	Neural backdoor attack is emerging as a severe security threat to deep learning, while the capability of existing defense methods is limited, especially for complex backdoor triggers. In the work, we explore the space formed by the pixel values of all possible backdoor triggers. An original trigger used by an attacker to build the backdoored model represents only a point in the space. It then will be generalized into a distribution of valid triggers, all of which can influence the backdoored model. Thus, previous methods that model only one point of the trigger distribution is not sufficient. Getting the entire trigger distribution, e.g., via generative modeling, is a key to effective defense. However, existing generative modeling techniques for image generation are not applicable to the backdoor scenario as the trigger distribution is completely unknown. In this work, we propose max-entropy staircase approximator (MESA), an algorithm for high-dimensional sampling-free generative modeling and use it to recover the trigger distribution. We also develop a defense technique to remove the triggers from the backdoored model. Our experiments on Cifar10/100 dataset demonstrate the effectiveness of MESA in modeling the trigger distribution and the robustness of the proposed defense method.
Tasks	Image Generation
Published	2019-10-10
URL	https://arxiv.org/abs/1910.04749v2
PDF	https://arxiv.org/pdf/1910.04749v2.pdf
PWC	https://paperswithcode.com/paper/defending-neural-backdoors-via-generative
Repo	https://github.com/superrrpotato/Defending-Neural-Backdoors-via-Generative-Distribution-Modeling
Framework	pytorch

On Recovering Latent Factors From Sampling And Firing Graph


Title	On Recovering Latent Factors From Sampling And Firing Graph
Authors	Pierre Gouedard
Abstract	Consider a set of latent factors whose observable effect of activation is caught on a measure space that appears as a grid of bits tacking value in ${0, 1 }$. This paper intend to deliver a theoretical and practical answer to the question: Given that we have access to a perfect indicator of the activation of latent factors that label a finite dataset of grid’s activity, can we imagine a procedure to build a generic identificator of factor’s activations ?
Tasks
Published	2019-09-20
URL	https://arxiv.org/abs/1909.09493v1
PDF	https://arxiv.org/pdf/1909.09493v1.pdf
PWC	https://paperswithcode.com/paper/on-recovering-latent-factors-from-sampling
Repo	https://github.com/pierreGouedard/deyep
Framework	none

Active embedding search via noisy paired comparisons


Title	Active embedding search via noisy paired comparisons
Authors	Gregory H. Canal, Andrew K. Massimino, Mark A. Davenport, Christopher J. Rozell
Abstract	Suppose that we wish to estimate a user’s preference vector $w$ from paired comparisons of the form “does user $w$ prefer item $p$ or item $q$?,” where both the user and items are embedded in a low-dimensional Euclidean space with distances that reflect user and item similarities. Such observations arise in numerous settings, including psychometrics and psychology experiments, search tasks, advertising, and recommender systems. In such tasks, queries can be extremely costly and subject to varying levels of response noise; thus, we aim to actively choose pairs that are most informative given the results of previous comparisons. We provide new theoretical insights into the benefits and challenges of greedy information maximization in this setting, and develop two novel strategies that maximize lower bounds on information gain and are simpler to analyze and compute respectively. We use simulated responses from a real-world dataset to validate our strategies through their similar performance to greedy information maximization, and their superior preference estimation over state-of-the-art selection methods as well as random queries.
Tasks	Recommendation Systems
Published	2019-05-10
URL	https://arxiv.org/abs/1905.04363v2
PDF	https://arxiv.org/pdf/1905.04363v2.pdf
PWC	https://paperswithcode.com/paper/active-embedding-search-via-noisy-paired
Repo	https://github.com/siplab-gt/pairsearch
Framework	none

Testing Conditional Independence in Supervised Learning Algorithms


Title	Testing Conditional Independence in Supervised Learning Algorithms
Authors	David S. Watson, Marvin N. Wright
Abstract	We propose the conditional predictive impact (CPI), a consistent and unbiased estimator of the association between one or several features and a given outcome, conditional on a reduced feature set. Building on the knockoff framework of Cand`es et al. (2018), we develop a novel testing procedure that works in conjunction with any valid knockoff sampler, supervised learning algorithm, and loss function. The CPI can be efficiently computed for high-dimensional data without any sparsity constraints. We demonstrate convergence criteria for the CPI and develop statistical inference procedures for evaluating its magnitude, significance, and precision. These tests aid in feature and model selection, extending traditional frequentist and Bayesian techniques to general supervised learning tasks. The CPI may also be applied in causal discovery to identify underlying multivariate graph structures. We test our method using various algorithms, including linear regression, neural networks, random forests, and support vector machines. Empirical results show that the CPI compares favorably to alternative variable importance measures and other nonparametric tests of conditional independence on a diverse array of real and simulated datasets. Simulations confirm that our inference procedures successfully control Type I error and achieve nominal coverage probability. Our method has been implemented in an R package, cpi, which can be downloaded from https://github.com/dswatson/cpi.
Tasks	Causal Discovery, Model Selection
Published	2019-01-28
URL	https://arxiv.org/abs/1901.09917v4
PDF	https://arxiv.org/pdf/1901.09917v4.pdf
PWC	https://paperswithcode.com/paper/testing-conditional-predictive-independence
Repo	https://github.com/dswatson/cpi
Framework	none

Episodic Memory in Lifelong Language Learning


Title	Episodic Memory in Lifelong Language Learning
Authors	Cyprien de Masson d’Autume, Sebastian Ruder, Lingpeng Kong, Dani Yogatama
Abstract	We introduce a lifelong language learning setup where a model needs to learn from a stream of text examples without any dataset identifier. We propose an episodic memory model that performs sparse experience replay and local adaptation to mitigate catastrophic forgetting in this setup. Experiments on text classification and question answering demonstrate the complementary benefits of sparse experience replay and local adaptation to allow the model to continuously learn from new datasets. We also show that the space complexity of the episodic memory module can be reduced significantly (~50-90%) by randomly choosing which examples to store in memory with a minimal decrease in performance. We consider an episodic memory component as a crucial building block of general linguistic intelligence and see our model as a first step in that direction.
Tasks	Continual Learning, Question Answering, Text Classification
Published	2019-06-03
URL	https://arxiv.org/abs/1906.01076v3
PDF	https://arxiv.org/pdf/1906.01076v3.pdf
PWC	https://paperswithcode.com/paper/episodic-memory-in-lifelong-language-learning
Repo	https://github.com/h3lio5/episodic-lifelong-learning
Framework	pytorch