Paper Group AWR 255
AeroRIT: A New Scene for Hyperspectral Image Analysis. REflex: Flexible Framework for Relation Extraction in Multiple Domains. Inferring Distributions Over Depth from a Single Image. Towards Ghost-free Shadow Removal via Dual Hierarchical Aggregation Network and Shadow Matting GAN. DD-PPO: Learning Near-Perfect PointGoal Navigators from 2.5 Billion …
AeroRIT: A New Scene for Hyperspectral Image Analysis
Title | AeroRIT: A New Scene for Hyperspectral Image Analysis |
Authors | Aneesh Rangnekar, Nilay Mokashi, Emmett Ientilucci, Christopher Kanan, Matthew J. Hoffman |
Abstract | We investigate modifying convolutional neural network (CNN) architecture to facilitate aerial hyperspectral scene understanding and present a new hyperspectral dataset-AeroRIT-that is large enough for CNN training. To date the majority of hyperspectral airborne have been confined to various sub-categories of vegetation and roads and this scene introduces two new categories: buildings and cars. To the best of our knowledge, this is the first comprehensive large-scale hyperspectral scene with nearly seven million pixel annotations for identifying cars, roads, and buildings. We compare the performance of three popular architectures - SegNet, U-Net, and Res-U-Net, for scene understanding and object identification via the task of dense semantic segmentation to establish a benchmark for the scene. To further strengthen the network, we add squeeze and excitation blocks for better channel interactions and use self-supervised learning for better encoder initialization. Aerial hyperspectral image analysis has been restricted to small datasets with limited train/test splits capabilities and we believe that AeroRIT will help advance the research in the field with a more complex object distribution to perform well on. The full dataset, with flight lines in radiance and reflectance domain, is available for download at https://github.com/aneesh3108/AeroRIT. This dataset is the first step towards developing robust algorithms for hyperspectral airborne sensing that can robustly perform advanced tasks like vehicle tracking and occlusion handling. |
Tasks | Image Super-Resolution, Scene Understanding, Semantic Segmentation, Super-Resolution |
Published | 2019-12-17 |
URL | https://arxiv.org/abs/1912.08178v2 |
https://arxiv.org/pdf/1912.08178v2.pdf | |
PWC | https://paperswithcode.com/paper/aerorit-a-new-scene-for-hyperspectral-image |
Repo | https://github.com/aneesh3108/AeroRIT |
Framework | none |
REflex: Flexible Framework for Relation Extraction in Multiple Domains
Title | REflex: Flexible Framework for Relation Extraction in Multiple Domains |
Authors | Geeticka Chauhan, Matthew B. A. McDermott, Peter Szolovits |
Abstract | Systematic comparison of methods for relation extraction (RE) is difficult because many experiments in the field are not described precisely enough to be completely reproducible and many papers fail to report ablation studies that would highlight the relative contributions of their various combined techniques. In this work, we build a unifying framework for RE, applying this on three highly used datasets (from the general, biomedical and clinical domains) with the ability to be extendable to new datasets. By performing a systematic exploration of modeling, pre-processing and training methodologies, we find that choices of pre-processing are a large contributor performance and that omission of such information can further hinder fair comparison. Other insights from our exploration allow us to provide recommendations for future research in this area. |
Tasks | Relation Extraction |
Published | 2019-06-19 |
URL | https://arxiv.org/abs/1906.08318v4 |
https://arxiv.org/pdf/1906.08318v4.pdf | |
PWC | https://paperswithcode.com/paper/reflex-flexible-framework-for-relation |
Repo | https://github.com/geetickachauhan/relation-extraction |
Framework | tf |
Inferring Distributions Over Depth from a Single Image
Title | Inferring Distributions Over Depth from a Single Image |
Authors | Gengshan Yang, Peiyun Hu, Deva Ramanan |
Abstract | When building a geometric scene understanding system for autonomous vehicles, it is crucial to know when the system might fail. Most contemporary approaches cast the problem as depth regression, whose output is a depth value for each pixel. Such approaches cannot diagnose when failures might occur. One attractive alternative is a deep Bayesian network, which captures uncertainty in both model parameters and ambiguous sensor measurements. However, estimating uncertainties is often slow and the distributions are often limited to be uni-modal. In this paper, we recast the continuous problem of depth regression as discrete binary classification, whose output is an un-normalized distribution over possible depths for each pixel. Such output allows one to reliably and efficiently capture multi-modal depth distributions in ambiguous cases, such as depth discontinuities and reflective surfaces. Results on standard benchmarks show that our method produces accurate depth predictions and significantly better uncertainty estimations than prior art while running near real-time. Finally, by making use of uncertainties of the predicted distribution, we significantly reduce streak-like artifacts and improves accuracy as well as memory efficiency in 3D map reconstruction. |
Tasks | Autonomous Vehicles, Scene Understanding |
Published | 2019-12-12 |
URL | https://arxiv.org/abs/1912.06268v1 |
https://arxiv.org/pdf/1912.06268v1.pdf | |
PWC | https://paperswithcode.com/paper/inferring-distributions-over-depth-from-a |
Repo | https://github.com/gengshan-y/monodepth-uncertainty |
Framework | tf |
Towards Ghost-free Shadow Removal via Dual Hierarchical Aggregation Network and Shadow Matting GAN
Title | Towards Ghost-free Shadow Removal via Dual Hierarchical Aggregation Network and Shadow Matting GAN |
Authors | Xiaodong Cun, Chi-Man Pun, Cheng Shi |
Abstract | Shadow removal is an essential task for scene understanding. Many studies consider only matching the image contents, which often causes two types of ghosts: color in-consistencies in shadow regions or artifacts on shadow boundaries. In this paper, we tackle these issues in two ways. First, to carefully learn the border artifacts-free image, we propose a novel network structure named the dual hierarchically aggregation network~(DHAN). It contains a series of growth dilated convolutions as the backbone without any down-samplings, and we hierarchically aggregate multi-context features for attention and prediction, respectively. Second, we argue that training on a limited dataset restricts the textural understanding of the network, which leads to the shadow region color in-consistencies. Currently, the largest dataset contains 2k+ shadow/shadow-free image pairs. However, it has only 0.1k+ unique scenes since many samples share exactly the same background with different shadow positions. Thus, we design a shadow matting generative adversarial network~(SMGAN) to synthesize realistic shadow mattings from a given shadow mask and shadow-free image. With the help of novel masks or scenes, we enhance the current datasets using synthesized shadow images. Experiments show that our DHAN can erase the shadows and produce high-quality ghost-free images. After training on the synthesized and real datasets, our network outperforms other state-of-the-art methods by a large margin. The code is available: http://github.com/vinthony/ghost-free-shadow-removal/ |
Tasks | Scene Understanding |
Published | 2019-11-20 |
URL | https://arxiv.org/abs/1911.08718v2 |
https://arxiv.org/pdf/1911.08718v2.pdf | |
PWC | https://paperswithcode.com/paper/towards-ghost-free-shadow-removal-via-dual |
Repo | https://github.com/vinthony/ghost-free-shadow-removal |
Framework | tf |
DD-PPO: Learning Near-Perfect PointGoal Navigators from 2.5 Billion Frames
Title | DD-PPO: Learning Near-Perfect PointGoal Navigators from 2.5 Billion Frames |
Authors | Erik Wijmans, Abhishek Kadian, Ari Morcos, Stefan Lee, Irfan Essa, Devi Parikh, Manolis Savva, Dhruv Batra |
Abstract | We present Decentralized Distributed Proximal Policy Optimization (DD-PPO), a method for distributed reinforcement learning in resource-intensive simulated environments. DD-PPO is distributed (uses multiple machines), decentralized (lacks a centralized server), and synchronous (no computation is ever stale), making it conceptually simple and easy to implement. In our experiments on training virtual robots to navigate in Habitat-Sim, DD-PPO exhibits near-linear scaling – achieving a speedup of 107x on 128 GPUs over a serial implementation. We leverage this scaling to train an agent for 2.5 Billion steps of experience (the equivalent of 80 years of human experience) – over 6 months of GPU-time training in under 3 days of wall-clock time with 64 GPUs. This massive-scale training not only sets the state of art on Habitat Autonomous Navigation Challenge 2019, but essentially solves the task –near-perfect autonomous navigation in an unseen environment without access to a map, directly from an RGB-D camera and a GPS+Compass sensor. Fortuitously, error vs computation exhibits a power-law-like distribution; thus, 90% of peak performance is obtained relatively early (at 100 million steps) and relatively cheaply (under 1 day with 8 GPUs). Finally, we show that the scene understanding and navigation policies learned can be transferred to other navigation tasks – the analog of ImageNet pre-training + task-specific fine-tuning for embodied AI. Our model outperforms ImageNet pre-trained CNNs on these transfer tasks and can serve as a universal resource (all models and code are publicly available). |
Tasks | Autonomous Navigation, Scene Understanding |
Published | 2019-11-01 |
URL | https://arxiv.org/abs/1911.00357v2 |
https://arxiv.org/pdf/1911.00357v2.pdf | |
PWC | https://paperswithcode.com/paper/decentralized-distributed-ppo-solving |
Repo | https://github.com/facebookresearch/habitat-api/tree/master/habitat_baselines/rl/ddppo |
Framework | pytorch |
Symmetric Cross Entropy for Robust Learning with Noisy Labels
Title | Symmetric Cross Entropy for Robust Learning with Noisy Labels |
Authors | Yisen Wang, Xingjun Ma, Zaiyi Chen, Yuan Luo, Jinfeng Yi, James Bailey |
Abstract | Training accurate deep neural networks (DNNs) in the presence of noisy labels is an important and challenging task. Though a number of approaches have been proposed for learning with noisy labels, many open issues remain. In this paper, we show that DNN learning with Cross Entropy (CE) exhibits overfitting to noisy labels on some classes (“easy” classes), but more surprisingly, it also suffers from significant under learning on some other classes (“hard” classes). Intuitively, CE requires an extra term to facilitate learning of hard classes, and more importantly, this term should be noise tolerant, so as to avoid overfitting to noisy labels. Inspired by the symmetric KL-divergence, we propose the approach of \textbf{Symmetric cross entropy Learning} (SL), boosting CE symmetrically with a noise robust counterpart Reverse Cross Entropy (RCE). Our proposed SL approach simultaneously addresses both the under learning and overfitting problem of CE in the presence of noisy labels. We provide a theoretical analysis of SL and also empirically show, on a range of benchmark and real-world datasets, that SL outperforms state-of-the-art methods. We also show that SL can be easily incorporated into existing methods in order to further enhance their performance. |
Tasks | |
Published | 2019-08-16 |
URL | https://arxiv.org/abs/1908.06112v1 |
https://arxiv.org/pdf/1908.06112v1.pdf | |
PWC | https://paperswithcode.com/paper/symmetric-cross-entropy-for-robust-learning |
Repo | https://github.com/xingjunm/dimensionality-driven-learning |
Framework | tf |
Natural Image Noise Dataset
Title | Natural Image Noise Dataset |
Authors | Benoit Brummer, Christophe De Vleeschouwer |
Abstract | Convolutional neural networks have been the focus of research aiming to solve image denoising problems, but their performance remains unsatisfactory for most applications. These networks are trained with synthetic noise distributions that do not accurately reflect the noise captured by image sensors. Some datasets of clean-noisy image pairs have been introduced but they are usually meant for benchmarking or specific applications. We introduce the Natural Image Noise Dataset (NIND), a dataset of DSLR-like images with varying levels of ISO noise which is large enough to train models for blind denoising over a wide range of noise. We demonstrate a denoising model trained with the NIND and show that it significantly outperforms BM3D on ISO noise from unseen images, even when generalizing to images from a different type of camera. The Natural Image Noise Dataset is published on Wikimedia Commons such that it remains open for curation and contributions. We expect that this dataset will prove useful for future image denoising applications. |
Tasks | Denoising, Image Denoising |
Published | 2019-06-01 |
URL | https://arxiv.org/abs/1906.00270v1 |
https://arxiv.org/pdf/1906.00270v1.pdf | |
PWC | https://paperswithcode.com/paper/190600270 |
Repo | https://github.com/MatusPilnan/nsiete-project |
Framework | tf |
Real-time Scene Text Detection with Differentiable Binarization
Title | Real-time Scene Text Detection with Differentiable Binarization |
Authors | Minghui Liao, Zhaoyi Wan, Cong Yao, Kai Chen, Xiang Bai |
Abstract | Recently, segmentation-based methods are quite popular in scene text detection, as the segmentation results can more accurately describe scene text of various shapes such as curve text. However, the post-processing of binarization is essential for segmentation-based detection, which converts probability maps produced by a segmentation method into bounding boxes/regions of text. In this paper, we propose a module named Differentiable Binarization (DB), which can perform the binarization process in a segmentation network. Optimized along with a DB module, a segmentation network can adaptively set the thresholds for binarization, which not only simplifies the post-processing but also enhances the performance of text detection. Based on a simple segmentation network, we validate the performance improvements of DB on five benchmark datasets, which consistently achieves state-of-the-art results, in terms of both detection accuracy and speed. In particular, with a light-weight backbone, the performance improvements by DB are significant so that we can look for an ideal tradeoff between detection accuracy and efficiency. Specifically, with a backbone of ResNet-18, our detector achieves an F-measure of 82.8, running at 62 FPS, on the MSRA-TD500 dataset. Code is available at: https://github.com/MhLiao/DB |
Tasks | Scene Text Detection |
Published | 2019-11-20 |
URL | https://arxiv.org/abs/1911.08947v2 |
https://arxiv.org/pdf/1911.08947v2.pdf | |
PWC | https://paperswithcode.com/paper/real-time-scene-text-detection-with |
Repo | https://github.com/xuannianz/DifferentiableBinarization |
Framework | tf |
Efficient Communication in Multi-Agent Reinforcement Learning via Variance Based Control
Title | Efficient Communication in Multi-Agent Reinforcement Learning via Variance Based Control |
Authors | Sai Qian Zhang, Qi Zhang, Jieyu Lin |
Abstract | Multi-agent reinforcement learning (MARL) has recently received considerable attention due to its applicability to a wide range of real-world applications. However, achieving efficient communication among agents has always been an overarching problem in MARL. In this work, we propose Variance Based Control (VBC), a simple yet efficient technique to improve communication efficiency in MARL. By limiting the variance of the exchanged messages between agents during the training phase, the noisy component in the messages can be eliminated effectively, while the useful part can be preserved and utilized by the agents for better performance. Our evaluation using a challenging set of StarCraft II benchmarks indicates that our method achieves $2-10\times$ lower in communication overhead than state-of-the-art MARL algorithms, while allowing agents to better collaborate by developing sophisticated strategies. |
Tasks | Multi-agent Reinforcement Learning, Starcraft, Starcraft II |
Published | 2019-09-06 |
URL | https://arxiv.org/abs/1909.02682v2 |
https://arxiv.org/pdf/1909.02682v2.pdf | |
PWC | https://paperswithcode.com/paper/efficient-communication-in-multi-agent |
Repo | https://github.com/saizhang0218/VBC |
Framework | pytorch |
Evolving Robust Neural Architectures to Defend from Adversarial Attacks
Title | Evolving Robust Neural Architectures to Defend from Adversarial Attacks |
Authors | Danilo Vasconcellos Vargas, Shashank Kotyan |
Abstract | Deep neural networks are prone to misclassify slightly modified input images. Recently, many defences have been proposed, but none have improved the robustness of neural networks consistently. Here, we propose to use adversarial attacks as a function evaluation to automatically search for neural architectures that can resist such attacks. Experiments on neural architecture search algorithms from the literature show that although accurate, they are not able to find robust architectures. A major reason for this lies in their limited search space. By creating a novel neural architecture search with options for dense layers to connect with convolution layers and vice-versa as well as the addition of concatenation layers in the search, we were able to evolve an architecture that is inherently accurate on adversarial samples. Interestingly, this inherent robustness of the evolved architecture rivals state-of-the-art defences such as adversarial training while being trained only on the non-adversarial samples. Moreover, the evolved architecture makes use of some peculiar traits which might be useful for developing even more robust ones. Thus, the results here demonstrate that more robust architectures exist as well as opens up a new range of possibilities for the development and exploration of deep neural networks using automatic architecture search. Code available at http://bit.ly/RobustArchitectureSearch. |
Tasks | Neural Architecture Search |
Published | 2019-06-27 |
URL | https://arxiv.org/abs/1906.11667v2 |
https://arxiv.org/pdf/1906.11667v2.pdf | |
PWC | https://paperswithcode.com/paper/evolving-robust-neural-architectures-to |
Repo | https://github.com/shashankkotyan/RobustArchitectureSearch |
Framework | tf |
Defending Neural Backdoors via Generative Distribution Modeling
Title | Defending Neural Backdoors via Generative Distribution Modeling |
Authors | Ximing Qiao, Yukun Yang, Hai Li |
Abstract | Neural backdoor attack is emerging as a severe security threat to deep learning, while the capability of existing defense methods is limited, especially for complex backdoor triggers. In the work, we explore the space formed by the pixel values of all possible backdoor triggers. An original trigger used by an attacker to build the backdoored model represents only a point in the space. It then will be generalized into a distribution of valid triggers, all of which can influence the backdoored model. Thus, previous methods that model only one point of the trigger distribution is not sufficient. Getting the entire trigger distribution, e.g., via generative modeling, is a key to effective defense. However, existing generative modeling techniques for image generation are not applicable to the backdoor scenario as the trigger distribution is completely unknown. In this work, we propose max-entropy staircase approximator (MESA), an algorithm for high-dimensional sampling-free generative modeling and use it to recover the trigger distribution. We also develop a defense technique to remove the triggers from the backdoored model. Our experiments on Cifar10/100 dataset demonstrate the effectiveness of MESA in modeling the trigger distribution and the robustness of the proposed defense method. |
Tasks | Image Generation |
Published | 2019-10-10 |
URL | https://arxiv.org/abs/1910.04749v2 |
https://arxiv.org/pdf/1910.04749v2.pdf | |
PWC | https://paperswithcode.com/paper/defending-neural-backdoors-via-generative |
Repo | https://github.com/superrrpotato/Defending-Neural-Backdoors-via-Generative-Distribution-Modeling |
Framework | pytorch |
On Recovering Latent Factors From Sampling And Firing Graph
Title | On Recovering Latent Factors From Sampling And Firing Graph |
Authors | Pierre Gouedard |
Abstract | Consider a set of latent factors whose observable effect of activation is caught on a measure space that appears as a grid of bits tacking value in ${0, 1 }$. This paper intend to deliver a theoretical and practical answer to the question: Given that we have access to a perfect indicator of the activation of latent factors that label a finite dataset of grid’s activity, can we imagine a procedure to build a generic identificator of factor’s activations ? |
Tasks | |
Published | 2019-09-20 |
URL | https://arxiv.org/abs/1909.09493v1 |
https://arxiv.org/pdf/1909.09493v1.pdf | |
PWC | https://paperswithcode.com/paper/on-recovering-latent-factors-from-sampling |
Repo | https://github.com/pierreGouedard/deyep |
Framework | none |
Active embedding search via noisy paired comparisons
Title | Active embedding search via noisy paired comparisons |
Authors | Gregory H. Canal, Andrew K. Massimino, Mark A. Davenport, Christopher J. Rozell |
Abstract | Suppose that we wish to estimate a user’s preference vector $w$ from paired comparisons of the form “does user $w$ prefer item $p$ or item $q$?,” where both the user and items are embedded in a low-dimensional Euclidean space with distances that reflect user and item similarities. Such observations arise in numerous settings, including psychometrics and psychology experiments, search tasks, advertising, and recommender systems. In such tasks, queries can be extremely costly and subject to varying levels of response noise; thus, we aim to actively choose pairs that are most informative given the results of previous comparisons. We provide new theoretical insights into the benefits and challenges of greedy information maximization in this setting, and develop two novel strategies that maximize lower bounds on information gain and are simpler to analyze and compute respectively. We use simulated responses from a real-world dataset to validate our strategies through their similar performance to greedy information maximization, and their superior preference estimation over state-of-the-art selection methods as well as random queries. |
Tasks | Recommendation Systems |
Published | 2019-05-10 |
URL | https://arxiv.org/abs/1905.04363v2 |
https://arxiv.org/pdf/1905.04363v2.pdf | |
PWC | https://paperswithcode.com/paper/active-embedding-search-via-noisy-paired |
Repo | https://github.com/siplab-gt/pairsearch |
Framework | none |
Testing Conditional Independence in Supervised Learning Algorithms
Title | Testing Conditional Independence in Supervised Learning Algorithms |
Authors | David S. Watson, Marvin N. Wright |
Abstract | We propose the conditional predictive impact (CPI), a consistent and unbiased estimator of the association between one or several features and a given outcome, conditional on a reduced feature set. Building on the knockoff framework of Cand`es et al. (2018), we develop a novel testing procedure that works in conjunction with any valid knockoff sampler, supervised learning algorithm, and loss function. The CPI can be efficiently computed for high-dimensional data without any sparsity constraints. We demonstrate convergence criteria for the CPI and develop statistical inference procedures for evaluating its magnitude, significance, and precision. These tests aid in feature and model selection, extending traditional frequentist and Bayesian techniques to general supervised learning tasks. The CPI may also be applied in causal discovery to identify underlying multivariate graph structures. We test our method using various algorithms, including linear regression, neural networks, random forests, and support vector machines. Empirical results show that the CPI compares favorably to alternative variable importance measures and other nonparametric tests of conditional independence on a diverse array of real and simulated datasets. Simulations confirm that our inference procedures successfully control Type I error and achieve nominal coverage probability. Our method has been implemented in an R package, cpi, which can be downloaded from https://github.com/dswatson/cpi. |
Tasks | Causal Discovery, Model Selection |
Published | 2019-01-28 |
URL | https://arxiv.org/abs/1901.09917v4 |
https://arxiv.org/pdf/1901.09917v4.pdf | |
PWC | https://paperswithcode.com/paper/testing-conditional-predictive-independence |
Repo | https://github.com/dswatson/cpi |
Framework | none |
Episodic Memory in Lifelong Language Learning
Title | Episodic Memory in Lifelong Language Learning |
Authors | Cyprien de Masson d’Autume, Sebastian Ruder, Lingpeng Kong, Dani Yogatama |
Abstract | We introduce a lifelong language learning setup where a model needs to learn from a stream of text examples without any dataset identifier. We propose an episodic memory model that performs sparse experience replay and local adaptation to mitigate catastrophic forgetting in this setup. Experiments on text classification and question answering demonstrate the complementary benefits of sparse experience replay and local adaptation to allow the model to continuously learn from new datasets. We also show that the space complexity of the episodic memory module can be reduced significantly (~50-90%) by randomly choosing which examples to store in memory with a minimal decrease in performance. We consider an episodic memory component as a crucial building block of general linguistic intelligence and see our model as a first step in that direction. |
Tasks | Continual Learning, Question Answering, Text Classification |
Published | 2019-06-03 |
URL | https://arxiv.org/abs/1906.01076v3 |
https://arxiv.org/pdf/1906.01076v3.pdf | |
PWC | https://paperswithcode.com/paper/episodic-memory-in-lifelong-language-learning |
Repo | https://github.com/h3lio5/episodic-lifelong-learning |
Framework | pytorch |