April 3, 2020

3144 words 15 mins read

Paper Group AWR 14

Learning Enriched Features for Real Image Restoration and Enhancement. Learning by Analogy: Reliable Supervision from Transformations for Unsupervised Optical Flow Estimation. Coping With Simulators That Don’t Always Return. Scalable End-to-end Recurrent Neural Network for Variable star classification. Infrared and 3D skeleton feature fusion for RG …

Learning Enriched Features for Real Image Restoration and Enhancement


Title	Learning Enriched Features for Real Image Restoration and Enhancement
Authors	Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Ming-Hsuan Yang, Ling Shao
Abstract	With the goal of recovering high-quality image content from its degraded version, image restoration enjoys numerous applications, such as in surveillance, computational photography, medical imaging, and remote sensing. Recently, convolutional neural networks (CNNs) have achieved dramatic improvements over conventional approaches for image restoration task. Existing CNN-based methods typically operate either on full-resolution or on progressively low-resolution representations. In the former case, spatially precise but contextually less robust results are achieved, while in the latter case, semantically reliable but spatially less accurate outputs are generated. In this paper, we present a novel architecture with the collective goals of maintaining spatially-precise high-resolution representations through the entire network, and receiving strong contextual information from the low-resolution representations. The core of our approach is a multi-scale residual block containing several key elements: (a) parallel multi-resolution convolution streams for extracting multi-scale features, (b) information exchange across the multi-resolution streams, (c) spatial and channel attention mechanisms for capturing contextual information, and (d) attention based multi-scale feature aggregation. In the nutshell, our approach learns an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details. Extensive experiments on five real image benchmark datasets demonstrate that our method, named as MIRNet, achieves state-of-the-art results for a variety of image processing tasks, including image denoising, super-resolution and image enhancement.
Tasks	Denoising, Image Denoising, Image Enhancement, Image Restoration, Super-Resolution
Published	2020-03-15
URL	https://arxiv.org/abs/2003.06792v1
PDF	https://arxiv.org/pdf/2003.06792v1.pdf
PWC	https://paperswithcode.com/paper/learning-enriched-features-for-real-image
Repo	https://github.com/swz30/MIRNet
Framework	none

Learning by Analogy: Reliable Supervision from Transformations for Unsupervised Optical Flow Estimation


Title	Learning by Analogy: Reliable Supervision from Transformations for Unsupervised Optical Flow Estimation
Authors	Liang Liu, Jiangning Zhang, Ruifei He, Yong Liu, Yabiao Wang, Ying Tai, Donghao Luo, Chengjie Wang, Jilin Li, Feiyue Huang
Abstract	Unsupervised learning of optical flow, which leverages the supervision from view synthesis, has emerged as a promising alternative to supervised methods. However, the objective of unsupervised learning is likely to be unreliable in challenging scenes. In this work, we present a framework to use more reliable supervision from transformations. It simply twists the general unsupervised learning pipeline by running another forward pass with transformed data from augmentation, along with using transformed predictions of original data as the self-supervision signal. Besides, we further introduce a lightweight network with multiple frames by a highly-shared flow decoder. Our method consistently gets a leap of performance on several benchmarks with the best accuracy among deep unsupervised methods. Also, our method achieves competitive results to recent fully supervised methods while with much fewer parameters.
Tasks	Optical Flow Estimation
Published	2020-03-29
URL	https://arxiv.org/abs/2003.13045v1
PDF	https://arxiv.org/pdf/2003.13045v1.pdf
PWC	https://paperswithcode.com/paper/learning-by-analogy-reliable-supervision-from
Repo	https://github.com/lliuz/ARFlow
Framework	pytorch

Coping With Simulators That Don’t Always Return


Title	Coping With Simulators That Don’t Always Return
Authors	Andrew Warrington, Saeid Naderiparizi, Frank Wood
Abstract	Deterministic models are approximations of reality that are easy to interpret and often easier to build than stochastic alternatives. Unfortunately, as nature is capricious, observational data can never be fully explained by deterministic models in practice. Observation and process noise need to be added to adapt deterministic models to behave stochastically, such that they are capable of explaining and extrapolating from noisy data. We investigate and address computational inefficiencies that arise from adding process noise to deterministic simulators that fail to return for certain inputs; a property we describe as “brittle.” We show how to train a conditional normalizing flow to propose perturbations such that the simulator succeeds with high probability, increasing computational efficiency.
Tasks
Published	2020-03-28
URL	https://arxiv.org/abs/2003.12908v1
PDF	https://arxiv.org/pdf/2003.12908v1.pdf
PWC	https://paperswithcode.com/paper/coping-with-simulators-that-don-t-always
Repo	https://github.com/plai-group/stdr
Framework	pytorch

Scalable End-to-end Recurrent Neural Network for Variable star classification


Title	Scalable End-to-end Recurrent Neural Network for Variable star classification
Authors	Ignacio Becker, Karim Pichara, Márcio Catelan, Pavlos Protopapas, Carlos Aguirre, Fatemeh Nikzat
Abstract	During the last decade, considerable effort has been made to perform automatic classification of variable stars using machine learning techniques. Traditionally, light curves are represented as a vector of descriptors or features used as input for many algorithms. Some features are computationally expensive, cannot be updated quickly and hence for large datasets such as the LSST cannot be applied. Previous work has been done to develop alternative unsupervised feature extraction algorithms for light curves, but the cost of doing so still remains high. In this work, we propose an end-to-end algorithm that automatically learns the representation of light curves that allows an accurate automatic classification. We study a series of deep learning architectures based on Recurrent Neural Networks and test them in automated classification scenarios. Our method uses minimal data preprocessing, can be updated with a low computational cost for new observations and light curves, and can scale up to massive datasets. We transform each light curve into an input matrix representation whose elements are the differences in time and magnitude, and the outputs are classification probabilities. We test our method in three surveys: OGLE-III, Gaia and WISE. We obtain accuracies of about $95%$ in the main classes and $75%$ in the majority of subclasses. We compare our results with the Random Forest classifier and obtain competitive accuracies while being faster and scalable. The analysis shows that the computational complexity of our approach grows up linearly with the light curve size, while the traditional approach cost grows as $N\log{(N)}$.
Tasks	Classification Of Variable Stars
Published	2020-02-03
URL	https://arxiv.org/abs/2002.00994v1
PDF	https://arxiv.org/pdf/2002.00994v1.pdf
PWC	https://paperswithcode.com/paper/scalable-end-to-end-recurrent-neural-network
Repo	https://github.com/iebecker/Scalable_RNN
Framework	tf

Infrared and 3D skeleton feature fusion for RGB-D action recognition


Title	Infrared and 3D skeleton feature fusion for RGB-D action recognition
Authors	Alban Main de Boissiere, Rita Noumeir
Abstract	A challenge of skeleton-based action recognition is the difficulty to classify actions with similar motions and object-related actions. Visual clues from other streams help in that regard. RGB data are sensible to illumination conditions, thus unusable in the dark. To alleviate this issue and still benefit from a visual stream, we propose a modular network (FUSION) combining skeleton and infrared data. A 2D convolutional neural network (CNN) is used as a pose module to extract features from skeleton data. A 3D CNN is used as an infrared module to extract visual cues from videos. Both feature vectors are then concatenated and exploited conjointly using a multilayer perceptron (MLP). Skeleton data also condition the infrared videos, providing a crop around the performing subjects and thus virtually focusing the attention of the infrared module. Ablation studies show that using pre-trained networks on other large scale datasets as our modules and data augmentation yield considerable improvements on the action classification accuracy. The strong contribution of our cropping strategy is also demonstrated. We evaluate our method on the NTU RGB+D dataset, the largest dataset for human action recognition from depth cameras, and report state-of-the-art performances.
Tasks	Action Classification, Action Recognition In Videos, Data Augmentation, Skeleton Based Action Recognition, Temporal Action Localization
Published	2020-02-28
URL	https://arxiv.org/abs/2002.12886v1
PDF	https://arxiv.org/pdf/2002.12886v1.pdf
PWC	https://paperswithcode.com/paper/infrared-and-3d-skeleton-feature-fusion-for
Repo	https://github.com/adeboissiere/FUSION-human-action-recognition
Framework	pytorch

Get Rid of Suspended Animation Problem: Deep Diffusive Neural Network on Graph Semi-Supervised Classification


Title	Get Rid of Suspended Animation Problem: Deep Diffusive Neural Network on Graph Semi-Supervised Classification
Authors	Jiawei Zhang
Abstract	Existing graph neural networks may suffer from the “suspended animation problem” when the model architecture goes deep. Meanwhile, for some graph learning scenarios, e.g., nodes with text/image attributes or graphs with long-distance node correlations, deep graph neural networks will be necessary for effective graph representation learning. In this paper, we propose a new graph neural network, namely DIFNET (Graph Diffusive Neural Network), for graph representation learning and node classification. DIFNET utilizes both neural gates and graph residual learning for node hidden state modeling, and includes an attention mechanism for node neighborhood information diffusion. Extensive experiments will be done in this paper to compare DIFNET against several state-of-the-art graph neural network models. The experimental results can illustrate both the learning performance advantages and effectiveness of DIFNET, especially in addressing the “suspended animation problem”.
Tasks	Graph Representation Learning, Node Classification, Representation Learning
Published	2020-01-22
URL	https://arxiv.org/abs/2001.07922v1
PDF	https://arxiv.org/pdf/2001.07922v1.pdf
PWC	https://paperswithcode.com/paper/get-rid-of-suspended-animation-problem-deep
Repo	https://github.com/anonymous-sourcecode/DifNN
Framework	pytorch

Semi-supervised Learning for Few-shot Image-to-Image Translation


Title	Semi-supervised Learning for Few-shot Image-to-Image Translation
Authors	Yaxing Wang, Salman Khan, Abel Gonzalez-Garcia, Joost van de Weijer, Fahad Shahbaz Khan
Abstract	In the last few years, unpaired image-to-image translation has witnessed remarkable progress. Although the latest methods are able to generate realistic images, they crucially rely on a large number of labeled images. Recently, some methods have tackled the challenging setting of few-shot image-to-image translation, reducing the labeled data requirements for the target domain during inference. In this work, we go one step further and reduce the amount of required labeled data also from the source domain during training. To do so, we propose applying semi-supervised learning via a noise-tolerant pseudo-labeling procedure. We also apply a cycle consistency constraint to further exploit the information from unlabeled images, either from the same dataset or external. Additionally, we propose several structural modifications to facilitate the image translation task under these circumstances. Our semi-supervised method for few-shot image translation, called \emph{SEMIT}, achieves excellent results on four different datasets using as little as 10% of the source labels, and matches the performance of the main fully-supervised competitor using only 20% labeled data. Our code and models are made public at: \url{https://github.com/yaxingwang/SEMIT}.
Tasks	Image-to-Image Translation
Published	2020-03-30
URL	https://arxiv.org/abs/2003.13853v1
PDF	https://arxiv.org/pdf/2003.13853v1.pdf
PWC	https://paperswithcode.com/paper/semi-supervised-learning-for-few-shot-image
Repo	https://github.com/yaxingwang/SEMIT
Framework	none

A Differentiable Color Filter for Generating Unrestricted Adversarial Images


Title	A Differentiable Color Filter for Generating Unrestricted Adversarial Images
Authors	Zhengyu Zhao, Zhuoran Liu, Martha Larson
Abstract	We propose Adversarial Color Filtering (AdvCF), an approach that uses a differentiable color filter to create adversarial images. The color filter allows us to introduce large perturbations into images, while still maintaining or enhancing their photographic quality and appeal. AdvCF is motivated by properties that are necessary if adversarial images are to be used to protect the content of images shared online from unethical machine learning classifiers: First, perturbations must be imperceptible and adversarial images must look realistic to the human eye. Second, adversarial impact must be maintained in the face of classifiers unknown when the perturbations are generated (transferability). The paper presents evidence that AdvCF has these two properties, and also points out that AdvCF has the potential for further improvement if image semantics are taken into account.
Tasks
Published	2020-02-03
URL	https://arxiv.org/abs/2002.01008v1
PDF	https://arxiv.org/pdf/2002.01008v1.pdf
PWC	https://paperswithcode.com/paper/a-differentiable-color-filter-for-generating
Repo	https://github.com/ZhengyuZhao/AdvCF
Framework	pytorch

Multi-level Context Gating of Embedded Collective Knowledge for Medical Image Segmentation


Title	Multi-level Context Gating of Embedded Collective Knowledge for Medical Image Segmentation
Authors	Maryam Asadi-Aghbolaghi, Reza Azad, Mahmood Fathy, Sergio Escalera
Abstract	Medical image segmentation has been very challenging due to the large variation of anatomy across different cases. Recent advances in deep learning frameworks have exhibited faster and more accurate performance in image segmentation. Among the existing networks, U-Net has been successfully applied on medical image segmentation. In this paper, we propose an extension of U-Net for medical image segmentation, in which we take full advantages of U-Net, Squeeze and Excitation (SE) block, bi-directional ConvLSTM (BConvLSTM), and the mechanism of dense convolutions. (I) We improve the segmentation performance by utilizing SE modules within the U-Net, with a minor effect on model complexity. These blocks adaptively recalibrate the channel-wise feature responses by utilizing a self-gating mechanism of the global information embedding of the feature maps. (II) To strengthen feature propagation and encourage feature reuse, we use densely connected convolutions in the last convolutional layer of the encoding path. (III) Instead of a simple concatenation in the skip connection of U-Net, we employ BConvLSTM in all levels of the network to combine the feature maps extracted from the corresponding encoding path and the previous decoding up-convolutional layer in a non-linear way. The proposed model is evaluated on six datasets DRIVE, ISIC 2017 and 2018, lung segmentation, $PH^2$, and cell nuclei segmentation, achieving state-of-the-art performance.
Tasks	Lesion Segmentation, Medical Image Segmentation, Semantic Segmentation
Published	2020-03-10
URL	https://arxiv.org/abs/2003.05056v1
PDF	https://arxiv.org/pdf/2003.05056v1.pdf
PWC	https://paperswithcode.com/paper/multi-level-context-gating-of-embedded
Repo	https://github.com/rezazad68/BCDU-Net
Framework	tf

A Metric Learning Reality Check


Title	A Metric Learning Reality Check
Authors	Kevin Musgrave, Serge Belongie, Ser-Nam Lim
Abstract	Deep metric learning papers from the past four years have consistently claimed great advances in accuracy, often more than doubling the performance of decade-old methods. In this paper, we take a closer look at the field to see if this is actually true. We find flaws in the experimental setup of these papers, and propose a new way to evaluate metric learning algorithms. Finally, we present experimental results that show that the improvements over time have been marginal at best.
Tasks	Metric Learning
Published	2020-03-18
URL	https://arxiv.org/abs/2003.08505v1
PDF	https://arxiv.org/pdf/2003.08505v1.pdf
PWC	https://paperswithcode.com/paper/a-metric-learning-reality-check
Repo	https://github.com/KevinMusgrave/powerful-benchmarker
Framework	pytorch

Fundamental Tradeoffs between Invariance and Sensitivity to Adversarial Perturbations


Title	Fundamental Tradeoffs between Invariance and Sensitivity to Adversarial Perturbations
Authors	Florian Tramèr, Jens Behrmann, Nicholas Carlini, Nicolas Papernot, Jörn-Henrik Jacobsen
Abstract	Adversarial examples are malicious inputs crafted to induce misclassification. Commonly studied sensitivity-based adversarial examples introduce semantically-small changes to an input that result in a different model prediction. This paper studies a complementary failure mode, invariance-based adversarial examples, that introduce minimal semantic changes that modify an input’s true label yet preserve the model’s prediction. We demonstrate fundamental tradeoffs between these two types of adversarial examples. We show that defenses against sensitivity-based attacks actively harm a model’s accuracy on invariance-based attacks, and that new approaches are needed to resist both attack types. In particular, we break state-of-the-art adversarially-trained and certifiably-robust models by generating small perturbations that the models are (provably) robust to, yet that change an input’s class according to human labelers. Finally, we formally show that the existence of excessively invariant classifiers arises from the presence of overly-robust predictive features in standard datasets.
Tasks
Published	2020-02-11
URL	https://arxiv.org/abs/2002.04599v1
PDF	https://arxiv.org/pdf/2002.04599v1.pdf
PWC	https://paperswithcode.com/paper/fundamental-tradeoffs-between-invariance-and
Repo	https://github.com/ftramer/Excessive-Invariance
Framework	tf

Hippocampus Segmentation on Epilepsy and Alzheimer’s Disease Studies with Multiple Convolutional Neural Networks


Title	Hippocampus Segmentation on Epilepsy and Alzheimer’s Disease Studies with Multiple Convolutional Neural Networks
Authors	Diedre Carmo, Bruna Silva, Clarissa Yasuda, Letícia Rittner, Roberto Lotufo
Abstract	Hippocampus segmentation on magnetic resonance imaging (MRI) is of key importance for the diagnosis, treatment decision and investigation of neuropsychiatric disorders. Automatic segmentation is a very active research field, with many recent models involving Deep Learning for such task. However, Deep Learning requires a training phase, which can introduce bias from the specific domain of the training dataset. Current state-of-the art methods train their methods on healthy or Alzheimer’s disease patients from public datasets. This raises the question whether these methods are capable to recognize the Hippocampus on a very different domain. In this paper we present a state-of-the-art, open source, ready-to-use hippocampus segmentation methodology, using Deep Learning. We analyze this methodology alongside other recent Deep Learning methods, in two domains: the public HarP benchmark and an in-house Epilepsy patients dataset. Our internal dataset differs significantly from Alzheimer’s and Healthy subjects scans. Some scans are from patients who have undergone hippocampal resection, due to surgical treatment of Epilepsy. We show that our method surpasses others from the literature in both the Alzheimer’s and Epilepsy test datasets.
Tasks
Published	2020-01-14
URL	https://arxiv.org/abs/2001.05058v1
PDF	https://arxiv.org/pdf/2001.05058v1.pdf
PWC	https://paperswithcode.com/paper/hippocampus-segmentation-on-epilepsy-and
Repo	https://github.com/dscarmo/e2dhipseg
Framework	pytorch

FGN: Fusion Glyph Network for Chinese Named Entity Recognition


Title	FGN: Fusion Glyph Network for Chinese Named Entity Recognition
Authors	Zhenyu Xuan, Rui Bao, Chuyu Ma, Shengyi Jiang
Abstract	Chinese NER is a challenging task. As pictographs, Chinese characters contain latent glyph infor-mation, which is often overlooked. In this paper, we propose the FGN , Fusion Glyph Network for Chinese NER. Except for adding glyph information, this method may also add extra interactive infor-mation with the fusion mechanism. The major in-novations of FGN include: (1) a novel CNN struc-ture called CGS-CNN is proposed to capture both glyph information and interactive information between glyphs from neighboring characters. (2) we provide a method with sliding window and Slice-Attention to fuse the BERT representation and glyph representation for a character, which may capture potential interactive knowledge be-tween context and glyph. Experiments are con-ducted on four NER datasets, showing that FGN with LSTM-CRF as tagger achieves new state-of-the-arts performance for Chinese NER. Further, more experiments are conducted to inves-tigate the influences of various components and settings in FGN.
Tasks	Chinese Named Entity Recognition, Named Entity Recognition, Representation Learning
Published	2020-01-15
URL	https://arxiv.org/abs/2001.05272v3
PDF	https://arxiv.org/pdf/2001.05272v3.pdf
PWC	https://paperswithcode.com/paper/fgn-fusion-glyph-network-for-chinese-named
Repo	https://github.com/AidenHuen/FGN-NER
Framework	none

Set-Structured Latent Representations


Title	Set-Structured Latent Representations
Authors	Qian Huang, Horace He, Abhay Singh, Yan Zhang, Ser-Nam Lim, Austin Benson
Abstract	Unstructured data often has latent component structure, such as the objects in an image of a scene. In these situations, the relevant latent structure is an unordered collection or \emph{set}. However, learning such representations directly from data is difficult due to the discrete and unordered structure. Here, we develop a framework for differentiable learning of set-structured latent representations. We show how to use this framework to naturally decompose data such as images into sets of interpretable and meaningful components and demonstrate how existing techniques cannot properly disentangle relevant structure. We also show how to extend our methodology to downstream tasks such as set matching, which uses set-specific operations. Our code is available at https://github.com/CUVL/SSLR.
Tasks
Published	2020-03-09
URL	https://arxiv.org/abs/2003.04448v1
PDF	https://arxiv.org/pdf/2003.04448v1.pdf
PWC	https://paperswithcode.com/paper/set-structured-latent-representations
Repo	https://github.com/CUVL/SSLR
Framework	pytorch

Towards Achieving Adversarial Robustness by Enforcing Feature Consistency Across Bit Planes


Title	Towards Achieving Adversarial Robustness by Enforcing Feature Consistency Across Bit Planes
Authors	Sravanti Addepalli, Vivek B. S., Arya Baburaj, Gaurang Sriramanan, R. Venkatesh Babu
Abstract	As humans, we inherently perceive images based on their predominant features, and ignore noise embedded within lower bit planes. On the contrary, Deep Neural Networks are known to confidently misclassify images corrupted with meticulously crafted perturbations that are nearly imperceptible to the human eye. In this work, we attempt to address this problem by training networks to form coarse impressions based on the information in higher bit planes, and use the lower bit planes only to refine their prediction. We demonstrate that, by imposing consistency on the representations learned across differently quantized images, the adversarial robustness of networks improves significantly when compared to a normally trained model. Present state-of-the-art defenses against adversarial attacks require the networks to be explicitly trained using adversarial samples that are computationally expensive to generate. While such methods that use adversarial training continue to achieve the best results, this work paves the way towards achieving robustness without having to explicitly train on adversarial samples. The proposed approach is therefore faster, and also closer to the natural learning process in humans.
Tasks
Published	2020-04-01
URL	https://arxiv.org/abs/2004.00306v1
PDF	https://arxiv.org/pdf/2004.00306v1.pdf
PWC	https://paperswithcode.com/paper/towards-achieving-adversarial-robustness-by
Repo	https://github.com/val-iisc/BPFC
Framework	none