Paper Group AWR 14
Learning Enriched Features for Real Image Restoration and Enhancement. Learning by Analogy: Reliable Supervision from Transformations for Unsupervised Optical Flow Estimation. Coping With Simulators That Don’t Always Return. Scalable End-to-end Recurrent Neural Network for Variable star classification. Infrared and 3D skeleton feature fusion for RG …
Learning Enriched Features for Real Image Restoration and Enhancement
Title | Learning Enriched Features for Real Image Restoration and Enhancement |
Authors | Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Ming-Hsuan Yang, Ling Shao |
Abstract | With the goal of recovering high-quality image content from its degraded version, image restoration enjoys numerous applications, such as in surveillance, computational photography, medical imaging, and remote sensing. Recently, convolutional neural networks (CNNs) have achieved dramatic improvements over conventional approaches for image restoration task. Existing CNN-based methods typically operate either on full-resolution or on progressively low-resolution representations. In the former case, spatially precise but contextually less robust results are achieved, while in the latter case, semantically reliable but spatially less accurate outputs are generated. In this paper, we present a novel architecture with the collective goals of maintaining spatially-precise high-resolution representations through the entire network, and receiving strong contextual information from the low-resolution representations. The core of our approach is a multi-scale residual block containing several key elements: (a) parallel multi-resolution convolution streams for extracting multi-scale features, (b) information exchange across the multi-resolution streams, (c) spatial and channel attention mechanisms for capturing contextual information, and (d) attention based multi-scale feature aggregation. In the nutshell, our approach learns an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details. Extensive experiments on five real image benchmark datasets demonstrate that our method, named as MIRNet, achieves state-of-the-art results for a variety of image processing tasks, including image denoising, super-resolution and image enhancement. |
Tasks | Denoising, Image Denoising, Image Enhancement, Image Restoration, Super-Resolution |
Published | 2020-03-15 |
URL | https://arxiv.org/abs/2003.06792v1 |
https://arxiv.org/pdf/2003.06792v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-enriched-features-for-real-image |
Repo | https://github.com/swz30/MIRNet |
Framework | none |
Learning by Analogy: Reliable Supervision from Transformations for Unsupervised Optical Flow Estimation
Title | Learning by Analogy: Reliable Supervision from Transformations for Unsupervised Optical Flow Estimation |
Authors | Liang Liu, Jiangning Zhang, Ruifei He, Yong Liu, Yabiao Wang, Ying Tai, Donghao Luo, Chengjie Wang, Jilin Li, Feiyue Huang |
Abstract | Unsupervised learning of optical flow, which leverages the supervision from view synthesis, has emerged as a promising alternative to supervised methods. However, the objective of unsupervised learning is likely to be unreliable in challenging scenes. In this work, we present a framework to use more reliable supervision from transformations. It simply twists the general unsupervised learning pipeline by running another forward pass with transformed data from augmentation, along with using transformed predictions of original data as the self-supervision signal. Besides, we further introduce a lightweight network with multiple frames by a highly-shared flow decoder. Our method consistently gets a leap of performance on several benchmarks with the best accuracy among deep unsupervised methods. Also, our method achieves competitive results to recent fully supervised methods while with much fewer parameters. |
Tasks | Optical Flow Estimation |
Published | 2020-03-29 |
URL | https://arxiv.org/abs/2003.13045v1 |
https://arxiv.org/pdf/2003.13045v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-by-analogy-reliable-supervision-from |
Repo | https://github.com/lliuz/ARFlow |
Framework | pytorch |
Coping With Simulators That Don’t Always Return
Title | Coping With Simulators That Don’t Always Return |
Authors | Andrew Warrington, Saeid Naderiparizi, Frank Wood |
Abstract | Deterministic models are approximations of reality that are easy to interpret and often easier to build than stochastic alternatives. Unfortunately, as nature is capricious, observational data can never be fully explained by deterministic models in practice. Observation and process noise need to be added to adapt deterministic models to behave stochastically, such that they are capable of explaining and extrapolating from noisy data. We investigate and address computational inefficiencies that arise from adding process noise to deterministic simulators that fail to return for certain inputs; a property we describe as “brittle.” We show how to train a conditional normalizing flow to propose perturbations such that the simulator succeeds with high probability, increasing computational efficiency. |
Tasks | |
Published | 2020-03-28 |
URL | https://arxiv.org/abs/2003.12908v1 |
https://arxiv.org/pdf/2003.12908v1.pdf | |
PWC | https://paperswithcode.com/paper/coping-with-simulators-that-don-t-always |
Repo | https://github.com/plai-group/stdr |
Framework | pytorch |
Scalable End-to-end Recurrent Neural Network for Variable star classification
Title | Scalable End-to-end Recurrent Neural Network for Variable star classification |
Authors | Ignacio Becker, Karim Pichara, Márcio Catelan, Pavlos Protopapas, Carlos Aguirre, Fatemeh Nikzat |
Abstract | During the last decade, considerable effort has been made to perform automatic classification of variable stars using machine learning techniques. Traditionally, light curves are represented as a vector of descriptors or features used as input for many algorithms. Some features are computationally expensive, cannot be updated quickly and hence for large datasets such as the LSST cannot be applied. Previous work has been done to develop alternative unsupervised feature extraction algorithms for light curves, but the cost of doing so still remains high. In this work, we propose an end-to-end algorithm that automatically learns the representation of light curves that allows an accurate automatic classification. We study a series of deep learning architectures based on Recurrent Neural Networks and test them in automated classification scenarios. Our method uses minimal data preprocessing, can be updated with a low computational cost for new observations and light curves, and can scale up to massive datasets. We transform each light curve into an input matrix representation whose elements are the differences in time and magnitude, and the outputs are classification probabilities. We test our method in three surveys: OGLE-III, Gaia and WISE. We obtain accuracies of about $95%$ in the main classes and $75%$ in the majority of subclasses. We compare our results with the Random Forest classifier and obtain competitive accuracies while being faster and scalable. The analysis shows that the computational complexity of our approach grows up linearly with the light curve size, while the traditional approach cost grows as $N\log{(N)}$. |
Tasks | Classification Of Variable Stars |
Published | 2020-02-03 |
URL | https://arxiv.org/abs/2002.00994v1 |
https://arxiv.org/pdf/2002.00994v1.pdf | |
PWC | https://paperswithcode.com/paper/scalable-end-to-end-recurrent-neural-network |
Repo | https://github.com/iebecker/Scalable_RNN |
Framework | tf |
Infrared and 3D skeleton feature fusion for RGB-D action recognition
Title | Infrared and 3D skeleton feature fusion for RGB-D action recognition |
Authors | Alban Main de Boissiere, Rita Noumeir |
Abstract | A challenge of skeleton-based action recognition is the difficulty to classify actions with similar motions and object-related actions. Visual clues from other streams help in that regard. RGB data are sensible to illumination conditions, thus unusable in the dark. To alleviate this issue and still benefit from a visual stream, we propose a modular network (FUSION) combining skeleton and infrared data. A 2D convolutional neural network (CNN) is used as a pose module to extract features from skeleton data. A 3D CNN is used as an infrared module to extract visual cues from videos. Both feature vectors are then concatenated and exploited conjointly using a multilayer perceptron (MLP). Skeleton data also condition the infrared videos, providing a crop around the performing subjects and thus virtually focusing the attention of the infrared module. Ablation studies show that using pre-trained networks on other large scale datasets as our modules and data augmentation yield considerable improvements on the action classification accuracy. The strong contribution of our cropping strategy is also demonstrated. We evaluate our method on the NTU RGB+D dataset, the largest dataset for human action recognition from depth cameras, and report state-of-the-art performances. |
Tasks | Action Classification, Action Recognition In Videos, Data Augmentation, Skeleton Based Action Recognition, Temporal Action Localization |
Published | 2020-02-28 |
URL | https://arxiv.org/abs/2002.12886v1 |
https://arxiv.org/pdf/2002.12886v1.pdf | |
PWC | https://paperswithcode.com/paper/infrared-and-3d-skeleton-feature-fusion-for |
Repo | https://github.com/adeboissiere/FUSION-human-action-recognition |
Framework | pytorch |
Get Rid of Suspended Animation Problem: Deep Diffusive Neural Network on Graph Semi-Supervised Classification
Title | Get Rid of Suspended Animation Problem: Deep Diffusive Neural Network on Graph Semi-Supervised Classification |
Authors | Jiawei Zhang |
Abstract | Existing graph neural networks may suffer from the “suspended animation problem” when the model architecture goes deep. Meanwhile, for some graph learning scenarios, e.g., nodes with text/image attributes or graphs with long-distance node correlations, deep graph neural networks will be necessary for effective graph representation learning. In this paper, we propose a new graph neural network, namely DIFNET (Graph Diffusive Neural Network), for graph representation learning and node classification. DIFNET utilizes both neural gates and graph residual learning for node hidden state modeling, and includes an attention mechanism for node neighborhood information diffusion. Extensive experiments will be done in this paper to compare DIFNET against several state-of-the-art graph neural network models. The experimental results can illustrate both the learning performance advantages and effectiveness of DIFNET, especially in addressing the “suspended animation problem”. |
Tasks | Graph Representation Learning, Node Classification, Representation Learning |
Published | 2020-01-22 |
URL | https://arxiv.org/abs/2001.07922v1 |
https://arxiv.org/pdf/2001.07922v1.pdf | |
PWC | https://paperswithcode.com/paper/get-rid-of-suspended-animation-problem-deep |
Repo | https://github.com/anonymous-sourcecode/DifNN |
Framework | pytorch |
Semi-supervised Learning for Few-shot Image-to-Image Translation
Title | Semi-supervised Learning for Few-shot Image-to-Image Translation |
Authors | Yaxing Wang, Salman Khan, Abel Gonzalez-Garcia, Joost van de Weijer, Fahad Shahbaz Khan |
Abstract | In the last few years, unpaired image-to-image translation has witnessed remarkable progress. Although the latest methods are able to generate realistic images, they crucially rely on a large number of labeled images. Recently, some methods have tackled the challenging setting of few-shot image-to-image translation, reducing the labeled data requirements for the target domain during inference. In this work, we go one step further and reduce the amount of required labeled data also from the source domain during training. To do so, we propose applying semi-supervised learning via a noise-tolerant pseudo-labeling procedure. We also apply a cycle consistency constraint to further exploit the information from unlabeled images, either from the same dataset or external. Additionally, we propose several structural modifications to facilitate the image translation task under these circumstances. Our semi-supervised method for few-shot image translation, called \emph{SEMIT}, achieves excellent results on four different datasets using as little as 10% of the source labels, and matches the performance of the main fully-supervised competitor using only 20% labeled data. Our code and models are made public at: \url{https://github.com/yaxingwang/SEMIT}. |
Tasks | Image-to-Image Translation |
Published | 2020-03-30 |
URL | https://arxiv.org/abs/2003.13853v1 |
https://arxiv.org/pdf/2003.13853v1.pdf | |
PWC | https://paperswithcode.com/paper/semi-supervised-learning-for-few-shot-image |
Repo | https://github.com/yaxingwang/SEMIT |
Framework | none |
A Differentiable Color Filter for Generating Unrestricted Adversarial Images
Title | A Differentiable Color Filter for Generating Unrestricted Adversarial Images |
Authors | Zhengyu Zhao, Zhuoran Liu, Martha Larson |
Abstract | We propose Adversarial Color Filtering (AdvCF), an approach that uses a differentiable color filter to create adversarial images. The color filter allows us to introduce large perturbations into images, while still maintaining or enhancing their photographic quality and appeal. AdvCF is motivated by properties that are necessary if adversarial images are to be used to protect the content of images shared online from unethical machine learning classifiers: First, perturbations must be imperceptible and adversarial images must look realistic to the human eye. Second, adversarial impact must be maintained in the face of classifiers unknown when the perturbations are generated (transferability). The paper presents evidence that AdvCF has these two properties, and also points out that AdvCF has the potential for further improvement if image semantics are taken into account. |
Tasks | |
Published | 2020-02-03 |
URL | https://arxiv.org/abs/2002.01008v1 |
https://arxiv.org/pdf/2002.01008v1.pdf | |
PWC | https://paperswithcode.com/paper/a-differentiable-color-filter-for-generating |
Repo | https://github.com/ZhengyuZhao/AdvCF |
Framework | pytorch |
Multi-level Context Gating of Embedded Collective Knowledge for Medical Image Segmentation
Title | Multi-level Context Gating of Embedded Collective Knowledge for Medical Image Segmentation |
Authors | Maryam Asadi-Aghbolaghi, Reza Azad, Mahmood Fathy, Sergio Escalera |
Abstract | Medical image segmentation has been very challenging due to the large variation of anatomy across different cases. Recent advances in deep learning frameworks have exhibited faster and more accurate performance in image segmentation. Among the existing networks, U-Net has been successfully applied on medical image segmentation. In this paper, we propose an extension of U-Net for medical image segmentation, in which we take full advantages of U-Net, Squeeze and Excitation (SE) block, bi-directional ConvLSTM (BConvLSTM), and the mechanism of dense convolutions. (I) We improve the segmentation performance by utilizing SE modules within the U-Net, with a minor effect on model complexity. These blocks adaptively recalibrate the channel-wise feature responses by utilizing a self-gating mechanism of the global information embedding of the feature maps. (II) To strengthen feature propagation and encourage feature reuse, we use densely connected convolutions in the last convolutional layer of the encoding path. (III) Instead of a simple concatenation in the skip connection of U-Net, we employ BConvLSTM in all levels of the network to combine the feature maps extracted from the corresponding encoding path and the previous decoding up-convolutional layer in a non-linear way. The proposed model is evaluated on six datasets DRIVE, ISIC 2017 and 2018, lung segmentation, $PH^2$, and cell nuclei segmentation, achieving state-of-the-art performance. |
Tasks | Lesion Segmentation, Medical Image Segmentation, Semantic Segmentation |
Published | 2020-03-10 |
URL | https://arxiv.org/abs/2003.05056v1 |
https://arxiv.org/pdf/2003.05056v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-level-context-gating-of-embedded |
Repo | https://github.com/rezazad68/BCDU-Net |
Framework | tf |
A Metric Learning Reality Check
Title | A Metric Learning Reality Check |
Authors | Kevin Musgrave, Serge Belongie, Ser-Nam Lim |
Abstract | Deep metric learning papers from the past four years have consistently claimed great advances in accuracy, often more than doubling the performance of decade-old methods. In this paper, we take a closer look at the field to see if this is actually true. We find flaws in the experimental setup of these papers, and propose a new way to evaluate metric learning algorithms. Finally, we present experimental results that show that the improvements over time have been marginal at best. |
Tasks | Metric Learning |
Published | 2020-03-18 |
URL | https://arxiv.org/abs/2003.08505v1 |
https://arxiv.org/pdf/2003.08505v1.pdf | |
PWC | https://paperswithcode.com/paper/a-metric-learning-reality-check |
Repo | https://github.com/KevinMusgrave/powerful-benchmarker |
Framework | pytorch |
Fundamental Tradeoffs between Invariance and Sensitivity to Adversarial Perturbations
Title | Fundamental Tradeoffs between Invariance and Sensitivity to Adversarial Perturbations |
Authors | Florian Tramèr, Jens Behrmann, Nicholas Carlini, Nicolas Papernot, Jörn-Henrik Jacobsen |
Abstract | Adversarial examples are malicious inputs crafted to induce misclassification. Commonly studied sensitivity-based adversarial examples introduce semantically-small changes to an input that result in a different model prediction. This paper studies a complementary failure mode, invariance-based adversarial examples, that introduce minimal semantic changes that modify an input’s true label yet preserve the model’s prediction. We demonstrate fundamental tradeoffs between these two types of adversarial examples. We show that defenses against sensitivity-based attacks actively harm a model’s accuracy on invariance-based attacks, and that new approaches are needed to resist both attack types. In particular, we break state-of-the-art adversarially-trained and certifiably-robust models by generating small perturbations that the models are (provably) robust to, yet that change an input’s class according to human labelers. Finally, we formally show that the existence of excessively invariant classifiers arises from the presence of overly-robust predictive features in standard datasets. |
Tasks | |
Published | 2020-02-11 |
URL | https://arxiv.org/abs/2002.04599v1 |
https://arxiv.org/pdf/2002.04599v1.pdf | |
PWC | https://paperswithcode.com/paper/fundamental-tradeoffs-between-invariance-and |
Repo | https://github.com/ftramer/Excessive-Invariance |
Framework | tf |
Hippocampus Segmentation on Epilepsy and Alzheimer’s Disease Studies with Multiple Convolutional Neural Networks
Title | Hippocampus Segmentation on Epilepsy and Alzheimer’s Disease Studies with Multiple Convolutional Neural Networks |
Authors | Diedre Carmo, Bruna Silva, Clarissa Yasuda, Letícia Rittner, Roberto Lotufo |
Abstract | Hippocampus segmentation on magnetic resonance imaging (MRI) is of key importance for the diagnosis, treatment decision and investigation of neuropsychiatric disorders. Automatic segmentation is a very active research field, with many recent models involving Deep Learning for such task. However, Deep Learning requires a training phase, which can introduce bias from the specific domain of the training dataset. Current state-of-the art methods train their methods on healthy or Alzheimer’s disease patients from public datasets. This raises the question whether these methods are capable to recognize the Hippocampus on a very different domain. In this paper we present a state-of-the-art, open source, ready-to-use hippocampus segmentation methodology, using Deep Learning. We analyze this methodology alongside other recent Deep Learning methods, in two domains: the public HarP benchmark and an in-house Epilepsy patients dataset. Our internal dataset differs significantly from Alzheimer’s and Healthy subjects scans. Some scans are from patients who have undergone hippocampal resection, due to surgical treatment of Epilepsy. We show that our method surpasses others from the literature in both the Alzheimer’s and Epilepsy test datasets. |
Tasks | |
Published | 2020-01-14 |
URL | https://arxiv.org/abs/2001.05058v1 |
https://arxiv.org/pdf/2001.05058v1.pdf | |
PWC | https://paperswithcode.com/paper/hippocampus-segmentation-on-epilepsy-and |
Repo | https://github.com/dscarmo/e2dhipseg |
Framework | pytorch |
FGN: Fusion Glyph Network for Chinese Named Entity Recognition
Title | FGN: Fusion Glyph Network for Chinese Named Entity Recognition |
Authors | Zhenyu Xuan, Rui Bao, Chuyu Ma, Shengyi Jiang |
Abstract | Chinese NER is a challenging task. As pictographs, Chinese characters contain latent glyph infor-mation, which is often overlooked. In this paper, we propose the FGN , Fusion Glyph Network for Chinese NER. Except for adding glyph information, this method may also add extra interactive infor-mation with the fusion mechanism. The major in-novations of FGN include: (1) a novel CNN struc-ture called CGS-CNN is proposed to capture both glyph information and interactive information between glyphs from neighboring characters. (2) we provide a method with sliding window and Slice-Attention to fuse the BERT representation and glyph representation for a character, which may capture potential interactive knowledge be-tween context and glyph. Experiments are con-ducted on four NER datasets, showing that FGN with LSTM-CRF as tagger achieves new state-of-the-arts performance for Chinese NER. Further, more experiments are conducted to inves-tigate the influences of various components and settings in FGN. |
Tasks | Chinese Named Entity Recognition, Named Entity Recognition, Representation Learning |
Published | 2020-01-15 |
URL | https://arxiv.org/abs/2001.05272v3 |
https://arxiv.org/pdf/2001.05272v3.pdf | |
PWC | https://paperswithcode.com/paper/fgn-fusion-glyph-network-for-chinese-named |
Repo | https://github.com/AidenHuen/FGN-NER |
Framework | none |
Set-Structured Latent Representations
Title | Set-Structured Latent Representations |
Authors | Qian Huang, Horace He, Abhay Singh, Yan Zhang, Ser-Nam Lim, Austin Benson |
Abstract | Unstructured data often has latent component structure, such as the objects in an image of a scene. In these situations, the relevant latent structure is an unordered collection or \emph{set}. However, learning such representations directly from data is difficult due to the discrete and unordered structure. Here, we develop a framework for differentiable learning of set-structured latent representations. We show how to use this framework to naturally decompose data such as images into sets of interpretable and meaningful components and demonstrate how existing techniques cannot properly disentangle relevant structure. We also show how to extend our methodology to downstream tasks such as set matching, which uses set-specific operations. Our code is available at https://github.com/CUVL/SSLR. |
Tasks | |
Published | 2020-03-09 |
URL | https://arxiv.org/abs/2003.04448v1 |
https://arxiv.org/pdf/2003.04448v1.pdf | |
PWC | https://paperswithcode.com/paper/set-structured-latent-representations |
Repo | https://github.com/CUVL/SSLR |
Framework | pytorch |
Towards Achieving Adversarial Robustness by Enforcing Feature Consistency Across Bit Planes
Title | Towards Achieving Adversarial Robustness by Enforcing Feature Consistency Across Bit Planes |
Authors | Sravanti Addepalli, Vivek B. S., Arya Baburaj, Gaurang Sriramanan, R. Venkatesh Babu |
Abstract | As humans, we inherently perceive images based on their predominant features, and ignore noise embedded within lower bit planes. On the contrary, Deep Neural Networks are known to confidently misclassify images corrupted with meticulously crafted perturbations that are nearly imperceptible to the human eye. In this work, we attempt to address this problem by training networks to form coarse impressions based on the information in higher bit planes, and use the lower bit planes only to refine their prediction. We demonstrate that, by imposing consistency on the representations learned across differently quantized images, the adversarial robustness of networks improves significantly when compared to a normally trained model. Present state-of-the-art defenses against adversarial attacks require the networks to be explicitly trained using adversarial samples that are computationally expensive to generate. While such methods that use adversarial training continue to achieve the best results, this work paves the way towards achieving robustness without having to explicitly train on adversarial samples. The proposed approach is therefore faster, and also closer to the natural learning process in humans. |
Tasks | |
Published | 2020-04-01 |
URL | https://arxiv.org/abs/2004.00306v1 |
https://arxiv.org/pdf/2004.00306v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-achieving-adversarial-robustness-by |
Repo | https://github.com/val-iisc/BPFC |
Framework | none |