April 3, 2020

3304 words 16 mins read

Paper Group ANR 25

Paper Group ANR 25

Lossless Compression of Deep Neural Networks. Recalibrating 3D ConvNets with Project & Excite. E2EET: From Pipeline to End-to-end Entity Typing via Transformer-Based Embeddings. Exploring Chemical Space using Natural Language Processing Methodologies for Drug Discovery. Inducing Optimal Attribute Representations for Conditional GANs. On Safety Asse …

Lossless Compression of Deep Neural Networks

Title Lossless Compression of Deep Neural Networks
Authors Thiago Serra, Abhinav Kumar, Srikumar Ramalingam
Abstract Deep neural networks have been successful in many predictive modeling tasks, such as image and language recognition, where large neural networks are often used to obtain good accuracy. Consequently, it is challenging to deploy these networks under limited computational resources, such as in mobile devices. In this work, we introduce an algorithm that removes units and layers of a neural network while not changing the output that is produced, which thus implies a lossless compression. This algorithm, which we denote as LEO (Lossless Expressiveness Optimization), relies on Mixed-Integer Linear Programming (MILP) to identify Rectified Linear Units (ReLUs) with linear behavior over the input domain. By using L1 regularization to induce such behavior, we can benefit from training over a larger architecture than we would later use in the environment where the trained neural network is deployed.
Published 2020-01-01
URL https://arxiv.org/abs/2001.00218v3
PDF https://arxiv.org/pdf/2001.00218v3.pdf
PWC https://paperswithcode.com/paper/lossless-compression-of-deep-neural-networks

Recalibrating 3D ConvNets with Project & Excite

Title Recalibrating 3D ConvNets with Project & Excite
Authors Anne-Marie Rickmann, Abhijit Guha Roy, Ignacio Sarasua, Christian Wachinger
Abstract Fully Convolutional Neural Networks (F-CNNs) achieve state-of-the-art performance for segmentation tasks in computer vision and medical imaging. Recently, computational blocks termed squeeze and excitation (SE) have been introduced to recalibrate F-CNN feature maps both channel- and spatial-wise, boosting segmentation performance while only minimally increasing the model complexity. So far, the development of SE blocks has focused on 2D architectures. For volumetric medical images, however, 3D F-CNNs are a natural choice. In this article, we extend existing 2D recalibration methods to 3D and propose a generic compress-process-recalibrate pipeline for easy comparison of such blocks. We further introduce Project & Excite (PE) modules, customized for 3D networks. In contrast to existing modules, Project & Excite does not perform global average pooling but compresses feature maps along different spatial dimensions of the tensor separately to retain more spatial information that is subsequently used in the excitation step. We evaluate the modules on two challenging tasks, whole-brain segmentation of MRI scans and whole-body segmentation of CT scans. We demonstrate that PE modules can be easily integrated into 3D F-CNNs, boosting performance up to 0.3 in Dice Score and outperforming 3D extensions of other recalibration blocks, while only marginally increasing the model complexity. Our code is publicly available on https://github.com/ai-med/squeeze_and_excitation .
Tasks Brain Segmentation
Published 2020-02-25
URL https://arxiv.org/abs/2002.10994v1
PDF https://arxiv.org/pdf/2002.10994v1.pdf
PWC https://paperswithcode.com/paper/recalibrating-3d-convnets-with-project-excite

E2EET: From Pipeline to End-to-end Entity Typing via Transformer-Based Embeddings

Title E2EET: From Pipeline to End-to-end Entity Typing via Transformer-Based Embeddings
Authors Michael Stewart, Wei Liu
Abstract Entity Typing (ET) is the process of identifying the semantic types of every entity within a corpus. In contrast to Named Entity Recognition, where each token in a sentence is labelled with zero or one class label, ET involves labelling each entity mention with one or more class labels. Existing entity typing models, which operate at the mention level, are limited by two key factors: they do not make use of recently-proposed context-dependent embeddings, and are trained on fixed context windows. They are therefore sensitive to window size selection and are unable to incorporate the context of the entire document. In light of these drawbacks we propose to incorporate context using transformer-based embeddings for a mention-level model, and an end-to-end model using a Bi-GRU to remove the dependency on window size. An extensive ablative study demonstrates the effectiveness of contextualised embeddings for mention-level models and the competitiveness of our end-to-end model for entity typing.
Tasks Entity Typing, Named Entity Recognition
Published 2020-03-23
URL https://arxiv.org/abs/2003.10097v1
PDF https://arxiv.org/pdf/2003.10097v1.pdf
PWC https://paperswithcode.com/paper/e2eet-from-pipeline-to-end-to-end-entity

Exploring Chemical Space using Natural Language Processing Methodologies for Drug Discovery

Title Exploring Chemical Space using Natural Language Processing Methodologies for Drug Discovery
Authors Hakime Öztürk, Arzucan Özgür, Philippe Schwaller, Teodoro Laino, Elif Ozkirimli
Abstract Text-based representations of chemicals and proteins can be thought of as unstructured languages codified by humans to describe domain-specific knowledge. Advances in natural language processing (NLP) methodologies in the processing of spoken languages accelerated the application of NLP to elucidate hidden knowledge in textual representations of these biochemical entities and then use it to construct models to predict molecular properties or to design novel molecules. This review outlines the impact made by these advances on drug discovery and aims to further the dialogue between medicinal chemists and computer scientists.
Tasks Drug Discovery
Published 2020-02-10
URL https://arxiv.org/abs/2002.06053v1
PDF https://arxiv.org/pdf/2002.06053v1.pdf
PWC https://paperswithcode.com/paper/exploring-chemical-space-using-natural

Inducing Optimal Attribute Representations for Conditional GANs

Title Inducing Optimal Attribute Representations for Conditional GANs
Authors Binod Bhattarai, Tae-Kyun Kim
Abstract Conditional GANs are widely used in translating an image from one category to another. Meaningful conditions to GANs provide greater flexibility and control over the nature of the target domain synthetic data. Existing conditional GANs commonly encode target domain label information as hard-coded categorical vectors in the form of 0s and 1s. The major drawbacks of such representations are inability to encode the high-order semantic information of target categories and their relative dependencies. We propose a novel end-to-end learning framework with Graph Convolutional Networks to learn the attribute representations to condition on the generator. The GAN losses, i.e. the discriminator and attribute classification losses, are fed back to the Graph resulting in the synthetic images that are more natural and clearer in attributes. Moreover, prior-arts are given priorities to condition on the generator side, not on the discriminator side of GANs. We apply the conditions to the discriminator side as well via multi-task learning. We enhanced the four state-of-the art cGANs architectures: Stargan, Stargan-JNT, AttGAN and STGAN. Our extensive qualitative and quantitative evaluations on challenging face attributes manipulation data set, CelebA, LFWA, and RaFD, show that the cGANs enhanced by our methods outperform by a large margin, compared to their counter-parts and other conditioning methods, in terms of both target attributes recognition rates and quality measures such as PSNR and SSIM.
Tasks Multi-Task Learning
Published 2020-03-13
URL https://arxiv.org/abs/2003.06472v1
PDF https://arxiv.org/pdf/2003.06472v1.pdf
PWC https://paperswithcode.com/paper/inducing-optimal-attribute-representations

On Safety Assessment of Artificial Intelligence

Title On Safety Assessment of Artificial Intelligence
Authors Jens Braband, Hendrik Schäbe
Abstract In this paper we discuss how systems with Artificial Intelligence (AI) can undergo safety assessment. This is relevant, if AI is used in safety related applications. Taking a deeper look into AI models, we show, that many models of artificial intelligence, in particular machine learning, are statistical models. Safety assessment would then have t o concentrate on the model that is used in AI, besides the normal assessment procedure. Part of the budget of dangerous random failures for the relevant safety integrity level needs to be used for the probabilistic faulty behavior of the AI system. We demonstrate our thoughts with a simple example and propose a research challenge that may be decisive for the use of AI in safety related systems.
Published 2020-02-29
URL https://arxiv.org/abs/2003.00260v1
PDF https://arxiv.org/pdf/2003.00260v1.pdf
PWC https://paperswithcode.com/paper/on-safety-assessment-of-artificial

Vanishing Point Detection with Direct and Transposed Fast Hough Transform inside the neural network

Title Vanishing Point Detection with Direct and Transposed Fast Hough Transform inside the neural network
Authors A. Sheshkus, A. Chirvonaya, D. Nikolaev, V. L. Arlazarov
Abstract In this paper, we suggest a new neural network architecture for vanishing point detection in images. The key element is the use of the direct and transposed Fast Hough Transforms separated by convolutional layer blocks with activation functions. It allows us to get the answer in the coordinates of the input image at the output of the network and thus to calculate the coordinates of the vanishing point by simply selecting the maximum. The use of integral operators enables the neural network to rely on global rectilinear features in the image, and so it is ideal for detecting vanishing points. To demonstrate the effectiveness of the proposed architecture, we use a set of images from a DVR and show its superiority over existing methods. Note, in addition, that the proposed neural network architecture essentially repeats the process of direct and back projection used, for example, in computed tomography.
Published 2020-02-04
URL https://arxiv.org/abs/2002.01176v1
PDF https://arxiv.org/pdf/2002.01176v1.pdf
PWC https://paperswithcode.com/paper/vanishing-point-detection-with-direct-and

Single-view 2D CNNs with Fully Automatic Non-nodule Categorization for False Positive Reduction in Pulmonary Nodule Detection

Title Single-view 2D CNNs with Fully Automatic Non-nodule Categorization for False Positive Reduction in Pulmonary Nodule Detection
Authors Hyunjun Eun, Daeyeong Kim, Chanho Jung, Changick Kim
Abstract Background and Objective: In pulmonary nodule detection, the first stage, candidate detection, aims to detect suspicious pulmonary nodules. However, detected candidates include many false positives and thus in the following stage, false positive reduction, such false positives are reliably reduced. Note that this task is challenging due to 1) the imbalance between the numbers of nodules and non-nodules and 2) the intra-class diversity of non-nodules. Although techniques using 3D convolutional neural networks (CNNs) have shown promising performance, they suffer from high computational complexity which hinders constructing deep networks. To efficiently address these problems, we propose a novel framework using the ensemble of 2D CNNs using single views, which outperforms existing 3D CNN-based methods. Methods: Our ensemble of 2D CNNs utilizes single-view 2D patches to improve both computational and memory efficiency compared to previous techniques exploiting 3D CNNs. We first categorize non-nodules on the basis of features encoded by an autoencoder. Then, all 2D CNNs are trained by using the same nodule samples, but with different types of non-nodules. By extending the learning capability, this training scheme resolves difficulties of extracting representative features from non-nodules with large appearance variations. Note that, instead of manual categorization requiring the heavy workload of radiologists, we propose to automatically categorize non-nodules based on the autoencoder and k-means clustering.
Published 2020-03-09
URL https://arxiv.org/abs/2003.04454v1
PDF https://arxiv.org/pdf/2003.04454v1.pdf
PWC https://paperswithcode.com/paper/single-view-2d-cnns-with-fully-automatic-non

CAFENet: Class-Agnostic Few-Shot Edge Detection Network

Title CAFENet: Class-Agnostic Few-Shot Edge Detection Network
Authors Young-Hyun Park, Jun Seo, Jaekyun Moon
Abstract We tackle a novel few-shot learning challenge, which we call few-shot semantic edge detection, aiming to localize crisp boundaries of novel categories using only a few labeled samples. We also present a Class-Agnostic Few-shot Edge detection Network (CAFENet) based on meta-learning strategy. CAFENet employs a semantic segmentation module in small-scale to compensate for lack of semantic information in edge labels. The predicted segmentation mask is used to generate an attention map to highlight the target object region, and make the decoder module concentrate on that region. We also propose a new regularization method based on multi-split matching. In meta-training, the metric-learning problem with high-dimensional vectors are divided into small subproblems with low-dimensional sub-vectors. Since there is no existing dataset for few-shot semantic edge detection, we construct two new datasets, FSE-1000 and SBD-$5^i$, and evaluate the performance of the proposed CAFENet on them. Extensive simulation results confirm the performance merits of the techniques adopted in CAFENet.
Tasks Edge Detection, Few-Shot Learning, Meta-Learning, Metric Learning, Semantic Segmentation
Published 2020-03-18
URL https://arxiv.org/abs/2003.08235v1
PDF https://arxiv.org/pdf/2003.08235v1.pdf
PWC https://paperswithcode.com/paper/cafenet-class-agnostic-few-shot-edge

Task-Adaptive Clustering for Semi-Supervised Few-Shot Classification

Title Task-Adaptive Clustering for Semi-Supervised Few-Shot Classification
Authors Jun Seo, Sung Whan Yoon, Jaekyun Moon
Abstract Few-shot learning aims to handle previously unseen tasks using only a small amount of new training data. In preparing (or meta-training) a few-shot learner, however, massive labeled data are necessary. In the real world, unfortunately, labeled data are expensive and/or scarce. In this work, we propose a few-shot learner that can work well under the semi-supervised setting where a large portion of training data is unlabeled. Our method employs explicit task-conditioning in which unlabeled sample clustering for the current task takes place in a new projection space different from the embedding feature space. The conditioned clustering space is linearly constructed so as to quickly close the gap between the class centroids for the current task and the independent per-class reference vectors meta-trained across tasks. In a more general setting, our method introduces a concept of controlling the degree of task-conditioning for meta-learning: the amount of task-conditioning varies with the number of repetitive updates for the clustering space. Extensive simulation results based on the miniImageNet and tieredImageNet datasets show state-of-the-art semi-supervised few-shot classification performance of the proposed method. Simulation results also indicate that the proposed task-adaptive clustering shows graceful degradation with a growing number of distractor samples, i.e., unlabeled sample images coming from outside the candidate classes.
Tasks Few-Shot Learning, Meta-Learning
Published 2020-03-18
URL https://arxiv.org/abs/2003.08221v1
PDF https://arxiv.org/pdf/2003.08221v1.pdf
PWC https://paperswithcode.com/paper/task-adaptive-clustering-for-semi-supervised

Semi-supervised few-shot learning for medical image segmentation

Title Semi-supervised few-shot learning for medical image segmentation
Authors Abdur R Feyjie, Reza Azad, Marco Pedersoli, Claude Kauffman, Ismail Ben Ayed, Jose Dolz
Abstract Recent years have witnessed the great progress of deep neural networks on semantic segmentation, particularly in medical imaging. Nevertheless, training high-performing models require large amounts of pixel-level ground truth masks, which can be prohibitive to obtain in the medical domain. Furthermore, training such models in a low-data regime highly increases the risk of overfitting. Recent attempts to alleviate the need for large annotated datasets have developed training strategies under the few-shot learning paradigm, which addresses this shortcoming by learning a novel class from only a few labeled examples. In this context, a segmentation model is trained on episodes, which represent different segmentation problems, each of them trained with a very small labeled dataset. In this work, we propose a novel few-shot learning framework for semantic segmentation, where unlabeled images are also made available at each episode. To handle this new learning paradigm, we propose to include surrogate tasks that can leverage very powerful supervisory signals –derived from the data itself– for semantic feature learning. We show that including unlabeled surrogate tasks in the episodic training leads to more powerful feature representations, which ultimately results in better generability to unseen tasks. We demonstrate the efficiency of our method in the task of skin lesion segmentation in two publicly available datasets. Furthermore, our approach is general and model-agnostic, which can be combined with different deep architectures.
Tasks Few-Shot Learning, Lesion Segmentation, Medical Image Segmentation, Semantic Segmentation
Published 2020-03-18
URL https://arxiv.org/abs/2003.08462v1
PDF https://arxiv.org/pdf/2003.08462v1.pdf
PWC https://paperswithcode.com/paper/semi-supervised-few-shot-learning-for-medical

TAFSSL: Task-Adaptive Feature Sub-Space Learning for few-shot classification

Title TAFSSL: Task-Adaptive Feature Sub-Space Learning for few-shot classification
Authors Moshe Lichtenstein, Prasanna Sattigeri, Rogerio Feris, Raja Giryes, Leonid Karlinsky
Abstract The field of Few-Shot Learning (FSL), or learning from very few (typically $1$ or $5$) examples per novel class (unseen during training), has received a lot of attention and significant performance advances in the recent literature. While number of techniques have been proposed for FSL, several factors have emerged as most important for FSL performance, awarding SOTA even to the simplest of techniques. These are: the backbone architecture (bigger is better), type of pre-training on the base classes (meta-training vs regular multi-class, currently regular wins), quantity and diversity of the base classes set (the more the merrier, resulting in richer and better adaptive features), and the use of self-supervised tasks during pre-training (serving as a proxy for increasing the diversity of the base set). In this paper we propose yet another simple technique that is important for the few shot learning performance - a search for a compact feature sub-space that is discriminative for a given few-shot test task. We show that the Task-Adaptive Feature Sub-Space Learning (TAFSSL) can significantly boost the performance in FSL scenarios when some additional unlabeled data accompanies the novel few-shot task, be it either the set of unlabeled queries (transductive FSL) or some additional set of unlabeled data samples (semi-supervised FSL). Specifically, we show that on the challenging miniImageNet and tieredImageNet benchmarks, TAFSSL can improve the current state-of-the-art in both transductive and semi-supervised FSL settings by more than $5%$, while increasing the benefit of using unlabeled data in FSL to above $10%$ performance gain.
Tasks Few-Shot Learning
Published 2020-03-14
URL https://arxiv.org/abs/2003.06670v1
PDF https://arxiv.org/pdf/2003.06670v1.pdf
PWC https://paperswithcode.com/paper/tafssl-task-adaptive-feature-sub-space

Traffic Signs Detection and Recognition System using Deep Learning

Title Traffic Signs Detection and Recognition System using Deep Learning
Authors Pavly Salah Zaki, Marco Magdy William, Bolis Karam Soliman, Kerolos Gamal Alexsan, Keroles Khalil, Magdy El-Moursy
Abstract With the rapid development of technology, automobiles have become an essential asset in our day-to-day lives. One of the more important researches is Traffic Signs Recognition (TSR) systems. This paper describes an approach for efficiently detecting and recognizing traffic signs in real-time, taking into account the various weather, illumination and visibility challenges through the means of transfer learning. We tackle the traffic sign detection problem using the state-of-the-art of multi-object detection systems such as Faster Recurrent Convolutional Neural Networks (F-RCNN) and Single Shot Multi- Box Detector (SSD) combined with various feature extractors such as MobileNet v1 and Inception v2, and also Tiny-YOLOv2. However, the focus of this paper is going to be F-RCNN Inception v2 and Tiny YOLO v2 as they achieved the best results. The aforementioned models were fine-tuned on the German Traffic Signs Detection Benchmark (GTSDB) dataset. These models were tested on the host PC as well as Raspberry Pi 3 Model B+ and the TASS PreScan simulation. We will discuss the results of all the models in the conclusion section.
Tasks Object Detection, Transfer Learning
Published 2020-03-06
URL https://arxiv.org/abs/2003.03256v1
PDF https://arxiv.org/pdf/2003.03256v1.pdf
PWC https://paperswithcode.com/paper/traffic-signs-detection-and-recognition

Learning to Incorporate Structure Knowledge for Image Inpainting

Title Learning to Incorporate Structure Knowledge for Image Inpainting
Authors Jie Yang, Zhiquan Qi, Yong Shi
Abstract This paper develops a multi-task learning framework that attempts to incorporate the image structure knowledge to assist image inpainting, which is not well explored in previous works. The primary idea is to train a shared generator to simultaneously complete the corrupted image and corresponding structures — edge and gradient, thus implicitly encouraging the generator to exploit relevant structure knowledge while inpainting. In the meantime, we also introduce a structure embedding scheme to explicitly embed the learned structure features into the inpainting process, thus to provide possible preconditions for image completion. Specifically, a novel pyramid structure loss is proposed to supervise structure learning and embedding. Moreover, an attention mechanism is developed to further exploit the recurrent structures and patterns in the image to refine the generated structures and contents. Through multi-task learning, structure embedding besides with attention, our framework takes advantage of the structure knowledge and outperforms several state-of-the-art methods on benchmark datasets quantitatively and qualitatively.
Tasks Image Inpainting, Multi-Task Learning
Published 2020-02-11
URL https://arxiv.org/abs/2002.04170v2
PDF https://arxiv.org/pdf/2002.04170v2.pdf
PWC https://paperswithcode.com/paper/learning-to-incorporate-structure-knowledge

Interpretation and Simplification of Deep Forest

Title Interpretation and Simplification of Deep Forest
Authors Sangwon Kim, Mira Jeong, Byoung Chul Ko
Abstract This paper proposes a new method for interpreting and simplifying a black box model of a deep random forest (RF) using a proposed rule elimination. In deep RF, a large number of decision trees are connected to multiple layers, thereby making an analysis difficult. It has a high performance similar to that of a deep neural network (DNN), but achieves a better generalizability. Therefore, in this study, we consider quantifying the feature contributions and frequency of the fully trained deep RF in the form of a decision rule set. The feature contributions provide a basis for determining how features affect the decision process in a rule set. Model simplification is achieved by eliminating unnecessary rules by measuring the feature contributions. Consequently, the simplified model has fewer parameters and rules than before. Experiment results have shown that a feature contribution analysis allows a black box model to be decomposed for quantitatively interpreting a rule set. The proposed method was successfully applied to various deep RF models and benchmark datasets while maintaining a robust performance despite the elimination of a large number of rules.
Published 2020-01-14
URL https://arxiv.org/abs/2001.04721v2
PDF https://arxiv.org/pdf/2001.04721v2.pdf
PWC https://paperswithcode.com/paper/interpretation-and-simplification-of-deep
comments powered by Disqus