April 2, 2020

2997 words 15 mins read

Paper Group ANR 125

Towards Automatic Threat Detection: A Survey of Advances of Deep Learning within X-ray Security Imaging. IROF: a low resource evaluation metric for explanation methods. Accelerating Deep Learning Inference via Freezing. High-Order Paired-ASPP Networks for Semantic Segmenation. Asymmetric Rejection Loss for Fairer Face Recognition. An Approach for T …

Towards Automatic Threat Detection: A Survey of Advances of Deep Learning within X-ray Security Imaging


Title	Towards Automatic Threat Detection: A Survey of Advances of Deep Learning within X-ray Security Imaging
Authors	Samet Akcay, Toby Breckon
Abstract	X-ray security screening is widely used to maintain aviation/transport security, and its significance poses a particular interest in automated screening systems. This paper aims to review computerised X-ray security imaging algorithms by taxonomising the field into conventional machine learning and contemporary deep learning applications. The first part briefly discusses the classical machine learning approaches utilised within X-ray security imaging, while the latter part thoroughly investigates the use of modern deep learning algorithms. The proposed taxonomy sub-categorises the use of deep learning approaches into supervised, semi-supervised and unsupervised learning, with a particular focus on object classification, detection, segmentation and anomaly detection tasks. The paper further explores well-established X-ray datasets and provides a performance benchmark. Based on the current and future trends in deep learning, the paper finally presents a discussion and future directions for X-ray security imagery.
Tasks	Anomaly Detection, Object Classification
Published	2020-01-05
URL	https://arxiv.org/abs/2001.01293v1
PDF	https://arxiv.org/pdf/2001.01293v1.pdf
PWC	https://paperswithcode.com/paper/towards-automatic-threat-detection-a-survey
Repo
Framework

IROF: a low resource evaluation metric for explanation methods


Title	IROF: a low resource evaluation metric for explanation methods
Authors	Laura Rieger, Lars Kai Hansen
Abstract	The adoption of machine learning in health care hinges on the transparency of the used algorithms, necessitating the need for explanation methods. However, despite a growing literature on explaining neural networks, no consensus has been reached on how to evaluate those explanation methods. We propose IROF, a new approach to evaluating explanation methods that circumvents the need for manual evaluation. Compared to other recent work, our approach requires several orders of magnitude less computational resources and no human input, making it accessible to lower resource groups and robust to human bias.
Tasks
Published	2020-03-09
URL	https://arxiv.org/abs/2003.08747v1
PDF	https://arxiv.org/pdf/2003.08747v1.pdf
PWC	https://paperswithcode.com/paper/irof-a-low-resource-evaluation-metric-for
Repo
Framework

Accelerating Deep Learning Inference via Freezing


Title	Accelerating Deep Learning Inference via Freezing
Authors	Adarsh Kumar, Arjun Balasubramanian, Shivaram Venkataraman, Aditya Akella
Abstract	Over the last few years, Deep Neural Networks (DNNs) have become ubiquitous owing to their high accuracy on real-world tasks. However, this increase in accuracy comes at the cost of computationally expensive models leading to higher prediction latencies. Prior efforts to reduce this latency such as quantization, model distillation, and any-time prediction models typically trade-off accuracy for performance. In this work, we observe that caching intermediate layer outputs can help us avoid running all the layers of a DNN for a sizeable fraction of inference requests. We find that this can potentially reduce the number of effective layers by half for 91.58% of CIFAR-10 requests run on ResNet-18. We present Freeze Inference, a system that introduces approximate caching at each intermediate layer and we discuss techniques to reduce the cache size and improve the cache hit rate. Finally, we discuss some of the open research challenges in realizing such a design.
Tasks	Quantization
Published	2020-02-07
URL	https://arxiv.org/abs/2002.02645v1
PDF	https://arxiv.org/pdf/2002.02645v1.pdf
PWC	https://paperswithcode.com/paper/accelerating-deep-learning-inference-via
Repo
Framework

High-Order Paired-ASPP Networks for Semantic Segmenation


Title	High-Order Paired-ASPP Networks for Semantic Segmenation
Authors	Yu Zhang, Xin Sun, Junyu Dong, Changrui Chen, Yue Shen
Abstract	Current semantic segmentation models only exploit first-order statistics, while rarely exploring high-order statistics. However, common first-order statistics are insufficient to support a solid unanimous representation. In this paper, we propose High-Order Paired-ASPP Network to exploit high-order statistics from various feature levels. The network first introduces a High-Order Representation module to extract the contextual high-order information from all stages of the backbone. They can provide more semantic clues and discriminative information than the first-order ones. Besides, a Paired-ASPP module is proposed to embed high-order statistics of the early stages into the last stage. It can further preserve the boundary-related and spatial context in the low-level features for final prediction. Our experiments show that the high-order statistics significantly boost the performance on confusing objects. Our method achieves competitive performance without bells and whistles on three benchmarks, i.e, Cityscapes, ADE20K and Pascal-Context with the mIoU of 81.6%, 45.3% and 52.9%.
Tasks	Semantic Segmentation
Published	2020-02-18
URL	https://arxiv.org/abs/2002.07371v1
PDF	https://arxiv.org/pdf/2002.07371v1.pdf
PWC	https://paperswithcode.com/paper/high-order-paired-aspp-networks-for-semantic
Repo
Framework

Asymmetric Rejection Loss for Fairer Face Recognition


Title	Asymmetric Rejection Loss for Fairer Face Recognition
Authors	Haoyu Qin
Abstract	Face recognition performance has seen a tremendous gain in recent years, mostly due to the availability of large-scale face images dataset that can be exploited by deep neural networks to learn powerful face representations. However, recent research has shown differences in face recognition performance across different ethnic groups mostly due to the racial imbalance in the training datasets where Caucasian identities largely dominate other ethnicities. This is actually symptomatic of the under-representation of non-Caucasian ethnic groups in the celebdom from which face datasets are usually gathered, rendering the acquisition of labeled data of the under-represented groups challenging. In this paper, we propose an Asymmetric Rejection Loss, which aims at making full use of unlabeled images of those under-represented groups, to reduce the racial bias of face recognition models. We view each unlabeled image as a unique class, however as we cannot guarantee that two unlabeled samples are from a distinct class we exploit both labeled and unlabeled data in an asymmetric manner in our loss formalism. Extensive experiments show our method’s strength in mitigating racial bias, outperforming state-of-the-art semi-supervision methods. Performance on the under-represented ethnicity groups increases while that on the well-represented group is nearly unchanged.
Tasks	Face Recognition
Published	2020-02-09
URL	https://arxiv.org/abs/2002.03276v1
PDF	https://arxiv.org/pdf/2002.03276v1.pdf
PWC	https://paperswithcode.com/paper/asymmetric-rejection-loss-for-fairer-face
Repo
Framework


Title	An Approach for Time-aware Domain-based Social Influence Prediction
Authors	Bilal Abu-Salih, Kit Yan Chan, Omar Al-Kadi, Marwan Al-Tawil, Pornpit Wongthongtham, Tomayess Issa, Heba Saadeh, Malak Al-Hassan, Bushra Bremie, Abdulaziz Albahlal
Abstract	Online Social Networks(OSNs) have established virtual platforms enabling people to express their opinions, interests and thoughts in a variety of contexts and domains, allowing legitimate users as well as spammers and other untrustworthy users to publish and spread their content. Hence, the concept of social trust has attracted the attention of information processors/data scientists and information consumers/business firms. One of the main reasons for acquiring the value of Social Big Data (SBD) is to provide frameworks and methodologies using which the credibility of OSNs users can be evaluated. These approaches should be scalable to accommodate large-scale social data. Hence, there is a need for well comprehending of social trust to improve and expand the analysis process and inferring the credibility of SBD. Given the exposed environment’s settings and fewer limitations related to OSNs, the medium allows legitimate and genuine users as well as spammers and other low trustworthy users to publish and spread their content. Hence, this paper presents an approach incorporates semantic analysis and machine learning modules to measure and predict users’ trustworthiness in numerous domains in different time periods. The evaluation of the conducted experiment validates the applicability of the incorporated machine learning techniques to predict highly trustworthy domain-based users.
Tasks
Published	2020-01-19
URL	https://arxiv.org/abs/2001.07838v1
PDF	https://arxiv.org/pdf/2001.07838v1.pdf
PWC	https://paperswithcode.com/paper/an-approach-for-time-aware-domain-based
Repo
Framework

Improving Robustness of Deep-Learning-Based Image Reconstruction


Title	Improving Robustness of Deep-Learning-Based Image Reconstruction
Authors	Ankit Raj, Yoram Bresler, Bo Li
Abstract	Deep-learning-based methods for different applications have been shown vulnerable to adversarial examples. These examples make deployment of such models in safety-critical tasks questionable. Use of deep neural networks as inverse problem solvers has generated much excitement for medical imaging including CT and MRI, but recently a similar vulnerability has also been demonstrated for these tasks. We show that for such inverse problem solvers, one should analyze and study the effect of adversaries in the measurement-space, instead of the signal-space as in previous work. In this paper, we propose to modify the training strategy of end-to-end deep-learning-based inverse problem solvers to improve robustness. We introduce an auxiliary network to generate adversarial examples, which is used in a min-max formulation to build robust image reconstruction networks. Theoretically, we show for a linear reconstruction scheme the min-max formulation results in a singular-value(s) filter regularized solution, which suppresses the effect of adversarial examples occurring because of ill-conditioning in the measurement matrix. We find that a linear network using the proposed min-max learning scheme indeed converges to the same solution. In addition, for non-linear Compressed Sensing (CS) reconstruction using deep networks, we show significant improvement in robustness using the proposed approach over other methods. We complement the theory by experiments for CS on two different datasets and evaluate the effect of increasing perturbations on trained networks. We find the behavior for ill-conditioned and well-conditioned measurement matrices to be qualitatively different.
Tasks	Image Reconstruction
Published	2020-02-26
URL	https://arxiv.org/abs/2002.11821v1
PDF	https://arxiv.org/pdf/2002.11821v1.pdf
PWC	https://paperswithcode.com/paper/improving-robustness-of-deep-learning-based-1
Repo
Framework

The Enron Corpus: Where the Email Bodies are Buried?


Title	The Enron Corpus: Where the Email Bodies are Buried?
Authors	David Noever
Abstract	To probe the largest public-domain email database for indicators of fraud, we apply machine learning and accomplish four investigative tasks. First, we identify persons of interest (POI), using financial records and email, and report a peak accuracy of 95.7%. Secondly, we find any publicly exposed personally identifiable information (PII) and discover 50,000 previously unreported instances. Thirdly, we automatically flag legally responsive emails as scored by human experts in the California electricity blackout lawsuit, and find a peak 99% accuracy. Finally, we track three years of primary topics and sentiment across over 10,000 unique people before, during and after the onset of the corporate crisis. Where possible, we compare accuracy against execution times for 51 algorithms and report human-interpretable business rules that can scale to vast datasets.
Tasks
Published	2020-01-24
URL	https://arxiv.org/abs/2001.10374v1
PDF	https://arxiv.org/pdf/2001.10374v1.pdf
PWC	https://paperswithcode.com/paper/the-enron-corpus-where-the-email-bodies-are
Repo
Framework

Cell R-CNN V3: A Novel Panoptic Paradigm for Instance Segmentation in Biomedical Images


Title	Cell R-CNN V3: A Novel Panoptic Paradigm for Instance Segmentation in Biomedical Images
Authors	Dongnan Liu, Donghao Zhang, Yang Song, Heng Huang, Weidong Cai
Abstract	Instance segmentation is an important task for biomedical image analysis. Due to the complicated background components, the high variability of object appearances, numerous overlapping objects, and ambiguous object boundaries, this task still remains challenging. Recently, deep learning based methods have been widely employed to solve these problems and can be categorized into proposal-free and proposal-based methods. However, both proposal-free and proposal-based methods suffer from information loss, as they focus on either global-level semantic or local-level instance features. To tackle this issue, we present a panoptic architecture that unifies the semantic and instance features in this work. Specifically, our proposed method contains a residual attention feature fusion mechanism to incorporate the instance prediction with the semantic features, in order to facilitate the semantic contextual information learning in the instance branch. Then, a mask quality branch is designed to align the confidence score of each object with the quality of the mask prediction. Furthermore, a consistency regularization mechanism is designed between the semantic segmentation tasks in the semantic and instance branches, for the robust learning of both tasks. Extensive experiments demonstrate the effectiveness of our proposed method, which outperforms several state-of-the-art methods on various biomedical datasets.
Tasks	Instance Segmentation, Medical Image Segmentation, Semantic Segmentation
Published	2020-02-15
URL	https://arxiv.org/abs/2002.06345v1
PDF	https://arxiv.org/pdf/2002.06345v1.pdf
PWC	https://paperswithcode.com/paper/cell-r-cnn-v3-a-novel-panoptic-paradigm-for
Repo
Framework

Tensor Decompositions in Deep Learning


Title	Tensor Decompositions in Deep Learning
Authors	Davide Bacciu, Danilo P. Mandic
Abstract	The paper surveys the topic of tensor decompositions in modern machine learning applications. It focuses on three active research topics of significant relevance for the community. After a brief review of consolidated works on multi-way data analysis, we consider the use of tensor decompositions in compressing the parameter space of deep learning models. Lastly, we discuss how tensor methods can be leveraged to yield richer adaptive representations of complex data, including structured information. The paper concludes with a discussion on interesting open research challenges.
Tasks
Published	2020-02-26
URL	https://arxiv.org/abs/2002.11835v1
PDF	https://arxiv.org/pdf/2002.11835v1.pdf
PWC	https://paperswithcode.com/paper/tensor-decompositions-in-deep-learning
Repo
Framework

Analysis of diversity-accuracy tradeoff in image captioning


Title	Analysis of diversity-accuracy tradeoff in image captioning
Authors	Ruotian Luo, Gregory Shakhnarovich
Abstract	We investigate the effect of different model architectures, training objectives, hyperparameter settings and decoding procedures on the diversity of automatically generated image captions. Our results show that 1) simple decoding by naive sampling, coupled with low temperature is a competitive and fast method to produce diverse and accurate caption sets; 2) training with CIDEr-based reward using Reinforcement learning harms the diversity properties of the resulting generator, which cannot be mitigated by manipulating decoding parameters. In addition, we propose a new metric AllSPICE for evaluating both accuracy and diversity of a set of captions by a single value.
Tasks	Image Captioning
Published	2020-02-27
URL	https://arxiv.org/abs/2002.11848v1
PDF	https://arxiv.org/pdf/2002.11848v1.pdf
PWC	https://paperswithcode.com/paper/analysis-of-diversity-accuracy-tradeoff-in
Repo
Framework

Image to Language Understanding: Captioning approach


Title	Image to Language Understanding: Captioning approach
Authors	Madhavan Seshadri, Malavika Srikanth, Mikhail Belov
Abstract	Extracting context from visual representations is of utmost importance in the advancement of Computer Science. Representation of such a format in Natural Language has a huge variety of applications such as helping the visually impaired etc. Such an approach is a combination of Computer Vision and Natural Language techniques which is a hard problem to solve. This project aims to compare different approaches for solving the image captioning problem. In specific, the focus was on comparing two different types of models: Encoder-Decoder approach and a Multi-model approach. In the encoder-decoder approach, inject and merge architectures were compared against a multi-modal image captioning approach based primarily on object detection. These approaches have been compared on the basis on state of the art sentence comparison metrics such as BLEU, GLEU, Meteor, and Rouge on a subset of the Google Conceptual captions dataset which contains 100k images. On the basis of this comparison, we observed that the best model was the Inception injected encoder model. This best approach has been deployed as a web-based system. On uploading an image, such a system will output the best caption associated with the image.
Tasks	Image Captioning, Object Detection
Published	2020-02-21
URL	https://arxiv.org/abs/2002.09536v1
PDF	https://arxiv.org/pdf/2002.09536v1.pdf
PWC	https://paperswithcode.com/paper/image-to-language-understanding-captioning
Repo
Framework

A Combined Stochastic and Physical Framework for Modeling Indoor 5G Millimeter Wave Propagation


Title	A Combined Stochastic and Physical Framework for Modeling Indoor 5G Millimeter Wave Propagation
Authors	Georges Nassif, Catherine Gloaguen, Philippe Martins
Abstract	Indoor coverage is a major challenge for 5G millimeter waves (mmWaves). In this paper, we address this problem through a novel theoretical framework that combines stochastic indoor environment modeling with advanced physical propagation simulation. This approach is particularly adapted to investigate indoor-to-indoor 5G mmWave propagation. Its system implementation, so-called iGeoStat, generates parameterized typical environments that account for the indoor spatial variations, then simulates radio propagation based on the physical interaction between electromagnetic waves and material properties. This framework is not dedicated to a particular environment, material, frequency or use case and aims to statistically understand the influence of indoor environment parameters on mmWave propagation properties, especially coverage and path loss. Its implementation raises numerous computational challenges that we solve by formulating an adapted link budget and designing new memory optimization algorithms. The first simulation results for two major 5G applications are validated with measurement data and show the efficiency of iGeoStat to simulate multiple diffusion in realistic environments, within a reasonable amount of time and memory resources. Generated output maps confirm that diffusion has a critical impact on indoor mmWave propagation and that proper physical modeling is of the utmost importance to generate relevant propagation models.
Tasks
Published	2020-02-12
URL	https://arxiv.org/abs/2002.05162v2
PDF	https://arxiv.org/pdf/2002.05162v2.pdf
PWC	https://paperswithcode.com/paper/a-combined-stochastic-and-physical-framework
Repo
Framework

Gaussian Smoothen Semantic Features (GSSF) – Exploring the Linguistic Aspects of Visual Captioning in Indian Languages (Bengali) Using MSCOCO Framework


Title	Gaussian Smoothen Semantic Features (GSSF) – Exploring the Linguistic Aspects of Visual Captioning in Indian Languages (Bengali) Using MSCOCO Framework
Authors	Chiranjib Sur
Abstract	In this work, we have introduced Gaussian Smoothen Semantic Features (GSSF) for Better Semantic Selection for Indian regional language-based image captioning and introduced a procedure where we used the existing translation and English crowd-sourced sentences for training. We have shown that this architecture is a promising alternative source, where there is a crunch in resources. Our main contribution of this work is the development of deep learning architectures for the Bengali language (is the fifth widely spoken language in the world) with a completely different grammar and language attributes. We have shown that these are working well for complex applications like language generation from image contexts and can diversify the representation through introducing constraints, more extensive features, and unique feature spaces. We also established that we could achieve absolute precision and diversity when we use smoothened semantic tensor with the traditional LSTM and feature decomposition networks. With better learning architecture, we succeeded in establishing an automated algorithm and assessment procedure that can help in the evaluation of competent applications without the requirement for expertise and human intervention.
Tasks	Image Captioning, Text Generation
Published	2020-02-16
URL	https://arxiv.org/abs/2002.06701v1
PDF	https://arxiv.org/pdf/2002.06701v1.pdf
PWC	https://paperswithcode.com/paper/gaussian-smoothen-semantic-features-gssf
Repo
Framework

Latent Normalizing Flows for Many-to-Many Cross-Domain Mappings


Title	Latent Normalizing Flows for Many-to-Many Cross-Domain Mappings
Authors	Shweta Mahajan, Iryna Gurevych, Stefan Roth
Abstract	Learned joint representations of images and text form the backbone of several important cross-domain tasks such as image captioning. Prior work mostly maps both domains into a common latent representation in a purely supervised fashion. This is rather restrictive, however, as the two domains follow distinct generative processes. Therefore, we propose a novel semi-supervised framework, which models shared information between domains and domain-specific information separately. The information shared between the domains is aligned with an invertible neural network. Our model integrates normalizing flow-based priors for the domain-specific information, which allows us to learn diverse many-to-many mappings between the two domains. We demonstrate the effectiveness of our model on diverse tasks, including image captioning and text-to-image synthesis.
Tasks	Image Captioning, Image Generation
Published	2020-02-16
URL	https://arxiv.org/abs/2002.06661v1
PDF	https://arxiv.org/pdf/2002.06661v1.pdf
PWC	https://paperswithcode.com/paper/latent-normalizing-flows-for-many-to-many-1
Repo
Framework