April 2, 2020

# Paper Group ANR 152

A Light Field Camera Calibration Method Using Sub-Aperture Related Bipartition Projection Model and 4D Corner Detection. Preventing Clean Label Poisoning using Gaussian Mixture Loss. Point Spread Function Modelling for Wide Field Small Aperture Telescopes with a Denoising Autoencoder. Generative Modeling with Denoising Auto-Encoders and Langevin Sa …

Title A Light Field Camera Calibration Method Using Sub-Aperture Related Bipartition Projection Model and 4D Corner Detection
Authors Dongyang Jin, Saiping Zhang, Xiao Huo, Wei Zhang, Fuzheng Yang
Abstract Accurate calibration of intrinsic parameters of the light field (LF) camera is the key issue of many applications, especially of the 3D reconstruction. In this paper, we propose the Sub-Aperture Related Bipartition (SARB) projection model to characterize the LF camera. This projection model is composed with two sets of parameters targeting on center view sub-aperture and relations between sub-apertures. Moreover, we also propose a corner point detection algorithm which fully utilizes the 4D LF information in the raw image. Experimental results have demonstrated the accuracy and robustness of the corner detection method. Both the 2D re-projection errors in the lateral direction and errors in the depth direction are minimized because two sets of parameters in SARB projection model are solved separately.
Published 2020-01-11
URL https://arxiv.org/abs/2001.03734v1
PDF https://arxiv.org/pdf/2001.03734v1.pdf
PWC https://paperswithcode.com/paper/a-light-field-camera-calibration-method-using
Repo
Framework

#### Preventing Clean Label Poisoning using Gaussian Mixture Loss

Title Preventing Clean Label Poisoning using Gaussian Mixture Loss
Abstract Since 2014 when Szegedy et al. showed that carefully designed perturbations of the input can lead Deep Neural Networks (DNNs) to wrongly classify its label, there has been an ongoing research to make DNNs more robust to such malicious perturbations. In this work, we consider a poisoning attack called Clean Labeling poisoning attack (CLPA). The goal of CLPA is to inject seemingly benign instances which can drastically change decision boundary of the DNNs due to which subsequent queries at test time can be mis-classified. We argue that a strong defense against CLPA can be embedded into the model during the training by imposing features of the network to follow a Large Margin Gaussian Mixture distribution in the penultimate layer. By having such a prior knowledge, we can systematically evaluate how unusual the example is, given the label it is claiming to be. We demonstrate our builtin defense via experiments on MNIST and CIFAR datasets. We train two models on each dataset: one trained via softmax, another via LGM. We show that using LGM can substantially reduce the effectiveness of CLPA while having no additional overhead of data sanitization. The code to reproduce our results is available online.
Published 2020-02-10
URL https://arxiv.org/abs/2003.00798v1
PDF https://arxiv.org/pdf/2003.00798v1.pdf
PWC https://paperswithcode.com/paper/preventing-clean-label-poisoning-using
Repo
Framework

#### Point Spread Function Modelling for Wide Field Small Aperture Telescopes with a Denoising Autoencoder

Title Point Spread Function Modelling for Wide Field Small Aperture Telescopes with a Denoising Autoencoder
Authors Peng Jia, Xiyu Li, Zhengyang Li, Weinan Wang, Dongmei Cai
Published 2020-01-31
URL https://arxiv.org/abs/2001.11716v2
PDF https://arxiv.org/pdf/2001.11716v2.pdf
Repo
Framework

#### Generative Modeling with Denoising Auto-Encoders and Langevin Sampling

Title Generative Modeling with Denoising Auto-Encoders and Langevin Sampling
Authors Adam Block, Youssef Mroueh, Alexander Rakhlin
Abstract We study convergence of a generative modeling method that first estimates the score function of the distribution using Denoising Auto-Encoders (DAE) or Denoising Score Matching (DSM) and then employs Langevin diffusion for sampling. We show that both DAE and DSM provide estimates of the score of the Gaussian smoothed population density, allowing us to apply the machinery of Empirical Processes. We overcome the challenge of relying only on $L^2$ bounds on the score estimation error and provide finite-sample bounds in the Wasserstein distance between the law of the population distribution and the law of this sampling scheme. We then apply our results to the homotopy method of arXiv:1907.05600 and provide theoretical justification for its empirical success.
Published 2020-01-31
URL https://arxiv.org/abs/2002.00107v2
PDF https://arxiv.org/pdf/2002.00107v2.pdf
PWC https://paperswithcode.com/paper/generative-modeling-with-denoising-auto
Repo
Framework

#### Kernel Bi-Linear Modeling for Reconstructing Data on Manifolds: The Dynamic-MRI Case

Title Kernel Bi-Linear Modeling for Reconstructing Data on Manifolds: The Dynamic-MRI Case
Authors Gaurav N. Shetty, Konstantinos Slavakis, Ukash Nakarmi, Gesualdo Scutari, Leslie Ying
Abstract This paper establishes a kernel-based framework for reconstructing data on manifolds, tailored to fit the dynamic-(d)MRI-data recovery problem. The proposed methodology exploits simple tangent-space geometries of manifolds in reproducing kernel Hilbert spaces and follows classical kernel-approximation arguments to form the data-recovery task as a bi-linear inverse problem. Departing from mainstream approaches, the proposed methodology uses no training data, employs no graph Laplacian matrix to penalize the optimization task, uses no costly (kernel) pre-imaging step to map feature points back to the input space, and utilizes complex-valued kernel functions to account for k-space data. The framework is validated on synthetically generated dMRI data, where comparisons against state-of-the-art schemes highlight the rich potential of the proposed approach in data-recovery problems.
Published 2020-02-27
URL https://arxiv.org/abs/2002.11885v1
PDF https://arxiv.org/pdf/2002.11885v1.pdf
PWC https://paperswithcode.com/paper/kernel-bi-linear-modeling-for-reconstructing
Repo
Framework

#### Data-Driven Symbol Detection via Model-Based Machine Learning

Title Data-Driven Symbol Detection via Model-Based Machine Learning
Authors Nariman Farsad, Nir Shlezinger, Andrea J. Goldsmith, Yonina C. Eldar
Abstract The design of symbol detectors in digital communication systems has traditionally relied on statistical channel models that describe the relation between the transmitted symbols and the observed signal at the receiver. Here we review a data-driven framework to symbol detection design which combines machine learning (ML) and model-based algorithms. In this hybrid approach, well-known channel-model-based algorithms such as the Viterbi method, BCJR detection, and multiple-input multiple-output (MIMO) soft interference cancellation (SIC) are augmented with ML-based algorithms to remove their channel-model-dependence, allowing the receiver to learn to implement these algorithms solely from data. The resulting data-driven receivers are most suitable for systems where the underlying channel models are poorly understood, highly complex, or do not well-capture the underlying physics. Our approach is unique in that it only replaces the channel-model-based computations with dedicated neural networks that can be trained from a small amount of data, while keeping the general algorithm intact. Our results demonstrate that these techniques can yield near-optimal performance of model-based algorithms without knowing the exact channel input-output statistical relationship and in the presence of channel state information uncertainty.
Published 2020-02-14
URL https://arxiv.org/abs/2002.07806v1
PDF https://arxiv.org/pdf/2002.07806v1.pdf
PWC https://paperswithcode.com/paper/data-driven-symbol-detection-via-model-based
Repo
Framework

#### Named Entities in Medical Case Reports: Corpus and Experiments

Title Named Entities in Medical Case Reports: Corpus and Experiments
Authors Sarah Schulz, Jurica Ševa, Samuel Rodriguez, Malte Ostendorff, Georg Rehm
Abstract We present a new corpus comprising annotations of medical entities in case reports, originating from PubMed Central’s open access library. In the case reports, we annotate cases, conditions, findings, factors and negation modifiers. Moreover, where applicable, we annotate relations between these entities. As such, this is the first corpus of this kind made available to the scientific community in English. It enables the initial investigation of automatic information extraction from case reports through tasks like Named Entity Recognition, Relation Extraction and (sentence/paragraph) relevance detection. Additionally, we present four strong baseline systems for the detection of medical entities made available through the annotated dataset.
Tasks Named Entity Recognition, Relation Extraction
Published 2020-03-29
URL https://arxiv.org/abs/2003.13032v1
PDF https://arxiv.org/pdf/2003.13032v1.pdf
PWC https://paperswithcode.com/paper/named-entities-in-medical-case-reports-corpus
Repo
Framework

#### Neural Relation Prediction for Simple Question Answering over Knowledge Graph

Title Neural Relation Prediction for Simple Question Answering over Knowledge Graph
Authors Amin Abolghasemi, Saeedeh Momtazi
Abstract Relation extraction from simple questions aims to capture the relation of a factoid question with one underlying relation from a set of predefined ones ina knowledge base. Most recent methods take advantage of neural networks for matching a question with all relations in order to find the best relation that is expressed by that question. In this paper, we propose an instance-based method to find similar questions of a new question, in the sense of their relations, to predict its mentioned relation. The motivation roots in the fact that a relation can be expressed with different forms of question and these forms mostly share similar terms or concepts. Our experiments on the SimpleQuestions dataset show that the proposed model achieved better accuracy compared to the state-of-the-art relation extraction models.
Published 2020-02-18
URL https://arxiv.org/abs/2002.07715v1
PDF https://arxiv.org/pdf/2002.07715v1.pdf
PWC https://paperswithcode.com/paper/neural-relation-prediction-for-simple
Repo
Framework

#### Learning-Aided Deep Path Prediction for Sphere Decoding in Large MIMO Systems

Title Learning-Aided Deep Path Prediction for Sphere Decoding in Large MIMO Systems
Authors Doyeon Weon, Kyungchun Lee
Abstract In this paper, we propose a novel learning-aided sphere decoding (SD) scheme for large multiple-input–multiple-output systems, namely, deep path prediction-based sphere decoding (DPP-SD). In this scheme, we employ a neural network (NN) to predict the minimum metrics of the deep’’ paths in sub-trees before commencing the tree search in SD. To reduce the complexity of the NN, we employ the input vector with a reduced dimension rather than using the original received signals and full channel matrix. The outputs of the NN, i.e., the predicted minimum path metrics, are exploited to determine the search order between the sub-trees, as well as to optimize the initial search radius, which may reduce the computational complexity of SD. For further complexity reduction, an early termination scheme based on the predicted minimum path metrics is also proposed. Our simulation results show that the proposed DPP-SD scheme provides a significant reduction in computational complexity compared with the conventional SD algorithm, despite achieving near-optimal performance. |
Published 2020-01-02
URL https://arxiv.org/abs/2001.00342v1
PDF https://arxiv.org/pdf/2001.00342v1.pdf
PWC https://paperswithcode.com/paper/learning-aided-deep-path-prediction-for
Repo
Framework

#### Learning a Weakly-Supervised Video Actor-Action Segmentation Model with a Wise Selection

Title Learning a Weakly-Supervised Video Actor-Action Segmentation Model with a Wise Selection
Authors Jie Chen, Zhiheng Li, Jiebo Luo, Chenliang Xu
Abstract We address weakly-supervised video actor-action segmentation (VAAS), which extends general video object segmentation (VOS) to additionally consider action labels of the actors. The most successful methods on VOS synthesize a pool of pseudo-annotations (PAs) and then refine them iteratively. However, they face challenges as to how to select from a massive amount of PAs high-quality ones, how to set an appropriate stop condition for weakly-supervised training, and how to initialize PAs pertaining to VAAS. To overcome these challenges, we propose a general Weakly-Supervised framework with a Wise Selection of training samples and model evaluation criterion (WS^2). Instead of blindly trusting quality-inconsistent PAs, WS^2 employs a learning-based selection to select effective PAs and a novel region integrity criterion as a stopping condition for weakly-supervised training. In addition, a 3D-Conv GCAM is devised to adapt to the VAAS task. Extensive experiments show that WS^2 achieves state-of-the-art performance on both weakly-supervised VOS and VAAS tasks and is on par with the best fully-supervised method on VAAS.
Tasks action segmentation, Semantic Segmentation, Video Object Segmentation, Video Semantic Segmentation
Published 2020-03-29
URL https://arxiv.org/abs/2003.13141v1
PDF https://arxiv.org/pdf/2003.13141v1.pdf
PWC https://paperswithcode.com/paper/learning-a-weakly-supervised-video-actor
Repo
Framework

#### Semi-supervised Development of ASR Systems for Multilingual Code-switched Speech in Under-resourced Languages

Title Semi-supervised Development of ASR Systems for Multilingual Code-switched Speech in Under-resourced Languages
Authors Astik Biswas, Emre Yılmaz, Febe de Wet, Ewald van der Westhuizen, Thomas Niesler
Abstract This paper reports on the semi-supervised development of acoustic and language models for under-resourced, code-switched speech in five South African languages. Two approaches are considered. The first constructs four separate bilingual automatic speech recognisers (ASRs) corresponding to four different language pairs between which speakers switch frequently. The second uses a single, unified, five-lingual ASR system that represents all the languages (English, isiZulu, isiXhosa, Setswana and Sesotho). We evaluate the effectiveness of these two approaches when used to add additional data to our extremely sparse training sets. Results indicate that batch-wise semi-supervised training yields better results than a non-batch-wise approach. Furthermore, while the separate bilingual systems achieved better recognition performance than the unified system, they benefited more from pseudo-labels generated by the five-lingual system than from those generated by the bilingual systems.
Published 2020-03-06
URL https://arxiv.org/abs/2003.03135v1
PDF https://arxiv.org/pdf/2003.03135v1.pdf
PWC https://paperswithcode.com/paper/semi-supervised-development-of-asr-systems
Repo
Framework

#### Automated Methods for Detection and Classification Pneumonia based on X-Ray Images Using Deep Learning

Title Automated Methods for Detection and Classification Pneumonia based on X-Ray Images Using Deep Learning
Authors Khalid El Asnaoui, Youness Chawki, Ali Idri
Abstract Recently, researchers, specialists, and companies around the world are rolling out deep learning and image processing-based systems that can fastly process hundreds of X-Ray and computed tomography (CT) images to accelerate the diagnosis of pneumonia such as SARS, COVID-19, and aid in its containment. Medical images analysis is one of the most promising research areas, it provides facilities for diagnosis and making decisions of a number of diseases such as MERS, COVID-19. In this paper, we present a comparison of recent Deep Convolutional Neural Network (DCNN) architectures for automatic binary classification of pneumonia images based fined tuned versions of (VGG16, VGG19, DenseNet201, Inception_ResNet_V2, Inception_V3, Resnet50, MobileNet_V2 and Xception). The proposed work has been tested using chest X-Ray & CT dataset which contains 5856 images (4273 pneumonia and 1583 normal). As result we can conclude that fine-tuned version of Resnet50, MobileNet_V2 and Inception_Resnet_V2 show highly satisfactory performance with rate of increase in training and validation accuracy (more than 96% of accuracy). Unlike CNN, Xception, VGG16, VGG19, Inception_V3 and DenseNet201 display low performance (more than 84% accuracy).
Published 2020-03-31
URL https://arxiv.org/abs/2003.14363v1
PDF https://arxiv.org/pdf/2003.14363v1.pdf
PWC https://paperswithcode.com/paper/automated-methods-for-detection-and
Repo
Framework

#### RDAnet: A Deep Learning Based Approach for Synthetic Aperture Radar Image Formation

Title RDAnet: A Deep Learning Based Approach for Synthetic Aperture Radar Image Formation
Authors Andrew Rittenbach, John Paul Walters
Abstract Synthetic Aperture Radar (SAR) imaging systems operate by emitting radar signals from a moving object, such as a satellite, towards the target of interest. Reflected radar echoes are received and later used by image formation algorithms to form a SAR image. There is great interest in using SAR images in computer vision tasks such as automatic target recognition. Today, however, SAR applications consist of multiple operations: image formation followed by image processing. In this work, we show that deep learning can be used to train a neural network able to form SAR images from echo data. Results show that our neural network, RDAnet, can form SAR images comparable to images formed using a traditional algorithm. This approach opens the possibility to end-to-end SAR applications where image formation and image processing are integrated into a single task. We believe that this work is the first demonstration of deep learning based SAR image formation using real data.
Published 2020-01-22
URL https://arxiv.org/abs/2001.08202v1
PDF https://arxiv.org/pdf/2001.08202v1.pdf
PWC https://paperswithcode.com/paper/rdanet-a-deep-learning-based-approach-for
Repo
Framework

#### Fast Geometric Projections for Local Robustness Certification

Title Fast Geometric Projections for Local Robustness Certification
Authors Aymeric Fromherz, Klas Leino, Matt Fredrikson, Bryan Parno, Corina Păsăreanu
Abstract Local robustness ensures that a model classifies all inputs within an $\epsilon$-ball consistently, which precludes various forms of adversarial inputs. In this paper, we present a fast procedure for checking local robustness in feed-forward neural networks with piecewise linear activation functions. The key insight is that such networks partition the input space into a polyhedral complex such that the network is linear inside each polyhedral region; hence, a systematic search for decision boundaries within the regions around a given input is sufficient for assessing robustness. Crucially, we show how these regions can be analyzed using geometric projections instead of expensive constraint solving, thus admitting an efficient, highly-parallel GPU implementation at the price of incompleteness, which can be addressed by falling back on prior approaches. Empirically, we find that incompleteness is not often an issue, and that our method performs one to two orders of magnitude faster than existing robustness-certification techniques based on constraint solving.
Published 2020-02-12
URL https://arxiv.org/abs/2002.04742v1
PDF https://arxiv.org/pdf/2002.04742v1.pdf
PWC https://paperswithcode.com/paper/fast-geometric-projections-for-local
Repo
Framework

#### Data-Free Knowledge Amalgamation via Group-Stack Dual-GAN

Title Data-Free Knowledge Amalgamation via Group-Stack Dual-GAN
Authors Jingwen Ye, Yixin Ji, Xinchao Wang, Xin Gao, Mingli Song
Abstract Recent advances in deep learning have provided procedures for learning one network to amalgamate multiple streams of knowledge from the pre-trained Convolutional Neural Network (CNN) models, thus reduce the annotation cost. However, almost all existing methods demand massive training data, which may be unavailable due to privacy or transmission issues. In this paper, we propose a data-free knowledge amalgamate strategy to craft a well-behaved multi-task student network from multiple single/multi-task teachers. The main idea is to construct the group-stack generative adversarial networks (GANs) which have two dual generators. First one generator is trained to collect the knowledge by reconstructing the images approximating the original dataset utilized for pre-training the teachers. Then a dual generator is trained by taking the output from the former generator as input. Finally we treat the dual part generator as the target network and regroup it. As demonstrated on several benchmarks of multi-label classification, the proposed method without any training data achieves the surprisingly competitive results, even compared with some full-supervised methods.