Paper Group ANR 152
A Light Field Camera Calibration Method Using Sub-Aperture Related Bipartition Projection Model and 4D Corner Detection. Preventing Clean Label Poisoning using Gaussian Mixture Loss. Point Spread Function Modelling for Wide Field Small Aperture Telescopes with a Denoising Autoencoder. Generative Modeling with Denoising Auto-Encoders and Langevin Sa …
A Light Field Camera Calibration Method Using Sub-Aperture Related Bipartition Projection Model and 4D Corner Detection
Title | A Light Field Camera Calibration Method Using Sub-Aperture Related Bipartition Projection Model and 4D Corner Detection |
Authors | Dongyang Jin, Saiping Zhang, Xiao Huo, Wei Zhang, Fuzheng Yang |
Abstract | Accurate calibration of intrinsic parameters of the light field (LF) camera is the key issue of many applications, especially of the 3D reconstruction. In this paper, we propose the Sub-Aperture Related Bipartition (SARB) projection model to characterize the LF camera. This projection model is composed with two sets of parameters targeting on center view sub-aperture and relations between sub-apertures. Moreover, we also propose a corner point detection algorithm which fully utilizes the 4D LF information in the raw image. Experimental results have demonstrated the accuracy and robustness of the corner detection method. Both the 2D re-projection errors in the lateral direction and errors in the depth direction are minimized because two sets of parameters in SARB projection model are solved separately. |
Tasks | 3D Reconstruction, Calibration |
Published | 2020-01-11 |
URL | https://arxiv.org/abs/2001.03734v1 |
https://arxiv.org/pdf/2001.03734v1.pdf | |
PWC | https://paperswithcode.com/paper/a-light-field-camera-calibration-method-using |
Repo | |
Framework | |
Preventing Clean Label Poisoning using Gaussian Mixture Loss
Title | Preventing Clean Label Poisoning using Gaussian Mixture Loss |
Authors | Muhammad Yaseen, Muneeb Aadil, Maria Sargsyan |
Abstract | Since 2014 when Szegedy et al. showed that carefully designed perturbations of the input can lead Deep Neural Networks (DNNs) to wrongly classify its label, there has been an ongoing research to make DNNs more robust to such malicious perturbations. In this work, we consider a poisoning attack called Clean Labeling poisoning attack (CLPA). The goal of CLPA is to inject seemingly benign instances which can drastically change decision boundary of the DNNs due to which subsequent queries at test time can be mis-classified. We argue that a strong defense against CLPA can be embedded into the model during the training by imposing features of the network to follow a Large Margin Gaussian Mixture distribution in the penultimate layer. By having such a prior knowledge, we can systematically evaluate how unusual the example is, given the label it is claiming to be. We demonstrate our builtin defense via experiments on MNIST and CIFAR datasets. We train two models on each dataset: one trained via softmax, another via LGM. We show that using LGM can substantially reduce the effectiveness of CLPA while having no additional overhead of data sanitization. The code to reproduce our results is available online. |
Tasks | |
Published | 2020-02-10 |
URL | https://arxiv.org/abs/2003.00798v1 |
https://arxiv.org/pdf/2003.00798v1.pdf | |
PWC | https://paperswithcode.com/paper/preventing-clean-label-poisoning-using |
Repo | |
Framework | |
Point Spread Function Modelling for Wide Field Small Aperture Telescopes with a Denoising Autoencoder
Title | Point Spread Function Modelling for Wide Field Small Aperture Telescopes with a Denoising Autoencoder |
Authors | Peng Jia, Xiyu Li, Zhengyang Li, Weinan Wang, Dongmei Cai |
Abstract | The point spread function reflects the state of an optical telescope and it is important for data post-processing methods design. For wide field small aperture telescopes, the point spread function is hard to model, because it is affected by many different effects and has strong temporal and spatial variations. In this paper, we propose to use the denoising autoencoder, a type of deep neural network, to model the point spread function of wide field small aperture telescopes. The denoising autoencoder is a pure data based point spread function modelling method, which uses calibration data from real observations or numerical simulated results as point spread function templates. According to real observation conditions, different levels of random noise or aberrations are added to point spread function templates, making them as realizations of the point spread function, i.e., simulated star images. Then we train the denoising autoencoder with realizations and templates of the point spread function. After training, the denoising autoencoder learns the manifold space of the point spread function and can map any star images obtained by wide field small aperture telescopes directly to its point spread function, which could be used to design data post-processing or optical system alignment methods. |
Tasks | Calibration, Denoising |
Published | 2020-01-31 |
URL | https://arxiv.org/abs/2001.11716v2 |
https://arxiv.org/pdf/2001.11716v2.pdf | |
PWC | https://paperswithcode.com/paper/point-spread-function-modelling-for-wide |
Repo | |
Framework | |
Generative Modeling with Denoising Auto-Encoders and Langevin Sampling
Title | Generative Modeling with Denoising Auto-Encoders and Langevin Sampling |
Authors | Adam Block, Youssef Mroueh, Alexander Rakhlin |
Abstract | We study convergence of a generative modeling method that first estimates the score function of the distribution using Denoising Auto-Encoders (DAE) or Denoising Score Matching (DSM) and then employs Langevin diffusion for sampling. We show that both DAE and DSM provide estimates of the score of the Gaussian smoothed population density, allowing us to apply the machinery of Empirical Processes. We overcome the challenge of relying only on $L^2$ bounds on the score estimation error and provide finite-sample bounds in the Wasserstein distance between the law of the population distribution and the law of this sampling scheme. We then apply our results to the homotopy method of arXiv:1907.05600 and provide theoretical justification for its empirical success. |
Tasks | Denoising |
Published | 2020-01-31 |
URL | https://arxiv.org/abs/2002.00107v2 |
https://arxiv.org/pdf/2002.00107v2.pdf | |
PWC | https://paperswithcode.com/paper/generative-modeling-with-denoising-auto |
Repo | |
Framework | |
Kernel Bi-Linear Modeling for Reconstructing Data on Manifolds: The Dynamic-MRI Case
Title | Kernel Bi-Linear Modeling for Reconstructing Data on Manifolds: The Dynamic-MRI Case |
Authors | Gaurav N. Shetty, Konstantinos Slavakis, Ukash Nakarmi, Gesualdo Scutari, Leslie Ying |
Abstract | This paper establishes a kernel-based framework for reconstructing data on manifolds, tailored to fit the dynamic-(d)MRI-data recovery problem. The proposed methodology exploits simple tangent-space geometries of manifolds in reproducing kernel Hilbert spaces and follows classical kernel-approximation arguments to form the data-recovery task as a bi-linear inverse problem. Departing from mainstream approaches, the proposed methodology uses no training data, employs no graph Laplacian matrix to penalize the optimization task, uses no costly (kernel) pre-imaging step to map feature points back to the input space, and utilizes complex-valued kernel functions to account for k-space data. The framework is validated on synthetically generated dMRI data, where comparisons against state-of-the-art schemes highlight the rich potential of the proposed approach in data-recovery problems. |
Tasks | |
Published | 2020-02-27 |
URL | https://arxiv.org/abs/2002.11885v1 |
https://arxiv.org/pdf/2002.11885v1.pdf | |
PWC | https://paperswithcode.com/paper/kernel-bi-linear-modeling-for-reconstructing |
Repo | |
Framework | |
Data-Driven Symbol Detection via Model-Based Machine Learning
Title | Data-Driven Symbol Detection via Model-Based Machine Learning |
Authors | Nariman Farsad, Nir Shlezinger, Andrea J. Goldsmith, Yonina C. Eldar |
Abstract | The design of symbol detectors in digital communication systems has traditionally relied on statistical channel models that describe the relation between the transmitted symbols and the observed signal at the receiver. Here we review a data-driven framework to symbol detection design which combines machine learning (ML) and model-based algorithms. In this hybrid approach, well-known channel-model-based algorithms such as the Viterbi method, BCJR detection, and multiple-input multiple-output (MIMO) soft interference cancellation (SIC) are augmented with ML-based algorithms to remove their channel-model-dependence, allowing the receiver to learn to implement these algorithms solely from data. The resulting data-driven receivers are most suitable for systems where the underlying channel models are poorly understood, highly complex, or do not well-capture the underlying physics. Our approach is unique in that it only replaces the channel-model-based computations with dedicated neural networks that can be trained from a small amount of data, while keeping the general algorithm intact. Our results demonstrate that these techniques can yield near-optimal performance of model-based algorithms without knowing the exact channel input-output statistical relationship and in the presence of channel state information uncertainty. |
Tasks | |
Published | 2020-02-14 |
URL | https://arxiv.org/abs/2002.07806v1 |
https://arxiv.org/pdf/2002.07806v1.pdf | |
PWC | https://paperswithcode.com/paper/data-driven-symbol-detection-via-model-based |
Repo | |
Framework | |
Named Entities in Medical Case Reports: Corpus and Experiments
Title | Named Entities in Medical Case Reports: Corpus and Experiments |
Authors | Sarah Schulz, Jurica Ševa, Samuel Rodriguez, Malte Ostendorff, Georg Rehm |
Abstract | We present a new corpus comprising annotations of medical entities in case reports, originating from PubMed Central’s open access library. In the case reports, we annotate cases, conditions, findings, factors and negation modifiers. Moreover, where applicable, we annotate relations between these entities. As such, this is the first corpus of this kind made available to the scientific community in English. It enables the initial investigation of automatic information extraction from case reports through tasks like Named Entity Recognition, Relation Extraction and (sentence/paragraph) relevance detection. Additionally, we present four strong baseline systems for the detection of medical entities made available through the annotated dataset. |
Tasks | Named Entity Recognition, Relation Extraction |
Published | 2020-03-29 |
URL | https://arxiv.org/abs/2003.13032v1 |
https://arxiv.org/pdf/2003.13032v1.pdf | |
PWC | https://paperswithcode.com/paper/named-entities-in-medical-case-reports-corpus |
Repo | |
Framework | |
Neural Relation Prediction for Simple Question Answering over Knowledge Graph
Title | Neural Relation Prediction for Simple Question Answering over Knowledge Graph |
Authors | Amin Abolghasemi, Saeedeh Momtazi |
Abstract | Relation extraction from simple questions aims to capture the relation of a factoid question with one underlying relation from a set of predefined ones ina knowledge base. Most recent methods take advantage of neural networks for matching a question with all relations in order to find the best relation that is expressed by that question. In this paper, we propose an instance-based method to find similar questions of a new question, in the sense of their relations, to predict its mentioned relation. The motivation roots in the fact that a relation can be expressed with different forms of question and these forms mostly share similar terms or concepts. Our experiments on the SimpleQuestions dataset show that the proposed model achieved better accuracy compared to the state-of-the-art relation extraction models. |
Tasks | Question Answering, Relation Extraction |
Published | 2020-02-18 |
URL | https://arxiv.org/abs/2002.07715v1 |
https://arxiv.org/pdf/2002.07715v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-relation-prediction-for-simple |
Repo | |
Framework | |
Learning-Aided Deep Path Prediction for Sphere Decoding in Large MIMO Systems
Title | Learning-Aided Deep Path Prediction for Sphere Decoding in Large MIMO Systems |
Authors | Doyeon Weon, Kyungchun Lee |
Abstract | In this paper, we propose a novel learning-aided sphere decoding (SD) scheme for large multiple-input–multiple-output systems, namely, deep path prediction-based sphere decoding (DPP-SD). In this scheme, we employ a neural network (NN) to predict the minimum metrics of the ``deep’’ paths in sub-trees before commencing the tree search in SD. To reduce the complexity of the NN, we employ the input vector with a reduced dimension rather than using the original received signals and full channel matrix. The outputs of the NN, i.e., the predicted minimum path metrics, are exploited to determine the search order between the sub-trees, as well as to optimize the initial search radius, which may reduce the computational complexity of SD. For further complexity reduction, an early termination scheme based on the predicted minimum path metrics is also proposed. Our simulation results show that the proposed DPP-SD scheme provides a significant reduction in computational complexity compared with the conventional SD algorithm, despite achieving near-optimal performance. | |
Tasks | |
Published | 2020-01-02 |
URL | https://arxiv.org/abs/2001.00342v1 |
https://arxiv.org/pdf/2001.00342v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-aided-deep-path-prediction-for |
Repo | |
Framework | |
Learning a Weakly-Supervised Video Actor-Action Segmentation Model with a Wise Selection
Title | Learning a Weakly-Supervised Video Actor-Action Segmentation Model with a Wise Selection |
Authors | Jie Chen, Zhiheng Li, Jiebo Luo, Chenliang Xu |
Abstract | We address weakly-supervised video actor-action segmentation (VAAS), which extends general video object segmentation (VOS) to additionally consider action labels of the actors. The most successful methods on VOS synthesize a pool of pseudo-annotations (PAs) and then refine them iteratively. However, they face challenges as to how to select from a massive amount of PAs high-quality ones, how to set an appropriate stop condition for weakly-supervised training, and how to initialize PAs pertaining to VAAS. To overcome these challenges, we propose a general Weakly-Supervised framework with a Wise Selection of training samples and model evaluation criterion (WS^2). Instead of blindly trusting quality-inconsistent PAs, WS^2 employs a learning-based selection to select effective PAs and a novel region integrity criterion as a stopping condition for weakly-supervised training. In addition, a 3D-Conv GCAM is devised to adapt to the VAAS task. Extensive experiments show that WS^2 achieves state-of-the-art performance on both weakly-supervised VOS and VAAS tasks and is on par with the best fully-supervised method on VAAS. |
Tasks | action segmentation, Semantic Segmentation, Video Object Segmentation, Video Semantic Segmentation |
Published | 2020-03-29 |
URL | https://arxiv.org/abs/2003.13141v1 |
https://arxiv.org/pdf/2003.13141v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-a-weakly-supervised-video-actor |
Repo | |
Framework | |
Semi-supervised Development of ASR Systems for Multilingual Code-switched Speech in Under-resourced Languages
Title | Semi-supervised Development of ASR Systems for Multilingual Code-switched Speech in Under-resourced Languages |
Authors | Astik Biswas, Emre Yılmaz, Febe de Wet, Ewald van der Westhuizen, Thomas Niesler |
Abstract | This paper reports on the semi-supervised development of acoustic and language models for under-resourced, code-switched speech in five South African languages. Two approaches are considered. The first constructs four separate bilingual automatic speech recognisers (ASRs) corresponding to four different language pairs between which speakers switch frequently. The second uses a single, unified, five-lingual ASR system that represents all the languages (English, isiZulu, isiXhosa, Setswana and Sesotho). We evaluate the effectiveness of these two approaches when used to add additional data to our extremely sparse training sets. Results indicate that batch-wise semi-supervised training yields better results than a non-batch-wise approach. Furthermore, while the separate bilingual systems achieved better recognition performance than the unified system, they benefited more from pseudo-labels generated by the five-lingual system than from those generated by the bilingual systems. |
Tasks | |
Published | 2020-03-06 |
URL | https://arxiv.org/abs/2003.03135v1 |
https://arxiv.org/pdf/2003.03135v1.pdf | |
PWC | https://paperswithcode.com/paper/semi-supervised-development-of-asr-systems |
Repo | |
Framework | |
Automated Methods for Detection and Classification Pneumonia based on X-Ray Images Using Deep Learning
Title | Automated Methods for Detection and Classification Pneumonia based on X-Ray Images Using Deep Learning |
Authors | Khalid El Asnaoui, Youness Chawki, Ali Idri |
Abstract | Recently, researchers, specialists, and companies around the world are rolling out deep learning and image processing-based systems that can fastly process hundreds of X-Ray and computed tomography (CT) images to accelerate the diagnosis of pneumonia such as SARS, COVID-19, and aid in its containment. Medical images analysis is one of the most promising research areas, it provides facilities for diagnosis and making decisions of a number of diseases such as MERS, COVID-19. In this paper, we present a comparison of recent Deep Convolutional Neural Network (DCNN) architectures for automatic binary classification of pneumonia images based fined tuned versions of (VGG16, VGG19, DenseNet201, Inception_ResNet_V2, Inception_V3, Resnet50, MobileNet_V2 and Xception). The proposed work has been tested using chest X-Ray & CT dataset which contains 5856 images (4273 pneumonia and 1583 normal). As result we can conclude that fine-tuned version of Resnet50, MobileNet_V2 and Inception_Resnet_V2 show highly satisfactory performance with rate of increase in training and validation accuracy (more than 96% of accuracy). Unlike CNN, Xception, VGG16, VGG19, Inception_V3 and DenseNet201 display low performance (more than 84% accuracy). |
Tasks | Computed Tomography (CT) |
Published | 2020-03-31 |
URL | https://arxiv.org/abs/2003.14363v1 |
https://arxiv.org/pdf/2003.14363v1.pdf | |
PWC | https://paperswithcode.com/paper/automated-methods-for-detection-and |
Repo | |
Framework | |
RDAnet: A Deep Learning Based Approach for Synthetic Aperture Radar Image Formation
Title | RDAnet: A Deep Learning Based Approach for Synthetic Aperture Radar Image Formation |
Authors | Andrew Rittenbach, John Paul Walters |
Abstract | Synthetic Aperture Radar (SAR) imaging systems operate by emitting radar signals from a moving object, such as a satellite, towards the target of interest. Reflected radar echoes are received and later used by image formation algorithms to form a SAR image. There is great interest in using SAR images in computer vision tasks such as automatic target recognition. Today, however, SAR applications consist of multiple operations: image formation followed by image processing. In this work, we show that deep learning can be used to train a neural network able to form SAR images from echo data. Results show that our neural network, RDAnet, can form SAR images comparable to images formed using a traditional algorithm. This approach opens the possibility to end-to-end SAR applications where image formation and image processing are integrated into a single task. We believe that this work is the first demonstration of deep learning based SAR image formation using real data. |
Tasks | |
Published | 2020-01-22 |
URL | https://arxiv.org/abs/2001.08202v1 |
https://arxiv.org/pdf/2001.08202v1.pdf | |
PWC | https://paperswithcode.com/paper/rdanet-a-deep-learning-based-approach-for |
Repo | |
Framework | |
Fast Geometric Projections for Local Robustness Certification
Title | Fast Geometric Projections for Local Robustness Certification |
Authors | Aymeric Fromherz, Klas Leino, Matt Fredrikson, Bryan Parno, Corina Păsăreanu |
Abstract | Local robustness ensures that a model classifies all inputs within an $\epsilon$-ball consistently, which precludes various forms of adversarial inputs. In this paper, we present a fast procedure for checking local robustness in feed-forward neural networks with piecewise linear activation functions. The key insight is that such networks partition the input space into a polyhedral complex such that the network is linear inside each polyhedral region; hence, a systematic search for decision boundaries within the regions around a given input is sufficient for assessing robustness. Crucially, we show how these regions can be analyzed using geometric projections instead of expensive constraint solving, thus admitting an efficient, highly-parallel GPU implementation at the price of incompleteness, which can be addressed by falling back on prior approaches. Empirically, we find that incompleteness is not often an issue, and that our method performs one to two orders of magnitude faster than existing robustness-certification techniques based on constraint solving. |
Tasks | |
Published | 2020-02-12 |
URL | https://arxiv.org/abs/2002.04742v1 |
https://arxiv.org/pdf/2002.04742v1.pdf | |
PWC | https://paperswithcode.com/paper/fast-geometric-projections-for-local |
Repo | |
Framework | |
Data-Free Knowledge Amalgamation via Group-Stack Dual-GAN
Title | Data-Free Knowledge Amalgamation via Group-Stack Dual-GAN |
Authors | Jingwen Ye, Yixin Ji, Xinchao Wang, Xin Gao, Mingli Song |
Abstract | Recent advances in deep learning have provided procedures for learning one network to amalgamate multiple streams of knowledge from the pre-trained Convolutional Neural Network (CNN) models, thus reduce the annotation cost. However, almost all existing methods demand massive training data, which may be unavailable due to privacy or transmission issues. In this paper, we propose a data-free knowledge amalgamate strategy to craft a well-behaved multi-task student network from multiple single/multi-task teachers. The main idea is to construct the group-stack generative adversarial networks (GANs) which have two dual generators. First one generator is trained to collect the knowledge by reconstructing the images approximating the original dataset utilized for pre-training the teachers. Then a dual generator is trained by taking the output from the former generator as input. Finally we treat the dual part generator as the target network and regroup it. As demonstrated on several benchmarks of multi-label classification, the proposed method without any training data achieves the surprisingly competitive results, even compared with some full-supervised methods. |
Tasks | Multi-Label Classification |
Published | 2020-03-20 |
URL | https://arxiv.org/abs/2003.09088v1 |
https://arxiv.org/pdf/2003.09088v1.pdf | |
PWC | https://paperswithcode.com/paper/data-free-knowledge-amalgamation-via-group |
Repo | |
Framework | |