January 28, 2020

3518 words 17 mins read

Paper Group ANR 894

Softmax Dissection: Towards Understanding Intra- and Inter-class Objective for Embedding Learning. Early Detection of Diabetic Retinopathy and Severity Scale Measurement: A Progressive Review & Scopes. Adversarial Training is a Form of Data-dependent Operator Norm Regularization. Approximate Representer Theorems in Non-reflexive Banach Spaces. ISP4 …

Softmax Dissection: Towards Understanding Intra- and Inter-class Objective for Embedding Learning


Title	Softmax Dissection: Towards Understanding Intra- and Inter-class Objective for Embedding Learning
Authors	Lanqing He, Zhongdao Wang, Yali Li, Shengjin Wang
Abstract	The softmax loss and its variants are widely used as objectives for embedding learning, especially in applications like face recognition. However, the intra- and inter-class objectives in the softmax loss are entangled, therefore a well-optimized inter-class objective leads to relaxation on the intra-class objective, and vice versa. In this paper, we propose to dissect the softmax loss into independent intra- and inter-class objective (D-Softmax). With D-Softmax as objective, we can have a clear understanding of both the intra- and inter-class objective, therefore it is straightforward to tune each part to the best state. Furthermore, we find the computation of the inter-class objective is redundant and propose two sampling-based variants of D-Softmax to reduce the computation cost. Training with regular-scale data, experiments in face verification show D-Softmax is favorably comparable to existing losses such as SphereFace and ArcFace. Training with massive-scale data, experiments show the fast variants of D-Softmax significantly accelerates the training process (such as 64x) with only a minor sacrifice in performance, outperforming existing acceleration methods of softmax in terms of both performance and efficiency.
Tasks	Face Recognition, Face Verification
Published	2019-08-04
URL	https://arxiv.org/abs/1908.01281v2
PDF	https://arxiv.org/pdf/1908.01281v2.pdf
PWC	https://paperswithcode.com/paper/softmax-dissection-towards-understanding
Repo
Framework

Early Detection of Diabetic Retinopathy and Severity Scale Measurement: A Progressive Review & Scopes


Title	Early Detection of Diabetic Retinopathy and Severity Scale Measurement: A Progressive Review & Scopes
Authors	Asma Khatun, Sk. Golam Sarowar Hossain
Abstract	Early detection of diabetic retinopathy prevents visual loss and blindness of a human eye. Based on the types of feature extraction method used, DR detection method can be broadly classified as Deep Convolutional Neural Network (CNN) based and traditional feature extraction (machine learning) based. This paper presents a comprehensive survey of existing feature extraction methods based on Deep CNN and conventional feature extraction for DR detection. In addition to that, this paper focuses on the severity scale measurement of the DR detection and to the best of our knowledge this is the first survey paper which covers severity grading scale. It is also necessary to mention that this is the first study which reviews the proposed Deep CNN based method in the state of the art for DR detection methods. This study discovers that recently proposed deep learning based DR detection methods provides higher accuracy than existing traditional feature extraction methods in the literature and also useful in large scale datasets. However, deep learning based methods require GPU implementation to get the desirable output. The one of the other major finding of this paper is that there are no obvious standard severity scale detection criteria to measure the grading. Some used binary class while many other used multi stage class.
Tasks
Published	2019-12-30
URL	https://arxiv.org/abs/1912.12829v1
PDF	https://arxiv.org/pdf/1912.12829v1.pdf
PWC	https://paperswithcode.com/paper/early-detection-of-diabetic-retinopathy-and
Repo
Framework

Adversarial Training is a Form of Data-dependent Operator Norm Regularization


Title	Adversarial Training is a Form of Data-dependent Operator Norm Regularization
Authors	Kevin Roth, Yannic Kilcher, Thomas Hofmann
Abstract	We establish a theoretical link between adversarial training and operator norm regularization for deep neural networks. Specifically, we prove that $\ell_p$-norm constrained projected gradient ascent based adversarial training with an $\ell_q$-norm loss on the logits of clean and perturbed inputs is equivalent to data-dependent (p, q) operator norm regularization. This fundamental connection confirms the long-standing argument that a network’s sensitivity to adversarial examples is tied to its spectral properties and hints at novel ways to robustify and defend against adversarial attacks. We provide extensive empirical evidence on state-of-the-art network architectures to support our theoretical results.
Tasks
Published	2019-06-04
URL	https://arxiv.org/abs/1906.01527v4
PDF	https://arxiv.org/pdf/1906.01527v4.pdf
PWC	https://paperswithcode.com/paper/adversarial-training-generalizes-data
Repo
Framework

Approximate Representer Theorems in Non-reflexive Banach Spaces


Title	Approximate Representer Theorems in Non-reflexive Banach Spaces
Authors	Kevin Schlegel
Abstract	The representer theorem is one of the most important mathematical foundations for regularised learning and kernel methods. Classical formulations of the theorem state sufficient conditions under which a regularisation problem on a Hilbert space admits a solution in the subspace spanned by the representers of the data points. This turns the problem into an equivalent optimisation problem in a finite dimensional space, making it computationally tractable. Moreover, Banach space methods for learning have been receiving more and more attention. Considering the representer theorem in Banach spaces is hence of increasing importance. Recently the question of the necessary condition for a representer theorem to hold in Hilbert spaces and certain Banach spaces has been considered. It has been shown that a classical representer theorem cannot exist in general in non-reflexive Banach spaces. In this paper we propose a notion of approximate solutions and approximate representer theorem to overcome this problem. We show that for these notions we can indeed extend the previous results to obtain a unified theory for the existence of representer theorems in any general Banach spaces, in particular including $l_1$-type spaces. We give a precise characterisation when a regulariser admits a classical representer theorem and when only an approximate representer theorem is possible.
Tasks
Published	2019-11-01
URL	https://arxiv.org/abs/1911.00433v1
PDF	https://arxiv.org/pdf/1911.00433v1.pdf
PWC	https://paperswithcode.com/paper/approximate-representer-theorems-in-non
Repo
Framework

ISP4ML: Understanding the Role of Image Signal Processing in Efficient Deep Learning Vision Systems


Title	ISP4ML: Understanding the Role of Image Signal Processing in Efficient Deep Learning Vision Systems
Authors	Patrick Hansen, Alexey Vilkin, Yury Khrustalev, James Imber, David Hanwell, Matthew Mattina, Paul N. Whatmough
Abstract	Convolutional neural networks (CNNs) are now predominant components in a variety of computer vision (CV) systems. These systems typically include an image signal processor (ISP), even though the ISP is traditionally designed to produce images that look appealing to humans. In CV systems, it is not clear what the role of the ISP is, or if it is even required at all for accurate prediction. In this work, we investigate the efficacy of the ISP in CNN classification tasks, and outline the system-level trade-offs between prediction accuracy and computational cost. To do so, we build software models of a configurable ISP and an imaging sensor in order to train CNNs on ImageNet with a range of different ISP settings and functionality. Results on ImageNet show that an ISP improves accuracy by 4.6%-12.2% on MobileNet architectures of different widths. Results using ResNets demonstrate that these trends also generalize to deeper networks. An ablation study of the various processing stages in a typical ISP reveals that the tone mapper is the most significant stage when operating on high dynamic range (HDR) images, by providing 5.8% average accuracy improvement alone. Overall, the ISP benefits system efficiency because the memory and computational costs of the ISP is minimal compared to the cost of using a larger CNN to achieve the same accuracy.
Tasks
Published	2019-11-18
URL	https://arxiv.org/abs/1911.07954v2
PDF	https://arxiv.org/pdf/1911.07954v2.pdf
PWC	https://paperswithcode.com/paper/isp4ml-understanding-the-role-of-image-signal
Repo
Framework

A Deep Image Compression Framework for Face Recognition


Title	A Deep Image Compression Framework for Face Recognition
Authors	Nai Bian, Feng Liang, Haisheng Fu, Bo Lei
Abstract	Face recognition technology has advanced rapidly and has been widely used in various applications. Due to the extremely huge amount of data of face images and the large computing resources required correspondingly in large-scale face recognition tasks, there is a requirement for a face image compression approach that is highly suitable for face recognition tasks. In this paper, we propose a deep convolutional autoencoder compression network for face recognition tasks. In the compression process, deep features are extracted from the original image by the convolutional neural networks to produce a compact representation of the original image, which is then encoded and saved by existing codec such as PNG. This compact representation is utilized by the reconstruction network to generate a reconstructed image of the original one. In order to improve the face recognition accuracy when the compression framework is used in a face recognition system, we combine this compression framework with a existing face recognition network for joint optimization. We test the proposed scheme and find that after joint training, the Labeled Faces in the Wild (LFW) dataset compressed by our compression framework has higher face verification accuracy than that compressed by JPEG2000, and is much higher than that compressed by JPEG.
Tasks	Face Recognition, Face Verification, Image Compression
Published	2019-07-03
URL	https://arxiv.org/abs/1907.01714v1
PDF	https://arxiv.org/pdf/1907.01714v1.pdf
PWC	https://paperswithcode.com/paper/a-deep-image-compression-framework-for-face
Repo
Framework

Fast Rates for a kNN Classifier Robust to Unknown Asymmetric Label Noise


Title	Fast Rates for a kNN Classifier Robust to Unknown Asymmetric Label Noise
Authors	Henry W. J. Reeve, Ata Kaban
Abstract	We consider classification in the presence of class-dependent asymmetric label noise with unknown noise probabilities. In this setting, identifiability conditions are known, but additional assumptions were shown to be required for finite sample rates, and so far only the parametric rate has been obtained. Assuming these identifiability conditions, together with a measure-smoothness condition on the regression function and Tsybakov’s margin condition, we show that the Robust kNN classifier of Gao et al. attains, the minimax optimal rates of the noise-free setting, up to a log factor, even when trained on data with unknown asymmetric label noise. Hence, our results provide a solid theoretical backing for this empirically successful algorithm. By contrast the standard kNN is not even consistent in the setting of asymmetric label noise. A key idea in our analysis is a simple kNN based method for estimating the maximum of a function that requires far less assumptions than existing mode estimators do, and which may be of independent interest for noise proportion estimation and randomised optimisation problems.
Tasks
Published	2019-06-11
URL	https://arxiv.org/abs/1906.04542v1
PDF	https://arxiv.org/pdf/1906.04542v1.pdf
PWC	https://paperswithcode.com/paper/fast-rates-for-a-knn-classifier-robust-to
Repo
Framework

SemanticAdv: Generating Adversarial Examples via Attribute-conditional Image Editing


Title	SemanticAdv: Generating Adversarial Examples via Attribute-conditional Image Editing
Authors	Haonan Qiu, Chaowei Xiao, Lei Yang, Xinchen Yan, Honglak Lee, Bo Li
Abstract	Deep neural networks (DNNs) have achieved great success in various applications due to their strong expressive power. However, recent studies have shown that DNNs are vulnerable to adversarial examples which are manipulated instances targeting to mislead DNNs to make incorrect predictions. Currently, most such adversarial examples try to guarantee “subtle perturbation” by limiting the $L_p$ norm of the perturbation. In this paper, we aim to explore the impact of semantic manipulation on DNNs predictions by manipulating the semantic attributes of images and generate “unrestricted adversarial examples”. In particular, we propose an algorithm \emph{SemanticAdv} which leverages disentangled semantic factors to generate adversarial perturbation by altering controlled semantic attributes to fool the learner towards various “adversarial” targets. We conduct extensive experiments to show that the semantic based adversarial examples can not only fool different learning tasks such as face verification and landmark detection, but also achieve high targeted attack success rate against \emph{real-world black-box} services such as Azure face verification service based on transferability. To further demonstrate the applicability of \emph{SemanticAdv} beyond face recognition domain, we also generate semantic perturbations on street-view images. Such adversarial examples with controlled semantic manipulation can shed light on further understanding about vulnerabilities of DNNs as well as potential defensive approaches.
Tasks	Face Recognition, Face Verification
Published	2019-06-19
URL	https://arxiv.org/abs/1906.07927v2
PDF	https://arxiv.org/pdf/1906.07927v2.pdf
PWC	https://paperswithcode.com/paper/semanticadv-generating-adversarial-examples
Repo
Framework

Seismic data denoising and deblending using deep learning


Title	Seismic data denoising and deblending using deep learning
Authors	Alan Richardson, Caelen Feller
Abstract	An important step of seismic data processing is removing noise, including interference due to simultaneous and blended sources, from the recorded data. Traditional methods are time-consuming to apply as they often require manual choosing of parameters to obtain good results. We use deep learning, with a U-net model incorporating a ResNet architecture pretrained on ImageNet and further trained on synthetic seismic data, to perform this task. The method is applied to common offset gathers, with adjacent offset gathers of the gather being denoised provided as additional input channels. Here we show that this approach leads to a method that removes noise from several datasets recorded in different parts of the world with moderate success. We find that providing three adjacent offset gathers on either side of the gather being denoised is most effective. As this method does not require parameters to be chosen, it is more automated than traditional methods.
Tasks	Denoising
Published	2019-07-02
URL	https://arxiv.org/abs/1907.01497v1
PDF	https://arxiv.org/pdf/1907.01497v1.pdf
PWC	https://paperswithcode.com/paper/seismic-data-denoising-and-deblending-using
Repo
Framework

Incremental personalized E-mail spam filter using novel TFDCR feature selection with dynamic feature update


Title	Incremental personalized E-mail spam filter using novel TFDCR feature selection with dynamic feature update
Authors	Gopi Sanghani, Ketan Kotecha
Abstract	Communication through e-mails remains to be highly formalized, conventional and indispensable method for the exchange of information over the Internet. An ever-increasing ratio and adversary nature of spam e-mails have posed a great many challenges such as uneven class distribution, unequal error cost, frequent change of content and personalized context-sensitive discrimination. In this research, we propose a novel and distinctive approach to develop an incremental personalized e-mail spam filter. The proposed work is described using three significant contributions. First, we applied a novel term frequency difference and category ratio based feature selection function TFDCR to select the most discriminating features irrespective of the number of samples in each class. Second, an incremental learning model is used which enables the classifier to update the discriminant function dynamically. Third, a heuristic function called selectionRankWeight is introduced to upgrade the existing feature set that determines new features carrying strong discriminating ability from an incoming set of e-mails. Three public e-mail datasets possessing different characteristics are used to evaluate the filter performance. Experiments are conducted to compare the feature selection efficiency of TFDCR and to observe the filter performance under both the batch and the incremental learning mode. The results demonstrate the superiority of TFDCR as the most effective f eature selection function. The incremental learning model incorporating dynamic feature update function overcomes the problem of drifting concepts. The proposed filter validates its efficiency and feasibility by substantially improving the classification accuracy and reducing the false positive error of misclassifying legitimate e-mail as spam.
Tasks	Feature Selection
Published	2019-04-27
URL	http://arxiv.org/abs/1904.12118v1
PDF	http://arxiv.org/pdf/1904.12118v1.pdf
PWC	https://paperswithcode.com/paper/incremental-personalized-e-mail-spam-filter
Repo
Framework

Effect of Activation Functions on the Training of Overparametrized Neural Nets


Title	Effect of Activation Functions on the Training of Overparametrized Neural Nets
Authors	Abhishek Panigrahi, Abhishek Shetty, Navin Goyal
Abstract	It is well-known that overparametrized neural networks trained using gradient-based methods quickly achieve small training error with appropriate hyperparameter settings. Recent papers have proved this statement theoretically for highly overparametrized networks under reasonable assumptions. These results either assume that the activation function is ReLU or they crucially depend on the minimum eigenvalue of a certain Gram matrix depending on the data, random initialization and the activation function. In the later case, existing works only prove that this minimum eigenvalue is non-zero and do not provide quantitative bounds. On the empirical side, a contemporary line of investigations has proposed a number of alternative activation functions which tend to perform better than ReLU at least in some settings but no clear understanding has emerged. This state of affairs underscores the importance of theoretically understanding the impact of activation functions on training. In the present paper, we provide theoretical results about the effect of activation function on the training of highly overparametrized 2-layer neural networks. A crucial property that governs the performance of an activation is whether or not it is smooth. For non-smooth activations such as ReLU, SELU and ELU, all eigenvalues of the associated Gram matrix are large under minimal assumptions on the data. For smooth activations such as tanh, swish and polynomials, the situation is more complex. If the subspace spanned by the data has small dimension then the minimum eigenvalue of the Gram matrix can be small leading to slow training. But if the dimension is large and the data satisfies another mild condition, then the eigenvalues are large. If we allow deep networks, then the small data dimension is not a limitation provided that the depth is sufficient. We discuss a number of extensions and applications of these results.
Tasks
Published	2019-08-16
URL	https://arxiv.org/abs/1908.05660v3
PDF	https://arxiv.org/pdf/1908.05660v3.pdf
PWC	https://paperswithcode.com/paper/effect-of-activation-functions-on-the
Repo
Framework

A Practical Maximum Clique Algorithm for Matching with Pairwise Constraints


Title	A Practical Maximum Clique Algorithm for Matching with Pairwise Constraints
Authors	Álvaro Parra, Tat-Jun Chin, Frank Neumann, Tobias Friedrich, Maximilian Katzmann
Abstract	A popular paradigm for 3D point cloud registration is by extracting 3D keypoint correspondences, then estimating the registration function from the correspondences using a robust algorithm. However, many existing 3D keypoint techniques tend to produce large proportions of erroneous correspondences or outliers, which significantly increases the cost of robust estimation. An alternative approach is to directly search for the subset of correspondences that are pairwise consistent, without optimising the registration function. This gives rise to the combinatorial problem of matching with pairwise constraints. In this paper, we propose a very efficient maximum clique algorithm to solve matching with pairwise constraints. Our technique combines tree searching with efficient bounding and pruning based on graph colouring. We demonstrate that, despite the theoretical intractability, many real problem instances can be solved exactly and quickly (seconds to minutes) with our algorithm, which makes our approach an excellent alternative to standard robust techniques for 3D registration.
Tasks	Point Cloud Registration
Published	2019-02-05
URL	https://arxiv.org/abs/1902.01534v2
PDF	https://arxiv.org/pdf/1902.01534v2.pdf
PWC	https://paperswithcode.com/paper/a-practical-maximum-clique-algorithm-for
Repo
Framework

Bipolar Morphological Neural Networks: Convolution Without Multiplication


Title	Bipolar Morphological Neural Networks: Convolution Without Multiplication
Authors	Elena Limonova, Daniil Matveev, Dmitry Nikolaev, Vladimir V. Arlazarov
Abstract	In the paper we introduce a novel bipolar morphological neuron and bipolar morphological layer models. The models use only such operations as addition, subtraction and maximum inside the neuron and exponent and logarithm as activation functions for the layer. The proposed models unlike previously introduced morphological neural networks approximate the classical computations and show better recognition results. We also propose layer-by-layer approach to train the bipolar morphological networks, which can be further developed to an incremental approach for separate neurons to get higher accuracy. Both these approaches do not require special training algorithms and can use a variety of gradient descent methods. To demonstrate efficiency of the proposed model we consider classical convolutional neural networks and convert the pre-trained convolutional layers to the bipolar morphological layers. Seeing that the experiments on recognition of MNIST and MRZ symbols show only moderate decrease of accuracy after conversion and training, bipolar neuron model can provide faster inference and be very useful in mobile and embedded systems.
Tasks
Published	2019-11-05
URL	https://arxiv.org/abs/1911.01971v1
PDF	https://arxiv.org/pdf/1911.01971v1.pdf
PWC	https://paperswithcode.com/paper/bipolar-morphological-neural-networks
Repo
Framework

ViterbiNet: A Deep Learning Based Viterbi Algorithm for Symbol Detection


Title	ViterbiNet: A Deep Learning Based Viterbi Algorithm for Symbol Detection
Authors	Nir Shlezinger, Nariman Farsad, Yonina C. Eldar, Andrea J. Goldsmith
Abstract	Symbol detection plays an important role in the implementation of digital receivers. In this work, we propose ViterbiNet, which is a data-driven symbol detector that does not require channel state information (CSI). ViterbiNet is obtained by integrating deep neural networks (DNNs) into the Viterbi algorithm. We identify the specific parts of the Viterbi algorithm that are channel-model-based, and design a DNN to implement only those computations, leaving the rest of the algorithm structure intact. We then propose a meta-learning based approach to train ViterbiNet online based on recent decisions, allowing the receiver to track dynamic channel conditions without requiring new training samples for every coherence block. Our numerical evaluations demonstrate that the performance of ViterbiNet, which is ignorant of the CSI, approaches that of the CSI-based Viterbi algorithm, and is capable of tracking time-varying channels without needing instantaneous CSI or additional training data. Moreover, unlike conventional Viterbi detection, ViterbiNet is robust to CSI uncertainty, and it can be reliably implemented in complex channel models with constrained computational burden. More broadly, our results demonstrate the conceptual benefit of designing communication systems to that integrate DNNs into established algorithms.
Tasks	Meta-Learning
Published	2019-05-26
URL	https://arxiv.org/abs/1905.10750v1
PDF	https://arxiv.org/pdf/1905.10750v1.pdf
PWC	https://paperswithcode.com/paper/viterbinet-a-deep-learning-based-viterbi
Repo
Framework

The Roadmap to 6G – AI Empowered Wireless Networks


Title	The Roadmap to 6G – AI Empowered Wireless Networks
Authors	Khaled B. Letaief, Wei Chen, Yuanming Shi, Jun Zhang, Ying-Jun Angela Zhang
Abstract	The recent upsurge of diversified mobile applications, especially those supported by Artificial Intelligence (AI), is spurring heated discussions on the future evolution of wireless communications. While 5G is being deployed around the world, efforts from industry and academia have started to look beyond 5G and conceptualize 6G. We envision 6G to undergo an unprecedented transformation that will make it substantially different from the previous generations of wireless cellular systems. In particular, 6G will go beyond mobile Internet and will be required to support ubiquitous AI services from the core to the end devices of the network. Meanwhile, AI will play a critical role in designing and optimizing 6G architectures, protocols, and operations. In this article, we discuss potential technologies for 6G to enable mobile AI applications, as well as AI-enabled methodologies for 6G network design and optimization. Key trends in the evolution to 6G will also be discussed.
Tasks
Published	2019-04-26
URL	https://arxiv.org/abs/1904.11686v2
PDF	https://arxiv.org/pdf/1904.11686v2.pdf
PWC	https://paperswithcode.com/paper/the-roadmap-to-6g-ai-empowered-wireless
Repo
Framework