April 1, 2020

3234 words 16 mins read

Paper Group ANR 394

Paper Group ANR 394

Differentially Private ERM Based on Data Perturbation. Phonetic Feedback for Speech Enhancement With and Without Parallel Speech Data. Vision-Dialog Navigation by Exploring Cross-modal Memory. Characterizing Speech Adversarial Examples Using Self-Attention U-Net Enhancement. Deep learning approach for breast cancer diagnosis. Multifactorial Cellula …

Differentially Private ERM Based on Data Perturbation

Title Differentially Private ERM Based on Data Perturbation
Authors Yilin Kang, Yong Liu, Lizhong Ding, Xinwang Liu, Xinyi Tong, Weiping Wang
Abstract In this paper, after observing that different training data instances affect the machine learning model to different extents, we attempt to improve the performance of differentially private empirical risk minimization (DP-ERM) from a new perspective. Specifically, we measure the contributions of various training data instances on the final machine learning model, and select some of them to add random noise. Considering that the key of our method is to measure each data instance separately, we propose a new `Data perturbation’ based (DB) paradigm for DP-ERM: adding random noise to the original training data and achieving ($\epsilon,\delta$)-differential privacy on the final machine learning model, along with the preservation on the original data. By introducing the Influence Function (IF), we quantitatively measure the impact of the training data on the final model. Theoretical and experimental results show that our proposed DBDP-ERM paradigm enhances the model performance significantly. |
Tasks
Published 2020-02-20
URL https://arxiv.org/abs/2002.08578v1
PDF https://arxiv.org/pdf/2002.08578v1.pdf
PWC https://paperswithcode.com/paper/differentially-private-erm-based-on-data
Repo
Framework

Phonetic Feedback for Speech Enhancement With and Without Parallel Speech Data

Title Phonetic Feedback for Speech Enhancement With and Without Parallel Speech Data
Authors Peter Plantinga, Deblin Bagchi, Eric Fosler-Lussier
Abstract While deep learning systems have gained significant ground in speech enhancement research, these systems have yet to make use of the full potential of deep learning systems to provide high-level feedback. In particular, phonetic feedback is rare in speech enhancement research even though it includes valuable top-down information. We use the technique of mimic loss to provide phonetic feedback to an off-the-shelf enhancement system, and find gains in objective intelligibility scores on CHiME-4 data. This technique takes a frozen acoustic model trained on clean speech to provide valuable feedback to the enhancement model, even in the case where no parallel speech data is available. Our work is one of the first to show intelligibility improvement for neural enhancement systems without parallel speech data, and we show phonetic feedback can improve a state-of-the-art neural enhancement system trained with parallel speech data.
Tasks Speech Enhancement
Published 2020-03-03
URL https://arxiv.org/abs/2003.01769v1
PDF https://arxiv.org/pdf/2003.01769v1.pdf
PWC https://paperswithcode.com/paper/phonetic-feedback-for-speech-enhancement-with
Repo
Framework

Vision-Dialog Navigation by Exploring Cross-modal Memory

Title Vision-Dialog Navigation by Exploring Cross-modal Memory
Authors Yi Zhu, Fengda Zhu, Zhaohuan Zhan, Bingqian Lin, Jianbin Jiao, Xiaojun Chang, Xiaodan Liang
Abstract Vision-dialog navigation posed as a new holy-grail task in vision-language disciplinary targets at learning an agent endowed with the capability of constant conversation for help with natural language and navigating according to human responses. Besides the common challenges faced in visual language navigation, vision-dialog navigation also requires to handle well with the language intentions of a series of questions about the temporal context from dialogue history and co-reasoning both dialogs and visual scenes. In this paper, we propose the Cross-modal Memory Network (CMN) for remembering and understanding the rich information relevant to historical navigation actions. Our CMN consists of two memory modules, the language memory module (L-mem) and the visual memory module (V-mem). Specifically, L-mem learns latent relationships between the current language interaction and a dialog history by employing a multi-head attention mechanism. V-mem learns to associate the current visual views and the cross-modal memory about the previous navigation actions. The cross-modal memory is generated via a vision-to-language attention and a language-to-vision attention. Benefiting from the collaborative learning of the L-mem and the V-mem, our CMN is able to explore the memory about the decision making of historical navigation actions which is for the current step. Experiments on the CVDN dataset show that our CMN outperforms the previous state-of-the-art model by a significant margin on both seen and unseen environments.
Tasks Decision Making
Published 2020-03-15
URL https://arxiv.org/abs/2003.06745v1
PDF https://arxiv.org/pdf/2003.06745v1.pdf
PWC https://paperswithcode.com/paper/vision-dialog-navigation-by-exploring-cross
Repo
Framework

Characterizing Speech Adversarial Examples Using Self-Attention U-Net Enhancement

Title Characterizing Speech Adversarial Examples Using Self-Attention U-Net Enhancement
Authors Chao-Han Huck Yang, Jun Qi, Pin-Yu Chen, Xiaoli Ma, Chin-Hui Lee
Abstract Recent studies have highlighted adversarial examples as ubiquitous threats to the deep neural network (DNN) based speech recognition systems. In this work, we present a U-Net based attention model, U-Net$_{At}$, to enhance adversarial speech signals. Specifically, we evaluate the model performance by interpretable speech recognition metrics and discuss the model performance by the augmented adversarial training. Our experiments show that our proposed U-Net$_{At}$ improves the perceptual evaluation of speech quality (PESQ) from 1.13 to 2.78, speech transmission index (STI) from 0.65 to 0.75, short-term objective intelligibility (STOI) from 0.83 to 0.96 on the task of speech enhancement with adversarial speech examples. We conduct experiments on the automatic speech recognition (ASR) task with adversarial audio attacks. We find that (i) temporal features learned by the attention network are capable of enhancing the robustness of DNN based ASR models; (ii) the generalization power of DNN based ASR model could be enhanced by applying adversarial training with an additive adversarial data augmentation. The ASR metric on word-error-rates (WERs) shows that there is an absolute 2.22 $%$ decrease under gradient-based perturbation, and an absolute 2.03 $%$ decrease, under evolutionary-optimized perturbation, which suggests that our enhancement models with adversarial training can further secure a resilient ASR system.
Tasks Data Augmentation, Speech Enhancement, Speech Recognition
Published 2020-03-31
URL https://arxiv.org/abs/2003.13917v1
PDF https://arxiv.org/pdf/2003.13917v1.pdf
PWC https://paperswithcode.com/paper/characterizing-speech-adversarial-examples
Repo
Framework

Deep learning approach for breast cancer diagnosis

Title Deep learning approach for breast cancer diagnosis
Authors Essam A. Rashed, M. Samir Abou El Seoud
Abstract Breast cancer is one of the leading fatal disease worldwide with high risk control if early discovered. Conventional method for breast screening is x-ray mammography, which is known to be challenging for early detection of cancer lesions. The dense breast structure produced due to the compression process during imaging lead to difficulties to recognize small size abnormalities. Also, inter- and intra-variations of breast tissues lead to significant difficulties to achieve high diagnosis accuracy using hand-crafted features. Deep learning is an emerging machine learning technology that requires a relatively high computation power. Yet, it proved to be very effective in several difficult tasks that requires decision making at the level of human intelligence. In this paper, we develop a new network architecture inspired by the U-net structure that can be used for effective and early detection of breast cancer. Results indicate a high rate of sensitivity and specificity that indicate potential usefulness of the proposed approach in clinical use.
Tasks Decision Making
Published 2020-03-10
URL https://arxiv.org/abs/2003.04480v1
PDF https://arxiv.org/pdf/2003.04480v1.pdf
PWC https://paperswithcode.com/paper/deep-learning-approach-for-breast-cancer
Repo
Framework

Multifactorial Cellular Genetic Algorithm (MFCGA): Algorithmic Design, Performance Comparison and Genetic Transferability Analysis

Title Multifactorial Cellular Genetic Algorithm (MFCGA): Algorithmic Design, Performance Comparison and Genetic Transferability Analysis
Authors Eneko Osaba, Aritz D. Martinez, Jesus L. Lobo, Javier Del Ser, Francisco Herrera
Abstract Multitasking optimization is an incipient research area which is lately gaining a notable research momentum. Unlike traditional optimization paradigm that focuses on solving a single task at a time, multitasking addresses how multiple optimization problems can be tackled simultaneously by performing a single search process. The main objective to achieve this goal efficiently is to exploit synergies between the problems (tasks) to be optimized, helping each other via knowledge transfer (thereby being referred to as Transfer Optimization). Furthermore, the equally recent concept of Evolutionary Multitasking (EM) refers to multitasking environments adopting concepts from Evolutionary Computation as their inspiration for the simultaneous solving of the problems under consideration. As such, EM approaches such as the Multifactorial Evolutionary Algorithm (MFEA) has shown a remarkable success when dealing with multiple discrete, continuous, single-, and/or multi-objective optimization problems. In this work we propose a novel algorithmic scheme for Multifactorial Optimization scenarios - the Multifactorial Cellular Genetic Algorithm (MFCGA) - that hinges on concepts from Cellular Automata to implement mechanisms for exchanging knowledge among problems. We conduct an extensive performance analysis of the proposed MFCGA and compare it to the canonical MFEA under the same algorithmic conditions and over 15 different multitasking setups (encompassing different reference instances of the discrete Traveling Salesman Problem). A further contribution of this analysis beyond performance benchmarking is a quantitative examination of the genetic transferability among the problem instances, eliciting an empirical demonstration of the synergies emerged between the different optimization tasks along the MFCGA search process.
Tasks Transfer Learning
Published 2020-03-24
URL https://arxiv.org/abs/2003.10768v1
PDF https://arxiv.org/pdf/2003.10768v1.pdf
PWC https://paperswithcode.com/paper/multifactorial-cellular-genetic-algorithm
Repo
Framework

Flow descriptors of human mobility networks

Title Flow descriptors of human mobility networks
Authors David Pastor-Escuredo, Enrique Frias-Martinez
Abstract Mobile phone data has enabled the timely and fine-grained study human mobility. Call Detail Records, generated at call events, allow building descriptions of mobility at different resolutions and with different spatial, temporal and social granularity. Individual trajectories are the basis for long-term observation of mobility patterns and identify factors of human dynamics. Here we propose a systematic analysis to characterize mobility network flows and topology and assess their impact into individual traces. Discrete flow-based descriptors are used to classify and understand human mobility patterns at multiple scales. This framework is suitable to assess urban planning, optimize transportation, measure the impact of external events and conditions, monitor internal dynamics and profile users according to their movement patterns.
Tasks Human Dynamics
Published 2020-03-16
URL https://arxiv.org/abs/2003.07279v1
PDF https://arxiv.org/pdf/2003.07279v1.pdf
PWC https://paperswithcode.com/paper/flow-descriptors-of-human-mobility-networks
Repo
Framework

EvoNet: A Neural Network for Predicting the Evolution of Dynamic Graphs

Title EvoNet: A Neural Network for Predicting the Evolution of Dynamic Graphs
Authors Changmin Wu, Giannis Nikolentzos, Michalis Vazirgiannis
Abstract Neural networks for structured data like graphs have been studied extensively in recent years. To date, the bulk of research activity has focused mainly on static graphs. However, most real-world networks are dynamic since their topology tends to change over time. Predicting the evolution of dynamic graphs is a task of high significance in the area of graph mining. Despite its practical importance, the task has not been explored in depth so far, mainly due to its challenging nature. In this paper, we propose a model that predicts the evolution of dynamic graphs. Specifically, we use a graph neural network along with a recurrent architecture to capture the temporal evolution patterns of dynamic graphs. Then, we employ a generative model which predicts the topology of the graph at the next time step and constructs a graph instance that corresponds to that topology. We evaluate the proposed model on several artificial datasets following common network evolving dynamics, as well as on real-world datasets. Results demonstrate the effectiveness of the proposed model.
Tasks
Published 2020-03-02
URL https://arxiv.org/abs/2003.00842v1
PDF https://arxiv.org/pdf/2003.00842v1.pdf
PWC https://paperswithcode.com/paper/evonet-a-neural-network-for-predicting-the-1
Repo
Framework

Random Matrix Theory Proves that Deep Learning Representations of GAN-data Behave as Gaussian Mixtures

Title Random Matrix Theory Proves that Deep Learning Representations of GAN-data Behave as Gaussian Mixtures
Authors Mohamed El Amine Seddik, Cosme Louart, Mohamed Tamaazousti, Romain Couillet
Abstract This paper shows that deep learning (DL) representations of data produced by generative adversarial nets (GANs) are random vectors which fall within the class of so-called \textit{concentrated} random vectors. Further exploiting the fact that Gram matrices, of the type $G = X^T X$ with $X=[x_1,\ldots,x_n]\in \mathbb{R}^{p\times n}$ and $x_i$ independent concentrated random vectors from a mixture model, behave asymptotically (as $n,p\to \infty$) as if the $x_i$ were drawn from a Gaussian mixture, suggests that DL representations of GAN-data can be fully described by their first two statistical moments for a wide range of standard classifiers. Our theoretical findings are validated by generating images with the BigGAN model and across different popular deep representation networks.
Tasks
Published 2020-01-21
URL https://arxiv.org/abs/2001.08370v1
PDF https://arxiv.org/pdf/2001.08370v1.pdf
PWC https://paperswithcode.com/paper/random-matrix-theory-proves-that-deep-1
Repo
Framework

Tangent-Space Gradient Optimization of Tensor Network for Machine Learning

Title Tangent-Space Gradient Optimization of Tensor Network for Machine Learning
Authors Zheng-zhi Sun, Shi-ju Ran, Gang Su
Abstract The gradient-based optimization method for deep machine learning models suffers from gradient vanishing and exploding problems, particularly when the computational graph becomes deep. In this work, we propose the tangent-space gradient optimization (TSGO) for the probabilistic models to keep the gradients from vanishing or exploding. The central idea is to guarantee the orthogonality between the variational parameters and the gradients. The optimization is then implemented by rotating parameter vector towards the direction of gradient. We explain and testify TSGO in tensor network (TN) machine learning, where the TN describes the joint probability distribution as a normalized state $\left \psi \right\rangle $ in Hilbert space. We show that the gradient can be restricted in the tangent space of $\left\langle \psi \right.\left \psi \right\rangle = 1$ hyper-sphere. Instead of additional adaptive methods to control the learning rate in deep learning, the learning rate of TSGO is naturally determined by the angle $\theta $ as $\eta = \tan \theta $. Our numerical results reveal better convergence of TSGO in comparison to the off-the-shelf Adam.
Tasks
Published 2020-01-10
URL https://arxiv.org/abs/2001.04029v1
PDF https://arxiv.org/pdf/2001.04029v1.pdf
PWC https://paperswithcode.com/paper/tangent-space-gradient-optimization-of-tensor
Repo
Framework

Gravitational Wave Detection and Information Extraction via Neural Networks

Title Gravitational Wave Detection and Information Extraction via Neural Networks
Authors Gerson R. Santos, Marcela P. Figueiredo, Antonio de Pádua Santos, Pavlos Protopapas, Tiago A. E. Ferreira
Abstract Laser Interferometer Gravitational-Wave Observatory (LIGO) was the first laboratory to measure the gravitational waves. It was needed an exceptional experimental design to measure distance changes much less than a radius of a proton. In the same way, the data analyses to confirm and extract information is a tremendously hard task. Here, it is shown a computational procedure base on artificial neural networks to detect a gravitation wave event and extract the knowledge of its ring-down time from the LIGO data. With this proposal, it is possible to make a probabilistic thermometer for gravitational wave detection and obtain physical information about the astronomical body system that created the phenomenon. Here, the ring-down time is determined with a direct data measure, without the need to use numerical relativity techniques and high computational power.
Tasks Gravitational Wave Detection
Published 2020-03-22
URL https://arxiv.org/abs/2003.09995v1
PDF https://arxiv.org/pdf/2003.09995v1.pdf
PWC https://paperswithcode.com/paper/gravitational-wave-detection-and-information
Repo
Framework

Using Single-Step Adversarial Training to Defend Iterative Adversarial Examples

Title Using Single-Step Adversarial Training to Defend Iterative Adversarial Examples
Authors Guanxiong Liu, Issa Khalil, Abdallah Khreishah
Abstract Adversarial examples have become one of the largest challenges that machine learning models, especially neural network classifiers, face. These adversarial examples break the assumption of attack-free scenario and fool state-of-the-art (SOTA) classifiers with insignificant perturbations to human. So far, researchers achieved great progress in utilizing adversarial training as a defense. However, the overwhelming computational cost degrades its applicability and little has been done to overcome this issue. Single-Step adversarial training methods have been proposed as computationally viable solutions, however they still fail to defend against iterative adversarial examples. In this work, we first experimentally analyze several different SOTA defense methods against adversarial examples. Then, based on observations from experiments, we propose a novel single-step adversarial training method which can defend against both single-step and iterative adversarial examples. Lastly, through extensive evaluations, we demonstrate that our proposed method outperforms the SOTA single-step and iterative adversarial training defense. Compared with ATDA (single-step method) on CIFAR10 dataset, our proposed method achieves 35.67% enhancement in test accuracy and 19.14% reduction in training time. When compared with methods that use BIM or Madry examples (iterative methods) on CIFAR10 dataset, it saves up to 76.03% in training time with less than 3.78% degeneration in test accuracy.
Tasks
Published 2020-02-22
URL https://arxiv.org/abs/2002.09632v2
PDF https://arxiv.org/pdf/2002.09632v2.pdf
PWC https://paperswithcode.com/paper/using-single-step-adversarial-training-to
Repo
Framework

Time-Frequency Analysis based Blind Modulation Classification for Multiple-Antenna Systems

Title Time-Frequency Analysis based Blind Modulation Classification for Multiple-Antenna Systems
Authors Weiheng Jiang, Xiaogang Wu, Bolin Chen, Wenjiang Feng, Yi Jin
Abstract Blind modulation classification is an important step to implement cognitive radio networks. The multiple-input multiple-output (MIMO) technique is widely used in military and civil communication systems. Due to the lack of prior information about channel parameters and the overlapping of signals in the MIMO systems, the traditional likelihood-based and feature-based approaches cannot be applied in these scenarios directly. Hence, in this paper, to resolve the problem of blind modulation classification in MIMO systems, the time-frequency analysis method based on the windowed short-time Fourier transform is used to analyse the time-frequency characteristics of time-domain modulated signals. Then the extracted time-frequency characteristics are converted into RGB spectrogram images, and the convolutional neural network based on transfer learning is applied to classify the modulation types according to the RGB spectrogram images. Finally, a decision fusion module is used to fuse the classification results of all the receive antennas. Through simulations, we analyse the classification performance at different signal-to-noise ratios (SNRs), the results indicate that, for the single-input single-output (SISO) network, our proposed scheme can achieve 92.37% and 99.12% average classification accuracy at SNRs of -4 dB and 10 dB, respectively. For the MIMO network, our scheme achieves 80.42% and 87.92% average classification accuracy at -4 dB and 10 dB, respectively. This outperforms the existing classification methods based on baseband signals.
Tasks Transfer Learning
Published 2020-04-01
URL https://arxiv.org/abs/2004.00378v1
PDF https://arxiv.org/pdf/2004.00378v1.pdf
PWC https://paperswithcode.com/paper/time-frequency-analysis-based-blind
Repo
Framework

Diagnosing COVID-19 Pneumonia from X-Ray and CT Images using Deep Learning and Transfer Learning Algorithms

Title Diagnosing COVID-19 Pneumonia from X-Ray and CT Images using Deep Learning and Transfer Learning Algorithms
Authors Halgurd S. Maghdid, Aras T. Asaad, Kayhan Zrar Ghafoor, Ali Safaa Sadiq, Muhammad Khurram Khan
Abstract COVID-19 (also known as 2019 Novel Coronavirus) first emerged in Wuhan, China and spread across the globe with unprecedented effect and has now become the greatest crisis of the modern era. The COVID-19 has proved much more pervasive demands for diagnosis that has driven researchers to develop more intelligent, highly responsive and efficient detection methods. In this work, we focus on proposing AI tools that can be used by radiologists or healthcare professionals to diagnose COVID-19 cases in a quick and accurate manner. However, the lack of a publicly available dataset of X-ray and CT images makes the design of such AI tools a challenging task. To this end, this study aims to build a comprehensive dataset of X-rays and CT scan images from multiple sources as well as provides a simple but an effective COVID-19 detection technique using deep learning and transfer learning algorithms. In this vein, a simple convolution neural network (CNN) and modified pre-trained AlexNet model are applied on the prepared X-rays and CT scan images dataset. The result of the experiments shows that the utilized models can provide accuracy up to 98 % via pre-trained network and 94.1 % accuracy by using the modified CNN.
Tasks COVID-19 Detection, Transfer Learning
Published 2020-03-31
URL https://arxiv.org/abs/2004.00038v1
PDF https://arxiv.org/pdf/2004.00038v1.pdf
PWC https://paperswithcode.com/paper/diagnosing-covid-19-pneumonia-from-x-ray-and
Repo
Framework

Transfer Learning of Photometric Phenotypes in Agriculture Using Metadata

Title Transfer Learning of Photometric Phenotypes in Agriculture Using Metadata
Authors Dan Halbersberg, Aharon Bar Hillel, Shon Mendelson, Daniel Koster, Lena Karol, Boaz Lerner
Abstract Estimation of photometric plant phenotypes (e.g., hue, shine, chroma) in field conditions is important for decisions on the expected yield quality, fruit ripeness, and need for further breeding. Estimating these from images is difficult due to large variances in lighting conditions, shadows, and sensor properties. We combine the image and metadata regarding capturing conditions embedded into a network, enabling more accurate estimation and transfer between different conditions. Compared to a state-of-the-art deep CNN and a human expert, metadata embedding improves the estimation of the tomato’s hue and chroma.
Tasks Transfer Learning
Published 2020-04-01
URL https://arxiv.org/abs/2004.00303v1
PDF https://arxiv.org/pdf/2004.00303v1.pdf
PWC https://paperswithcode.com/paper/transfer-learning-of-photometric-phenotypes
Repo
Framework
comments powered by Disqus