Paper Group ANR 394
Differentially Private ERM Based on Data Perturbation. Phonetic Feedback for Speech Enhancement With and Without Parallel Speech Data. Vision-Dialog Navigation by Exploring Cross-modal Memory. Characterizing Speech Adversarial Examples Using Self-Attention U-Net Enhancement. Deep learning approach for breast cancer diagnosis. Multifactorial Cellula …
Differentially Private ERM Based on Data Perturbation
Title | Differentially Private ERM Based on Data Perturbation |
Authors | Yilin Kang, Yong Liu, Lizhong Ding, Xinwang Liu, Xinyi Tong, Weiping Wang |
Abstract | In this paper, after observing that different training data instances affect the machine learning model to different extents, we attempt to improve the performance of differentially private empirical risk minimization (DP-ERM) from a new perspective. Specifically, we measure the contributions of various training data instances on the final machine learning model, and select some of them to add random noise. Considering that the key of our method is to measure each data instance separately, we propose a new `Data perturbation’ based (DB) paradigm for DP-ERM: adding random noise to the original training data and achieving ($\epsilon,\delta$)-differential privacy on the final machine learning model, along with the preservation on the original data. By introducing the Influence Function (IF), we quantitatively measure the impact of the training data on the final model. Theoretical and experimental results show that our proposed DBDP-ERM paradigm enhances the model performance significantly. | |
Tasks | |
Published | 2020-02-20 |
URL | https://arxiv.org/abs/2002.08578v1 |
https://arxiv.org/pdf/2002.08578v1.pdf | |
PWC | https://paperswithcode.com/paper/differentially-private-erm-based-on-data |
Repo | |
Framework | |
Phonetic Feedback for Speech Enhancement With and Without Parallel Speech Data
Title | Phonetic Feedback for Speech Enhancement With and Without Parallel Speech Data |
Authors | Peter Plantinga, Deblin Bagchi, Eric Fosler-Lussier |
Abstract | While deep learning systems have gained significant ground in speech enhancement research, these systems have yet to make use of the full potential of deep learning systems to provide high-level feedback. In particular, phonetic feedback is rare in speech enhancement research even though it includes valuable top-down information. We use the technique of mimic loss to provide phonetic feedback to an off-the-shelf enhancement system, and find gains in objective intelligibility scores on CHiME-4 data. This technique takes a frozen acoustic model trained on clean speech to provide valuable feedback to the enhancement model, even in the case where no parallel speech data is available. Our work is one of the first to show intelligibility improvement for neural enhancement systems without parallel speech data, and we show phonetic feedback can improve a state-of-the-art neural enhancement system trained with parallel speech data. |
Tasks | Speech Enhancement |
Published | 2020-03-03 |
URL | https://arxiv.org/abs/2003.01769v1 |
https://arxiv.org/pdf/2003.01769v1.pdf | |
PWC | https://paperswithcode.com/paper/phonetic-feedback-for-speech-enhancement-with |
Repo | |
Framework | |
Vision-Dialog Navigation by Exploring Cross-modal Memory
Title | Vision-Dialog Navigation by Exploring Cross-modal Memory |
Authors | Yi Zhu, Fengda Zhu, Zhaohuan Zhan, Bingqian Lin, Jianbin Jiao, Xiaojun Chang, Xiaodan Liang |
Abstract | Vision-dialog navigation posed as a new holy-grail task in vision-language disciplinary targets at learning an agent endowed with the capability of constant conversation for help with natural language and navigating according to human responses. Besides the common challenges faced in visual language navigation, vision-dialog navigation also requires to handle well with the language intentions of a series of questions about the temporal context from dialogue history and co-reasoning both dialogs and visual scenes. In this paper, we propose the Cross-modal Memory Network (CMN) for remembering and understanding the rich information relevant to historical navigation actions. Our CMN consists of two memory modules, the language memory module (L-mem) and the visual memory module (V-mem). Specifically, L-mem learns latent relationships between the current language interaction and a dialog history by employing a multi-head attention mechanism. V-mem learns to associate the current visual views and the cross-modal memory about the previous navigation actions. The cross-modal memory is generated via a vision-to-language attention and a language-to-vision attention. Benefiting from the collaborative learning of the L-mem and the V-mem, our CMN is able to explore the memory about the decision making of historical navigation actions which is for the current step. Experiments on the CVDN dataset show that our CMN outperforms the previous state-of-the-art model by a significant margin on both seen and unseen environments. |
Tasks | Decision Making |
Published | 2020-03-15 |
URL | https://arxiv.org/abs/2003.06745v1 |
https://arxiv.org/pdf/2003.06745v1.pdf | |
PWC | https://paperswithcode.com/paper/vision-dialog-navigation-by-exploring-cross |
Repo | |
Framework | |
Characterizing Speech Adversarial Examples Using Self-Attention U-Net Enhancement
Title | Characterizing Speech Adversarial Examples Using Self-Attention U-Net Enhancement |
Authors | Chao-Han Huck Yang, Jun Qi, Pin-Yu Chen, Xiaoli Ma, Chin-Hui Lee |
Abstract | Recent studies have highlighted adversarial examples as ubiquitous threats to the deep neural network (DNN) based speech recognition systems. In this work, we present a U-Net based attention model, U-Net$_{At}$, to enhance adversarial speech signals. Specifically, we evaluate the model performance by interpretable speech recognition metrics and discuss the model performance by the augmented adversarial training. Our experiments show that our proposed U-Net$_{At}$ improves the perceptual evaluation of speech quality (PESQ) from 1.13 to 2.78, speech transmission index (STI) from 0.65 to 0.75, short-term objective intelligibility (STOI) from 0.83 to 0.96 on the task of speech enhancement with adversarial speech examples. We conduct experiments on the automatic speech recognition (ASR) task with adversarial audio attacks. We find that (i) temporal features learned by the attention network are capable of enhancing the robustness of DNN based ASR models; (ii) the generalization power of DNN based ASR model could be enhanced by applying adversarial training with an additive adversarial data augmentation. The ASR metric on word-error-rates (WERs) shows that there is an absolute 2.22 $%$ decrease under gradient-based perturbation, and an absolute 2.03 $%$ decrease, under evolutionary-optimized perturbation, which suggests that our enhancement models with adversarial training can further secure a resilient ASR system. |
Tasks | Data Augmentation, Speech Enhancement, Speech Recognition |
Published | 2020-03-31 |
URL | https://arxiv.org/abs/2003.13917v1 |
https://arxiv.org/pdf/2003.13917v1.pdf | |
PWC | https://paperswithcode.com/paper/characterizing-speech-adversarial-examples |
Repo | |
Framework | |
Deep learning approach for breast cancer diagnosis
Title | Deep learning approach for breast cancer diagnosis |
Authors | Essam A. Rashed, M. Samir Abou El Seoud |
Abstract | Breast cancer is one of the leading fatal disease worldwide with high risk control if early discovered. Conventional method for breast screening is x-ray mammography, which is known to be challenging for early detection of cancer lesions. The dense breast structure produced due to the compression process during imaging lead to difficulties to recognize small size abnormalities. Also, inter- and intra-variations of breast tissues lead to significant difficulties to achieve high diagnosis accuracy using hand-crafted features. Deep learning is an emerging machine learning technology that requires a relatively high computation power. Yet, it proved to be very effective in several difficult tasks that requires decision making at the level of human intelligence. In this paper, we develop a new network architecture inspired by the U-net structure that can be used for effective and early detection of breast cancer. Results indicate a high rate of sensitivity and specificity that indicate potential usefulness of the proposed approach in clinical use. |
Tasks | Decision Making |
Published | 2020-03-10 |
URL | https://arxiv.org/abs/2003.04480v1 |
https://arxiv.org/pdf/2003.04480v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-approach-for-breast-cancer |
Repo | |
Framework | |
Multifactorial Cellular Genetic Algorithm (MFCGA): Algorithmic Design, Performance Comparison and Genetic Transferability Analysis
Title | Multifactorial Cellular Genetic Algorithm (MFCGA): Algorithmic Design, Performance Comparison and Genetic Transferability Analysis |
Authors | Eneko Osaba, Aritz D. Martinez, Jesus L. Lobo, Javier Del Ser, Francisco Herrera |
Abstract | Multitasking optimization is an incipient research area which is lately gaining a notable research momentum. Unlike traditional optimization paradigm that focuses on solving a single task at a time, multitasking addresses how multiple optimization problems can be tackled simultaneously by performing a single search process. The main objective to achieve this goal efficiently is to exploit synergies between the problems (tasks) to be optimized, helping each other via knowledge transfer (thereby being referred to as Transfer Optimization). Furthermore, the equally recent concept of Evolutionary Multitasking (EM) refers to multitasking environments adopting concepts from Evolutionary Computation as their inspiration for the simultaneous solving of the problems under consideration. As such, EM approaches such as the Multifactorial Evolutionary Algorithm (MFEA) has shown a remarkable success when dealing with multiple discrete, continuous, single-, and/or multi-objective optimization problems. In this work we propose a novel algorithmic scheme for Multifactorial Optimization scenarios - the Multifactorial Cellular Genetic Algorithm (MFCGA) - that hinges on concepts from Cellular Automata to implement mechanisms for exchanging knowledge among problems. We conduct an extensive performance analysis of the proposed MFCGA and compare it to the canonical MFEA under the same algorithmic conditions and over 15 different multitasking setups (encompassing different reference instances of the discrete Traveling Salesman Problem). A further contribution of this analysis beyond performance benchmarking is a quantitative examination of the genetic transferability among the problem instances, eliciting an empirical demonstration of the synergies emerged between the different optimization tasks along the MFCGA search process. |
Tasks | Transfer Learning |
Published | 2020-03-24 |
URL | https://arxiv.org/abs/2003.10768v1 |
https://arxiv.org/pdf/2003.10768v1.pdf | |
PWC | https://paperswithcode.com/paper/multifactorial-cellular-genetic-algorithm |
Repo | |
Framework | |
Flow descriptors of human mobility networks
Title | Flow descriptors of human mobility networks |
Authors | David Pastor-Escuredo, Enrique Frias-Martinez |
Abstract | Mobile phone data has enabled the timely and fine-grained study human mobility. Call Detail Records, generated at call events, allow building descriptions of mobility at different resolutions and with different spatial, temporal and social granularity. Individual trajectories are the basis for long-term observation of mobility patterns and identify factors of human dynamics. Here we propose a systematic analysis to characterize mobility network flows and topology and assess their impact into individual traces. Discrete flow-based descriptors are used to classify and understand human mobility patterns at multiple scales. This framework is suitable to assess urban planning, optimize transportation, measure the impact of external events and conditions, monitor internal dynamics and profile users according to their movement patterns. |
Tasks | Human Dynamics |
Published | 2020-03-16 |
URL | https://arxiv.org/abs/2003.07279v1 |
https://arxiv.org/pdf/2003.07279v1.pdf | |
PWC | https://paperswithcode.com/paper/flow-descriptors-of-human-mobility-networks |
Repo | |
Framework | |
EvoNet: A Neural Network for Predicting the Evolution of Dynamic Graphs
Title | EvoNet: A Neural Network for Predicting the Evolution of Dynamic Graphs |
Authors | Changmin Wu, Giannis Nikolentzos, Michalis Vazirgiannis |
Abstract | Neural networks for structured data like graphs have been studied extensively in recent years. To date, the bulk of research activity has focused mainly on static graphs. However, most real-world networks are dynamic since their topology tends to change over time. Predicting the evolution of dynamic graphs is a task of high significance in the area of graph mining. Despite its practical importance, the task has not been explored in depth so far, mainly due to its challenging nature. In this paper, we propose a model that predicts the evolution of dynamic graphs. Specifically, we use a graph neural network along with a recurrent architecture to capture the temporal evolution patterns of dynamic graphs. Then, we employ a generative model which predicts the topology of the graph at the next time step and constructs a graph instance that corresponds to that topology. We evaluate the proposed model on several artificial datasets following common network evolving dynamics, as well as on real-world datasets. Results demonstrate the effectiveness of the proposed model. |
Tasks | |
Published | 2020-03-02 |
URL | https://arxiv.org/abs/2003.00842v1 |
https://arxiv.org/pdf/2003.00842v1.pdf | |
PWC | https://paperswithcode.com/paper/evonet-a-neural-network-for-predicting-the-1 |
Repo | |
Framework | |
Random Matrix Theory Proves that Deep Learning Representations of GAN-data Behave as Gaussian Mixtures
Title | Random Matrix Theory Proves that Deep Learning Representations of GAN-data Behave as Gaussian Mixtures |
Authors | Mohamed El Amine Seddik, Cosme Louart, Mohamed Tamaazousti, Romain Couillet |
Abstract | This paper shows that deep learning (DL) representations of data produced by generative adversarial nets (GANs) are random vectors which fall within the class of so-called \textit{concentrated} random vectors. Further exploiting the fact that Gram matrices, of the type $G = X^T X$ with $X=[x_1,\ldots,x_n]\in \mathbb{R}^{p\times n}$ and $x_i$ independent concentrated random vectors from a mixture model, behave asymptotically (as $n,p\to \infty$) as if the $x_i$ were drawn from a Gaussian mixture, suggests that DL representations of GAN-data can be fully described by their first two statistical moments for a wide range of standard classifiers. Our theoretical findings are validated by generating images with the BigGAN model and across different popular deep representation networks. |
Tasks | |
Published | 2020-01-21 |
URL | https://arxiv.org/abs/2001.08370v1 |
https://arxiv.org/pdf/2001.08370v1.pdf | |
PWC | https://paperswithcode.com/paper/random-matrix-theory-proves-that-deep-1 |
Repo | |
Framework | |
Tangent-Space Gradient Optimization of Tensor Network for Machine Learning
Title | Tangent-Space Gradient Optimization of Tensor Network for Machine Learning |
Authors | Zheng-zhi Sun, Shi-ju Ran, Gang Su |
Abstract | The gradient-based optimization method for deep machine learning models suffers from gradient vanishing and exploding problems, particularly when the computational graph becomes deep. In this work, we propose the tangent-space gradient optimization (TSGO) for the probabilistic models to keep the gradients from vanishing or exploding. The central idea is to guarantee the orthogonality between the variational parameters and the gradients. The optimization is then implemented by rotating parameter vector towards the direction of gradient. We explain and testify TSGO in tensor network (TN) machine learning, where the TN describes the joint probability distribution as a normalized state $\left \psi \right\rangle $ in Hilbert space. We show that the gradient can be restricted in the tangent space of $\left\langle \psi \right.\left \psi \right\rangle = 1$ hyper-sphere. Instead of additional adaptive methods to control the learning rate in deep learning, the learning rate of TSGO is naturally determined by the angle $\theta $ as $\eta = \tan \theta $. Our numerical results reveal better convergence of TSGO in comparison to the off-the-shelf Adam. |
Tasks | |
Published | 2020-01-10 |
URL | https://arxiv.org/abs/2001.04029v1 |
https://arxiv.org/pdf/2001.04029v1.pdf | |
PWC | https://paperswithcode.com/paper/tangent-space-gradient-optimization-of-tensor |
Repo | |
Framework | |
Gravitational Wave Detection and Information Extraction via Neural Networks
Title | Gravitational Wave Detection and Information Extraction via Neural Networks |
Authors | Gerson R. Santos, Marcela P. Figueiredo, Antonio de Pádua Santos, Pavlos Protopapas, Tiago A. E. Ferreira |
Abstract | Laser Interferometer Gravitational-Wave Observatory (LIGO) was the first laboratory to measure the gravitational waves. It was needed an exceptional experimental design to measure distance changes much less than a radius of a proton. In the same way, the data analyses to confirm and extract information is a tremendously hard task. Here, it is shown a computational procedure base on artificial neural networks to detect a gravitation wave event and extract the knowledge of its ring-down time from the LIGO data. With this proposal, it is possible to make a probabilistic thermometer for gravitational wave detection and obtain physical information about the astronomical body system that created the phenomenon. Here, the ring-down time is determined with a direct data measure, without the need to use numerical relativity techniques and high computational power. |
Tasks | Gravitational Wave Detection |
Published | 2020-03-22 |
URL | https://arxiv.org/abs/2003.09995v1 |
https://arxiv.org/pdf/2003.09995v1.pdf | |
PWC | https://paperswithcode.com/paper/gravitational-wave-detection-and-information |
Repo | |
Framework | |
Using Single-Step Adversarial Training to Defend Iterative Adversarial Examples
Title | Using Single-Step Adversarial Training to Defend Iterative Adversarial Examples |
Authors | Guanxiong Liu, Issa Khalil, Abdallah Khreishah |
Abstract | Adversarial examples have become one of the largest challenges that machine learning models, especially neural network classifiers, face. These adversarial examples break the assumption of attack-free scenario and fool state-of-the-art (SOTA) classifiers with insignificant perturbations to human. So far, researchers achieved great progress in utilizing adversarial training as a defense. However, the overwhelming computational cost degrades its applicability and little has been done to overcome this issue. Single-Step adversarial training methods have been proposed as computationally viable solutions, however they still fail to defend against iterative adversarial examples. In this work, we first experimentally analyze several different SOTA defense methods against adversarial examples. Then, based on observations from experiments, we propose a novel single-step adversarial training method which can defend against both single-step and iterative adversarial examples. Lastly, through extensive evaluations, we demonstrate that our proposed method outperforms the SOTA single-step and iterative adversarial training defense. Compared with ATDA (single-step method) on CIFAR10 dataset, our proposed method achieves 35.67% enhancement in test accuracy and 19.14% reduction in training time. When compared with methods that use BIM or Madry examples (iterative methods) on CIFAR10 dataset, it saves up to 76.03% in training time with less than 3.78% degeneration in test accuracy. |
Tasks | |
Published | 2020-02-22 |
URL | https://arxiv.org/abs/2002.09632v2 |
https://arxiv.org/pdf/2002.09632v2.pdf | |
PWC | https://paperswithcode.com/paper/using-single-step-adversarial-training-to |
Repo | |
Framework | |
Time-Frequency Analysis based Blind Modulation Classification for Multiple-Antenna Systems
Title | Time-Frequency Analysis based Blind Modulation Classification for Multiple-Antenna Systems |
Authors | Weiheng Jiang, Xiaogang Wu, Bolin Chen, Wenjiang Feng, Yi Jin |
Abstract | Blind modulation classification is an important step to implement cognitive radio networks. The multiple-input multiple-output (MIMO) technique is widely used in military and civil communication systems. Due to the lack of prior information about channel parameters and the overlapping of signals in the MIMO systems, the traditional likelihood-based and feature-based approaches cannot be applied in these scenarios directly. Hence, in this paper, to resolve the problem of blind modulation classification in MIMO systems, the time-frequency analysis method based on the windowed short-time Fourier transform is used to analyse the time-frequency characteristics of time-domain modulated signals. Then the extracted time-frequency characteristics are converted into RGB spectrogram images, and the convolutional neural network based on transfer learning is applied to classify the modulation types according to the RGB spectrogram images. Finally, a decision fusion module is used to fuse the classification results of all the receive antennas. Through simulations, we analyse the classification performance at different signal-to-noise ratios (SNRs), the results indicate that, for the single-input single-output (SISO) network, our proposed scheme can achieve 92.37% and 99.12% average classification accuracy at SNRs of -4 dB and 10 dB, respectively. For the MIMO network, our scheme achieves 80.42% and 87.92% average classification accuracy at -4 dB and 10 dB, respectively. This outperforms the existing classification methods based on baseband signals. |
Tasks | Transfer Learning |
Published | 2020-04-01 |
URL | https://arxiv.org/abs/2004.00378v1 |
https://arxiv.org/pdf/2004.00378v1.pdf | |
PWC | https://paperswithcode.com/paper/time-frequency-analysis-based-blind |
Repo | |
Framework | |
Diagnosing COVID-19 Pneumonia from X-Ray and CT Images using Deep Learning and Transfer Learning Algorithms
Title | Diagnosing COVID-19 Pneumonia from X-Ray and CT Images using Deep Learning and Transfer Learning Algorithms |
Authors | Halgurd S. Maghdid, Aras T. Asaad, Kayhan Zrar Ghafoor, Ali Safaa Sadiq, Muhammad Khurram Khan |
Abstract | COVID-19 (also known as 2019 Novel Coronavirus) first emerged in Wuhan, China and spread across the globe with unprecedented effect and has now become the greatest crisis of the modern era. The COVID-19 has proved much more pervasive demands for diagnosis that has driven researchers to develop more intelligent, highly responsive and efficient detection methods. In this work, we focus on proposing AI tools that can be used by radiologists or healthcare professionals to diagnose COVID-19 cases in a quick and accurate manner. However, the lack of a publicly available dataset of X-ray and CT images makes the design of such AI tools a challenging task. To this end, this study aims to build a comprehensive dataset of X-rays and CT scan images from multiple sources as well as provides a simple but an effective COVID-19 detection technique using deep learning and transfer learning algorithms. In this vein, a simple convolution neural network (CNN) and modified pre-trained AlexNet model are applied on the prepared X-rays and CT scan images dataset. The result of the experiments shows that the utilized models can provide accuracy up to 98 % via pre-trained network and 94.1 % accuracy by using the modified CNN. |
Tasks | COVID-19 Detection, Transfer Learning |
Published | 2020-03-31 |
URL | https://arxiv.org/abs/2004.00038v1 |
https://arxiv.org/pdf/2004.00038v1.pdf | |
PWC | https://paperswithcode.com/paper/diagnosing-covid-19-pneumonia-from-x-ray-and |
Repo | |
Framework | |
Transfer Learning of Photometric Phenotypes in Agriculture Using Metadata
Title | Transfer Learning of Photometric Phenotypes in Agriculture Using Metadata |
Authors | Dan Halbersberg, Aharon Bar Hillel, Shon Mendelson, Daniel Koster, Lena Karol, Boaz Lerner |
Abstract | Estimation of photometric plant phenotypes (e.g., hue, shine, chroma) in field conditions is important for decisions on the expected yield quality, fruit ripeness, and need for further breeding. Estimating these from images is difficult due to large variances in lighting conditions, shadows, and sensor properties. We combine the image and metadata regarding capturing conditions embedded into a network, enabling more accurate estimation and transfer between different conditions. Compared to a state-of-the-art deep CNN and a human expert, metadata embedding improves the estimation of the tomato’s hue and chroma. |
Tasks | Transfer Learning |
Published | 2020-04-01 |
URL | https://arxiv.org/abs/2004.00303v1 |
https://arxiv.org/pdf/2004.00303v1.pdf | |
PWC | https://paperswithcode.com/paper/transfer-learning-of-photometric-phenotypes |
Repo | |
Framework | |