April 1, 2020

3243 words 16 mins read

Paper Group ANR 470

Paper Group ANR 470

Convergence of End-to-End Training in Deep Unsupervised Contrasitive Learning. End-to-End Multi-speaker Speech Recognition with Transformer. Damage-sensitive and domain-invariant feature extraction for vehicle-vibration-based bridge health monitoring. Real-Time Optimal Guidance and Control for Interplanetary Transfers Using Deep Networks. Pre-proce …

Convergence of End-to-End Training in Deep Unsupervised Contrasitive Learning

Title Convergence of End-to-End Training in Deep Unsupervised Contrasitive Learning
Authors Zixin Wen
Abstract Unsupervised contrastive learning has gained increasing attention in the latest research and has proven to be a powerful method for learning representations from unlabeled data. However, little theoretical analysis was known for this framework. In this paper, we study the optimization of deep unsupervised contrastive learning. We prove that, by applying end-to-end training that simultaneously updates two deep over-parameterized neural networks, one can find an approximate stationary solution for the non-convex contrastive loss. This result is inherently different from the existing over-parameterized analysis in the supervised setting because, in contrast to learning a specific target function, unsupervised contrastive learning tries to encode the unlabeled data distribution into the neural networks, which generally has no optimal solution. Our analysis provides theoretical insights into the practical success of these unsupervised pretraining methods.
Tasks
Published 2020-02-17
URL https://arxiv.org/abs/2002.06979v2
PDF https://arxiv.org/pdf/2002.06979v2.pdf
PWC https://paperswithcode.com/paper/convergence-of-end-to-end-training-in-deep
Repo
Framework

End-to-End Multi-speaker Speech Recognition with Transformer

Title End-to-End Multi-speaker Speech Recognition with Transformer
Authors Xuankai Chang, Wangyou Zhang, Yanmin Qian, Jonathan Le Roux, Shinji Watanabe
Abstract Recently, fully recurrent neural network (RNN) based end-to-end models have been proven to be effective for multi-speaker speech recognition in both the single-channel and multi-channel scenarios. In this work, we explore the use of Transformer models for these tasks by focusing on two aspects. First, we replace the RNN-based encoder-decoder in the speech recognition model with a Transformer architecture. Second, in order to use the Transformer in the masking network of the neural beamformer in the multi-channel case, we modify the self-attention component to be restricted to a segment rather than the whole sequence in order to reduce computation. Besides the model architecture improvements, we also incorporate an external dereverberation preprocessing, the weighted prediction error (WPE), enabling our model to handle reverberated signals. Experiments on the spatialized wsj1-2mix corpus show that the Transformer-based models achieve 40.9% and 25.6% relative WER reduction, down to 12.1% and 6.4% WER, under the anechoic condition in single-channel and multi-channel tasks, respectively, while in the reverberant case, our methods achieve 41.5% and 13.8% relative WER reduction, down to 16.5% and 15.2% WER.
Tasks Speech Recognition
Published 2020-02-10
URL https://arxiv.org/abs/2002.03921v2
PDF https://arxiv.org/pdf/2002.03921v2.pdf
PWC https://paperswithcode.com/paper/end-to-end-multi-speaker-speech-recognition-1
Repo
Framework

Damage-sensitive and domain-invariant feature extraction for vehicle-vibration-based bridge health monitoring

Title Damage-sensitive and domain-invariant feature extraction for vehicle-vibration-based bridge health monitoring
Authors Jingxiao Liu, Bingqing Chen, Siheng Chen, Mario Berges, Jacobo Bielak, HaeYoung Noh
Abstract We introduce a physics-guided signal processing approach to extract a damage-sensitive and domain-invariant (DS & DI) feature from acceleration response data of a vehicle traveling over a bridge to assess bridge health. Motivated by indirect sensing methods’ benefits, such as low-cost and low-maintenance, vehicle-vibration-based bridge health monitoring has been studied to efficiently monitor bridges in real-time. Yet applying this approach is challenging because 1) physics-based features extracted manually are generally not damage-sensitive, and 2) features from machine learning techniques are often not applicable to different bridges. Thus, we formulate a vehicle bridge interaction system model and find a physics-guided DS & DI feature, which can be extracted using the synchrosqueezed wavelet transform representing non-stationary signals as intrinsic-mode-type components. We validate the effectiveness of the proposed feature with simulated experiments. Compared to conventional time- and frequency-domain features, our feature provides the best damage quantification and localization results across different bridges in five of six experiments.
Tasks
Published 2020-02-06
URL https://arxiv.org/abs/2002.02105v1
PDF https://arxiv.org/pdf/2002.02105v1.pdf
PWC https://paperswithcode.com/paper/damage-sensitive-and-domain-invariant-feature
Repo
Framework

Real-Time Optimal Guidance and Control for Interplanetary Transfers Using Deep Networks

Title Real-Time Optimal Guidance and Control for Interplanetary Transfers Using Deep Networks
Authors Dario Izzo, Ekin Öztürk
Abstract We consider the Earth-Venus mass-optimal interplanetary transfer of a low-thrust spacecraft and show how the optimal guidance can be represented by deep networks in a large portion of the state space and to a high degree of accuracy. Imitation (supervised) learning of optimal examples is used as a network training paradigm. The resulting models are suitable for an on-board, real-time, implementation of the optimal guidance and control system of the spacecraft and are called G&CNETs. A new general methodology called Backward Generation of Optimal Examples is introduced and shown to be able to efficiently create all the optimal state action pairs necessary to train G&CNETs without solving optimal control problems. With respect to previous works, we are able to produce datasets containing a few orders of magnitude more optimal trajectories and obtain network performances compatible with real missions requirements. Several schemes able to train representations of either the optimal policy (thrust profile) or the value function (optimal mass) are proposed and tested. We find that both policy learning and value function learning successfully and accurately learn the optimal thrust and that a spacecraft employing the learned thrust is able to reach the target conditions orbit spending only 2 permil more propellant than in the corresponding mathematically optimal transfer. Moreover, the optimal propellant mass can be predicted (in case of value function learning) within an error well within 1%. All G&CNETs produced are tested during simulations of interplanetary transfers with respect to their ability to reach the target conditions optimally starting from nominal and off-nominal conditions.
Tasks
Published 2020-02-20
URL https://arxiv.org/abs/2002.09063v1
PDF https://arxiv.org/pdf/2002.09063v1.pdf
PWC https://paperswithcode.com/paper/real-time-optimal-guidance-and-control-for
Repo
Framework

Pre-processing Image using Brightening, CLAHE and RETINEX

Title Pre-processing Image using Brightening, CLAHE and RETINEX
Authors Thi Phuoc Hanh Nguyen, Zinan Cai, Khanh Nguyen, Sokuntheariddh Keth, Ningyuan Shen, Mira Park
Abstract This paper focuses on finding the most optimal pre-processing methods considering three common algorithms for image enhancement: Brightening, CLAHE and Retinex. For the purpose of image training in general, these methods will be combined to find out the most optimal method for image enhancement. We have carried out the research on the different permutation of three methods: Brightening, CLAHE and Retinex. The evaluation is based on Canny Edge detection applied to all processed images. Then the sharpness of objects will be justified by true positive pixels number in comparison between images. After using different number combinations pre-processing functions on images, CLAHE proves to be the most effective in edges improvement, Brightening does not show much effect on the edges enhancement, and the Retinex even reduces the sharpness of images and shows little contribution on images enhancement.
Tasks Edge Detection, Image Enhancement
Published 2020-03-22
URL https://arxiv.org/abs/2003.10822v1
PDF https://arxiv.org/pdf/2003.10822v1.pdf
PWC https://paperswithcode.com/paper/pre-processing-image-using-brightening-clahe
Repo
Framework

No Surprises: Training Robust Lung Nodule Detection for Low-Dose CT Scans by Augmenting with Adversarial Attacks

Title No Surprises: Training Robust Lung Nodule Detection for Low-Dose CT Scans by Augmenting with Adversarial Attacks
Authors Siqi Liu, Arnaud Arindra Adiyoso Setio, Florin C. Ghesu, Eli Gibson, Sasa Grbic, Bogdan Georgescu, Dorin Comaniciu
Abstract Detecting malignant pulmonary nodules at an early stage can allow medical interventions which increases the survival rate of lung cancer patients. Using computer vision techniques to detect nodules can improve the sensitivity and the speed of interpreting chest CT for lung cancer screening. Many studies have used CNNs to detect nodule candidates. Though such approaches have been shown to outperform the conventional image processing based methods regarding the detection accuracy, CNNs are also known to be limited to generalize on under-represented samples in the training set and prone to imperceptible noise perturbations. Such limitations can not be easily addressed by scaling up the dataset or the models. In this work, we propose to add adversarial synthetic nodules and adversarial attack samples to the training data to improve the generalization and the robustness of the lung nodule detection systems. In order to generate hard examples of nodules from a differentiable nodule synthesizer, we use projected gradient descent (PGD) to search the latent code within a bounded neighbourhood that would generate nodules to decrease the detector response. To make the network more robust to unanticipated noise perturbations, we use PGD to search for noise patterns that can trigger the network to give over-confident mistakes. By evaluating on two different benchmark datasets containing consensus annotations from three radiologists, we show that the proposed techniques can improve the detection performance on real CT data. To understand the limitations of both the conventional networks and the proposed augmented networks, we also perform stress-tests on the false positive reduction networks by feeding different types of artificially produced patches. We show that the augmented networks are more robust to both under-represented nodules as well as resistant to noise perturbations.
Tasks Adversarial Attack, Lung Nodule Detection
Published 2020-03-08
URL https://arxiv.org/abs/2003.03824v1
PDF https://arxiv.org/pdf/2003.03824v1.pdf
PWC https://paperswithcode.com/paper/no-surprises-training-robust-lung-nodule
Repo
Framework

Double Backpropagation for Training Autoencoders against Adversarial Attack

Title Double Backpropagation for Training Autoencoders against Adversarial Attack
Authors Chengjin Sun, Sizhe Chen, Xiaolin Huang
Abstract Deep learning, as widely known, is vulnerable to adversarial samples. This paper focuses on the adversarial attack on autoencoders. Safety of the autoencoders (AEs) is important because they are widely used as a compression scheme for data storage and transmission, however, the current autoencoders are easily attacked, i.e., one can slightly modify an input but has totally different codes. The vulnerability is rooted the sensitivity of the autoencoders and to enhance the robustness, we propose to adopt double backpropagation (DBP) to secure autoencoder such as VAE and DRAW. We restrict the gradient from the reconstruction image to the original one so that the autoencoder is not sensitive to trivial perturbation produced by the adversarial attack. After smoothing the gradient by DBP, we further smooth the label by Gaussian Mixture Model (GMM), aiming for accurate and robust classification. We demonstrate in MNIST, CelebA, SVHN that our method leads to a robust autoencoder resistant to attack and a robust classifier able for image transition and immune to adversarial attack if combined with GMM.
Tasks Adversarial Attack
Published 2020-03-04
URL https://arxiv.org/abs/2003.01895v1
PDF https://arxiv.org/pdf/2003.01895v1.pdf
PWC https://paperswithcode.com/paper/double-backpropagation-for-training
Repo
Framework

Real-time, Universal, and Robust Adversarial Attacks Against Speaker Recognition Systems

Title Real-time, Universal, and Robust Adversarial Attacks Against Speaker Recognition Systems
Authors Yi Xie, Cong Shi, Zhuohang Li, Jian Liu, Yingying Chen, Bo Yuan
Abstract As the popularity of voice user interface (VUI) exploded in recent years, speaker recognition system has emerged as an important medium of identifying a speaker in many security-required applications and services. In this paper, we propose the first real-time, universal, and robust adversarial attack against the state-of-the-art deep neural network (DNN) based speaker recognition system. Through adding an audio-agnostic universal perturbation on arbitrary enrolled speaker’s voice input, the DNN-based speaker recognition system would identify the speaker as any target (i.e., adversary-desired) speaker label. In addition, we improve the robustness of our attack by modeling the sound distortions caused by the physical over-the-air propagation through estimating room impulse response (RIR). Experiment using a public dataset of $109$ English speakers demonstrates the effectiveness and robustness of our proposed attack with a high attack success rate of over 90%. The attack launching time also achieves a 100X speedup over contemporary non-universal attacks.
Tasks Adversarial Attack, Speaker Recognition
Published 2020-03-04
URL https://arxiv.org/abs/2003.02301v1
PDF https://arxiv.org/pdf/2003.02301v1.pdf
PWC https://paperswithcode.com/paper/real-time-universal-and-robust-adversarial
Repo
Framework

Temporal Sparse Adversarial Attack on Gait Recognition

Title Temporal Sparse Adversarial Attack on Gait Recognition
Authors Ziwen He, Wei Wang, Jing Dong, Tieniu Tan
Abstract Gait recognition has a broad application in social security due to its advantages in long-distance human identification. Despite the high accuracy of gait recognition systems, their adversarial robustness has not been explored. In this paper, we demonstrate that the state-of-the-art gait recognition model is vulnerable to adversarial attacks. A novel temporal sparse adversarial attack under a new defined distortion measurement is proposed. GAN-based architecture is employed to semantically generate adversarial high-quality gait silhouette. By sparsely substituting or inserting a few adversarial gait silhouettes, our proposed method can achieve a high attack success rate. The imperceptibility and the attacking success rate of the adversarial examples are well balanced. Experimental results show even only one-fortieth frames are attacked, the attack success rate still reaches 76.8%.
Tasks Adversarial Attack, Gait Recognition
Published 2020-02-22
URL https://arxiv.org/abs/2002.09674v1
PDF https://arxiv.org/pdf/2002.09674v1.pdf
PWC https://paperswithcode.com/paper/temporal-sparse-adversarial-attack-on-gait
Repo
Framework

Designing Fair AI for Managing Employees in Organizations: A Review, Critique, and Design Agenda

Title Designing Fair AI for Managing Employees in Organizations: A Review, Critique, and Design Agenda
Authors Lionel P. Robert, Casey Pierce, Liz Morris, Sangmi Kim, Rasha Alahmad
Abstract Organizations are rapidly deploying artificial intelligence (AI) systems to manage their workers. However, AI has been found at times to be unfair to workers. Unfairness toward workers has been associated with decreased worker effort and increased worker turnover. To avoid such problems, AI systems must be designed to support fairness and redress instances of unfairness. Despite the attention related to AI unfairness, there has not been a theoretical and systematic approach to developing a design agenda. This paper addresses the issue in three ways. First, we introduce the organizational justice theory, three different fairness types (distributive, procedural, interactional), and the frameworks for redressing instances of unfairness (retributive justice, restorative justice). Second, we review the design literature that specifically focuses on issues of AI fairness in organizations. Third, we propose a design agenda for AI fairness in organizations that applies each of the fairness types to organizational scenarios. Then, the paper concludes with implications for future research.
Tasks
Published 2020-02-20
URL https://arxiv.org/abs/2002.09054v1
PDF https://arxiv.org/pdf/2002.09054v1.pdf
PWC https://paperswithcode.com/paper/designing-fair-ai-for-managing-employees-in
Repo
Framework

An Elementary Approach to Convergence Guarantees of Optimization Algorithms for Deep Networks

Title An Elementary Approach to Convergence Guarantees of Optimization Algorithms for Deep Networks
Authors Vincent Roulet, Zaid Harchaoui
Abstract We present an approach to obtain convergence guarantees of optimization algorithms for deep networks based on elementary arguments and computations. The convergence analysis revolves around the analytical and computational structures of optimization oracles central to the implementation of deep networks in machine learning software. We provide a systematic way to compute estimates of the smoothness constants that govern the convergence behavior of first-order optimization algorithms used to train deep networks. A diverse set of example components and architectures arising in modern deep networks intersperse the exposition to illustrate the approach.
Tasks
Published 2020-02-20
URL https://arxiv.org/abs/2002.09051v1
PDF https://arxiv.org/pdf/2002.09051v1.pdf
PWC https://paperswithcode.com/paper/an-elementary-approach-to-convergence
Repo
Framework

Entity Profiling in Knowledge Graphs

Title Entity Profiling in Knowledge Graphs
Authors Xiang Zhang, Qingqing Yang, Jinru Ding, Ziyue Wang
Abstract Knowledge Graphs (KGs) are graph-structured knowledge bases storing factual information about real-world entities. Understanding the uniqueness of each entity is crucial to the analyzing, sharing, and reusing of KGs. Traditional profiling technologies encompass a vast array of methods to find distinctive features in various applications, which can help to differentiate entities in the process of human understanding of KGs. In this work, we present a novel profiling approach to identify distinctive entity features. The distinctiveness of features is carefully measured by a HAS model, which is a scalable representation learning model to produce a multi-pattern entity embedding. We fully evaluate the quality of entity profiles generated from real KGs. The results show that our approach facilitates human understanding of entities in KGs.
Tasks Knowledge Graphs, Representation Learning
Published 2020-02-29
URL https://arxiv.org/abs/2003.00172v1
PDF https://arxiv.org/pdf/2003.00172v1.pdf
PWC https://paperswithcode.com/paper/entity-profiling-in-knowledge-graphs
Repo
Framework

Learning Representations by Predicting Bags of Visual Words

Title Learning Representations by Predicting Bags of Visual Words
Authors Spyros Gidaris, Andrei Bursuc, Nikos Komodakis, Patrick Pérez, Matthieu Cord
Abstract Self-supervised representation learning targets to learn convnet-based image representations from unlabeled data. Inspired by the success of NLP methods in this area, in this work we propose a self-supervised approach based on spatially dense image descriptions that encode discrete visual concepts, here called visual words. To build such discrete representations, we quantize the feature maps of a first pre-trained self-supervised convnet, over a k-means based vocabulary. Then, as a self-supervised task, we train another convnet to predict the histogram of visual words of an image (i.e., its Bag-of-Words representation) given as input a perturbed version of that image. The proposed task forces the convnet to learn perturbation-invariant and context-aware image features, useful for downstream image understanding tasks. We extensively evaluate our method and demonstrate very strong empirical results, e.g., our pre-trained self-supervised representations transfer better on detection task and similarly on classification over classes “unseen” during pre-training, when compared to the supervised case. This also shows that the process of image discretization into visual words can provide the basis for very powerful self-supervised approaches in the image domain, thus allowing further connections to be made to related methods from the NLP domain that have been extremely successful so far.
Tasks Representation Learning
Published 2020-02-27
URL https://arxiv.org/abs/2002.12247v1
PDF https://arxiv.org/pdf/2002.12247v1.pdf
PWC https://paperswithcode.com/paper/learning-representations-by-predicting-bags
Repo
Framework

A Road Map to Strong Intelligence

Title A Road Map to Strong Intelligence
Authors Philip Paquette
Abstract I wrote this paper because technology can really improve people’s lives. With it, we can live longer in a healthy body, save time through increased efficiency and automation, and make better decisions. To get to the next level, we need to start looking at intelligence from a much broader perspective, and promote international interdisciplinary collaborations. Section 1 of this paper delves into sociology and social psychology to explain that the mechanisms underlying intelligence are inherently social. Section 2 proposes a method to classify intelligence, and describes the differences between weak and strong intelligence. Section 3 examines the Chinese Room argument from a different perspective. It demonstrates that a Turing-complete machine cannot have strong intelligence, and considers the modifications necessary for a computer to be intelligent and have understanding. Section 4 argues that the existential risk caused by the technological explosion of a single agent should not be of serious concern. Section 5 looks at the AI control problem and argues that it is impossible to build a super-intelligent machine that will do what it creators want. By using insights from biology, it also proposes a solution to the control problem. Section 6 discusses some of the implications of strong intelligence. Section 7 lists the main challenges with deep learning, and asserts that radical changes will be required to reach strong intelligence. Section 8 examines a neuroscience framework that could help explain how a cortical column works. Section 9 lays out the broad strokes of a road map towards strong intelligence. Finally, section 10 analyzes the impacts and the challenges of greater intelligence.
Tasks
Published 2020-02-20
URL https://arxiv.org/abs/2002.09044v1
PDF https://arxiv.org/pdf/2002.09044v1.pdf
PWC https://paperswithcode.com/paper/a-road-map-to-strong-intelligence
Repo
Framework

Questioning the AI: Informing Design Practices for Explainable AI User Experiences

Title Questioning the AI: Informing Design Practices for Explainable AI User Experiences
Authors Q. Vera Liao, Daniel Gruen, Sarah Miller
Abstract A surge of interest in explainable AI (XAI) has led to a vast collection of algorithmic work on the topic. While many recognize the necessity to incorporate explainability features in AI systems, how to address real-world user needs for understanding AI remains an open question. By interviewing 20 UX and design practitioners working on various AI products, we seek to identify gaps between the current XAI algorithmic work and practices to create explainable AI products. To do so, we develop an algorithm-informed XAI question bank in which user needs for explainability are represented as prototypical questions users might ask about the AI, and use it as a study probe. Our work contributes insights into the design space of XAI, informs efforts to support design practices in this space, and identifies opportunities for future XAI work. We also provide an extended XAI question bank and discuss how it can be used for creating user-centered XAI.
Tasks
Published 2020-01-08
URL https://arxiv.org/abs/2001.02478v2
PDF https://arxiv.org/pdf/2001.02478v2.pdf
PWC https://paperswithcode.com/paper/questioning-the-ai-informing-design-practices
Repo
Framework
comments powered by Disqus