April 1, 2020

3243 words 16 mins read

Paper Group ANR 470

Convergence of End-to-End Training in Deep Unsupervised Contrasitive Learning. End-to-End Multi-speaker Speech Recognition with Transformer. Damage-sensitive and domain-invariant feature extraction for vehicle-vibration-based bridge health monitoring. Real-Time Optimal Guidance and Control for Interplanetary Transfers Using Deep Networks. Pre-proce …

Convergence of End-to-End Training in Deep Unsupervised Contrasitive Learning


Title	Convergence of End-to-End Training in Deep Unsupervised Contrasitive Learning
Authors	Zixin Wen
Abstract	Unsupervised contrastive learning has gained increasing attention in the latest research and has proven to be a powerful method for learning representations from unlabeled data. However, little theoretical analysis was known for this framework. In this paper, we study the optimization of deep unsupervised contrastive learning. We prove that, by applying end-to-end training that simultaneously updates two deep over-parameterized neural networks, one can find an approximate stationary solution for the non-convex contrastive loss. This result is inherently different from the existing over-parameterized analysis in the supervised setting because, in contrast to learning a specific target function, unsupervised contrastive learning tries to encode the unlabeled data distribution into the neural networks, which generally has no optimal solution. Our analysis provides theoretical insights into the practical success of these unsupervised pretraining methods.
Tasks
Published	2020-02-17
URL	https://arxiv.org/abs/2002.06979v2
PDF	https://arxiv.org/pdf/2002.06979v2.pdf
PWC	https://paperswithcode.com/paper/convergence-of-end-to-end-training-in-deep
Repo
Framework

End-to-End Multi-speaker Speech Recognition with Transformer


Title	End-to-End Multi-speaker Speech Recognition with Transformer
Authors	Xuankai Chang, Wangyou Zhang, Yanmin Qian, Jonathan Le Roux, Shinji Watanabe
Abstract	Recently, fully recurrent neural network (RNN) based end-to-end models have been proven to be effective for multi-speaker speech recognition in both the single-channel and multi-channel scenarios. In this work, we explore the use of Transformer models for these tasks by focusing on two aspects. First, we replace the RNN-based encoder-decoder in the speech recognition model with a Transformer architecture. Second, in order to use the Transformer in the masking network of the neural beamformer in the multi-channel case, we modify the self-attention component to be restricted to a segment rather than the whole sequence in order to reduce computation. Besides the model architecture improvements, we also incorporate an external dereverberation preprocessing, the weighted prediction error (WPE), enabling our model to handle reverberated signals. Experiments on the spatialized wsj1-2mix corpus show that the Transformer-based models achieve 40.9% and 25.6% relative WER reduction, down to 12.1% and 6.4% WER, under the anechoic condition in single-channel and multi-channel tasks, respectively, while in the reverberant case, our methods achieve 41.5% and 13.8% relative WER reduction, down to 16.5% and 15.2% WER.
Tasks	Speech Recognition
Published	2020-02-10
URL	https://arxiv.org/abs/2002.03921v2
PDF	https://arxiv.org/pdf/2002.03921v2.pdf
PWC	https://paperswithcode.com/paper/end-to-end-multi-speaker-speech-recognition-1
Repo
Framework

Damage-sensitive and domain-invariant feature extraction for vehicle-vibration-based bridge health monitoring


Title	Damage-sensitive and domain-invariant feature extraction for vehicle-vibration-based bridge health monitoring
Authors	Jingxiao Liu, Bingqing Chen, Siheng Chen, Mario Berges, Jacobo Bielak, HaeYoung Noh
Abstract	We introduce a physics-guided signal processing approach to extract a damage-sensitive and domain-invariant (DS & DI) feature from acceleration response data of a vehicle traveling over a bridge to assess bridge health. Motivated by indirect sensing methods’ benefits, such as low-cost and low-maintenance, vehicle-vibration-based bridge health monitoring has been studied to efficiently monitor bridges in real-time. Yet applying this approach is challenging because 1) physics-based features extracted manually are generally not damage-sensitive, and 2) features from machine learning techniques are often not applicable to different bridges. Thus, we formulate a vehicle bridge interaction system model and find a physics-guided DS & DI feature, which can be extracted using the synchrosqueezed wavelet transform representing non-stationary signals as intrinsic-mode-type components. We validate the effectiveness of the proposed feature with simulated experiments. Compared to conventional time- and frequency-domain features, our feature provides the best damage quantification and localization results across different bridges in five of six experiments.
Tasks
Published	2020-02-06
URL	https://arxiv.org/abs/2002.02105v1
PDF	https://arxiv.org/pdf/2002.02105v1.pdf
PWC	https://paperswithcode.com/paper/damage-sensitive-and-domain-invariant-feature
Repo
Framework

Real-Time Optimal Guidance and Control for Interplanetary Transfers Using Deep Networks


Title	Real-Time Optimal Guidance and Control for Interplanetary Transfers Using Deep Networks
Authors	Dario Izzo, Ekin Öztürk
Abstract	We consider the Earth-Venus mass-optimal interplanetary transfer of a low-thrust spacecraft and show how the optimal guidance can be represented by deep networks in a large portion of the state space and to a high degree of accuracy. Imitation (supervised) learning of optimal examples is used as a network training paradigm. The resulting models are suitable for an on-board, real-time, implementation of the optimal guidance and control system of the spacecraft and are called G&CNETs. A new general methodology called Backward Generation of Optimal Examples is introduced and shown to be able to efficiently create all the optimal state action pairs necessary to train G&CNETs without solving optimal control problems. With respect to previous works, we are able to produce datasets containing a few orders of magnitude more optimal trajectories and obtain network performances compatible with real missions requirements. Several schemes able to train representations of either the optimal policy (thrust profile) or the value function (optimal mass) are proposed and tested. We find that both policy learning and value function learning successfully and accurately learn the optimal thrust and that a spacecraft employing the learned thrust is able to reach the target conditions orbit spending only 2 permil more propellant than in the corresponding mathematically optimal transfer. Moreover, the optimal propellant mass can be predicted (in case of value function learning) within an error well within 1%. All G&CNETs produced are tested during simulations of interplanetary transfers with respect to their ability to reach the target conditions optimally starting from nominal and off-nominal conditions.
Tasks
Published	2020-02-20
URL	https://arxiv.org/abs/2002.09063v1
PDF	https://arxiv.org/pdf/2002.09063v1.pdf
PWC	https://paperswithcode.com/paper/real-time-optimal-guidance-and-control-for
Repo
Framework

Pre-processing Image using Brightening, CLAHE and RETINEX


Title	Pre-processing Image using Brightening, CLAHE and RETINEX
Authors	Thi Phuoc Hanh Nguyen, Zinan Cai, Khanh Nguyen, Sokuntheariddh Keth, Ningyuan Shen, Mira Park
Abstract	This paper focuses on finding the most optimal pre-processing methods considering three common algorithms for image enhancement: Brightening, CLAHE and Retinex. For the purpose of image training in general, these methods will be combined to find out the most optimal method for image enhancement. We have carried out the research on the different permutation of three methods: Brightening, CLAHE and Retinex. The evaluation is based on Canny Edge detection applied to all processed images. Then the sharpness of objects will be justified by true positive pixels number in comparison between images. After using different number combinations pre-processing functions on images, CLAHE proves to be the most effective in edges improvement, Brightening does not show much effect on the edges enhancement, and the Retinex even reduces the sharpness of images and shows little contribution on images enhancement.
Tasks	Edge Detection, Image Enhancement
Published	2020-03-22
URL	https://arxiv.org/abs/2003.10822v1
PDF	https://arxiv.org/pdf/2003.10822v1.pdf
PWC	https://paperswithcode.com/paper/pre-processing-image-using-brightening-clahe
Repo
Framework

No Surprises: Training Robust Lung Nodule Detection for Low-Dose CT Scans by Augmenting with Adversarial Attacks


Title	No Surprises: Training Robust Lung Nodule Detection for Low-Dose CT Scans by Augmenting with Adversarial Attacks
Authors	Siqi Liu, Arnaud Arindra Adiyoso Setio, Florin C. Ghesu, Eli Gibson, Sasa Grbic, Bogdan Georgescu, Dorin Comaniciu
Abstract	Detecting malignant pulmonary nodules at an early stage can allow medical interventions which increases the survival rate of lung cancer patients. Using computer vision techniques to detect nodules can improve the sensitivity and the speed of interpreting chest CT for lung cancer screening. Many studies have used CNNs to detect nodule candidates. Though such approaches have been shown to outperform the conventional image processing based methods regarding the detection accuracy, CNNs are also known to be limited to generalize on under-represented samples in the training set and prone to imperceptible noise perturbations. Such limitations can not be easily addressed by scaling up the dataset or the models. In this work, we propose to add adversarial synthetic nodules and adversarial attack samples to the training data to improve the generalization and the robustness of the lung nodule detection systems. In order to generate hard examples of nodules from a differentiable nodule synthesizer, we use projected gradient descent (PGD) to search the latent code within a bounded neighbourhood that would generate nodules to decrease the detector response. To make the network more robust to unanticipated noise perturbations, we use PGD to search for noise patterns that can trigger the network to give over-confident mistakes. By evaluating on two different benchmark datasets containing consensus annotations from three radiologists, we show that the proposed techniques can improve the detection performance on real CT data. To understand the limitations of both the conventional networks and the proposed augmented networks, we also perform stress-tests on the false positive reduction networks by feeding different types of artificially produced patches. We show that the augmented networks are more robust to both under-represented nodules as well as resistant to noise perturbations.
Tasks	Adversarial Attack, Lung Nodule Detection
Published	2020-03-08
URL	https://arxiv.org/abs/2003.03824v1
PDF	https://arxiv.org/pdf/2003.03824v1.pdf
PWC	https://paperswithcode.com/paper/no-surprises-training-robust-lung-nodule
Repo
Framework

Double Backpropagation for Training Autoencoders against Adversarial Attack


Title	Double Backpropagation for Training Autoencoders against Adversarial Attack
Authors	Chengjin Sun, Sizhe Chen, Xiaolin Huang
Abstract	Deep learning, as widely known, is vulnerable to adversarial samples. This paper focuses on the adversarial attack on autoencoders. Safety of the autoencoders (AEs) is important because they are widely used as a compression scheme for data storage and transmission, however, the current autoencoders are easily attacked, i.e., one can slightly modify an input but has totally different codes. The vulnerability is rooted the sensitivity of the autoencoders and to enhance the robustness, we propose to adopt double backpropagation (DBP) to secure autoencoder such as VAE and DRAW. We restrict the gradient from the reconstruction image to the original one so that the autoencoder is not sensitive to trivial perturbation produced by the adversarial attack. After smoothing the gradient by DBP, we further smooth the label by Gaussian Mixture Model (GMM), aiming for accurate and robust classification. We demonstrate in MNIST, CelebA, SVHN that our method leads to a robust autoencoder resistant to attack and a robust classifier able for image transition and immune to adversarial attack if combined with GMM.
Tasks	Adversarial Attack
Published	2020-03-04
URL	https://arxiv.org/abs/2003.01895v1
PDF	https://arxiv.org/pdf/2003.01895v1.pdf
PWC	https://paperswithcode.com/paper/double-backpropagation-for-training
Repo
Framework

Real-time, Universal, and Robust Adversarial Attacks Against Speaker Recognition Systems


Title	Real-time, Universal, and Robust Adversarial Attacks Against Speaker Recognition Systems
Authors	Yi Xie, Cong Shi, Zhuohang Li, Jian Liu, Yingying Chen, Bo Yuan
Abstract	As the popularity of voice user interface (VUI) exploded in recent years, speaker recognition system has emerged as an important medium of identifying a speaker in many security-required applications and services. In this paper, we propose the first real-time, universal, and robust adversarial attack against the state-of-the-art deep neural network (DNN) based speaker recognition system. Through adding an audio-agnostic universal perturbation on arbitrary enrolled speaker’s voice input, the DNN-based speaker recognition system would identify the speaker as any target (i.e., adversary-desired) speaker label. In addition, we improve the robustness of our attack by modeling the sound distortions caused by the physical over-the-air propagation through estimating room impulse response (RIR). Experiment using a public dataset of $109$ English speakers demonstrates the effectiveness and robustness of our proposed attack with a high attack success rate of over 90%. The attack launching time also achieves a 100X speedup over contemporary non-universal attacks.
Tasks	Adversarial Attack, Speaker Recognition
Published	2020-03-04
URL	https://arxiv.org/abs/2003.02301v1
PDF	https://arxiv.org/pdf/2003.02301v1.pdf
PWC	https://paperswithcode.com/paper/real-time-universal-and-robust-adversarial
Repo
Framework

Temporal Sparse Adversarial Attack on Gait Recognition


Title	Temporal Sparse Adversarial Attack on Gait Recognition
Authors	Ziwen He, Wei Wang, Jing Dong, Tieniu Tan
Abstract	Gait recognition has a broad application in social security due to its advantages in long-distance human identification. Despite the high accuracy of gait recognition systems, their adversarial robustness has not been explored. In this paper, we demonstrate that the state-of-the-art gait recognition model is vulnerable to adversarial attacks. A novel temporal sparse adversarial attack under a new defined distortion measurement is proposed. GAN-based architecture is employed to semantically generate adversarial high-quality gait silhouette. By sparsely substituting or inserting a few adversarial gait silhouettes, our proposed method can achieve a high attack success rate. The imperceptibility and the attacking success rate of the adversarial examples are well balanced. Experimental results show even only one-fortieth frames are attacked, the attack success rate still reaches 76.8%.
Tasks	Adversarial Attack, Gait Recognition
Published	2020-02-22
URL	https://arxiv.org/abs/2002.09674v1
PDF	https://arxiv.org/pdf/2002.09674v1.pdf
PWC	https://paperswithcode.com/paper/temporal-sparse-adversarial-attack-on-gait
Repo
Framework

Designing Fair AI for Managing Employees in Organizations: A Review, Critique, and Design Agenda


Title	Designing Fair AI for Managing Employees in Organizations: A Review, Critique, and Design Agenda
Authors	Lionel P. Robert, Casey Pierce, Liz Morris, Sangmi Kim, Rasha Alahmad
Abstract	Organizations are rapidly deploying artificial intelligence (AI) systems to manage their workers. However, AI has been found at times to be unfair to workers. Unfairness toward workers has been associated with decreased worker effort and increased worker turnover. To avoid such problems, AI systems must be designed to support fairness and redress instances of unfairness. Despite the attention related to AI unfairness, there has not been a theoretical and systematic approach to developing a design agenda. This paper addresses the issue in three ways. First, we introduce the organizational justice theory, three different fairness types (distributive, procedural, interactional), and the frameworks for redressing instances of unfairness (retributive justice, restorative justice). Second, we review the design literature that specifically focuses on issues of AI fairness in organizations. Third, we propose a design agenda for AI fairness in organizations that applies each of the fairness types to organizational scenarios. Then, the paper concludes with implications for future research.
Tasks
Published	2020-02-20
URL	https://arxiv.org/abs/2002.09054v1
PDF	https://arxiv.org/pdf/2002.09054v1.pdf
PWC	https://paperswithcode.com/paper/designing-fair-ai-for-managing-employees-in
Repo
Framework

An Elementary Approach to Convergence Guarantees of Optimization Algorithms for Deep Networks


Title	An Elementary Approach to Convergence Guarantees of Optimization Algorithms for Deep Networks
Authors	Vincent Roulet, Zaid Harchaoui
Abstract	We present an approach to obtain convergence guarantees of optimization algorithms for deep networks based on elementary arguments and computations. The convergence analysis revolves around the analytical and computational structures of optimization oracles central to the implementation of deep networks in machine learning software. We provide a systematic way to compute estimates of the smoothness constants that govern the convergence behavior of first-order optimization algorithms used to train deep networks. A diverse set of example components and architectures arising in modern deep networks intersperse the exposition to illustrate the approach.
Tasks
Published	2020-02-20
URL	https://arxiv.org/abs/2002.09051v1
PDF	https://arxiv.org/pdf/2002.09051v1.pdf
PWC	https://paperswithcode.com/paper/an-elementary-approach-to-convergence
Repo
Framework

Entity Profiling in Knowledge Graphs


Title	Entity Profiling in Knowledge Graphs
Authors	Xiang Zhang, Qingqing Yang, Jinru Ding, Ziyue Wang
Abstract	Knowledge Graphs (KGs) are graph-structured knowledge bases storing factual information about real-world entities. Understanding the uniqueness of each entity is crucial to the analyzing, sharing, and reusing of KGs. Traditional profiling technologies encompass a vast array of methods to find distinctive features in various applications, which can help to differentiate entities in the process of human understanding of KGs. In this work, we present a novel profiling approach to identify distinctive entity features. The distinctiveness of features is carefully measured by a HAS model, which is a scalable representation learning model to produce a multi-pattern entity embedding. We fully evaluate the quality of entity profiles generated from real KGs. The results show that our approach facilitates human understanding of entities in KGs.
Tasks	Knowledge Graphs, Representation Learning
Published	2020-02-29
URL	https://arxiv.org/abs/2003.00172v1
PDF	https://arxiv.org/pdf/2003.00172v1.pdf
PWC	https://paperswithcode.com/paper/entity-profiling-in-knowledge-graphs
Repo
Framework

Learning Representations by Predicting Bags of Visual Words


Title	Learning Representations by Predicting Bags of Visual Words
Authors	Spyros Gidaris, Andrei Bursuc, Nikos Komodakis, Patrick Pérez, Matthieu Cord
Abstract	Self-supervised representation learning targets to learn convnet-based image representations from unlabeled data. Inspired by the success of NLP methods in this area, in this work we propose a self-supervised approach based on spatially dense image descriptions that encode discrete visual concepts, here called visual words. To build such discrete representations, we quantize the feature maps of a first pre-trained self-supervised convnet, over a k-means based vocabulary. Then, as a self-supervised task, we train another convnet to predict the histogram of visual words of an image (i.e., its Bag-of-Words representation) given as input a perturbed version of that image. The proposed task forces the convnet to learn perturbation-invariant and context-aware image features, useful for downstream image understanding tasks. We extensively evaluate our method and demonstrate very strong empirical results, e.g., our pre-trained self-supervised representations transfer better on detection task and similarly on classification over classes “unseen” during pre-training, when compared to the supervised case. This also shows that the process of image discretization into visual words can provide the basis for very powerful self-supervised approaches in the image domain, thus allowing further connections to be made to related methods from the NLP domain that have been extremely successful so far.
Tasks	Representation Learning
Published	2020-02-27
URL	https://arxiv.org/abs/2002.12247v1
PDF	https://arxiv.org/pdf/2002.12247v1.pdf
PWC	https://paperswithcode.com/paper/learning-representations-by-predicting-bags
Repo
Framework

A Road Map to Strong Intelligence


Title	A Road Map to Strong Intelligence
Authors	Philip Paquette
Abstract	I wrote this paper because technology can really improve people’s lives. With it, we can live longer in a healthy body, save time through increased efficiency and automation, and make better decisions. To get to the next level, we need to start looking at intelligence from a much broader perspective, and promote international interdisciplinary collaborations. Section 1 of this paper delves into sociology and social psychology to explain that the mechanisms underlying intelligence are inherently social. Section 2 proposes a method to classify intelligence, and describes the differences between weak and strong intelligence. Section 3 examines the Chinese Room argument from a different perspective. It demonstrates that a Turing-complete machine cannot have strong intelligence, and considers the modifications necessary for a computer to be intelligent and have understanding. Section 4 argues that the existential risk caused by the technological explosion of a single agent should not be of serious concern. Section 5 looks at the AI control problem and argues that it is impossible to build a super-intelligent machine that will do what it creators want. By using insights from biology, it also proposes a solution to the control problem. Section 6 discusses some of the implications of strong intelligence. Section 7 lists the main challenges with deep learning, and asserts that radical changes will be required to reach strong intelligence. Section 8 examines a neuroscience framework that could help explain how a cortical column works. Section 9 lays out the broad strokes of a road map towards strong intelligence. Finally, section 10 analyzes the impacts and the challenges of greater intelligence.
Tasks
Published	2020-02-20
URL	https://arxiv.org/abs/2002.09044v1
PDF	https://arxiv.org/pdf/2002.09044v1.pdf
PWC	https://paperswithcode.com/paper/a-road-map-to-strong-intelligence
Repo
Framework

Questioning the AI: Informing Design Practices for Explainable AI User Experiences


Title	Questioning the AI: Informing Design Practices for Explainable AI User Experiences
Authors	Q. Vera Liao, Daniel Gruen, Sarah Miller
Abstract	A surge of interest in explainable AI (XAI) has led to a vast collection of algorithmic work on the topic. While many recognize the necessity to incorporate explainability features in AI systems, how to address real-world user needs for understanding AI remains an open question. By interviewing 20 UX and design practitioners working on various AI products, we seek to identify gaps between the current XAI algorithmic work and practices to create explainable AI products. To do so, we develop an algorithm-informed XAI question bank in which user needs for explainability are represented as prototypical questions users might ask about the AI, and use it as a study probe. Our work contributes insights into the design space of XAI, informs efforts to support design practices in this space, and identifies opportunities for future XAI work. We also provide an extended XAI question bank and discuss how it can be used for creating user-centered XAI.
Tasks
Published	2020-01-08
URL	https://arxiv.org/abs/2001.02478v2
PDF	https://arxiv.org/pdf/2001.02478v2.pdf
PWC	https://paperswithcode.com/paper/questioning-the-ai-informing-design-practices
Repo
Framework