April 3, 2020

2988 words 15 mins read

Paper Group ANR 73

Domain Adaptation via Teacher-Student Learning for End-to-End Speech Recognition. Online LiDAR-SLAM for Legged Robots with Robust Registration and Deep-Learned Loop Closure. Decentralized SGD with Over-the-Air Computation. MaxUp: A Simple Way to Improve Generalization of Neural Network Training. Neural Kernels Without Tangents. On the Convergence o …

Domain Adaptation via Teacher-Student Learning for End-to-End Speech Recognition


Title	Domain Adaptation via Teacher-Student Learning for End-to-End Speech Recognition
Authors	Zhong Meng, Jinyu Li, Yashesh Gaur, Yifan Gong
Abstract	Teacher-student (T/S) has shown to be effective for domain adaptation of deep neural network acoustic models in hybrid speech recognition systems. In this work, we extend the T/S learning to large-scale unsupervised domain adaptation of an attention-based end-to-end (E2E) model through two levels of knowledge transfer: teacher’s token posteriors as soft labels and one-best predictions as decoder guidance. To further improve T/S learning with the help of ground-truth labels, we propose adaptive T/S (AT/S) learning. Instead of conditionally choosing from either the teacher’s soft token posteriors or the one-hot ground-truth label, in AT/S, the student always learns from both the teacher and the ground truth with a pair of adaptive weights assigned to the soft and one-hot labels quantifying the confidence on each of the knowledge sources. The confidence scores are dynamically estimated at each decoder step as a function of the soft and one-hot labels. With 3400 hours parallel close-talk and far-field Microsoft Cortana data for domain adaptation, T/S and AT/S achieve 6.3% and 10.3% relative word error rate improvement over a strong E2E model trained with the same amount of far-field data.
Tasks	Domain Adaptation, End-To-End Speech Recognition, Speech Recognition, Transfer Learning, Unsupervised Domain Adaptation
Published	2020-01-06
URL	https://arxiv.org/abs/2001.01798v1
PDF	https://arxiv.org/pdf/2001.01798v1.pdf
PWC	https://paperswithcode.com/paper/domain-adaptation-via-teacher-student
Repo
Framework

Online LiDAR-SLAM for Legged Robots with Robust Registration and Deep-Learned Loop Closure


Title	Online LiDAR-SLAM for Legged Robots with Robust Registration and Deep-Learned Loop Closure
Authors	Milad Ramezani, Georgi Tinchev, Egor Iuganov, Maurice Fallon
Abstract	In this paper, we present a factor-graph LiDAR-SLAM system which incorporates a state-of-the-art deeply learned feature-based loop closure detector to enable a legged robot to localize and map in industrial environments. These facilities can be badly lit and comprised of indistinct metallic structures, thus our system uses only LiDAR sensing and was developed to run on the quadruped robot’s navigation PC. Point clouds are accumulated using an inertial-kinematic state estimator before being aligned using ICP registration. To close loops we use a loop proposal mechanism which matches individual segments between clouds. We trained a descriptor offline to match these segments. The efficiency of our method comes from carefully designing the network architecture to minimize the number of parameters such that this deep learning method can be deployed in real-time using only the CPU of a legged robot, a major contribution of this work. The set of odometry and loop closure factors are updated using pose graph optimization. Finally we present an efficient risk alignment prediction method which verifies the reliability of the registrations. Experimental results at an industrial facility demonstrated the robustness and flexibility of our system, including autonomous following paths derived from the SLAM map.
Tasks	Legged Robots
Published	2020-01-28
URL	https://arxiv.org/abs/2001.10249v1
PDF	https://arxiv.org/pdf/2001.10249v1.pdf
PWC	https://paperswithcode.com/paper/online-lidar-slam-for-legged-robots-with
Repo
Framework

Decentralized SGD with Over-the-Air Computation


Title	Decentralized SGD with Over-the-Air Computation
Authors	Emre Ozfatura, Stefano Rini, Deniz Gunduz
Abstract	We study the performance of decentralized stochastic gradient descent (DSGD) in a wireless network, where the nodes collaboratively optimize an objective function using their local datasets. Unlike the conventional setting, where the nodes communicate over error-free orthogonal communication links, we assume that transmissions are prone to additive noise and interference.We first consider a point-to-point (P2P) transmission strategy, termed the OAC-P2P scheme, in which the node pairs are scheduled in an orthogonal fashion to minimize interference. Since in the DSGD framework, each node requires a linear combination of the neighboring models at the consensus step, we then propose the OAC-MAC scheme, which utilizes the signal superposition property of the wireless medium to achieve over-the-air computation (OAC). For both schemes, we cast the scheduling problem as a graph coloring problem. We numerically evaluate the performance of these two schemes for the MNIST image classification task under various network conditions. We show that the OAC-MAC scheme attains better convergence performance with a fewer communication rounds.
Tasks	Image Classification
Published	2020-03-06
URL	https://arxiv.org/abs/2003.04216v1
PDF	https://arxiv.org/pdf/2003.04216v1.pdf
PWC	https://paperswithcode.com/paper/decentralized-sgd-with-over-the-air
Repo
Framework

MaxUp: A Simple Way to Improve Generalization of Neural Network Training


Title	MaxUp: A Simple Way to Improve Generalization of Neural Network Training
Authors	Chengyue Gong, Tongzheng Ren, Mao Ye, Qiang Liu
Abstract	We propose \emph{MaxUp}, an embarrassingly simple, highly effective technique for improving the generalization performance of machine learning models, especially deep neural networks. The idea is to generate a set of augmented data with some random perturbations or transforms and minimize the maximum, or worst case loss over the augmented data. By doing so, we implicitly introduce a smoothness or robustness regularization against the random perturbations, and hence improve the generation performance. For example, in the case of Gaussian perturbation, \emph{MaxUp} is asymptotically equivalent to using the gradient norm of the loss as a penalty to encourage smoothness. We test \emph{MaxUp} on a range of tasks, including image classification, language modeling, and adversarial certification, on which \emph{MaxUp} consistently outperforms the existing best baseline methods, without introducing substantial computational overhead. In particular, we improve ImageNet classification from the state-of-the-art top-1 accuracy $85.5%$ without extra data to $85.8%$. Code will be released soon.
Tasks	Image Classification, Language Modelling
Published	2020-02-20
URL	https://arxiv.org/abs/2002.09024v1
PDF	https://arxiv.org/pdf/2002.09024v1.pdf
PWC	https://paperswithcode.com/paper/maxup-a-simple-way-to-improve-generalization
Repo
Framework

Neural Kernels Without Tangents


Title	Neural Kernels Without Tangents
Authors	Vaishaal Shankar, Alex Fang, Wenshuo Guo, Sara Fridovich-Keil, Ludwig Schmidt, Jonathan Ragan-Kelley, Benjamin Recht
Abstract	We investigate the connections between neural networks and simple building blocks in kernel space. In particular, using well established feature space tools such as direct sum, averaging, and moment lifting, we present an algebra for creating “compositional” kernels from bags of features. We show that these operations correspond to many of the building blocks of “neural tangent kernels (NTK)". Experimentally, we show that there is a correlation in test error between neural network architectures and the associated kernels. We construct a simple neural network architecture using only 3x3 convolutions, 2x2 average pooling, ReLU, and optimized with SGD and MSE loss that achieves 96% accuracy on CIFAR10, and whose corresponding compositional kernel achieves 90% accuracy. We also use our constructions to investigate the relative performance of neural networks, NTKs, and compositional kernels in the small dataset regime. In particular, we find that compositional kernels outperform NTKs and neural networks outperform both kernel methods.
Tasks
Published	2020-03-04
URL	https://arxiv.org/abs/2003.02237v2
PDF	https://arxiv.org/pdf/2003.02237v2.pdf
PWC	https://paperswithcode.com/paper/neural-kernels-without-tangents
Repo
Framework

On the Convergence of the Dynamic Inner PCA Algorithm


Title	On the Convergence of the Dynamic Inner PCA Algorithm
Authors	Sungho Shin, Alex D. Smith, S. Joe Qin, Victor M. Zavala
Abstract	Dynamic inner principal component analysis (DiPCA) is a powerful method for the analysis of time-dependent multivariate data. DiPCA extracts dynamic latent variables that capture the most dominant temporal trends by solving a large-scale, dense, and nonconvex nonlinear program (NLP). A scalable decomposition algorithm has been recently proposed in the literature to solve these challenging NLPs. The decomposition algorithm performs well in practice but its convergence properties are not well understood. In this work, we show that this algorithm is a specialized variant of a coordinate maximization algorithm. This observation allows us to explain why the decomposition algorithm might work (or not) in practice and can guide improvements. We compare the performance of the decomposition strategies with that of the off-the-shelf solver Ipopt. The results show that decomposition is more scalable and, surprisingly, delivers higher quality solutions.
Tasks
Published	2020-03-12
URL	https://arxiv.org/abs/2003.05928v1
PDF	https://arxiv.org/pdf/2003.05928v1.pdf
PWC	https://paperswithcode.com/paper/on-the-convergence-of-the-dynamic-inner-pca
Repo
Framework

Review: Noise and artifact reduction for MRI using deep learning


Title	Review: Noise and artifact reduction for MRI using deep learning
Authors	Daiki Tamada
Abstract	For several years, numerous attempts have been made to reduce noise and artifacts in MRI. Although there have been many successful methods to address these problems, practical implementation for clinical images is still challenging because of its complicated mechanism. Recently, deep learning received considerable attention, emerging as a machine learning approach in delivering robust MR image processing. The purpose here is therefore to explore further and review noise and artifact reduction using deep learning for MRI.
Tasks
Published	2020-02-28
URL	https://arxiv.org/abs/2002.12889v1
PDF	https://arxiv.org/pdf/2002.12889v1.pdf
PWC	https://paperswithcode.com/paper/review-noise-and-artifact-reduction-for-mri
Repo
Framework

Single Image Depth Estimation Trained via Depth from Defocus Cues


Title	Single Image Depth Estimation Trained via Depth from Defocus Cues
Authors	Shir Gur, Lior Wolf
Abstract	Estimating depth from a single RGB images is a fundamental task in computer vision, which is most directly solved using supervised deep learning. In the field of unsupervised learning of depth from a single RGB image, depth is not given explicitly. Existing work in the field receives either a stereo pair, a monocular video, or multiple views, and, using losses that are based on structure-from-motion, trains a depth estimation network. In this work, we rely, instead of different views, on depth from focus cues. Learning is based on a novel Point Spread Function convolutional layer, which applies location specific kernels that arise from the Circle-Of-Confusion in each image location. We evaluate our method on data derived from five common datasets for depth estimation and lightfield images, and present results that are on par with supervised methods on KITTI and Make3D datasets and outperform unsupervised learning approaches. Since the phenomenon of depth from defocus is not dataset specific, we hypothesize that learning based on it would overfit less to the specific content in each dataset. Our experiments show that this is indeed the case, and an estimator learned on one dataset using our method provides better results on other datasets, than the directly supervised methods.
Tasks	Depth Estimation, Lightfield
Published	2020-01-14
URL	https://arxiv.org/abs/2001.05036v1
PDF	https://arxiv.org/pdf/2001.05036v1.pdf
PWC	https://paperswithcode.com/paper/single-image-depth-estimation-trained-via-1
Repo
Framework

Convex Fairness Constrained Model Using Causal Effect Estimators


Title	Convex Fairness Constrained Model Using Causal Effect Estimators
Authors	Hikaru Ogura, Akiko Takeda
Abstract	Recent years have seen much research on fairness in machine learning. Here, mean difference (MD) or demographic parity is one of the most popular measures of fairness. However, MD quantifies not only discrimination but also explanatory bias which is the difference of outcomes justified by explanatory features. In this paper, we devise novel models, called FairCEEs, which remove discrimination while keeping explanatory bias. The models are based on estimators of causal effect utilizing propensity score analysis. We prove that FairCEEs with the squared loss theoretically outperform a naive MD constraint model. We provide an efficient algorithm for solving FairCEEs in regression and binary classification tasks. In our experiment on synthetic and real-world data in these two tasks, FairCEEs outperformed an existing model that considers explanatory bias in specific cases.
Tasks
Published	2020-02-16
URL	https://arxiv.org/abs/2002.06501v1
PDF	https://arxiv.org/pdf/2002.06501v1.pdf
PWC	https://paperswithcode.com/paper/convex-fairness-constrained-model-using
Repo
Framework

Hold me tight! Influence of discriminative features on deep network boundaries


Title	Hold me tight! Influence of discriminative features on deep network boundaries
Authors	Guillermo Ortiz-Jimenez, Apostolos Modas, Seyed-Mohsen Moosavi-Dezfooli, Pascal Frossard
Abstract	Important insights towards the explainability of neural networks and their properties reside in the formation of their decision boundaries. In this work, we borrow tools from the field of adversarial robustness and propose a new framework that permits to relate the features of the dataset with the distance of data samples to the decision boundary along specific directions. We demonstrate that the inductive bias of deep learning has the tendency to generate classification functions that are invariant along non-discriminative directions of the dataset. More surprisingly, we further show that training on small perturbations of the data samples are sufficient to completely change the decision boundary. This is actually the characteristic exploited by the so-called adversarial training to produce robust classifiers. Our general framework can be used to reveal the effect of specific dataset features on the macroscopic properties of deep models and to develop a better understanding of the successes and limitations of deep learning.
Tasks
Published	2020-02-15
URL	https://arxiv.org/abs/2002.06349v1
PDF	https://arxiv.org/pdf/2002.06349v1.pdf
PWC	https://paperswithcode.com/paper/hold-me-tight-influence-of-discriminative
Repo
Framework

Circumventing Outliers of AutoAugment with Knowledge Distillation


Title	Circumventing Outliers of AutoAugment with Knowledge Distillation
Authors	Longhui Wei, An Xiao, Lingxi Xie, Xin Chen, Xiaopeng Zhang, Qi Tian
Abstract	AutoAugment has been a powerful algorithm that improves the accuracy of many vision tasks, yet it is sensitive to the operator space as well as hyper-parameters, and an improper setting may degenerate network optimization. This paper delves deep into the working mechanism, and reveals that AutoAugment may remove part of discriminative information from the training image and so insisting on the ground-truth label is no longer the best option. To relieve the inaccuracy of supervision, we make use of knowledge distillation that refers to the output of a teacher model to guide network training. Experiments are performed in standard image classification benchmarks, and demonstrate the effectiveness of our approach in suppressing noise of data augmentation and stabilizing training. Upon the cooperation of knowledge distillation and AutoAugment, we claim the new state-of-the-art on ImageNet classification with a top-1 accuracy of 85.8%.
Tasks	Data Augmentation, Image Classification
Published	2020-03-25
URL	https://arxiv.org/abs/2003.11342v1
PDF	https://arxiv.org/pdf/2003.11342v1.pdf
PWC	https://paperswithcode.com/paper/circumventing-outliers-of-autoaugment-with
Repo
Framework

The Effect of Data Ordering in Image Classification


Title	The Effect of Data Ordering in Image Classification
Authors	Ethem F. Can, Aysu Ezen-Can
Abstract	The success stories from deep learning models increase every day spanning different tasks from image classification to natural language understanding. With the increasing popularity of these models, scientists spend more and more time finding the optimal parameters and best model architectures for their tasks. In this paper, we focus on the ingredient that feeds these machines: the data. We hypothesize that the data ordering affects how well a model performs. To that end, we conduct experiments on an image classification task using ImageNet dataset and show that some data orderings are better than others in terms of obtaining higher classification accuracies. Experimental results show that independent of model architecture, learning rate and batch size, ordering of the data significantly affects the outcome. We show these findings using different metrics: NDCG, accuracy @ 1 and accuracy @ 5. Our goal here is to show that not only parameters and model architectures but also the data ordering has a say in obtaining better results.
Tasks	Image Classification
Published	2020-01-08
URL	https://arxiv.org/abs/2001.05857v1
PDF	https://arxiv.org/pdf/2001.05857v1.pdf
PWC	https://paperswithcode.com/paper/the-effect-of-data-ordering-in-image
Repo
Framework

Inverse Learning of Symmetry Transformations


Title	Inverse Learning of Symmetry Transformations
Authors	Mario Wieser, Sonali Parbhoo, Aleksander Wieczorek, Volker Roth
Abstract	Symmetry transformations induce invariances and are a crucial building block of modern machine learning algorithms. Some transformations can be described analytically, e.g. geometric invariances. However, in many complex domains, such as the chemical space, invariances can be observed yet the corresponding symmetry transformation cannot be formulated analytically. Thus, the goal of our work is to learn the symmetry transformation that induced this invariance. To address this task, we propose learning two latent subspaces, where the first subspace captures the property and the second subspace the remaining invariant information. Our approach is based on the deep information bottleneck principle in combination with a mutual information regulariser. Unlike previous methods however, we focus on estimating mutual information in continuous rather than binary settings. This poses many challenges as mutual information cannot be meaningfully minimised in continuous domains. Therefore, we base the calculation of mutual information on correlation matrices in combination with a bijective variable transformation. Extensive experiments demonstrate that our model outperforms state-of-the-art methods on artificial and molecular datasets.
Tasks
Published	2020-02-07
URL	https://arxiv.org/abs/2002.02782v1
PDF	https://arxiv.org/pdf/2002.02782v1.pdf
PWC	https://paperswithcode.com/paper/inverse-learning-of-symmetry-transformations
Repo
Framework

Type I Attack for Generative Models


Title	Type I Attack for Generative Models
Authors	Chengjin Sun, Sizhe Chen, Jia Cai, Xiaolin Huang
Abstract	Generative models are popular tools with a wide range of applications. Nevertheless, it is as vulnerable to adversarial samples as classifiers. The existing attack methods mainly focus on generating adversarial examples by adding imperceptible perturbations to input, which leads to wrong result. However, we focus on another aspect of attack, i.e., cheating models by significant changes. The former induces Type II error and the latter causes Type I error. In this paper, we propose Type I attack to generative models such as VAE and GAN. One example given in VAE is that we can change an original image significantly to a meaningless one but their reconstruction results are similar. To implement the Type I attack, we destroy the original one by increasing the distance in input space while keeping the output similar because different inputs may correspond to similar features for the property of deep neural network. Experimental results show that our attack method is effective to generate Type I adversarial examples for generative models on large-scale image datasets.
Tasks
Published	2020-03-04
URL	https://arxiv.org/abs/2003.01872v1
PDF	https://arxiv.org/pdf/2003.01872v1.pdf
PWC	https://paperswithcode.com/paper/type-i-attack-for-generative-models
Repo
Framework

Is There Tradeoff between Spatial and Temporal in Video Super-Resolution?


Title	Is There Tradeoff between Spatial and Temporal in Video Super-Resolution?
Authors	Haochen Zhang, Dong Liu, Zhiwei Xiong
Abstract	Recent advances of deep learning lead to great success of image and video super-resolution (SR) methods that are based on convolutional neural networks (CNN). For video SR, advanced algorithms have been proposed to exploit the temporal correlation between low-resolution (LR) video frames, and/or to super-resolve a frame with multiple LR frames. These methods pursue higher quality of super-resolved frames, where the quality is usually measured frame by frame in e.g. PSNR. However, frame-wise quality may not reveal the consistency between frames. If an algorithm is applied to each frame independently (which is the case of most previous methods), the algorithm may cause temporal inconsistency, which can be observed as flickering. It is a natural requirement to improve both frame-wise fidelity and between-frame consistency, which are termed spatial quality and temporal quality, respectively. Then we may ask, is a method optimized for spatial quality also optimized for temporal quality? Can we optimize the two quality metrics jointly?
Tasks	Super-Resolution, Video Super-Resolution
Published	2020-03-13
URL	https://arxiv.org/abs/2003.06141v1
PDF	https://arxiv.org/pdf/2003.06141v1.pdf
PWC	https://paperswithcode.com/paper/is-there-tradeoff-between-spatial-and
Repo
Framework