Paper Group ANR 536
Curriculum-based transfer learning for an effective end-to-end spoken language understanding and domain portability. An Enhanced Machine Learning-based Biometric Authentication System Using RR-Interval Framed Electrocardiograms. Semi-supervised Acoustic Event Detection based on tri-training. Visual-Thermal Landmarks and Inertial Fusion for Navigati …
Curriculum-based transfer learning for an effective end-to-end spoken language understanding and domain portability
Title | Curriculum-based transfer learning for an effective end-to-end spoken language understanding and domain portability |
Authors | Antoine Caubrière, Natalia Tomashenko, Antoine Laurent, Emmanuel Morin, Nathalie Camelin, Yannick Estève |
Abstract | We present an end-to-end approach to extract semantic concepts directly from the speech audio signal. To overcome the lack of data available for this spoken language understanding approach, we investigate the use of a transfer learning strategy based on the principles of curriculum learning. This approach allows us to exploit out-of-domain data that can help to prepare a fully neural architecture. Experiments are carried out on the French MEDIA and PORTMEDIA corpora and show that this end-to-end SLU approach reaches the best results ever published on this task. We compare our approach to a classical pipeline approach that uses ASR, POS tagging, lemmatizer, chunker… and other NLP tools that aim to enrich ASR outputs that feed an SLU text to concepts system. Last, we explore the promising capacity of our end-to-end SLU approach to address the problem of domain portability. |
Tasks | Spoken Language Understanding, Transfer Learning |
Published | 2019-06-18 |
URL | https://arxiv.org/abs/1906.07601v1 |
https://arxiv.org/pdf/1906.07601v1.pdf | |
PWC | https://paperswithcode.com/paper/curriculum-based-transfer-learning-for-an |
Repo | |
Framework | |
An Enhanced Machine Learning-based Biometric Authentication System Using RR-Interval Framed Electrocardiograms
Title | An Enhanced Machine Learning-based Biometric Authentication System Using RR-Interval Framed Electrocardiograms |
Authors | Amang Song-Kyoo Kim, Chan Yeob Yeun, Paul D. Yoo |
Abstract | This paper is targeted in the area of biometric data enabled security system based on the machine learning for the digital health. The disadvantages of traditional authentication systems include the risks of forgetfulness, loss, and theft. Biometric authentication is therefore rapidly replacing traditional authentication methods and is becoming an everyday part of life. The electrocardiogram (ECG) was recently introduced as a biometric authentication system suitable for security checks. The proposed authentication system helps investigators studying ECG-based biometric authentication techniques to reshape input data by slicing based on the RR-interval, and defines the Overall Performance (OP), which is the combined performance metric of multiple authentication measures. We evaluated the performance of the proposed system using a confusion matrix and achieved up to 95% accuracy by compact data analysis. We also used the Amang ECG (amgecg) toolbox in MATLAB to investigate the upper-range control limit (UCL) based on the mean square error, which directly affects three authentication performance metrics: the accuracy, the number of accepted samples, and the OP. Using this approach, we found that the OP can be optimized by using a UCL of 0.0028, which indicates 61 accepted samples out of 70 and ensures that the proposed authentication system achieves an accuracy of 95%. |
Tasks | |
Published | 2019-07-27 |
URL | https://arxiv.org/abs/1907.13517v3 |
https://arxiv.org/pdf/1907.13517v3.pdf | |
PWC | https://paperswithcode.com/paper/an-enhanced-machine-learning-based-biometric |
Repo | |
Framework | |
Semi-supervised Acoustic Event Detection based on tri-training
Title | Semi-supervised Acoustic Event Detection based on tri-training |
Authors | Bowen Shi, Ming Sun, Chieh-Chi Kao, Viktor Rozgic, Spyros Matsoukas, Chao Wang |
Abstract | This paper presents our work of training acoustic event detection (AED) models using unlabeled dataset. Recent acoustic event detectors are based on large-scale neural networks, which are typically trained with huge amounts of labeled data. Labels for acoustic events are expensive to obtain, and relevant acoustic event audios can be limited, especially for rare events. In this paper we leverage an Internet-scale unlabeled dataset with potential domain shift to improve the detection of acoustic events. Based on the classic tri-training approach, our proposed method shows accuracy improvement over both the supervised training baseline, and semisupervised self-training set-up, in all pre-defined acoustic event detection tasks. As our approach relies on ensemble models, we further show the improvements can be distilled to a single model via knowledge distillation, with the resulting single student model maintaining high accuracy of teacher ensemble models. |
Tasks | |
Published | 2019-04-29 |
URL | http://arxiv.org/abs/1904.12926v1 |
http://arxiv.org/pdf/1904.12926v1.pdf | |
PWC | https://paperswithcode.com/paper/semi-supervised-acoustic-event-detection |
Repo | |
Framework | |
Visual-Thermal Landmarks and Inertial Fusion for Navigation in Degraded Visual Environments
Title | Visual-Thermal Landmarks and Inertial Fusion for Navigation in Degraded Visual Environments |
Authors | Shehryar Khattak, Christos Papachristos, Kostas Alexis |
Abstract | With an ever-widening domain of aerial robotic applications, including many mission critical tasks such as disaster response operations, search and rescue missions and infrastructure inspections taking place in GPS-denied environments, the need for reliable autonomous operation of aerial robots has become crucial. Operating in GPS-denied areas aerial robots rely on a multitude of sensors to localize and navigate. Visible spectrum cameras are the most commonly used sensors due to their low cost and weight. However, in environments that are visually-degraded such as in conditions of poor illumination, low texture, or presence of obscurants including fog, smoke and dust, the reliability of visible light cameras deteriorates significantly. Nevertheless, maintaining reliable robot navigation in such conditions is essential. In contrast to visible light cameras, thermal cameras offer visibility in the infrared spectrum and can be used in a complementary manner with visible spectrum cameras for robot localization and navigation tasks, without paying the significant weight and power penalty typically associated with carrying other sensors. Exploiting this fact, in this work we present a multi-sensor fusion algorithm for reliable odometry estimation in GPS-denied and degraded visual environments. The proposed method utilizes information from both the visible and thermal spectra for landmark selection and prioritizes feature extraction from informative image regions based on a metric over spatial entropy. Furthermore, inertial sensing cues are integrated to improve the robustness of the odometry estimation process. To verify our solution, a set of challenging experiments were conducted inside a) an obscurant filed machine shop-like industrial environment, as well as b) a dark subterranean mine in the presence of heavy airborne dust. |
Tasks | Robot Navigation, Sensor Fusion |
Published | 2019-03-05 |
URL | http://arxiv.org/abs/1903.01656v1 |
http://arxiv.org/pdf/1903.01656v1.pdf | |
PWC | https://paperswithcode.com/paper/visual-thermal-landmarks-and-inertial-fusion |
Repo | |
Framework | |
Lyapunov-based Safe Policy Optimization for Continuous Control
Title | Lyapunov-based Safe Policy Optimization for Continuous Control |
Authors | Yinlam Chow, Ofir Nachum, Aleksandra Faust, Edgar Duenez-Guzman, Mohammad Ghavamzadeh |
Abstract | We study continuous action reinforcement learning problems in which it is crucial that the agent interacts with the environment only through safe policies, i.e.,~policies that do not take the agent to undesirable situations. We formulate these problems as constrained Markov decision processes (CMDPs) and present safe policy optimization algorithms that are based on a Lyapunov approach to solve them. Our algorithms can use any standard policy gradient (PG) method, such as deep deterministic policy gradient (DDPG) or proximal policy optimization (PPO), to train a neural network policy, while guaranteeing near-constraint satisfaction for every policy update by projecting either the policy parameter or the action onto the set of feasible solutions induced by the state-dependent linearized Lyapunov constraints. Compared to the existing constrained PG algorithms, ours are more data efficient as they are able to utilize both on-policy and off-policy data. Moreover, our action-projection algorithm often leads to less conservative policy updates and allows for natural integration into an end-to-end PG training pipeline. We evaluate our algorithms and compare them with the state-of-the-art baselines on several simulated (MuJoCo) tasks, as well as a real-world indoor robot navigation problem, demonstrating their effectiveness in terms of balancing performance and constraint satisfaction. Videos of the experiments can be found in the following link: https://drive.google.com/file/d/1pzuzFqWIE710bE2U6DmS59AfRzqK2Kek/view?usp=sharing. |
Tasks | Continuous Control, Robot Navigation |
Published | 2019-01-28 |
URL | http://arxiv.org/abs/1901.10031v2 |
http://arxiv.org/pdf/1901.10031v2.pdf | |
PWC | https://paperswithcode.com/paper/lyapunov-based-safe-policy-optimization-for |
Repo | |
Framework | |
Lifelong Federated Reinforcement Learning: A Learning Architecture for Navigation in Cloud Robotic Systems
Title | Lifelong Federated Reinforcement Learning: A Learning Architecture for Navigation in Cloud Robotic Systems |
Authors | Boyi Liu, Lujia Wang, Ming Liu |
Abstract | This paper was motivated by the problem of how to make robots fuse and transfer their experience so that they can effectively use prior knowledge and quickly adapt to new environments. To address the problem, we present a learning architecture for navigation in cloud robotic systems: Lifelong Federated Reinforcement Learning (LFRL). In the work, We propose a knowledge fusion algorithm for upgrading a shared model deployed on the cloud. Then, effective transfer learning methods in LFRL are introduced. LFRL is consistent with human cognitive science and fits well in cloud robotic systems. Experiments show that LFRL greatly improves the efficiency of reinforcement learning for robot navigation. The cloud robotic system deployment also shows that LFRL is capable of fusing prior knowledge. In addition, we release a cloud robotic navigation-learning website based on LFRL. |
Tasks | Robot Navigation, Transfer Learning |
Published | 2019-01-19 |
URL | https://arxiv.org/abs/1901.06455v3 |
https://arxiv.org/pdf/1901.06455v3.pdf | |
PWC | https://paperswithcode.com/paper/lifelong-federated-reinforcement-learning-a |
Repo | |
Framework | |
YUVMultiNet: Real-time YUV multi-task CNN for autonomous driving
Title | YUVMultiNet: Real-time YUV multi-task CNN for autonomous driving |
Authors | Thomas Boulay, Said El-Hachimi, Mani Kumar Surisetti, Pullarao Maddu, Saranya Kandan |
Abstract | In this paper, we propose a multi-task convolutional neural network (CNN) architecture optimized for a low power automotive grade SoC. We introduce a network based on a unified architecture where the encoder is shared among the two tasks namely detection and segmentation. The pro-posed network runs at 25FPS for 1280x800 resolution. We briefly discuss the methods used to optimize the network architecture such as using native YUV image directly, optimization of layers & feature maps and applying quantization. We also focus on memory bandwidth in our design as convolutions are data intensives and most SOCs are bandwidth bottlenecked. We then demonstrate the efficiency of our proposed network for a dedicated CNN accelerators presenting the key performance indicators (KPI) for the detection and segmentation tasks obtained from the hardware execution and the corresponding run-time. |
Tasks | Autonomous Driving, Quantization |
Published | 2019-04-11 |
URL | http://arxiv.org/abs/1904.05673v1 |
http://arxiv.org/pdf/1904.05673v1.pdf | |
PWC | https://paperswithcode.com/paper/yuvmultinet-real-time-yuv-multi-task-cnn-for |
Repo | |
Framework | |
Multi-level Monte Carlo Variational Inference
Title | Multi-level Monte Carlo Variational Inference |
Authors | Masahiro Fujisawa, Issei Sato |
Abstract | We propose a framework for variance reduction using the multi-level Monte Carlo (MLMC) method. The framework is naturally compatible with reparameterized gradient estimators. We also propose a novel stochastic gradient estimation method and optimization algorithm on the MLMC method, which estimates sample size per level adaptively according to the ratio of the variance and computational cost in each iteration. Furthermore, we analyzed the convergence of the gradient in stochastic gradient descent and the quality of the gradient estimator in each optimization step on the basis of the $\textit{signal-to-noise}$ ratio. Finally, we evaluated our method by comparing it with sampling-based benchmark methods in several experiments and found that our method got closer to the optimal value and reduced gradient variance more than the other methods did. |
Tasks | Stochastic Optimization |
Published | 2019-02-01 |
URL | https://arxiv.org/abs/1902.00468v3 |
https://arxiv.org/pdf/1902.00468v3.pdf | |
PWC | https://paperswithcode.com/paper/multi-level-monte-carlo-variational-inference |
Repo | |
Framework | |
Who is Afraid of Big Bad Minima? Analysis of Gradient-Flow in a Spiked Matrix-Tensor Model
Title | Who is Afraid of Big Bad Minima? Analysis of Gradient-Flow in a Spiked Matrix-Tensor Model |
Authors | Stefano Sarao Mannelli, Giulio Biroli, Chiara Cammarota, Florent Krzakala, Lenka Zdeborová |
Abstract | Gradient-based algorithms are effective for many machine learning tasks, but despite ample recent effort and some progress, it often remains unclear why they work in practice in optimising high-dimensional non-convex functions and why they find good minima instead of being trapped in spurious ones. Here we present a quantitative theory explaining this behaviour in a spiked matrix-tensor model. Our framework is based on the Kac-Rice analysis of stationary points and a closed-form analysis of gradient-flow originating from statistical physics. We show that there is a well defined region of parameters where the gradient-flow algorithm finds a good global minimum despite the presence of exponentially many spurious local minima. We show that this is achieved by surfing on saddles that have strong negative direction towards the global minima, a phenomenon that is connected to a BBP-type threshold in the Hessian describing the critical points of the landscapes. |
Tasks | |
Published | 2019-07-18 |
URL | https://arxiv.org/abs/1907.08226v3 |
https://arxiv.org/pdf/1907.08226v3.pdf | |
PWC | https://paperswithcode.com/paper/who-is-afraid-of-big-bad-minima-analysis-of |
Repo | |
Framework | |
Code-Switched Language Models Using Neural Based Synthetic Data from Parallel Sentences
Title | Code-Switched Language Models Using Neural Based Synthetic Data from Parallel Sentences |
Authors | Genta Indra Winata, Andrea Madotto, Chien-Sheng Wu, Pascale Fung |
Abstract | Training code-switched language models is difficult due to lack of data and complexity in the grammatical structure. Linguistic constraint theories have been used for decades to generate artificial code-switching sentences to cope with this issue. However, this require external word alignments or constituency parsers that create erroneous results on distant languages. We propose a sequence-to-sequence model using a copy mechanism to generate code-switching data by leveraging parallel monolingual translations from a limited source of code-switching data. The model learns how to combine words from parallel sentences and identifies when to switch one language to the other. Moreover, it captures code-switching constraints by attending and aligning the words in inputs, without requiring any external knowledge. Based on experimental results, the language model trained with the generated sentences achieves state-of-the-art performance and improves end-to-end automatic speech recognition. |
Tasks | Language Modelling, Speech Recognition |
Published | 2019-09-18 |
URL | https://arxiv.org/abs/1909.08582v1 |
https://arxiv.org/pdf/1909.08582v1.pdf | |
PWC | https://paperswithcode.com/paper/code-switched-language-models-using-neural |
Repo | |
Framework | |
Efficient Relative Pose Estimation for Cameras and Generalized Cameras in Case of Known Relative Rotation Angle
Title | Efficient Relative Pose Estimation for Cameras and Generalized Cameras in Case of Known Relative Rotation Angle |
Authors | Evgeniy Martyushev, Bo Li |
Abstract | We propose two minimal solutions to the problem of relative pose estimation of (i) a calibrated camera from four points in two views and (ii) a calibrated generalized camera from five points in two views. In both cases, the relative rotation angle between the views is assumed to be known. In practice, such angle can be derived from the readings of a 3d gyroscope. We represent the rotation part of the motion in terms of unit quaternions in order to construct polynomial equations encoding the epipolar constraints. The Gr"{o}bner basis technique is then used to efficiently derive the solutions. Our first solver for regular cameras significantly improves the existing state-of-the-art solution. The second solver for generalized cameras is novel. The presented minimal solvers can be used in a hypothesize-and-test architecture such as RANSAC for reliable pose estimation. Experiments on synthetic and real datasets confirm that our algorithms are numerically stable, fast and robust. |
Tasks | Pose Estimation |
Published | 2019-01-31 |
URL | http://arxiv.org/abs/1901.11357v1 |
http://arxiv.org/pdf/1901.11357v1.pdf | |
PWC | https://paperswithcode.com/paper/efficient-relative-pose-estimation-for |
Repo | |
Framework | |
Towards Real Scene Super-Resolution with Raw Images
Title | Towards Real Scene Super-Resolution with Raw Images |
Authors | Xiangyu Xu, Yongrui Ma, Wenxiu Sun |
Abstract | Most existing super-resolution methods do not perform well in real scenarios due to lack of realistic training data and information loss of the model input. To solve the first problem, we propose a new pipeline to generate realistic training data by simulating the imaging process of digital cameras. And to remedy the information loss of the input, we develop a dual convolutional neural network to exploit the originally captured radiance information in raw images. In addition, we propose to learn a spatially-variant color transformation which helps more effective color corrections. Extensive experiments demonstrate that super-resolution with raw data helps recover fine details and clear structures, and more importantly, the proposed network and data generation pipeline achieve superior results for single image super-resolution in real scenarios. |
Tasks | Image Super-Resolution, Super-Resolution |
Published | 2019-05-29 |
URL | https://arxiv.org/abs/1905.12156v1 |
https://arxiv.org/pdf/1905.12156v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-real-scene-super-resolution-with-raw-1 |
Repo | |
Framework | |
Gradient Coding Based on Block Designs for Mitigating Adversarial Stragglers
Title | Gradient Coding Based on Block Designs for Mitigating Adversarial Stragglers |
Authors | Swanand Kadhe, O. Ozan Koyluoglu, Kannan Ramchandran |
Abstract | Distributed implementations of gradient-based methods, wherein a server distributes gradient computations across worker machines, suffer from slow running machines, called ‘stragglers’. Gradient coding is a coding-theoretic framework to mitigate stragglers by enabling the server to recover the gradient sum in the presence of stragglers. ‘Approximate gradient codes’ are variants of gradient codes that reduce computation and storage overhead per worker by allowing the server to approximately reconstruct the gradient sum. In this work, our goal is to construct approximate gradient codes that are resilient to stragglers selected by a computationally unbounded adversary. Our motivation for constructing codes to mitigate adversarial stragglers stems from the challenge of tackling stragglers in massive-scale elastic and serverless systems, wherein it is difficult to statistically model stragglers. Towards this end, we propose a class of approximate gradient codes based on balanced incomplete block designs (BIBDs). We show that the approximation error for these codes depends only on the number of stragglers, and thus, adversarial straggler selection has no advantage over random selection. In addition, the proposed codes admit computationally efficient decoding at the server. Next, to characterize fundamental limits of adversarial straggling, we consider the notion of ‘adversarial threshold’ – the smallest number of workers that an adversary must straggle to inflict certain approximation error. We compute a lower bound on the adversarial threshold, and show that codes based on symmetric BIBDs maximize this lower bound among a wide class of codes, making them excellent candidates for mitigating adversarial stragglers. |
Tasks | |
Published | 2019-04-30 |
URL | http://arxiv.org/abs/1904.13373v1 |
http://arxiv.org/pdf/1904.13373v1.pdf | |
PWC | https://paperswithcode.com/paper/gradient-coding-based-on-block-designs-for |
Repo | |
Framework | |
Adversarial Robustness via Label-Smoothing
Title | Adversarial Robustness via Label-Smoothing |
Authors | Morgane Goibert, Elvis Dohmatob |
Abstract | We study Label-Smoothing as a means for improving adversarial robustness of supervised deep-learning models. After establishing a thorough and unified framework, we propose several variations to this general method: adversarial, Boltzmann and second-best Label-Smoothing methods, and we explain how to construct your own one. On various datasets (MNIST, CIFAR10, SVHN) and models (linear models, MLPs, LeNet, ResNet), we show that Label-Smoothing in general improves adversarial robustness against a variety of attacks (FGSM, BIM, DeepFool, Carlini-Wagner) by better taking account of the dataset geometry. The proposed Label-Smoothing methods have two main advantages: they can be implemented as a modified cross-entropy loss, thus do not require any modifications of the network architecture nor do they lead to increased training times, and they improve both standard and adversarial accuracy. |
Tasks | |
Published | 2019-06-27 |
URL | https://arxiv.org/abs/1906.11567v2 |
https://arxiv.org/pdf/1906.11567v2.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-robustness-via-adversarial-label |
Repo | |
Framework | |
Reinforcing Medical Image Classifier to Improve Generalization on Small Datasets
Title | Reinforcing Medical Image Classifier to Improve Generalization on Small Datasets |
Authors | Walid Abdullah Al, Il Dong Yun |
Abstract | With the advents of deep learning, improved image classification with complex discriminative models has been made possible. However, such deep models with increased complexity require a huge set of labeled samples to generalize the training. Such classification models can easily overfit when applied for medical images because of limited training data, which is a common problem in the field of medical image analysis. This paper proposes and investigates a reinforced classifier for improving the generalization under a few available training data. Partially following the idea of reinforcement learning, the proposed classifier uses a generalization-feedback from a subset of the training data to update its parameter instead of only using the conventional cross-entropy loss about the training data. We evaluate the improvement of the proposed classifier by applying it on three different classification problems against the standard deep classifiers equipped with existing overfitting-prevention techniques. Besides an overall improvement in classification performance, the proposed classifier showed remarkable characteristics of generalized learning, which can have great potential in medical classification tasks. |
Tasks | Image Classification |
Published | 2019-09-02 |
URL | https://arxiv.org/abs/1909.05630v2 |
https://arxiv.org/pdf/1909.05630v2.pdf | |
PWC | https://paperswithcode.com/paper/reinforcing-medical-image-classifier-to |
Repo | |
Framework | |