January 29, 2020

3194 words 15 mins read

Paper Group ANR 491

Paper Group ANR 491

Controlling Steering Angle for Cooperative Self-driving Vehicles utilizing CNN and LSTM-based Deep Networks. Does deep learning always outperform simple linear regression in optical imaging?. Optimal mini-batch and step sizes for SAGA. Neural Architecture Search in Embedding Space. Nested Cavity Classifier: performance and remedy. Proximal Distille …

Controlling Steering Angle for Cooperative Self-driving Vehicles utilizing CNN and LSTM-based Deep Networks

Title Controlling Steering Angle for Cooperative Self-driving Vehicles utilizing CNN and LSTM-based Deep Networks
Authors Rodolfo Valiente, Mahdi Zaman, Sedat Ozer, Yaser P. Fallah
Abstract A fundamental challenge in autonomous vehicles is adjusting the steering angle at different road conditions. Recent state-of-the-art solutions addressing this challenge include deep learning techniques as they provide end-to-end solution to predict steering angles directly from the raw input images with higher accuracy. Most of these works ignore the temporal dependencies between the image frames. In this paper, we tackle the problem of utilizing multiple sets of images shared between two autonomous vehicles to improve the accuracy of controlling the steering angle by considering the temporal dependencies between the image frames. This problem has not been studied in the literature widely. We present and study a new deep architecture to predict the steering angle automatically by using Long-Short-Term-Memory (LSTM) in our deep architecture. Our deep architecture is an end-to-end network that utilizes CNN, LSTM and fully connected (FC) layers and it uses both present and futures images (shared by a vehicle ahead via Vehicle-to-Vehicle (V2V) communication) as input to control the steering angle. Our model demonstrates the lowest error when compared to the other existing approaches in the literature.
Tasks Autonomous Vehicles
Published 2019-04-08
URL https://arxiv.org/abs/1904.04375v3
PDF https://arxiv.org/pdf/1904.04375v3.pdf
PWC https://paperswithcode.com/paper/controlling-steering-angle-for-cooperative
Repo
Framework

Does deep learning always outperform simple linear regression in optical imaging?

Title Does deep learning always outperform simple linear regression in optical imaging?
Authors Shuming Jiao, Yang Gao, Jun Feng, Ting Lei, Xiaocong Yuan
Abstract Deep learning has been extensively applied in many optical imaging applications in recent years. Despite the success, the limitations and drawbacks of deep learning in optical imaging have been seldom investigated. In this work, we show that conventional linear-regression-based methods can outperform the previously proposed deep learning approaches for two black-box optical imaging problems in some extent. Deep learning demonstrates its weakness especially when the number of training samples is small. The advantages and disadvantages of linear-regression-based methods and deep learning are analyzed and compared. Since many optical systems are essentially linear, a deep learning network containing many nonlinearity functions sometimes may not be the most suitable option.
Tasks
Published 2019-10-31
URL https://arxiv.org/abs/1911.00353v2
PDF https://arxiv.org/pdf/1911.00353v2.pdf
PWC https://paperswithcode.com/paper/does-deep-learning-always-outperform-simple
Repo
Framework

Optimal mini-batch and step sizes for SAGA

Title Optimal mini-batch and step sizes for SAGA
Authors Nidham Gazagnadou, Robert M. Gower, Joseph Salmon
Abstract Recently it has been shown that the step sizes of a family of variance reduced gradient methods called the JacSketch methods depend on the expected smoothness constant. In particular, if this expected smoothness constant could be calculated a priori, then one could safely set much larger step sizes which would result in a much faster convergence rate. We fill in this gap, and provide simple closed form expressions for the expected smoothness constant and careful numerical experiments verifying these bounds. Using these bounds, and since the SAGA algorithm is part of this JacSketch family, we suggest a new standard practice for setting the step sizes and mini-batch size for SAGA that are competitive with a numerical grid search. Furthermore, we can now show that the total complexity of the SAGA algorithm decreases linearly in the mini-batch size up to a pre-defined value: the optimal mini-batch size. This is a rare result in the stochastic variance reduced literature, only previously shown for the Katyusha algorithm. Finally we conjecture that this is the case for many other stochastic variance reduced methods and that our bounds and analysis of the expected smoothness constant is key to extending these results.
Tasks
Published 2019-01-31
URL https://arxiv.org/abs/1902.00071v3
PDF https://arxiv.org/pdf/1902.00071v3.pdf
PWC https://paperswithcode.com/paper/optimal-mini-batch-and-step-sizes-for-saga
Repo
Framework

Neural Architecture Search in Embedding Space

Title Neural Architecture Search in Embedding Space
Authors Chun-Ting Liu
Abstract The neural architecture search (NAS) algorithm with reinforcement learning can be a powerful and novel framework for the automatic discovering process of neural architectures. However, its application is restricted by noncontinuous and high-dimensional search spaces, which result in difficulty in optimization. To resolve these problems, we proposed NAS in embedding space (NASES), which is a novel framework. Unlike other NAS with reinforcement learning approaches that search over a discrete and high-dimensional architecture space, this approach enables reinforcement learning to search in an embedding space by using architecture encoders and decoders. The current experiment demonstrated that the performance of the final architecture network using the NASES procedure is comparable with that of other popular NAS approaches for the image classification task on CIFAR-10. The results of the experiment were efficient and indicated that NASES was highly efficient to discover final architecture only in $<$3.5 GPU hours. The beneficial-performance and effectiveness of NASES was impressive when the architecture-embedding searching and weight initialization were applied.
Tasks Image Classification, Neural Architecture Search
Published 2019-09-09
URL https://arxiv.org/abs/1909.03615v3
PDF https://arxiv.org/pdf/1909.03615v3.pdf
PWC https://paperswithcode.com/paper/neural-architecture-search-in-embedding-space
Repo
Framework

Nested Cavity Classifier: performance and remedy

Title Nested Cavity Classifier: performance and remedy
Authors Waleed A. Mustafa, Waleed A. Yousef
Abstract Nested Cavity Classifier (NCC) is a classification rule that pursues partitioning the feature space, in parallel coordinates, into convex hulls to build decision regions. It is claimed in some literatures that this geometric-based classifier is superior to many others, particularly in higher dimensions. First, we give an example on how NCC can be inefficient, then motivate a remedy by combining the NCC with the Linear Discriminant Analysis (LDA) classifier. We coin the term Nested Cavity Discriminant Analysis (NCDA) for the resulting classifier. Second, a simulation study is conducted to compare both, NCC and NCDA to another two basic classifiers, Linear and Quadratic Discriminant Analysis. NCC alone proves to be inferior to others, while NCDA always outperforms NCC and competes with LDA and QDA.
Tasks
Published 2019-06-23
URL https://arxiv.org/abs/1906.09669v3
PDF https://arxiv.org/pdf/1906.09669v3.pdf
PWC https://paperswithcode.com/paper/nested-cavity-classifier-performance-and
Repo
Framework

Proximal Distilled Evolutionary Reinforcement Learning

Title Proximal Distilled Evolutionary Reinforcement Learning
Authors Cristian Bodnar, Ben Day, Pietro Lio’
Abstract Reinforcement Learning (RL) has achieved impressive performance in many complex environments due to the integration with Deep Neural Networks (DNNs). At the same time, Genetic Algorithms (GAs), often seen as a competing approach to RL, had limited success in scaling up to the DNNs required to solve challenging tasks. Contrary to this dichotomic view, in the physical world, evolution and learning are complementary processes that continuously interact. The recently proposed Evolutionary Reinforcement Learning (ERL) framework has demonstrated mutual benefits to performance when combining the two methods. However, ERL has not fully addressed the scalability problem of GAs. In this paper, we show that this problem is rooted in an unfortunate combination of a simple genetic encoding for DNNs and the use of traditional biologically-inspired variation operators. When applied to these encodings, the standard operators are destructive and cause catastrophic forgetting of the traits the networks acquired. We propose a novel algorithm called Proximal Distilled Evolutionary Reinforcement Learning (PDERL) that is characterised by a hierarchical integration between evolution and learning. The main innovation of PDERL is the use of learning-based variation operators that compensate for the simplicity of the genetic representation. Unlike traditional operators, our proposals meet the functional requirements of variation operators when applied on directly-encoded DNNs. We evaluate PDERL in five robot locomotion settings from the OpenAI gym. Our method outperforms ERL, as well as two state-of-the-art RL algorithms, PPO and TD3, in all tested environments.
Tasks
Published 2019-06-24
URL https://arxiv.org/abs/1906.09807v3
PDF https://arxiv.org/pdf/1906.09807v3.pdf
PWC https://paperswithcode.com/paper/proximal-distilled-evolutionary-reinforcement
Repo
Framework

Graph based Nearest Neighbor Search: Promises and Failures

Title Graph based Nearest Neighbor Search: Promises and Failures
Authors Peng-Cheng Lin, Wan-Lei Zhao
Abstract Recently, graph based nearest neighbor search gets more and more popular on large-scale retrieval tasks. The attractiveness of this type of approaches lies in its superior performance over most of the known nearest neighbor search approaches as well as its genericness to various metrics. In this paper, the role of two strategies, namely hierarchical structure and graph diversification that are adopted as the key steps in the graph based approaches, is investigated. We find the hierarchical structure could not achieve “much better logarithmic complexity scaling” as it was claimed in the original paper, particularly on high dimensional cases. Moreover, we find that similar high search speed efficiency as the one with hierarchical structure could be achieved with the support of flat k-NN graph after graph diversification. Finally, we point out the difficulty, that is faced by most of the graph based search approaches, is directly linked to “curse of dimensionality”.
Tasks
Published 2019-04-03
URL https://arxiv.org/abs/1904.02077v5
PDF https://arxiv.org/pdf/1904.02077v5.pdf
PWC https://paperswithcode.com/paper/a-comparative-study-on-hierarchical-navigable
Repo
Framework

Partial Fingerprint Detection Using Core Point Location

Title Partial Fingerprint Detection Using Core Point Location
Authors Wajih Ullah Baig, Adeel Ejaz, Umar Munir, Kashif Sardar
Abstract In Biometric identification, fingerprints based identification has been the widely accepted mechanism. Automated fingerprints identification/verification techniques are widely adopted in many civilian and forensic applications. In forensic applications fingerprints are usually incomplete, broken, unclear or degraded which are known as partial fingerprints. Fingerprints identification/verification largely suffer from the problem of handling partial fingerprints. In this paper a novel and simple approach is presented for detecting partial fingerprints using core point location. Our techniques is particularly useful during the acquisition stage as to determine whether a user needs to re-align the finger to ensure a complete capture of fingerprint area.This technique is tested on FVC-2002 DB1A. The results are very accurate which are presented in the Results sections.
Tasks
Published 2019-02-04
URL http://arxiv.org/abs/1902.01400v1
PDF http://arxiv.org/pdf/1902.01400v1.pdf
PWC https://paperswithcode.com/paper/partial-fingerprint-detection-using-core
Repo
Framework

Model-based Convolutional De-Aliasing Network Learning for Parallel MR Imaging

Title Model-based Convolutional De-Aliasing Network Learning for Parallel MR Imaging
Authors Yanxia Chen, Taohui Xiao, Cheng Li, Qiegen Liu, Shanshan Wang
Abstract Parallel imaging has been an essential technique to accelerate MR imaging. Nevertheless, the acceleration rate is still limited due to the ill-condition and challenges associated with the undersampled reconstruction. In this paper, we propose a model-based convolutional de-aliasing network with adaptive parameter learning to achieve accurate reconstruction from multi-coil undersampled k-space data. Three main contributions have been made: a de-aliasing reconstruction model was proposed to accelerate parallel MR imaging with deep learning exploring both spatial redundancy and multi-coil correlations; a split Bregman iteration algorithm was developed to solve the model efficiently; and unlike most existing parallel imaging methods which rely on the accuracy of the estimated multi-coil sensitivity, the proposed method can perform parallel reconstruction from undersampled data without explicit sensitivity calculation. Evaluations were conducted on \emph{in vivo} brain dataset with a variety of undersampling patterns and different acceleration factors. Our results demonstrated that this method could achieve superior performance in both quantitative and qualitative analysis, compared to three state-of-the-art methods.
Tasks De-aliasing
Published 2019-08-06
URL https://arxiv.org/abs/1908.02054v1
PDF https://arxiv.org/pdf/1908.02054v1.pdf
PWC https://paperswithcode.com/paper/model-based-convolutional-de-aliasing-network
Repo
Framework

CLOTH3D: Clothed 3D Humans

Title CLOTH3D: Clothed 3D Humans
Authors Hugo Bertiche, Meysam Madadi, Sergio Escalera
Abstract This work presents CLOTH3D, the first big scale synthetic dataset of 3D clothed human sequences. CLOTH3D contains a large variability on garment type, topology, shape, size, tightness and fabric. Clothes are simulated on top of thousands of different pose sequences and body shapes, generating realistic cloth dynamics. We provide the dataset with a generative model for cloth generation. We propose a Conditional Variational Auto-Encoder (CVAE) based on graph convolutions (GCVAE) to learn garment latent spaces. This allows for realistic generation of 3D garments on top of SMPL model for any pose and shape.
Tasks
Published 2019-12-05
URL https://arxiv.org/abs/1912.02792v1
PDF https://arxiv.org/pdf/1912.02792v1.pdf
PWC https://paperswithcode.com/paper/cloth3d-clothed-3d-humans
Repo
Framework

SA-Text: Simple but Accurate Detector for Text of Arbitrary Shapes

Title SA-Text: Simple but Accurate Detector for Text of Arbitrary Shapes
Authors Qitong Wang, Yi Zheng, Margrit Betke
Abstract We introduce a new framework for text detection named SA-Text meaning “Simple but Accurate,” which utilizes heatmaps to detect text regions in natural scene images effectively. SA-Text detects text that occurs in various fonts, shapes, and orientations in natural scene images with complicated backgrounds. Experiments on three challenging and public scene-text-detection datasets, Total-Text, SCUT-CTW1500, and MSRA-TD500 show the effectiveness and generalization ability of SA-Text in detecting not only multi-lingual oriented straight but also curved text in scripts of multiple languages. To further show the practicality of SA-Text, we combine it with a powerful state-of-the-art text recognition model and thus propose a pipeline-based text spotting system called SAA (“text spotting” is used as the technical term for “detection and recognition of text”). Our experimental results of SAA on the Total-Text dataset show that SAA outperforms four state-of-the-art text spotting frameworks by at least 9 percent points in the F-measure, which means that SA-Text can be used as a complete text detection and recognition system in real applications.
Tasks Scene Text Detection, Text Spotting
Published 2019-11-16
URL https://arxiv.org/abs/1911.07046v2
PDF https://arxiv.org/pdf/1911.07046v2.pdf
PWC https://paperswithcode.com/paper/sa-text-simple-but-accurate-detector-for-text
Repo
Framework

Defending Against Misclassification Attacks in Transfer Learning

Title Defending Against Misclassification Attacks in Transfer Learning
Authors Bang Wu, Xiangwen Yang, Shuo Wang, Xingliang Yuan, Cong Wang, Carsten Rudolph
Abstract Transfer learning accelerates the development of new models (Student Models). It applies relevant knowledge from a pre-trained model (Teacher Model) to the new ones with a small amount of training data, yet without affecting the model accuracy. However, these Teacher Models are normally open in order to facilitate sharing and reuse, which creates an attack plane in transfer learning systems. Among others, recent emerging attacks demonstrate that adversarial inputs can be built with negligible perturbations to the normal inputs. Such inputs can mimic the internal features of the student models directly based on the knowledge of the Teacher Models and cause misclassification in final predictions. In this paper, we propose an effective defence against the above misclassification attacks in transfer learning. First, we propose a distilled differentiator that can address the targeted attacks, where adversarial inputs are misclassified to a specific class. Specifically, this dedicated differentiator is designed with network activation pruning and retraining in a fine-tuned manner, so as to reach high defence rates and high model accuracy. To address the non-targeted attacks that misclassify adversarial inputs to randomly selected classes, we further employ an ensemble structure from the differentiators to cover all possible misclassification. Our evaluations over common image recognition tasks confirm that the student models applying our defence can reject most of the adversarial inputs with a marginal accuracy loss. We also show that our defence outperforms prior approaches in both targeted and non-targeted attacks.
Tasks Transfer Learning
Published 2019-08-29
URL https://arxiv.org/abs/1908.11230v2
PDF https://arxiv.org/pdf/1908.11230v2.pdf
PWC https://paperswithcode.com/paper/defending-against-misclassification-attacks
Repo
Framework

Understanding language-elicited EEG data by predicting it from a fine-tuned language model

Title Understanding language-elicited EEG data by predicting it from a fine-tuned language model
Authors Dan Schwartz, Tom Mitchell
Abstract Electroencephalography (EEG) recordings of brain activity taken while participants read or listen to language are widely used within the cognitive neuroscience and psycholinguistics communities as a tool to study language comprehension. Several time-locked stereotyped EEG responses to word-presentations – known collectively as event-related potentials (ERPs) – are thought to be markers for semantic or syntactic processes that take place during comprehension. However, the characterization of each individual ERP in terms of what features of a stream of language trigger the response remains controversial. Improving this characterization would make ERPs a more useful tool for studying language comprehension. We take a step towards better understanding the ERPs by fine-tuning a language model to predict them. This new approach to analysis shows for the first time that all of the ERPs are predictable from embeddings of a stream of language. Prior work has only found two of the ERPs to be predictable. In addition to this analysis, we examine which ERPs benefit from sharing parameters during joint training. We find that two pairs of ERPs previously identified in the literature as being related to each other benefit from joint training, while several other pairs of ERPs that benefit from joint training are suggestive of potential relationships. Extensions of this analysis that further examine what kinds of information in the model embeddings relate to each ERP have the potential to elucidate the processes involved in human language comprehension.
Tasks EEG, Language Modelling
Published 2019-04-02
URL http://arxiv.org/abs/1904.01548v1
PDF http://arxiv.org/pdf/1904.01548v1.pdf
PWC https://paperswithcode.com/paper/understanding-language-elicited-eeg-data-by
Repo
Framework

Predicting knee osteoarthritis severity: comparative modeling based on patient’s data and plain X-ray images

Title Predicting knee osteoarthritis severity: comparative modeling based on patient’s data and plain X-ray images
Authors Jaynal Abedin, Joseph Antony, Kevin McGuinness, Kieran Moran, Noel E O’Connor, Dietrich Rebholz-Schuhmann, John Newell
Abstract Knee osteoarthritis (KOA) is a disease that impairs knee function and causes pain. A radiologist reviews knee X-ray images and grades the severity level of the impairments according to the Kellgren and Lawrence grading scheme; a five-point ordinal scale (0–4). In this study, we used Elastic Net (EN) and Random Forests (RF) to build predictive models using patient assessment data (i.e. signs and symptoms of both knees and medication use) and a convolution neural network (CNN) trained using X-ray images only. Linear mixed effect models (LMM) were used to model the within subject correlation between the two knees. The root mean squared error for the CNN, EN, and RF models was 0.77, 0.97, and 0.94 respectively. The LMM shows similar overall prediction accuracy as the EN regression but correctly accounted for the hierarchical structure of the data resulting in more reliable inference. Useful explanatory variables were identified that could be used for patient monitoring before X-ray imaging. Our analyses suggest that the models trained for predicting the KOA severity levels achieve comparable results when modeling X-ray images and patient data. The subjectivity in the KL grade is still a primary concern.
Tasks
Published 2019-08-23
URL https://arxiv.org/abs/1908.08873v1
PDF https://arxiv.org/pdf/1908.08873v1.pdf
PWC https://paperswithcode.com/paper/predicting-knee-osteoarthritis-severity
Repo
Framework

Representation Learning for Electronic Health Records

Title Representation Learning for Electronic Health Records
Authors Wei-Hung Weng, Peter Szolovits
Abstract Information in electronic health records (EHR), such as clinical narratives, examination reports, lab measurements, demographics, and other patient encounter entries, can be transformed into appropriate data representations that can be used for downstream clinical machine learning tasks using representation learning. Learning better representations is critical to improve the performance of downstream tasks. Due to the advances in machine learning, we now can learn better and meaningful representations from EHR through disentangling the underlying factors inside data and distilling large amounts of information and knowledge from heterogeneous EHR sources. In this chapter, we first introduce the background of learning representations and reasons why we need good EHR representations in machine learning for medicine and healthcare in Section 1. Next, we explain the commonly-used machine learning and evaluation methods for representation learning using a deep learning approach in Section 2. Following that, we review recent related studies of learning patient state representation from EHR for clinical machine learning tasks in Section 3. Finally, in Section 4 we discuss more techniques, studies, and challenges for learning natural language representations when free texts, such as clinical notes, examination reports, or biomedical literature are used. We also discuss challenges and opportunities in these rapidly growing research fields.
Tasks Representation Learning
Published 2019-09-19
URL https://arxiv.org/abs/1909.09248v1
PDF https://arxiv.org/pdf/1909.09248v1.pdf
PWC https://paperswithcode.com/paper/representation-learning-for-electronic-health
Repo
Framework
comments powered by Disqus