April 3, 2020

3389 words 16 mins read

Paper Group ANR 62

Paper Group ANR 62

Performance of Statistical and Machine Learning Techniques for Physical Layer Authentication. ProxEmo: Gait-based Emotion Learning and Multi-view Proxemic Fusion for Socially-Aware Robot Navigation. A novel tree-structured point cloud dataset for skeletonization algorithm evaluation. Deep Learning Stereo Vision at the edge. An End-to-end Deep Learn …

Performance of Statistical and Machine Learning Techniques for Physical Layer Authentication

Title Performance of Statistical and Machine Learning Techniques for Physical Layer Authentication
Authors Linda Senigagliesi, Marco Baldi, Ennio Gambi
Abstract In this paper we consider authentication at the physical layer, in which the authenticator aims at distinguishing a legitimate supplicant from an attacker on the basis of the characteristics of the communication channel. Authentication is performed over a set of parallel wireless channels affected by time-varying fading at the presence of a malicious attacker, whose channel has a spatial correlation with the supplicant’s one. We first propose the use of two different statistical decision methods, and we prove that using a large number of references (in the form of channel estimates) affected by different levels of time-varying fading is not beneficial from a security point of view. We then propose to exploit classification methods based on machine learning. In order to face the worst case of an authenticator provided with no forged messages during training, we consider one-class classifiers. When instead the training set includes some forged messages, we resort to more conventional binary classifiers, considering the cases in which such messages are either labelled or not. For the latter case, we exploit clustering algorithms to label the training set. The performance of both nearest neighbor (NN) and support vector machine (SVM) classification techniques is assessed. Through numerical examples, we show that under the same probability of false alarm, one-class classification (OCC) algorithms achieve the lowest probability of missed detection when a small spatial correlation exists between the main channel and the adversary one, while statistical methods are advantageous when the spatial correlation between the two channels is large.
Tasks
Published 2020-01-17
URL https://arxiv.org/abs/2001.06238v1
PDF https://arxiv.org/pdf/2001.06238v1.pdf
PWC https://paperswithcode.com/paper/performance-of-statistical-and-machine
Repo
Framework

ProxEmo: Gait-based Emotion Learning and Multi-view Proxemic Fusion for Socially-Aware Robot Navigation

Title ProxEmo: Gait-based Emotion Learning and Multi-view Proxemic Fusion for Socially-Aware Robot Navigation
Authors Venkatraman Narayanan, Bala Murali Manoghar, Vishnu Sashank Dorbala, Dinesh Manocha, Aniket Bera
Abstract We present ProxEmo, a novel end-to-end emotion prediction algorithm for socially aware robot navigation among pedestrians. Our approach predicts the perceived emotions of a pedestrian from walking gaits, which is then used for emotion-guided navigation taking into account social and proxemic constraints. To classify emotions, we propose a multi-view skeleton graph convolution-based model that works on a commodity camera mounted onto a moving robot. Our emotion recognition is integrated into a mapless navigation scheme and makes no assumptions about the environment of pedestrian motion. It achieves a mean average emotion prediction precision of 82.47% on the Emotion-Gait benchmark dataset. We outperform current state-of-art algorithms for emotion recognition from 3D gaits. We highlight its benefits in terms of navigation in indoor scenes using a Clearpath Jackal robot.
Tasks Emotion Recognition, Robot Navigation
Published 2020-03-02
URL https://arxiv.org/abs/2003.01062v1
PDF https://arxiv.org/pdf/2003.01062v1.pdf
PWC https://paperswithcode.com/paper/proxemo-gait-based-emotion-learning-and-multi
Repo
Framework

A novel tree-structured point cloud dataset for skeletonization algorithm evaluation

Title A novel tree-structured point cloud dataset for skeletonization algorithm evaluation
Authors Yan Lin, Ji Liu, Jianlin Zhou
Abstract Curve skeleton extraction from unorganized point cloud is a fundamental task of computer vision and three-dimensional data preprocessing and visualization. A great amount of work has been done to extract skeleton from point cloud. but the lack of standard datasets of point cloud with ground truth skeleton makes it difficult to evaluate these algorithms. In this paper, we construct a brand new tree-structured point cloud dataset, including ground truth skeletons, and point cloud models. In addition, four types of point cloud are built on clean point cloud: point clouds with noise, point clouds with missing data, point clouds with different density, and point clouds with uneven density distribution. We first use tree editor to build the tree skeleton and corresponding mesh model. Since the implicit surface is sufficiently expressive to retain the edges and details of the complex branches model, we use the implicit surface to model the triangular mesh. With the implicit surface, virtual scanner is applied to the sampling of point cloud. Finally, considering the challenges in skeleton extraction, we introduce different methods to build four different types of point cloud models. This dataset can be used as standard dataset for skeleton extraction algorithms. And the evaluation between skeleton extraction algorithms can be performed by comparing the ground truth skeleton with the extracted skeleton.
Tasks
Published 2020-01-09
URL https://arxiv.org/abs/2001.02823v1
PDF https://arxiv.org/pdf/2001.02823v1.pdf
PWC https://paperswithcode.com/paper/a-novel-tree-structured-point-cloud-dataset
Repo
Framework

Deep Learning Stereo Vision at the edge

Title Deep Learning Stereo Vision at the edge
Authors Luca Puglia, Cormac Brick
Abstract We present an overview of the methodology used to build a new stereo vision solution that is suitable for System on Chip. This new solution was developed to bring computer vision capability to embedded devices that live in a power constrained environment. The solution is constructured as a hybrid between classical Stereo Vision techniques and deep learning approaches. The stereoscopic module is composed of two separate modules: one that accelerates the neural network we trained and one that accelerates the front-end part. The system is completely passive and does not require any structured light to obtain very compelling accuracy. With respect to the previous Stereo Vision solutions offered by the industries we offer a major improvement is robustness to noise. This is mainly possible due to the deep learning part of the chosen architecture. We submitted our result to Middlebury dataset challenge. It currently ranks as the best System on Chip solution. The system has been developed for low latency applications which require better than real time performance on high definition videos.
Tasks
Published 2020-01-13
URL https://arxiv.org/abs/2001.04552v1
PDF https://arxiv.org/pdf/2001.04552v1.pdf
PWC https://paperswithcode.com/paper/deep-learning-stereo-vision-at-the-edge
Repo
Framework

An End-to-end Deep Learning Approach for Landmark Detection and Matching in Medical Images

Title An End-to-end Deep Learning Approach for Landmark Detection and Matching in Medical Images
Authors Monika Grewal, Timo M. Deist, Jan Wiersma, Peter A. N. Bosman, Tanja Alderliesten
Abstract Anatomical landmark correspondences in medical images can provide additional guidance information for the alignment of two images, which, in turn, is crucial for many medical applications. However, manual landmark annotation is labor-intensive. Therefore, we propose an end-to-end deep learning approach to automatically detect landmark correspondences in pairs of two-dimensional (2D) images. Our approach consists of a Siamese neural network, which is trained to identify salient locations in images as landmarks and predict matching probabilities for landmark pairs from two different images. We trained our approach on 2D transverse slices from 168 lower abdominal Computed Tomography (CT) scans. We tested the approach on 22,206 pairs of 2D slices with varying levels of intensity, affine, and elastic transformations. The proposed approach finds an average of 639, 466, and 370 landmark matches per image pair for intensity, affine, and elastic transformations, respectively, with spatial matching errors of at most 1 mm. Further, more than 99% of the landmark pairs are within a spatial matching error of 2 mm, 4 mm, and 8 mm for image pairs with intensity, affine, and elastic transformations, respectively. To investigate the utility of our developed approach in a clinical setting, we also tested our approach on pairs of transverse slices selected from follow-up CT scans of three patients. Visual inspection of the results revealed landmark matches in both bony anatomical regions as well as in soft tissues lacking prominent intensity gradients.
Tasks Computed Tomography (CT)
Published 2020-01-21
URL https://arxiv.org/abs/2001.07434v1
PDF https://arxiv.org/pdf/2001.07434v1.pdf
PWC https://paperswithcode.com/paper/an-end-to-end-deep-learning-approach-for
Repo
Framework

Synthetic vascular structure generation for unsupervised pre-training in CTA segmentation tasks

Title Synthetic vascular structure generation for unsupervised pre-training in CTA segmentation tasks
Authors Nil Stolt Ansó
Abstract Large enough computed tomography (CT) data sets to train supervised deep models are often hard to come by. One contributing issue is the amount of manual labor that goes into creating ground truth labels, specially for volumetric data. In this research, we train a U-net architecture at a vessel segmentation task that can be used to provide insights when treating stroke patients. We create a computational model that generates synthetic vascular structures which can be blended into unlabeled CT scans of the head. This unsupervised approached to labelling is used to pre-train deep segmentation models, which are later fine-tuned on real examples to achieve an increase in accuracy compared to models trained exclusively on a hand-labeled data set.
Tasks Computed Tomography (CT)
Published 2020-01-02
URL https://arxiv.org/abs/2001.00666v1
PDF https://arxiv.org/pdf/2001.00666v1.pdf
PWC https://paperswithcode.com/paper/synthetic-vascular-structure-generation-for
Repo
Framework

Sketchformer: Transformer-based Representation for Sketched Structure

Title Sketchformer: Transformer-based Representation for Sketched Structure
Authors Leo Sampaio Ferraz Ribeiro, Tu Bui, John Collomosse, Moacir Ponti
Abstract Sketchformer is a novel transformer-based representation for encoding free-hand sketches input in a vector form, i.e. as a sequence of strokes. Sketchformer effectively addresses multiple tasks: sketch classification, sketch based image retrieval (SBIR), and the reconstruction and interpolation of sketches. We report several variants exploring continuous and tokenized input representations, and contrast their performance. Our learned embedding, driven by a dictionary learning tokenization scheme, yields state of the art performance in classification and image retrieval tasks, when compared against baseline representations driven by LSTM sequence to sequence architectures: SketchRNN and derivatives. We show that sketch reconstruction and interpolation are improved significantly by the Sketchformer embedding for complex sketches with longer stroke sequences.
Tasks Dictionary Learning, Image Retrieval, Sketch-Based Image Retrieval, Tokenization
Published 2020-02-24
URL https://arxiv.org/abs/2002.10381v1
PDF https://arxiv.org/pdf/2002.10381v1.pdf
PWC https://paperswithcode.com/paper/sketchformer-transformer-based-representation
Repo
Framework

Modeling and solving the multimodal car- and ride-sharing problem

Title Modeling and solving the multimodal car- and ride-sharing problem
Authors Miriam Enzi, Sophie N. Parragh, David Pisinger, Matthias Prandtstetter
Abstract We introduce the multimodal car- and ride-sharing problem (MMCRP), in which a pool of cars is used to cover a set of ride requests, while uncovered requests are assigned to other modes of transport (MOT). A car’s route consists of one or more trips. Each trip must have a specific but non-predetermined driver, start in a depot and finish in a (possibly different) depot. Ride-sharing between users is allowed, even when two rides do not have the same origin and/or destination. A user has always the option of using other modes of transport according to an individual list of preferences. The problem can be formulated as a vehicle scheduling problem. In order to solve the problem, an auxiliary graph is constructed in which each trip starting and ending in a depot, and covering possible ride-shares, is modeled as an edge in a time-space graph. We propose a two-layer decomposition algorithm based on column generation, where the master problem ensures that each request can only be covered at most once, and the pricing problem generates new promising routes by solving a kind of shortest path problem in a time-space network. Computational experiments based on realistic instances are reported. The benchmark instances are based on demographic, spatial, and economic data of Vienna, Austria. We solve large instances with the column generation based approach to near optimality in reasonable time, and we further investigate various exact and heuristic pricing schemes.
Tasks
Published 2020-01-15
URL https://arxiv.org/abs/2001.05490v1
PDF https://arxiv.org/pdf/2001.05490v1.pdf
PWC https://paperswithcode.com/paper/modeling-and-solving-the-multimodal-car-and
Repo
Framework

Structured Domain Adaptation for Unsupervised Person Re-identification

Title Structured Domain Adaptation for Unsupervised Person Re-identification
Authors Yixiao Ge, Feng Zhu, Rui Zhao, Hongsheng Li
Abstract Unsupervised domain adaptation (UDA) aims at adapting the model trained on a labeled source-domain dataset to another target-domain dataset without any annotation. The task of UDA for the open-set person re-identification (re-ID) is even more challenging as the identities (classes) have no overlap between the two domains. Existing UDA methods for person re-ID have the following limitations. 1) Pseudo-label-based methods achieve state-of-the-art performances but ignore the complex relations between two domains’ images, along with the valuable source-domain annotations. 2) Domain translation-based methods cannot achieve competitive performances as the domain translation is not properly regularized to generate informative enough training samples that well maintain inter-sample relations. To tackle the above challenges, we propose an end-to-end structured domain adaptation framework that consists of a novel structured domain-translation network and two domain-specific person image encoders. The structured domain-translation network can effectively transform the source-domain images into the target domain while well preserving the original intra- and inter-identity relations. The target-domain encoder could then be trained using both source-to-target translated images with valuable ground-truth labels and target-domain images with pseudo labels. Importantly, the domain-translation network and target-domain encoder are jointly optimized, improving each other towards the overall objective, i.e. to achieve optimal re-ID performances on the target domain. Our proposed framework outperforms state-of-the-art methods on multiple UDA tasks of person re-ID.
Tasks Domain Adaptation, Person Re-Identification, Unsupervised Domain Adaptation, Unsupervised Person Re-Identification
Published 2020-03-14
URL https://arxiv.org/abs/2003.06650v1
PDF https://arxiv.org/pdf/2003.06650v1.pdf
PWC https://paperswithcode.com/paper/structured-domain-adaptation-for-unsupervised
Repo
Framework

Elastic Consistency: A General Consistency Model for Distributed Stochastic Gradient Descent

Title Elastic Consistency: A General Consistency Model for Distributed Stochastic Gradient Descent
Authors Dan Alistarh, Bapi Chatterjee, Vyacheslav Kungurtsev
Abstract Machine learning has made tremendous progress in recent years, with models matching or even surpassing humans on a series of specialized tasks. One key element behind the progress of machine learning in recent years has been the ability to train machine learning models in large-scale distributed shared-memory and message-passing environments. Many of these models are trained employing variants of stochastic gradient descent (SGD) based optimization. In this paper, we introduce a general consistency condition covering communication-reduced and asynchronous distributed SGD implementations. Our framework, called elastic consistency enables us to derive convergence bounds for a variety of distributed SGD methods used in practice to train large-scale machine learning models. The proposed framework de-clutters the implementation-specific convergence analysis and provides an abstraction to derive convergence bounds. We utilize the framework to analyze a sparsification scheme for distributed SGD methods in an asynchronous setting for convex and non-convex objectives. We implement the distributed SGD variant to train deep CNN models in an asynchronous shared-memory setting. Empirical results show that error-feedback may not necessarily help in improving the convergence of sparsified asynchronous distributed SGD, which corroborates an insight suggested by our convergence analysis.
Tasks
Published 2020-01-16
URL https://arxiv.org/abs/2001.05918v1
PDF https://arxiv.org/pdf/2001.05918v1.pdf
PWC https://paperswithcode.com/paper/elastic-consistency-a-general-consistency
Repo
Framework

Manifold for Machine Learning Assurance

Title Manifold for Machine Learning Assurance
Authors Taejoon Byun, Sanjai Rayadurgam
Abstract The increasing use of machine-learning (ML) enabled systems in critical tasks fuels the quest for novel verification and validation techniques yet grounded in accepted system assurance principles. In traditional system development, model-based techniques have been widely adopted, where the central premise is that abstract models of the required system provide a sound basis for judging its implementation. We posit an analogous approach for ML systems using an ML technique that extracts from the high-dimensional training data implicitly describing the required system, a low-dimensional underlying structure–a manifold. It is then harnessed for a range of quality assurance tasks such as test adequacy measurement, test input generation, and runtime monitoring of the target ML system. The approach is built on variational autoencoder, an unsupervised method for learning a pair of mutually near-inverse functions between a given high-dimensional dataset and a low-dimensional representation. Preliminary experiments establish that the proposed manifold-based approach, for test adequacy drives diversity in test data, for test generation yields fault-revealing yet realistic test cases, and for runtime monitoring provides an independent means to assess trustability of the target system’s output.
Tasks
Published 2020-02-08
URL https://arxiv.org/abs/2002.03147v1
PDF https://arxiv.org/pdf/2002.03147v1.pdf
PWC https://paperswithcode.com/paper/manifold-for-machine-learning-assurance
Repo
Framework

Multi-task Reinforcement Learning with a Planning Quasi-Metric

Title Multi-task Reinforcement Learning with a Planning Quasi-Metric
Authors Vincent Micheli, Karthigan Sinnathamby, François Fleuret
Abstract We introduce a new reinforcement learning approach combining a planning quasi-metric (PQM) that estimates the number of actions required to go from a state to another, with task-specific planners that compute a target state to reach a given goal. The main advantage of this decomposition is to allow the sharing across tasks of a task-agnostic model of the quasi-metric that captures the environment’s dynamics and can be learned in a dense and unsupervised manner. We demonstrate the usefulness of this approach on the standard bit-flip problem and in the MuJoCo robotic arm simulator.
Tasks
Published 2020-02-08
URL https://arxiv.org/abs/2002.03240v1
PDF https://arxiv.org/pdf/2002.03240v1.pdf
PWC https://paperswithcode.com/paper/multi-task-reinforcement-learning-with-a
Repo
Framework

Extreme Multi-label Classification from Aggregated Labels

Title Extreme Multi-label Classification from Aggregated Labels
Authors Yanyao Shen, Hsiang-fu Yu, Sujay Sanghavi, Inderjit Dhillon
Abstract Extreme multi-label classification (XMC) is the problem of finding the relevant labels for an input, from a very large universe of possible labels. We consider XMC in the setting where labels are available only for groups of samples - but not for individual ones. Current XMC approaches are not built for such multi-instance multi-label (MIML) training data, and MIML approaches do not scale to XMC sizes. We develop a new and scalable algorithm to impute individual-sample labels from the group labels; this can be paired with any existing XMC method to solve the aggregated label problem. We characterize the statistical properties of our algorithm under mild assumptions, and provide a new end-to-end framework for MIML as an extension. Experiments on both aggregated label XMC and MIML tasks show the advantages over existing approaches.
Tasks Extreme Multi-Label Classification, Multi-Label Classification
Published 2020-04-01
URL https://arxiv.org/abs/2004.00198v1
PDF https://arxiv.org/pdf/2004.00198v1.pdf
PWC https://paperswithcode.com/paper/extreme-multi-label-classification-from
Repo
Framework

Beyond Clicks: Modeling Multi-Relational Item Graph for Session-Based Target Behavior Prediction

Title Beyond Clicks: Modeling Multi-Relational Item Graph for Session-Based Target Behavior Prediction
Authors Wen Wang, Wei Zhang, Shukai Liu, Qi Liu, Bo Zhang, Leyu Lin, Hongyuan Zha
Abstract Session-based target behavior prediction aims to predict the next item to be interacted with specific behavior types (e.g., clicking). Although existing methods for session-based behavior prediction leverage powerful representation learning approaches to encode items’ sequential relevance in a low-dimensional space, they suffer from several limitations. Firstly, they focus on only utilizing the same type of user behavior for prediction, but ignore the potential of taking other behavior data as auxiliary information. This is particularly crucial when the target behavior is sparse but important (e.g., buying or sharing an item). Secondly, item-to-item relations are modeled separately and locally in one behavior sequence, and they lack a principled way to globally encode these relations more effectively. To overcome these limitations, we propose a novel Multi-relational Graph Neural Network model for Session-based target behavior Prediction, namely MGNN-SPred for short. Specifically, we build a Multi-Relational Item Graph (MRIG) based on all behavior sequences from all sessions, involving target and auxiliary behavior types. Based on MRIG, MGNN-SPred learns global item-to-item relations and further obtains user preferences w.r.t. current target and auxiliary behavior sequences, respectively. In the end, MGNN-SPred leverages a gating mechanism to adaptively fuse user representations for predicting next item interacted with target behavior. The extensive experiments on two real-world datasets demonstrate the superiority of MGNN-SPred by comparing with state-of-the-art session-based prediction methods, validating the benefits of leveraging auxiliary behavior and learning item-to-item relations over MRIG.
Tasks Representation Learning
Published 2020-02-19
URL https://arxiv.org/abs/2002.07993v1
PDF https://arxiv.org/pdf/2002.07993v1.pdf
PWC https://paperswithcode.com/paper/beyond-clicks-modeling-multi-relational-item
Repo
Framework

MRI Super-Resolution with GAN and 3D Multi-Level DenseNet: Smaller, Faster, and Better

Title MRI Super-Resolution with GAN and 3D Multi-Level DenseNet: Smaller, Faster, and Better
Authors Yuhua Chen, Anthony G. Christodoulou, Zhengwei Zhou, Feng Shi, Yibin Xie, Debiao Li
Abstract High-resolution (HR) magnetic resonance imaging (MRI) provides detailed anatomical information that is critical for diagnosis in the clinical application. However, HR MRI typically comes at the cost of long scan time, small spatial coverage, and low signal-to-noise ratio (SNR). Recent studies showed that with a deep convolutional neural network (CNN), HR generic images could be recovered from low-resolution (LR) inputs via single image super-resolution (SISR) approaches. Additionally, previous works have shown that a deep 3D CNN can generate high-quality SR MRIs by using learned image priors. However, 3D CNN with deep structures, have a large number of parameters and are computationally expensive. In this paper, we propose a novel 3D CNN architecture, namely a multi-level densely connected super-resolution network (mDCSRN), which is light-weight, fast and accurate. We also show that with the generative adversarial network (GAN)-guided training, the mDCSRN-GAN provides appealing sharp SR images with rich texture details that are highly comparable with the referenced HR images. Our results from experiments on a large public dataset with 1,113 subjects showed that this new architecture outperformed other popular deep learning methods in recovering 4x resolution-downgraded images in both quality and speed.
Tasks Image Super-Resolution, Super-Resolution
Published 2020-03-02
URL https://arxiv.org/abs/2003.01217v2
PDF https://arxiv.org/pdf/2003.01217v2.pdf
PWC https://paperswithcode.com/paper/mri-super-resolution-with-gan-and-3d-multi
Repo
Framework
comments powered by Disqus