April 3, 2020

3389 words 16 mins read

Paper Group ANR 62

Performance of Statistical and Machine Learning Techniques for Physical Layer Authentication. ProxEmo: Gait-based Emotion Learning and Multi-view Proxemic Fusion for Socially-Aware Robot Navigation. A novel tree-structured point cloud dataset for skeletonization algorithm evaluation. Deep Learning Stereo Vision at the edge. An End-to-end Deep Learn …

Performance of Statistical and Machine Learning Techniques for Physical Layer Authentication


Title	Performance of Statistical and Machine Learning Techniques for Physical Layer Authentication
Authors	Linda Senigagliesi, Marco Baldi, Ennio Gambi
Abstract	In this paper we consider authentication at the physical layer, in which the authenticator aims at distinguishing a legitimate supplicant from an attacker on the basis of the characteristics of the communication channel. Authentication is performed over a set of parallel wireless channels affected by time-varying fading at the presence of a malicious attacker, whose channel has a spatial correlation with the supplicant’s one. We first propose the use of two different statistical decision methods, and we prove that using a large number of references (in the form of channel estimates) affected by different levels of time-varying fading is not beneficial from a security point of view. We then propose to exploit classification methods based on machine learning. In order to face the worst case of an authenticator provided with no forged messages during training, we consider one-class classifiers. When instead the training set includes some forged messages, we resort to more conventional binary classifiers, considering the cases in which such messages are either labelled or not. For the latter case, we exploit clustering algorithms to label the training set. The performance of both nearest neighbor (NN) and support vector machine (SVM) classification techniques is assessed. Through numerical examples, we show that under the same probability of false alarm, one-class classification (OCC) algorithms achieve the lowest probability of missed detection when a small spatial correlation exists between the main channel and the adversary one, while statistical methods are advantageous when the spatial correlation between the two channels is large.
Tasks
Published	2020-01-17
URL	https://arxiv.org/abs/2001.06238v1
PDF	https://arxiv.org/pdf/2001.06238v1.pdf
PWC	https://paperswithcode.com/paper/performance-of-statistical-and-machine
Repo
Framework


Title	ProxEmo: Gait-based Emotion Learning and Multi-view Proxemic Fusion for Socially-Aware Robot Navigation
Authors	Venkatraman Narayanan, Bala Murali Manoghar, Vishnu Sashank Dorbala, Dinesh Manocha, Aniket Bera
Abstract	We present ProxEmo, a novel end-to-end emotion prediction algorithm for socially aware robot navigation among pedestrians. Our approach predicts the perceived emotions of a pedestrian from walking gaits, which is then used for emotion-guided navigation taking into account social and proxemic constraints. To classify emotions, we propose a multi-view skeleton graph convolution-based model that works on a commodity camera mounted onto a moving robot. Our emotion recognition is integrated into a mapless navigation scheme and makes no assumptions about the environment of pedestrian motion. It achieves a mean average emotion prediction precision of 82.47% on the Emotion-Gait benchmark dataset. We outperform current state-of-art algorithms for emotion recognition from 3D gaits. We highlight its benefits in terms of navigation in indoor scenes using a Clearpath Jackal robot.
Tasks	Emotion Recognition, Robot Navigation
Published	2020-03-02
URL	https://arxiv.org/abs/2003.01062v1
PDF	https://arxiv.org/pdf/2003.01062v1.pdf
PWC	https://paperswithcode.com/paper/proxemo-gait-based-emotion-learning-and-multi
Repo
Framework

A novel tree-structured point cloud dataset for skeletonization algorithm evaluation


Title	A novel tree-structured point cloud dataset for skeletonization algorithm evaluation
Authors	Yan Lin, Ji Liu, Jianlin Zhou
Abstract	Curve skeleton extraction from unorganized point cloud is a fundamental task of computer vision and three-dimensional data preprocessing and visualization. A great amount of work has been done to extract skeleton from point cloud. but the lack of standard datasets of point cloud with ground truth skeleton makes it difficult to evaluate these algorithms. In this paper, we construct a brand new tree-structured point cloud dataset, including ground truth skeletons, and point cloud models. In addition, four types of point cloud are built on clean point cloud: point clouds with noise, point clouds with missing data, point clouds with different density, and point clouds with uneven density distribution. We first use tree editor to build the tree skeleton and corresponding mesh model. Since the implicit surface is sufficiently expressive to retain the edges and details of the complex branches model, we use the implicit surface to model the triangular mesh. With the implicit surface, virtual scanner is applied to the sampling of point cloud. Finally, considering the challenges in skeleton extraction, we introduce different methods to build four different types of point cloud models. This dataset can be used as standard dataset for skeleton extraction algorithms. And the evaluation between skeleton extraction algorithms can be performed by comparing the ground truth skeleton with the extracted skeleton.
Tasks
Published	2020-01-09
URL	https://arxiv.org/abs/2001.02823v1
PDF	https://arxiv.org/pdf/2001.02823v1.pdf
PWC	https://paperswithcode.com/paper/a-novel-tree-structured-point-cloud-dataset
Repo
Framework

Deep Learning Stereo Vision at the edge


Title	Deep Learning Stereo Vision at the edge
Authors	Luca Puglia, Cormac Brick
Abstract	We present an overview of the methodology used to build a new stereo vision solution that is suitable for System on Chip. This new solution was developed to bring computer vision capability to embedded devices that live in a power constrained environment. The solution is constructured as a hybrid between classical Stereo Vision techniques and deep learning approaches. The stereoscopic module is composed of two separate modules: one that accelerates the neural network we trained and one that accelerates the front-end part. The system is completely passive and does not require any structured light to obtain very compelling accuracy. With respect to the previous Stereo Vision solutions offered by the industries we offer a major improvement is robustness to noise. This is mainly possible due to the deep learning part of the chosen architecture. We submitted our result to Middlebury dataset challenge. It currently ranks as the best System on Chip solution. The system has been developed for low latency applications which require better than real time performance on high definition videos.
Tasks
Published	2020-01-13
URL	https://arxiv.org/abs/2001.04552v1
PDF	https://arxiv.org/pdf/2001.04552v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-stereo-vision-at-the-edge
Repo
Framework

An End-to-end Deep Learning Approach for Landmark Detection and Matching in Medical Images


Title	An End-to-end Deep Learning Approach for Landmark Detection and Matching in Medical Images
Authors	Monika Grewal, Timo M. Deist, Jan Wiersma, Peter A. N. Bosman, Tanja Alderliesten
Abstract	Anatomical landmark correspondences in medical images can provide additional guidance information for the alignment of two images, which, in turn, is crucial for many medical applications. However, manual landmark annotation is labor-intensive. Therefore, we propose an end-to-end deep learning approach to automatically detect landmark correspondences in pairs of two-dimensional (2D) images. Our approach consists of a Siamese neural network, which is trained to identify salient locations in images as landmarks and predict matching probabilities for landmark pairs from two different images. We trained our approach on 2D transverse slices from 168 lower abdominal Computed Tomography (CT) scans. We tested the approach on 22,206 pairs of 2D slices with varying levels of intensity, affine, and elastic transformations. The proposed approach finds an average of 639, 466, and 370 landmark matches per image pair for intensity, affine, and elastic transformations, respectively, with spatial matching errors of at most 1 mm. Further, more than 99% of the landmark pairs are within a spatial matching error of 2 mm, 4 mm, and 8 mm for image pairs with intensity, affine, and elastic transformations, respectively. To investigate the utility of our developed approach in a clinical setting, we also tested our approach on pairs of transverse slices selected from follow-up CT scans of three patients. Visual inspection of the results revealed landmark matches in both bony anatomical regions as well as in soft tissues lacking prominent intensity gradients.
Tasks	Computed Tomography (CT)
Published	2020-01-21
URL	https://arxiv.org/abs/2001.07434v1
PDF	https://arxiv.org/pdf/2001.07434v1.pdf
PWC	https://paperswithcode.com/paper/an-end-to-end-deep-learning-approach-for
Repo
Framework

Synthetic vascular structure generation for unsupervised pre-training in CTA segmentation tasks


Title	Synthetic vascular structure generation for unsupervised pre-training in CTA segmentation tasks
Authors	Nil Stolt Ansó
Abstract	Large enough computed tomography (CT) data sets to train supervised deep models are often hard to come by. One contributing issue is the amount of manual labor that goes into creating ground truth labels, specially for volumetric data. In this research, we train a U-net architecture at a vessel segmentation task that can be used to provide insights when treating stroke patients. We create a computational model that generates synthetic vascular structures which can be blended into unlabeled CT scans of the head. This unsupervised approached to labelling is used to pre-train deep segmentation models, which are later fine-tuned on real examples to achieve an increase in accuracy compared to models trained exclusively on a hand-labeled data set.
Tasks	Computed Tomography (CT)
Published	2020-01-02
URL	https://arxiv.org/abs/2001.00666v1
PDF	https://arxiv.org/pdf/2001.00666v1.pdf
PWC	https://paperswithcode.com/paper/synthetic-vascular-structure-generation-for
Repo
Framework

Sketchformer: Transformer-based Representation for Sketched Structure


Title	Sketchformer: Transformer-based Representation for Sketched Structure
Authors	Leo Sampaio Ferraz Ribeiro, Tu Bui, John Collomosse, Moacir Ponti
Abstract	Sketchformer is a novel transformer-based representation for encoding free-hand sketches input in a vector form, i.e. as a sequence of strokes. Sketchformer effectively addresses multiple tasks: sketch classification, sketch based image retrieval (SBIR), and the reconstruction and interpolation of sketches. We report several variants exploring continuous and tokenized input representations, and contrast their performance. Our learned embedding, driven by a dictionary learning tokenization scheme, yields state of the art performance in classification and image retrieval tasks, when compared against baseline representations driven by LSTM sequence to sequence architectures: SketchRNN and derivatives. We show that sketch reconstruction and interpolation are improved significantly by the Sketchformer embedding for complex sketches with longer stroke sequences.
Tasks	Dictionary Learning, Image Retrieval, Sketch-Based Image Retrieval, Tokenization
Published	2020-02-24
URL	https://arxiv.org/abs/2002.10381v1
PDF	https://arxiv.org/pdf/2002.10381v1.pdf
PWC	https://paperswithcode.com/paper/sketchformer-transformer-based-representation
Repo
Framework


Title	Modeling and solving the multimodal car- and ride-sharing problem
Authors	Miriam Enzi, Sophie N. Parragh, David Pisinger, Matthias Prandtstetter
Abstract	We introduce the multimodal car- and ride-sharing problem (MMCRP), in which a pool of cars is used to cover a set of ride requests, while uncovered requests are assigned to other modes of transport (MOT). A car’s route consists of one or more trips. Each trip must have a specific but non-predetermined driver, start in a depot and finish in a (possibly different) depot. Ride-sharing between users is allowed, even when two rides do not have the same origin and/or destination. A user has always the option of using other modes of transport according to an individual list of preferences. The problem can be formulated as a vehicle scheduling problem. In order to solve the problem, an auxiliary graph is constructed in which each trip starting and ending in a depot, and covering possible ride-shares, is modeled as an edge in a time-space graph. We propose a two-layer decomposition algorithm based on column generation, where the master problem ensures that each request can only be covered at most once, and the pricing problem generates new promising routes by solving a kind of shortest path problem in a time-space network. Computational experiments based on realistic instances are reported. The benchmark instances are based on demographic, spatial, and economic data of Vienna, Austria. We solve large instances with the column generation based approach to near optimality in reasonable time, and we further investigate various exact and heuristic pricing schemes.
Tasks
Published	2020-01-15
URL	https://arxiv.org/abs/2001.05490v1
PDF	https://arxiv.org/pdf/2001.05490v1.pdf
PWC	https://paperswithcode.com/paper/modeling-and-solving-the-multimodal-car-and
Repo
Framework

Structured Domain Adaptation for Unsupervised Person Re-identification


Title	Structured Domain Adaptation for Unsupervised Person Re-identification
Authors	Yixiao Ge, Feng Zhu, Rui Zhao, Hongsheng Li
Abstract	Unsupervised domain adaptation (UDA) aims at adapting the model trained on a labeled source-domain dataset to another target-domain dataset without any annotation. The task of UDA for the open-set person re-identification (re-ID) is even more challenging as the identities (classes) have no overlap between the two domains. Existing UDA methods for person re-ID have the following limitations. 1) Pseudo-label-based methods achieve state-of-the-art performances but ignore the complex relations between two domains’ images, along with the valuable source-domain annotations. 2) Domain translation-based methods cannot achieve competitive performances as the domain translation is not properly regularized to generate informative enough training samples that well maintain inter-sample relations. To tackle the above challenges, we propose an end-to-end structured domain adaptation framework that consists of a novel structured domain-translation network and two domain-specific person image encoders. The structured domain-translation network can effectively transform the source-domain images into the target domain while well preserving the original intra- and inter-identity relations. The target-domain encoder could then be trained using both source-to-target translated images with valuable ground-truth labels and target-domain images with pseudo labels. Importantly, the domain-translation network and target-domain encoder are jointly optimized, improving each other towards the overall objective, i.e. to achieve optimal re-ID performances on the target domain. Our proposed framework outperforms state-of-the-art methods on multiple UDA tasks of person re-ID.
Tasks	Domain Adaptation, Person Re-Identification, Unsupervised Domain Adaptation, Unsupervised Person Re-Identification
Published	2020-03-14
URL	https://arxiv.org/abs/2003.06650v1
PDF	https://arxiv.org/pdf/2003.06650v1.pdf
PWC	https://paperswithcode.com/paper/structured-domain-adaptation-for-unsupervised
Repo
Framework

Elastic Consistency: A General Consistency Model for Distributed Stochastic Gradient Descent


Title	Elastic Consistency: A General Consistency Model for Distributed Stochastic Gradient Descent
Authors	Dan Alistarh, Bapi Chatterjee, Vyacheslav Kungurtsev
Abstract	Machine learning has made tremendous progress in recent years, with models matching or even surpassing humans on a series of specialized tasks. One key element behind the progress of machine learning in recent years has been the ability to train machine learning models in large-scale distributed shared-memory and message-passing environments. Many of these models are trained employing variants of stochastic gradient descent (SGD) based optimization. In this paper, we introduce a general consistency condition covering communication-reduced and asynchronous distributed SGD implementations. Our framework, called elastic consistency enables us to derive convergence bounds for a variety of distributed SGD methods used in practice to train large-scale machine learning models. The proposed framework de-clutters the implementation-specific convergence analysis and provides an abstraction to derive convergence bounds. We utilize the framework to analyze a sparsification scheme for distributed SGD methods in an asynchronous setting for convex and non-convex objectives. We implement the distributed SGD variant to train deep CNN models in an asynchronous shared-memory setting. Empirical results show that error-feedback may not necessarily help in improving the convergence of sparsified asynchronous distributed SGD, which corroborates an insight suggested by our convergence analysis.
Tasks
Published	2020-01-16
URL	https://arxiv.org/abs/2001.05918v1
PDF	https://arxiv.org/pdf/2001.05918v1.pdf
PWC	https://paperswithcode.com/paper/elastic-consistency-a-general-consistency
Repo
Framework

Manifold for Machine Learning Assurance


Title	Manifold for Machine Learning Assurance
Authors	Taejoon Byun, Sanjai Rayadurgam
Abstract	The increasing use of machine-learning (ML) enabled systems in critical tasks fuels the quest for novel verification and validation techniques yet grounded in accepted system assurance principles. In traditional system development, model-based techniques have been widely adopted, where the central premise is that abstract models of the required system provide a sound basis for judging its implementation. We posit an analogous approach for ML systems using an ML technique that extracts from the high-dimensional training data implicitly describing the required system, a low-dimensional underlying structure–a manifold. It is then harnessed for a range of quality assurance tasks such as test adequacy measurement, test input generation, and runtime monitoring of the target ML system. The approach is built on variational autoencoder, an unsupervised method for learning a pair of mutually near-inverse functions between a given high-dimensional dataset and a low-dimensional representation. Preliminary experiments establish that the proposed manifold-based approach, for test adequacy drives diversity in test data, for test generation yields fault-revealing yet realistic test cases, and for runtime monitoring provides an independent means to assess trustability of the target system’s output.
Tasks
Published	2020-02-08
URL	https://arxiv.org/abs/2002.03147v1
PDF	https://arxiv.org/pdf/2002.03147v1.pdf
PWC	https://paperswithcode.com/paper/manifold-for-machine-learning-assurance
Repo
Framework

Multi-task Reinforcement Learning with a Planning Quasi-Metric


Title	Multi-task Reinforcement Learning with a Planning Quasi-Metric
Authors	Vincent Micheli, Karthigan Sinnathamby, François Fleuret
Abstract	We introduce a new reinforcement learning approach combining a planning quasi-metric (PQM) that estimates the number of actions required to go from a state to another, with task-specific planners that compute a target state to reach a given goal. The main advantage of this decomposition is to allow the sharing across tasks of a task-agnostic model of the quasi-metric that captures the environment’s dynamics and can be learned in a dense and unsupervised manner. We demonstrate the usefulness of this approach on the standard bit-flip problem and in the MuJoCo robotic arm simulator.
Tasks
Published	2020-02-08
URL	https://arxiv.org/abs/2002.03240v1
PDF	https://arxiv.org/pdf/2002.03240v1.pdf
PWC	https://paperswithcode.com/paper/multi-task-reinforcement-learning-with-a
Repo
Framework

Extreme Multi-label Classification from Aggregated Labels


Title	Extreme Multi-label Classification from Aggregated Labels
Authors	Yanyao Shen, Hsiang-fu Yu, Sujay Sanghavi, Inderjit Dhillon
Abstract	Extreme multi-label classification (XMC) is the problem of finding the relevant labels for an input, from a very large universe of possible labels. We consider XMC in the setting where labels are available only for groups of samples - but not for individual ones. Current XMC approaches are not built for such multi-instance multi-label (MIML) training data, and MIML approaches do not scale to XMC sizes. We develop a new and scalable algorithm to impute individual-sample labels from the group labels; this can be paired with any existing XMC method to solve the aggregated label problem. We characterize the statistical properties of our algorithm under mild assumptions, and provide a new end-to-end framework for MIML as an extension. Experiments on both aggregated label XMC and MIML tasks show the advantages over existing approaches.
Tasks	Extreme Multi-Label Classification, Multi-Label Classification
Published	2020-04-01
URL	https://arxiv.org/abs/2004.00198v1
PDF	https://arxiv.org/pdf/2004.00198v1.pdf
PWC	https://paperswithcode.com/paper/extreme-multi-label-classification-from
Repo
Framework

Beyond Clicks: Modeling Multi-Relational Item Graph for Session-Based Target Behavior Prediction


Title	Beyond Clicks: Modeling Multi-Relational Item Graph for Session-Based Target Behavior Prediction
Authors	Wen Wang, Wei Zhang, Shukai Liu, Qi Liu, Bo Zhang, Leyu Lin, Hongyuan Zha
Abstract	Session-based target behavior prediction aims to predict the next item to be interacted with specific behavior types (e.g., clicking). Although existing methods for session-based behavior prediction leverage powerful representation learning approaches to encode items’ sequential relevance in a low-dimensional space, they suffer from several limitations. Firstly, they focus on only utilizing the same type of user behavior for prediction, but ignore the potential of taking other behavior data as auxiliary information. This is particularly crucial when the target behavior is sparse but important (e.g., buying or sharing an item). Secondly, item-to-item relations are modeled separately and locally in one behavior sequence, and they lack a principled way to globally encode these relations more effectively. To overcome these limitations, we propose a novel Multi-relational Graph Neural Network model for Session-based target behavior Prediction, namely MGNN-SPred for short. Specifically, we build a Multi-Relational Item Graph (MRIG) based on all behavior sequences from all sessions, involving target and auxiliary behavior types. Based on MRIG, MGNN-SPred learns global item-to-item relations and further obtains user preferences w.r.t. current target and auxiliary behavior sequences, respectively. In the end, MGNN-SPred leverages a gating mechanism to adaptively fuse user representations for predicting next item interacted with target behavior. The extensive experiments on two real-world datasets demonstrate the superiority of MGNN-SPred by comparing with state-of-the-art session-based prediction methods, validating the benefits of leveraging auxiliary behavior and learning item-to-item relations over MRIG.
Tasks	Representation Learning
Published	2020-02-19
URL	https://arxiv.org/abs/2002.07993v1
PDF	https://arxiv.org/pdf/2002.07993v1.pdf
PWC	https://paperswithcode.com/paper/beyond-clicks-modeling-multi-relational-item
Repo
Framework

MRI Super-Resolution with GAN and 3D Multi-Level DenseNet: Smaller, Faster, and Better


Title	MRI Super-Resolution with GAN and 3D Multi-Level DenseNet: Smaller, Faster, and Better
Authors	Yuhua Chen, Anthony G. Christodoulou, Zhengwei Zhou, Feng Shi, Yibin Xie, Debiao Li
Abstract	High-resolution (HR) magnetic resonance imaging (MRI) provides detailed anatomical information that is critical for diagnosis in the clinical application. However, HR MRI typically comes at the cost of long scan time, small spatial coverage, and low signal-to-noise ratio (SNR). Recent studies showed that with a deep convolutional neural network (CNN), HR generic images could be recovered from low-resolution (LR) inputs via single image super-resolution (SISR) approaches. Additionally, previous works have shown that a deep 3D CNN can generate high-quality SR MRIs by using learned image priors. However, 3D CNN with deep structures, have a large number of parameters and are computationally expensive. In this paper, we propose a novel 3D CNN architecture, namely a multi-level densely connected super-resolution network (mDCSRN), which is light-weight, fast and accurate. We also show that with the generative adversarial network (GAN)-guided training, the mDCSRN-GAN provides appealing sharp SR images with rich texture details that are highly comparable with the referenced HR images. Our results from experiments on a large public dataset with 1,113 subjects showed that this new architecture outperformed other popular deep learning methods in recovering 4x resolution-downgraded images in both quality and speed.
Tasks	Image Super-Resolution, Super-Resolution
Published	2020-03-02
URL	https://arxiv.org/abs/2003.01217v2
PDF	https://arxiv.org/pdf/2003.01217v2.pdf
PWC	https://paperswithcode.com/paper/mri-super-resolution-with-gan-and-3d-multi
Repo
Framework