October 19, 2019

3203 words 16 mins read

Paper Group ANR 339

X2Face: A network for controlling face generation by using images, audio, and pose codes. PointSeg: Real-Time Semantic Segmentation Based on 3D LiDAR Point Cloud. A Comprehensive Study on the Applications of Machine Learning for the Medical Diagnosis and Prognosis of Asthma. From Rank Estimation to Rank Approximation: Rank Residual Constraint for I …

X2Face: A network for controlling face generation by using images, audio, and pose codes


Title	X2Face: A network for controlling face generation by using images, audio, and pose codes
Authors	Olivia Wiles, A. Sophia Koepke, Andrew Zisserman
Abstract	The objective of this paper is a neural network model that controls the pose and expression of a given face, using another face or modality (e.g. audio). This model can then be used for lightweight, sophisticated video and image editing. We make the following three contributions. First, we introduce a network, X2Face, that can control a source face (specified by one or more frames) using another face in a driving frame to produce a generated frame with the identity of the source frame but the pose and expression of the face in the driving frame. Second, we propose a method for training the network fully self-supervised using a large collection of video data. Third, we show that the generation process can be driven by other modalities, such as audio or pose codes, without any further training of the network. The generation results for driving a face with another face are compared to state-of-the-art self-supervised/supervised methods. We show that our approach is more robust than other methods, as it makes fewer assumptions about the input data. We also show examples of using our framework for video face editing.
Tasks	Face Generation
Published	2018-07-27
URL	http://arxiv.org/abs/1807.10550v1
PDF	http://arxiv.org/pdf/1807.10550v1.pdf
PWC	https://paperswithcode.com/paper/x2face-a-network-for-controlling-face
Repo
Framework

PointSeg: Real-Time Semantic Segmentation Based on 3D LiDAR Point Cloud


Title	PointSeg: Real-Time Semantic Segmentation Based on 3D LiDAR Point Cloud
Authors	Yuan Wang, Tianyue Shi, Peng Yun, Lei Tai, Ming Liu
Abstract	In this paper, we propose PointSeg, a real-time end-to-end semantic segmentation method for road-objects based on spherical images. We take the spherical image, which is transformed from the 3D LiDAR point clouds, as input of the convolutional neural networks (CNNs) to predict the point-wise semantic map. To make PointSeg applicable on a mobile system, we build the model based on the light-weight network, SqueezeNet, with several improvements. It maintains a good balance between memory cost and prediction performance. Our model is trained on spherical images and label masks projected from the KITTI 3D object detection dataset. Experiments show that PointSeg can achieve competitive accuracy with 90fps on a single GPU 1080ti. which makes it quite compatible for autonomous driving applications.
Tasks	3D Object Detection, Autonomous Driving, Object Detection, Real-Time Semantic Segmentation, Semantic Segmentation
Published	2018-07-17
URL	http://arxiv.org/abs/1807.06288v8
PDF	http://arxiv.org/pdf/1807.06288v8.pdf
PWC	https://paperswithcode.com/paper/pointseg-real-time-semantic-segmentation
Repo
Framework

A Comprehensive Study on the Applications of Machine Learning for the Medical Diagnosis and Prognosis of Asthma


Title	A Comprehensive Study on the Applications of Machine Learning for the Medical Diagnosis and Prognosis of Asthma
Authors	Saksham Kukreja
Abstract	An estimated 300 million people worldwide suffer from asthma, and this number is expected to increase to 400 million by 2025. Approximately 250,000 people die prematurely each year from asthma out of which, almost all deaths are avoidable. Most of these deaths occur because the patients are unaware of their asthmatic morbidity. If detected early, asthmatic mortality rate can be reduced by 78%, provided that the patients carry appropriate medication for the same and/or are in lose vicinity to medical equipment like nebulizers. This study focuses on the development and valuation of algorithms to diagnose asthma through symptom intensive questionary, clinical data and medical reports. Machine Learning Algorithms like Back-propagation model, Context Sensitive Auto-Associative Memory Neural Network Model, C4.5 Algorithm, Bayesian Network and Particle Swarm Optimization have been employed for the diagnosis of asthma and later a comparison is made between their respective prospects. All algorithms received an accuracy of over 80%. However, the use of Auto Associative Memory Model (on a layered Artificial Neural Network) displayed much better results. It reached to an accuracy of over 90% and an inconclusive diagnosis rate of less than 1% when trained with adequate data. In the end, na"ive mobile based applications were developed on Android and iOS that made use of the self-training auto associative memory model to achieve an accuracy of nearly 94.2%.
Tasks	Medical Diagnosis
Published	2018-04-07
URL	http://arxiv.org/abs/1804.04612v1
PDF	http://arxiv.org/pdf/1804.04612v1.pdf
PWC	https://paperswithcode.com/paper/a-comprehensive-study-on-the-applications-of-1
Repo
Framework

From Rank Estimation to Rank Approximation: Rank Residual Constraint for Image Restoration


Title	From Rank Estimation to Rank Approximation: Rank Residual Constraint for Image Restoration
Authors	Zhiyuan Zha, Xin Yuan, Bihan Wen, Jiantao Zhou, Jiachao Zhang, Ce Zhu
Abstract	In this paper, we propose a novel approach to the rank minimization problem, termed rank residual constraint (RRC) model. Different from existing low-rank based approaches, such as the well-known nuclear norm minimization (NNM) and the weighted nuclear norm minimization (WNNM), which estimate the underlying low-rank matrix directly from the corrupted observations, we progressively approximate the underlying low-rank matrix via minimizing the rank residual. Through integrating the image nonlocal self-similarity (NSS) prior with the proposed RRC model, we apply it to image restoration tasks, including image denoising and image compression artifacts reduction. Towards this end, we first obtain a good reference of the original image groups by using the image NSS prior, and then the rank residual of the image groups between this reference and the degraded image is minimized to achieve a better estimate to the desired image. In this manner, both the reference and the estimated image are updated gradually and jointly in each iteration. Based on the group-based sparse representation model, we further provide a theoretical analysis on the feasibility of the proposed RRC model. Experimental results demonstrate that the proposed RRC model outperforms many state-of-the-art schemes in both the objective and perceptual quality.
Tasks	Denoising, Image Compression, Image Denoising, Image Restoration
Published	2018-07-06
URL	https://arxiv.org/abs/1807.02504v9
PDF	https://arxiv.org/pdf/1807.02504v9.pdf
PWC	https://paperswithcode.com/paper/from-rank-estimation-to-rank-approximation
Repo
Framework

Convolutional herbal prescription building method from multi-scale facial features


Title	Convolutional herbal prescription building method from multi-scale facial features
Authors	Huiqiang Liao, Guihua Wen, Yang Hu, Changjun Wang
Abstract	In Traditional Chinese Medicine (TCM), facial features are important basis for diagnosis and treatment. A doctor of TCM can prescribe according to a patient’s physical indicators such as face, tongue, voice, symptoms, pulse. Previous works analyze and generate prescription according to symptoms. However, research work to mine the association between facial features and prescriptions has not been found for the time being. In this work, we try to use deep learning methods to mine the relationship between the patient’s face and herbal prescriptions (TCM prescriptions), and propose to construct convolutional neural networks that generate TCM prescriptions according to the patient’s face image. It is a novel and challenging job. In order to mine features from different granularities of faces, we design a multi-scale convolutional neural network based on three-grained face, which mines the patient’s face information from the organs, local regions, and the entire face. Our experiments show that convolutional neural networks can learn relevant information from face to prescribe, and the multi-scale convolutional neural networks based on three-grained face perform better.
Tasks
Published	2018-12-17
URL	http://arxiv.org/abs/1812.06847v1
PDF	http://arxiv.org/pdf/1812.06847v1.pdf
PWC	https://paperswithcode.com/paper/convolutional-herbal-prescription-building
Repo
Framework


Title	Some HCI Priorities for GDPR-Compliant Machine Learning
Authors	Michael Veale, Reuben Binns, Max Van Kleek
Abstract	In this short paper, we consider the roles of HCI in enabling the better governance of consequential machine learning systems using the rights and obligations laid out in the recent 2016 EU General Data Protection Regulation (GDPR)—a law which involves heavy interaction with people and systems. Focussing on those areas that relate to algorithmic systems in society, we propose roles for HCI in legal contexts in relation to fairness, bias and discrimination; data protection by design; data protection impact assessments; transparency and explanations; the mitigation and understanding of automation bias; and the communication of envisaged consequences of processing.
Tasks
Published	2018-03-16
URL	http://arxiv.org/abs/1803.06174v1
PDF	http://arxiv.org/pdf/1803.06174v1.pdf
PWC	https://paperswithcode.com/paper/some-hci-priorities-for-gdpr-compliant
Repo
Framework

HBST: A Hamming Distance embedding Binary Search Tree for Visual Place Recognition


Title	HBST: A Hamming Distance embedding Binary Search Tree for Visual Place Recognition
Authors	Dominik Schlegel, Giorgio Grisetti
Abstract	Reliable and efficient Visual Place Recognition is a major building block of modern SLAM systems. Leveraging on our prior work, in this paper we present a Hamming Distance embedding Binary Search Tree (HBST) approach for binary Descriptor Matching and Image Retrieval. HBST allows for descriptor Search and Insertion in logarithmic time by exploiting particular properties of binary Feature descriptors. We support the idea behind our search structure with a thorough analysis on the exploited descriptor properties and their effects on completeness and complexity of search and insertion. To validate our claims we conducted comparative experiments for HBST and several state-of-the-art methods on a broad range of publicly available datasets. HBST is available as a compact open-source C++ header-only library.
Tasks	Image Retrieval, Visual Place Recognition
Published	2018-02-26
URL	http://arxiv.org/abs/1802.09261v2
PDF	http://arxiv.org/pdf/1802.09261v2.pdf
PWC	https://paperswithcode.com/paper/hbst-a-hamming-distance-embedding-binary
Repo
Framework

A Dynamic Boosted Ensemble Learning Method Based on Random Forest


Title	A Dynamic Boosted Ensemble Learning Method Based on Random Forest
Authors	Xingzhang Ren, Chen Long, Leilei Zhang, Ye Wei, Dongdong Du, Jingxi Liang, Shikun Zhang, Weiping Li
Abstract	We propose a dynamic boosted ensemble learning method based on random forest (DBRF), a novel ensemble algorithm that incorporates the notion of hard example mining into Random Forest (RF) and thus combines the high accuracy of Boosting algorithm with the strong generalization of Bagging algorithm. Specifically, we propose to measure the quality of each leaf node of every decision tree in the random forest to determine hard examples. By iteratively training and then removing easy examples from training data, we evolve the random forest to focus on hard examples dynamically so as to learn decision boundaries better. Data can be cascaded through these random forests learned in each iteration in sequence to generate predictions, thus making RF deep. We also propose to use evolution mechanism and smart iteration mechanism to improve the performance of the model. DBRF outperforms RF on three UCI datasets and achieved state-of-the-art results compared to other deep models. Moreover, we show that DBRF is also a new way of sampling and can be very useful when learning from imbalanced data.
Tasks
Published	2018-04-19
URL	http://arxiv.org/abs/1804.07270v3
PDF	http://arxiv.org/pdf/1804.07270v3.pdf
PWC	https://paperswithcode.com/paper/a-dynamic-boosted-ensemble-learning-method
Repo
Framework

Guide Me: Interacting with Deep Networks


Title	Guide Me: Interacting with Deep Networks
Authors	Christian Rupprecht, Iro Laina, Nassir Navab, Gregory D. Hager, Federico Tombari
Abstract	Interaction and collaboration between humans and intelligent machines has become increasingly important as machine learning methods move into real-world applications that involve end users. While much prior work lies at the intersection of natural language and vision, such as image captioning or image generation from text descriptions, less focus has been placed on the use of language to guide or improve the performance of a learned visual processing algorithm. In this paper, we explore methods to flexibly guide a trained convolutional neural network through user input to improve its performance during inference. We do so by inserting a layer that acts as a spatio-semantic guide into the network. This guide is trained to modify the network’s activations, either directly via an energy minimization scheme or indirectly through a recurrent model that translates human language queries to interaction weights. Learning the verbal interaction is fully automatic and does not require manual text annotations. We evaluate the method on two datasets, showing that guiding a pre-trained network can improve performance, and provide extensive insights into the interaction between the guide and the CNN.
Tasks	Image Captioning, Image Generation
Published	2018-03-30
URL	http://arxiv.org/abs/1803.11544v1
PDF	http://arxiv.org/pdf/1803.11544v1.pdf
PWC	https://paperswithcode.com/paper/guide-me-interacting-with-deep-networks
Repo
Framework

Generative Model for Heterogeneous Inference


Title	Generative Model for Heterogeneous Inference
Authors	Honggang Zhou, Yunchun Li, Hailong Yang, Wei Li, Jie Jia
Abstract	Generative models (GMs) such as Generative Adversary Network (GAN) and Variational Auto-Encoder (VAE) have thrived these years and achieved high quality results in generating new samples. Especially in Computer Vision, GMs have been used in image inpainting, denoising and completion, which can be treated as the inference from observed pixels to corrupted pixels. However, images are hierarchically structured which are quite different from many real-world inference scenarios with non-hierarchical features. These inference scenarios contain heterogeneous stochastic variables and irregular mutual dependences. Traditionally they are modeled by Bayesian Network (BN). However, the learning and inference of BN model are NP-hard thus the number of stochastic variables in BN is highly constrained. In this paper, we adapt typical GMs to enable heterogeneous learning and inference in polynomial time.We also propose an extended autoregressive (EAR) model and an EAR with adversary loss (EARA) model and give theoretical results on their effectiveness. Experiments on several BN datasets show that our proposed EAR model achieves the best performance in most cases compared to other GMs. Except for black box analysis, we’ve also done a serial of experiments on Markov border inference of GMs for white box analysis and give theoretical results.
Tasks	Denoising, Image Inpainting
Published	2018-04-26
URL	http://arxiv.org/abs/1804.09858v1
PDF	http://arxiv.org/pdf/1804.09858v1.pdf
PWC	https://paperswithcode.com/paper/generative-model-for-heterogeneous-inference
Repo
Framework

Online Graph-Adaptive Learning with Scalability and Privacy


Title	Online Graph-Adaptive Learning with Scalability and Privacy
Authors	Yanning Shen, Geert Leus, Georgios B. Giannakis
Abstract	Graphs are widely adopted for modeling complex systems, including financial, biological, and social networks. Nodes in networks usually entail attributes, such as the age or gender of users in a social network. However, real-world networks can have very large size, and nodal attributes can be unavailable to a number of nodes, e.g., due to privacy concerns. Moreover, new nodes can emerge over time, which can necessitate real-time evaluation of their nodal attributes. In this context, the present paper deals with scalable learning of nodal attributes by estimating a nodal function based on noisy observations at a subset of nodes. A multikernel-based approach is developed which is scalable to large-size networks. Unlike most existing methods that re-solve the function estimation problem over all existing nodes whenever a new node joins the network, the novel method is capable of providing real-time evaluation of the function values on newly-joining nodes without resorting to a batch solver. Interestingly, the novel scheme only relies on an encrypted version of each node’s connectivity in order to learn the nodal attributes, which promotes privacy. Experiments on both synthetic and real datasets corroborate the effectiveness of the proposed methods.
Tasks
Published	2018-12-03
URL	http://arxiv.org/abs/1812.00974v1
PDF	http://arxiv.org/pdf/1812.00974v1.pdf
PWC	https://paperswithcode.com/paper/online-graph-adaptive-learning-with
Repo
Framework


Title	Infinite Factorial Finite State Machine for Blind Multiuser Channel Estimation
Authors	Francisco J. R. Ruiz, Isabel Valera, Lennart Svensson, Fernando Perez-Cruz
Abstract	New communication standards need to deal with machine-to-machine communications, in which users may start or stop transmitting at any time in an asynchronous manner. Thus, the number of users is an unknown and time-varying parameter that needs to be accurately estimated in order to properly recover the symbols transmitted by all users in the system. In this paper, we address the problem of joint channel parameter and data estimation in a multiuser communication channel in which the number of transmitters is not known. For that purpose, we develop the infinite factorial finite state machine model, a Bayesian nonparametric model based on the Markov Indian buffet that allows for an unbounded number of transmitters with arbitrary channel length. We propose an inference algorithm that makes use of slice sampling and particle Gibbs with ancestor sampling. Our approach is fully blind as it does not require a prior channel estimation step, prior knowledge of the number of transmitters, or any signaling information. Our experimental results, loosely based on the LTE random access channel, show that the proposed approach can effectively recover the data-generating process for a wide range of scenarios, with varying number of transmitters, number of receivers, constellation order, channel length, and signal-to-noise ratio.
Tasks
Published	2018-10-18
URL	http://arxiv.org/abs/1810.09261v1
PDF	http://arxiv.org/pdf/1810.09261v1.pdf
PWC	https://paperswithcode.com/paper/infinite-factorial-finite-state-machine-for
Repo
Framework

Can a Compact Neuronal Circuit Policy be Re-purposed to Learn Simple Robotic Control?


Title	Can a Compact Neuronal Circuit Policy be Re-purposed to Learn Simple Robotic Control?
Authors	Ramin Hasani, Mathias Lechner, Alexander Amini, Daniela Rus, Radu Grosu
Abstract	We propose a neural information processing system which is obtained by re-purposing the function of a biological neural circuit model, to govern simulated and real-world control tasks. Inspired by the structure of the nervous system of the soil-worm, C. elegans, we introduce Neuronal Circuit Policies (NCPs), defined as the model of biological neural circuits reparameterized for the control of an alternative task. We learn instances of NCPs to control a series of robotic tasks, including the autonomous parking of a real-world rover robot. For reconfiguration of the purpose of the neural circuit, we adopt a search-based optimization algorithm. Neuronal circuit policies perform on par and in some cases surpass the performance of contemporary deep learning models with the advantage leveraging significantly fewer learnable parameters and realizing interpretable dynamics at the cell-level.
Tasks
Published	2018-09-11
URL	https://arxiv.org/abs/1809.04423v2
PDF	https://arxiv.org/pdf/1809.04423v2.pdf
PWC	https://paperswithcode.com/paper/re-purposing-compact-neuronal-circuit
Repo
Framework

Geocoding Without Geotags: A Text-based Approach for reddit


Title	Geocoding Without Geotags: A Text-based Approach for reddit
Authors	Keith Harrigian
Abstract	In this paper, we introduce the first geolocation inference approach for reddit, a social media platform where user pseudonymity has thus far made supervised demographic inference difficult to implement and validate. In particular, we design a text-based heuristic schema to generate ground truth location labels for reddit users in the absence of explicitly geotagged data. After evaluating the accuracy of our labeling procedure, we train and test several geolocation inference models across our reddit data set and three benchmark Twitter geolocation data sets. Ultimately, we show that geolocation models trained and applied on the same domain substantially outperform models attempting to transfer training data across domains, even more so on reddit where platform-specific interest-group metadata can be used to improve inferences.
Tasks
Published	2018-10-07
URL	http://arxiv.org/abs/1810.03067v1
PDF	http://arxiv.org/pdf/1810.03067v1.pdf
PWC	https://paperswithcode.com/paper/geocoding-without-geotags-a-text-based
Repo
Framework

Model-Free Linear Quadratic Control via Reduction to Expert Prediction


Title	Model-Free Linear Quadratic Control via Reduction to Expert Prediction
Authors	Yasin Abbasi-Yadkori, Nevena Lazic, Csaba Szepesvari
Abstract	Model-free approaches for reinforcement learning (RL) and continuous control find policies based only on past states and rewards, without fitting a model of the system dynamics. They are appealing as they are general purpose and easy to implement; however, they also come with fewer theoretical guarantees than model-based RL. In this work, we present a new model-free algorithm for controlling linear quadratic (LQ) systems, and show that its regret scales as $O(T^{\xi+2/3})$ for any small $\xi>0$ if time horizon satisfies $T>C^{1/\xi}$ for a constant $C$. The algorithm is based on a reduction of control of Markov decision processes to an expert prediction problem. In practice, it corresponds to a variant of policy iteration with forced exploration, where the policy in each phase is greedy with respect to the average of all previous value functions. This is the first model-free algorithm for adaptive control of LQ systems that provably achieves sublinear regret and has a polynomial computation cost. Empirically, our algorithm dramatically outperforms standard policy iteration, but performs worse than a model-based approach.
Tasks	Continuous Control
Published	2018-04-17
URL	http://arxiv.org/abs/1804.06021v3
PDF	http://arxiv.org/pdf/1804.06021v3.pdf
PWC	https://paperswithcode.com/paper/model-free-linear-quadratic-control-via
Repo
Framework