January 25, 2020

3231 words 16 mins read

Paper Group ANR 1686

Explicit Disentanglement of Appearance and Perspective in Generative Models. Few-Shot Point Cloud Region Annotation with Human in the Loop. Transfer Adaptation Learning: A Decade Survey. Using Bi-Directional Information Exchange to Improve Decentralized Schedule-Driven Traffic Control. Sparse Parallel Training of Hierarchical Dirichlet Process Topi …

Explicit Disentanglement of Appearance and Perspective in Generative Models


Title	Explicit Disentanglement of Appearance and Perspective in Generative Models
Authors	Nicki Skafte Detlefsen, Søren Hauberg
Abstract	Disentangled representation learning finds compact, independent and easy-to-interpret factors of the data. Learning such has been shown to require an inductive bias, which we explicitly encode in a generative model of images. Specifically, we propose a model with two latent spaces: one that represents spatial transformations of the input data, and another that represents the transformed data. We find that the latter naturally captures the intrinsic appearance of the data. To realize the generative model, we propose a Variationally Inferred Transformational Autoencoder (VITAE) that incorporates a spatial transformer into a variational autoencoder. We show how to perform inference in the model efficiently by carefully designing the encoders and restricting the transformation class to be diffeomorphic. Empirically, our model separates the visual style from digit type on MNIST, separates shape and pose in images of human bodies and facial features from facial shape on CelebA.
Tasks	Representation Learning
Published	2019-06-11
URL	https://arxiv.org/abs/1906.11881v2
PDF	https://arxiv.org/pdf/1906.11881v2.pdf
PWC	https://paperswithcode.com/paper/explicit-disentanglement-of-appearance-and
Repo
Framework

Few-Shot Point Cloud Region Annotation with Human in the Loop


Title	Few-Shot Point Cloud Region Annotation with Human in the Loop
Authors	Siddhant Jain, Sowmya Munukutla, David Held
Abstract	We propose a point cloud annotation framework that employs human-in-loop learning to enable the creation of large point cloud datasets with per-point annotations. Sparse labels from a human annotator are iteratively propagated to generate a full segmentation of the network by fine-tuning a pre-trained model of an allied task via a few-shot learning paradigm. We show that the proposed framework significantly reduces the amount of human interaction needed in annotating point clouds, without sacrificing on the quality of the annotations. Our experiments also suggest the suitability of the framework in annotating large datasets by noting a reduction in human interaction as the number of full annotations completed by the system increases. Finally, we demonstrate the flexibility of the framework to support multiple different annotations of the same point cloud enabling the creation of datasets with different granularities of annotation.
Tasks	Few-Shot Learning
Published	2019-06-11
URL	https://arxiv.org/abs/1906.04409v1
PDF	https://arxiv.org/pdf/1906.04409v1.pdf
PWC	https://paperswithcode.com/paper/few-shot-point-cloud-region-annotation-with
Repo
Framework

Transfer Adaptation Learning: A Decade Survey


Title	Transfer Adaptation Learning: A Decade Survey
Authors	Lei Zhang
Abstract	The world we see is ever-changing and it always changes with people, things, and the environment. Domain is referred to as the state of the world at a certain moment. A research problem is characterized as domain transfer adaptation when it needs knowledge correspondence between different moments. Conventional machine learning aims to find a model with the minimum expected risk on test data by minimizing the regularized empirical risk on the training data, which, however, supposes that the training and test data share similar joint probability distribution. Transfer adaptation learning aims to build models that can perform tasks of target domain by learning knowledge from a semantic related but distribution different source domain. It is an energetic research filed of increasing influence and importance. This paper surveys the recent advances in transfer adaptation learning methodology and potential benchmarks. Broader challenges being faced by transfer adaptation learning researchers are identified, i.e., instance re-weighting adaptation, feature adaptation, classifier adaptation, deep network adaptation, and adversarial adaptation, which are beyond the early semi-supervised and unsupervised split. The survey provides researchers a framework for better understanding and identifying the research status, challenges and future directions of the field.
Tasks
Published	2019-03-12
URL	http://arxiv.org/abs/1903.04687v1
PDF	http://arxiv.org/pdf/1903.04687v1.pdf
PWC	https://paperswithcode.com/paper/transfer-adaptation-learning-a-decade-survey
Repo
Framework

Using Bi-Directional Information Exchange to Improve Decentralized Schedule-Driven Traffic Control


Title	Using Bi-Directional Information Exchange to Improve Decentralized Schedule-Driven Traffic Control
Authors	Hsu-Chieh Hu, Stephen F. Smith
Abstract	Recent work in decentralized, schedule-driven traffic control has demonstrated the ability to improve the efficiency of traffic flow in complex urban road networks. In this approach, a scheduling agent is associated with each intersection. Each agent senses the traffic approaching its intersection and in real-time constructs a schedule that minimizes the cumulative wait time of vehicles approaching the intersection over the current look-ahead horizon. In order to achieve network level coordination in a scalable manner, scheduling agents communicate only with their direct neighbors. Each time an agent generates a new intersection schedule it communicates its expected outflows to its downstream neighbors as a prediction of future demand and these outflows are appended to the downstream agent’s locally perceived demand. In this paper, we extend this basic coordination algorithm to additionally incorporate the complementary flow of information reflective of an intersection’s current congestion level to its upstream neighbors. We present an asynchronous decentralized algorithm for updating intersection schedules and congestion level estimates based on these bi-directional information flows. By relating this algorithm to the self-optimized decision making of the basic operation, we are able to approach network-wide optimality and reduce inefficiency due to strictly self-interested intersection control decisions.
Tasks	Decision Making
Published	2019-07-03
URL	https://arxiv.org/abs/1907.01978v1
PDF	https://arxiv.org/pdf/1907.01978v1.pdf
PWC	https://paperswithcode.com/paper/using-bi-directional-information-exchange-to
Repo
Framework

Sparse Parallel Training of Hierarchical Dirichlet Process Topic Models


Title	Sparse Parallel Training of Hierarchical Dirichlet Process Topic Models
Authors	Alexander Terenin, Måns Magnusson, Leif Jonsson
Abstract	Nonparametric extensions of topic models such as Latent Dirichlet Allocation, including Hierarchical Dirichlet Process (HDP), are often studied in natural language processing. Training these models generally requires use of serial algorithms, which limits scalability to large data sets and complicates acceleration via use of parallel and distributed systems. Most current approaches to scalable training of such models either don’t converge to the correct target, or are not data-parallel. Moreover, these approaches generally do not utilize all available sources of sparsity found in natural language - an important way to make computation efficient. Based upon a representation of certain conditional distributions within an HDP, we propose a doubly sparse data-parallel sampler for the HDP topic model that addresses these issues. We benchmark our method on a well-known corpora (PubMed) with 8m documents and 768m tokens, using a single multi-core machine in under three days.
Tasks	Topic Models
Published	2019-06-06
URL	https://arxiv.org/abs/1906.02416v1
PDF	https://arxiv.org/pdf/1906.02416v1.pdf
PWC	https://paperswithcode.com/paper/sparse-parallel-training-of-hierarchical
Repo
Framework

Selective Transfer with Reinforced Transfer Network for Partial Domain Adaptation


Title	Selective Transfer with Reinforced Transfer Network for Partial Domain Adaptation
Authors	Zhihong Chen, Chao Chen, Zhaowei Cheng, Boyuan Jiang, Ke Fang, Xinyu Jin
Abstract	One crucial aspect of partial domain adaptation (PDA) is how to select the relevant source samples in the shared classes for knowledge transfer. Previous PDA methods tackle this problem by re-weighting the source samples based on their high-level information (deep features). However, since the domain shift between source and target domains, only using the deep features for sample selection is defective. We argue that it is more reasonable to additionally exploit the pixel-level information for PDA problem, as the appearance difference between outlier source classes and target classes is significantly large. In this paper, we propose a reinforced transfer network (RTNet), which utilizes both high-level and pixel-level information for PDA problem. Our RTNet is composed of a reinforced data selector (RDS) based on reinforcement learning (RL), which filters out the outlier source samples, and a domain adaptation model which minimizes the domain discrepancy in the shared label space. Specifically, in the RDS, we design a novel reward based on the reconstruct errors of selected source samples on the target generator, which introduces the pixel-level information to guide the learning of RDS. Besides, we develope a state containing high-level information, which used by the RDS for sample selection. The proposed RDS is a general module, which can be easily integrated into existing DA models to make them fit the PDA situation. Extensive experiments indicate that RTNet can achieve state-of-the-art performance for PDA tasks on several benchmark datasets.
Tasks	Domain Adaptation, Partial Domain Adaptation, Transfer Learning
Published	2019-05-26
URL	https://arxiv.org/abs/1905.10756v4
PDF	https://arxiv.org/pdf/1905.10756v4.pdf
PWC	https://paperswithcode.com/paper/selective-transfer-with-reinforced-transfer
Repo
Framework

Comparison of Patch-Based Conditional Generative Adversarial Neural Net Models with Emphasis on Model Robustness for Use in Head and Neck Cases for MR-Only planning


Title	Comparison of Patch-Based Conditional Generative Adversarial Neural Net Models with Emphasis on Model Robustness for Use in Head and Neck Cases for MR-Only planning
Authors	Peter Klages, Ilyes Benslimane, Sadegh Riyahi, Jue Jiang, Margie Hunt, Joe Deasy, Harini Veeraraghavan, Neelam Tyagi
Abstract	A total of twenty paired CT and MR images were used in this study to investigate two conditional generative adversarial networks, Pix2Pix, and Cycle GAN, for generating synthetic CT images for Headand Neck cancer cases. Ten of the patient cases were used for training and included such common artifacts as dental implants; the remaining ten testing cases were used for testing and included a larger range of image features commonly found in clinical head and neck cases. These features included strong metal artifacts from dental implants, one case with a metal implant, and one case with abnormal anatomy. The original CT images were deformably registered to the mDixon FFE MR images to minimize the effects of processing the MR images. The sCT generation accuracy and robustness were evaluated using Mean Absolute Error (MAE) based on the Hounsfield Units (HU) for three regions (whole body, bone, and air within the body), Mean Error (ME) to observe systematic average offset errors in the sCT generation, and dosimetric evaluation of all clinically relevant structures. For the test set the MAE for the Pix2Pix and Cycle GAN models were 92.4 $\pm$ 13.5 HU, and 100.7 $\pm$ 14.6 HU, respectively, for the body region, 166.3 $\pm$ 31.8 HU, and 184 $\pm$ 31.9 HU, respectively, for the bone region, and 183.7 $\pm$ 41.3 HU and 185.4 $\pm$ 37.9 HU for the air regions. The ME for Pix2Pix and Cycle GAN were 21.0 $\pm$ 11.8 HU and 37.5 $\pm$ 14.9 HU, respectively. Absolute Percent Mean/Max Dose Errors were less than 2% for the PTV and all critical structures for both models, and DRRs generated from these models looked qualitatively similar to CT generated DRRs showing these methods are promising for MR-only planning.
Tasks
Published	2019-02-01
URL	http://arxiv.org/abs/1902.00536v4
PDF	http://arxiv.org/pdf/1902.00536v4.pdf
PWC	https://paperswithcode.com/paper/comparison-of-patch-based-conditional
Repo
Framework

Intrinsic Image Popularity Assessment


Title	Intrinsic Image Popularity Assessment
Authors	Keyan Ding, Kede Ma, Shiqi Wang
Abstract	The goal of research in automatic image popularity assessment (IPA) is to develop computational models that can accurately predict the potential of a social image to go viral on the Internet. Here, we aim to single out the contribution of visual content to image popularity, i.e., intrinsic image popularity. Specifically, we first describe a probabilistic method to generate massive popularity-discriminable image pairs, based on which the first large-scale image database for intrinsic IPA (I$^2$PA) is established. We then develop computational models for I$^2$PA based on deep neural networks, optimizing for ranking consistency with millions of popularity-discriminable image pairs. Experiments on Instagram and other social platforms demonstrate that the optimized model performs favorably against existing methods, exhibits reasonable generalizability on different databases, and even surpasses human-level performance on Instagram. In addition, we conduct a psychophysical experiment to analyze various aspects of human behavior in I$^2$PA.
Tasks
Published	2019-07-03
URL	https://arxiv.org/abs/1907.01985v2
PDF	https://arxiv.org/pdf/1907.01985v2.pdf
PWC	https://paperswithcode.com/paper/intrinsic-image-popularity-assessment
Repo
Framework

Semi-Automatic Labeling for Deep Learning in Robotics


Title	Semi-Automatic Labeling for Deep Learning in Robotics
Authors	Daniele De Gregorio, Alessio Tonioni, Gianluca Palli, Luigi Di Stefano
Abstract	In this paper, we propose Augmented Reality Semi-automatic labeling (ARS), a semi-automatic method which leverages on moving a 2D camera by means of a robot, proving precise camera tracking, and an augmented reality pen to define initial object bounding box, to create large labeled datasets with minimal human intervention. By removing the burden of generating annotated data from humans, we make the Deep Learning technique applied to computer vision, that typically requires very large datasets, truly automated and reliable. With the ARS pipeline, we created effortlessly two novel datasets, one on electromechanical components (industrial scenario) and one on fruits (daily-living scenario), and trained robustly two state-of-the-art object detectors, based on convolutional neural networks, such as YOLO and SSD. With respect to the conventional manual annotation of 1000 frames that takes us slightly more than 10 hours, the proposed approach based on ARS allows annotating 9 sequences of about 35000 frames in less than one hour, with a gain factor of about 450. Moreover, both the precision and recall of object detection is increased by about 15% with respect to manual labeling. All our software is available as a ROS package in a public repository alongside the novel annotated datasets.
Tasks	Object Detection
Published	2019-08-05
URL	https://arxiv.org/abs/1908.01862v1
PDF	https://arxiv.org/pdf/1908.01862v1.pdf
PWC	https://paperswithcode.com/paper/semi-automatic-labeling-for-deep-learning-in
Repo
Framework

Reconstructing neuronal anatomy from whole-brain images


Title	Reconstructing neuronal anatomy from whole-brain images
Authors	James Gornet, Kannan Umadevi Venkataraju, Arun Narasimhan, Nicholas Turner, Kisuk Lee, H. Sebastian Seung, Pavel Osten, Uygar Sümbül
Abstract	Reconstructing multiple molecularly defined neurons from individual brains and across multiple brain regions can reveal organizational principles of the nervous system. However, high resolution imaging of the whole brain is a technically challenging and slow process. Recently, oblique light sheet microscopy has emerged as a rapid imaging method that can provide whole brain fluorescence microscopy at a voxel size of 0.4 by 0.4 by 2.5 cubic microns. On the other hand, complex image artifacts due to whole-brain coverage produce apparent discontinuities in neuronal arbors. Here, we present connectivity-preserving methods and data augmentation strategies for supervised learning of neuroanatomy from light microscopy using neural networks. We quantify the merit of our approach by implementing an end-to-end automated tracing pipeline. Lastly, we demonstrate a scalable, distributed implementation that can reconstruct the large datasets that sub-micron whole-brain images produce.
Tasks	Data Augmentation
Published	2019-03-17
URL	http://arxiv.org/abs/1903.07027v1
PDF	http://arxiv.org/pdf/1903.07027v1.pdf
PWC	https://paperswithcode.com/paper/reconstructing-neuronal-anatomy-from-whole
Repo
Framework

MnasFPN: Learning Latency-aware Pyramid Architecture for Object Detection on Mobile Devices


Title	MnasFPN: Learning Latency-aware Pyramid Architecture for Object Detection on Mobile Devices
Authors	Bo Chen, Golnaz Ghiasi, Hanxiao Liu, Tsung-Yi Lin, Dmitry Kalenichenko, Hartwig Adams, Quoc V. Le
Abstract	Despite the blooming success of architecture search for vision tasks in resource-constrained environments, the design of on-device object detection architectures have mostly been manual. The few automated search efforts are either centered around non-mobile-friendly search spaces or not guided by on-device latency. We propose Mnasfpn, a mobile-friendly search space for the detection head, and combine it with latency-aware architecture search to produce efficient object detection models. The learned Mnasfpn head, when paired with MobileNetV2 body, outperforms MobileNetV3+SSDLite by 1.8 mAP at similar latency on Pixel. It is also both 1.0 mAP more accurate and 10% faster than NAS-FPNLite. Ablation studies show that the majority of the performance gain comes from innovations in the search space. Further explorations reveal an interesting coupling between the search space design and the search algorithm, and that the complexity of Mnasfpn search space may be at a local optimum.
Tasks	Object Detection
Published	2019-12-02
URL	https://arxiv.org/abs/1912.01106v1
PDF	https://arxiv.org/pdf/1912.01106v1.pdf
PWC	https://paperswithcode.com/paper/mnasfpn-learning-latency-aware-pyramid
Repo
Framework

Neural Network Models for Stock Selection Based on Fundamental Analysis


Title	Neural Network Models for Stock Selection Based on Fundamental Analysis
Authors	Yuxuan Huang, Luiz Fernando Capretz, Danny Ho
Abstract	Application of neural network architectures for financial prediction has been actively studied in recent years. This paper presents a comparative study that investigates and compares feed-forward neural network (FNN) and adaptive neural fuzzy inference system (ANFIS) on stock prediction using fundamental financial ratios. The study is designed to evaluate the performance of each architecture based on the relative return of the selected portfolios with respect to the benchmark stock index. The results show that both architectures possess the ability to separate winners and losers from a sample universe of stocks, and the selected portfolios outperform the benchmark. Our study argues that FNN shows superior performance over ANFIS.
Tasks	Stock Prediction
Published	2019-06-12
URL	https://arxiv.org/abs/1906.05327v1
PDF	https://arxiv.org/pdf/1906.05327v1.pdf
PWC	https://paperswithcode.com/paper/neural-network-models-for-stock-selection
Repo
Framework

An Alternative Cross Entropy Loss for Learning-to-Rank


Title	An Alternative Cross Entropy Loss for Learning-to-Rank
Authors	Sebastian Bruch
Abstract	Listwise learning-to-rank methods form a powerful class of ranking algorithms that are widely adopted in applications such as information retrieval. These algorithms learn to rank a set of items by optimizing a loss that is a function of the entire set—as a surrogate to a typically non-differentiable ranking metric. Despite their empirical success, existing listwise methods are based on heuristics and remain theoretically ill-understood. In particular, none of the empirically-successful loss functions are related to ranking metrics. In this work, we propose a cross entropy-based learning-to-rank loss function that is theoretically sound and is a convex bound on NDCG, a popular ranking metric. Furthermore, empirical evaluation of an implementation of the proposed method with gradient boosting machines on benchmark learning-to-rank datasets demonstrates the superiority of our proposed formulation over existing algorithms in quality and robustness.
Tasks	Information Retrieval, Learning-To-Rank
Published	2019-11-22
URL	https://arxiv.org/abs/1911.09798v2
PDF	https://arxiv.org/pdf/1911.09798v2.pdf
PWC	https://paperswithcode.com/paper/an-alternative-cross-entropy-loss-for
Repo
Framework

Object Detection with Convolutional Neural Networks


Title	Object Detection with Convolutional Neural Networks
Authors	Kaidong Li, Wenchi Ma, Usman Sajid, Yuanwei Wu, Guanghui Wang
Abstract	In this chapter, we present a brief overview of the recent development in object detection using convolutional neural networks (CNN). Several classical CNN-based detectors are presented. Some developments are based on the detector architectures, while others are focused on solving certain problems, like model degradation and small-scale object detection. The chapter also presents some performance comparison results of different models on several benchmark datasets. Through the discussion of these models, we hope to give readers a general idea about the developments of CNN-based object detection.
Tasks	Object Detection
Published	2019-12-04
URL	https://arxiv.org/abs/1912.01844v1
PDF	https://arxiv.org/pdf/1912.01844v1.pdf
PWC	https://paperswithcode.com/paper/object-detection-with-convolutional-neural
Repo
Framework

Multi-View Large-Scale Bundle Adjustment Method for High-Resolution Satellite Images


Title	Multi-View Large-Scale Bundle Adjustment Method for High-Resolution Satellite Images
Authors	Xu Huang, Rongjun Qin
Abstract	Given enough multi-view image corresponding points (also called tie points) and ground control points (GCP), bundle adjustment for high-resolution satellite images is used to refine the orientations or most often used geometric parameters Rational Polynomial Coefficients (RPC) of each satellite image in a unified geodetic framework, which is very critical in many photogrammetry and computer vision applications. However, the growing number of high resolution spaceborne optical sensors has brought two challenges to the bundle adjustment: 1) images come from different satellite cameras may have different imaging dates, viewing angles, resolutions, etc., thus resulting in geometric and radiometric distortions in the bundle adjustment; 2) The large-scale mapping area always corresponds to vast number of bundle adjustment corrections (including RPC bias and object space point coordinates). Due to the limitation of computer memory, it is hard to refine all corrections at the same time. Hence, how to efficiently realize the bundle adjustment in large-scale regions is very important. This paper particularly addresses the multi-view large-scale bundle adjustment problem by two steps: 1) to get robust tie points among different satellite images, we design a multi-view, multi-source tie point matching algorithm based on plane rectification and epipolar constraints, which is able to compensate geometric and local nonlinear radiometric distortions among satellite datasets, and 2) to solve dozens of thousands or even millions of variables bundle adjustment corrections in the large scale bundle adjustment, we use an efficient solution with only a little computer memory. Experiments on in-track and off-track satellite datasets show that the proposed method is capable of computing sub-pixel accuracy bundle adjustment results.
Tasks
Published	2019-05-22
URL	https://arxiv.org/abs/1905.09152v1
PDF	https://arxiv.org/pdf/1905.09152v1.pdf
PWC	https://paperswithcode.com/paper/multi-view-large-scale-bundle-adjustment
Repo
Framework