January 25, 2020

2769 words 13 mins read

Paper Group ANR 1755

Shape Reconstruction by Learning Differentiable Surface Representations. Automated Detection System for Adversarial Examples with High-Frequency Noises Sieve. A Stochastic Composite Gradient Method with Incremental Variance Reduction. Learning When to Drive in Intersections by Combining Reinforcement Learning and Model Predictive Control. Rotated F …

Shape Reconstruction by Learning Differentiable Surface Representations


Title	Shape Reconstruction by Learning Differentiable Surface Representations
Authors	Jan Bednarik, Shaifali Parashar, Erhan Gundogdu, Mathieu Salzmann, Pascal Fua
Abstract	Generative models that produce point clouds have emerged as a powerful tool to represent 3D surfaces, and the best current ones rely on learning an ensemble of parametric representations. Unfortunately, they offer no control over the deformations of the surface patches that form the ensemble and thus fail to prevent them from either overlapping or collapsing into single points or lines. As a consequence, computing shape properties such as surface normals and curvatures becomes difficult and unreliable. In this paper, we show that we can exploit the inherent differentiability of deep networks to leverage differential surface properties during training so as to prevent patch collapse and strongly reduce patch overlap. Furthermore, this lets us reliably compute quantities such as surface normals and curvatures. We will demonstrate on several tasks that this yields more accurate surface reconstructions than the state-of-the-art methods in terms of normals estimation and amount of collapsed and overlapped patches.
Tasks
Published	2019-11-25
URL	https://arxiv.org/abs/1911.11227v1
PDF	https://arxiv.org/pdf/1911.11227v1.pdf
PWC	https://paperswithcode.com/paper/shape-reconstruction-by-learning
Repo
Framework

Automated Detection System for Adversarial Examples with High-Frequency Noises Sieve


Title	Automated Detection System for Adversarial Examples with High-Frequency Noises Sieve
Authors	Dang Duy Thang, Toshihiro Matsui
Abstract	Deep neural networks are being applied in many tasks with encouraging results, and have often reached human-level performance. However, deep neural networks are vulnerable to well-designed input samples called adversarial examples. In particular, neural networks tend to misclassify adversarial examples that are imperceptible to humans. This paper introduces a new detection system that automatically detects adversarial examples on deep neural networks. Our proposed system can mostly distinguish adversarial samples and benign images in an end-to-end manner without human intervention. We exploit the important role of the frequency domain in adversarial samples and propose a method that detects malicious samples in observations. When evaluated on two standard benchmark datasets (MNIST and ImageNet), our method achieved an out-detection rate of 99.7 - 100% in many settings.
Tasks
Published	2019-08-05
URL	https://arxiv.org/abs/1908.01469v1
PDF	https://arxiv.org/pdf/1908.01469v1.pdf
PWC	https://paperswithcode.com/paper/automated-detection-system-for-adversarial
Repo
Framework

A Stochastic Composite Gradient Method with Incremental Variance Reduction


Title	A Stochastic Composite Gradient Method with Incremental Variance Reduction
Authors	Junyu Zhang, Lin Xiao
Abstract	We consider the problem of minimizing the composition of a smooth (nonconvex) function and a smooth vector mapping, where the inner mapping is in the form of an expectation over some random variable or a finite sum. We propose a stochastic composite gradient method that employs an incremental variance-reduced estimator for both the inner vector mapping and its Jacobian. We show that this method achieves the same orders of complexity as the best known first-order methods for minimizing expected-value and finite-sum nonconvex functions, despite the additional outer composition which renders the composite gradient estimator biased. This finding enables a much broader range of applications in machine learning to benefit from the low complexity of incremental variance-reduction methods.
Tasks
Published	2019-06-24
URL	https://arxiv.org/abs/1906.10186v1
PDF	https://arxiv.org/pdf/1906.10186v1.pdf
PWC	https://paperswithcode.com/paper/a-stochastic-composite-gradient-method-with
Repo
Framework

Learning When to Drive in Intersections by Combining Reinforcement Learning and Model Predictive Control


Title	Learning When to Drive in Intersections by Combining Reinforcement Learning and Model Predictive Control
Authors	Tommy Tram, Ivo Batkovic, Mohammad Ali, Jonas Sjöberg
Abstract	In this paper, we propose a decision making algorithm intended for automated vehicles that negotiate with other possibly non-automated vehicles in intersections. The decision algorithm is separated into two parts: a high-level decision module based on reinforcement learning, and a low-level planning module based on model predictive control. Traffic is simulated with numerous predefined driver behaviors and intentions, and the performance of the proposed decision algorithm was evaluated against another controller. The results show that the proposed decision algorithm yields shorter training episodes and an increased performance in success rate compared to the other controller.
Tasks	Decision Making
Published	2019-08-01
URL	https://arxiv.org/abs/1908.00177v1
PDF	https://arxiv.org/pdf/1908.00177v1.pdf
PWC	https://paperswithcode.com/paper/learning-when-to-drive-in-intersections-by
Repo
Framework

Rotated Feature Network for multi-orientation object detection


Title	Rotated Feature Network for multi-orientation object detection
Authors	Zhixin Zhang, Xudong Chen, Jie Liu, Kaibo Zhou
Abstract	General detectors follow the pipeline that feature maps extracted from ConvNets are shared between classification and regression tasks. However, there exists obvious conflicting requirements in multi-orientation object detection that classification is insensitive to orientations, while regression is quite sensitive. To address this issue, we provide an Encoder-Decoder architecture, called Rotated Feature Network (RFN), which produces rotation-sensitive feature maps (RS) for regression and rotation-invariant feature maps (RI) for classification. Specifically, the Encoder unit assigns weights for rotated feature maps. The Decoder unit extracts RS and RI by performing resuming operator on rotated and reweighed feature maps, respectively. To make the rotation-invariant characteristics more reliable, we adopt a metric to quantitatively evaluate the rotation-invariance by adding a constrain item in the loss, yielding a promising detection performance. Compared with the state-of-the-art methods, our method can achieve significant improvement on NWPU VHR-10 and RSOD datasets. We further evaluate the RFN on the scene classification in remote sensing images and object detection in natural images, demonstrating its good generalization ability. The proposed RFN can be integrated into an existing framework, leading to great performance with only a slight increase in model complexity.
Tasks	Object Detection, Scene Classification
Published	2019-03-23
URL	http://arxiv.org/abs/1903.09839v2
PDF	http://arxiv.org/pdf/1903.09839v2.pdf
PWC	https://paperswithcode.com/paper/rotated-feature-network-for-multi-orientation
Repo
Framework

Learning to Adapt Invariance in Memory for Person Re-identification


Title	Learning to Adapt Invariance in Memory for Person Re-identification
Authors	Zhun Zhong, Liang Zheng, Zhiming Luo, Shaozi Li, Yi Yang
Abstract	This work considers the problem of unsupervised domain adaptation in person re-identification (re-ID), which aims to transfer knowledge from the source domain to the target domain. Existing methods are primary to reduce the inter-domain shift between the domains, which however usually overlook the relations among target samples. This paper investigates into the intra-domain variations of the target domain and proposes a novel adaptation framework w.r.t. three types of underlying invariance, i.e., Exemplar-Invariance, Camera-Invariance, and Neighborhood-Invariance. Specifically, an exemplar memory is introduced to store features of samples, which can effectively and efficiently enforce the invariance constraints over the global dataset. We further present the Graph-based Positive Prediction (GPP) method to explore reliable neighbors for the target domain, which is built upon the memory and is trained on the source samples. Experiments demonstrate that 1) the three invariance properties are indispensable for effective domain adaptation, 2) the memory plays a key role in implementing invariance learning and improves the performance with limited extra computation cost, 3) GPP could facilitate the invariance learning and thus significantly improves the results, and 4) our approach produces new state-of-the-art adaptation accuracy on three re-ID large-scale benchmarks.
Tasks	Domain Adaptation, Person Re-Identification, Unsupervised Domain Adaptation
Published	2019-08-01
URL	https://arxiv.org/abs/1908.00485v1
PDF	https://arxiv.org/pdf/1908.00485v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-adapt-invariance-in-memory-for
Repo
Framework

An Efficient Approach for Super and Nested Term Indexing and Retrieval


Title	An Efficient Approach for Super and Nested Term Indexing and Retrieval
Authors	Md Faisal Mahbub Chowdhury, Robert Farrell
Abstract	This paper describes a new approach, called Terminological Bucket Indexing (TBI), for efficient indexing and retrieval of both nested and super terms using a single method. We propose a hybrid data structure for facilitating faster indexing building. An evaluation of our approach with respect to widely used existing approaches on several publicly available dataset is provided. Compared to Trie based approaches, TBI provides comparable performance on nested term retrieval and far superior performance on super term retrieval. Compared to traditional hash table, TBI needs 80% less time for indexing.
Tasks
Published	2019-05-23
URL	https://arxiv.org/abs/1905.09761v1
PDF	https://arxiv.org/pdf/1905.09761v1.pdf
PWC	https://paperswithcode.com/paper/an-efficient-approach-for-super-and-nested
Repo
Framework

Radar Classification of Contiguous Activities of Daily Living


Title	Radar Classification of Contiguous Activities of Daily Living
Authors	Ronny Gerhard Guendel
Abstract	We consider radar classifications of Activities of Daily Living (ADL) which can prove beneficial in fall detection, analysis of daily routines, and discerning physical and cognitive human conditions. We focus on contiguous motion classifications which follow and commensurate with the human ethogram of possible motion sequences. Contiguous motions can be closely connected with no clear time gap separations. In the proposed motion classification approach, we utilize the Radon transform applied to the radar range-map to detect the translation motion, whereas an energy detector is used to provide the onset and offset times of in-place motions, such as sitting down and standing up. It is shown that motion classifications give different results when performed forward and backward in time. The number of classes, thereby classification rates, considered by a classifier, is made variable depending on the current motion state and the possible transitioning activities in and out of the state. Motion examples are provided to delineate the performance of the proposed approach under typical sequences of human motions.
Tasks
Published	2019-12-17
URL	https://arxiv.org/abs/2001.01556v2
PDF	https://arxiv.org/pdf/2001.01556v2.pdf
PWC	https://paperswithcode.com/paper/radar-classification-of-contiguous-activities
Repo
Framework

TRB: A Novel Triplet Representation for Understanding 2D Human Body


Title	TRB: A Novel Triplet Representation for Understanding 2D Human Body
Authors	Haodong Duan, KwanYee Lin, Sheng Jin, Wentao Liu, Chen Qian, Wanli Ouyang
Abstract	Human pose and shape are two important components of 2D human body. However, how to efficiently represent both of them in images is still an open question. In this paper, we propose the Triplet Representation for Body (TRB) – a compact 2D human body representation, with skeleton keypoints capturing human pose information and contour keypoints containing human shape information. TRB not only preserves the flexibility of skeleton keypoint representation, but also contains rich pose and human shape information. Therefore, it promises broader application areas, such as human shape editing and conditional image generation. We further introduce the challenging problem of TRB estimation, where joint learning of human pose and shape is required. We construct several large-scale TRB estimation datasets, based on popular 2D pose datasets: LSP, MPII, COCO. To effectively solve TRB estimation, we propose a two-branch network (TRB-net) with three novel techniques, namely X-structure (Xs), Directional Convolution (DC) and Pairwise Mapping (PM), to enforce multi-level message passing for joint feature learning. We evaluate our proposed TRB-net and several leading approaches on our proposed TRB datasets, and demonstrate the superiority of our method through extensive evaluations.
Tasks	Conditional Image Generation, Image Generation
Published	2019-10-25
URL	https://arxiv.org/abs/1910.11535v1
PDF	https://arxiv.org/pdf/1910.11535v1.pdf
PWC	https://paperswithcode.com/paper/trb-a-novel-triplet-representation-for-1
Repo
Framework

Disentangling the Spatial Structure and Style in Conditional VAE


Title	Disentangling the Spatial Structure and Style in Conditional VAE
Authors	Ziye Zhang, Li Sun, Zhilin Zheng, Qingli Li
Abstract	This paper aims to disentangle the latent space in cVAE into the spatial structure and the style code, which are complementary to each other, with one of them $z_s$ being label relevant and the other $z_u$ irrelevant. The generator is built by a connected encoder-decoder and a label condition mapping network. Depending on whether the label is related with the spatial structure, the output $z_s$ from the condition mapping network is used either as a style code or a spatial structure code. The encoder provides the label irrelevant posterior from which $z_u$ is sampled. The decoder employs $z_s$ and $z_u$ in each layer by adaptive normalization like SPADE or AdaIN. Extensive experiments on two datasets with different types of labels show the effectiveness of our method.
Tasks
Published	2019-10-29
URL	https://arxiv.org/abs/1910.13062v1
PDF	https://arxiv.org/pdf/1910.13062v1.pdf
PWC	https://paperswithcode.com/paper/disentangling-the-spatial-structure-and-style
Repo
Framework

Artificial Intelligence-Based Image Classification for Diagnosis of Skin Cancer: Challenges and Opportunities


Title	Artificial Intelligence-Based Image Classification for Diagnosis of Skin Cancer: Challenges and Opportunities
Authors	Manu Goyal, Thomas Knackstedt, Shaofeng Yan, Amanda Oakley, Saeed Hassanpour
Abstract	Recently, there has been great interest in developing Artificial Intelligence (AI) enabled computer-aided diagnostics solutions for the diagnosis of skin cancer. With the increasing incidence of skin cancers, low awareness among a growing population, and a lack of adequate clinical expertise and services, there is an immediate need for AI systems to assist clinicians in this domain. A large number of skin lesion datasets are available publicly, and researchers have developed AI-based image classification solutions, particularly deep learning algorithms, to distinguish malignant skin lesions from benign lesions in different image modalities such as dermoscopic, clinical, and histopathology images. Despite the various claims of AI systems achieving higher accuracy than dermatologists in the classification of different skin lesions, these AI systems are still in the very early stages of clinical application in terms of being ready to aid clinicians in the diagnosis of skin cancers. In this review, we discuss advancements in the digital image-based AI solutions for the diagnosis of skin cancer, along with some challenges and future opportunities to improve these AI systems to support dermatologists and enhance their ability to diagnose skin cancer.
Tasks	Image Classification
Published	2019-11-26
URL	https://arxiv.org/abs/1911.11872v2
PDF	https://arxiv.org/pdf/1911.11872v2.pdf
PWC	https://paperswithcode.com/paper/artificial-intelligence-for-diagnosis-of-skin
Repo
Framework

ARSM Gradient Estimator for Supervised Learning to Rank


Title	ARSM Gradient Estimator for Supervised Learning to Rank
Authors	Siamak Zamani Dadaneh, Shahin Boluki, Mingyuan Zhou, Xiaoning Qian
Abstract	We propose a new model for supervised learning to rank. In our model, the relevance labels are assumed to follow a categorical distribution whose probabilities are constructed based on a scoring function. We optimize the training objective with respect to the multivariate categorical variables with an unbiased and low-variance gradient estimator. Learning-to-rank methods can generally be categorized into pointwise, pairwise, and listwise approaches. Although our scoring function is pointwise, the proposed framework permits flexibility over the choice of the loss function. In our new model, the loss function need not be differentiable and can either be pointwise or listwise. Our proposed method achieves better or comparable results on two datasets compared with existing pairwise and listwise methods.
Tasks	Learning-To-Rank
Published	2019-11-01
URL	https://arxiv.org/abs/1911.00465v2
PDF	https://arxiv.org/pdf/1911.00465v2.pdf
PWC	https://paperswithcode.com/paper/arsm-gradient-estimator-for-supervised
Repo
Framework


Title	Personalized Context-Aware Multi-Modal Transportation Recommendation
Authors	Meixin Zhu, Jingyun Hu, Hao, Yang, Ziyuan Pu, Yinhai Wang
Abstract	This study proposes to find the most appropriate transport modes with awareness of user preferences (e.g., costs, times) and trip characteristics (e.g., purpose, distance). The work was based on real-life trips obtained from a map application. Several methods including gradient boosting tree, learning to rank, multinomial logit model, automated machine learning, random forest, and shallow neural network have been tried. For some methods, feature selection and over-sampling techniques were also tried. The results show that the best performing method is a gradient boosting tree model with synthetic minority over-sampling technique (SMOTE). Also, results of the multinomial logit model show that (1) an increase in travel cost would decrease the utility of all the transportation modes; (2) people are less sensitive to the travel distance for the metro mode or a multi-modal option that containing metro, i.e., compared to other modes, people would be more willing to tolerate long-distance metro trips. This indicates that metro lines might be a good candidate for large cities.
Tasks	Feature Selection, Learning-To-Rank
Published	2019-10-13
URL	https://arxiv.org/abs/1910.12601v1
PDF	https://arxiv.org/pdf/1910.12601v1.pdf
PWC	https://paperswithcode.com/paper/personalized-context-aware-multi-modal
Repo
Framework

Online Continuous Submodular Maximization: From Full-Information to Bandit Feedback


Title	Online Continuous Submodular Maximization: From Full-Information to Bandit Feedback
Authors	Mingrui Zhang, Lin Chen, Hamed Hassani, Amin Karbasi
Abstract	In this paper, we propose three online algorithms for submodular maximisation. The first one, Mono-Frank-Wolfe, reduces the number of per-function gradient evaluations from $T^{1/2}$ [Chen2018Online] and $T^{3/2}$ [chen2018projection] to 1, and achieves a $(1-1/e)$-regret bound of $O(T^{4/5})$. The second one, Bandit-Frank-Wolfe, is the first bandit algorithm for continuous DR-submodular maximization, which achieves a $(1-1/e)$-regret bound of $O(T^{8/9})$. Finally, we extend Bandit-Frank-Wolfe to a bandit algorithm for discrete submodular maximization, Responsive-Frank-Wolfe, which attains a $(1-1/e)$-regret bound of $O(T^{8/9})$ in the responsive bandit setting.
Tasks
Published	2019-10-28
URL	https://arxiv.org/abs/1910.12424v1
PDF	https://arxiv.org/pdf/1910.12424v1.pdf
PWC	https://paperswithcode.com/paper/online-continuous-submodular-maximization-1
Repo
Framework

Unsupervised Domain Adaptation via Calibrating Uncertainties


Title	Unsupervised Domain Adaptation via Calibrating Uncertainties
Authors	Ligong Han, Yang Zou, Ruijiang Gao, Lezi Wang, Dimitris Metaxas
Abstract	Unsupervised domain adaptation (UDA) aims at inferring class labels for unlabeled target domain given a related labeled source dataset. Intuitively, a model trained on source domain normally produces higher uncertainties for unseen data. In this work, we build on this assumption and propose to adapt from source to target domain via calibrating their predictive uncertainties. The uncertainty is quantified as the Renyi entropy, from which we propose a general Renyi entropy regularization (RER) framework. We further employ variational Bayes learning for reliable uncertainty estimation. In addition, calibrating the sample variance of network parameters serves as a plug-in regularizer for training. We discuss the theoretical properties of the proposed method and demonstrate its effectiveness on three domain-adaptation tasks.
Tasks	Domain Adaptation, Unsupervised Domain Adaptation
Published	2019-07-25
URL	https://arxiv.org/abs/1907.11202v1
PDF	https://arxiv.org/pdf/1907.11202v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-domain-adaptation-via
Repo
Framework