Paper Group ANR 1755
Shape Reconstruction by Learning Differentiable Surface Representations. Automated Detection System for Adversarial Examples with High-Frequency Noises Sieve. A Stochastic Composite Gradient Method with Incremental Variance Reduction. Learning When to Drive in Intersections by Combining Reinforcement Learning and Model Predictive Control. Rotated F …
Shape Reconstruction by Learning Differentiable Surface Representations
Title | Shape Reconstruction by Learning Differentiable Surface Representations |
Authors | Jan Bednarik, Shaifali Parashar, Erhan Gundogdu, Mathieu Salzmann, Pascal Fua |
Abstract | Generative models that produce point clouds have emerged as a powerful tool to represent 3D surfaces, and the best current ones rely on learning an ensemble of parametric representations. Unfortunately, they offer no control over the deformations of the surface patches that form the ensemble and thus fail to prevent them from either overlapping or collapsing into single points or lines. As a consequence, computing shape properties such as surface normals and curvatures becomes difficult and unreliable. In this paper, we show that we can exploit the inherent differentiability of deep networks to leverage differential surface properties during training so as to prevent patch collapse and strongly reduce patch overlap. Furthermore, this lets us reliably compute quantities such as surface normals and curvatures. We will demonstrate on several tasks that this yields more accurate surface reconstructions than the state-of-the-art methods in terms of normals estimation and amount of collapsed and overlapped patches. |
Tasks | |
Published | 2019-11-25 |
URL | https://arxiv.org/abs/1911.11227v1 |
https://arxiv.org/pdf/1911.11227v1.pdf | |
PWC | https://paperswithcode.com/paper/shape-reconstruction-by-learning |
Repo | |
Framework | |
Automated Detection System for Adversarial Examples with High-Frequency Noises Sieve
Title | Automated Detection System for Adversarial Examples with High-Frequency Noises Sieve |
Authors | Dang Duy Thang, Toshihiro Matsui |
Abstract | Deep neural networks are being applied in many tasks with encouraging results, and have often reached human-level performance. However, deep neural networks are vulnerable to well-designed input samples called adversarial examples. In particular, neural networks tend to misclassify adversarial examples that are imperceptible to humans. This paper introduces a new detection system that automatically detects adversarial examples on deep neural networks. Our proposed system can mostly distinguish adversarial samples and benign images in an end-to-end manner without human intervention. We exploit the important role of the frequency domain in adversarial samples and propose a method that detects malicious samples in observations. When evaluated on two standard benchmark datasets (MNIST and ImageNet), our method achieved an out-detection rate of 99.7 - 100% in many settings. |
Tasks | |
Published | 2019-08-05 |
URL | https://arxiv.org/abs/1908.01469v1 |
https://arxiv.org/pdf/1908.01469v1.pdf | |
PWC | https://paperswithcode.com/paper/automated-detection-system-for-adversarial |
Repo | |
Framework | |
A Stochastic Composite Gradient Method with Incremental Variance Reduction
Title | A Stochastic Composite Gradient Method with Incremental Variance Reduction |
Authors | Junyu Zhang, Lin Xiao |
Abstract | We consider the problem of minimizing the composition of a smooth (nonconvex) function and a smooth vector mapping, where the inner mapping is in the form of an expectation over some random variable or a finite sum. We propose a stochastic composite gradient method that employs an incremental variance-reduced estimator for both the inner vector mapping and its Jacobian. We show that this method achieves the same orders of complexity as the best known first-order methods for minimizing expected-value and finite-sum nonconvex functions, despite the additional outer composition which renders the composite gradient estimator biased. This finding enables a much broader range of applications in machine learning to benefit from the low complexity of incremental variance-reduction methods. |
Tasks | |
Published | 2019-06-24 |
URL | https://arxiv.org/abs/1906.10186v1 |
https://arxiv.org/pdf/1906.10186v1.pdf | |
PWC | https://paperswithcode.com/paper/a-stochastic-composite-gradient-method-with |
Repo | |
Framework | |
Learning When to Drive in Intersections by Combining Reinforcement Learning and Model Predictive Control
Title | Learning When to Drive in Intersections by Combining Reinforcement Learning and Model Predictive Control |
Authors | Tommy Tram, Ivo Batkovic, Mohammad Ali, Jonas Sjöberg |
Abstract | In this paper, we propose a decision making algorithm intended for automated vehicles that negotiate with other possibly non-automated vehicles in intersections. The decision algorithm is separated into two parts: a high-level decision module based on reinforcement learning, and a low-level planning module based on model predictive control. Traffic is simulated with numerous predefined driver behaviors and intentions, and the performance of the proposed decision algorithm was evaluated against another controller. The results show that the proposed decision algorithm yields shorter training episodes and an increased performance in success rate compared to the other controller. |
Tasks | Decision Making |
Published | 2019-08-01 |
URL | https://arxiv.org/abs/1908.00177v1 |
https://arxiv.org/pdf/1908.00177v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-when-to-drive-in-intersections-by |
Repo | |
Framework | |
Rotated Feature Network for multi-orientation object detection
Title | Rotated Feature Network for multi-orientation object detection |
Authors | Zhixin Zhang, Xudong Chen, Jie Liu, Kaibo Zhou |
Abstract | General detectors follow the pipeline that feature maps extracted from ConvNets are shared between classification and regression tasks. However, there exists obvious conflicting requirements in multi-orientation object detection that classification is insensitive to orientations, while regression is quite sensitive. To address this issue, we provide an Encoder-Decoder architecture, called Rotated Feature Network (RFN), which produces rotation-sensitive feature maps (RS) for regression and rotation-invariant feature maps (RI) for classification. Specifically, the Encoder unit assigns weights for rotated feature maps. The Decoder unit extracts RS and RI by performing resuming operator on rotated and reweighed feature maps, respectively. To make the rotation-invariant characteristics more reliable, we adopt a metric to quantitatively evaluate the rotation-invariance by adding a constrain item in the loss, yielding a promising detection performance. Compared with the state-of-the-art methods, our method can achieve significant improvement on NWPU VHR-10 and RSOD datasets. We further evaluate the RFN on the scene classification in remote sensing images and object detection in natural images, demonstrating its good generalization ability. The proposed RFN can be integrated into an existing framework, leading to great performance with only a slight increase in model complexity. |
Tasks | Object Detection, Scene Classification |
Published | 2019-03-23 |
URL | http://arxiv.org/abs/1903.09839v2 |
http://arxiv.org/pdf/1903.09839v2.pdf | |
PWC | https://paperswithcode.com/paper/rotated-feature-network-for-multi-orientation |
Repo | |
Framework | |
Learning to Adapt Invariance in Memory for Person Re-identification
Title | Learning to Adapt Invariance in Memory for Person Re-identification |
Authors | Zhun Zhong, Liang Zheng, Zhiming Luo, Shaozi Li, Yi Yang |
Abstract | This work considers the problem of unsupervised domain adaptation in person re-identification (re-ID), which aims to transfer knowledge from the source domain to the target domain. Existing methods are primary to reduce the inter-domain shift between the domains, which however usually overlook the relations among target samples. This paper investigates into the intra-domain variations of the target domain and proposes a novel adaptation framework w.r.t. three types of underlying invariance, i.e., Exemplar-Invariance, Camera-Invariance, and Neighborhood-Invariance. Specifically, an exemplar memory is introduced to store features of samples, which can effectively and efficiently enforce the invariance constraints over the global dataset. We further present the Graph-based Positive Prediction (GPP) method to explore reliable neighbors for the target domain, which is built upon the memory and is trained on the source samples. Experiments demonstrate that 1) the three invariance properties are indispensable for effective domain adaptation, 2) the memory plays a key role in implementing invariance learning and improves the performance with limited extra computation cost, 3) GPP could facilitate the invariance learning and thus significantly improves the results, and 4) our approach produces new state-of-the-art adaptation accuracy on three re-ID large-scale benchmarks. |
Tasks | Domain Adaptation, Person Re-Identification, Unsupervised Domain Adaptation |
Published | 2019-08-01 |
URL | https://arxiv.org/abs/1908.00485v1 |
https://arxiv.org/pdf/1908.00485v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-adapt-invariance-in-memory-for |
Repo | |
Framework | |
An Efficient Approach for Super and Nested Term Indexing and Retrieval
Title | An Efficient Approach for Super and Nested Term Indexing and Retrieval |
Authors | Md Faisal Mahbub Chowdhury, Robert Farrell |
Abstract | This paper describes a new approach, called Terminological Bucket Indexing (TBI), for efficient indexing and retrieval of both nested and super terms using a single method. We propose a hybrid data structure for facilitating faster indexing building. An evaluation of our approach with respect to widely used existing approaches on several publicly available dataset is provided. Compared to Trie based approaches, TBI provides comparable performance on nested term retrieval and far superior performance on super term retrieval. Compared to traditional hash table, TBI needs 80% less time for indexing. |
Tasks | |
Published | 2019-05-23 |
URL | https://arxiv.org/abs/1905.09761v1 |
https://arxiv.org/pdf/1905.09761v1.pdf | |
PWC | https://paperswithcode.com/paper/an-efficient-approach-for-super-and-nested |
Repo | |
Framework | |
Radar Classification of Contiguous Activities of Daily Living
Title | Radar Classification of Contiguous Activities of Daily Living |
Authors | Ronny Gerhard Guendel |
Abstract | We consider radar classifications of Activities of Daily Living (ADL) which can prove beneficial in fall detection, analysis of daily routines, and discerning physical and cognitive human conditions. We focus on contiguous motion classifications which follow and commensurate with the human ethogram of possible motion sequences. Contiguous motions can be closely connected with no clear time gap separations. In the proposed motion classification approach, we utilize the Radon transform applied to the radar range-map to detect the translation motion, whereas an energy detector is used to provide the onset and offset times of in-place motions, such as sitting down and standing up. It is shown that motion classifications give different results when performed forward and backward in time. The number of classes, thereby classification rates, considered by a classifier, is made variable depending on the current motion state and the possible transitioning activities in and out of the state. Motion examples are provided to delineate the performance of the proposed approach under typical sequences of human motions. |
Tasks | |
Published | 2019-12-17 |
URL | https://arxiv.org/abs/2001.01556v2 |
https://arxiv.org/pdf/2001.01556v2.pdf | |
PWC | https://paperswithcode.com/paper/radar-classification-of-contiguous-activities |
Repo | |
Framework | |
TRB: A Novel Triplet Representation for Understanding 2D Human Body
Title | TRB: A Novel Triplet Representation for Understanding 2D Human Body |
Authors | Haodong Duan, KwanYee Lin, Sheng Jin, Wentao Liu, Chen Qian, Wanli Ouyang |
Abstract | Human pose and shape are two important components of 2D human body. However, how to efficiently represent both of them in images is still an open question. In this paper, we propose the Triplet Representation for Body (TRB) – a compact 2D human body representation, with skeleton keypoints capturing human pose information and contour keypoints containing human shape information. TRB not only preserves the flexibility of skeleton keypoint representation, but also contains rich pose and human shape information. Therefore, it promises broader application areas, such as human shape editing and conditional image generation. We further introduce the challenging problem of TRB estimation, where joint learning of human pose and shape is required. We construct several large-scale TRB estimation datasets, based on popular 2D pose datasets: LSP, MPII, COCO. To effectively solve TRB estimation, we propose a two-branch network (TRB-net) with three novel techniques, namely X-structure (Xs), Directional Convolution (DC) and Pairwise Mapping (PM), to enforce multi-level message passing for joint feature learning. We evaluate our proposed TRB-net and several leading approaches on our proposed TRB datasets, and demonstrate the superiority of our method through extensive evaluations. |
Tasks | Conditional Image Generation, Image Generation |
Published | 2019-10-25 |
URL | https://arxiv.org/abs/1910.11535v1 |
https://arxiv.org/pdf/1910.11535v1.pdf | |
PWC | https://paperswithcode.com/paper/trb-a-novel-triplet-representation-for-1 |
Repo | |
Framework | |
Disentangling the Spatial Structure and Style in Conditional VAE
Title | Disentangling the Spatial Structure and Style in Conditional VAE |
Authors | Ziye Zhang, Li Sun, Zhilin Zheng, Qingli Li |
Abstract | This paper aims to disentangle the latent space in cVAE into the spatial structure and the style code, which are complementary to each other, with one of them $z_s$ being label relevant and the other $z_u$ irrelevant. The generator is built by a connected encoder-decoder and a label condition mapping network. Depending on whether the label is related with the spatial structure, the output $z_s$ from the condition mapping network is used either as a style code or a spatial structure code. The encoder provides the label irrelevant posterior from which $z_u$ is sampled. The decoder employs $z_s$ and $z_u$ in each layer by adaptive normalization like SPADE or AdaIN. Extensive experiments on two datasets with different types of labels show the effectiveness of our method. |
Tasks | |
Published | 2019-10-29 |
URL | https://arxiv.org/abs/1910.13062v1 |
https://arxiv.org/pdf/1910.13062v1.pdf | |
PWC | https://paperswithcode.com/paper/disentangling-the-spatial-structure-and-style |
Repo | |
Framework | |
Artificial Intelligence-Based Image Classification for Diagnosis of Skin Cancer: Challenges and Opportunities
Title | Artificial Intelligence-Based Image Classification for Diagnosis of Skin Cancer: Challenges and Opportunities |
Authors | Manu Goyal, Thomas Knackstedt, Shaofeng Yan, Amanda Oakley, Saeed Hassanpour |
Abstract | Recently, there has been great interest in developing Artificial Intelligence (AI) enabled computer-aided diagnostics solutions for the diagnosis of skin cancer. With the increasing incidence of skin cancers, low awareness among a growing population, and a lack of adequate clinical expertise and services, there is an immediate need for AI systems to assist clinicians in this domain. A large number of skin lesion datasets are available publicly, and researchers have developed AI-based image classification solutions, particularly deep learning algorithms, to distinguish malignant skin lesions from benign lesions in different image modalities such as dermoscopic, clinical, and histopathology images. Despite the various claims of AI systems achieving higher accuracy than dermatologists in the classification of different skin lesions, these AI systems are still in the very early stages of clinical application in terms of being ready to aid clinicians in the diagnosis of skin cancers. In this review, we discuss advancements in the digital image-based AI solutions for the diagnosis of skin cancer, along with some challenges and future opportunities to improve these AI systems to support dermatologists and enhance their ability to diagnose skin cancer. |
Tasks | Image Classification |
Published | 2019-11-26 |
URL | https://arxiv.org/abs/1911.11872v2 |
https://arxiv.org/pdf/1911.11872v2.pdf | |
PWC | https://paperswithcode.com/paper/artificial-intelligence-for-diagnosis-of-skin |
Repo | |
Framework | |
ARSM Gradient Estimator for Supervised Learning to Rank
Title | ARSM Gradient Estimator for Supervised Learning to Rank |
Authors | Siamak Zamani Dadaneh, Shahin Boluki, Mingyuan Zhou, Xiaoning Qian |
Abstract | We propose a new model for supervised learning to rank. In our model, the relevance labels are assumed to follow a categorical distribution whose probabilities are constructed based on a scoring function. We optimize the training objective with respect to the multivariate categorical variables with an unbiased and low-variance gradient estimator. Learning-to-rank methods can generally be categorized into pointwise, pairwise, and listwise approaches. Although our scoring function is pointwise, the proposed framework permits flexibility over the choice of the loss function. In our new model, the loss function need not be differentiable and can either be pointwise or listwise. Our proposed method achieves better or comparable results on two datasets compared with existing pairwise and listwise methods. |
Tasks | Learning-To-Rank |
Published | 2019-11-01 |
URL | https://arxiv.org/abs/1911.00465v2 |
https://arxiv.org/pdf/1911.00465v2.pdf | |
PWC | https://paperswithcode.com/paper/arsm-gradient-estimator-for-supervised |
Repo | |
Framework | |
Personalized Context-Aware Multi-Modal Transportation Recommendation
Title | Personalized Context-Aware Multi-Modal Transportation Recommendation |
Authors | Meixin Zhu, Jingyun Hu, Hao, Yang, Ziyuan Pu, Yinhai Wang |
Abstract | This study proposes to find the most appropriate transport modes with awareness of user preferences (e.g., costs, times) and trip characteristics (e.g., purpose, distance). The work was based on real-life trips obtained from a map application. Several methods including gradient boosting tree, learning to rank, multinomial logit model, automated machine learning, random forest, and shallow neural network have been tried. For some methods, feature selection and over-sampling techniques were also tried. The results show that the best performing method is a gradient boosting tree model with synthetic minority over-sampling technique (SMOTE). Also, results of the multinomial logit model show that (1) an increase in travel cost would decrease the utility of all the transportation modes; (2) people are less sensitive to the travel distance for the metro mode or a multi-modal option that containing metro, i.e., compared to other modes, people would be more willing to tolerate long-distance metro trips. This indicates that metro lines might be a good candidate for large cities. |
Tasks | Feature Selection, Learning-To-Rank |
Published | 2019-10-13 |
URL | https://arxiv.org/abs/1910.12601v1 |
https://arxiv.org/pdf/1910.12601v1.pdf | |
PWC | https://paperswithcode.com/paper/personalized-context-aware-multi-modal |
Repo | |
Framework | |
Online Continuous Submodular Maximization: From Full-Information to Bandit Feedback
Title | Online Continuous Submodular Maximization: From Full-Information to Bandit Feedback |
Authors | Mingrui Zhang, Lin Chen, Hamed Hassani, Amin Karbasi |
Abstract | In this paper, we propose three online algorithms for submodular maximisation. The first one, Mono-Frank-Wolfe, reduces the number of per-function gradient evaluations from $T^{1/2}$ [Chen2018Online] and $T^{3/2}$ [chen2018projection] to 1, and achieves a $(1-1/e)$-regret bound of $O(T^{4/5})$. The second one, Bandit-Frank-Wolfe, is the first bandit algorithm for continuous DR-submodular maximization, which achieves a $(1-1/e)$-regret bound of $O(T^{8/9})$. Finally, we extend Bandit-Frank-Wolfe to a bandit algorithm for discrete submodular maximization, Responsive-Frank-Wolfe, which attains a $(1-1/e)$-regret bound of $O(T^{8/9})$ in the responsive bandit setting. |
Tasks | |
Published | 2019-10-28 |
URL | https://arxiv.org/abs/1910.12424v1 |
https://arxiv.org/pdf/1910.12424v1.pdf | |
PWC | https://paperswithcode.com/paper/online-continuous-submodular-maximization-1 |
Repo | |
Framework | |
Unsupervised Domain Adaptation via Calibrating Uncertainties
Title | Unsupervised Domain Adaptation via Calibrating Uncertainties |
Authors | Ligong Han, Yang Zou, Ruijiang Gao, Lezi Wang, Dimitris Metaxas |
Abstract | Unsupervised domain adaptation (UDA) aims at inferring class labels for unlabeled target domain given a related labeled source dataset. Intuitively, a model trained on source domain normally produces higher uncertainties for unseen data. In this work, we build on this assumption and propose to adapt from source to target domain via calibrating their predictive uncertainties. The uncertainty is quantified as the Renyi entropy, from which we propose a general Renyi entropy regularization (RER) framework. We further employ variational Bayes learning for reliable uncertainty estimation. In addition, calibrating the sample variance of network parameters serves as a plug-in regularizer for training. We discuss the theoretical properties of the proposed method and demonstrate its effectiveness on three domain-adaptation tasks. |
Tasks | Domain Adaptation, Unsupervised Domain Adaptation |
Published | 2019-07-25 |
URL | https://arxiv.org/abs/1907.11202v1 |
https://arxiv.org/pdf/1907.11202v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-domain-adaptation-via |
Repo | |
Framework | |