October 17, 2019

2911 words 14 mins read

Paper Group ANR 708

Paper Group ANR 708

Robust 6D Object Pose Estimation with Stochastic Congruent Sets. Deep Learning for Single Image Super-Resolution: A Brief Review. Generalized Binary Search For Split-Neighborly Problems. A Deep Structure of Person Re-Identification using Multi-Level Gaussian Models. 6D Pose Estimation using an Improved Method based on Point Pair Features. An ADMM-B …

Robust 6D Object Pose Estimation with Stochastic Congruent Sets

Title Robust 6D Object Pose Estimation with Stochastic Congruent Sets
Authors Chaitanya Mitash, Abdeslam Boularias, Kostas Bekris
Abstract Object pose estimation is frequently achieved by first segmenting an RGB image and then, given depth data, registering the corresponding point cloud segment against the object’s 3D model. Despite the progress due to CNNs, semantic segmentation output can be noisy, especially when the CNN is only trained on synthetic data. This causes registration methods to fail in estimating a good object pose. This work proposes a novel stochastic optimization process that treats the segmentation output of CNNs as a confidence probability. The algorithm, called Stochastic Congruent Sets (StoCS), samples pointsets on the point cloud according to the soft segmentation distribution and so as to agree with the object’s known geometry. The pointsets are then matched to congruent sets on the 3D object model to generate pose estimates. StoCS is shown to be robust on an APC dataset, despite the fact the CNN is trained only on synthetic data. In the YCB dataset, StoCS outperforms a recent network for 6D pose estimation and alternative pointset matching techniques.
Tasks 6D Pose Estimation, 6D Pose Estimation using RGB, Pose Estimation, Semantic Segmentation, Stochastic Optimization
Published 2018-05-16
URL http://arxiv.org/abs/1805.06324v1
PDF http://arxiv.org/pdf/1805.06324v1.pdf
PWC https://paperswithcode.com/paper/robust-6d-object-pose-estimation-with
Repo
Framework

Deep Learning for Single Image Super-Resolution: A Brief Review

Title Deep Learning for Single Image Super-Resolution: A Brief Review
Authors Wenming Yang, Xuechen Zhang, Yapeng Tian, Wei Wang, Jing-Hao Xue
Abstract Single image super-resolution (SISR) is a notoriously challenging ill-posed problem, which aims to obtain a high-resolution (HR) output from one of its low-resolution (LR) versions. To solve the SISR problem, recently powerful deep learning algorithms have been employed and achieved the state-of-the-art performance. In this survey, we review representative deep learning-based SISR methods, and group them into two categories according to their major contributions to two essential aspects of SISR: the exploration of efficient neural network architectures for SISR, and the development of effective optimization objectives for deep SISR learning. For each category, a baseline is firstly established and several critical limitations of the baseline are summarized. Then representative works on overcoming these limitations are presented based on their original contents as well as our critical understandings and analyses, and relevant comparisons are conducted from a variety of perspectives. Finally we conclude this review with some vital current challenges and future trends in SISR leveraging deep learning algorithms.
Tasks Image Super-Resolution, Super-Resolution
Published 2018-08-09
URL https://arxiv.org/abs/1808.03344v3
PDF https://arxiv.org/pdf/1808.03344v3.pdf
PWC https://paperswithcode.com/paper/deep-learning-for-single-image-super
Repo
Framework

Generalized Binary Search For Split-Neighborly Problems

Title Generalized Binary Search For Split-Neighborly Problems
Authors Stephen Mussmann, Percy Liang
Abstract In sequential hypothesis testing, Generalized Binary Search (GBS) greedily chooses the test with the highest information gain at each step. It is known that GBS obtains the gold standard query cost of $O(\log n)$ for problems satisfying the $k$-neighborly condition, which requires any two tests to be connected by a sequence of tests where neighboring tests disagree on at most $k$ hypotheses. In this paper, we introduce a weaker condition, split-neighborly, which requires that for the set of hypotheses two neighbors disagree on, any subset is splittable by some test. For four problems that are not $k$-neighborly for any constant $k$, we prove that they are split-neighborly, which allows us to obtain the optimal $O(\log n)$ worst-case query cost.
Tasks
Published 2018-02-27
URL http://arxiv.org/abs/1802.09751v1
PDF http://arxiv.org/pdf/1802.09751v1.pdf
PWC https://paperswithcode.com/paper/generalized-binary-search-for-split
Repo
Framework

A Deep Structure of Person Re-Identification using Multi-Level Gaussian Models

Title A Deep Structure of Person Re-Identification using Multi-Level Gaussian Models
Authors Dinesh Kumar Vishwakarma, Sakshi Upadhyay
Abstract Person re-identification is being widely used in the forensic, and security and surveillance system, but person re-identification is a challenging task in real life scenario. Hence, in this work, a new feature descriptor model has been proposed using a multilayer framework of Gaussian distribution model on pixel features, which include color moments, color space values and Schmid filter responses. An image of a person usually consists of distinct body regions, usually with differentiable clothing followed by local colors and texture patterns. Thus, the image is evaluated locally by dividing the image into overlapping regions. Each region is further fragmented into a set of local Gaussians on small patches. A global Gaussian encodes, these local Gaussians for each region creating a multi-level structure. Hence, the global picture of a person is described by local level information present in it, which is often ignored. Also, we have analyzed the efficiency of earlier metric learning methods on this descriptor. The performance of the descriptor is evaluated on four public available challenging datasets and the highest accuracy achieved on these datasets are compared with similar state-of-the-arts, which demonstrate the superior performance.
Tasks Metric Learning, Person Re-Identification
Published 2018-05-20
URL http://arxiv.org/abs/1805.07720v1
PDF http://arxiv.org/pdf/1805.07720v1.pdf
PWC https://paperswithcode.com/paper/a-deep-structure-of-person-re-identification
Repo
Framework

6D Pose Estimation using an Improved Method based on Point Pair Features

Title 6D Pose Estimation using an Improved Method based on Point Pair Features
Authors Joel Vidal, Chyi-Yeu Lin, Robert Martí
Abstract The Point Pair Feature (Drost et al. 2010) has been one of the most successful 6D pose estimation method among model-based approaches as an efficient, integrated and compromise alternative to the traditional local and global pipelines. During the last years, several variations of the algorithm have been proposed. Among these extensions, the solution introduced by Hinterstoisser et al. (2016) is a major contribution. This work presents a variation of this PPF method applied to the SIXD Challenge datasets presented at the 3rd International Workshop on Recovering 6D Object Pose held at the ICCV 2017. We report an average recall of 0.77 for all datasets and overall recall of 0.82, 0.67, 0.85, 0.37, 0.97 and 0.96 for hinterstoisser, tless, tudlight, rutgers, tejani and doumanoglou datasets, respectively.
Tasks 6D Pose Estimation, 6D Pose Estimation using RGB, Pose Estimation
Published 2018-02-23
URL http://arxiv.org/abs/1802.08516v1
PDF http://arxiv.org/pdf/1802.08516v1.pdf
PWC https://paperswithcode.com/paper/6d-pose-estimation-using-an-improved-method
Repo
Framework

An ADMM-Based Universal Framework for Adversarial Attacks on Deep Neural Networks

Title An ADMM-Based Universal Framework for Adversarial Attacks on Deep Neural Networks
Authors Pu Zhao, Sijia Liu, Yanzhi Wang, Xue Lin
Abstract Deep neural networks (DNNs) are known vulnerable to adversarial attacks. That is, adversarial examples, obtained by adding delicately crafted distortions onto original legal inputs, can mislead a DNN to classify them as any target labels. In a successful adversarial attack, the targeted mis-classification should be achieved with the minimal distortion added. In the literature, the added distortions are usually measured by L0, L1, L2, and L infinity norms, namely, L0, L1, L2, and L infinity attacks, respectively. However, there lacks a versatile framework for all types of adversarial attacks. This work for the first time unifies the methods of generating adversarial examples by leveraging ADMM (Alternating Direction Method of Multipliers), an operator splitting optimization approach, such that L0, L1, L2, and L infinity attacks can be effectively implemented by this general framework with little modifications. Comparing with the state-of-the-art attacks in each category, our ADMM-based attacks are so far the strongest, achieving both the 100% attack success rate and the minimal distortion.
Tasks Adversarial Attack
Published 2018-04-09
URL http://arxiv.org/abs/1804.03193v1
PDF http://arxiv.org/pdf/1804.03193v1.pdf
PWC https://paperswithcode.com/paper/an-admm-based-universal-framework-for
Repo
Framework

An Attention-Based Approach for Single Image Super Resolution

Title An Attention-Based Approach for Single Image Super Resolution
Authors Yuan Liu, Yuancheng Wang, Nan Li, Xu Cheng, Yifeng Zhang, Yongming Huang, Guojun Lu
Abstract The main challenge of single image super resolution (SISR) is the recovery of high frequency details such as tiny textures. However, most of the state-of-the-art methods lack specific modules to identify high frequency areas, causing the output image to be blurred. We propose an attention-based approach to give a discrimination between texture areas and smooth areas. After the positions of high frequency details are located, high frequency compensation is carried out. This approach can incorporate with previously proposed SISR networks. By providing high frequency enhancement, better performance and visual effect are achieved. We also propose our own SISR network composed of DenseRes blocks. The block provides an effective way to combine the low level features and high level features. Extensive benchmark evaluation shows that our proposed method achieves significant improvement over the state-of-the-art works in SISR.
Tasks Image Super-Resolution, Super-Resolution
Published 2018-07-18
URL http://arxiv.org/abs/1807.06779v1
PDF http://arxiv.org/pdf/1807.06779v1.pdf
PWC https://paperswithcode.com/paper/an-attention-based-approach-for-single-image
Repo
Framework

Soccer on Your Tabletop

Title Soccer on Your Tabletop
Authors Konstantinos Rematas, Ira Kemelmacher-Shlizerman, Brian Curless, Steve Seitz
Abstract We present a system that transforms a monocular video of a soccer game into a moving 3D reconstruction, in which the players and field can be rendered interactively with a 3D viewer or through an Augmented Reality device. At the heart of our paper is an approach to estimate the depth map of each player, using a CNN that is trained on 3D player data extracted from soccer video games. We compare with state of the art body pose and depth estimation techniques, and show results on both synthetic ground truth benchmarks, and real YouTube soccer footage.
Tasks 3D Reconstruction, Depth Estimation
Published 2018-06-03
URL http://arxiv.org/abs/1806.00890v1
PDF http://arxiv.org/pdf/1806.00890v1.pdf
PWC https://paperswithcode.com/paper/soccer-on-your-tabletop
Repo
Framework

Adaptive Importance Learning for Improving Lightweight Image Super-resolution Network

Title Adaptive Importance Learning for Improving Lightweight Image Super-resolution Network
Authors Lei Zhang, Peng Wang, Chunhua Shen, Lingqiao Liu, Wei Wei, Yanning Zhang, Anton van den Hengel
Abstract Deep neural networks have achieved remarkable success in single image super-resolution (SISR). The computing and memory requirements of these methods have hindered their application to broad classes of real devices with limited computing power, however. One approach to this problem has been lightweight network architectures that bal- ance the super-resolution performance and the computation burden. In this study, we revisit this problem from an orthog- onal view, and propose a novel learning strategy to maxi- mize the pixel-wise fitting capacity of a given lightweight network architecture. Considering that the initial capacity of the lightweight network is very limited, we present an adaptive importance learning scheme for SISR that trains the network with an easy-to-complex paradigm by dynam- ically updating the importance of image pixels on the basis of the training loss. Specifically, we formulate the network training and the importance learning into a joint optimization problem. With a carefully designed importance penalty function, the importance of individual pixels can be gradu- ally increased through solving a convex optimization problem. The training process thus begins with pixels that are easy to reconstruct, and gradually proceeds to more complex pixels as fitting improves.
Tasks Image Super-Resolution, Super-Resolution
Published 2018-06-05
URL http://arxiv.org/abs/1806.01576v1
PDF http://arxiv.org/pdf/1806.01576v1.pdf
PWC https://paperswithcode.com/paper/adaptive-importance-learning-for-improving
Repo
Framework

A Multimodal Recommender System for Large-scale Assortment Generation in E-commerce

Title A Multimodal Recommender System for Large-scale Assortment Generation in E-commerce
Authors Murium Iqbal, Adair Kovac, Kamelia Aryafar
Abstract E-commerce platforms surface interesting products largely through product recommendations that capture users’ styles and aesthetic preferences. Curating recommendations as a complete complementary set, or assortment, is critical for a successful e-commerce experience, especially for product categories such as furniture, where items are selected together with the overall theme, style or ambiance of a space in mind. In this paper, we propose two visually-aware recommender systems that can automatically curate an assortment of living room furniture around a couple of pre-selected seed pieces for the room. The first system aims to maximize the visual-based style compatibility of the entire selection by making use of transfer learning and topic modeling. The second system extends the first by incorporating text data and applying polylingual topic modeling to infer style over both modalities. We review the production pipeline for surfacing these visually-aware recommender systems and compare them through offline validations and large-scale online A/B tests on Overstock. Our experimental results show that complimentary style is best discovered over product sets when both visual and textual data are incorporated.
Tasks Recommendation Systems, Transfer Learning
Published 2018-06-28
URL http://arxiv.org/abs/1806.11226v1
PDF http://arxiv.org/pdf/1806.11226v1.pdf
PWC https://paperswithcode.com/paper/a-multimodal-recommender-system-for-large
Repo
Framework

Learning multiple non-mutually-exclusive tasks for improved classification of inherently ordered labels

Title Learning multiple non-mutually-exclusive tasks for improved classification of inherently ordered labels
Authors Vadim Ratner, Yoel Shoshan, Tal Kachman
Abstract Medical image classification involves thresholding of labels that represent malignancy risk levels. Usually, a task defines a single threshold, and when developing computer-aided diagnosis tools, a single network is trained per such threshold, e.g. as screening out healthy (very low risk) patients to leave possibly sick ones for further analysis (low threshold), or trying to find malignant cases among those marked as non-risk by the radiologist (“second reading”, high threshold). We propose a way to rephrase the classification problem in a manner that yields several problems (corresponding to different thresholds) to be solved simultaneously. This allows the use of Multiple Task Learning (MTL) methods, significantly improving the performance of the original classifier, by facilitating effective extraction of information from existing data.
Tasks Image Classification
Published 2018-05-30
URL http://arxiv.org/abs/1805.11837v2
PDF http://arxiv.org/pdf/1805.11837v2.pdf
PWC https://paperswithcode.com/paper/learning-multiple-non-mutually-exclusive
Repo
Framework

Cost-Aware Learning for Improved Identifiability with Multiple Experiments

Title Cost-Aware Learning for Improved Identifiability with Multiple Experiments
Authors Longyun Guo, Jean Honorio, John Morgan
Abstract We analyze the sample complexity of learning from multiple experiments where the experimenter has a total budget for obtaining samples. In this problem, the learner should choose a hypothesis that performs well with respect to multiple experiments, and their related data distributions. Each collected sample is associated with a cost which depends on the particular experiments. In our setup, a learner performs $m$ experiments, while incurring a total cost $C$. We first show that learning from multiple experiments allows to improve identifiability. Additionally, by using a Rademacher complexity approach, we show that the gap between the training and generalization error is $O(C^{-1/2})$. We also provide some examples for linear prediction, two-layer neural networks and kernel methods.
Tasks
Published 2018-02-12
URL https://arxiv.org/abs/1802.04350v5
PDF https://arxiv.org/pdf/1802.04350v5.pdf
PWC https://paperswithcode.com/paper/cost-aware-learning-for-improved
Repo
Framework

SALSA-TEXT : self attentive latent space based adversarial text generation

Title SALSA-TEXT : self attentive latent space based adversarial text generation
Authors Jules Gagnon-Marchand, Hamed Sadeghi, Md. Akmal Haidar, Mehdi Rezagholizadeh
Abstract Inspired by the success of self attention mechanism and Transformer architecture in sequence transduction and image generation applications, we propose novel self attention-based architectures to improve the performance of adversarial latent code- based schemes in text generation. Adversarial latent code-based text generation has recently gained a lot of attention due to their promising results. In this paper, we take a step to fortify the architectures used in these setups, specifically AAE and ARAE. We benchmark two latent code-based methods (AAE and ARAE) designed based on adversarial setups. In our experiments, the Google sentence compression dataset is utilized to compare our method with these methods using various objective and subjective measures. The experiments demonstrate the proposed (self) attention-based models outperform the state-of-the-art in adversarial code-based text generation.
Tasks Adversarial Text, Image Generation, Sentence Compression, Text Generation
Published 2018-09-28
URL http://arxiv.org/abs/1809.11155v2
PDF http://arxiv.org/pdf/1809.11155v2.pdf
PWC https://paperswithcode.com/paper/salsa-text-self-attentive-latent-space-based
Repo
Framework

Semi-supervised learning for structured regression on partially observed attributed graphs

Title Semi-supervised learning for structured regression on partially observed attributed graphs
Authors Jelena Stojanovic, Milos Jovanovic, Djordje Gligorijevic, Zoran Obradovic
Abstract Conditional probabilistic graphical models provide a powerful framework for structured regression in spatio-temporal datasets with complex correlation patterns. However, in real-life applications a large fraction of observations is often missing, which can severely limit the representational power of these models. In this paper we propose a Marginalized Gaussian Conditional Random Fields (m-GCRF) structured regression model for dealing with missing labels in partially observed temporal attributed graphs. This method is aimed at learning with both labeled and unlabeled parts and effectively predicting future values in a graph. The method is even capable of learning from nodes for which the response variable is never observed in history, which poses problems for many state-of-the-art models that can handle missing data. The proposed model is characterized for various missingness mechanisms on 500 synthetic graphs. The benefits of the new method are also demonstrated on a challenging application for predicting precipitation based on partial observations of climate variables in a temporal graph that spans the entire continental US. We also show that the method can be useful for optimizing the costs of data collection in climate applications via active reduction of the number of weather stations to consider. In experiments on these real-world and synthetic datasets we show that the proposed model is consistently more accurate than alternative semi-supervised structured models, as well as models that either use imputation to deal with missing values or simply ignore them altogether.
Tasks Imputation
Published 2018-03-28
URL http://arxiv.org/abs/1803.10705v1
PDF http://arxiv.org/pdf/1803.10705v1.pdf
PWC https://paperswithcode.com/paper/semi-supervised-learning-for-structured
Repo
Framework

Uncertainty propagation in neural networks for sparse coding

Title Uncertainty propagation in neural networks for sparse coding
Authors Danil Kuzin, Olga Isupova, Lyudmila Mihaylova
Abstract A novel method to propagate uncertainty through the soft-thresholding nonlinearity is proposed in this paper. At every layer the current distribution of the target vector is represented as a spike and slab distribution, which represents the probabilities of each variable being zero, or Gaussian-distributed. Using the proposed method of uncertainty propagation, the gradients of the logarithms of normalisation constants are derived, that can be used to update a weight distribution. A novel Bayesian neural network for sparse coding is designed utilising both the proposed method of uncertainty propagation and Bayesian inference algorithm.
Tasks Bayesian Inference
Published 2018-11-29
URL http://arxiv.org/abs/1811.12465v1
PDF http://arxiv.org/pdf/1811.12465v1.pdf
PWC https://paperswithcode.com/paper/uncertainty-propagation-in-neural-networks
Repo
Framework
comments powered by Disqus