January 25, 2020

3035 words 15 mins read

Paper Group ANR 1622

Paper Group ANR 1622

Fingerspelling recognition in the wild with iterative visual attention. Robust Online Multi-target Visual Tracking using a HISP Filter with Discriminative Deep Appearance Learning. Action-Centered Information Retrieval. Multiscale Nakagami parametric imaging for improved liver tumor localization. General risk measures for robust machine learning. I …

Fingerspelling recognition in the wild with iterative visual attention

Title Fingerspelling recognition in the wild with iterative visual attention
Authors Bowen Shi, Aurora Martinez Del Rio, Jonathan Keane, Diane Brentari, Greg Shakhnarovich, Karen Livescu
Abstract Sign language recognition is a challenging gesture sequence recognition problem, characterized by quick and highly coarticulated motion. In this paper we focus on recognition of fingerspelling sequences in American Sign Language (ASL) videos collected in the wild, mainly from YouTube and Deaf social media. Most previous work on sign language recognition has focused on controlled settings where the data is recorded in a studio environment and the number of signers is limited. Our work aims to address the challenges of real-life data, reducing the need for detection or segmentation modules commonly used in this domain. We propose an end-to-end model based on an iterative attention mechanism, without explicit hand detection or segmentation. Our approach dynamically focuses on increasingly high-resolution regions of interest. It outperforms prior work by a large margin. We also introduce a newly collected data set of crowdsourced annotations of fingerspelling in the wild, and show that performance can be further improved with this additional data set.
Tasks Sign Language Recognition
Published 2019-08-28
URL https://arxiv.org/abs/1908.10546v1
PDF https://arxiv.org/pdf/1908.10546v1.pdf
PWC https://paperswithcode.com/paper/fingerspelling-recognition-in-the-wild-with
Repo
Framework

Robust Online Multi-target Visual Tracking using a HISP Filter with Discriminative Deep Appearance Learning

Title Robust Online Multi-target Visual Tracking using a HISP Filter with Discriminative Deep Appearance Learning
Authors Nathanael L. Baisa
Abstract We propose a novel online multi-target visual tracker based on the recently developed Hypothesized and Independent Stochastic Population (HISP) filter. The HISP filter combines advantages of traditional tracking approaches like multiple hypothesis tracking (MHT) and point-process-based approaches like probability hypothesis density (PHD) filter, and it has a linear complexity while maintaining track identities. We apply this filter for tracking multiple targets in video sequences acquired under varying environmental conditions and targets density using a tracking-by-detection approach. We also adopt deep convolutional neural networks (CNN) appearance representation by training a verification-identification network (VerIdNet) on large-scale person re-identification data sets. We construct an augmented likelihood in a principled manner using this deep CNN appearance features and spatio-temporal (motion) information that can improve the tracker’s performance. In addition, we solve the problem of two or more targets having identical label taking into account the weight propagated with each confirmed hypothesis. Finally, we carry out extensive experiments on Multiple Object Tracking 2016 (MOT16) and 2017 (MOT17) benchmark data sets and find out that our tracker significantly outperforms several state-of-the-art trackers in terms of tracking accuracy.
Tasks Large-Scale Person Re-Identification, Multiple Object Tracking, Object Tracking, Person Re-Identification, Visual Tracking
Published 2019-08-11
URL https://arxiv.org/abs/1908.03945v4
PDF https://arxiv.org/pdf/1908.03945v4.pdf
PWC https://paperswithcode.com/paper/robust-online-multi-target-visual-tracking
Repo
Framework

Action-Centered Information Retrieval

Title Action-Centered Information Retrieval
Authors Marcello Balduccini, Emily LeBlanc
Abstract Information Retrieval (IR) aims at retrieving documents that are most relevant to a query provided by a user. Traditional techniques rely mostly on syntactic methods. In some cases, however, links at a deeper semantic level must be considered. In this paper, we explore a type of IR task in which documents describe sequences of events, and queries are about the state of the world after such events. In this context, successfully matching documents and query requires considering the events’ possibly implicit, uncertain effects and side-effects. We begin by analyzing the problem, then propose an action language based formalization, and finally automate the corresponding IR task using Answer Set Programming.
Tasks Information Retrieval
Published 2019-03-23
URL http://arxiv.org/abs/1903.09850v1
PDF http://arxiv.org/pdf/1903.09850v1.pdf
PWC https://paperswithcode.com/paper/action-centered-information-retrieval
Repo
Framework

Multiscale Nakagami parametric imaging for improved liver tumor localization

Title Multiscale Nakagami parametric imaging for improved liver tumor localization
Authors Omar S. Al-Kadi
Abstract Effective ultrasound tissue characterization is usually hindered by complex tissue structures. The interlacing of speckle patterns complicates the correct estimation of backscatter distribution parameters. Nakagami parametric imaging based on localized shape parameter mapping can model different backscattering conditions. However, performance of the constructed Nakagami image depends on the sensitivity of the estimation method to the backscattered statistics and scale of analysis. Using a fixed focal region of interest in estimating the Nakagami parametric image would increase estimation variance. In this work, localized Nakagami parameters are estimated adaptively by means of maximum likelihood estimation on a multiscale basis. The varying size kernel integrates the goodness-of-fit of the backscattering distribution parameters at multiple scales for more stable parameter estimation. Results show improved quantitative visualization of changes in tissue specular reflections, suggesting a potential approach for improving tumor localization in low contrast ultrasound images.
Tasks
Published 2019-06-11
URL https://arxiv.org/abs/1906.04333v1
PDF https://arxiv.org/pdf/1906.04333v1.pdf
PWC https://paperswithcode.com/paper/multiscale-nakagami-parametric-imaging-for
Repo
Framework

General risk measures for robust machine learning

Title General risk measures for robust machine learning
Authors Emilie Chouzenoux, Henri Gérard, Jean-Christophe Pesquet
Abstract A wide array of machine learning problems are formulated as the minimization of the expectation of a convex loss function on some parameter space. Since the probability distribution of the data of interest is usually unknown, it is is often estimated from training sets, which may lead to poor out-of-sample performance. In this work, we bring new insights in this problem by using the framework which has been developed in quantitative finance for risk measures. We show that the original min-max problem can be recast as a convex minimization problem under suitable assumptions. We discuss several important examples of robust formulations, in particular by defining ambiguity sets based on $\varphi$-divergences and the Wasserstein metric.We also propose an efficient algorithm for solving the corresponding convex optimization problems involving complex convex constraints. Through simulation examples, we demonstrate that this algorithm scales well on real data sets.
Tasks
Published 2019-04-26
URL https://arxiv.org/abs/1904.11707v2
PDF https://arxiv.org/pdf/1904.11707v2.pdf
PWC https://paperswithcode.com/paper/general-risk-measures-for-robust-machine
Repo
Framework

Image Differential Invariants

Title Image Differential Invariants
Authors Hanlin Mo, Hua Li
Abstract Inspired by the methods of systematic derivation of image moment invariants, we design two fundamental differential operators to generate image differential invariants for the action of 2D Euclidean, similarity and affine transformation groups. Each differential invariant obtained by using the new method can be expressed as a homogeneous polynomial of image partial derivatives. When setting the degree of the polynomial and the order of image partial derivatives are less than or equal to 4, we generate all Euclidean differential invariants and discuss the independence of them in detail. In the experimental part, we find the relation between Euclidean differential invariants and Gaussian-Hermite moment invariants when using the derivatives of Gaussian to estimate image partial derivatives. Texture classification and image patch verification are carried out on some synthetic and popular real databases. We mainly evaluate the stability and discriminability of Euclidean differential invariants and analyse the effects of various factors on performance of them. The experimental results validate image Euclidean differential invariants have better performance than some commonly used local image features in most cases.
Tasks Texture Classification
Published 2019-11-13
URL https://arxiv.org/abs/1911.05327v1
PDF https://arxiv.org/pdf/1911.05327v1.pdf
PWC https://paperswithcode.com/paper/image-differential-invariants
Repo
Framework

Improving the Accuracy of Principal Component Analysis by the Maximum Entropy Method

Title Improving the Accuracy of Principal Component Analysis by the Maximum Entropy Method
Authors Guihong Wan, Crystal Maung, Haim Schweitzer
Abstract Classical Principal Component Analysis (PCA) approximates data in terms of projections on a small number of orthogonal vectors. There are simple procedures to efficiently compute various functions of the data from the PCA approximation. The most important function is arguably the Euclidean distance between data items, This can be used, for example, to solve the approximate nearest neighbor problem. We use random variables to model the inherent uncertainty in such approximations, and apply the Maximum Entropy Method to infer the underlying probability distribution. We propose using the expected values of distances between these random variables as improved estimates of the distance. We show by analysis and experimentally that in most cases results obtained by our method are more accurate than what is obtained by the classical approach. This improves the accuracy of a classical technique that have been used with little change for over 100 years.
Tasks
Published 2019-07-24
URL https://arxiv.org/abs/1907.11094v1
PDF https://arxiv.org/pdf/1907.11094v1.pdf
PWC https://paperswithcode.com/paper/improving-the-accuracy-of-principal-component
Repo
Framework

On Sharing Models Instead of Data using Mimic learning for Smart Health Applications

Title On Sharing Models Instead of Data using Mimic learning for Smart Health Applications
Authors Mohamed Baza, Andrew Salazar, Mohamed Mahmoud, Mohamed Abdallah, Kemal Akkaya
Abstract Electronic health records (EHR) systems contain vast amounts of medical information about patients. These data can be used to train machine learning models that can predict health status, as well as to help prevent future diseases or disabilities. However, getting patients’ medical data to obtain well-trained machine learning models is a challenging task. This is because sharing the patients’ medical records is prohibited by law in most countries due to patients privacy concerns. In this paper, we tackle this problem by sharing the models instead of the original sensitive data by using the mimic learning approach. The idea is first to train a model on the original sensitive data, called the teacher model. Then, using this model, we can transfer its knowledge to another model, called the student model, without the need to learn the original data used in training the teacher model. The student model is then shared to the public and can be used to make accurate predictions. To assess the mimic learning approach, we have evaluated our scheme using different medical datasets. The results indicate that the student model mimics the teacher model performance in terms of prediction accuracy without the need to access to the patients’ original data records.
Tasks
Published 2019-12-24
URL https://arxiv.org/abs/1912.11210v1
PDF https://arxiv.org/pdf/1912.11210v1.pdf
PWC https://paperswithcode.com/paper/on-sharing-models-instead-of-data-using-mimic
Repo
Framework

Active Learning by Greedy Split and Label Exploration

Title Active Learning by Greedy Split and Label Exploration
Authors Alyssa Herbst, Bert Huang
Abstract Annotating large unlabeled datasets can be a major bottleneck for machine learning applications. We introduce a scheme for inferring labels of unlabeled data at a fraction of the cost of labeling the entire dataset. We refer to the scheme as greedy split and label exploration (GSAL). GSAL greedily queries an oracle (or human labeler) and partitions a dataset to find data subsets that have mostly the same label. GSAL can then infer labels by majority vote of the known labels in each subset. GSAL makes the decision to split or label from a subset by maximizing a lower bound on the expected number of correctly labeled examples. GSAL improves upon existing hierarchical labeling schemes by using supervised models to partition the data, therefore avoiding reliance on unsupervised clustering methods that may not accurately group data by label. We design GSAL with strategies to avoid bias that could be introduced through this adaptive partitioning. We evaluate GSAL on labeling of three datasets and find that it outperforms existing strategies for adaptive labeling.
Tasks Active Learning
Published 2019-06-17
URL https://arxiv.org/abs/1906.07046v1
PDF https://arxiv.org/pdf/1906.07046v1.pdf
PWC https://paperswithcode.com/paper/active-learning-by-greedy-split-and-label
Repo
Framework

Enhancing Generic Segmentation with Learned Region Representations

Title Enhancing Generic Segmentation with Learned Region Representations
Authors Or Isaacs, Oran Shayer, Michael Lindenbaum
Abstract Current successful approaches for generic (non-semantic) segmentation rely mostly on edge detection and have leveraged the strengths of deep learning mainly by improving the edge detection stage in the algorithmic pipeline. This is in contrast to semantic and instance segmentation, where DNNs are applied directly to generate pixel-wise segment representations. We propose a new method for learning a pixel-wise representation that reflects segment relatedness. This representation is combined with an edge map to yield a new segmentation algorithm. We show that the representations themselves achieve state-of-the-art segment similarity scores. Moreover, the proposed combined segmentation algorithm provides results that are either state of the art or improve upon it, for most quality measures.
Tasks Edge Detection, Instance Segmentation, Semantic Segmentation
Published 2019-11-17
URL https://arxiv.org/abs/1911.08564v2
PDF https://arxiv.org/pdf/1911.08564v2.pdf
PWC https://paperswithcode.com/paper/enhancing-generic-segmentation-with-learned
Repo
Framework

An AI-Augmented Lesion Detection Framework For Liver Metastases With Model Interpretability

Title An AI-Augmented Lesion Detection Framework For Liver Metastases With Model Interpretability
Authors Xin J. Hunt, Ralph Abbey, Ricky Tharrington, Joost Huiskens, Nina Wesdorp
Abstract Colorectal cancer (CRC) is the third most common cancer and the second leading cause of cancer-related deaths worldwide. Most CRC deaths are the result of progression of metastases. The assessment of metastases is done using the RECIST criterion, which is time consuming and subjective, as clinicians need to manually measure anatomical tumor sizes. AI has many successes in image object detection, but often suffers because the models used are not interpretable, leading to issues in trust and implementation in the clinical setting. We propose a framework for an AI-augmented system in which an interactive AI system assists clinicians in the metastasis assessment. We include model interpretability to give explanations of the reasoning of the underlying models.
Tasks Object Detection
Published 2019-07-17
URL https://arxiv.org/abs/1907.07713v1
PDF https://arxiv.org/pdf/1907.07713v1.pdf
PWC https://paperswithcode.com/paper/an-ai-augmented-lesion-detection-framework
Repo
Framework

Depth Extraction from Video Using Non-parametric Sampling

Title Depth Extraction from Video Using Non-parametric Sampling
Authors Kevin Karsch, Ce Liu, Sing Bing Kang
Abstract We describe a technique that automatically generates plausible depth maps from videos using non-parametric depth sampling. We demonstrate our technique in cases where past methods fail (non-translating cameras and dynamic scenes). Our technique is applicable to single images as well as videos. For videos, we use local motion cues to improve the inferred depth maps, while optical flow is used to ensure temporal depth consistency. For training and evaluation, we use a Kinect-based system to collect a large dataset containing stereoscopic videos with known depths. We show that our depth estimation technique outperforms the state-of-the-art on benchmark databases. Our technique can be used to automatically convert a monoscopic video into stereo for 3D visualization, and we demonstrate this through a variety of visually pleasing results for indoor and outdoor scenes, including results from the feature film Charade.
Tasks Depth Estimation, Optical Flow Estimation
Published 2019-12-24
URL https://arxiv.org/abs/2002.04479v1
PDF https://arxiv.org/pdf/2002.04479v1.pdf
PWC https://paperswithcode.com/paper/depth-extraction-from-video-using-non
Repo
Framework

Deep Learning to Address Candidate Generation and Cold Start Challenges in Recommender Systems: A Research Survey

Title Deep Learning to Address Candidate Generation and Cold Start Challenges in Recommender Systems: A Research Survey
Authors Kiran Rama, Pradeep Kumar, Bharat Bhasker
Abstract Among the machine learning applications to business, recommender systems would take one of the top places when it comes to success and adoption. They help the user in accelerating the process of search while helping businesses maximize sales. Post phenomenal success in computer vision and speech recognition, deep learning methods are beginning to get applied to recommender systems. Current survey papers on deep learning in recommender systems provide a historical overview and taxonomy of recommender systems based on type. Our paper addresses the gaps of providing a taxonomy of deep learning approaches to address recommender systems problems in the areas of cold start and candidate generation in recommender systems. We outline different challenges in recommender systems into those related to the recommendations themselves (include relevance, speed, accuracy and scalability), those related to the nature of the data (cold start problem, imbalance and sparsity) and candidate generation. We then provide a taxonomy of deep learning techniques to address these challenges. Deep learning techniques are mapped to the different challenges in recommender systems providing an overview of how deep learning techniques can be used to address them. We contribute a taxonomy of deep learning techniques to address the cold start and candidate generation problems in recommender systems. Cold Start is addressed through additional features (for audio, images, text) and by learning hidden user and item representations. Candidate generation has been addressed by separate networks, RNNs, autoencoders and hybrid methods. We also summarize the advantages and limitations of these techniques while outlining areas for future research.
Tasks Recommendation Systems, Speech Recognition
Published 2019-07-17
URL https://arxiv.org/abs/1907.08674v1
PDF https://arxiv.org/pdf/1907.08674v1.pdf
PWC https://paperswithcode.com/paper/deep-learning-to-address-candidate-generation
Repo
Framework

Fast Learning of Temporal Action Proposal via Dense Boundary Generator

Title Fast Learning of Temporal Action Proposal via Dense Boundary Generator
Authors Chuming Lin, Jian Li, Yabiao Wang, Ying Tai, Donghao Luo, Zhipeng Cui, Chengjie Wang, Jilin Li, Feiyue Huang, Rongrong Ji
Abstract Generating temporal action proposals remains a very challenging problem, where the main issue lies in predicting precise temporal proposal boundaries and reliable action confidence in long and untrimmed real-world videos. In this paper, we propose an efficient and unified framework to generate temporal action proposals named Dense Boundary Generator (DBG), which draws inspiration from boundary-sensitive methods and implements boundary classification and action completeness regression for densely distributed proposals. In particular, the DBG consists of two modules: Temporal boundary classification (TBC) and Action-aware completeness regression (ACR). The TBC aims to provide two temporal boundary confidence maps by low-level two-stream features, while the ACR is designed to generate an action completeness score map by high-level action-aware features. Moreover, we introduce a dual stream BaseNet (DSB) to encode RGB and optical flow information, which helps to capture discriminative boundary and actionness features. Extensive experiments on popular benchmarks ActivityNet-1.3 and THUMOS14 demonstrate the superiority of DBG over the state-of-the-art proposal generator (e.g., MGG and BMN). Our code will be made available upon publication.
Tasks Optical Flow Estimation
Published 2019-11-11
URL https://arxiv.org/abs/1911.04127v1
PDF https://arxiv.org/pdf/1911.04127v1.pdf
PWC https://paperswithcode.com/paper/fast-learning-of-temporal-action-proposal-via
Repo
Framework

Transfer Learning-Based Label Proportions Method with Data of Uncertainty

Title Transfer Learning-Based Label Proportions Method with Data of Uncertainty
Authors Yanshan Xiao, HuaiPei Wang, Bo Liu
Abstract Learning with label proportions (LLP), which is a learning task that only provides unlabeled data in bags and each bag’s label proportion, has widespread successful applications in practice. However, most of the existing LLP methods don’t consider the knowledge transfer for uncertain data. This paper presents a transfer learning-based approach for the problem of learning with label proportions(TL-LLP) to transfer knowledge from source task to target task where both the source and target tasks contain uncertain data. Our approach first formulates objective model for the uncertain data and deals with transfer learning at the same time, and then proposes an iterative framework to build an accurate classifier for the target task. Extensive experiments have shown that the proposed TL-LLP method can obtain the better accuracies and is less sensitive to noise compared with the existing LLP methods.
Tasks Transfer Learning
Published 2019-08-19
URL https://arxiv.org/abs/1908.06603v1
PDF https://arxiv.org/pdf/1908.06603v1.pdf
PWC https://paperswithcode.com/paper/transfer-learning-based-label-proportions
Repo
Framework
comments powered by Disqus