May 6, 2019

3145 words 15 mins read

Paper Group ANR 189

Object Recognition and Identification Using ESM Data. N-ary Error Correcting Coding Scheme. 3D Gaze Estimation from 2D Pupil Positions on Monocular Head-Mounted Eye Trackers. Blind Analysis of CT Image Noise Using Residual Denoised Images. Parsimonious Online Learning with Kernels via Sparse Projections in Function Space. Incorporation of Speech Du …

Object Recognition and Identification Using ESM Data


Title	Object Recognition and Identification Using ESM Data
Authors	E. Taghavi, D. Song, R. Tharmarasa, T. Kirubarajan, Anne-Claire Boury-Brisset, Bhashyam Balaji
Abstract	Recognition and identification of unknown targets is a crucial task in surveillance and security systems. Electronic Support Measures (ESM) are one of the most effective sensors for identification, especially for maritime and air–to–ground applications. In typical surveillance systems multiple ESM sensors are usually deployed along with kinematic sensors like radar. Different ESM sensors may produce different types of reports ready to be sent to the fusion center. The focus of this paper is to develop a new architecture for target recognition and identification when non–homogeneous ESM and possibly kinematic reports are received at the fusion center. The new fusion architecture is evaluated using simulations to show the benefit of utilizing different ESM reports such as attributes and signal level ESM data.
Tasks	Object Recognition
Published	2016-03-22
URL	http://arxiv.org/abs/1607.01355v1
PDF	http://arxiv.org/pdf/1607.01355v1.pdf
PWC	https://paperswithcode.com/paper/object-recognition-and-identification-using
Repo
Framework

N-ary Error Correcting Coding Scheme


Title	N-ary Error Correcting Coding Scheme
Authors	Joey Tianyi Zhou, Ivor W. Tsang, Shen-Shyang Ho, Klaus-Robert Muller
Abstract	The coding matrix design plays a fundamental role in the prediction performance of the error correcting output codes (ECOC)-based multi-class task. {In many-class classification problems, e.g., fine-grained categorization, it is difficult to distinguish subtle between-class differences under existing coding schemes due to a limited choices of coding values.} In this paper, we investigate whether one can relax existing binary and ternary code design to $N$-ary code design to achieve better classification performance. {In particular, we present a novel $N$-ary coding scheme that decomposes the original multi-class problem into simpler multi-class subproblems, which is similar to applying a divide-and-conquer method.} The two main advantages of such a coding scheme are as follows: (i) the ability to construct more discriminative codes and (ii) the flexibility for the user to select the best $N$ for ECOC-based classification. We show empirically that the optimal $N$ (based on classification performance) lies in $[3, 10]$ with some trade-off in computational cost. Moreover, we provide theoretical insights on the dependency of the generalization error bound of an $N$-ary ECOC on the average base classifier generalization error and the minimum distance between any two codes constructed. Extensive experimental results on benchmark multi-class datasets show that the proposed coding scheme achieves superior prediction performance over the state-of-the-art coding methods.
Tasks
Published	2016-03-18
URL	http://arxiv.org/abs/1603.05850v1
PDF	http://arxiv.org/pdf/1603.05850v1.pdf
PWC	https://paperswithcode.com/paper/n-ary-error-correcting-coding-scheme
Repo
Framework

3D Gaze Estimation from 2D Pupil Positions on Monocular Head-Mounted Eye Trackers


Title	3D Gaze Estimation from 2D Pupil Positions on Monocular Head-Mounted Eye Trackers
Authors	Mohsen Mansouryar, Julian Steil, Yusuke Sugano, Andreas Bulling
Abstract	3D gaze information is important for scene-centric attention analysis but accurate estimation and analysis of 3D gaze in real-world environments remains challenging. We present a novel 3D gaze estimation method for monocular head-mounted eye trackers. In contrast to previous work, our method does not aim to infer 3D eyeball poses but directly maps 2D pupil positions to 3D gaze directions in scene camera coordinate space. We first provide a detailed discussion of the 3D gaze estimation task and summarize different methods, including our own. We then evaluate the performance of different 3D gaze estimation approaches using both simulated and real data. Through experimental validation, we demonstrate the effectiveness of our method in reducing parallax error, and we identify research challenges for the design of 3D calibration procedures.
Tasks	Calibration, Gaze Estimation
Published	2016-01-11
URL	http://arxiv.org/abs/1601.02644v2
PDF	http://arxiv.org/pdf/1601.02644v2.pdf
PWC	https://paperswithcode.com/paper/3d-gaze-estimation-from-2d-pupil-positions-on
Repo
Framework


Title	Blind Analysis of CT Image Noise Using Residual Denoised Images
Authors	Sohini Roychowdhury, Nathan Hollraft, Adam Alessio
Abstract	CT protocol design and quality control would benefit from automated tools to estimate the quality of generated CT images. These tools could be used to identify erroneous CT acquisitions or refine protocols to achieve certain signal to noise characteristics. This paper investigates blind estimation methods to determine global signal strength and noise levels in chest CT images. Methods: We propose novel performance metrics corresponding to the accuracy of noise and signal estimation. We implement and evaluate the noise estimation performance of six spatial- and frequency- based methods, derived from conventional image filtering algorithms. Algorithms were tested on patient data sets from whole-body repeat CT acquisitions performed with a higher and lower dose technique over the same scan region. Results: The proposed performance metrics can evaluate the relative tradeoff of filter parameters and noise estimation performance. The proposed automated methods tend to underestimate CT image noise at low-flux levels. Initial application of methodology suggests that anisotropic diffusion and Wavelet-transform based filters provide optimal estimates of noise. Furthermore, methodology does not provide accurate estimates of absolute noise levels, but can provide estimates of relative change and/or trends in noise levels.
Tasks
Published	2016-05-24
URL	http://arxiv.org/abs/1605.07650v1
PDF	http://arxiv.org/pdf/1605.07650v1.pdf
PWC	https://paperswithcode.com/paper/blind-analysis-of-ct-image-noise-using
Repo
Framework

Parsimonious Online Learning with Kernels via Sparse Projections in Function Space


Title	Parsimonious Online Learning with Kernels via Sparse Projections in Function Space
Authors	Alec Koppel, Garrett Warnell, Ethan Stump, Alejandro Ribeiro
Abstract	Despite their attractiveness, popular perception is that techniques for nonparametric function approximation do not scale to streaming data due to an intractable growth in the amount of storage they require. To solve this problem in a memory-affordable way, we propose an online technique based on functional stochastic gradient descent in tandem with supervised sparsification based on greedy function subspace projections. The method, called parsimonious online learning with kernels (POLK), provides a controllable tradeoff? between its solution accuracy and the amount of memory it requires. We derive conditions under which the generated function sequence converges almost surely to the optimal function, and we establish that the memory requirement remains finite. We evaluate POLK for kernel multi-class logistic regression and kernel hinge-loss classification on three canonical data sets: a synthetic Gaussian mixture model, the MNIST hand-written digits, and the Brodatz texture database. On all three tasks, we observe a favorable tradeoff of objective function evaluation, classification performance, and complexity of the nonparametric regressor extracted the proposed method.
Tasks
Published	2016-12-13
URL	http://arxiv.org/abs/1612.04111v1
PDF	http://arxiv.org/pdf/1612.04111v1.pdf
PWC	https://paperswithcode.com/paper/parsimonious-online-learning-with-kernels-via
Repo
Framework

Incorporation of Speech Duration Information in Score Fusion of Speaker Recognition Systems


Title	Incorporation of Speech Duration Information in Score Fusion of Speaker Recognition Systems
Authors	Ali Khodabakhsh, Seyyed Saeed Sarfjoo, Umut Uludag, Osman Soyyigit, Cenk Demiroglu
Abstract	In recent years identity-vector (i-vector) based speaker verification (SV) systems have become very successful. Nevertheless, environmental noise and speech duration variability still have a significant effect on degrading the performance of these systems. In many real-life applications, duration of recordings are very short; as a result, extracted i-vectors cannot reliably represent the attributes of the speaker. Here, we investigate the effect of speech duration on the performance of three state-of-the-art speaker recognition systems. In addition, using a variety of available score fusion methods, we investigate the effect of score fusion for those speaker verification techniques to benefit from the performance difference of different methods under different enrollment and test speech duration conditions. This technique performed significantly better than the baseline score fusion methods.
Tasks	Speaker Recognition, Speaker Verification
Published	2016-08-07
URL	http://arxiv.org/abs/1608.02272v1
PDF	http://arxiv.org/pdf/1608.02272v1.pdf
PWC	https://paperswithcode.com/paper/incorporation-of-speech-duration-information
Repo
Framework


Title	Collective Semi-Supervised Learning for User Profiling in Social Media
Authors	Richard J. Oentaryo, Ee-Peng Lim, Freddy Chong Tat Chua, Jia-Wei Low, David Lo
Abstract	The abundance of user-generated data in social media has incentivized the development of methods to infer the latent attributes of users, which are crucially useful for personalization, advertising and recommendation. However, the current user profiling approaches have limited success, due to the lack of a principled way to integrate different types of social relationships of a user, and the reliance on scarcely-available labeled data in building a prediction model. In this paper, we present a novel solution termed Collective Semi-Supervised Learning (CSL), which provides a principled means to integrate different types of social relationship and unlabeled data under a unified computational framework. The joint learning from multiple relationships and unlabeled data yields a computationally sound and accurate approach to model user attributes in social media. Extensive experiments using Twitter data have demonstrated the efficacy of our CSL approach in inferring user attributes such as account type and marital status. We also show how CSL can be used to determine important user features, and to make inference on a larger user population.
Tasks
Published	2016-06-24
URL	http://arxiv.org/abs/1606.07707v1
PDF	http://arxiv.org/pdf/1606.07707v1.pdf
PWC	https://paperswithcode.com/paper/collective-semi-supervised-learning-for-user
Repo
Framework

Victory Sign Biometric for Terrorists Identification


Title	Victory Sign Biometric for Terrorists Identification
Authors	Ahmad B. A. Hassanat, Mahmoud B. Alhasanat, Mohammad Ali Abbadi, Eman Btoush, Mouhammd Al-Awadi, Ahmad S. Tarawneh
Abstract	Covering the face and all body parts, sometimes the only evidence to identify a person is their hand geometry, and not the whole hand- only two fingers (the index and the middle fingers) while showing the victory sign, as seen in many terrorists videos. This paper investigates for the first time a new way to identify persons, particularly (terrorists) from their victory sign. We have created a new database in this regard using a mobile phone camera, imaging the victory signs of 50 different persons over two sessions. Simple measurements for the fingers, in addition to the Hu Moments for the areas of the fingers were used to extract the geometric features of the shown part of the hand shown after segmentation. The experimental results using the KNN classifier were encouraging for most of the recorded persons; with about 40% to 93% total identification accuracy, depending on the features, distance metric and K used.
Tasks
Published	2016-02-26
URL	http://arxiv.org/abs/1602.08325v1
PDF	http://arxiv.org/pdf/1602.08325v1.pdf
PWC	https://paperswithcode.com/paper/victory-sign-biometric-for-terrorists
Repo
Framework

Towards Competitive Classifiers for Unbalanced Classification Problems: A Study on the Performance Scores


Title	Towards Competitive Classifiers for Unbalanced Classification Problems: A Study on the Performance Scores
Authors	Jonathan Ortigosa-Hernández, Iñaki Inza, Jose A. Lozano
Abstract	Although a great methodological effort has been invested in proposing competitive solutions to the class-imbalance problem, little effort has been made in pursuing a theoretical understanding of this matter. In order to shed some light on this topic, we perform, through a novel framework, an exhaustive analysis of the adequateness of the most commonly used performance scores to assess this complex scenario. We conclude that using unweighted H"older means with exponent $p \leq 1$ to average the recalls of all the classes produces adequate scores which are capable of determining whether a classifier is competitive. Then, we review the major solutions presented in the class-imbalance literature. Since any learning task can be defined as an optimisation problem where a loss function, usually connected to a particular score, is minimised, our goal, here, is to find whether the learning tasks found in the literature are also oriented to maximise the previously detected adequate scores. We conclude that they usually maximise the unweighted H"older mean with $p = 1$ (a-mean). Finally, we provide bounds on the values of the studied performance scores which guarantee a classifier with a higher recall than the random classifier in each and every class.
Tasks
Published	2016-08-31
URL	http://arxiv.org/abs/1608.08984v1
PDF	http://arxiv.org/pdf/1608.08984v1.pdf
PWC	https://paperswithcode.com/paper/towards-competitive-classifiers-for
Repo
Framework

Neural versus Phrase-Based Machine Translation Quality: a Case Study


Title	Neural versus Phrase-Based Machine Translation Quality: a Case Study
Authors	Luisa Bentivogli, Arianna Bisazza, Mauro Cettolo, Marcello Federico
Abstract	Within the field of Statistical Machine Translation (SMT), the neural approach (NMT) has recently emerged as the first technology able to challenge the long-standing dominance of phrase-based approaches (PBMT). In particular, at the IWSLT 2015 evaluation campaign, NMT outperformed well established state-of-the-art PBMT systems on English-German, a language pair known to be particularly hard because of morphology and syntactic differences. To understand in what respects NMT provides better translation quality than PBMT, we perform a detailed analysis of neural versus phrase-based SMT outputs, leveraging high quality post-edits performed by professional translators on the IWSLT data. For the first time, our analysis provides useful insights on what linguistic phenomena are best modeled by neural models – such as the reordering of verbs – while pointing out other aspects that remain to be improved.
Tasks	Machine Translation
Published	2016-08-16
URL	http://arxiv.org/abs/1608.04631v2
PDF	http://arxiv.org/pdf/1608.04631v2.pdf
PWC	https://paperswithcode.com/paper/neural-versus-phrase-based-machine
Repo
Framework

Language as a Latent Variable: Discrete Generative Models for Sentence Compression


Title	Language as a Latent Variable: Discrete Generative Models for Sentence Compression
Authors	Yishu Miao, Phil Blunsom
Abstract	In this work we explore deep generative models of text in which the latent representation of a document is itself drawn from a discrete language model distribution. We formulate a variational auto-encoder for inference in this model and apply it to the task of compressing sentences. In this application the generative model first draws a latent summary sentence from a background language model, and then subsequently draws the observed sentence conditioned on this latent summary. In our empirical evaluation we show that generative formulations of both abstractive and extractive compression yield state-of-the-art results when trained on a large amount of supervised data. Further, we explore semi-supervised compression scenarios where we show that it is possible to achieve performance competitive with previously proposed supervised models while training on a fraction of the supervised data.
Tasks	Language Modelling, Sentence Compression
Published	2016-09-23
URL	http://arxiv.org/abs/1609.07317v2
PDF	http://arxiv.org/pdf/1609.07317v2.pdf
PWC	https://paperswithcode.com/paper/language-as-a-latent-variable-discrete
Repo
Framework

An Unbiased Data Collection and Content Exploitation/Exploration Strategy for Personalization


Title	An Unbiased Data Collection and Content Exploitation/Exploration Strategy for Personalization
Authors	Liangjie Hong, Adnan Boz
Abstract	One of missions for personalization systems and recommender systems is to show content items according to users’ personal interests. In order to achieve such goal, these systems are learning user interests over time and trying to present content items tailoring to user profiles. Recommending items according to users’ preferences has been investigated extensively in the past few years, mainly thanks for the popularity of Netflix competition. In a real setting, users may be attracted by a subset of those items and interact with them, only leaving partial feedbacks to the system to learn in the next cycle, which leads to significant biases into systems and hence results in a situation where user engagement metrics cannot be improved over time. The problem is not just for one component of the system. The data collected from users is usually used in many different tasks, including learning ranking functions, building user profiles and constructing content classifiers. Once the data is biased, all these downstream use cases would be impacted as well. Therefore, it would be beneficial to gather unbiased data through user interactions. Traditionally, unbiased data collection is done through showing items uniformly sampling from the content pool. However, this simple scheme is not feasible as it risks user engagement metrics and it takes long time to gather user feedbacks. In this paper, we introduce a user-friendly unbiased data collection framework, by utilizing methods developed in the exploitation and exploration literature. We discuss how the framework is different from normal multi-armed bandit problems and why such method is needed. We layout a novel Thompson sampling for Bernoulli ranked-list to effectively balance user experiences and data collection. The proposed method is validated from a real bucket test and we show strong results comparing to old algorithms
Tasks	Recommendation Systems
Published	2016-04-12
URL	http://arxiv.org/abs/1604.03506v1
PDF	http://arxiv.org/pdf/1604.03506v1.pdf
PWC	https://paperswithcode.com/paper/an-unbiased-data-collection-and-content
Repo
Framework

A spectral-spatial fusion model for robust blood pulse waveform extraction in photoplethysmographic imaging


Title	A spectral-spatial fusion model for robust blood pulse waveform extraction in photoplethysmographic imaging
Authors	Robert Amelard, David A Clausi, Alexander Wong
Abstract	Photoplethysmographic imaging is a camera-based solution for non-contact cardiovascular monitoring from a distance. This technology enables monitoring in situations where contact-based devices may be problematic or infeasible, such as ambulatory, sleep, and multi-individual monitoring. However, extracting the blood pulse waveform signal is challenging due to the unknown mixture of relevant (pulsatile) and irrelevant pixels in the scene. Here, we design and implement a signal fusion framework, FusionPPG, for extracting a blood pulse waveform signal with strong temporal fidelity from a scene without requiring anatomical priors (e.g., facial tracking). The extraction problem is posed as a Bayesian least squares fusion problem, and solved using a novel probabilistic pulsatility model that incorporates both physiologically derived spectral and spatial waveform priors to identify pulsatility characteristics in the scene. Experimental results show statistically significantly improvements compared to the FaceMeanPPG method ($p<0.001$) and DistancePPG ($p<0.001$) methods. Heart rates predicted using FusionPPG correlated strongly with ground truth measurements ($r^2=0.9952$). FusionPPG was the only method able to assess cardiac arrhythmia via temporal analysis.
Tasks
Published	2016-06-29
URL	http://arxiv.org/abs/1606.09118v1
PDF	http://arxiv.org/pdf/1606.09118v1.pdf
PWC	https://paperswithcode.com/paper/a-spectral-spatial-fusion-model-for-robust
Repo
Framework

Data-Driven Online Decision Making with Costly Information Acquisition


Title	Data-Driven Online Decision Making with Costly Information Acquisition
Authors	Onur Atan, Mihaela van der Schaar
Abstract	In most real-world settings such as recommender systems, finance, and healthcare, collecting useful information is costly and requires an active choice on the part of the decision maker. The decision-maker needs to learn simultaneously what observations to make and what actions to take. This paper incorporates the information acquisition decision into an online learning framework. We propose two different algorithms for this dual learning problem: Sim-OOS and Seq-OOS where observations are made simultaneously and sequentially, respectively. We prove that both algorithms achieve a regret that is sublinear in time. The developed framework and algorithms can be used in many applications including medical informatics, recommender systems and actionable intelligence in transportation, finance, cyber-security etc., in which collecting information prior to making decisions is costly. We validate our algorithms in a breast cancer example setting in which we show substantial performance gains for our proposed algorithms.
Tasks	Decision Making, Recommendation Systems
Published	2016-02-11
URL	http://arxiv.org/abs/1602.03600v2
PDF	http://arxiv.org/pdf/1602.03600v2.pdf
PWC	https://paperswithcode.com/paper/data-driven-online-decision-making-with
Repo
Framework

Tracking with multi-level features


Title	Tracking with multi-level features
Authors	Roberto Henschel, Laura Leal-Taixé, Bodo Rosenhahn, Konrad Schindler
Abstract	We present a novel formulation of the multiple object tracking problem which integrates low and mid-level features. In particular, we formulate the tracking problem as a quadratic program coupling detections and dense point trajectories. Due to the computational complexity of the initial QP, we propose an approximation by two auxiliary problems, a temporal and spatial association, where the temporal subproblem can be efficiently solved by a linear program and the spatial association by a clustering algorithm. The objective function of the QP is used in order to find the optimal number of clusters, where each cluster ideally represents one person. Evaluation is provided for multiple scenarios, showing the superiority of our method with respect to classic tracking-by-detection methods and also other methods that greedily integrate low-level features.
Tasks	Multiple Object Tracking, Object Tracking
Published	2016-07-25
URL	http://arxiv.org/abs/1607.07304v1
PDF	http://arxiv.org/pdf/1607.07304v1.pdf
PWC	https://paperswithcode.com/paper/tracking-with-multi-level-features
Repo
Framework