Paper Group ANR 189
Object Recognition and Identification Using ESM Data. N-ary Error Correcting Coding Scheme. 3D Gaze Estimation from 2D Pupil Positions on Monocular Head-Mounted Eye Trackers. Blind Analysis of CT Image Noise Using Residual Denoised Images. Parsimonious Online Learning with Kernels via Sparse Projections in Function Space. Incorporation of Speech Du …
Object Recognition and Identification Using ESM Data
Title | Object Recognition and Identification Using ESM Data |
Authors | E. Taghavi, D. Song, R. Tharmarasa, T. Kirubarajan, Anne-Claire Boury-Brisset, Bhashyam Balaji |
Abstract | Recognition and identification of unknown targets is a crucial task in surveillance and security systems. Electronic Support Measures (ESM) are one of the most effective sensors for identification, especially for maritime and air–to–ground applications. In typical surveillance systems multiple ESM sensors are usually deployed along with kinematic sensors like radar. Different ESM sensors may produce different types of reports ready to be sent to the fusion center. The focus of this paper is to develop a new architecture for target recognition and identification when non–homogeneous ESM and possibly kinematic reports are received at the fusion center. The new fusion architecture is evaluated using simulations to show the benefit of utilizing different ESM reports such as attributes and signal level ESM data. |
Tasks | Object Recognition |
Published | 2016-03-22 |
URL | http://arxiv.org/abs/1607.01355v1 |
http://arxiv.org/pdf/1607.01355v1.pdf | |
PWC | https://paperswithcode.com/paper/object-recognition-and-identification-using |
Repo | |
Framework | |
N-ary Error Correcting Coding Scheme
Title | N-ary Error Correcting Coding Scheme |
Authors | Joey Tianyi Zhou, Ivor W. Tsang, Shen-Shyang Ho, Klaus-Robert Muller |
Abstract | The coding matrix design plays a fundamental role in the prediction performance of the error correcting output codes (ECOC)-based multi-class task. {In many-class classification problems, e.g., fine-grained categorization, it is difficult to distinguish subtle between-class differences under existing coding schemes due to a limited choices of coding values.} In this paper, we investigate whether one can relax existing binary and ternary code design to $N$-ary code design to achieve better classification performance. {In particular, we present a novel $N$-ary coding scheme that decomposes the original multi-class problem into simpler multi-class subproblems, which is similar to applying a divide-and-conquer method.} The two main advantages of such a coding scheme are as follows: (i) the ability to construct more discriminative codes and (ii) the flexibility for the user to select the best $N$ for ECOC-based classification. We show empirically that the optimal $N$ (based on classification performance) lies in $[3, 10]$ with some trade-off in computational cost. Moreover, we provide theoretical insights on the dependency of the generalization error bound of an $N$-ary ECOC on the average base classifier generalization error and the minimum distance between any two codes constructed. Extensive experimental results on benchmark multi-class datasets show that the proposed coding scheme achieves superior prediction performance over the state-of-the-art coding methods. |
Tasks | |
Published | 2016-03-18 |
URL | http://arxiv.org/abs/1603.05850v1 |
http://arxiv.org/pdf/1603.05850v1.pdf | |
PWC | https://paperswithcode.com/paper/n-ary-error-correcting-coding-scheme |
Repo | |
Framework | |
3D Gaze Estimation from 2D Pupil Positions on Monocular Head-Mounted Eye Trackers
Title | 3D Gaze Estimation from 2D Pupil Positions on Monocular Head-Mounted Eye Trackers |
Authors | Mohsen Mansouryar, Julian Steil, Yusuke Sugano, Andreas Bulling |
Abstract | 3D gaze information is important for scene-centric attention analysis but accurate estimation and analysis of 3D gaze in real-world environments remains challenging. We present a novel 3D gaze estimation method for monocular head-mounted eye trackers. In contrast to previous work, our method does not aim to infer 3D eyeball poses but directly maps 2D pupil positions to 3D gaze directions in scene camera coordinate space. We first provide a detailed discussion of the 3D gaze estimation task and summarize different methods, including our own. We then evaluate the performance of different 3D gaze estimation approaches using both simulated and real data. Through experimental validation, we demonstrate the effectiveness of our method in reducing parallax error, and we identify research challenges for the design of 3D calibration procedures. |
Tasks | Calibration, Gaze Estimation |
Published | 2016-01-11 |
URL | http://arxiv.org/abs/1601.02644v2 |
http://arxiv.org/pdf/1601.02644v2.pdf | |
PWC | https://paperswithcode.com/paper/3d-gaze-estimation-from-2d-pupil-positions-on |
Repo | |
Framework | |
Blind Analysis of CT Image Noise Using Residual Denoised Images
Title | Blind Analysis of CT Image Noise Using Residual Denoised Images |
Authors | Sohini Roychowdhury, Nathan Hollraft, Adam Alessio |
Abstract | CT protocol design and quality control would benefit from automated tools to estimate the quality of generated CT images. These tools could be used to identify erroneous CT acquisitions or refine protocols to achieve certain signal to noise characteristics. This paper investigates blind estimation methods to determine global signal strength and noise levels in chest CT images. Methods: We propose novel performance metrics corresponding to the accuracy of noise and signal estimation. We implement and evaluate the noise estimation performance of six spatial- and frequency- based methods, derived from conventional image filtering algorithms. Algorithms were tested on patient data sets from whole-body repeat CT acquisitions performed with a higher and lower dose technique over the same scan region. Results: The proposed performance metrics can evaluate the relative tradeoff of filter parameters and noise estimation performance. The proposed automated methods tend to underestimate CT image noise at low-flux levels. Initial application of methodology suggests that anisotropic diffusion and Wavelet-transform based filters provide optimal estimates of noise. Furthermore, methodology does not provide accurate estimates of absolute noise levels, but can provide estimates of relative change and/or trends in noise levels. |
Tasks | |
Published | 2016-05-24 |
URL | http://arxiv.org/abs/1605.07650v1 |
http://arxiv.org/pdf/1605.07650v1.pdf | |
PWC | https://paperswithcode.com/paper/blind-analysis-of-ct-image-noise-using |
Repo | |
Framework | |
Parsimonious Online Learning with Kernels via Sparse Projections in Function Space
Title | Parsimonious Online Learning with Kernels via Sparse Projections in Function Space |
Authors | Alec Koppel, Garrett Warnell, Ethan Stump, Alejandro Ribeiro |
Abstract | Despite their attractiveness, popular perception is that techniques for nonparametric function approximation do not scale to streaming data due to an intractable growth in the amount of storage they require. To solve this problem in a memory-affordable way, we propose an online technique based on functional stochastic gradient descent in tandem with supervised sparsification based on greedy function subspace projections. The method, called parsimonious online learning with kernels (POLK), provides a controllable tradeoff? between its solution accuracy and the amount of memory it requires. We derive conditions under which the generated function sequence converges almost surely to the optimal function, and we establish that the memory requirement remains finite. We evaluate POLK for kernel multi-class logistic regression and kernel hinge-loss classification on three canonical data sets: a synthetic Gaussian mixture model, the MNIST hand-written digits, and the Brodatz texture database. On all three tasks, we observe a favorable tradeoff of objective function evaluation, classification performance, and complexity of the nonparametric regressor extracted the proposed method. |
Tasks | |
Published | 2016-12-13 |
URL | http://arxiv.org/abs/1612.04111v1 |
http://arxiv.org/pdf/1612.04111v1.pdf | |
PWC | https://paperswithcode.com/paper/parsimonious-online-learning-with-kernels-via |
Repo | |
Framework | |
Incorporation of Speech Duration Information in Score Fusion of Speaker Recognition Systems
Title | Incorporation of Speech Duration Information in Score Fusion of Speaker Recognition Systems |
Authors | Ali Khodabakhsh, Seyyed Saeed Sarfjoo, Umut Uludag, Osman Soyyigit, Cenk Demiroglu |
Abstract | In recent years identity-vector (i-vector) based speaker verification (SV) systems have become very successful. Nevertheless, environmental noise and speech duration variability still have a significant effect on degrading the performance of these systems. In many real-life applications, duration of recordings are very short; as a result, extracted i-vectors cannot reliably represent the attributes of the speaker. Here, we investigate the effect of speech duration on the performance of three state-of-the-art speaker recognition systems. In addition, using a variety of available score fusion methods, we investigate the effect of score fusion for those speaker verification techniques to benefit from the performance difference of different methods under different enrollment and test speech duration conditions. This technique performed significantly better than the baseline score fusion methods. |
Tasks | Speaker Recognition, Speaker Verification |
Published | 2016-08-07 |
URL | http://arxiv.org/abs/1608.02272v1 |
http://arxiv.org/pdf/1608.02272v1.pdf | |
PWC | https://paperswithcode.com/paper/incorporation-of-speech-duration-information |
Repo | |
Framework | |
Collective Semi-Supervised Learning for User Profiling in Social Media
Title | Collective Semi-Supervised Learning for User Profiling in Social Media |
Authors | Richard J. Oentaryo, Ee-Peng Lim, Freddy Chong Tat Chua, Jia-Wei Low, David Lo |
Abstract | The abundance of user-generated data in social media has incentivized the development of methods to infer the latent attributes of users, which are crucially useful for personalization, advertising and recommendation. However, the current user profiling approaches have limited success, due to the lack of a principled way to integrate different types of social relationships of a user, and the reliance on scarcely-available labeled data in building a prediction model. In this paper, we present a novel solution termed Collective Semi-Supervised Learning (CSL), which provides a principled means to integrate different types of social relationship and unlabeled data under a unified computational framework. The joint learning from multiple relationships and unlabeled data yields a computationally sound and accurate approach to model user attributes in social media. Extensive experiments using Twitter data have demonstrated the efficacy of our CSL approach in inferring user attributes such as account type and marital status. We also show how CSL can be used to determine important user features, and to make inference on a larger user population. |
Tasks | |
Published | 2016-06-24 |
URL | http://arxiv.org/abs/1606.07707v1 |
http://arxiv.org/pdf/1606.07707v1.pdf | |
PWC | https://paperswithcode.com/paper/collective-semi-supervised-learning-for-user |
Repo | |
Framework | |
Victory Sign Biometric for Terrorists Identification
Title | Victory Sign Biometric for Terrorists Identification |
Authors | Ahmad B. A. Hassanat, Mahmoud B. Alhasanat, Mohammad Ali Abbadi, Eman Btoush, Mouhammd Al-Awadi, Ahmad S. Tarawneh |
Abstract | Covering the face and all body parts, sometimes the only evidence to identify a person is their hand geometry, and not the whole hand- only two fingers (the index and the middle fingers) while showing the victory sign, as seen in many terrorists videos. This paper investigates for the first time a new way to identify persons, particularly (terrorists) from their victory sign. We have created a new database in this regard using a mobile phone camera, imaging the victory signs of 50 different persons over two sessions. Simple measurements for the fingers, in addition to the Hu Moments for the areas of the fingers were used to extract the geometric features of the shown part of the hand shown after segmentation. The experimental results using the KNN classifier were encouraging for most of the recorded persons; with about 40% to 93% total identification accuracy, depending on the features, distance metric and K used. |
Tasks | |
Published | 2016-02-26 |
URL | http://arxiv.org/abs/1602.08325v1 |
http://arxiv.org/pdf/1602.08325v1.pdf | |
PWC | https://paperswithcode.com/paper/victory-sign-biometric-for-terrorists |
Repo | |
Framework | |
Towards Competitive Classifiers for Unbalanced Classification Problems: A Study on the Performance Scores
Title | Towards Competitive Classifiers for Unbalanced Classification Problems: A Study on the Performance Scores |
Authors | Jonathan Ortigosa-Hernández, Iñaki Inza, Jose A. Lozano |
Abstract | Although a great methodological effort has been invested in proposing competitive solutions to the class-imbalance problem, little effort has been made in pursuing a theoretical understanding of this matter. In order to shed some light on this topic, we perform, through a novel framework, an exhaustive analysis of the adequateness of the most commonly used performance scores to assess this complex scenario. We conclude that using unweighted H"older means with exponent $p \leq 1$ to average the recalls of all the classes produces adequate scores which are capable of determining whether a classifier is competitive. Then, we review the major solutions presented in the class-imbalance literature. Since any learning task can be defined as an optimisation problem where a loss function, usually connected to a particular score, is minimised, our goal, here, is to find whether the learning tasks found in the literature are also oriented to maximise the previously detected adequate scores. We conclude that they usually maximise the unweighted H"older mean with $p = 1$ (a-mean). Finally, we provide bounds on the values of the studied performance scores which guarantee a classifier with a higher recall than the random classifier in each and every class. |
Tasks | |
Published | 2016-08-31 |
URL | http://arxiv.org/abs/1608.08984v1 |
http://arxiv.org/pdf/1608.08984v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-competitive-classifiers-for |
Repo | |
Framework | |
Neural versus Phrase-Based Machine Translation Quality: a Case Study
Title | Neural versus Phrase-Based Machine Translation Quality: a Case Study |
Authors | Luisa Bentivogli, Arianna Bisazza, Mauro Cettolo, Marcello Federico |
Abstract | Within the field of Statistical Machine Translation (SMT), the neural approach (NMT) has recently emerged as the first technology able to challenge the long-standing dominance of phrase-based approaches (PBMT). In particular, at the IWSLT 2015 evaluation campaign, NMT outperformed well established state-of-the-art PBMT systems on English-German, a language pair known to be particularly hard because of morphology and syntactic differences. To understand in what respects NMT provides better translation quality than PBMT, we perform a detailed analysis of neural versus phrase-based SMT outputs, leveraging high quality post-edits performed by professional translators on the IWSLT data. For the first time, our analysis provides useful insights on what linguistic phenomena are best modeled by neural models – such as the reordering of verbs – while pointing out other aspects that remain to be improved. |
Tasks | Machine Translation |
Published | 2016-08-16 |
URL | http://arxiv.org/abs/1608.04631v2 |
http://arxiv.org/pdf/1608.04631v2.pdf | |
PWC | https://paperswithcode.com/paper/neural-versus-phrase-based-machine |
Repo | |
Framework | |
Language as a Latent Variable: Discrete Generative Models for Sentence Compression
Title | Language as a Latent Variable: Discrete Generative Models for Sentence Compression |
Authors | Yishu Miao, Phil Blunsom |
Abstract | In this work we explore deep generative models of text in which the latent representation of a document is itself drawn from a discrete language model distribution. We formulate a variational auto-encoder for inference in this model and apply it to the task of compressing sentences. In this application the generative model first draws a latent summary sentence from a background language model, and then subsequently draws the observed sentence conditioned on this latent summary. In our empirical evaluation we show that generative formulations of both abstractive and extractive compression yield state-of-the-art results when trained on a large amount of supervised data. Further, we explore semi-supervised compression scenarios where we show that it is possible to achieve performance competitive with previously proposed supervised models while training on a fraction of the supervised data. |
Tasks | Language Modelling, Sentence Compression |
Published | 2016-09-23 |
URL | http://arxiv.org/abs/1609.07317v2 |
http://arxiv.org/pdf/1609.07317v2.pdf | |
PWC | https://paperswithcode.com/paper/language-as-a-latent-variable-discrete |
Repo | |
Framework | |
An Unbiased Data Collection and Content Exploitation/Exploration Strategy for Personalization
Title | An Unbiased Data Collection and Content Exploitation/Exploration Strategy for Personalization |
Authors | Liangjie Hong, Adnan Boz |
Abstract | One of missions for personalization systems and recommender systems is to show content items according to users’ personal interests. In order to achieve such goal, these systems are learning user interests over time and trying to present content items tailoring to user profiles. Recommending items according to users’ preferences has been investigated extensively in the past few years, mainly thanks for the popularity of Netflix competition. In a real setting, users may be attracted by a subset of those items and interact with them, only leaving partial feedbacks to the system to learn in the next cycle, which leads to significant biases into systems and hence results in a situation where user engagement metrics cannot be improved over time. The problem is not just for one component of the system. The data collected from users is usually used in many different tasks, including learning ranking functions, building user profiles and constructing content classifiers. Once the data is biased, all these downstream use cases would be impacted as well. Therefore, it would be beneficial to gather unbiased data through user interactions. Traditionally, unbiased data collection is done through showing items uniformly sampling from the content pool. However, this simple scheme is not feasible as it risks user engagement metrics and it takes long time to gather user feedbacks. In this paper, we introduce a user-friendly unbiased data collection framework, by utilizing methods developed in the exploitation and exploration literature. We discuss how the framework is different from normal multi-armed bandit problems and why such method is needed. We layout a novel Thompson sampling for Bernoulli ranked-list to effectively balance user experiences and data collection. The proposed method is validated from a real bucket test and we show strong results comparing to old algorithms |
Tasks | Recommendation Systems |
Published | 2016-04-12 |
URL | http://arxiv.org/abs/1604.03506v1 |
http://arxiv.org/pdf/1604.03506v1.pdf | |
PWC | https://paperswithcode.com/paper/an-unbiased-data-collection-and-content |
Repo | |
Framework | |
A spectral-spatial fusion model for robust blood pulse waveform extraction in photoplethysmographic imaging
Title | A spectral-spatial fusion model for robust blood pulse waveform extraction in photoplethysmographic imaging |
Authors | Robert Amelard, David A Clausi, Alexander Wong |
Abstract | Photoplethysmographic imaging is a camera-based solution for non-contact cardiovascular monitoring from a distance. This technology enables monitoring in situations where contact-based devices may be problematic or infeasible, such as ambulatory, sleep, and multi-individual monitoring. However, extracting the blood pulse waveform signal is challenging due to the unknown mixture of relevant (pulsatile) and irrelevant pixels in the scene. Here, we design and implement a signal fusion framework, FusionPPG, for extracting a blood pulse waveform signal with strong temporal fidelity from a scene without requiring anatomical priors (e.g., facial tracking). The extraction problem is posed as a Bayesian least squares fusion problem, and solved using a novel probabilistic pulsatility model that incorporates both physiologically derived spectral and spatial waveform priors to identify pulsatility characteristics in the scene. Experimental results show statistically significantly improvements compared to the FaceMeanPPG method ($p<0.001$) and DistancePPG ($p<0.001$) methods. Heart rates predicted using FusionPPG correlated strongly with ground truth measurements ($r^2=0.9952$). FusionPPG was the only method able to assess cardiac arrhythmia via temporal analysis. |
Tasks | |
Published | 2016-06-29 |
URL | http://arxiv.org/abs/1606.09118v1 |
http://arxiv.org/pdf/1606.09118v1.pdf | |
PWC | https://paperswithcode.com/paper/a-spectral-spatial-fusion-model-for-robust |
Repo | |
Framework | |
Data-Driven Online Decision Making with Costly Information Acquisition
Title | Data-Driven Online Decision Making with Costly Information Acquisition |
Authors | Onur Atan, Mihaela van der Schaar |
Abstract | In most real-world settings such as recommender systems, finance, and healthcare, collecting useful information is costly and requires an active choice on the part of the decision maker. The decision-maker needs to learn simultaneously what observations to make and what actions to take. This paper incorporates the information acquisition decision into an online learning framework. We propose two different algorithms for this dual learning problem: Sim-OOS and Seq-OOS where observations are made simultaneously and sequentially, respectively. We prove that both algorithms achieve a regret that is sublinear in time. The developed framework and algorithms can be used in many applications including medical informatics, recommender systems and actionable intelligence in transportation, finance, cyber-security etc., in which collecting information prior to making decisions is costly. We validate our algorithms in a breast cancer example setting in which we show substantial performance gains for our proposed algorithms. |
Tasks | Decision Making, Recommendation Systems |
Published | 2016-02-11 |
URL | http://arxiv.org/abs/1602.03600v2 |
http://arxiv.org/pdf/1602.03600v2.pdf | |
PWC | https://paperswithcode.com/paper/data-driven-online-decision-making-with |
Repo | |
Framework | |
Tracking with multi-level features
Title | Tracking with multi-level features |
Authors | Roberto Henschel, Laura Leal-Taixé, Bodo Rosenhahn, Konrad Schindler |
Abstract | We present a novel formulation of the multiple object tracking problem which integrates low and mid-level features. In particular, we formulate the tracking problem as a quadratic program coupling detections and dense point trajectories. Due to the computational complexity of the initial QP, we propose an approximation by two auxiliary problems, a temporal and spatial association, where the temporal subproblem can be efficiently solved by a linear program and the spatial association by a clustering algorithm. The objective function of the QP is used in order to find the optimal number of clusters, where each cluster ideally represents one person. Evaluation is provided for multiple scenarios, showing the superiority of our method with respect to classic tracking-by-detection methods and also other methods that greedily integrate low-level features. |
Tasks | Multiple Object Tracking, Object Tracking |
Published | 2016-07-25 |
URL | http://arxiv.org/abs/1607.07304v1 |
http://arxiv.org/pdf/1607.07304v1.pdf | |
PWC | https://paperswithcode.com/paper/tracking-with-multi-level-features |
Repo | |
Framework | |