April 2, 2020

3115 words 15 mins read

Paper Group ANR 207

Paper Group ANR 207

A Neuromorphic Proto-Object Based Dynamic Visual Saliency Model with an FPGA Implementation. Differential Privacy for Eye Tracking with Temporal Correlations. Reinforcement learning for the manipulation of eye tracking data. Fully Convolutional Neural Networks for Raw Eye Tracking Data Segmentation, Generation, and Reconstruction. End-to-End Models …

A Neuromorphic Proto-Object Based Dynamic Visual Saliency Model with an FPGA Implementation

Title A Neuromorphic Proto-Object Based Dynamic Visual Saliency Model with an FPGA Implementation
Authors Jamal Lottier Molin, Chetan Singh Thakur, Ralph Etienne-Cummings, Ernst Niebur
Abstract The ability to attend to salient regions of a visual scene is an innate and necessary preprocessing step for both biological and engineered systems performing high-level visual tasks (e.g. object detection, tracking, and classification). Computational efficiency, in regard to processing bandwidth and speed, is improved by only devoting computational resources to salient regions of the visual stimuli. In this paper, we first present a biologically-plausible, bottom-up, dynamic visual saliency model based on the notion of proto-objects. This is achieved by incorporating the temporal characteristics of the visual stimulus into the model, similarly to the manner in which early stages of the human visual system extracts temporal information. This model outperforms state-of-the-art dynamic visual saliency models in predicting human eye fixations on a commonly-used video dataset with associated eye tracking data. Secondly, for this model to have practical applications, it must be capable of performing its computations in real-time under lowpower, small-size, and lightweight constraints. To address this, we introduce a Field-Programmable Gate Array implementation of the model on an Opal Kelly 7350 Kintex-7 board. This novel hardware implementation allows for processing of up to 23.35 frames per second running on a 100 MHz clock – better than 26x speedup from the software implementation.
Tasks Eye Tracking, Object Detection
Published 2020-02-27
URL https://arxiv.org/abs/2002.11898v2
PDF https://arxiv.org/pdf/2002.11898v2.pdf
PWC https://paperswithcode.com/paper/a-proto-object-based-dynamic-visual-saliency
Repo
Framework

Differential Privacy for Eye Tracking with Temporal Correlations

Title Differential Privacy for Eye Tracking with Temporal Correlations
Authors Efe Bozkir, Onur Günlü, Wolfgang Fuhl, Rafael F. Schaefer, Enkelejda Kasneci
Abstract Head mounted displays bring eye tracking into daily use and this raises privacy concerns for users. Privacy-preservation techniques such as differential privacy mechanisms are recently applied to the eye tracking data obtained from such displays; however, standard differential privacy mechanisms are vulnerable to temporal correlations in the eye movement features. In this work, a transform coding based differential privacy mechanism is proposed for the first time in the eye tracking literature to further adapt it to statistics of eye movement feature data by comparing various low-complexity methods. Fourier Perturbation Algorithm, which is a differential privacy mechanism, is extended and a scaling mistake in its proof is corrected. Significant reductions in correlations in addition to query sensitivities are illustrated, which provide the best utility-privacy trade-off in the literature for the eye tracking dataset used. The differentially private eye movement data are evaluated also for classification accuracies for gender and document-type predictions to show that higher privacy is obtained without a reduction in the classification accuracies by using proposed methods.
Tasks Eye Tracking
Published 2020-02-20
URL https://arxiv.org/abs/2002.08972v1
PDF https://arxiv.org/pdf/2002.08972v1.pdf
PWC https://paperswithcode.com/paper/differential-privacy-for-eye-tracking-with
Repo
Framework

Reinforcement learning for the manipulation of eye tracking data

Title Reinforcement learning for the manipulation of eye tracking data
Authors Wolfgang Fuhl
Abstract In this paper, we present an approach based on reinforcement learning for eye tracking data manipulation. It is based on two opposing agents, where one tries to classify the data correctly and the second agent looks for patterns in the data, which get manipulated to hide specific information. We show that our approach is successfully applicable to preserve the privacy of a subject. In addition, our approach allows to evaluate the importance of temporal, as well as spatial, information of eye tracking data for specific classification goals. In general, this approach can also be used for stimuli manipulation, making it interesting for gaze guidance. For this purpose, this work provides the theoretical basis, which is why we have also integrated a section on how to apply this method for gaze guidance.
Tasks Eye Tracking
Published 2020-02-17
URL https://arxiv.org/abs/2002.06806v1
PDF https://arxiv.org/pdf/2002.06806v1.pdf
PWC https://paperswithcode.com/paper/reinforcement-learning-for-the-manipulation
Repo
Framework

Fully Convolutional Neural Networks for Raw Eye Tracking Data Segmentation, Generation, and Reconstruction

Title Fully Convolutional Neural Networks for Raw Eye Tracking Data Segmentation, Generation, and Reconstruction
Authors Wolfgang Fuhl
Abstract In this paper, we use fully convolutional neural networks for the semantic segmentation of eye tracking data. We also use these networks for reconstruction, and in conjunction with a variational auto-encoder to generate eye movement data. The first improvement of our approach is that no input window is necessary, due to the use of fully convolutional networks and therefore any input size can be processed directly. The second improvement is that the used and generated data is raw eye tracking data (position X, Y and time) without preprocessing. This is achieved by pre-initializing the filters in the first layer and by building the input tensor along the z axis. We evaluated our approach on three publicly available datasets and compare the results to the state of the art.
Tasks Eye Tracking, Semantic Segmentation
Published 2020-02-17
URL https://arxiv.org/abs/2002.10905v1
PDF https://arxiv.org/pdf/2002.10905v1.pdf
PWC https://paperswithcode.com/paper/fully-convolutional-neural-networks-for-raw
Repo
Framework

End-to-End Models for the Analysis of System 1 and System 2 Interactions based on Eye-Tracking Data

Title End-to-End Models for the Analysis of System 1 and System 2 Interactions based on Eye-Tracking Data
Authors Alessandro Rossi, Sara Ermini, Dario Bernabini, Dario Zanca, Marino Todisco, Alessandro Genovese, Antonio Rizzo
Abstract While theories postulating a dual cognitive system take hold, quantitative confirmations are still needed to understand and identify interactions between the two systems or conflict events. Eye movements are among the most direct markers of the individual attentive load and may serve as an important proxy of information. In this work we propose a computational method, within a modified visual version of the well-known Stroop test, for the identification of different tasks and potential conflicts events between the two systems through the collection and processing of data related to eye movements. A statistical analysis shows that the selected variables can characterize the variation of attentive load within different scenarios. Moreover, we show that Machine Learning techniques allow to distinguish between different tasks with a good classification accuracy and to investigate more in depth the gaze dynamics.
Tasks Eye Tracking
Published 2020-02-03
URL https://arxiv.org/abs/2002.11192v1
PDF https://arxiv.org/pdf/2002.11192v1.pdf
PWC https://paperswithcode.com/paper/end-to-end-models-for-the-analysis-of-system
Repo
Framework

Relevance Prediction from Eye-movements Using Semi-interpretable Convolutional Neural Networks

Title Relevance Prediction from Eye-movements Using Semi-interpretable Convolutional Neural Networks
Authors Nilavra Bhattacharya, Somnath Rakshit, Jacek Gwizdka, Paul Kogut
Abstract We propose an image-classification method to predict the perceived-relevance of text documents from eye-movements. An eye-tracking study was conducted where participants read short news articles, and rated them as relevant or irrelevant for answering a trigger question. We encode participants’ eye-movement scanpaths as images, and then train a convolutional neural network classifier using these scanpath images. The trained classifier is used to predict participants’ perceived-relevance of news articles from the corresponding scanpath images. This method is content-independent, as the classifier does not require knowledge of the screen-content, or the user’s information-task. Even with little data, the image classifier can predict perceived-relevance with up to 80% accuracy. When compared to similar eye-tracking studies from the literature, this scanpath image classification method outperforms previously reported metrics by appreciable margins. We also attempt to interpret how the image classifier differentiates between scanpaths on relevant and irrelevant documents.
Tasks Eye Tracking, Image Classification
Published 2020-01-15
URL https://arxiv.org/abs/2001.05152v1
PDF https://arxiv.org/pdf/2001.05152v1.pdf
PWC https://paperswithcode.com/paper/relevance-prediction-from-eye-movements-using
Repo
Framework

Multi-task Learning for Speaker Verification and Voice Trigger Detection

Title Multi-task Learning for Speaker Verification and Voice Trigger Detection
Authors Siddharth Sigtia, Erik Marchi, Sachin Kajarekar, Devang Naik, John Bridle
Abstract Automatic speech transcription and speaker recognition are usually treated as separate tasks even though they are interdependent. In this study, we investigate training a single network to perform both tasks jointly. We train the network in a supervised multi-task learning setup, where the speech transcription branch of the network is trained to minimise a phonetic connectionist temporal classification (CTC) loss while the speaker recognition branch of the network is trained to label the input sequence with the correct label for the speaker. We present a large-scale empirical study where the model is trained using several thousand hours of labelled training data for each task. We evaluate the speech transcription branch of the network on a voice trigger detection task while the speaker recognition branch is evaluated on a speaker verification task. Results demonstrate that the network is able to encode both phonetic \emph{and} speaker information in its learnt representations while yielding accuracies at least as good as the baseline models for each task, with the same number of parameters as the independent models.
Tasks Multi-Task Learning, Speaker Recognition, Speaker Verification
Published 2020-01-26
URL https://arxiv.org/abs/2001.10816v1
PDF https://arxiv.org/pdf/2001.10816v1.pdf
PWC https://paperswithcode.com/paper/multi-task-learning-for-speaker-verification
Repo
Framework

A multi-site study of a breast density deep learning model for full-field digital mammography and digital breast tomosynthesis exams

Title A multi-site study of a breast density deep learning model for full-field digital mammography and digital breast tomosynthesis exams
Authors Thomas P. Matthews, Sadanand Singh, Brent Mombourquette, Jason Su, Meet P. Shah, Stefano Pedemonte, Aaron Long, David Maffit, Jenny Gurney, Rodrigo Morales Hoil, Nikita Ghare, Douglas Smith, Stephen M. Moore, Susan C. Marks, Richard L. Wahl
Abstract $\textbf{Purpose:}$ To develop a Breast Imaging Reporting and Data System (BI-RADS) breast density DL model in a multi-site setting for synthetic 2D mammography (SM) images derived from 3D DBT exams using FFDM images and limited SM data. $\textbf{Materials and Methods:}$ A DL model was trained to predict BI-RADS breast density using FFDM images acquired from 2008 to 2017 (Site 1: 57492 patients, 187627 exams, 750752 images) for this retrospective study. The FFDM model was evaluated using SM datasets from two institutions (Site 1: 3842 patients, 3866 exams, 14472 images, acquired from 2016 to 2017; Site 2: 7557 patients, 16283 exams, 63973 images, 2015 to 2019). Adaptation methods were investigated to improve performance on the SM datasets and the effect of dataset size on each adaptation method is considered. Statistical significance was assessed using confidence intervals (CI), estimated by bootstrapping. $\textbf{Results:}$ Without adaptation, the model demonstrated close agreement with the original reporting radiologists for all three datasets (Site 1 FFDM: linearly-weighted $\kappa_w$ = 0.75, 95% CI: [0.74, 0.76]; Site 1 SM: $\kappa_w$ = 0.71, CI: [0.64, 0.78]; Site 2 SM: $\kappa_w$ = 0.72, CI: [0.70, 0.75]). With adaptation, performance improved for Site 2 (Site 1: $\kappa_w$ = 0.72, CI: [0.66, 0.79], Site 2: $\kappa_w$ = 0.79, CI: [0.76, 0.81]) using only 500 SM images from each site. $\textbf{Conclusion:}$ A BI-RADS breast density DL model demonstrated strong performance on FFDM and SM images from two institutions without training on SM images and improved using few SM images.
Tasks
Published 2020-01-23
URL https://arxiv.org/abs/2001.08383v1
PDF https://arxiv.org/pdf/2001.08383v1.pdf
PWC https://paperswithcode.com/paper/a-multi-site-study-of-a-breast-density-deep
Repo
Framework

Coronary Artery Disease Diagnosis; Ranking the Significant Features Using Random Trees Model

Title Coronary Artery Disease Diagnosis; Ranking the Significant Features Using Random Trees Model
Authors Javad Hassannataj Joloudari, Edris Hassannataj Joloudari, Hamid Saadatfar, Mohammad GhasemiGol, Seyyed Mohammad Razavi, Amir Mosavi, Narjes Nabipour, Shahaboddin Shamshirband, Laszlo Nadai
Abstract Heart disease is one of the most common diseases in middle-aged citizens. Among the vast number of heart diseases, the coronary artery disease (CAD) is considered as a common cardiovascular disease with a high death rate. The most popular tool for diagnosing CAD is the use of medical imaging, e.g., angiography. However, angiography is known for being costly and also associated with a number of side effects. Hence, the purpose of this study is to increase the accuracy of coronary heart disease diagnosis through selecting significant predictive features in order of their ranking. In this study, we propose an integrated method using machine learning. The machine learning methods of random trees (RTs), decision tree of C5.0, support vector machine (SVM), decision tree of Chi-squared automatic interaction detection (CHAID) are used in this study. The proposed method shows promising results and the study confirms that RTs model outperforms other models.
Tasks
Published 2020-01-16
URL https://arxiv.org/abs/2001.09841v1
PDF https://arxiv.org/pdf/2001.09841v1.pdf
PWC https://paperswithcode.com/paper/coronary-artery-disease-diagnosis-ranking-the
Repo
Framework

Reservoir Computing with Planar Nanomagnet Arrays

Title Reservoir Computing with Planar Nanomagnet Arrays
Authors Peng Zhou, Nathan R. McDonald, Alexander J. Edwards, Lisa Loomis, Clare D. Thiem, Joseph S. Friedman
Abstract Reservoir computing is an emerging methodology for neuromorphic computing that is especially well-suited for hardware implementations in size, weight, and power (SWaP) constrained environments. This work proposes a novel hardware implementation of a reservoir computer using a planar nanomagnet array. A small nanomagnet reservoir is demonstrated via micromagnetic simulations to be able to identify simple waveforms with 100% accuracy. Planar nanomagnet reservoirs are a promising new solution to the growing need for dedicated neuromorphic hardware.
Tasks
Published 2020-03-24
URL https://arxiv.org/abs/2003.10948v1
PDF https://arxiv.org/pdf/2003.10948v1.pdf
PWC https://paperswithcode.com/paper/reservoir-computing-with-planar-nanomagnet
Repo
Framework

Neural Operator: Graph Kernel Network for Partial Differential Equations

Title Neural Operator: Graph Kernel Network for Partial Differential Equations
Authors Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, Anima Anandkumar
Abstract The classical development of neural networks has been primarily for mappings between a finite-dimensional Euclidean space and a set of classes, or between two finite-dimensional Euclidean spaces. The purpose of this work is to generalize neural networks so that they can learn mappings between infinite-dimensional spaces (operators). The key innovation in our work is that a single set of network parameters, within a carefully designed network architecture, may be used to describe mappings between infinite-dimensional spaces and between different finite-dimensional approximations of those spaces. We formulate approximation of the infinite-dimensional mapping by composing nonlinear activation functions and a class of integral operators. The kernel integration is computed by message passing on graph networks. This approach has substantial practical consequences which we will illustrate in the context of mappings between input data to partial differential equations (PDEs) and their solutions. In this context, such learned networks can generalize among different approximation methods for the PDE (such as finite difference or finite element methods) and among approximations corresponding to different underlying levels of resolution and discretization. Experiments confirm that the proposed graph kernel network does have the desired properties and show competitive performance compared to the state of the art solvers.
Tasks
Published 2020-03-07
URL https://arxiv.org/abs/2003.03485v1
PDF https://arxiv.org/pdf/2003.03485v1.pdf
PWC https://paperswithcode.com/paper/neural-operator-graph-kernel-network-for
Repo
Framework

Deep Vectorization of Technical Drawings

Title Deep Vectorization of Technical Drawings
Authors Vage Egiazarian, Oleg Voynov, Alexey Artemov, Denis Volkhonskiy, Aleksandr Safin, Maria Taktasheva, Denis Zorin, Evgeny Burnaev
Abstract We present a new method for vectorization of technical line drawings, such as floor plans, architectural drawings, and 2D CAD images. Our method includes (1) a deep learning-based cleaning stage to eliminate the background and imperfections in the image and fill in missing parts, (2) a transformer-based network to estimate vector primitives, and (3) optimization procedure to obtain the final primitive configurations. We train the networks on synthetic data, renderings of vector line drawings, and manually vectorized scans of line drawings. Our method quantitatively and qualitatively outperforms a number of existing techniques on a collection of representative technical drawings.
Tasks
Published 2020-03-11
URL https://arxiv.org/abs/2003.05471v2
PDF https://arxiv.org/pdf/2003.05471v2.pdf
PWC https://paperswithcode.com/paper/deep-vectorization-of-technical-drawings
Repo
Framework

Context-Aware Recommendations for Televisions Using Deep Embeddings with Relaxed N-Pairs Loss Objective

Title Context-Aware Recommendations for Televisions Using Deep Embeddings with Relaxed N-Pairs Loss Objective
Authors Miklas S. Kristoffersen, Sven E. Shepstone, Zheng-Hua Tan
Abstract This paper studies context-aware recommendations in the television domain by proposing a deep learning-based method for learning joint context-content embeddings (JCCE). The method builds on recent developments within recommendations using latent representations and deep metric learning, in order to effectively represent contextual settings of viewing situations as well as available content in a shared latent space. This embedding space is used for exploring relevant content in various viewing settings by applying an N -pairs loss objective as well as a relaxed variant introduced in this paper. Experiments on two datasets confirm the recommendation ability of JCCE, achieving improvements when compared to state-of-the-art methods. Further experiments display useful structures in the learned embeddings that can be used to gain valuable knowledge of underlying variables in the relationship between contextual settings and content properties.
Tasks Metric Learning
Published 2020-02-04
URL https://arxiv.org/abs/2002.01554v1
PDF https://arxiv.org/pdf/2002.01554v1.pdf
PWC https://paperswithcode.com/paper/context-aware-recommendations-for-televisions
Repo
Framework

First Investigation Into the Use of Deep Learning for Continuous Assessment of Neonatal Postoperative Pain

Title First Investigation Into the Use of Deep Learning for Continuous Assessment of Neonatal Postoperative Pain
Authors Md Sirajus Salekin, Ghada Zamzmi, Dmitry Goldgof, Rangachar Kasturi, Thao Ho, Yu Sun
Abstract This paper presents the first investigation into the use of fully automated deep learning framework for assessing neonatal postoperative pain. It specifically investigates the use of Bilinear Convolutional Neural Network (B-CNN) to extract facial features during different levels of postoperative pain followed by modeling the temporal pattern using Recurrent Neural Network (RNN). Although acute and postoperative pain have some common characteristics (e.g., visual action units), postoperative pain has a different dynamic, and it evolves in a unique pattern over time. Our experimental results indicate a clear difference between the pattern of acute and postoperative pain. They also suggest the efficiency of using a combination of bilinear CNN with RNN model for the continuous assessment of postoperative pain intensity.
Tasks
Published 2020-03-24
URL https://arxiv.org/abs/2003.10601v1
PDF https://arxiv.org/pdf/2003.10601v1.pdf
PWC https://paperswithcode.com/paper/first-investigation-into-the-use-of-deep
Repo
Framework

Syndrome-Enabled Unsupervised Learning for Channel Adaptive Blind Equalizer with Joint Optimization Mechanism

Title Syndrome-Enabled Unsupervised Learning for Channel Adaptive Blind Equalizer with Joint Optimization Mechanism
Authors Chieh-Fang Teng, Yen-Liang Chen
Abstract With the rapid growth of deep learning in many fields, machine learning-assisted communication systems has attracted lots of researches with many eye-catching initial results. At the present stage, most of the methods still have great demand of massive “labeled data” for supervised learning to overcome channel variation. However, obtaining labeled data in practical applications may result in severe transmission overheads, and thus degrade the spectral efficiency. To address this issue, syndrome loss has been proposed to penalize non-valid decoded codewords and to achieve unsupervised learning for neural network-based decoder. However, it has not been evaluated under varying channels and cannot be applied to polar codes directly. In this work, by exploiting the nature of polar codes and taking advantage of the standardized cyclic redundancy check (CRC) mechanism, we propose two kinds of modified syndrome loss to enable unsupervised learning for polar codes. In addition, two application scenarios that benefit from the syndrome loss are also proposed for the evaluation. From simulation results, the proposed syndrome loss can even outperform supervised learning for the training of neural network-based polar decoder. Furthermore, the proposed syndrome-enabled blind equalizer can avoid the transmission of training sequences under time-varying fading channel and achieve global optimum via joint optimization mechanism, which has 1.3 dB gain over non-blind minimum mean square error (MMSE) equalizer.
Tasks
Published 2020-01-06
URL https://arxiv.org/abs/2001.01426v1
PDF https://arxiv.org/pdf/2001.01426v1.pdf
PWC https://paperswithcode.com/paper/syndrome-enabled-unsupervised-learning-for
Repo
Framework
comments powered by Disqus