Paper Group ANR 207
A Neuromorphic Proto-Object Based Dynamic Visual Saliency Model with an FPGA Implementation. Differential Privacy for Eye Tracking with Temporal Correlations. Reinforcement learning for the manipulation of eye tracking data. Fully Convolutional Neural Networks for Raw Eye Tracking Data Segmentation, Generation, and Reconstruction. End-to-End Models …
A Neuromorphic Proto-Object Based Dynamic Visual Saliency Model with an FPGA Implementation
Title | A Neuromorphic Proto-Object Based Dynamic Visual Saliency Model with an FPGA Implementation |
Authors | Jamal Lottier Molin, Chetan Singh Thakur, Ralph Etienne-Cummings, Ernst Niebur |
Abstract | The ability to attend to salient regions of a visual scene is an innate and necessary preprocessing step for both biological and engineered systems performing high-level visual tasks (e.g. object detection, tracking, and classification). Computational efficiency, in regard to processing bandwidth and speed, is improved by only devoting computational resources to salient regions of the visual stimuli. In this paper, we first present a biologically-plausible, bottom-up, dynamic visual saliency model based on the notion of proto-objects. This is achieved by incorporating the temporal characteristics of the visual stimulus into the model, similarly to the manner in which early stages of the human visual system extracts temporal information. This model outperforms state-of-the-art dynamic visual saliency models in predicting human eye fixations on a commonly-used video dataset with associated eye tracking data. Secondly, for this model to have practical applications, it must be capable of performing its computations in real-time under lowpower, small-size, and lightweight constraints. To address this, we introduce a Field-Programmable Gate Array implementation of the model on an Opal Kelly 7350 Kintex-7 board. This novel hardware implementation allows for processing of up to 23.35 frames per second running on a 100 MHz clock – better than 26x speedup from the software implementation. |
Tasks | Eye Tracking, Object Detection |
Published | 2020-02-27 |
URL | https://arxiv.org/abs/2002.11898v2 |
https://arxiv.org/pdf/2002.11898v2.pdf | |
PWC | https://paperswithcode.com/paper/a-proto-object-based-dynamic-visual-saliency |
Repo | |
Framework | |
Differential Privacy for Eye Tracking with Temporal Correlations
Title | Differential Privacy for Eye Tracking with Temporal Correlations |
Authors | Efe Bozkir, Onur Günlü, Wolfgang Fuhl, Rafael F. Schaefer, Enkelejda Kasneci |
Abstract | Head mounted displays bring eye tracking into daily use and this raises privacy concerns for users. Privacy-preservation techniques such as differential privacy mechanisms are recently applied to the eye tracking data obtained from such displays; however, standard differential privacy mechanisms are vulnerable to temporal correlations in the eye movement features. In this work, a transform coding based differential privacy mechanism is proposed for the first time in the eye tracking literature to further adapt it to statistics of eye movement feature data by comparing various low-complexity methods. Fourier Perturbation Algorithm, which is a differential privacy mechanism, is extended and a scaling mistake in its proof is corrected. Significant reductions in correlations in addition to query sensitivities are illustrated, which provide the best utility-privacy trade-off in the literature for the eye tracking dataset used. The differentially private eye movement data are evaluated also for classification accuracies for gender and document-type predictions to show that higher privacy is obtained without a reduction in the classification accuracies by using proposed methods. |
Tasks | Eye Tracking |
Published | 2020-02-20 |
URL | https://arxiv.org/abs/2002.08972v1 |
https://arxiv.org/pdf/2002.08972v1.pdf | |
PWC | https://paperswithcode.com/paper/differential-privacy-for-eye-tracking-with |
Repo | |
Framework | |
Reinforcement learning for the manipulation of eye tracking data
Title | Reinforcement learning for the manipulation of eye tracking data |
Authors | Wolfgang Fuhl |
Abstract | In this paper, we present an approach based on reinforcement learning for eye tracking data manipulation. It is based on two opposing agents, where one tries to classify the data correctly and the second agent looks for patterns in the data, which get manipulated to hide specific information. We show that our approach is successfully applicable to preserve the privacy of a subject. In addition, our approach allows to evaluate the importance of temporal, as well as spatial, information of eye tracking data for specific classification goals. In general, this approach can also be used for stimuli manipulation, making it interesting for gaze guidance. For this purpose, this work provides the theoretical basis, which is why we have also integrated a section on how to apply this method for gaze guidance. |
Tasks | Eye Tracking |
Published | 2020-02-17 |
URL | https://arxiv.org/abs/2002.06806v1 |
https://arxiv.org/pdf/2002.06806v1.pdf | |
PWC | https://paperswithcode.com/paper/reinforcement-learning-for-the-manipulation |
Repo | |
Framework | |
Fully Convolutional Neural Networks for Raw Eye Tracking Data Segmentation, Generation, and Reconstruction
Title | Fully Convolutional Neural Networks for Raw Eye Tracking Data Segmentation, Generation, and Reconstruction |
Authors | Wolfgang Fuhl |
Abstract | In this paper, we use fully convolutional neural networks for the semantic segmentation of eye tracking data. We also use these networks for reconstruction, and in conjunction with a variational auto-encoder to generate eye movement data. The first improvement of our approach is that no input window is necessary, due to the use of fully convolutional networks and therefore any input size can be processed directly. The second improvement is that the used and generated data is raw eye tracking data (position X, Y and time) without preprocessing. This is achieved by pre-initializing the filters in the first layer and by building the input tensor along the z axis. We evaluated our approach on three publicly available datasets and compare the results to the state of the art. |
Tasks | Eye Tracking, Semantic Segmentation |
Published | 2020-02-17 |
URL | https://arxiv.org/abs/2002.10905v1 |
https://arxiv.org/pdf/2002.10905v1.pdf | |
PWC | https://paperswithcode.com/paper/fully-convolutional-neural-networks-for-raw |
Repo | |
Framework | |
End-to-End Models for the Analysis of System 1 and System 2 Interactions based on Eye-Tracking Data
Title | End-to-End Models for the Analysis of System 1 and System 2 Interactions based on Eye-Tracking Data |
Authors | Alessandro Rossi, Sara Ermini, Dario Bernabini, Dario Zanca, Marino Todisco, Alessandro Genovese, Antonio Rizzo |
Abstract | While theories postulating a dual cognitive system take hold, quantitative confirmations are still needed to understand and identify interactions between the two systems or conflict events. Eye movements are among the most direct markers of the individual attentive load and may serve as an important proxy of information. In this work we propose a computational method, within a modified visual version of the well-known Stroop test, for the identification of different tasks and potential conflicts events between the two systems through the collection and processing of data related to eye movements. A statistical analysis shows that the selected variables can characterize the variation of attentive load within different scenarios. Moreover, we show that Machine Learning techniques allow to distinguish between different tasks with a good classification accuracy and to investigate more in depth the gaze dynamics. |
Tasks | Eye Tracking |
Published | 2020-02-03 |
URL | https://arxiv.org/abs/2002.11192v1 |
https://arxiv.org/pdf/2002.11192v1.pdf | |
PWC | https://paperswithcode.com/paper/end-to-end-models-for-the-analysis-of-system |
Repo | |
Framework | |
Relevance Prediction from Eye-movements Using Semi-interpretable Convolutional Neural Networks
Title | Relevance Prediction from Eye-movements Using Semi-interpretable Convolutional Neural Networks |
Authors | Nilavra Bhattacharya, Somnath Rakshit, Jacek Gwizdka, Paul Kogut |
Abstract | We propose an image-classification method to predict the perceived-relevance of text documents from eye-movements. An eye-tracking study was conducted where participants read short news articles, and rated them as relevant or irrelevant for answering a trigger question. We encode participants’ eye-movement scanpaths as images, and then train a convolutional neural network classifier using these scanpath images. The trained classifier is used to predict participants’ perceived-relevance of news articles from the corresponding scanpath images. This method is content-independent, as the classifier does not require knowledge of the screen-content, or the user’s information-task. Even with little data, the image classifier can predict perceived-relevance with up to 80% accuracy. When compared to similar eye-tracking studies from the literature, this scanpath image classification method outperforms previously reported metrics by appreciable margins. We also attempt to interpret how the image classifier differentiates between scanpaths on relevant and irrelevant documents. |
Tasks | Eye Tracking, Image Classification |
Published | 2020-01-15 |
URL | https://arxiv.org/abs/2001.05152v1 |
https://arxiv.org/pdf/2001.05152v1.pdf | |
PWC | https://paperswithcode.com/paper/relevance-prediction-from-eye-movements-using |
Repo | |
Framework | |
Multi-task Learning for Speaker Verification and Voice Trigger Detection
Title | Multi-task Learning for Speaker Verification and Voice Trigger Detection |
Authors | Siddharth Sigtia, Erik Marchi, Sachin Kajarekar, Devang Naik, John Bridle |
Abstract | Automatic speech transcription and speaker recognition are usually treated as separate tasks even though they are interdependent. In this study, we investigate training a single network to perform both tasks jointly. We train the network in a supervised multi-task learning setup, where the speech transcription branch of the network is trained to minimise a phonetic connectionist temporal classification (CTC) loss while the speaker recognition branch of the network is trained to label the input sequence with the correct label for the speaker. We present a large-scale empirical study where the model is trained using several thousand hours of labelled training data for each task. We evaluate the speech transcription branch of the network on a voice trigger detection task while the speaker recognition branch is evaluated on a speaker verification task. Results demonstrate that the network is able to encode both phonetic \emph{and} speaker information in its learnt representations while yielding accuracies at least as good as the baseline models for each task, with the same number of parameters as the independent models. |
Tasks | Multi-Task Learning, Speaker Recognition, Speaker Verification |
Published | 2020-01-26 |
URL | https://arxiv.org/abs/2001.10816v1 |
https://arxiv.org/pdf/2001.10816v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-task-learning-for-speaker-verification |
Repo | |
Framework | |
A multi-site study of a breast density deep learning model for full-field digital mammography and digital breast tomosynthesis exams
Title | A multi-site study of a breast density deep learning model for full-field digital mammography and digital breast tomosynthesis exams |
Authors | Thomas P. Matthews, Sadanand Singh, Brent Mombourquette, Jason Su, Meet P. Shah, Stefano Pedemonte, Aaron Long, David Maffit, Jenny Gurney, Rodrigo Morales Hoil, Nikita Ghare, Douglas Smith, Stephen M. Moore, Susan C. Marks, Richard L. Wahl |
Abstract | $\textbf{Purpose:}$ To develop a Breast Imaging Reporting and Data System (BI-RADS) breast density DL model in a multi-site setting for synthetic 2D mammography (SM) images derived from 3D DBT exams using FFDM images and limited SM data. $\textbf{Materials and Methods:}$ A DL model was trained to predict BI-RADS breast density using FFDM images acquired from 2008 to 2017 (Site 1: 57492 patients, 187627 exams, 750752 images) for this retrospective study. The FFDM model was evaluated using SM datasets from two institutions (Site 1: 3842 patients, 3866 exams, 14472 images, acquired from 2016 to 2017; Site 2: 7557 patients, 16283 exams, 63973 images, 2015 to 2019). Adaptation methods were investigated to improve performance on the SM datasets and the effect of dataset size on each adaptation method is considered. Statistical significance was assessed using confidence intervals (CI), estimated by bootstrapping. $\textbf{Results:}$ Without adaptation, the model demonstrated close agreement with the original reporting radiologists for all three datasets (Site 1 FFDM: linearly-weighted $\kappa_w$ = 0.75, 95% CI: [0.74, 0.76]; Site 1 SM: $\kappa_w$ = 0.71, CI: [0.64, 0.78]; Site 2 SM: $\kappa_w$ = 0.72, CI: [0.70, 0.75]). With adaptation, performance improved for Site 2 (Site 1: $\kappa_w$ = 0.72, CI: [0.66, 0.79], Site 2: $\kappa_w$ = 0.79, CI: [0.76, 0.81]) using only 500 SM images from each site. $\textbf{Conclusion:}$ A BI-RADS breast density DL model demonstrated strong performance on FFDM and SM images from two institutions without training on SM images and improved using few SM images. |
Tasks | |
Published | 2020-01-23 |
URL | https://arxiv.org/abs/2001.08383v1 |
https://arxiv.org/pdf/2001.08383v1.pdf | |
PWC | https://paperswithcode.com/paper/a-multi-site-study-of-a-breast-density-deep |
Repo | |
Framework | |
Coronary Artery Disease Diagnosis; Ranking the Significant Features Using Random Trees Model
Title | Coronary Artery Disease Diagnosis; Ranking the Significant Features Using Random Trees Model |
Authors | Javad Hassannataj Joloudari, Edris Hassannataj Joloudari, Hamid Saadatfar, Mohammad GhasemiGol, Seyyed Mohammad Razavi, Amir Mosavi, Narjes Nabipour, Shahaboddin Shamshirband, Laszlo Nadai |
Abstract | Heart disease is one of the most common diseases in middle-aged citizens. Among the vast number of heart diseases, the coronary artery disease (CAD) is considered as a common cardiovascular disease with a high death rate. The most popular tool for diagnosing CAD is the use of medical imaging, e.g., angiography. However, angiography is known for being costly and also associated with a number of side effects. Hence, the purpose of this study is to increase the accuracy of coronary heart disease diagnosis through selecting significant predictive features in order of their ranking. In this study, we propose an integrated method using machine learning. The machine learning methods of random trees (RTs), decision tree of C5.0, support vector machine (SVM), decision tree of Chi-squared automatic interaction detection (CHAID) are used in this study. The proposed method shows promising results and the study confirms that RTs model outperforms other models. |
Tasks | |
Published | 2020-01-16 |
URL | https://arxiv.org/abs/2001.09841v1 |
https://arxiv.org/pdf/2001.09841v1.pdf | |
PWC | https://paperswithcode.com/paper/coronary-artery-disease-diagnosis-ranking-the |
Repo | |
Framework | |
Reservoir Computing with Planar Nanomagnet Arrays
Title | Reservoir Computing with Planar Nanomagnet Arrays |
Authors | Peng Zhou, Nathan R. McDonald, Alexander J. Edwards, Lisa Loomis, Clare D. Thiem, Joseph S. Friedman |
Abstract | Reservoir computing is an emerging methodology for neuromorphic computing that is especially well-suited for hardware implementations in size, weight, and power (SWaP) constrained environments. This work proposes a novel hardware implementation of a reservoir computer using a planar nanomagnet array. A small nanomagnet reservoir is demonstrated via micromagnetic simulations to be able to identify simple waveforms with 100% accuracy. Planar nanomagnet reservoirs are a promising new solution to the growing need for dedicated neuromorphic hardware. |
Tasks | |
Published | 2020-03-24 |
URL | https://arxiv.org/abs/2003.10948v1 |
https://arxiv.org/pdf/2003.10948v1.pdf | |
PWC | https://paperswithcode.com/paper/reservoir-computing-with-planar-nanomagnet |
Repo | |
Framework | |
Neural Operator: Graph Kernel Network for Partial Differential Equations
Title | Neural Operator: Graph Kernel Network for Partial Differential Equations |
Authors | Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, Anima Anandkumar |
Abstract | The classical development of neural networks has been primarily for mappings between a finite-dimensional Euclidean space and a set of classes, or between two finite-dimensional Euclidean spaces. The purpose of this work is to generalize neural networks so that they can learn mappings between infinite-dimensional spaces (operators). The key innovation in our work is that a single set of network parameters, within a carefully designed network architecture, may be used to describe mappings between infinite-dimensional spaces and between different finite-dimensional approximations of those spaces. We formulate approximation of the infinite-dimensional mapping by composing nonlinear activation functions and a class of integral operators. The kernel integration is computed by message passing on graph networks. This approach has substantial practical consequences which we will illustrate in the context of mappings between input data to partial differential equations (PDEs) and their solutions. In this context, such learned networks can generalize among different approximation methods for the PDE (such as finite difference or finite element methods) and among approximations corresponding to different underlying levels of resolution and discretization. Experiments confirm that the proposed graph kernel network does have the desired properties and show competitive performance compared to the state of the art solvers. |
Tasks | |
Published | 2020-03-07 |
URL | https://arxiv.org/abs/2003.03485v1 |
https://arxiv.org/pdf/2003.03485v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-operator-graph-kernel-network-for |
Repo | |
Framework | |
Deep Vectorization of Technical Drawings
Title | Deep Vectorization of Technical Drawings |
Authors | Vage Egiazarian, Oleg Voynov, Alexey Artemov, Denis Volkhonskiy, Aleksandr Safin, Maria Taktasheva, Denis Zorin, Evgeny Burnaev |
Abstract | We present a new method for vectorization of technical line drawings, such as floor plans, architectural drawings, and 2D CAD images. Our method includes (1) a deep learning-based cleaning stage to eliminate the background and imperfections in the image and fill in missing parts, (2) a transformer-based network to estimate vector primitives, and (3) optimization procedure to obtain the final primitive configurations. We train the networks on synthetic data, renderings of vector line drawings, and manually vectorized scans of line drawings. Our method quantitatively and qualitatively outperforms a number of existing techniques on a collection of representative technical drawings. |
Tasks | |
Published | 2020-03-11 |
URL | https://arxiv.org/abs/2003.05471v2 |
https://arxiv.org/pdf/2003.05471v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-vectorization-of-technical-drawings |
Repo | |
Framework | |
Context-Aware Recommendations for Televisions Using Deep Embeddings with Relaxed N-Pairs Loss Objective
Title | Context-Aware Recommendations for Televisions Using Deep Embeddings with Relaxed N-Pairs Loss Objective |
Authors | Miklas S. Kristoffersen, Sven E. Shepstone, Zheng-Hua Tan |
Abstract | This paper studies context-aware recommendations in the television domain by proposing a deep learning-based method for learning joint context-content embeddings (JCCE). The method builds on recent developments within recommendations using latent representations and deep metric learning, in order to effectively represent contextual settings of viewing situations as well as available content in a shared latent space. This embedding space is used for exploring relevant content in various viewing settings by applying an N -pairs loss objective as well as a relaxed variant introduced in this paper. Experiments on two datasets confirm the recommendation ability of JCCE, achieving improvements when compared to state-of-the-art methods. Further experiments display useful structures in the learned embeddings that can be used to gain valuable knowledge of underlying variables in the relationship between contextual settings and content properties. |
Tasks | Metric Learning |
Published | 2020-02-04 |
URL | https://arxiv.org/abs/2002.01554v1 |
https://arxiv.org/pdf/2002.01554v1.pdf | |
PWC | https://paperswithcode.com/paper/context-aware-recommendations-for-televisions |
Repo | |
Framework | |
First Investigation Into the Use of Deep Learning for Continuous Assessment of Neonatal Postoperative Pain
Title | First Investigation Into the Use of Deep Learning for Continuous Assessment of Neonatal Postoperative Pain |
Authors | Md Sirajus Salekin, Ghada Zamzmi, Dmitry Goldgof, Rangachar Kasturi, Thao Ho, Yu Sun |
Abstract | This paper presents the first investigation into the use of fully automated deep learning framework for assessing neonatal postoperative pain. It specifically investigates the use of Bilinear Convolutional Neural Network (B-CNN) to extract facial features during different levels of postoperative pain followed by modeling the temporal pattern using Recurrent Neural Network (RNN). Although acute and postoperative pain have some common characteristics (e.g., visual action units), postoperative pain has a different dynamic, and it evolves in a unique pattern over time. Our experimental results indicate a clear difference between the pattern of acute and postoperative pain. They also suggest the efficiency of using a combination of bilinear CNN with RNN model for the continuous assessment of postoperative pain intensity. |
Tasks | |
Published | 2020-03-24 |
URL | https://arxiv.org/abs/2003.10601v1 |
https://arxiv.org/pdf/2003.10601v1.pdf | |
PWC | https://paperswithcode.com/paper/first-investigation-into-the-use-of-deep |
Repo | |
Framework | |
Syndrome-Enabled Unsupervised Learning for Channel Adaptive Blind Equalizer with Joint Optimization Mechanism
Title | Syndrome-Enabled Unsupervised Learning for Channel Adaptive Blind Equalizer with Joint Optimization Mechanism |
Authors | Chieh-Fang Teng, Yen-Liang Chen |
Abstract | With the rapid growth of deep learning in many fields, machine learning-assisted communication systems has attracted lots of researches with many eye-catching initial results. At the present stage, most of the methods still have great demand of massive “labeled data” for supervised learning to overcome channel variation. However, obtaining labeled data in practical applications may result in severe transmission overheads, and thus degrade the spectral efficiency. To address this issue, syndrome loss has been proposed to penalize non-valid decoded codewords and to achieve unsupervised learning for neural network-based decoder. However, it has not been evaluated under varying channels and cannot be applied to polar codes directly. In this work, by exploiting the nature of polar codes and taking advantage of the standardized cyclic redundancy check (CRC) mechanism, we propose two kinds of modified syndrome loss to enable unsupervised learning for polar codes. In addition, two application scenarios that benefit from the syndrome loss are also proposed for the evaluation. From simulation results, the proposed syndrome loss can even outperform supervised learning for the training of neural network-based polar decoder. Furthermore, the proposed syndrome-enabled blind equalizer can avoid the transmission of training sequences under time-varying fading channel and achieve global optimum via joint optimization mechanism, which has 1.3 dB gain over non-blind minimum mean square error (MMSE) equalizer. |
Tasks | |
Published | 2020-01-06 |
URL | https://arxiv.org/abs/2001.01426v1 |
https://arxiv.org/pdf/2001.01426v1.pdf | |
PWC | https://paperswithcode.com/paper/syndrome-enabled-unsupervised-learning-for |
Repo | |
Framework | |