Paper Group ANR 799
Neural ODEs for Image Segmentation with Level Sets. Q-Learning Based Aerial Base Station Placement for Fairness Enhancement in Mobile Networks. One-Class Convolutional Neural Network. Representation Learning with Autoencoders for Electronic Health Records: A Comparative Study. RespNet: A deep learning model for extraction of respiration from photop …
Neural ODEs for Image Segmentation with Level Sets
Title | Neural ODEs for Image Segmentation with Level Sets |
Authors | Rafael Valle, Fitsum Reda, Mohammad Shoeybi, Patrick Legresley, Andrew Tao, Bryan Catanzaro |
Abstract | We propose a novel approach for image segmentation that combines Neural Ordinary Differential Equations (NODEs) and the Level Set method. Our approach parametrizes the evolution of an initial contour with a NODE that implicitly learns from data a speed function describing the evolution. In addition, for cases where an initial contour is not available and to alleviate the need for careful choice or design of contour embedding functions, we propose a NODE-based method that evolves an image embedding into a dense per-pixel semantic label space. We evaluate our methods on kidney segmentation (KiTS19) and on salient object detection (PASCAL-S, ECSSD and HKU-IS). In addition to improving initial contours provided by deep learning models while using a fraction of their number of parameters, our approach achieves F scores that are higher than several state-of-the-art deep learning algorithms. |
Tasks | Object Detection, Salient Object Detection, Semantic Segmentation |
Published | 2019-12-25 |
URL | https://arxiv.org/abs/1912.11683v1 |
https://arxiv.org/pdf/1912.11683v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-odes-for-image-segmentation-with-level |
Repo | |
Framework | |
Q-Learning Based Aerial Base Station Placement for Fairness Enhancement in Mobile Networks
Title | Q-Learning Based Aerial Base Station Placement for Fairness Enhancement in Mobile Networks |
Authors | Rozhina Ghanavi, Maryam Sabbaghian, Halim Yanikomeroglu |
Abstract | In this paper, we use an aerial base station (aerial-BS) to enhance fairness in a dynamic environment with user mobility. The problem of optimally placing the aerial-BS is a non-deterministic polynomial-time hard (NP-hard) problem. Moreover, the network topology is subject to continuous changes due to the user mobility. These issues intensify the quest to develop an adaptive and fast algorithm for 3D placement of the aerial-BS. To this end, we propose a method based on reinforcement learning to achieve these goals. Simulation results show that our method increases fairness among users in a reasonable computing time, while the solution is comparatively close to the optimal solution obtained by exhaustive search. |
Tasks | Q-Learning |
Published | 2019-09-10 |
URL | https://arxiv.org/abs/1909.08093v1 |
https://arxiv.org/pdf/1909.08093v1.pdf | |
PWC | https://paperswithcode.com/paper/q-learning-based-aerial-base-station |
Repo | |
Framework | |
One-Class Convolutional Neural Network
Title | One-Class Convolutional Neural Network |
Authors | Poojan Oza, Vishal M. Patel |
Abstract | We present a novel Convolutional Neural Network (CNN) based approach for one class classification. The idea is to use a zero centered Gaussian noise in the latent space as the pseudo-negative class and train the network using the cross-entropy loss to learn a good representation as well as the decision boundary for the given class. A key feature of the proposed approach is that any pre-trained CNN can be used as the base network for one class classification. The proposed One Class CNN (OC-CNN) is evaluated on the UMDAA-02 Face, Abnormality-1001, FounderType-200 datasets. These datasets are related to a variety of one class application problems such as user authentication, abnormality detection and novelty detection. Extensive experiments demonstrate that the proposed method achieves significant improvements over the recent state-of-the-art methods. The source code is available at : github.com/otkupjnoz/oc-cnn. |
Tasks | Anomaly Detection |
Published | 2019-01-24 |
URL | http://arxiv.org/abs/1901.08688v1 |
http://arxiv.org/pdf/1901.08688v1.pdf | |
PWC | https://paperswithcode.com/paper/one-class-convolutional-neural-network |
Repo | |
Framework | |
Representation Learning with Autoencoders for Electronic Health Records: A Comparative Study
Title | Representation Learning with Autoencoders for Electronic Health Records: A Comparative Study |
Authors | Najibesadat Sadati, Milad Zafar Nezhad, Ratna Babu Chinnam, Dongxiao Zhu |
Abstract | Increasing volume of Electronic Health Records (EHR) in recent years provides great opportunities for data scientists to collaborate on different aspects of healthcare research by applying advanced analytics to these EHR clinical data. A key requirement however is obtaining meaningful insights from high dimensional, sparse and complex clinical data. Data science approaches typically address this challenge by performing feature learning in order to build more reliable and informative feature representations from clinical data followed by supervised learning. In this paper, we propose a predictive modeling approach based on deep learning based feature representations and word embedding techniques. Our method uses different deep architectures (stacked sparse autoencoders, deep belief network, adversarial autoencoders and variational autoencoders) for feature representation in higher-level abstraction to obtain effective and robust features from EHRs, and then build prediction models on top of them. Our approach is particularly useful when the unlabeled data is abundant whereas labeled data is scarce. We investigate the performance of representation learning through a supervised learning approach. Our focus is to present a comparative study to evaluate the performance of different deep architectures through supervised learning and provide insights in the choice of deep feature representation techniques. Our experiments demonstrate that for small data sets, stacked sparse autoencoder demonstrates a superior generality performance in prediction due to sparsity regularization whereas variational autoencoders outperform the competing approaches for large data sets due to its capability of learning the representation distribution |
Tasks | Representation Learning |
Published | 2019-08-24 |
URL | https://arxiv.org/abs/1908.09174v2 |
https://arxiv.org/pdf/1908.09174v2.pdf | |
PWC | https://paperswithcode.com/paper/representation-learning-with-autoencoders-for |
Repo | |
Framework | |
RespNet: A deep learning model for extraction of respiration from photoplethysmogram
Title | RespNet: A deep learning model for extraction of respiration from photoplethysmogram |
Authors | Vignesh Ravichandran, Balamurali Murugesan, Vaishali Balakarthikeyan, Sharath M Shankaranarayana, Keerthi Ram, Preejith S. P, Jayaraj Joseph, Mohanasankar Sivaprakasam |
Abstract | Respiratory ailments afflict a wide range of people and manifests itself through conditions like asthma and sleep apnea. Continuous monitoring of chronic respiratory ailments is seldom used outside the intensive care ward due to the large size and cost of the monitoring system. While Electrocardiogram (ECG) based respiration extraction is a validated approach, its adoption is limited by access to a suitable continuous ECG monitor. Recently, due to the widespread adoption of wearable smartwatches with in-built Photoplethysmogram (PPG) sensor, it is being considered as a viable candidate for continuous and unobtrusive respiration monitoring. Research in this domain, however, has been predominantly focussed on estimating respiration rate from PPG. In this work, a novel end-to-end deep learning network called RespNet is proposed to perform the task of extracting the respiration signal from a given input PPG as opposed to extracting respiration rate. The proposed network was trained and tested on two different datasets utilizing different modalities of reference respiration signal recordings. Also, the similarity and performance of the proposed network against two conventional signal processing approaches for extracting respiration signal were studied. The proposed method was tested on two independent datasets with a Mean Squared Error of 0.262 and 0.145. The Cross-Correlation coefficient of the respective datasets were found to be 0.933 and 0.931. The reported errors and similarity was found to be better than conventional approaches. The proposed approach would aid clinicians to provide comprehensive evaluation of sleep-related respiratory conditions and chronic respiratory ailments while being comfortable and inexpensive for the patient. |
Tasks | |
Published | 2019-02-12 |
URL | http://arxiv.org/abs/1902.04236v2 |
http://arxiv.org/pdf/1902.04236v2.pdf | |
PWC | https://paperswithcode.com/paper/respnet-a-deep-learning-model-for-extraction |
Repo | |
Framework | |
Diseño de un espacio semántico sobre la base de la Wikipedia. Una propuesta de análisis de la semántica latente para el idioma español
Title | Diseño de un espacio semántico sobre la base de la Wikipedia. Una propuesta de análisis de la semántica latente para el idioma español |
Authors | Dalina Aidee Villa, Igor Barahona, Luis Javier Álvarez |
Abstract | Latent Semantic Analysis (LSA) was initially conceived by the cognitive psychology at the 90s decade. Since its emergence, the LSA has been used to model cognitive processes, pointing out academic texts, compare literature works and analyse political speeches, among other applications. Taking as starting point multivariate method for dimensionality reduction, this paper propose a semantic space for Spanish language. Out results include a document text matrix with dimensions 1.3 x10^6 and 5.9x10^6, which later is decomposed into singular values. Those singular values are used to semantically words or text. |
Tasks | Dimensionality Reduction |
Published | 2019-01-28 |
URL | http://arxiv.org/abs/1902.02173v1 |
http://arxiv.org/pdf/1902.02173v1.pdf | |
PWC | https://paperswithcode.com/paper/diseno-de-un-espacio-semantico-sobre-la-base |
Repo | |
Framework | |
Registration-free Face-SSD: Single shot analysis of smiles, facial attributes, and affect in the wild
Title | Registration-free Face-SSD: Single shot analysis of smiles, facial attributes, and affect in the wild |
Authors | Youngkyoon Jang, Hatice Gunes, Ioannis Patras |
Abstract | In this paper, we present a novel single shot face-related task analysis method, called Face-SSD, for detecting faces and for performing various face-related (classification/regression) tasks including smile recognition, face attribute prediction and valence-arousal estimation in the wild. Face-SSD uses a Fully Convolutional Neural Network (FCNN) to detect multiple faces of different sizes and recognise/regress one or more face-related classes. Face-SSD has two parallel branches that share the same low-level filters, one branch dealing with face detection and the other one with face analysis tasks. The outputs of both branches are spatially aligned heatmaps that are produced in parallel - therefore Face-SSD does not require that face detection, facial region extraction, size normalisation, and facial region processing are performed in subsequent steps. Our contributions are threefold: 1) Face-SSD is the first network to perform face analysis without relying on pre-processing such as face detection and registration in advance - Face-SSD is a simple and a single FCNN architecture simultaneously performing face detection and face-related task analysis - those are conventionally treated as separate consecutive tasks; 2) Face-SSD is a generalised architecture that is applicable for various face analysis tasks without modifying the network structure - this is in contrast to designing task-specific architectures; and 3) Face-SSD achieves real-time performance (21 FPS) even when detecting multiple faces and recognising multiple classes in a given image. Experimental results show that Face-SSD achieves state-of-the-art performance in various face analysis tasks by reaching a recognition accuracy of 95.76% for smile detection, 90.29% for attribute prediction, and Root Mean Square (RMS) error of 0.44 and 0.39 for valence and arousal estimation. |
Tasks | Face Detection, Smile Recognition |
Published | 2019-02-11 |
URL | http://arxiv.org/abs/1902.04042v1 |
http://arxiv.org/pdf/1902.04042v1.pdf | |
PWC | https://paperswithcode.com/paper/registration-free-face-ssd-single-shot |
Repo | |
Framework | |
Sensorimotor learning for artificial body perception
Title | Sensorimotor learning for artificial body perception |
Authors | German Diez-Valencia, Takuya Ohashi, Pablo Lanillos, Gordon Cheng |
Abstract | Artificial self-perception is the machine ability to perceive its own body, i.e., the mastery of modal and intermodal contingencies of performing an action with a specific sensors/actuators body configuration. In other words, the spatio-temporal patterns that relate its sensors (e.g. visual, proprioceptive, tactile, etc.), its actions and its body latent variables are responsible of the distinction between its own body and the rest of the world. This paper describes some of the latest approaches for modelling artificial body self-perception: from Bayesian estimation to deep learning. Results show the potential of these free-model unsupervised or semi-supervised crossmodal/intermodal learning approaches. However, there are still challenges that should be overcome before we achieve artificial multisensory body perception. |
Tasks | |
Published | 2019-01-15 |
URL | http://arxiv.org/abs/1901.09792v1 |
http://arxiv.org/pdf/1901.09792v1.pdf | |
PWC | https://paperswithcode.com/paper/sensorimotor-learning-for-artificial-body |
Repo | |
Framework | |
FDDB-360: Face Detection in 360-degree Fisheye Images
Title | FDDB-360: Face Detection in 360-degree Fisheye Images |
Authors | Jianglin Fu, Saeed Ranjbar Alvar, Ivan V. Bajic, Rodney G. Vaughan |
Abstract | 360-degree cameras offer the possibility to cover a large area, for example an entire room, without using multiple distributed vision sensors. However, geometric distortions introduced by their lenses make computer vision problems more challenging. In this paper we address face detection in 360-degree fisheye images. We show how a face detector trained on regular images can be re-trained for this purpose, and we also provide a 360-degree fisheye-like version of the popular FDDB face detection dataset, which we call FDDB-360. |
Tasks | Face Detection |
Published | 2019-02-07 |
URL | http://arxiv.org/abs/1902.02777v1 |
http://arxiv.org/pdf/1902.02777v1.pdf | |
PWC | https://paperswithcode.com/paper/fddb-360-face-detection-in-360-degree-fisheye |
Repo | |
Framework | |
Principal Component Analysis for Multivariate Extremes
Title | Principal Component Analysis for Multivariate Extremes |
Authors | Holger Drees, Anne Sabourin |
Abstract | The first order behavior of multivariate heavy-tailed random vectors above large radial thresholds is ruled by a limit measure in a regular variation framework. For a high dimensional vector, a reasonable assumption is that the support of this measure is concentrated on a lower dimensional subspace, meaning that certain linear combinations of the components are much likelier to be large than others. Identifying this subspace and thus reducing the dimension will facilitate a refined statistical analysis. In this work we apply Principal Component Analysis (PCA) to a re-scaled version of radially thresholded observations. Within the statistical learning framework of empirical risk minimization, our main focus is to analyze the squared reconstruction error for the exceedances over large radial thresholds. We prove that the empirical risk converges to the true risk, uniformly over all projection subspaces. As a consequence, the best projection subspace is shown to converge in probability to the optimal one, in terms of the Hausdorff distance between their intersections with the unit sphere. In addition, if the exceedances are re-scaled to the unit ball, we obtain finite sample uniform guarantees to the reconstruction error pertaining to the estimated projection sub-space. Numerical experiments illustrate the relevance of the proposed framework for practical purposes. |
Tasks | |
Published | 2019-06-26 |
URL | https://arxiv.org/abs/1906.11043v1 |
https://arxiv.org/pdf/1906.11043v1.pdf | |
PWC | https://paperswithcode.com/paper/principal-component-analysis-for-multivariate |
Repo | |
Framework | |
Improved Selective Refinement Network for Face Detection
Title | Improved Selective Refinement Network for Face Detection |
Authors | Shifeng Zhang, Rui Zhu, Xiaobo Wang, Hailin Shi, Tianyu Fu, Shuo Wang, Tao Mei, Stan Z. Li |
Abstract | As a long-standing problem in computer vision, face detection has attracted much attention in recent decades for its practical applications. With the availability of face detection benchmark WIDER FACE dataset, much of the progresses have been made by various algorithms in recent years. Among them, the Selective Refinement Network (SRN) face detector introduces the two-step classification and regression operations selectively into an anchor-based face detector to reduce false positives and improve location accuracy simultaneously. Moreover, it designs a receptive field enhancement block to provide more diverse receptive field. In this report, to further improve the performance of SRN, we exploit some existing techniques via extensive experiments, including new data augmentation strategy, improved backbone network, MS COCO pretraining, decoupled classification module, segmentation branch and Squeeze-and-Excitation block. Some of these techniques bring performance improvements, while few of them do not well adapt to our baseline. As a consequence, we present an improved SRN face detector by combining these useful techniques together and obtain the best performance on widely used face detection benchmark WIDER FACE dataset. |
Tasks | Data Augmentation, Face Detection |
Published | 2019-01-20 |
URL | http://arxiv.org/abs/1901.06651v3 |
http://arxiv.org/pdf/1901.06651v3.pdf | |
PWC | https://paperswithcode.com/paper/improved-selective-refinement-network-for |
Repo | |
Framework | |
DAFE-FD: Density Aware Feature Enrichment for Face Detection
Title | DAFE-FD: Density Aware Feature Enrichment for Face Detection |
Authors | Vishwanath A. Sindagi, Vishal M. Patel |
Abstract | Recent research on face detection, which is focused primarily on improving accuracy of detecting smaller faces, attempt to develop new anchor design strategies to facilitate increased overlap between anchor boxes and ground truth faces of smaller sizes. In this work, we approach the problem of small face detection with the motivation of enriching the feature maps using a density map estimation module. This module, inspired by recent crowd counting/density estimation techniques, performs the task of estimating the per pixel density of people/faces present in the image. Output of this module is employed to accentuate the feature maps from the backbone network using a feature enrichment module before being used for detecting smaller faces. The proposed approach can be used to complement recent anchor-design based novel methods to further improve their results. Experiments conducted on different datasets such as WIDER, FDDB and Pascal-Faces demonstrate the effectiveness of the proposed approach. |
Tasks | Crowd Counting, Density Estimation, Face Detection |
Published | 2019-01-16 |
URL | http://arxiv.org/abs/1901.05375v1 |
http://arxiv.org/pdf/1901.05375v1.pdf | |
PWC | https://paperswithcode.com/paper/dafe-fd-density-aware-feature-enrichment-for |
Repo | |
Framework | |
Forbidden knowledge in machine learning – Reflections on the limits of research and publication
Title | Forbidden knowledge in machine learning – Reflections on the limits of research and publication |
Authors | Thilo Hagendorff |
Abstract | Certain research strands can yield “forbidden knowledge”. This term refers to knowledge that is considered too sensitive, dangerous or taboo to be produced or shared. Discourses about such publication restrictions are already entrenched in scientific fields like IT security, synthetic biology or nuclear physics research. This paper makes the case for transferring this discourse to machine learning research. Some machine learning applications can very easily be misused and unfold harmful consequences, for instance with regard to generative video or text synthesis, personality analysis, behavior manipulation, software vulnerability detection and the like. Up to now, the machine learning research community embraces the idea of open access. However, this is opposed to precautionary efforts to prevent the malicious use of machine learning applications. Information about or from such applications may, if improperly disclosed, cause harm to people, organizations or whole societies. Hence, the goal of this work is to outline norms that can help to decide whether and when the dissemination of such information should be prevented. It proposes review parameters for the machine learning community to establish an ethical framework on how to deal with forbidden knowledge and dual-use applications. |
Tasks | Vulnerability Detection |
Published | 2019-11-19 |
URL | https://arxiv.org/abs/1911.08603v1 |
https://arxiv.org/pdf/1911.08603v1.pdf | |
PWC | https://paperswithcode.com/paper/forbidden-knowledge-in-machine-learning |
Repo | |
Framework | |
Straight to the point: reinforcement learning for user guidance in ultrasound
Title | Straight to the point: reinforcement learning for user guidance in ultrasound |
Authors | Fausto Milletari, Vighnesh Birodkar, Michal Sofka |
Abstract | Point of care ultrasound (POCUS) consists in the use of ultrasound imaging in critical or emergency situations to support clinical decisions by healthcare professionals and first responders. In this setting it is essential to be able to provide means to obtain diagnostic data to potentially inexperienced users who did not receive an extensive medical training. Interpretation and acquisition of ultrasound images is not trivial. First, the user needs to find a suitable sound window which can be used to get a clear image, and then he needs to correctly interpret it to perform a diagnosis. Although many recent approaches focus on developing smart ultrasound devices that add interpretation capabilities to existing systems, our goal in this paper is to present a reinforcement learning (RL) strategy which is capable to guide novice users to the correct sonic window and enable them to obtain clinically relevant pictures of the anatomy of interest. We apply our approach to cardiac images acquired from the parasternal long axis (PLAx) view of the left ventricle of the heart. |
Tasks | |
Published | 2019-03-02 |
URL | http://arxiv.org/abs/1903.00586v1 |
http://arxiv.org/pdf/1903.00586v1.pdf | |
PWC | https://paperswithcode.com/paper/straight-to-the-point-reinforcement-learning |
Repo | |
Framework | |
The Good, the Bad and the Ugly: Evaluating Convolutional Neural Networks for Prohibited Item Detection Using Real and Synthetically Composited X-ray Imagery
Title | The Good, the Bad and the Ugly: Evaluating Convolutional Neural Networks for Prohibited Item Detection Using Real and Synthetically Composited X-ray Imagery |
Authors | Neelanjan Bhowmik, Qian Wang, Yona Falinie A. Gaus, Marcin Szarek, Toby P. Breckon |
Abstract | Detecting prohibited items in X-ray security imagery is pivotal in maintaining border and transport security against a wide range of threat profiles. Convolutional Neural Networks (CNN) with the support of a significant volume of data have brought advancement in such automated prohibited object detection and classification. However, collating such large volumes of X-ray security imagery remains a significant challenge. This work opens up the possibility of using synthetically composed imagery, avoiding the need to collate such large volumes of hand-annotated real-world imagery. Here we investigate the difference in detection performance achieved using real and synthetic X-ray training imagery for CNN architecture detecting three exemplar prohibited items, {Firearm, Firearm Parts, Knives}, within cluttered and complex X-ray security baggage imagery. We achieve 0.88 of mean average precision (mAP) with a Faster R-CNN and ResNet-101 CNN architecture for this 3-class object detection using real X-ray imagery. While the performance is comparable with synthetically composited X-ray imagery (0.78 mAP), our extended evaluation demonstrates both challenge and promise of using synthetically composed images to diversify the X-ray security training imagery for automated detection algorithm training. |
Tasks | Object Detection |
Published | 2019-09-25 |
URL | https://arxiv.org/abs/1909.11508v1 |
https://arxiv.org/pdf/1909.11508v1.pdf | |
PWC | https://paperswithcode.com/paper/the-good-the-bad-and-the-ugly-evaluating |
Repo | |
Framework | |