January 27, 2020

3347 words 16 mins read

Paper Group ANR 1304

A Curated Image Parameter Dataset from Solar Dynamics Observatory Mission. Object Recognition under Multifarious Conditions: A Reliability Analysis and A Feature Similarity-based Performance Estimation. Exploring 3 R’s of Long-term Tracking: Re-detection, Recovery and Reliability. A Semi-Automated Usability Evaluation Framework for Interactive Imag …

A Curated Image Parameter Dataset from Solar Dynamics Observatory Mission


Title	A Curated Image Parameter Dataset from Solar Dynamics Observatory Mission
Authors	Azim Ahmadzadeh, Dustin J. Kempton, Rafal A. Angryk
Abstract	We provide a large image parameter dataset extracted from the Solar Dynamics Observatory (SDO) mission’s AIA instrument, for the period of January 2011 through the current date, with the cadence of six minutes, for nine wavelength channels. The volume of the dataset for each year is just short of 1 TiB. Towards achieving better results in the region classification of active regions and coronal holes, we improve upon the performance of a set of ten image parameters, through an in depth evaluation of various assumptions that are necessary for calculation of these image parameters. Then, where possible, a method for finding an appropriate settings for the parameter calculations was devised, as well as a validation task to show our improved results. In addition, we include comparisons of JP2 and FITS image formats using supervised classification models, by tuning the parameters specific to the format of the images from which they are extracted, and specific to each wavelength. The results of these comparisons show that utilizing JP2 images, which are significantly smaller files, is not detrimental to the region classification task that these parameters were originally intended for. Finally, we compute the tuned parameters on the AIA images and provide a public API (http://dmlab.cs.gsu.edu/dmlabapi) to access the dataset. This dataset can be used in a range of studies on AIA images, such as content-based image retrieval or tracking of solar events, where dimensionality reduction on the images is necessary for feasibility of the tasks.
Tasks	Content-Based Image Retrieval, Dimensionality Reduction, Image Retrieval
Published	2019-06-03
URL	https://arxiv.org/abs/1906.01062v1
PDF	https://arxiv.org/pdf/1906.01062v1.pdf
PWC	https://paperswithcode.com/paper/a-curated-image-parameter-dataset-from-solar
Repo
Framework

Object Recognition under Multifarious Conditions: A Reliability Analysis and A Feature Similarity-based Performance Estimation


Title	Object Recognition under Multifarious Conditions: A Reliability Analysis and A Feature Similarity-based Performance Estimation
Authors	Dogancan Temel, Jinsol Lee, Ghassan AlRegib
Abstract	In this paper, we investigate the reliability of online recognition platforms, Amazon Rekognition and Microsoft Azure, with respect to changes in background, acquisition device, and object orientation. We focus on platforms that are commonly used by the public to better understand their real-world performances. To assess the variation in recognition performance, we perform a controlled experiment by changing the acquisition conditions one at a time. We use three smartphones, one DSLR, and one webcam to capture side views and overhead views of objects in a living room, an office, and photo studio setups. Moreover, we introduce a framework to estimate the recognition performance with respect to backgrounds and orientations. In this framework, we utilize both handcrafted features based on color, texture, and shape characteristics and data-driven features obtained from deep neural networks. Experimental results show that deep learning-based image representations can estimate the recognition performance variation with a Spearman’s rank-order correlation of 0.94 under multifarious acquisition conditions.
Tasks	Object Recognition
Published	2019-02-18
URL	https://arxiv.org/abs/1902.06585v2
PDF	https://arxiv.org/pdf/1902.06585v2.pdf
PWC	https://paperswithcode.com/paper/object-recognition-under-multifarious
Repo
Framework

Exploring 3 R’s of Long-term Tracking: Re-detection, Recovery and Reliability


Title	Exploring 3 R’s of Long-term Tracking: Re-detection, Recovery and Reliability
Authors	Shyamgopal Karthik, Abhinav Moudgil, Vineet Gandhi
Abstract	Recent works have proposed several long term tracking benchmarks and highlight the importance of moving towards long-duration tracking to bridge the gap with application requirements. The current evaluation methodologies, however, do not focus on several aspects that are crucial in a long term perspective like Re-detection, Recovery, and Reliability. In this paper, we propose novel evaluation strategies for a more in-depth analysis of trackers from a long-term perspective. More specifically, (a) we test re-detection capability of the trackers in the wild by simulating virtual cuts, (b) we investigate the role of chance in the recovery of tracker after failure and (c) we propose a novel metric allowing visual inference on the ability of a tracker to track contiguously (without any failure) at a given accuracy. We present several original insights derived from an extensive set of quantitative and qualitative experiments.
Tasks
Published	2019-10-27
URL	https://arxiv.org/abs/1910.12273v1
PDF	https://arxiv.org/pdf/1910.12273v1.pdf
PWC	https://paperswithcode.com/paper/exploring-3-rs-of-long-term-tracking-re
Repo
Framework

A Semi-Automated Usability Evaluation Framework for Interactive Image Segmentation Systems


Title	A Semi-Automated Usability Evaluation Framework for Interactive Image Segmentation Systems
Authors	Mario Amrehn, Stefan Steidl, Reinier Kortekaas, Maddalena Strumia, Markus Weingarten, Markus Kowarschik, Andreas Maier
Abstract	For complex segmentation tasks, the achievable accuracy of fully automated systems is inherently limited. Specifically, when a precise segmentation result is desired for a small amount of given data sets, semi-automatic methods exhibit a clear benefit for the user. The optimization of human computer interaction (HCI) is an essential part of interactive image segmentation. Nevertheless, publications introducing novel interactive segmentation systems (ISS) often lack an objective comparison of HCI aspects. It is demonstrated, that even when the underlying segmentation algorithm is the same throughout interactive prototypes, their user experience may vary substantially. As a result, users prefer simple interfaces as well as a considerable degree of freedom to control each iterative step of the segmentation. In this article, an objective method for the comparison of ISS is proposed, based on extensive user studies. A summative qualitative content analysis is conducted via abstraction of visual and verbal feedback given by the participants. A direct assessment of the segmentation system is executed by the users via the system usability scale (SUS) and AttrakDiff-2 questionnaires. Furthermore, an approximation of the findings regarding usability aspects in those studies is introduced, conducted solely from the system-measurable user actions during their usage of interactive segmentation prototypes. The prediction of all questionnaire results has an average relative error of 8.9%, which is close to the expected precision of the questionnaire results themselves. This automated evaluation scheme may significantly reduce the resources necessary to investigate each variation of a prototype’s user interface (UI) features and segmentation methodologies.
Tasks	Interactive Segmentation, Semantic Segmentation
Published	2019-09-01
URL	https://arxiv.org/abs/1909.00482v1
PDF	https://arxiv.org/pdf/1909.00482v1.pdf
PWC	https://paperswithcode.com/paper/a-semi-automated-usability-evaluation
Repo
Framework

Measurement Dependence Inducing Latent Causal Models


Title	Measurement Dependence Inducing Latent Causal Models
Authors	Alex Markham, Moritz Grosse-Wentrup
Abstract	We consider the task of causal structure learning over measurement dependence inducing latent (MeDIL) causal models. We show that this task can be framed in terms of the graph theoretical problem of finding edge clique covers, resulting in a simple algorithm for returning minimal MeDIL causal models (minMCMs). This algorithm is non-parametric, requiring no assumptions about linearity or Gaussianity. Furthermore, despite rather weak assumptions about the class of MeDIL causal models, we show that minimality in minMCMs implies three rather specific and interesting properties: first, minMCMs provide lower bounds on (i) the number of latent causal variables and (ii) the number of functional causal relations that are required to model a complex system at any level of granularity; second, a minMCM contains no causal links between the latent variables; and third, in contrast to factor analysis, a minMCM may require more latent than measurement variables.
Tasks
Published	2019-10-19
URL	https://arxiv.org/abs/1910.08778v1
PDF	https://arxiv.org/pdf/1910.08778v1.pdf
PWC	https://paperswithcode.com/paper/measurement-dependence-inducing-latent-causal
Repo
Framework

FSD: Feature Skyscraper Detector for Stem End and Blossom End of Navel Orange


Title	FSD: Feature Skyscraper Detector for Stem End and Blossom End of Navel Orange
Authors	Xiaoye Sun, Gongyan Li, Shaoyun Xu
Abstract	To accurately and efficiently distinguish the stem end and the blossom end of navel orange from its black spots, we propose a feature skyscraper detector (FSD) with low computational cost, compact architecture and high detection accuracy. The main part of the detector is inspired from small object that stem (blossom) end is complex and black spot is densely distributed, so we design the feature skyscraper networks (FSN) based on dense connectivity. In particular, FSN is distinguished from regular feature pyramids, and which provides more intensive detection of high-level features. Then we design the backbone of the FSD based on attention mechanism and dense block for better feature extraction to the FSN. In addition, the architecture of the detector is also added Swish to further improve the accuracy. And we create a dataset in Pascal VOC format annotated three types of detection targets the stem end, the blossom end and the black spot. Experimental results on our orange data set confirm that FSD has competitive results to the state-of-the-art one-stage detectors like SSD, DSOD, YOLOv2, YOLOv3, RFB and FSSD, and it achieves 87.479%mAP at 131 FPS with only 5.812M parameters.
Tasks
Published	2019-05-24
URL	https://arxiv.org/abs/1905.09994v2
PDF	https://arxiv.org/pdf/1905.09994v2.pdf
PWC	https://paperswithcode.com/paper/a-real-time-tiny-detection-model-for-stem-end
Repo
Framework

On the Exact Recovery Conditions of 3D Human Motion from 2D Landmark Motion with Sparse Articulated Motion


Title	On the Exact Recovery Conditions of 3D Human Motion from 2D Landmark Motion with Sparse Articulated Motion
Authors	Abed Malti
Abstract	In this paper, we address the problem of exact recovery condition in retrieving 3D human motion from 2D landmark motion. We use a skeletal kinematic model to represent the 3D human motion as a vector of angular articulation motion. We address this problem based on the observation that at high tracking rate, regardless of the global rigid motion, only few angular articulations have non-zero motion. We propose a first ideal formulation with $\ell_0$-norm to minimize the cardinal of non-zero angular articulation motion given an equality constraint on the time-differentiation of the reprojection error. The second relaxed formulation relies on an $\ell_1$-norm to minimize the sum of absolute values of the angular articulation motion. This formulation has the advantage of being able to provide 3D motion even in the under-determined case when twice the number of 2D landmarks is smaller than the number of angular articulations. We define a specific property which is the Projective Kinematic Space Property (PKSP) that takes into account the reprojection constraint and the kinematic model. We prove that for the relaxed formulation we are able to recover the exact 3D human motion from 2D landmarks if and only if the PKSP property is verified. We further demonstrate that solving the relaxed formulation provides the same ground-truth solution as the ideal formulation if and only if the PKSP condition is filled. Results with simulated sparse skeletal angular motion show the ability of the proposed method to recover exact location of angular motion. We provide results on publicly available real data (HUMAN3.6M, PANOPTIC and MPI-I3DHP).
Tasks
Published	2019-07-09
URL	https://arxiv.org/abs/1907.03967v1
PDF	https://arxiv.org/pdf/1907.03967v1.pdf
PWC	https://paperswithcode.com/paper/on-the-exact-recovery-conditions-of-3d-human
Repo
Framework

Spatial images from temporal data


Title	Spatial images from temporal data
Authors	Alex Turpin, Gabriella Musarra, Valentin Kapitany, Francesco Tonolini, Ashley Lyons, Ilya Starshynov, Federica Villa, Enrico Conca, Francesco Fioranelli, Roderick Murray-Smith, Daniele Faccio
Abstract	Traditional paradigms for imaging rely on the use of spatial structure either in the detector (pixels arrays) or in the illumination (patterned light). Removal of spatial structure in the detector or illumination, i.e. imaging with just a single-point sensor, would require solving a very strongly ill-posed inverse retrieval problem that to date has not been solved. Here we demonstrate a data-driven approach in which full 3D information is obtained with just a single-point, single-photon avalanche diode that records the arrival time of photons reflected from a scene that is illuminated with short pulses of light. Imaging with single-point time-of-flight (temporal) data opens new routes in terms of speed, size, and functionality. As an example, we show how the training based on an optical time-of-flight camera enables a compact radio-frequency impulse RADAR transceiver to provide 3D images.
Tasks
Published	2019-12-02
URL	https://arxiv.org/abs/1912.01413v1
PDF	https://arxiv.org/pdf/1912.01413v1.pdf
PWC	https://paperswithcode.com/paper/spatial-images-from-temporal-data
Repo
Framework

High-Fidelity State-of-Charge Estimation of Li-Ion Batteries Using Machine Learning


Title	High-Fidelity State-of-Charge Estimation of Li-Ion Batteries Using Machine Learning
Authors	Weizhong Wang, Nicholas W. Brady, Chenyao Liao, Youssef A. Fahmy, Ephrem Chemali, Alan C. West, Matthias Preindl
Abstract	This paper proposes a way to augment the existing machine learning algorithm applied to state-of-charge estimation by introducing a form of pulse injection to the running battery cells. It is believed that the information contained in the pulse responses can be interpreted by a machine learning algorithm whereas other techniques are difficult to decode due to the nonlinearity. The sensitivity analysis of the amplitude of the current pulse is given through simulation, allowing the researchers to select the appropriate current level with respect to the desired accuracy improvement. A multi-layer feedforward neural networks is trained to acquire the nonlinear relationship between the pulse train and the ground-truth SoC. The experimental data is trained and the results are shown to be promising with less than 2% SoC estimation error using layer sizes in the range of 10 - 10,000 trained in 0 - 1 million epochs. The testing procedure specifically designed for the proposed technique is explained and provided. The implementation of the proposed strategy is also discussed. The detailed system layout to perform the augmented SoC estimation integrated in the existing active balancing hardware has also been given.
Tasks
Published	2019-08-30
URL	https://arxiv.org/abs/1909.02448v1
PDF	https://arxiv.org/pdf/1909.02448v1.pdf
PWC	https://paperswithcode.com/paper/high-fidelity-state-of-charge-estimation-of
Repo
Framework

Semi-supervised Learning for Quantification of Pulmonary Edema in Chest X-Ray Images


Title	Semi-supervised Learning for Quantification of Pulmonary Edema in Chest X-Ray Images
Authors	Ruizhi Liao, Jonathan Rubin, Grace Lam, Seth Berkowitz, Sandeep Dalal, William Wells, Steven Horng, Polina Golland
Abstract	We propose and demonstrate machine learning algorithms to assess the severity of pulmonary edema in chest x-ray images of congestive heart failure patients. Accurate assessment of pulmonary edema in heart failure is critical when making treatment and disposition decisions. Our work is grounded in a large-scale clinical dataset of over 300,000 x-ray images with associated radiology reports. While edema severity labels can be extracted unambiguously from a small fraction of the radiology reports, accurate annotation is challenging in most cases. To take advantage of the unlabeled images, we develop a Bayesian model that includes a variational auto-encoder for learning a latent representation from the entire image set trained jointly with a regressor that employs this representation for predicting pulmonary edema severity. Our experimental results suggest that modeling the distribution of images jointly with the limited labels improves the accuracy of pulmonary edema scoring compared to a strictly supervised approach. To the best of our knowledge, this is the first attempt to employ machine learning algorithms to automatically and quantitatively assess the severity of pulmonary edema in chest x-ray images.
Tasks
Published	2019-02-27
URL	http://arxiv.org/abs/1902.10785v3
PDF	http://arxiv.org/pdf/1902.10785v3.pdf
PWC	https://paperswithcode.com/paper/semi-supervised-learning-for-quantification
Repo
Framework

Deep multi-class learning from label proportions


Title	Deep multi-class learning from label proportions
Authors	Gabriel Dulac-Arnold, Neil Zeghidour, Marco Cuturi, Lucas Beyer, Jean-Philippe Vert
Abstract	We propose a learning algorithm capable of learning from label proportions instead of direct data labels. In this scenario, our data are arranged into various bags of a certain size, and only the proportions of each label within a given bag are known. This is a common situation in cases where per-data labeling is lengthy, but a more general label is easily accessible. Several approaches have been proposed to learn in this setting with linear models in the multiclass setting, or with nonlinear models in the binary classification setting. Here we investigate the more general nonlinear multiclass setting, and compare two differentiable loss functions to train end-to-end deep neural networks from bags with label proportions. We illustrate the relevance of our methods on an image classification benchmark, and demonstrate the possibility to learn accurate image classifiers from bags of images.
Tasks	Image Classification
Published	2019-05-30
URL	https://arxiv.org/abs/1905.12909v2
PDF	https://arxiv.org/pdf/1905.12909v2.pdf
PWC	https://paperswithcode.com/paper/deep-multi-class-learning-from-label
Repo
Framework

Semantically Interpretable and Controllable Filter Sets


Title	Semantically Interpretable and Controllable Filter Sets
Authors	Mohit Prabhushankar, Gukyeong Kwon, Dogancan Temel, Ghassan AlRegib
Abstract	In this paper, we generate and control semantically interpretable filters that are directly learned from natural images in an unsupervised fashion. Each semantic filter learns a visually interpretable local structure in conjunction with other filters. The significance of learning these interpretable filter sets is demonstrated on two contrasting applications. The first application is image recognition under progressive decolorization, in which recognition algorithms should be color-insensitive to achieve a robust performance. The second application is image quality assessment where objective methods should be sensitive to color degradations. In the proposed work, the sensitivity and lack thereof are controlled by weighing the semantic filters based on the local structures they represent. To validate the proposed approach, we utilize the CURE-TSR dataset for image recognition and the TID 2013 dataset for image quality assessment. We show that the proposed semantic filter set achieves state-of-the-art performances in both datasets while maintaining its robustness across progressive distortions.
Tasks	Image Quality Assessment
Published	2019-02-17
URL	http://arxiv.org/abs/1902.06334v1
PDF	http://arxiv.org/pdf/1902.06334v1.pdf
PWC	https://paperswithcode.com/paper/semantically-interpretable-and-controllable
Repo
Framework

Deep Iterative Reconstruction for Phase Retrieval


Title	Deep Iterative Reconstruction for Phase Retrieval
Authors	Çağatay Işıl, Figen S. Oktem, Aykut Koç
Abstract	Classical phase retrieval problem is the recovery of a constrained image from the magnitude of its Fourier transform. Although there are several well-known phase retrieval algorithms including the hybrid input-output (HIO) method, the reconstruction performance is generally sensitive to initialization and measurement noise. Recently, deep neural networks (DNNs) have been shown to provide state-of-the-art performance in solving several inverse problems such as denoising, deconvolution, and superresolution. In this work, we develop a phase retrieval algorithm that utilizes two DNNs together with the model-based HIO method. First, a DNN is trained to remove the HIO artifacts and is used iteratively with the HIO method to improve the reconstructions. After this iterative phase, a second DNN is trained to remove the remaining artifacts. Numerical results demonstrate the effectiveness of ourapproach, which has little additional computational cost compared to the HIO method. Our approach not only achieves state-of-the-art reconstruction performance but also is more robust to different initialization and noise levels.
Tasks	Denoising
Published	2019-04-25
URL	https://arxiv.org/abs/1904.11301v2
PDF	https://arxiv.org/pdf/1904.11301v2.pdf
PWC	https://paperswithcode.com/paper/deep-iterative-reconstruction-for-phase
Repo
Framework

UnsuperPoint: End-to-end Unsupervised Interest Point Detector and Descriptor


Title	UnsuperPoint: End-to-end Unsupervised Interest Point Detector and Descriptor
Authors	Peter Hviid Christiansen, Mikkel Fly Kragh, Yury Brodskiy, Henrik Karstoft
Abstract	It is hard to create consistent ground truth data for interest points in natural images, since interest points are hard to define clearly and consistently for a human annotator. This makes interest point detectors non-trivial to build. In this work, we introduce an unsupervised deep learning-based interest point detector and descriptor. Using a self-supervised approach, we utilize a siamese network and a novel loss function that enables interest point scores and positions to be learned automatically. The resulting interest point detector and descriptor is UnsuperPoint. We use regression of point positions to 1) make UnsuperPoint end-to-end trainable and 2) to incorporate non-maximum suppression in the model. Unlike most trainable detectors, it requires no generation of pseudo ground truth points, no structure-from-motion-generated representations and the model is learned from only one round of training. Furthermore, we introduce a novel loss function to regularize network predictions to be uniformly distributed. UnsuperPoint runs in real-time with 323 frames per second (fps) at a resolution of $224\times320$ and 90 fps at $480\times640$. It is comparable or better than state-of-the-art performance when measured for speed, repeatability, localization, matching score and homography estimation on the HPatch dataset.
Tasks	Homography Estimation
Published	2019-07-09
URL	https://arxiv.org/abs/1907.04011v1
PDF	https://arxiv.org/pdf/1907.04011v1.pdf
PWC	https://paperswithcode.com/paper/unsuperpoint-end-to-end-unsupervised-interest
Repo
Framework

STN-Homography: estimate homography parameters directly


Title	STN-Homography: estimate homography parameters directly
Authors	Qiang Zhou, Xin Li
Abstract	In this paper, we introduce the STN-Homography model to directly estimate the homography matrix between image pair. Different most CNN-based homography estimation methods which use an alternative 4-point homography parameterization, we use prove that, after coordinate normalization, the variance of elements of coordinate normalized $3\times3$ homography matrix is very small and suitable to be regressed well with CNN. Based on proposed STN-Homography, we use a hierarchical architecture which stacks several STN-Homography models and successively reduce the estimation error. Effectiveness of the proposed method is shown through experiments on MSCOCO dataset, in which it significantly outperforms the state-of-the-art. The average processing time of our hierarchical STN-Homography with 1 stage is only 4.87 ms on the GPU, and the processing time for hierarchical STN-Homography with 3 stages is 17.85 ms. The code will soon be open sourced.
Tasks	Homography Estimation
Published	2019-06-06
URL	https://arxiv.org/abs/1906.02539v1
PDF	https://arxiv.org/pdf/1906.02539v1.pdf
PWC	https://paperswithcode.com/paper/stn-homography-estimate-homography-parameters
Repo
Framework