January 30, 2020

3145 words 15 mins read

Paper Group ANR 295

Building a Benchmark Dataset and Classifiers for Sentence-Level Findings in AP Chest X-rays. Logistic principal component analysis via non-convex singular value thresholding. DaTscan SPECT Image Classification for Parkinson’s Disease. SRG: Snippet Relatedness-based Temporal Action Proposal Generator. Anomaly Detection in Traffic Scenes via Spatial- …

Building a Benchmark Dataset and Classifiers for Sentence-Level Findings in AP Chest X-rays


Title	Building a Benchmark Dataset and Classifiers for Sentence-Level Findings in AP Chest X-rays
Authors	Tanveer Syeda-Mahmood, Hassan M. Ahmad, Nadeem Ansari, Yaniv Gur, Satyananda Kashyap, Alexandros Karargyris, Mehdi Moradi, Anup Pillai, Karthik Sheshadri, Weiting Wang, Ken C. L. Wong, Joy T. Wu
Abstract	Chest X-rays are the most common diagnostic exams in emergency rooms and hospitals. There has been a surge of work on automatic interpretation of chest X-rays using deep learning approaches after the availability of large open source chest X-ray dataset from NIH. However, the labels are not sufficiently rich and descriptive for training classification tools. Further, it does not adequately address the findings seen in Chest X-rays taken in anterior-posterior (AP) view which also depict the placement of devices such as central vascular lines and tubes. In this paper, we present a new chest X-ray benchmark database of 73 rich sentence-level descriptors of findings seen in AP chest X-rays. We describe our method of obtaining these findings through a semi-automated ground truth generation process from crowdsourcing of clinician annotations. We also present results of building classifiers for these findings that show that such higher granularity labels can also be learned through the framework of deep learning classifiers.
Tasks
Published	2019-06-21
URL	https://arxiv.org/abs/1906.09336v1
PDF	https://arxiv.org/pdf/1906.09336v1.pdf
PWC	https://paperswithcode.com/paper/building-a-benchmark-dataset-and-classifiers
Repo
Framework

Logistic principal component analysis via non-convex singular value thresholding


Title	Logistic principal component analysis via non-convex singular value thresholding
Authors	Yipeng Song, Johan A. Westerhuis, Age K. Smilde
Abstract	Multivariate binary data is becoming abundant in current biological research. Logistic principal component analysis (PCA) is one of the commonly used tools to explore the relationships inside a multivariate binary data set by exploiting the underlying low rank structure. We re-expressed the logistic PCA model based on the latent variable interpretation of the generalized linear model on binary data. The multivariate binary data set is assumed to be the sign observation of an unobserved quantitative data set, on which a low rank structure is assumed to exist. However, the standard logistic PCA model (using exact low rank constraint) is prone to overfitting, which could lead to divergence of some estimated parameters towards infinity. We propose to fit a logistic PCA model through non-convex singular value thresholding to alleviate the overfitting issue. An efficient Majorization-Minimization algorithm is implemented to fit the model and a missing value based cross validation (CV) procedure is introduced for the model selection. Our experiments on realistic simulations of imbalanced binary data and low signal to noise ratio show that the CV error based model selection procedure is successful in selecting the proposed model. Furthermore, the selected model demonstrates superior performance in recovering the underlying low rank structure compared to models with convex nuclear norm penalty and exact low rank constraint. A binary copy number aberration data set is used to illustrate the proposed methodology in practice.
Tasks	Model Selection
Published	2019-02-25
URL	http://arxiv.org/abs/1902.09486v1
PDF	http://arxiv.org/pdf/1902.09486v1.pdf
PWC	https://paperswithcode.com/paper/logistic-principal-component-analysis-via-non
Repo
Framework

DaTscan SPECT Image Classification for Parkinson’s Disease


Title	DaTscan SPECT Image Classification for Parkinson’s Disease
Authors	Justin Quan, Lin Xu, Rene Xu, Tyrael Tong, Jean Su
Abstract	Parkinson’s Disease (PD) is a neurodegenerative disease that currently does not have a cure. In order to facilitate disease management and reduce the speed of symptom progression, early diagnosis is essential. The current clinical, diagnostic approach is to have radiologists perform human visual analysis of the degeneration of dopaminergic neurons in the substantia nigra region of the brain. Clinically, dopamine levels are monitored through observing dopamine transporter (DaT) activity. One method of DaT activity analysis is performed with the injection of an Iodine-123 fluoropropyl (123I-FP-CIT) tracer combined with single photon emission computerized tomography (SPECT) imaging. The tracer illustrates the region of interest in the resulting DaTscan SPECT images. Human visual analysis is slow and vulnerable to subjectivity between radiologists, so the goal was to develop an introductory implementation of a deep convolutional neural network that can objectively and accurately classify DaTscan SPECT images as Parkinson’s Disease or normal. This study illustrates the approach of using a deep convolutional neural network and evaluates its performance on DaTscan SPECT image classification. The data used in this study was obtained through a database provided by the Parkinson’s Progression Markers Initiative (PPMI). The deep neural network in this study utilizes the InceptionV3 architecture, 1st runner up in the 2015 ImageNet Large Scale Visual Recognition Competition (ILSVRC), as a base model. A custom, binary classifier block was added on top of this base. In order to account for the small dataset size, a ten fold cross validation was implemented to evaluate the model’s performance.
Tasks	Image Classification, Object Recognition
Published	2019-09-09
URL	https://arxiv.org/abs/1909.04142v1
PDF	https://arxiv.org/pdf/1909.04142v1.pdf
PWC	https://paperswithcode.com/paper/datscan-spect-image-classification-for
Repo
Framework

SRG: Snippet Relatedness-based Temporal Action Proposal Generator


Title	SRG: Snippet Relatedness-based Temporal Action Proposal Generator
Authors	Hyunjun Eun, Sumin Lee, Jinyoung Moon, Jongyoul Park, Chanho Jung, Changick Kim
Abstract	Recent temporal action proposal generation approaches have suggested integrating segment- and snippet score-based methodologies to produce proposals with high recall and accurate boundaries. In this paper, different from such a hybrid strategy, we focus on the potential of the snippet score-based approach. Specifically, we propose a new snippet score-based method, named Snippet Relatedness-based Generator (SRG), with a novel concept of “snippet relatedness”. Snippet relatedness represents which snippets are related to a specific action instance. To effectively learn this snippet relatedness, we present “pyramid non-local operations” for locally and globally capturing long-range dependencies among snippets. By employing these components, SRG first produces a 2D relatedness score map that enables the generation of various temporal intervals reliably covering most action instances with high overlap. Then, SRG evaluates the action confidence scores of these temporal intervals and refines their boundaries to obtain temporal action proposals. On THUMOS-14 and ActivityNet-1.3 datasets, SRG outperforms state-of-the-art methods for temporal action proposal generation. Furthermore, compared to competing proposal generators, SRG leads to significant improvements in temporal action detection.
Tasks	Action Detection, Temporal Action Proposal Generation
Published	2019-11-26
URL	https://arxiv.org/abs/1911.11306v2
PDF	https://arxiv.org/pdf/1911.11306v2.pdf
PWC	https://paperswithcode.com/paper/srg-snippet-relatedness-based-temporal-action
Repo
Framework

Anomaly Detection in Traffic Scenes via Spatial-aware Motion Reconstruction


Title	Anomaly Detection in Traffic Scenes via Spatial-aware Motion Reconstruction
Authors	Yuan Yuan, Dong Wang, Qi Wang
Abstract	Anomaly detection from a driver’s perspective when driving is important to autonomous vehicles. As a part of Advanced Driver Assistance Systems (ADAS), it can remind the driver about dangers timely. Compared with traditional studied scenes such as the university campus and market surveillance videos, it is difficult to detect abnormal event from a driver’s perspective due to camera waggle, abidingly moving background, drastic change of vehicle velocity, etc. To tackle these specific problems, this paper proposes a spatial localization constrained sparse coding approach for anomaly detection in traffic scenes, which firstly measures the abnormality of motion orientation and magnitude respectively and then fuses these two aspects to obtain a robust detection result. The main contributions are threefold: 1) This work describes the motion orientation and magnitude of the object respectively in a new way, which is demonstrated to be better than the traditional motion descriptors. 2) The spatial localization of object is taken into account of the sparse reconstruction framework, which utilizes the scene’s structural information and outperforms the conventional sparse coding methods. 3) Results of motion orientation and magnitude are adaptively weighted and fused by a Bayesian model, which makes the proposed method more robust and handle more kinds of abnormal events. The efficiency and effectiveness of the proposed method are validated by testing on nine difficult video sequences captured by ourselves. Observed from the experimental results, the proposed method is more effective and efficient than the popular competitors, and yields a higher performance.
Tasks	Anomaly Detection, Autonomous Vehicles
Published	2019-04-30
URL	http://arxiv.org/abs/1904.13079v1
PDF	http://arxiv.org/pdf/1904.13079v1.pdf
PWC	https://paperswithcode.com/paper/anomaly-detection-in-traffic-scenes-via
Repo
Framework

Revisiting Feature Alignment for One-stage Object Detection


Title	Revisiting Feature Alignment for One-stage Object Detection
Authors	Yuntao Chen, Chenxia Han, Naiyan Wang, Zhaoxiang Zhang
Abstract	Recently, one-stage object detectors gain much attention due to their simplicity in practice. Its fully convolutional nature greatly reduces the difficulty of training and deployment compared with two-stage detectors which require NMS and sorting for the proposal stage. However, a fundamental issue lies in all one-stage detectors is the misalignment between anchor boxes and convolutional features, which significantly hinders the performance of one-stage detectors. In this work, we first reveal the deep connection between the widely used im2col operator and the RoIAlign operator. Guided by this illuminating observation, we propose a RoIConv operator which aligns the features and its corresponding anchors in one-stage detection in a principled way. We then design a fully convolutional AlignDet architecture which combines the flexibility of learned anchors and the preciseness of aligned features. Specifically, our AlignDet achieves a state-of-the-art mAP of 44.1 on the COCO test-dev with ResNeXt-101 backbone.
Tasks	Object Detection
Published	2019-08-05
URL	https://arxiv.org/abs/1908.01570v1
PDF	https://arxiv.org/pdf/1908.01570v1.pdf
PWC	https://paperswithcode.com/paper/revisiting-feature-alignment-for-one-stage
Repo
Framework

What do Entity-Centric Models Learn? Insights from Entity Linking in Multi-Party Dialogue


Title	What do Entity-Centric Models Learn? Insights from Entity Linking in Multi-Party Dialogue
Authors	Laura Aina, Carina Silberer, Matthijs Westera, Ionut-Teodor Sorodoc, Gemma Boleda
Abstract	Humans use language to refer to entities in the external world. Motivated by this, in recent years several models that incorporate a bias towards learning entity representations have been proposed. Such entity-centric models have shown empirical success, but we still know little about why. In this paper we analyze the behavior of two recently proposed entity-centric models in a referential task, Entity Linking in Multi-party Dialogue (SemEval 2018 Task 4). We show that these models outperform the state of the art on this task, and that they do better on lower frequency entities than a counterpart model that is not entity-centric, with the same model size. We argue that making models entity-centric naturally fosters good architectural decisions. However, we also show that these models do not really build entity representations and that they make poor use of linguistic context. These negative results underscore the need for model analysis, to test whether the motivations for particular architectures are borne out in how models behave when deployed.
Tasks	Entity Linking
Published	2019-05-16
URL	https://arxiv.org/abs/1905.06649v1
PDF	https://arxiv.org/pdf/1905.06649v1.pdf
PWC	https://paperswithcode.com/paper/what-do-entity-centric-models-learn-insights
Repo
Framework

Deep learning approach to description and classification of fungi microscopic images


Title	Deep learning approach to description and classification of fungi microscopic images
Authors	Bartosz Zieliński, Agnieszka Sroka-Oleksiak, Dawid Rymarczyk, Adam Piekarczyk, Monika Brzychczy-Włoch
Abstract	Diagnosis of fungal infections can rely on microscopic examination, however, in many cases, it does not allow unambiguous identification of the species due to their visual similarity. Therefore, it is usually necessary to use additional biochemical tests. That involves additional costs and extends the identification process up to 10 days. Such a delay in the implementation of targeted treatment is grave in consequences as the mortality rate for immunosuppressed patients is high. In this paper, we apply machine learning approach based on deep learning and bag-of-words to classify microscopic images of various fungi species. Our approach makes the last stage of biochemical identification redundant, shortening the identification process by 2-3 days and reducing the cost of the diagnostic examination.
Tasks
Published	2019-06-22
URL	https://arxiv.org/abs/1906.09449v3
PDF	https://arxiv.org/pdf/1906.09449v3.pdf
PWC	https://paperswithcode.com/paper/deep-learning-approach-to-description-and
Repo
Framework

Channel Normalization in Convolutional Neural Network avoids Vanishing Gradients


Title	Channel Normalization in Convolutional Neural Network avoids Vanishing Gradients
Authors	Zhenwei Dai, Reinhard Heckel
Abstract	Normalization layers are widely used in deep neural networks to stabilize training. In this paper, we consider the training of convolutional neural networks with gradient descent on a single training example. This optimization problem arises in recent approaches for solving inverse problems such as the deep image prior or the deep decoder. We show that for this setup, channel normalization, which centers and normalizes each channel individually, avoids vanishing gradients, whereas, without normalization, gradients vanish which prevents efficient optimization. This effect prevails in deep single-channel linear convolutional networks, and we show that without channel normalization, gradient descent takes at least exponentially many steps to come close to an optimum. Contrary, with channel normalization, the gradients remain bounded, thus avoiding exploding gradients.
Tasks
Published	2019-07-22
URL	https://arxiv.org/abs/1907.09539v1
PDF	https://arxiv.org/pdf/1907.09539v1.pdf
PWC	https://paperswithcode.com/paper/channel-normalization-in-convolutional-neural
Repo
Framework

Online Local Boosting: improving performance in online decision trees


Title	Online Local Boosting: improving performance in online decision trees
Authors	Victor G. Turrisi da Costa, Saulo Martiello Mastelini, André C. Ponce de Leon Ferreira de Carvalho, Sylvio Barbon Jr
Abstract	As more data are produced each day, and faster, data stream mining is growing in importance, making clear the need for algorithms able to fast process these data. Data stream mining algorithms are meant to be solutions to extract knowledge online, specially tailored from continuous data problem. Many of the current algorithms for data stream mining have high processing and memory costs. Often, the higher the predictive performance, the higher these costs. To increase predictive performance without largely increasing memory and time costs, this paper introduces a novel algorithm, named Online Local Boosting (OLBoost), which can be combined into online decision tree algorithms to improve their predictive performance without modifying the structure of the induced decision trees. For such, OLBoost applies a boosting to small separate regions of the instances space. Experimental results presented in this paper show that by using OLBoost the online learning decision tree algorithms can significantly improve their predictive performance. Additionally, it can make smaller trees perform as good or better than larger trees.
Tasks
Published	2019-07-16
URL	https://arxiv.org/abs/1907.07207v1
PDF	https://arxiv.org/pdf/1907.07207v1.pdf
PWC	https://paperswithcode.com/paper/online-local-boosting-improving-performance
Repo
Framework

New Graph-based Features For Shape Recognition


Title	New Graph-based Features For Shape Recognition
Authors	Narges Mirehi, Maryam Tahmasbi, Alireza Tavakoli Targhi
Abstract	Shape recognition is the main challenging problem in computer vision. Different approaches and tools are used to solve this problem. Most existing approaches to object recognition are based on pixels. Pixel-based methods are dependent on the geometry and nature of the pixels, so the destruction of pixels reduces their performance. In this paper, we study the ability of graphs as shape recognition. We construct a graph that captures the topological and geometrical properties of the object. Then, using the coordinate and relation of its vertices, we extract features that are robust to noise, rotation, scale variation, and articulation. To evaluate our method, we provide different comparisons with state-of-the-art results on various known benchmarks, including Kimia’s, Tari56, Tetrapod, and Articulated dataset. We provide an analysis of our method against different variations. The results confirm our performance, especially against noise.
Tasks	Object Recognition
Published	2019-09-08
URL	https://arxiv.org/abs/1909.03482v1
PDF	https://arxiv.org/pdf/1909.03482v1.pdf
PWC	https://paperswithcode.com/paper/new-graph-based-features-for-shape
Repo
Framework

ChoiceNet: CNN learning through choice of multiple feature map representations


Title	ChoiceNet: CNN learning through choice of multiple feature map representations
Authors	Farshid Rayhan, Aphrodite Galata, Timothy F. Cootes
Abstract	We introduce a new architecture called ChoiceNet where each layer of the network is highly connected with skip connections and channelwise concatenations. This enables the network to alleviate the problem of vanishing gradients, reduces the number of parameters without sacrificing performance, and encourages feature reuse. We evaluate our proposed architecture on three benchmark datasets for object recognition tasks (ImageNet, CIFAR- 10, CIFAR-100, SVHN) and on a semantic segmentation dataset (CamVid).
Tasks	Object Recognition, Semantic Segmentation
Published	2019-04-20
URL	https://arxiv.org/abs/1904.09472v3
PDF	https://arxiv.org/pdf/1904.09472v3.pdf
PWC	https://paperswithcode.com/paper/choicenet-cnn-learning-through-choice-of
Repo
Framework

Sparse tree search optimality guarantees in POMDPs with continuous observation spaces


Title	Sparse tree search optimality guarantees in POMDPs with continuous observation spaces
Authors	Michael H. Lim, Claire J. Tomlin, Zachary N. Sunberg
Abstract	Partially observable Markov decision processes (POMDPs) with continuous state and observation spaces have powerful flexibility for representing real-world decision and control problems but are notoriously difficult to solve. Recent online sampling-based algorithms that use observation likelihood weighting have shown unprecedented effectiveness in domains with continuous observation spaces. However there has been no formal theoretical justification for this technique. This work offers such a justification, proving that a simplified algorithm, partially observable weighted sparse sampling (POWSS), will estimate Q-values accurately with high probability and can be made to perform arbitrarily near the optimal solution by increasing computational power.
Tasks
Published	2019-10-10
URL	https://arxiv.org/abs/1910.04332v2
PDF	https://arxiv.org/pdf/1910.04332v2.pdf
PWC	https://paperswithcode.com/paper/sparse-tree-search-optimality-guarantees-in
Repo
Framework

Fixing Gaussian Mixture VAEs for Interpretable Text Generation


Title	Fixing Gaussian Mixture VAEs for Interpretable Text Generation
Authors	Wenxian Shi, Hao Zhou, Ning Miao, Shenjian Zhao, Lei Li
Abstract	Variational auto-encoder (VAE) with Gaussian priors is effective in text generation. To improve the controllability and interpretability, we propose to use Gaussian mixture distribution as the prior for VAE (GMVAE), since it includes an extra discrete latent variable in addition to the continuous one. Unfortunately, training GMVAE using standard variational approximation often leads to the mode-collapse problem. We theoretically analyze the root cause — maximizing the evidence lower bound of GMVAE implicitly aggregates the means of multiple Gaussian priors. We propose Dispersed-GMVAE (DGMVAE), an improved model for text generation. It introduces two extra terms to alleviate mode-collapse and to induce a better structured latent space. Experimental results show that DGMVAE outperforms strong baselines in several language modeling and text generation benchmarks.
Tasks	Language Modelling, Text Generation
Published	2019-06-16
URL	https://arxiv.org/abs/1906.06719v1
PDF	https://arxiv.org/pdf/1906.06719v1.pdf
PWC	https://paperswithcode.com/paper/fixing-gaussian-mixture-vaes-for
Repo
Framework

Utilising Low Complexity CNNs to Lift Non-Local Redundancies in Video Coding


Title	Utilising Low Complexity CNNs to Lift Non-Local Redundancies in Video Coding
Authors	Jan P. Klopp, Liang-Gee Chen, Shao-Yi Chien
Abstract	Digital media is ubiquitous and produced in ever-growing quantities. This necessitates a constant evolution of compression techniques, especially for video, in order to maintain efficient storage and transmission. In this work, we aim at exploiting non-local redundancies in video data that remain difficult to erase for conventional video codecs. We design convolutional neural networks with a particular emphasis on low memory and computational footprint. The parameters of those networks are trained on the fly, at encoding time, to predict the residual signal from the decoded video signal. After the training process has converged, the parameters are compressed and signalled as part of the code of the underlying video codec. The method can be applied to any existing video codec to increase coding gains while its low computational footprint allows for an application under resource-constrained conditions. Building on top of High Efficiency Video Coding, we achieve coding gains similar to those of pretrained denoising CNNs while only requiring about 1% of their computational complexity. Through extensive experiments, we provide insights into the effectiveness of our network design decisions. In addition, we demonstrate that our algorithm delivers stable performance under conditions met in practical video compression: our algorithm performs without significant performance loss on very long random access segments (up to 256 frames) and with moderate performance drops can even be applied to single frames in high resolution low delay settings.
Tasks	Denoising, Video Compression
Published	2019-10-19
URL	https://arxiv.org/abs/1910.08737v1
PDF	https://arxiv.org/pdf/1910.08737v1.pdf
PWC	https://paperswithcode.com/paper/utilising-low-complexity-cnns-to-lift-non
Repo
Framework