October 17, 2019

3315 words 16 mins read

Paper Group ANR 796

Deep Learning Predicts Hip Fracture using Confounding Patient and Healthcare Variables. Saliency Supervision: An Intuitive and Effective Approach for Pain Intensity Regression. Convolutional LSTMs for Cloud-Robust Segmentation of Remote Sensing Imagery. Deep Learning with Long Short-Term Memory for Time Series Prediction. Guiding Intelligent Survei …

Deep Learning Predicts Hip Fracture using Confounding Patient and Healthcare Variables


Title	Deep Learning Predicts Hip Fracture using Confounding Patient and Healthcare Variables
Authors	Marcus A. Badgeley, John R. Zech, Luke Oakden-Rayner, Benjamin S. Glicksberg, Manway Liu, William Gale, Michael V. McConnell, Beth Percha, Thomas M. Snyder, Joel T. Dudley
Abstract	Hip fractures are a leading cause of death and disability among older adults. Hip fractures are also the most commonly missed diagnosis on pelvic radiographs. Computer-Aided Diagnosis (CAD) algorithms have shown promise for helping radiologists detect fractures, but the image features underpinning their predictions are notoriously difficult to understand. In this study, we trained deep learning models on 17,587 radiographs to classify fracture, five patient traits, and 14 hospital process variables. All 20 variables could be predicted from a radiograph (p < 0.05), with the best performances on scanner model (AUC=1.00), scanner brand (AUC=0.98), and whether the order was marked “priority” (AUC=0.79). Fracture was predicted moderately well from the image (AUC=0.78) and better when combining image features with patient data (AUC=0.86, p=2e-9) or patient data plus hospital process features (AUC=0.91, p=1e-21). The model performance on a test set with matched patient variables was significantly lower than a random test set (AUC=0.67, p=0.003); and when the test set was matched on patient and image acquisition variables, the model performed randomly (AUC=0.52, 95% CI 0.46-0.58), indicating that these variables were the main source of the model’s predictive ability overall. We also used Naive Bayes to combine evidence from image models with patient and hospital data and found their inclusion improved performance, but that this approach was nevertheless inferior to directly modeling all variables. If CAD algorithms are inexplicably leveraging patient and process variables in their predictions, it is unclear how radiologists should interpret their predictions in the context of other known patient data. Further research is needed to illuminate deep learning decision processes so that computers and clinicians can effectively cooperate.
Tasks
Published	2018-11-08
URL	http://arxiv.org/abs/1811.03695v1
PDF	http://arxiv.org/pdf/1811.03695v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-predicts-hip-fracture-using
Repo
Framework

Saliency Supervision: An Intuitive and Effective Approach for Pain Intensity Regression


Title	Saliency Supervision: An Intuitive and Effective Approach for Pain Intensity Regression
Authors	Conghui Li, Zhaocheng Zhu, Yuming Zhao
Abstract	Getting pain intensity from face images is an important problem in autonomous nursing systems. However, due to the limitation in data sources and the subjectiveness in pain intensity values, it is hard to adopt modern deep neural networks for this problem without domain-specific auxiliary design. Inspired by human vision priori, we propose a novel approach called saliency supervision, where we directly regularize deep networks to focus on facial area that is discriminative for pain regression. Through alternative training between saliency supervision and global loss, our method can learn sparse and robust features, which is proved helpful for pain intensity regression. We verified saliency supervision with face-verification network backbone on the widely-used dataset, and achieved state-of-art performance without bells and whistles. Our saliency supervision is intuitive in spirit, yet effective in performance. We believe such saliency supervision is essential in dealing with ill-posed datasets, and has potential in a wide range of vision tasks.
Tasks	Face Verification, Pain Intensity Regression
Published	2018-11-16
URL	http://arxiv.org/abs/1811.07987v1
PDF	http://arxiv.org/pdf/1811.07987v1.pdf
PWC	https://paperswithcode.com/paper/saliency-supervision-an-intuitive-and
Repo
Framework

Convolutional LSTMs for Cloud-Robust Segmentation of Remote Sensing Imagery


Title	Convolutional LSTMs for Cloud-Robust Segmentation of Remote Sensing Imagery
Authors	Marc Rußwurm, Marco Körner
Abstract	Clouds frequently cover the Earth’s surface and pose an omnipresent challenge to optical Earth observation methods. The vast majority of remote sensing approaches either selectively choose single cloud-free observations or employ a pre-classification strategy to identify and mask cloudy pixels. We follow a different strategy and treat cloud coverage as noise that is inherent to the observed satellite data. In prior work, we directly employed a straightforward \emph{convolutional long short-term memory} network for vegetation classification without explicit cloud filtering and achieved state-of-the-art classification accuracies. In this work, we investigate this cloud-robustness further by visualizing internal cell activations and performing an ablation experiment on datasets of different cloud coverage. In the visualizations of network states, we identified some cells in which modulation and input gates closed on cloudy pixels. This indicates that the network has internalized a cloud-filtering mechanism without being specifically trained on cloud labels. Overall, our results question the necessity of sophisticated pre-processing pipelines for multi-temporal deep learning approaches.
Tasks	Segmentation Of Remote Sensing Imagery
Published	2018-10-28
URL	http://arxiv.org/abs/1811.02471v2
PDF	http://arxiv.org/pdf/1811.02471v2.pdf
PWC	https://paperswithcode.com/paper/convolutional-lstms-for-cloud-robust
Repo
Framework

Deep Learning with Long Short-Term Memory for Time Series Prediction


Title	Deep Learning with Long Short-Term Memory for Time Series Prediction
Authors	Yuxiu Hua, Zhifeng Zhao, Rongpeng Li, Xianfu Chen, Zhiming Liu, Honggang Zhang
Abstract	Time series prediction can be generalized as a process that extracts useful information from historical records and then determines future values. Learning long-range dependencies that are embedded in time series is often an obstacle for most algorithms, whereas Long Short-Term Memory (LSTM) solutions, as a specific kind of scheme in deep learning, promise to effectively overcome the problem. In this article, we first give a brief introduction to the structure and forward propagation mechanism of the LSTM model. Then, aiming at reducing the considerable computing cost of LSTM, we put forward the Random Connectivity LSTM (RCLSTM) model and test it by predicting traffic and user mobility in telecommunication networks. Compared to LSTM, RCLSTM is formed via stochastic connectivity between neurons, which achieves a significant breakthrough in the architecture formation of neural networks. In this way, the RCLSTM model exhibits a certain level of sparsity, which leads to an appealing decrease in the computational complexity and makes the RCLSTM model become more applicable in latency-stringent application scenarios. In the field of telecommunication networks, the prediction of traffic series and mobility traces could directly benefit from this improvement as we further demonstrate that the prediction accuracy of RCLSTM is comparable to that of the conventional LSTM no matter how we change the number of training samples or the length of input sequences.
Tasks	Time Series, Time Series Prediction
Published	2018-10-24
URL	http://arxiv.org/abs/1810.10161v1
PDF	http://arxiv.org/pdf/1810.10161v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-with-long-short-term-memory-for
Repo
Framework

Guiding Intelligent Surveillance System by learning-by-synthesis gaze estimation


Title	Guiding Intelligent Surveillance System by learning-by-synthesis gaze estimation
Authors	Tongtong Zhao, Yuxiao Yan, Jinjia Peng, Zetian Mi, Xianping Fu
Abstract	We describe a novel learning-by-synthesis method for estimating gaze direction of an automated intelligent surveillance system. Recently, progress in learning-by-synthesis has proposed training models on synthetic images, which can effectively reduce the cost of manpower and material resources. However, learning from synthetic images still fails to achieve the desired performance compared to naturalistic images due to the different distribution of synthetic images. In an attempt to address this issue, previous method is to improve the realism of synthetic images by learning a model. However, the disadvantage of the method is that the distortion has not been improved and the authenticity level is unstable. To solve this problem, we put forward a new structure to improve synthetic images, via the reference to the idea of style transformation, through which we can efficiently reduce the distortion of pictures and minimize the need of real data annotation. We estimate that this enables generation of highly realistic images, which we demonstrate both qualitatively and with a user study. We quantitatively evaluate the generated images by training models for gaze estimation. We show a significant improvement over using synthetic images, and achieve state-of-the-art results on various datasets including MPIIGaze dataset.
Tasks	Gaze Estimation
Published	2018-10-08
URL	http://arxiv.org/abs/1810.03286v1
PDF	http://arxiv.org/pdf/1810.03286v1.pdf
PWC	https://paperswithcode.com/paper/guiding-intelligent-surveillance-system-by
Repo
Framework

English verb regularization in books and tweets


Title	English verb regularization in books and tweets
Authors	Tyler J. Gray, Andrew J. Reagan, Peter Sheridan Dodds, Christopher M. Danforth
Abstract	The English language has evolved dramatically throughout its lifespan, to the extent that a modern speaker of Old English would be incomprehensible without translation. One concrete indicator of this process is the movement from irregular to regular (-ed) forms for the past tense of verbs. In this study we quantify the extent of verb regularization using two vastly disparate datasets: (1) Six years of published books scanned by Google (2003–2008), and (2) A decade of social media messages posted to Twitter (2008–2017). We find that the extent of verb regularization is greater on Twitter, taken as a whole, than in English Fiction books. Regularization is also greater for tweets geotagged in the United States relative to American English books, but the opposite is true for tweets geotagged in the United Kingdom relative to British English books. We also find interesting regional variations in regularization across counties in the United States. However, once differences in population are accounted for, we do not identify strong correlations with socio-demographic variables such as education or income.
Tasks
Published	2018-03-26
URL	http://arxiv.org/abs/1803.09745v2
PDF	http://arxiv.org/pdf/1803.09745v2.pdf
PWC	https://paperswithcode.com/paper/english-verb-regularization-in-books-and
Repo
Framework

Pairwise Relational Networks using Local Appearance Features for Face Recognition


Title	Pairwise Relational Networks using Local Appearance Features for Face Recognition
Authors	Bong-Nam Kang, Yonghyun Kim, Daijin Kim
Abstract	We propose a new face recognition method, called a pairwise relational network (PRN), which takes local appearance features around landmark points on the feature map, and captures unique pairwise relations with the same identity and discriminative pairwise relations between different identities. The PRN aims to determine facial part-relational structure from local appearance feature pairs. Because meaningful pairwise relations should be identity dependent, we add a face identity state feature, which obtains from the long short-term memory (LSTM) units network with the sequential local appearance features. To further improve accuracy, we combined the global appearance features with the pairwise relational feature. Experimental results on the LFW show that the PRN achieved 99.76% accuracy. On the YTF, PRN achieved the state-of-the-art accuracy (96.3%). The PRN also achieved comparable results to the state-of-the-art for both face verification and face identification tasks on the IJB-A and IJB-B. This work is already published on ECCV 2018.
Tasks	Face Identification, Face Recognition, Face Verification
Published	2018-11-15
URL	http://arxiv.org/abs/1811.06405v1
PDF	http://arxiv.org/pdf/1811.06405v1.pdf
PWC	https://paperswithcode.com/paper/pairwise-relational-networks-using-local
Repo
Framework

Face Verification and Forgery Detection for Ophthalmic Surgery Images


Title	Face Verification and Forgery Detection for Ophthalmic Surgery Images
Authors	Kaushal Bhogale, Nishant Shankar, Adheesh Juvekar, Asutosh Padhi
Abstract	Although modern face verification systems are accessible and accurate, they are not always robust to pose variance and occlusions. Moreover, accurate models require a large amount of data to train. We structure our experiments to operate on small amounts of data obtained from an NGO that funds ophthalmic surgeries. We set up our face verification task as that of verifying pre-operation and post-operation images of a patient that undergoes ophthalmic surgery, and as such the post-operation images have occlusions like an eye patch. In this paper, we present a system that performs the face verification task using one-shot learning. To this end, our paper uses deep convolutional networks and compares different model architectures and loss functions. Our best model achieves 85% test accuracy. During inference time, we also attempt to detect image forgeries in addition to performing face verification. To achieve this, we use Error Level Analysis. Finally, we propose an inference pipeline that demonstrates how these techniques can be used to implement an automated face verification and forgery detection system.
Tasks	Face Verification, One-Shot Learning
Published	2018-11-15
URL	http://arxiv.org/abs/1811.06194v1
PDF	http://arxiv.org/pdf/1811.06194v1.pdf
PWC	https://paperswithcode.com/paper/face-verification-and-forgery-detection-for
Repo
Framework

Automatic Thresholding of SIFT Descriptors


Title	Automatic Thresholding of SIFT Descriptors
Authors	Matthew R. Kirchner
Abstract	We introduce a method to perform automatic thresholding of SIFT descriptors that improves matching performance by at least 15.9% on the Oxford image matching benchmark. The method uses a contrario methodology to determine a unique bin magnitude threshold. This is done by building a generative uniform background model for descriptors and determining when bin magnitudes have reached a sufficient level. The presented method, called meaningful clamping, contrasts from the current SIFT implementation by efficiently computing a clamping threshold that is unique for every descriptor.
Tasks
Published	2018-11-07
URL	http://arxiv.org/abs/1811.03173v1
PDF	http://arxiv.org/pdf/1811.03173v1.pdf
PWC	https://paperswithcode.com/paper/automatic-thresholding-of-sift-descriptors
Repo
Framework

Neural Generative Models for 3D Faces with Application in 3D Texture Free Face Recognition


Title	Neural Generative Models for 3D Faces with Application in 3D Texture Free Face Recognition
Authors	Ahmed ElSayed, Elif Kongar, Ausif Mahmood, Tarek Sobh, Terrance Boult
Abstract	Using heterogeneous depth cameras and 3D scanners in 3D face verification causes variations in the resolution of the 3D point clouds. To solve this issue, previous studies use 3D registration techniques. Out of these proposed techniques, detecting points of correspondence is proven to be an efficient method given that the data belongs to the same individual. However, if the data belongs to different persons, the registration algorithms can convert the 3D point cloud of one person to another, destroying the distinguishing features between the two point clouds. Another issue regarding the storage size of the point clouds. That is, if the captured depth image contains around 50 thousand points in the cloud for a single pose for one individual, then the storage size of the entire dataset will be in order of giga if not tera bytes. With these motivations, this work introduces a new technique for 3D point clouds generation using a neural modeling system to handle the differences caused by heterogeneous depth cameras, and to generate a new face canonical compact representation. The proposed system reduces the stored 3D dataset size, and if required, provides an accurate dataset regeneration. Furthermore, the system generates neural models for all gallery point clouds and stores these models to represent the faces in the recognition or verification processes. For the probe cloud to be verified, a new model is generated specifically for that particular cloud and is matched against pre-stored gallery model presentations to identify the query cloud. This work also introduces the utilization of Siamese deep neural network in 3D face verification using generated model representations as raw data for the deep network, and shows that the accuracy of the trained network is comparable all published results on Bosphorus dataset.
Tasks	Face Recognition, Face Verification
Published	2018-11-11
URL	http://arxiv.org/abs/1811.04358v1
PDF	http://arxiv.org/pdf/1811.04358v1.pdf
PWC	https://paperswithcode.com/paper/neural-generative-models-for-3d-faces-with
Repo
Framework

Structural Consistency and Controllability for Diverse Colorization


Title	Structural Consistency and Controllability for Diverse Colorization
Authors	Safa Messaoud, David Forsyth, Alexander G. Schwing
Abstract	Colorizing a given gray-level image is an important task in the media and advertising industry. Due to the ambiguity inherent to colorization (many shades are often plausible), recent approaches started to explicitly model diversity. However, one of the most obvious artifacts, structural inconsistency, is rarely considered by existing methods which predict chrominance independently for every pixel. To address this issue, we develop a conditional random field based variational auto-encoder formulation which is able to achieve diversity while taking into account structural consistency. Moreover, we introduce a controllability mecha- nism that can incorporate external constraints from diverse sources in- cluding a user interface. Compared to existing baselines, we demonstrate that our method obtains more diverse and globally consistent coloriza- tions on the LFW, LSUN-Church and ILSVRC-2015 datasets.
Tasks	Colorization
Published	2018-09-06
URL	http://arxiv.org/abs/1809.02129v1
PDF	http://arxiv.org/pdf/1809.02129v1.pdf
PWC	https://paperswithcode.com/paper/structural-consistency-and-controllability
Repo
Framework

t-PINE: Tensor-based Predictable and Interpretable Node Embeddings


Title	t-PINE: Tensor-based Predictable and Interpretable Node Embeddings
Authors	Saba A. Al-Sayouri, Ekta Gujral, Danai Koutra, Evangelos E. Papalexakis, Sarah S. Lam
Abstract	Graph representations have increasingly grown in popularity during the last years. Existing representation learning approaches explicitly encode network structure. Despite their good performance in downstream processes (e.g., node classification, link prediction), there is still room for improvement in different aspects, like efficacy, visualization, and interpretability. In this paper, we propose, t-PINE, a method that addresses these limitations. Contrary to baseline methods, which generally learn explicit graph representations by solely using an adjacency matrix, t-PINE avails a multi-view information graph, the adjacency matrix represents the first view, and a nearest neighbor adjacency, computed over the node features, is the second view, in order to learn explicit and implicit node representations, using the Canonical Polyadic (a.k.a. CP) decomposition. We argue that the implicit and the explicit mapping from a higher-dimensional to a lower-dimensional vector space is the key to learn more useful, highly predictable, and gracefully interpretable representations. Having good interpretable representations provides a good guidance to understand how each view contributes to the representation learning process. In addition, it helps us to exclude unrelated dimensions. Extensive experiments show that t-PINE drastically outperforms baseline methods by up to 158.6% with respect to Micro-F1, in several multi-label classification problems, while it has high visualization and interpretability utility.
Tasks	Link Prediction, Multi-Label Classification, Node Classification, Representation Learning
Published	2018-05-03
URL	http://arxiv.org/abs/1805.01889v1
PDF	http://arxiv.org/pdf/1805.01889v1.pdf
PWC	https://paperswithcode.com/paper/t-pine-tensor-based-predictable-and
Repo
Framework

Understanding Fake Faces


Title	Understanding Fake Faces
Authors	Ryota Natsume, Kazuki Inoue, Yoshihiro Fukuhara, Shintaro Yamamoto, Shigeo Morishima, Hirokatsu Kataoka
Abstract	Face recognition research is one of the most active topics in computer vision (CV), and deep neural networks (DNN) are now filling the gap between human-level and computer-driven performance levels in face verification algorithms. However, although the performance gap appears to be narrowing in terms of accuracy-based expectations, a curious question has arisen; specifically, “Face understanding of AI is really close to that of human?” In the present study, in an effort to confirm the brain-driven concept, we conduct image-based detection, classification, and generation using an in-house created fake face database. This database has two configurations: (i) false positive face detections produced using both the Viola Jones (VJ) method and convolutional neural networks (CNN), and (ii) simulacra that have fundamental characteristics that resemble faces but are completely artificial. The results show a level of suggestive knowledge that indicates the continuing existence of a gap between the capabilities of recent vision-based face recognition algorithms and human-level performance. On a positive note, however, we have obtained knowledge that will advance the progress of face-understanding models.
Tasks	Face Recognition, Face Verification
Published	2018-09-22
URL	http://arxiv.org/abs/1809.08391v1
PDF	http://arxiv.org/pdf/1809.08391v1.pdf
PWC	https://paperswithcode.com/paper/understanding-fake-faces
Repo
Framework

Feature-Distributed SVRG for High-Dimensional Linear Classification


Title	Feature-Distributed SVRG for High-Dimensional Linear Classification
Authors	Gong-Duo Zhang, Shen-Yi Zhao, Hao Gao, Wu-Jun Li
Abstract	Linear classification has been widely used in many high-dimensional applications like text classification. To perform linear classification for large-scale tasks, we often need to design distributed learning methods on a cluster of multiple machines. In this paper, we propose a new distributed learning method, called feature-distributed stochastic variance reduced gradient (FD-SVRG) for high-dimensional linear classification. Unlike most existing distributed learning methods which are instance-distributed, FD-SVRG is feature-distributed. FD-SVRG has lower communication cost than other instance-distributed methods when the data dimensionality is larger than the number of data instances. Experimental results on real data demonstrate that FD-SVRG can outperform other state-of-the-art distributed methods for high-dimensional linear classification in terms of both communication cost and wall-clock time, when the dimensionality is larger than the number of instances in training data.
Tasks	Text Classification
Published	2018-02-10
URL	http://arxiv.org/abs/1802.03604v1
PDF	http://arxiv.org/pdf/1802.03604v1.pdf
PWC	https://paperswithcode.com/paper/feature-distributed-svrg-for-high-dimensional
Repo
Framework

SIC-MMAB: Synchronisation Involves Communication in Multiplayer Multi-Armed Bandits


Title	SIC-MMAB: Synchronisation Involves Communication in Multiplayer Multi-Armed Bandits
Authors	Etienne Boursier, Vianney Perchet
Abstract	Motivated by cognitive radio networks, we consider the stochastic multiplayer multi-armed bandit problem, where several players pull arms simultaneously and collisions occur if one of them is pulled by several players at the same stage. We present a decentralized algorithm that achieves the same performance as a centralized one, contradicting the existing lower bounds for that problem. This is possible by “hacking” the standard model by constructing a communication protocol between players that deliberately enforces collisions, allowing them to share their information at a negligible cost. This motivates the introduction of a more appropriate dynamic setting without sensing, where similar communication protocols are no longer possible. However, we show that the logarithmic growth of the regret is still achievable for this model with a new algorithm.
Tasks	Multi-Armed Bandits
Published	2018-09-21
URL	https://arxiv.org/abs/1809.08151v4
PDF	https://arxiv.org/pdf/1809.08151v4.pdf
PWC	https://paperswithcode.com/paper/sic-mmab-synchronisation-involves
Repo
Framework