Paper Group ANR 796
Deep Learning Predicts Hip Fracture using Confounding Patient and Healthcare Variables. Saliency Supervision: An Intuitive and Effective Approach for Pain Intensity Regression. Convolutional LSTMs for Cloud-Robust Segmentation of Remote Sensing Imagery. Deep Learning with Long Short-Term Memory for Time Series Prediction. Guiding Intelligent Survei …
Deep Learning Predicts Hip Fracture using Confounding Patient and Healthcare Variables
Title | Deep Learning Predicts Hip Fracture using Confounding Patient and Healthcare Variables |
Authors | Marcus A. Badgeley, John R. Zech, Luke Oakden-Rayner, Benjamin S. Glicksberg, Manway Liu, William Gale, Michael V. McConnell, Beth Percha, Thomas M. Snyder, Joel T. Dudley |
Abstract | Hip fractures are a leading cause of death and disability among older adults. Hip fractures are also the most commonly missed diagnosis on pelvic radiographs. Computer-Aided Diagnosis (CAD) algorithms have shown promise for helping radiologists detect fractures, but the image features underpinning their predictions are notoriously difficult to understand. In this study, we trained deep learning models on 17,587 radiographs to classify fracture, five patient traits, and 14 hospital process variables. All 20 variables could be predicted from a radiograph (p < 0.05), with the best performances on scanner model (AUC=1.00), scanner brand (AUC=0.98), and whether the order was marked “priority” (AUC=0.79). Fracture was predicted moderately well from the image (AUC=0.78) and better when combining image features with patient data (AUC=0.86, p=2e-9) or patient data plus hospital process features (AUC=0.91, p=1e-21). The model performance on a test set with matched patient variables was significantly lower than a random test set (AUC=0.67, p=0.003); and when the test set was matched on patient and image acquisition variables, the model performed randomly (AUC=0.52, 95% CI 0.46-0.58), indicating that these variables were the main source of the model’s predictive ability overall. We also used Naive Bayes to combine evidence from image models with patient and hospital data and found their inclusion improved performance, but that this approach was nevertheless inferior to directly modeling all variables. If CAD algorithms are inexplicably leveraging patient and process variables in their predictions, it is unclear how radiologists should interpret their predictions in the context of other known patient data. Further research is needed to illuminate deep learning decision processes so that computers and clinicians can effectively cooperate. |
Tasks | |
Published | 2018-11-08 |
URL | http://arxiv.org/abs/1811.03695v1 |
http://arxiv.org/pdf/1811.03695v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-predicts-hip-fracture-using |
Repo | |
Framework | |
Saliency Supervision: An Intuitive and Effective Approach for Pain Intensity Regression
Title | Saliency Supervision: An Intuitive and Effective Approach for Pain Intensity Regression |
Authors | Conghui Li, Zhaocheng Zhu, Yuming Zhao |
Abstract | Getting pain intensity from face images is an important problem in autonomous nursing systems. However, due to the limitation in data sources and the subjectiveness in pain intensity values, it is hard to adopt modern deep neural networks for this problem without domain-specific auxiliary design. Inspired by human vision priori, we propose a novel approach called saliency supervision, where we directly regularize deep networks to focus on facial area that is discriminative for pain regression. Through alternative training between saliency supervision and global loss, our method can learn sparse and robust features, which is proved helpful for pain intensity regression. We verified saliency supervision with face-verification network backbone on the widely-used dataset, and achieved state-of-art performance without bells and whistles. Our saliency supervision is intuitive in spirit, yet effective in performance. We believe such saliency supervision is essential in dealing with ill-posed datasets, and has potential in a wide range of vision tasks. |
Tasks | Face Verification, Pain Intensity Regression |
Published | 2018-11-16 |
URL | http://arxiv.org/abs/1811.07987v1 |
http://arxiv.org/pdf/1811.07987v1.pdf | |
PWC | https://paperswithcode.com/paper/saliency-supervision-an-intuitive-and |
Repo | |
Framework | |
Convolutional LSTMs for Cloud-Robust Segmentation of Remote Sensing Imagery
Title | Convolutional LSTMs for Cloud-Robust Segmentation of Remote Sensing Imagery |
Authors | Marc Rußwurm, Marco Körner |
Abstract | Clouds frequently cover the Earth’s surface and pose an omnipresent challenge to optical Earth observation methods. The vast majority of remote sensing approaches either selectively choose single cloud-free observations or employ a pre-classification strategy to identify and mask cloudy pixels. We follow a different strategy and treat cloud coverage as noise that is inherent to the observed satellite data. In prior work, we directly employed a straightforward \emph{convolutional long short-term memory} network for vegetation classification without explicit cloud filtering and achieved state-of-the-art classification accuracies. In this work, we investigate this cloud-robustness further by visualizing internal cell activations and performing an ablation experiment on datasets of different cloud coverage. In the visualizations of network states, we identified some cells in which modulation and input gates closed on cloudy pixels. This indicates that the network has internalized a cloud-filtering mechanism without being specifically trained on cloud labels. Overall, our results question the necessity of sophisticated pre-processing pipelines for multi-temporal deep learning approaches. |
Tasks | Segmentation Of Remote Sensing Imagery |
Published | 2018-10-28 |
URL | http://arxiv.org/abs/1811.02471v2 |
http://arxiv.org/pdf/1811.02471v2.pdf | |
PWC | https://paperswithcode.com/paper/convolutional-lstms-for-cloud-robust |
Repo | |
Framework | |
Deep Learning with Long Short-Term Memory for Time Series Prediction
Title | Deep Learning with Long Short-Term Memory for Time Series Prediction |
Authors | Yuxiu Hua, Zhifeng Zhao, Rongpeng Li, Xianfu Chen, Zhiming Liu, Honggang Zhang |
Abstract | Time series prediction can be generalized as a process that extracts useful information from historical records and then determines future values. Learning long-range dependencies that are embedded in time series is often an obstacle for most algorithms, whereas Long Short-Term Memory (LSTM) solutions, as a specific kind of scheme in deep learning, promise to effectively overcome the problem. In this article, we first give a brief introduction to the structure and forward propagation mechanism of the LSTM model. Then, aiming at reducing the considerable computing cost of LSTM, we put forward the Random Connectivity LSTM (RCLSTM) model and test it by predicting traffic and user mobility in telecommunication networks. Compared to LSTM, RCLSTM is formed via stochastic connectivity between neurons, which achieves a significant breakthrough in the architecture formation of neural networks. In this way, the RCLSTM model exhibits a certain level of sparsity, which leads to an appealing decrease in the computational complexity and makes the RCLSTM model become more applicable in latency-stringent application scenarios. In the field of telecommunication networks, the prediction of traffic series and mobility traces could directly benefit from this improvement as we further demonstrate that the prediction accuracy of RCLSTM is comparable to that of the conventional LSTM no matter how we change the number of training samples or the length of input sequences. |
Tasks | Time Series, Time Series Prediction |
Published | 2018-10-24 |
URL | http://arxiv.org/abs/1810.10161v1 |
http://arxiv.org/pdf/1810.10161v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-with-long-short-term-memory-for |
Repo | |
Framework | |
Guiding Intelligent Surveillance System by learning-by-synthesis gaze estimation
Title | Guiding Intelligent Surveillance System by learning-by-synthesis gaze estimation |
Authors | Tongtong Zhao, Yuxiao Yan, Jinjia Peng, Zetian Mi, Xianping Fu |
Abstract | We describe a novel learning-by-synthesis method for estimating gaze direction of an automated intelligent surveillance system. Recently, progress in learning-by-synthesis has proposed training models on synthetic images, which can effectively reduce the cost of manpower and material resources. However, learning from synthetic images still fails to achieve the desired performance compared to naturalistic images due to the different distribution of synthetic images. In an attempt to address this issue, previous method is to improve the realism of synthetic images by learning a model. However, the disadvantage of the method is that the distortion has not been improved and the authenticity level is unstable. To solve this problem, we put forward a new structure to improve synthetic images, via the reference to the idea of style transformation, through which we can efficiently reduce the distortion of pictures and minimize the need of real data annotation. We estimate that this enables generation of highly realistic images, which we demonstrate both qualitatively and with a user study. We quantitatively evaluate the generated images by training models for gaze estimation. We show a significant improvement over using synthetic images, and achieve state-of-the-art results on various datasets including MPIIGaze dataset. |
Tasks | Gaze Estimation |
Published | 2018-10-08 |
URL | http://arxiv.org/abs/1810.03286v1 |
http://arxiv.org/pdf/1810.03286v1.pdf | |
PWC | https://paperswithcode.com/paper/guiding-intelligent-surveillance-system-by |
Repo | |
Framework | |
English verb regularization in books and tweets
Title | English verb regularization in books and tweets |
Authors | Tyler J. Gray, Andrew J. Reagan, Peter Sheridan Dodds, Christopher M. Danforth |
Abstract | The English language has evolved dramatically throughout its lifespan, to the extent that a modern speaker of Old English would be incomprehensible without translation. One concrete indicator of this process is the movement from irregular to regular (-ed) forms for the past tense of verbs. In this study we quantify the extent of verb regularization using two vastly disparate datasets: (1) Six years of published books scanned by Google (2003–2008), and (2) A decade of social media messages posted to Twitter (2008–2017). We find that the extent of verb regularization is greater on Twitter, taken as a whole, than in English Fiction books. Regularization is also greater for tweets geotagged in the United States relative to American English books, but the opposite is true for tweets geotagged in the United Kingdom relative to British English books. We also find interesting regional variations in regularization across counties in the United States. However, once differences in population are accounted for, we do not identify strong correlations with socio-demographic variables such as education or income. |
Tasks | |
Published | 2018-03-26 |
URL | http://arxiv.org/abs/1803.09745v2 |
http://arxiv.org/pdf/1803.09745v2.pdf | |
PWC | https://paperswithcode.com/paper/english-verb-regularization-in-books-and |
Repo | |
Framework | |
Pairwise Relational Networks using Local Appearance Features for Face Recognition
Title | Pairwise Relational Networks using Local Appearance Features for Face Recognition |
Authors | Bong-Nam Kang, Yonghyun Kim, Daijin Kim |
Abstract | We propose a new face recognition method, called a pairwise relational network (PRN), which takes local appearance features around landmark points on the feature map, and captures unique pairwise relations with the same identity and discriminative pairwise relations between different identities. The PRN aims to determine facial part-relational structure from local appearance feature pairs. Because meaningful pairwise relations should be identity dependent, we add a face identity state feature, which obtains from the long short-term memory (LSTM) units network with the sequential local appearance features. To further improve accuracy, we combined the global appearance features with the pairwise relational feature. Experimental results on the LFW show that the PRN achieved 99.76% accuracy. On the YTF, PRN achieved the state-of-the-art accuracy (96.3%). The PRN also achieved comparable results to the state-of-the-art for both face verification and face identification tasks on the IJB-A and IJB-B. This work is already published on ECCV 2018. |
Tasks | Face Identification, Face Recognition, Face Verification |
Published | 2018-11-15 |
URL | http://arxiv.org/abs/1811.06405v1 |
http://arxiv.org/pdf/1811.06405v1.pdf | |
PWC | https://paperswithcode.com/paper/pairwise-relational-networks-using-local |
Repo | |
Framework | |
Face Verification and Forgery Detection for Ophthalmic Surgery Images
Title | Face Verification and Forgery Detection for Ophthalmic Surgery Images |
Authors | Kaushal Bhogale, Nishant Shankar, Adheesh Juvekar, Asutosh Padhi |
Abstract | Although modern face verification systems are accessible and accurate, they are not always robust to pose variance and occlusions. Moreover, accurate models require a large amount of data to train. We structure our experiments to operate on small amounts of data obtained from an NGO that funds ophthalmic surgeries. We set up our face verification task as that of verifying pre-operation and post-operation images of a patient that undergoes ophthalmic surgery, and as such the post-operation images have occlusions like an eye patch. In this paper, we present a system that performs the face verification task using one-shot learning. To this end, our paper uses deep convolutional networks and compares different model architectures and loss functions. Our best model achieves 85% test accuracy. During inference time, we also attempt to detect image forgeries in addition to performing face verification. To achieve this, we use Error Level Analysis. Finally, we propose an inference pipeline that demonstrates how these techniques can be used to implement an automated face verification and forgery detection system. |
Tasks | Face Verification, One-Shot Learning |
Published | 2018-11-15 |
URL | http://arxiv.org/abs/1811.06194v1 |
http://arxiv.org/pdf/1811.06194v1.pdf | |
PWC | https://paperswithcode.com/paper/face-verification-and-forgery-detection-for |
Repo | |
Framework | |
Automatic Thresholding of SIFT Descriptors
Title | Automatic Thresholding of SIFT Descriptors |
Authors | Matthew R. Kirchner |
Abstract | We introduce a method to perform automatic thresholding of SIFT descriptors that improves matching performance by at least 15.9% on the Oxford image matching benchmark. The method uses a contrario methodology to determine a unique bin magnitude threshold. This is done by building a generative uniform background model for descriptors and determining when bin magnitudes have reached a sufficient level. The presented method, called meaningful clamping, contrasts from the current SIFT implementation by efficiently computing a clamping threshold that is unique for every descriptor. |
Tasks | |
Published | 2018-11-07 |
URL | http://arxiv.org/abs/1811.03173v1 |
http://arxiv.org/pdf/1811.03173v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-thresholding-of-sift-descriptors |
Repo | |
Framework | |
Neural Generative Models for 3D Faces with Application in 3D Texture Free Face Recognition
Title | Neural Generative Models for 3D Faces with Application in 3D Texture Free Face Recognition |
Authors | Ahmed ElSayed, Elif Kongar, Ausif Mahmood, Tarek Sobh, Terrance Boult |
Abstract | Using heterogeneous depth cameras and 3D scanners in 3D face verification causes variations in the resolution of the 3D point clouds. To solve this issue, previous studies use 3D registration techniques. Out of these proposed techniques, detecting points of correspondence is proven to be an efficient method given that the data belongs to the same individual. However, if the data belongs to different persons, the registration algorithms can convert the 3D point cloud of one person to another, destroying the distinguishing features between the two point clouds. Another issue regarding the storage size of the point clouds. That is, if the captured depth image contains around 50 thousand points in the cloud for a single pose for one individual, then the storage size of the entire dataset will be in order of giga if not tera bytes. With these motivations, this work introduces a new technique for 3D point clouds generation using a neural modeling system to handle the differences caused by heterogeneous depth cameras, and to generate a new face canonical compact representation. The proposed system reduces the stored 3D dataset size, and if required, provides an accurate dataset regeneration. Furthermore, the system generates neural models for all gallery point clouds and stores these models to represent the faces in the recognition or verification processes. For the probe cloud to be verified, a new model is generated specifically for that particular cloud and is matched against pre-stored gallery model presentations to identify the query cloud. This work also introduces the utilization of Siamese deep neural network in 3D face verification using generated model representations as raw data for the deep network, and shows that the accuracy of the trained network is comparable all published results on Bosphorus dataset. |
Tasks | Face Recognition, Face Verification |
Published | 2018-11-11 |
URL | http://arxiv.org/abs/1811.04358v1 |
http://arxiv.org/pdf/1811.04358v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-generative-models-for-3d-faces-with |
Repo | |
Framework | |
Structural Consistency and Controllability for Diverse Colorization
Title | Structural Consistency and Controllability for Diverse Colorization |
Authors | Safa Messaoud, David Forsyth, Alexander G. Schwing |
Abstract | Colorizing a given gray-level image is an important task in the media and advertising industry. Due to the ambiguity inherent to colorization (many shades are often plausible), recent approaches started to explicitly model diversity. However, one of the most obvious artifacts, structural inconsistency, is rarely considered by existing methods which predict chrominance independently for every pixel. To address this issue, we develop a conditional random field based variational auto-encoder formulation which is able to achieve diversity while taking into account structural consistency. Moreover, we introduce a controllability mecha- nism that can incorporate external constraints from diverse sources in- cluding a user interface. Compared to existing baselines, we demonstrate that our method obtains more diverse and globally consistent coloriza- tions on the LFW, LSUN-Church and ILSVRC-2015 datasets. |
Tasks | Colorization |
Published | 2018-09-06 |
URL | http://arxiv.org/abs/1809.02129v1 |
http://arxiv.org/pdf/1809.02129v1.pdf | |
PWC | https://paperswithcode.com/paper/structural-consistency-and-controllability |
Repo | |
Framework | |
t-PINE: Tensor-based Predictable and Interpretable Node Embeddings
Title | t-PINE: Tensor-based Predictable and Interpretable Node Embeddings |
Authors | Saba A. Al-Sayouri, Ekta Gujral, Danai Koutra, Evangelos E. Papalexakis, Sarah S. Lam |
Abstract | Graph representations have increasingly grown in popularity during the last years. Existing representation learning approaches explicitly encode network structure. Despite their good performance in downstream processes (e.g., node classification, link prediction), there is still room for improvement in different aspects, like efficacy, visualization, and interpretability. In this paper, we propose, t-PINE, a method that addresses these limitations. Contrary to baseline methods, which generally learn explicit graph representations by solely using an adjacency matrix, t-PINE avails a multi-view information graph, the adjacency matrix represents the first view, and a nearest neighbor adjacency, computed over the node features, is the second view, in order to learn explicit and implicit node representations, using the Canonical Polyadic (a.k.a. CP) decomposition. We argue that the implicit and the explicit mapping from a higher-dimensional to a lower-dimensional vector space is the key to learn more useful, highly predictable, and gracefully interpretable representations. Having good interpretable representations provides a good guidance to understand how each view contributes to the representation learning process. In addition, it helps us to exclude unrelated dimensions. Extensive experiments show that t-PINE drastically outperforms baseline methods by up to 158.6% with respect to Micro-F1, in several multi-label classification problems, while it has high visualization and interpretability utility. |
Tasks | Link Prediction, Multi-Label Classification, Node Classification, Representation Learning |
Published | 2018-05-03 |
URL | http://arxiv.org/abs/1805.01889v1 |
http://arxiv.org/pdf/1805.01889v1.pdf | |
PWC | https://paperswithcode.com/paper/t-pine-tensor-based-predictable-and |
Repo | |
Framework | |
Understanding Fake Faces
Title | Understanding Fake Faces |
Authors | Ryota Natsume, Kazuki Inoue, Yoshihiro Fukuhara, Shintaro Yamamoto, Shigeo Morishima, Hirokatsu Kataoka |
Abstract | Face recognition research is one of the most active topics in computer vision (CV), and deep neural networks (DNN) are now filling the gap between human-level and computer-driven performance levels in face verification algorithms. However, although the performance gap appears to be narrowing in terms of accuracy-based expectations, a curious question has arisen; specifically, “Face understanding of AI is really close to that of human?” In the present study, in an effort to confirm the brain-driven concept, we conduct image-based detection, classification, and generation using an in-house created fake face database. This database has two configurations: (i) false positive face detections produced using both the Viola Jones (VJ) method and convolutional neural networks (CNN), and (ii) simulacra that have fundamental characteristics that resemble faces but are completely artificial. The results show a level of suggestive knowledge that indicates the continuing existence of a gap between the capabilities of recent vision-based face recognition algorithms and human-level performance. On a positive note, however, we have obtained knowledge that will advance the progress of face-understanding models. |
Tasks | Face Recognition, Face Verification |
Published | 2018-09-22 |
URL | http://arxiv.org/abs/1809.08391v1 |
http://arxiv.org/pdf/1809.08391v1.pdf | |
PWC | https://paperswithcode.com/paper/understanding-fake-faces |
Repo | |
Framework | |
Feature-Distributed SVRG for High-Dimensional Linear Classification
Title | Feature-Distributed SVRG for High-Dimensional Linear Classification |
Authors | Gong-Duo Zhang, Shen-Yi Zhao, Hao Gao, Wu-Jun Li |
Abstract | Linear classification has been widely used in many high-dimensional applications like text classification. To perform linear classification for large-scale tasks, we often need to design distributed learning methods on a cluster of multiple machines. In this paper, we propose a new distributed learning method, called feature-distributed stochastic variance reduced gradient (FD-SVRG) for high-dimensional linear classification. Unlike most existing distributed learning methods which are instance-distributed, FD-SVRG is feature-distributed. FD-SVRG has lower communication cost than other instance-distributed methods when the data dimensionality is larger than the number of data instances. Experimental results on real data demonstrate that FD-SVRG can outperform other state-of-the-art distributed methods for high-dimensional linear classification in terms of both communication cost and wall-clock time, when the dimensionality is larger than the number of instances in training data. |
Tasks | Text Classification |
Published | 2018-02-10 |
URL | http://arxiv.org/abs/1802.03604v1 |
http://arxiv.org/pdf/1802.03604v1.pdf | |
PWC | https://paperswithcode.com/paper/feature-distributed-svrg-for-high-dimensional |
Repo | |
Framework | |
SIC-MMAB: Synchronisation Involves Communication in Multiplayer Multi-Armed Bandits
Title | SIC-MMAB: Synchronisation Involves Communication in Multiplayer Multi-Armed Bandits |
Authors | Etienne Boursier, Vianney Perchet |
Abstract | Motivated by cognitive radio networks, we consider the stochastic multiplayer multi-armed bandit problem, where several players pull arms simultaneously and collisions occur if one of them is pulled by several players at the same stage. We present a decentralized algorithm that achieves the same performance as a centralized one, contradicting the existing lower bounds for that problem. This is possible by “hacking” the standard model by constructing a communication protocol between players that deliberately enforces collisions, allowing them to share their information at a negligible cost. This motivates the introduction of a more appropriate dynamic setting without sensing, where similar communication protocols are no longer possible. However, we show that the logarithmic growth of the regret is still achievable for this model with a new algorithm. |
Tasks | Multi-Armed Bandits |
Published | 2018-09-21 |
URL | https://arxiv.org/abs/1809.08151v4 |
https://arxiv.org/pdf/1809.08151v4.pdf | |
PWC | https://paperswithcode.com/paper/sic-mmab-synchronisation-involves |
Repo | |
Framework | |