January 26, 2020

3334 words 16 mins read

Paper Group ANR 1373

Paper Group ANR 1373

On the Difficulty of Warm-Starting Neural Network Training. Design Automation for Efficient Deep Learning Computing. Boosting the rule-out accuracy of deep disease detection using class weight modifiers. Deep Transfer Learning for Single-Channel Automatic Sleep Staging with Channel Mismatch. Growing axons: greedy learning of neural networks with ap …

On the Difficulty of Warm-Starting Neural Network Training

Title On the Difficulty of Warm-Starting Neural Network Training
Authors Jordan T. Ash, Ryan P. Adams
Abstract In many real-world deployments of machine learning systems, data arrive piecemeal. These learning scenarios may be passive, where data arrive incrementally due to structural properties of the problem (e.g., daily financial data) or active, where samples are selected according to a measure of their quality (e.g., experimental design). In both of these cases, we are building a sequence of models that incorporate an increasing amount of data. We would like each of these models in the sequence to be performant and take advantage of all the data that are available to that point. Conventional intuition suggests that when solving a sequence of related optimization problems of this form, it should be possible to initialize using the solution of the previous iterate—to “warm start” the optimization rather than initialize from scratch—and see reductions in wall-clock time. However, in practice this warm-starting seems to yield poorer generalization performance than models that have fresh random initializations, even though the final training losses are similar. While it appears that some hyperparameter settings allow a practitioner to close this generalization gap, they seem to only do so in regimes that damage the wall-clock gains of the warm start. Nevertheless, it is highly desirable to be able to warm-start neural network training, as it would dramatically reduce the resource usage associated with the construction of performant deep learning systems. In this work, we take a closer look at this empirical phenomenon and try to understand when and how it occurs. Although the present investigation did not lead to a solution, we hope that a thorough articulation of the problem will spur new research that may lead to improved methods that consume fewer resources during training.
Tasks
Published 2019-10-18
URL https://arxiv.org/abs/1910.08475v1
PDF https://arxiv.org/pdf/1910.08475v1.pdf
PWC https://paperswithcode.com/paper/on-the-difficulty-of-warm-starting-neural
Repo
Framework

Design Automation for Efficient Deep Learning Computing

Title Design Automation for Efficient Deep Learning Computing
Authors Song Han, Han Cai, Ligeng Zhu, Ji Lin, Kuan Wang, Zhijian Liu, Yujun Lin
Abstract Efficient deep learning computing requires algorithm and hardware co-design to enable specialization: we usually need to change the algorithm to reduce memory footprint and improve energy efficiency. However, the extra degree of freedom from the algorithm makes the design space much larger: it’s not only about designing the hardware but also about how to tweak the algorithm to best fit the hardware. Human engineers can hardly exhaust the design space by heuristics. It’s labor consuming and sub-optimal. We propose design automation techniques for efficient neural networks. We investigate automatically designing specialized fast models, auto channel pruning, and auto mixed-precision quantization. We demonstrate such learning-based, automated design achieves superior performance and efficiency than rule-based human design. Moreover, we shorten the design cycle by 200x than previous work, so that we can afford to design specialized neural network models for different hardware platforms.
Tasks Quantization
Published 2019-04-24
URL http://arxiv.org/abs/1904.10616v1
PDF http://arxiv.org/pdf/1904.10616v1.pdf
PWC https://paperswithcode.com/paper/design-automation-for-efficient-deep-learning
Repo
Framework

Boosting the rule-out accuracy of deep disease detection using class weight modifiers

Title Boosting the rule-out accuracy of deep disease detection using class weight modifiers
Authors Alexandros Karargyris, Ken C. L. Wong, Joy T. Wu, Mehdi Moradi, Tanveer Syeda-Mahmood
Abstract In many screening applications, the primary goal of a radiologist or assisting artificial intelligence is to rule out certain findings. The classifiers built for such applications are often trained on large datasets that derive labels from clinical notes written for patients. While the quality of the positive findings described in these notes is often reliable, lack of the mention of a finding does not always rule out the presence of it. This happens because radiologists comment on the patient in the context of the exam, for example focusing on trauma as opposed to chronic disease at emergency rooms. However, this disease finding ambiguity can affect the performance of algorithms. Hence it is critical to model the ambiguity during training. We propose a scheme to apply reasonable class weight modifiers to our loss function for the no mention cases during training. We experiment with two different deep neural network architectures and show that the proposed method results in a large improvement in the performance of the classifiers, specially on negated findings. The baseline performance of a custom-made dilated block network proposed in this paper shows an improvement in comparison with baseline DenseNet-201, while both architectures benefit from the new proposed loss function weighting scheme. Over 200,000 chest X-ray images and three highly common diseases, along with their negated counterparts, are included in this study.
Tasks
Published 2019-06-21
URL https://arxiv.org/abs/1906.09354v1
PDF https://arxiv.org/pdf/1906.09354v1.pdf
PWC https://paperswithcode.com/paper/boosting-the-rule-out-accuracy-of-deep
Repo
Framework

Deep Transfer Learning for Single-Channel Automatic Sleep Staging with Channel Mismatch

Title Deep Transfer Learning for Single-Channel Automatic Sleep Staging with Channel Mismatch
Authors Huy Phan, Oliver Y. Chén, Philipp Koch, Alfred Mertins, Maarten De Vos
Abstract Many sleep studies suffer from the problem of insufficient data to fully utilize deep neural networks as different labs use different recordings set ups, leading to the need of training automated algorithms on rather small databases, whereas large annotated databases are around but cannot be directly included into these studies for data compensation due to channel mismatch. This work presents a deep transfer learning approach to overcome the channel mismatch problem and transfer knowledge from a large dataset to a small cohort to study automatic sleep staging with single-channel input. We employ the state-of-the-art SeqSleepNet and train the network in the source domain, i.e. the large dataset. Afterwards, the pretrained network is finetuned in the target domain, i.e. the small cohort, to complete knowledge transfer. We study two transfer learning scenarios with slight and heavy channel mismatch between the source and target domains. We also investigate whether, and if so, how finetuning entirely or partially the pretrained network would affect the performance of sleep staging on the target domain. Using the Montreal Archive of Sleep Studies (MASS) database consisting of 200 subjects as the source domain and the Sleep-EDF Expanded database consisting of 20 subjects as the target domain in this study, our experimental results show significant performance improvement on sleep staging achieved with the proposed deep transfer learning approach. Furthermore, these results also reveal the essential of finetuning the feature-learning parts of the pretrained network to be able to bypass the channel mismatch problem.
Tasks Transfer Learning
Published 2019-04-11
URL https://arxiv.org/abs/1904.05945v2
PDF https://arxiv.org/pdf/1904.05945v2.pdf
PWC https://paperswithcode.com/paper/deep-transfer-learning-for-single-channel
Repo
Framework

Growing axons: greedy learning of neural networks with application to function approximation

Title Growing axons: greedy learning of neural networks with application to function approximation
Authors Daria Fokina, Ivan Oseledets
Abstract We propose a new method for learning deep neural network models that is based on a greedy learning approach: we add one basis function at a time, and a new basis function is generated as a non-linear activation function applied to a linear combination of the previous basis functions. Such a method (growing deep neural network by one neuron at a time) allows us to compute much more accurate approximants for several model problems in function approximation.
Tasks
Published 2019-10-28
URL https://arxiv.org/abs/1910.12686v2
PDF https://arxiv.org/pdf/1910.12686v2.pdf
PWC https://paperswithcode.com/paper/growing-axons-greedy-learning-of-neural
Repo
Framework

Illegible Text to Readable Text: An Image-to-Image Transformation using Conditional Sliced Wasserstein Adversarial Networks

Title Illegible Text to Readable Text: An Image-to-Image Transformation using Conditional Sliced Wasserstein Adversarial Networks
Authors Mostafa Karimi, Gopalkrishna Veni, Yen-Yun Yu
Abstract Automatic text recognition from ancient handwritten record images is an important problem in the genealogy domain. However, critical challenges such as varying noise conditions, vanishing texts, and variations in handwriting make the recognition task difficult. We tackle this problem by developing a handwritten-to-machine-print conditional Generative Adversarial network (HW2MP-GAN) model that formulates handwritten recognition as a text-Image-to-text-Image translation problem where a given image, typically in an illegible form, is converted into another image, close to its machine-print form. The proposed model consists of three-components including a generator, and word-level and character-level discriminators. The model incorporates Sliced Wasserstein distance (SWD) and U-Net architectures in HW2MP-GAN for better quality image-to-image transformation. Our experiments reveal that HW2MP-GAN outperforms state-of-the-art baseline cGAN models by almost 30 in Frechet Handwritten Distance (FHD), 0.6 on average Levenshtein distance and 39% in word accuracy for image-to-image translation on IAM database. Further, HW2MP-GAN improves handwritten recognition word accuracy by 1.3% compared to baseline handwritten recognition models on the IAM database.
Tasks Image-to-Image Translation
Published 2019-10-11
URL https://arxiv.org/abs/1910.05425v1
PDF https://arxiv.org/pdf/1910.05425v1.pdf
PWC https://paperswithcode.com/paper/illegible-text-to-readable-text-an-image-to
Repo
Framework

Adaptive Wasserstein Hourglass for Weakly Supervised Hand Pose Estimation from Monocular RGB

Title Adaptive Wasserstein Hourglass for Weakly Supervised Hand Pose Estimation from Monocular RGB
Authors Yumeng Zhang, Li Chen, Yufeng Liu, Junhai Yong, Wen Zheng
Abstract Insufficient labeled training datasets is one of the bottlenecks of 3D hand pose estimation from monocular RGB images. Synthetic datasets have a large number of images with precise annotations, but the obvious difference with real-world datasets impacts the generalization. Little work has been done to bridge the gap between two domains over their wide difference. In this paper, we propose a domain adaptation method called Adaptive Wasserstein Hourglass (AW Hourglass) for weakly-supervised 3D hand pose estimation, which aims to distinguish the difference and explore the common characteristics (e.g. hand structure) of synthetic and real-world datasets. Learning the common characteristics helps the network focus on pose-related information. The similarity of the characteristics makes it easier to enforce domain-invariant constraints. During training, based on the relation between these common characteristics and 3D pose learned from fully-annotated synthetic datasets, it is beneficial for the network to restore the 3D pose of weakly labeled real-world datasets with the aid of 2D annotations and depth images. While in testing, the network predicts the 3D pose with the input of RGB.
Tasks Domain Adaptation, Hand Pose Estimation, Pose Estimation
Published 2019-09-11
URL https://arxiv.org/abs/1909.05666v1
PDF https://arxiv.org/pdf/1909.05666v1.pdf
PWC https://paperswithcode.com/paper/adaptive-wasserstein-hourglass-for-weakly
Repo
Framework

Learning a Representation with the Block-Diagonal Structure for Pattern Classification

Title Learning a Representation with the Block-Diagonal Structure for Pattern Classification
Authors He-Feng Yin, Xiao-Jun Wu, Josef Kittler, Zhen-Hua Feng
Abstract Sparse-representation-based classification (SRC) has been widely studied and developed for various practical signal classification applications. However, the performance of a SRC-based method is degraded when both the training and test data are corrupted. To counteract this problem, we propose an approach that learns Representation with Block-Diagonal Structure (RBDS) for robust image recognition. To be more specific, we first introduce a regularization term that captures the block-diagonal structure of the target representation matrix of the training data. The resulting problem is then solved by an optimizer. Last, based on the learned representation, a simple yet effective linear classifier is used for the classification task. The experimental results obtained on several benchmarking datasets demonstrate the efficacy of the proposed RBDS method.
Tasks Sparse Representation-based Classification
Published 2019-11-23
URL https://arxiv.org/abs/1911.10301v1
PDF https://arxiv.org/pdf/1911.10301v1.pdf
PWC https://paperswithcode.com/paper/learning-a-representation-with-the-block
Repo
Framework

Human Motion Prediction via Pattern Completion in Latent Representation Space

Title Human Motion Prediction via Pattern Completion in Latent Representation Space
Authors Yi Tian Xu, Yaqiao Li, David Meger
Abstract Inspired by ideas in cognitive science, we propose a novel and general approach to solve human motion understanding via pattern completion on a learned latent representation space. Our model outperforms current state-of-the-art methods in human motion prediction across a number of tasks, with no customization. To construct a latent representation for time-series of various lengths, we propose a new and generic autoencoder based on sequence-to-sequence learning. While traditional inference strategies find a correlation between an input and an output, we use pattern completion, which views the input as a partial pattern and to predict the best corresponding complete pattern. Our results demonstrate that this approach has advantages when combined with our autoencoder in solving human motion prediction, motion generation and action classification.
Tasks Action Classification, motion prediction, Time Series
Published 2019-04-18
URL http://arxiv.org/abs/1904.09039v1
PDF http://arxiv.org/pdf/1904.09039v1.pdf
PWC https://paperswithcode.com/paper/human-motion-prediction-via-pattern
Repo
Framework

Word-based Domain Adaptation for Neural Machine Translation

Title Word-based Domain Adaptation for Neural Machine Translation
Authors Shen Yan, Leonard Dahlmann, Pavel Petrushkov, Sanjika Hewavitharana, Shahram Khadivi
Abstract In this paper, we empirically investigate applying word-level weights to adapt neural machine translation to e-commerce domains, where small e-commerce datasets and large out-of-domain datasets are available. In order to mine in-domain like words in the out-of-domain datasets, we compute word weights by using a domain-specific and a non-domain-specific language model followed by smoothing and binary quantization. The baseline model is trained on mixed in-domain and out-of-domain datasets. Experimental results on English to Chinese e-commerce domain translation show that compared to continuing training without word weights, it improves MT quality by up to 2.11% BLEU absolute and 1.59% TER. We have also trained models using fine-tuning on the in-domain data. Pre-training a model with word weights improves fine-tuning up to 1.24% BLEU absolute and 1.64% TER, respectively.
Tasks Domain Adaptation, Language Modelling, Machine Translation, Quantization
Published 2019-06-07
URL https://arxiv.org/abs/1906.03129v1
PDF https://arxiv.org/pdf/1906.03129v1.pdf
PWC https://paperswithcode.com/paper/word-based-domain-adaptation-for-neural
Repo
Framework
Title How Do You #relax When You’re #stressed? A Content Analysis and Infodemiology Study of Stress-Related Tweets
Authors Son Doan, Amanda Ritchart, Nicholas Perry, Juan D Chaparro, Mike Conway
Abstract Background: Stress is a contributing factor to many major health problems in the United States, such as heart disease, depression, and autoimmune diseases. Relaxation is often recommended in mental health treatment as a frontline strategy to reduce stress, thereby improving health conditions. Objective: The objective of our study was to understand how people express their feelings of stress and relaxation through Twitter messages. Methods: We first performed a qualitative content analysis of 1326 and 781 tweets containing the keywords “stress” and “relax”, respectively. We then investigated the use of machine learning algorithms to automatically classify tweets as stress versus non stress and relaxation versus non relaxation. Finally, we applied these classifiers to sample datasets drawn from 4 cities with the goal of evaluating the extent of any correlation between our automatic classification of tweets and results from public stress surveys. Results: Content analysis showed that the most frequent topic of stress tweets was education, followed by work and social relationships. The most frequent topic of relaxation tweets was rest and vacation, followed by nature and water. When we applied the classifiers to the cities dataset, the proportion of stress tweets in New York and San Diego was substantially higher than that in Los Angeles and San Francisco. Conclusions: This content analysis and infodemiology study revealed that Twitter, when used in conjunction with natural language processing techniques, is a useful data source for understanding stress and stress management strategies, and can potentially supplement infrequently collected survey-based stress data.
Tasks
Published 2019-11-21
URL https://arxiv.org/abs/1911.09242v2
PDF https://arxiv.org/pdf/1911.09242v2.pdf
PWC https://paperswithcode.com/paper/how-do-you-relax-when-youre-stressed-a
Repo
Framework

A Closer Look at Domain Shift for Deep Learning in Histopathology

Title A Closer Look at Domain Shift for Deep Learning in Histopathology
Authors Karin Stacke, Gabriel Eilertsen, Jonas Unger, Claes Lundström
Abstract Domain shift is a significant problem in histopathology. There can be large differences in data characteristics of whole-slide images between medical centers and scanners, making generalization of deep learning to unseen data difficult. To gain a better understanding of the problem, we present a study on convolutional neural networks trained for tumor classification of H&E stained whole-slide images. We analyze how augmentation and normalization strategies affect performance and learned representations, and what features a trained model respond to. Most centrally, we present a novel measure for evaluating the distance between domains in the context of the learned representation of a particular model. This measure can reveal how sensitive a model is to domain variations, and can be used to detect new data that a model will have problems generalizing to. The results show how learning is heavily influenced by the preparation of training data, and that the latent representation used to do classification is sensitive to changes in data distribution, especially when training without augmentation or normalization.
Tasks
Published 2019-09-25
URL https://arxiv.org/abs/1909.11575v2
PDF https://arxiv.org/pdf/1909.11575v2.pdf
PWC https://paperswithcode.com/paper/a-closer-look-at-domain-shift-for-deep
Repo
Framework

Towards Learning Affine-Invariant Representations via Data-Efficient CNNs

Title Towards Learning Affine-Invariant Representations via Data-Efficient CNNs
Authors Xenju Xu, Guanghui Wang, Alan Sullivan, Ziming Zhang
Abstract In this paper we propose integrating a priori knowledge into both design and training of convolutional neural networks (CNNs) to learn object representations that are invariant to affine transformations (i.e., translation, scale, rotation). Accordingly we propose a novel multi-scale maxout CNN and train it end-to-end with a novel rotation-invariant regularizer. This regularizer aims to enforce the weights in each 2D spatial filter to approximate circular patterns. In this way, we manage to handle affine transformations in training using convolution, multi-scale maxout, and circular filters. Empirically we demonstrate that such knowledge can significantly improve the data-efficiency as well as generalization and robustness of learned models. For instance, on the Traffic Sign data set and trained with only 10 images per class, our method can achieve 84.15% that outperforms the state-of-the-art by 29.80% in terms of test accuracy.
Tasks
Published 2019-08-31
URL https://arxiv.org/abs/1909.00114v1
PDF https://arxiv.org/pdf/1909.00114v1.pdf
PWC https://paperswithcode.com/paper/towards-learning-affine-invariant
Repo
Framework

Motion Equivariance OF Event-based Camera Data with the Temporal Normalization Transform

Title Motion Equivariance OF Event-based Camera Data with the Temporal Normalization Transform
Authors Ziyun Wang
Abstract In this work, we focus on using convolution neural networks (CNN) to perform object recognition on the event data. In object recognition, it is important for a neural network to be robust to the variations of the data during testing. For traditional cameras, translations are well handled because CNNs are naturally equivariant to translations. However, because event cameras record the change of light intensity of an image, the geometric shape of event volumes will not only depend on the objects but also on their relative motions with respect to the camera. The deformation of the events caused by motions causes the CNN to be less robust to unseen motions during inference. To address this problem, we would like to explore the equivariance property of CNNs, a well-studied area that demonstrates to produce predictable deformation of features under certain transformations of the input image.
Tasks Object Recognition
Published 2019-11-28
URL https://arxiv.org/abs/1911.12801v1
PDF https://arxiv.org/pdf/1911.12801v1.pdf
PWC https://paperswithcode.com/paper/motion-equivariance-of-event-based-camera
Repo
Framework

Interactive Learning for Identifying Relevant Tweets to Support Real-time Situational Awareness

Title Interactive Learning for Identifying Relevant Tweets to Support Real-time Situational Awareness
Authors Luke S. Snyder, Yi-Shan Lin, Morteza Karimzadeh, Dan Goldwasser, David S. Ebert
Abstract Various domain users are increasingly leveraging real-time social media data to gain rapid situational awareness. However, due to the high noise in the deluge of data, effectively determining semantically relevant information can be difficult, further complicated by the changing definition of relevancy by each end user for different events. The majority of existing methods for short text relevance classification fail to incorporate users’ knowledge into the classification process. Existing methods that incorporate interactive user feedback focus on historical datasets. Therefore, classifiers cannot be interactively retrained for specific events or user-dependent needs in real-time. This limits real-time situational awareness, as streaming data that is incorrectly classified cannot be corrected immediately, permitting the possibility for important incoming data to be incorrectly classified as well. We present a novel interactive learning framework to improve the classification process in which the user iteratively corrects the relevancy of tweets in real-time to train the classification model on-the-fly for immediate predictive improvements. We computationally evaluate our classification model adapted to learn at interactive rates. Our results show that our approach outperforms state-of-the-art machine learning models. In addition, we integrate our framework with the extended Social Media Analytics and Reporting Toolkit (SMART) 2.0 system, allowing the use of our interactive learning framework within a visual analytics system tailored for real-time situational awareness. To demonstrate our framework’s effectiveness, we provide domain expert feedback from first responders who used the extended SMART 2.0 system.
Tasks
Published 2019-08-01
URL https://arxiv.org/abs/1908.02588v2
PDF https://arxiv.org/pdf/1908.02588v2.pdf
PWC https://paperswithcode.com/paper/interactive-learning-for-identifying-relevant
Repo
Framework
comments powered by Disqus