Paper Group ANR 1640
Leveraging Deep Learning to Improve the Performance Predictability of Cloud Microservices. Regularized Estimation and Feature Selection in Mixtures of Gaussian-Gated Experts Models. EcoLens: Visual Analysis of Urban Region Dynamics Using Traffic Data. Automatic Detection and Classification of Cognitive Distortions in Mental Health Text. Inoculation …
Leveraging Deep Learning to Improve the Performance Predictability of Cloud Microservices
Title | Leveraging Deep Learning to Improve the Performance Predictability of Cloud Microservices |
Authors | Yu Gan, Yanqi Zhang, Kelvin Hu, Dailun Cheng, Yuan He, Meghna Pancholi, Christina Delimitrou |
Abstract | Performance unpredictability is a major roadblock towards cloud adoption, and has performance, cost, and revenue ramifications. Predictable performance is even more critical as cloud services transition from monolithic designs to microservices. Detecting QoS violations after they occur in systems with microservices results in long recovery times, as hotspots propagate and amplify across dependent services. We present Seer, an online cloud performance debugging system that leverages deep learning and the massive amount of tracing data cloud systems collect to learn spatial and temporal patterns that translate to QoS violations. Seer combines lightweight distributed RPC-level tracing, with detailed low-level hardware monitoring to signal an upcoming QoS violation, and diagnose the source of unpredictable performance. Once an imminent QoS violation is detected, Seer notifies the cluster manager to take action to avoid performance degradation altogether. We evaluate Seer both in local clusters, and in large-scale deployments of end-to-end applications built with microservices with hundreds of users. We show that Seer correctly anticipates QoS violations 91% of the time, and avoids the QoS violation to begin with in 84% of cases. Finally, we show that Seer can identify application-level design bugs, and provide insights on how to better architect microservices to achieve predictable performance. |
Tasks | |
Published | 2019-05-02 |
URL | https://arxiv.org/abs/1905.00968v1 |
https://arxiv.org/pdf/1905.00968v1.pdf | |
PWC | https://paperswithcode.com/paper/leveraging-deep-learning-to-improve-the |
Repo | |
Framework | |
Regularized Estimation and Feature Selection in Mixtures of Gaussian-Gated Experts Models
Title | Regularized Estimation and Feature Selection in Mixtures of Gaussian-Gated Experts Models |
Authors | Faïcel Chamroukhi, Florian Lecocq, Hien D. Nguyen |
Abstract | Mixtures-of-Experts models and their maximum likelihood estimation (MLE) via the EM algorithm have been thoroughly studied in the statistics and machine learning literature. They are subject of a growing investigation in the context of modeling with high-dimensional predictors with regularized MLE. We examine MoE with Gaussian gating network, for clustering and regression, and propose an $\ell_1$-regularized MLE to encourage sparse models and deal with the high-dimensional setting. We develop an EM-Lasso algorithm to perform parameter estimation and utilize a BIC-like criterion to select the model parameters, including the sparsity tuning hyperparameters. Experiments conducted on simulated data show the good performance of the proposed regularized MLE compared to the standard MLE with the EM algorithm. |
Tasks | Feature Selection |
Published | 2019-09-12 |
URL | https://arxiv.org/abs/1909.05494v1 |
https://arxiv.org/pdf/1909.05494v1.pdf | |
PWC | https://paperswithcode.com/paper/regularized-estimation-and-feature-selection |
Repo | |
Framework | |
EcoLens: Visual Analysis of Urban Region Dynamics Using Traffic Data
Title | EcoLens: Visual Analysis of Urban Region Dynamics Using Traffic Data |
Authors | Zhuochen Jin, Nan Cao, Yang Shi, Hanghang Tong, Yingcai Wu |
Abstract | The rapid development of urbanization during the past decades has significantly improved people’s lives but also introduced new challenges on effective functional urban planning and transportation management. The functional regions defined based on a static boundary rarely reflect an individual’s daily experience of the space in which they live and visit for a variety of purposes. Fortunately, the increasing availability of spatiotemporal data provides unprecedented opportunities for understanding the structure of an urban area in terms of people’s activity pattern and how they form the latent regions over time. These ecological regions, where people temporarily share a similar moving behavior during a short period of time, could provide insights into urban planning and smart-city services. However, existing solutions are limited in their capacity of capturing the evolutionary patterns of dynamic latent regions within urban context. In this work, we introduce an interactive visual analysis approach, EcoLens, that allows analysts to progressively explore and analyze the complex dynamic segmentation patterns of a city using traffic data. We propose an extended non-negative Matrix Factorization based algorithm smoothed over both spatial and temporal dimensions to capture the spatiotemporal dynamics of the city. The algorithm also ensures the orthogonality of its result to facilitate the interpretation of different patterns. A suite of visualizations is designed to illustrate the dynamics of city segmentation and the corresponding interactions are added to support the exploration of the segmentation patterns over time. We evaluate the effectiveness of our system via case studies using a real-world dataset and a qualitative interview with the domain expert. |
Tasks | |
Published | 2019-07-29 |
URL | https://arxiv.org/abs/1908.00181v1 |
https://arxiv.org/pdf/1908.00181v1.pdf | |
PWC | https://paperswithcode.com/paper/ecolens-visual-analysis-of-urban-region |
Repo | |
Framework | |
Automatic Detection and Classification of Cognitive Distortions in Mental Health Text
Title | Automatic Detection and Classification of Cognitive Distortions in Mental Health Text |
Authors | Benjamin Shickel, Scott Siegel, Martin Heesacker, Sherry Benton, Parisa Rashidi |
Abstract | In cognitive psychology, automatic and self-reinforcing irrational thought patterns are known as cognitive distortions. Left unchecked, patients exhibiting these types of thoughts can become stuck in negative feedback loops of unhealthy thinking, leading to inaccurate perceptions of reality commonly associated with anxiety and depression. In this paper, we present a machine learning framework for the automatic detection and classification of 15 common cognitive distortions in two novel mental health free text datasets collected from both crowdsourcing and a real-world online therapy program. When differentiating between distorted and non-distorted passages, our model achieved a weighted F1 score of 0.88. For classifying distorted passages into one of 15 distortion categories, our model yielded weighted F1 scores of 0.68 in the larger crowdsourced dataset and 0.45 in the smaller online counseling dataset, both of which outperformed random baseline metrics by a large margin. For both tasks, we also identified the most discriminative words and phrases between classes to highlight common thematic elements for improving targeted and therapist-guided mental health treatment. Furthermore, we performed an exploratory analysis using unsupervised content-based clustering and topic modeling algorithms as first efforts towards a data-driven perspective on the thematic relationship between similar cognitive distortions traditionally deemed unique. Finally, we highlight the difficulties in applying mental health-based machine learning in a real-world setting and comment on the implications and benefits of our framework for improving automated delivery of therapeutic treatment in conjunction with traditional cognitive-behavioral therapy. |
Tasks | |
Published | 2019-09-16 |
URL | https://arxiv.org/abs/1909.07502v2 |
https://arxiv.org/pdf/1909.07502v2.pdf | |
PWC | https://paperswithcode.com/paper/automatic-detection-and-classification-of |
Repo | |
Framework | |
Inoculation by Fine-Tuning: A Method for Analyzing Challenge Datasets
Title | Inoculation by Fine-Tuning: A Method for Analyzing Challenge Datasets |
Authors | Nelson F. Liu, Roy Schwartz, Noah A. Smith |
Abstract | Several datasets have recently been constructed to expose brittleness in models trained on existing benchmarks. While model performance on these challenge datasets is significantly lower compared to the original benchmark, it is unclear what particular weaknesses they reveal. For example, a challenge dataset may be difficult because it targets phenomena that current models cannot capture, or because it simply exploits blind spots in a model’s specific training set. We introduce inoculation by fine-tuning, a new analysis method for studying challenge datasets by exposing models (the metaphorical patient) to a small amount of data from the challenge dataset (a metaphorical pathogen) and assessing how well they can adapt. We apply our method to analyze the NLI “stress tests” (Naik et al., 2018) and the Adversarial SQuAD dataset (Jia and Liang, 2017). We show that after slight exposure, some of these datasets are no longer challenging, while others remain difficult. Our results indicate that failures on challenge datasets may lead to very different conclusions about models, training datasets, and the challenge datasets themselves. |
Tasks | |
Published | 2019-04-04 |
URL | http://arxiv.org/abs/1904.02668v4 |
http://arxiv.org/pdf/1904.02668v4.pdf | |
PWC | https://paperswithcode.com/paper/inoculation-by-fine-tuning-a-method-for |
Repo | |
Framework | |
Regression with Uncertainty Quantification in Large Scale Complex Data
Title | Regression with Uncertainty Quantification in Large Scale Complex Data |
Authors | Nicholas Wilkins, Michael Johnson, Ifeoma Nwogu |
Abstract | While several methods for predicting uncertainty on deep networks have been recently proposed, they do not readily translate to large and complex datasets. In this paper we utilize a simplified form of the Mixture Density Networks (MDNs) to produce a one-shot approach to quantify uncertainty in regression problems. We show that our uncertainty bounds are on-par or better than other reported existing methods. When applied to standard regression benchmark datasets, we show an improvement in predictive log-likelihood and root-mean-square-error when compared to existing state-of-the-art methods. We also demonstrate this method’s efficacy on stochastic, highly volatile time-series data where stock prices are predicted for the next time interval. The resulting uncertainty graph summarizes significant anomalies in the stock price chart. Furthermore, we apply this method to the task of age estimation from the challenging IMDb-Wiki dataset of half a million face images. We successfully predict the uncertainties associated with the prediction and empirically analyze the underlying causes of the uncertainties. This uncertainty quantification can be used to pre-process low quality datasets and further enable learning. |
Tasks | Age Estimation, Time Series |
Published | 2019-12-04 |
URL | https://arxiv.org/abs/1912.02163v1 |
https://arxiv.org/pdf/1912.02163v1.pdf | |
PWC | https://paperswithcode.com/paper/regression-with-uncertainty-quantification-in |
Repo | |
Framework | |
Classification of Neurodevelopmental Age in Normal Infants Using 3D-CNN based on Brain MRI
Title | Classification of Neurodevelopmental Age in Normal Infants Using 3D-CNN based on Brain MRI |
Authors | Mahdieh Shabanian, Eugene C. Eckstein, Hao Chen, John P. DeVincenzo |
Abstract | Human brain development is rapid during infancy and early childhood. Many disease processes impair this development. Therefore, brain developmental age estimation (BDAE) is essential for all diseases affecting cognitive development. Brain magnetic resonance imaging (MRI) of infants shows brain growth and morphologic patterns during childhood. Therefore, we can estimate the developmental age from brain images. However, MRI analysis is time-consuming because each scan contains millions of data points (voxels). We investigated the three-dimensional convolutional neural network (3D CNN), a deep learning algorithm, to rapidly classify neurodevelopmental age with high accuracy based on MRIs. MRIs from normal newborns were obtained from the National Institute of Mental Health (NIMH) Data Archive. Age categories of pediatric MRIs were 3 wks + 1 wk, 1 yr + 2 wks, and 3 yrs + 4 wks. We trained a BDAE method using T1, T2, and proton density (PD) images from MRI scans of 112 individuals using 3D CNN. Compared with the known age, our method has a sensitivity of 99% and specificity of 98.3%. Moreover, our 3D CNN model has better performance in neurodevelopmental age estimation than does 2D CNN. |
Tasks | Age Estimation |
Published | 2019-10-27 |
URL | https://arxiv.org/abs/1910.12159v2 |
https://arxiv.org/pdf/1910.12159v2.pdf | |
PWC | https://paperswithcode.com/paper/classification-of-neurodevelopmental-age-in |
Repo | |
Framework | |
A single target tracking algorithm based on Generative Adversarial Networks
Title | A single target tracking algorithm based on Generative Adversarial Networks |
Authors | Zhaofu Diao |
Abstract | In the single target tracking field, occlusion leads to the loss of tracking targets is a ubiquitous and arduous problem. To solve this problem, we propose a single target tracking algorithm with anti-occlusion capability. The main content of our algorithm is to use the Region Proposal Network to obtain the tracked target and potential interferences, and use the occlusion awareness module to judge whether the interfering object occludes the target. If no occlusion occurs, continue tracking. If occlusion occurs, the prediction module is started, and the motion trajectory of the target in subsequent frames is predicted according to the motion trajectory before occlusion. The result obtained by the prediction module is used to replace the target position feature obtained by the original tracking algorithm. So we solve the problem that the occlusion causes the tracking algorithm to lose the target. In actual performance, our algorithm can successfully track the target in the occluded dataset. On the VOT2018 dataset, our algorithm has an EAO of 0.421, an Accuracy of 0.67, and a Robustness of 0.186. Compared with SiamRPN ++, they increased by 1.69%, 11.67% and 9.3%, respectively. |
Tasks | |
Published | 2019-12-27 |
URL | https://arxiv.org/abs/1912.11967v1 |
https://arxiv.org/pdf/1912.11967v1.pdf | |
PWC | https://paperswithcode.com/paper/a-single-target-tracking-algorithm-based-on |
Repo | |
Framework | |
Hyperspectral Image Classification Based on Adaptive Sparse Deep Network
Title | Hyperspectral Image Classification Based on Adaptive Sparse Deep Network |
Authors | Jingwen Yan, Zixin Xie, Jingyao Chen, Yinan Liu, Lei Liu |
Abstract | Sparse model is widely used in hyperspectral image classification.However, different of sparsity and regularization parameters has great influence on the classification results.In this paper, a novel adaptive sparse deep network based on deep architecture is proposed, which can construct the optimal sparse representation and regularization parameters by deep network.Firstly, a data flow graph is designed to represent each update iteration based on Alternating Direction Method of Multipliers (ADMM) algorithm.Forward network and Back-Propagation network are deduced.All parameters are updated by gradient descent in Back-Propagation.Then we proposed an Adaptive Sparse Deep Network.Comparing with several traditional classifiers or other algorithm for sparse model, experiment results indicate that our method achieves great improvement in HSI classification. |
Tasks | Hyperspectral Image Classification, Image Classification |
Published | 2019-10-21 |
URL | https://arxiv.org/abs/1910.09405v1 |
https://arxiv.org/pdf/1910.09405v1.pdf | |
PWC | https://paperswithcode.com/paper/hyperspectral-image-classification-based-on |
Repo | |
Framework | |
Reservoir-size dependent learning in analogue neural networks
Title | Reservoir-size dependent learning in analogue neural networks |
Authors | Xavier Porte, Louis Andreoli, Maxime Jacquot, Laurent Larger, Daniel Brunner |
Abstract | The implementation of artificial neural networks in hardware substrates is a major interdisciplinary enterprise. Well suited candidates for physical implementations must combine nonlinear neurons with dedicated and efficient hardware solutions for both connectivity and training. Reservoir computing addresses the problems related with the network connectivity and training in an elegant and efficient way. However, important questions regarding impact of reservoir size and learning routines on the convergence-speed during learning remain unaddressed. Here, we study in detail the learning process of a recently demonstrated photonic neural network based on a reservoir. We use a greedy algorithm to train our neural network for the task of chaotic signals prediction and analyze the learning-error landscape. Our results unveil fundamental properties of the system’s optimization hyperspace. Particularly, we determine the convergence speed of learning as a function of reservoir size and find exceptional, close to linear scaling. This linear dependence, together with our parallel diffractive coupling, represent optimal scaling conditions for our photonic neural network scheme. |
Tasks | |
Published | 2019-07-23 |
URL | https://arxiv.org/abs/1908.08021v1 |
https://arxiv.org/pdf/1908.08021v1.pdf | |
PWC | https://paperswithcode.com/paper/reservoir-size-dependent-learning-in-analogue |
Repo | |
Framework | |
Unifying Causal Models with Trek Rules
Title | Unifying Causal Models with Trek Rules |
Authors | Shuyan Wang |
Abstract | In many scientific contexts, different investigators experiment with or observe different variables with data from a domain in which the distinct variable sets might well be related. This sort of fragmentation sometimes occurs in molecular biology, whether in studies of RNA expression or studies of protein interaction, and it is common in the social sciences. Models are built on the diverse data sets, but combining them can provide a more unified account of the causal processes in the domain. On the other hand, this problem is made challenging by the fact that a variable in one data set may influence variables in another although neither data set contains all of the variables involved. Several authors have proposed using conditional independence properties of fragmentary (marginal) data collections to form unified causal explanations when it is assumed that the data have a common causal explanation but cannot be merged to form a unified dataset. These methods typically return a large number of alternative causal models. The first part of the thesis shows that marginal datasets contain extra information that can be used to reduce the number of possible models, in some cases yielding a unique model. |
Tasks | |
Published | 2019-09-02 |
URL | https://arxiv.org/abs/1909.01789v1 |
https://arxiv.org/pdf/1909.01789v1.pdf | |
PWC | https://paperswithcode.com/paper/unifying-causal-models-with-trek-rules |
Repo | |
Framework | |
Probabilistic Load Forecasting via Point Forecast Feature Integration
Title | Probabilistic Load Forecasting via Point Forecast Feature Integration |
Authors | Qicheng Chang, Yishen Wang, Xiao Lu, Di Shi, Haifeng Li, Jiajun Duan, Zhiwei Wang |
Abstract | Short-term load forecasting is a critical element of power systems energy management systems. In recent years, probabilistic load forecasting (PLF) has gained increased attention for its ability to provide uncertainty information that helps to improve the reliability and economics of system operation performances. This paper proposes a two-stage probabilistic load forecasting framework by integrating point forecast as a key probabilistic forecasting feature into PLF. In the first stage, all related features are utilized to train a point forecast model and also obtain the feature importance. In the second stage the forecasting model is trained, taking into consideration point forecast features, as well as selected feature subsets. During the testing period of the forecast model, the final probabilistic load forecast results are leveraged to obtain both point forecasting and probabilistic forecasting. Numerical results obtained from ISO New England demand data demonstrate the effectiveness of the proposed approach in the hour-ahead load forecasting, which uses the gradient boosting regression for the point forecasting and quantile regression neural networks for the probabilistic forecasting. |
Tasks | Feature Importance, Load Forecasting |
Published | 2019-03-26 |
URL | http://arxiv.org/abs/1903.10684v1 |
http://arxiv.org/pdf/1903.10684v1.pdf | |
PWC | https://paperswithcode.com/paper/probabilistic-load-forecasting-via-point |
Repo | |
Framework | |
Predicting Treatment Initiation from Clinical Time Series Data via Graph-Augmented Time-Sensitive Model
Title | Predicting Treatment Initiation from Clinical Time Series Data via Graph-Augmented Time-Sensitive Model |
Authors | Fan Zhang, Tong Wu, Yunlong Wang, Yong Cai, Cao Xiao, Emily Zhao, Lucas Glass, Jimeng Sun |
Abstract | Many computational models were proposed to extract temporal patterns from clinical time series for each patient and among patient group for predictive healthcare. However, the common relations among patients (e.g., share the same doctor) were rarely considered. In this paper, we represent patients and clinicians relations by bipartite graphs addressing for example from whom a patient get a diagnosis. We then solve for the top eigenvectors of the graph Laplacian, and include the eigenvectors as latent representations of the similarity between patient-clinician pairs into a time-sensitive prediction model. We conducted experiments using real-world data to predict the initiation of first-line treatment for Chronic Lymphocytic Leukemia (CLL) patients. Results show that relational similarity can improve prediction over multiple baselines, for example a 5% incremental over long-short term memory baseline in terms of area under precision-recall curve. |
Tasks | Time Series |
Published | 2019-07-01 |
URL | https://arxiv.org/abs/1907.01099v1 |
https://arxiv.org/pdf/1907.01099v1.pdf | |
PWC | https://paperswithcode.com/paper/predicting-treatment-initiation-from-clinical |
Repo | |
Framework | |
Nearest Neighbor Median Shift Clustering for Binary Data
Title | Nearest Neighbor Median Shift Clustering for Binary Data |
Authors | Gaël Beck, Tarn Duong, Mustapha Lebbah, Hanane Azzag |
Abstract | We describe in this paper the theory and practice behind a new modal clustering method for binary data. Our approach (BinNNMS) is based on the nearest neighbor median shift. The median shift is an extension of the well-known mean shift, which was designed for continuous data, to handle binary data. We demonstrate that BinNNMS can discover accurately the location of clusters in binary data with theoretical and experimental analyses. |
Tasks | |
Published | 2019-02-11 |
URL | http://arxiv.org/abs/1902.04181v1 |
http://arxiv.org/pdf/1902.04181v1.pdf | |
PWC | https://paperswithcode.com/paper/nearest-neighbor-median-shift-clustering-for |
Repo | |
Framework | |
Don’t Jump Through Hoops and Remove Those Loops: SVRG and Katyusha are Better Without the Outer Loop
Title | Don’t Jump Through Hoops and Remove Those Loops: SVRG and Katyusha are Better Without the Outer Loop |
Authors | Dmitry Kovalev, Samuel Horvath, Peter Richtarik |
Abstract | The stochastic variance-reduced gradient method (SVRG) and its accelerated variant (Katyusha) have attracted enormous attention in the machine learning community in the last few years due to their superior theoretical properties and empirical behaviour on training supervised machine learning models via the empirical risk minimization paradigm. A key structural element in both of these methods is the inclusion of an outer loop at the beginning of which a full pass over the training data is made in order to compute the exact gradient, which is then used to construct a variance-reduced estimator of the gradient. In this work we design {\em loopless variants} of both of these methods. In particular, we remove the outer loop and replace its function by a coin flip performed in each iteration designed to trigger, with a small probability, the computation of the gradient. We prove that the new methods enjoy the same superior theoretical convergence properties as the original methods. However, we demonstrate through numerical experiments that our methods have substantially superior practical behavior. |
Tasks | |
Published | 2019-01-24 |
URL | https://arxiv.org/abs/1901.08689v2 |
https://arxiv.org/pdf/1901.08689v2.pdf | |
PWC | https://paperswithcode.com/paper/dont-jump-through-hoops-and-remove-those |
Repo | |
Framework | |