Paper Group ANR 1290
Electricity Load Forecasting – An Evaluation of Simple 1D-CNN Network Structures. Learning to Map Nearly Anything. Spatial-Winograd Pruning Enabling Sparse Winograd Convolution. Capacity of the covariance perceptron. Data Selection for Short Term load forecasting. 3D Semantic Scene Completion from a Single Depth Image using Adversarial Training. F …
Electricity Load Forecasting – An Evaluation of Simple 1D-CNN Network Structures
Title | Electricity Load Forecasting – An Evaluation of Simple 1D-CNN Network Structures |
Authors | Christian Lang, Florian Steinborn, Oliver Steffens, Elmar W. Lang |
Abstract | This paper presents a convolutional neural network (CNN) which can be used for forecasting electricity load profiles 36 hours into the future. In contrast to well established CNN architectures, the input data is one-dimensional. A parameter scanning of network parameters is conducted in order to gain information about the influence of the kernel size, number of filters, and dense size. The results show that a good forecast quality can already be achieved with basic CNN architectures.The method works not only for smooth sum loads of many hundred consumers, but also for the load of apartment buildings. |
Tasks | Load Forecasting |
Published | 2019-11-26 |
URL | https://arxiv.org/abs/1911.11536v1 |
https://arxiv.org/pdf/1911.11536v1.pdf | |
PWC | https://paperswithcode.com/paper/electricity-load-forecasting-an-evaluation-of |
Repo | |
Framework | |
Learning to Map Nearly Anything
Title | Learning to Map Nearly Anything |
Authors | Tawfiq Salem, Connor Greenwell, Hunter Blanton, Nathan Jacobs |
Abstract | Looking at the world from above, it is possible to estimate many properties of a given location, including the type of land cover and the expected land use. Historically, such tasks have relied on relatively coarse-grained categories due to the difficulty of obtaining fine-grained annotations. In this work, we propose an easily extensible approach that makes it possible to estimate fine-grained properties from overhead imagery. In particular, we propose a cross-modal distillation strategy to learn to predict the distribution of fine-grained properties from overhead imagery, without requiring any manual annotation of overhead imagery. We show that our learned models can be used directly for applications in mapping and image localization. |
Tasks | |
Published | 2019-09-16 |
URL | https://arxiv.org/abs/1909.06928v1 |
https://arxiv.org/pdf/1909.06928v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-map-nearly-anything |
Repo | |
Framework | |
Spatial-Winograd Pruning Enabling Sparse Winograd Convolution
Title | Spatial-Winograd Pruning Enabling Sparse Winograd Convolution |
Authors | Jiecao Yu, Jongsoo Park, Maxim Naumov |
Abstract | Deep convolutional neural networks (CNNs) are deployed in various applications but demand immense computational requirements. Pruning techniques and Winograd convolution are two typical methods to reduce the CNN computation. However, they cannot be directly combined because Winograd transformation fills in the sparsity resulting from pruning. Li et al. (2017) propose sparse Winograd convolution in which weights are directly pruned in the Winograd domain, but this technique is not very practical because Winograd-domain retraining requires low learning rates and hence significantly longer training time. Besides, Liu et al. (2018) move the ReLU function into the Winograd domain, which can help increase the weight sparsity but requires changes in the network structure. To achieve a high Winograd-domain weight sparsity without changing network structures, we propose a new pruning method, spatial-Winograd pruning. As the first step, spatial-domain weights are pruned in a structured way, which efficiently transfers the spatial-domain sparsity into the Winograd domain and avoids Winograd-domain retraining. For the next step, we also perform pruning and retraining directly in the Winograd domain but propose to use an importance factor matrix to adjust weight importance and weight gradients. This adjustment makes it possible to effectively retrain the pruned Winograd-domain network without changing the network structure. For the three models on the datasets of CIFAR10, CIFAR-100, and ImageNet, our proposed method can achieve the Winograd domain sparsities of 63%, 50%, and 74%, respectively. |
Tasks | |
Published | 2019-01-08 |
URL | http://arxiv.org/abs/1901.02132v1 |
http://arxiv.org/pdf/1901.02132v1.pdf | |
PWC | https://paperswithcode.com/paper/spatial-winograd-pruning-enabling-sparse |
Repo | |
Framework | |
Capacity of the covariance perceptron
Title | Capacity of the covariance perceptron |
Authors | David Dahmen, Matthieu Gilson, Moritz Helias |
Abstract | The classical perceptron is a simple neural network that performs a binary classification by a linear mapping between static inputs and outputs and application of a threshold. For small inputs, neural networks in a stationary state also perform an effectively linear input-output transformation, but of an entire time series. Choosing the temporal mean of the time series as the feature for classification, the linear transformation of the network with subsequent thresholding is equivalent to the classical perceptron. Here we show that choosing covariances of time series as the feature for classification maps the neural network to what we call a ‘covariance perceptron’; a bilinear mapping between covariances. By extending Gardner’s theory of connections to this bilinear problem, using a replica symmetric mean-field theory, we compute the pattern and information capacities of the covariance perceptron in the infinite-size limit. Closed-form expressions reveal superior pattern capacity in the binary classification task compared to the classical perceptron in the case of a high-dimensional input and low-dimensional output. For less convergent networks, the mean perceptron classifies a larger number of stimuli. However, since covariances span a much larger input and output space than means, the amount of stored information in the covariance perceptron exceeds the classical counterpart. For strongly convergent connectivity it is superior by a factor equal to the number of input neurons. Theoretical calculations are validated numerically for finite size systems using a gradient-based optimization of a soft-margin, as well as numerical solvers for the NP hard quadratically constrained quadratic programming problem, to which training can be mapped. |
Tasks | Time Series |
Published | 2019-12-02 |
URL | https://arxiv.org/abs/1912.00824v1 |
https://arxiv.org/pdf/1912.00824v1.pdf | |
PWC | https://paperswithcode.com/paper/capacity-of-the-covariance-perceptron |
Repo | |
Framework | |
Data Selection for Short Term load forecasting
Title | Data Selection for Short Term load forecasting |
Authors | Nestor Pereira, Miguel Angel Hombrados Herrera, Vanesssa Gómez-Verdejo, Andrea A. Mammoli, Manel Martínez-Ramón |
Abstract | Power load forecast with Machine Learning is a fairly mature application of artificial intelligence and it is indispensable in operation, control and planning. Data selection techniqies have been hardly used in this application. However, the use of such techniques could be beneficial provided the assumption that the data is identically distributed is clearly not true in load forecasting, but it is cyclostationary. In this work we present a fully automatic methodology to determine what are the most adequate data to train a predictor which is based on a full Bayesian probabilistic model. We assess the performance of the method with experiments based on real publicly available data recorded from several years in the United States of America. |
Tasks | Load Forecasting |
Published | 2019-09-02 |
URL | https://arxiv.org/abs/1909.01759v2 |
https://arxiv.org/pdf/1909.01759v2.pdf | |
PWC | https://paperswithcode.com/paper/data-selection-for-short-term-load |
Repo | |
Framework | |
3D Semantic Scene Completion from a Single Depth Image using Adversarial Training
Title | 3D Semantic Scene Completion from a Single Depth Image using Adversarial Training |
Authors | Yueh-Tung Chen, Martin Garbade, Juergen Gall |
Abstract | We address the task of 3D semantic scene completion, i.e. , given a single depth image, we predict the semantic labels and occupancy of voxels in a 3D grid representing the scene. In light of the recently introduced generative adversarial networks (GAN), our goal is to explore the potential of this model and the efficiency of various important design choices. Our results show that using conditional GANs outperforms the vanilla GAN setup. We evaluate these architecture designs on several datasets. Based on our experiments, we demonstrate that GANs are able to outperform the performance of a baseline 3D CNN in case of clean annotations, but they suffer from poorly aligned annotations. |
Tasks | |
Published | 2019-05-15 |
URL | https://arxiv.org/abs/1905.06231v1 |
https://arxiv.org/pdf/1905.06231v1.pdf | |
PWC | https://paperswithcode.com/paper/3d-semantic-scene-completion-from-a-single |
Repo | |
Framework | |
Feeling Anxious? Perceiving Anxiety in Tweets using Machine Learning
Title | Feeling Anxious? Perceiving Anxiety in Tweets using Machine Learning |
Authors | Dritjon Gruda, Souleiman Hasan |
Abstract | This study provides a predictive measurement tool to examine perceived anxiety from a longitudinal perspective, using a non-intrusive machine learning approach to scale human rating of anxiety in microblogs. Results suggest that our chosen machine learning approach depicts perceived user state-anxiety fluctuations over time, as well as mean trait anxiety. We further find a reverse relationship between perceived anxiety and outcomes such as social engagement and popularity. Implications on the individual, organizational, and societal levels are discussed. |
Tasks | |
Published | 2019-09-13 |
URL | https://arxiv.org/abs/1909.06959v1 |
https://arxiv.org/pdf/1909.06959v1.pdf | |
PWC | https://paperswithcode.com/paper/feeling-anxious-perceiving-anxiety-in-tweets |
Repo | |
Framework | |
LoadCNN: A Low Training Cost Deep Learning Model for Day-Ahead Individual Residential Load Forecasting
Title | LoadCNN: A Low Training Cost Deep Learning Model for Day-Ahead Individual Residential Load Forecasting |
Authors | Yunyou Huang, Nana Wang, Wanling Gao, Xiaoxu Guo, Cheng Huang, Tianshu Hao, Jianfeng Zhan |
Abstract | Accurate day-ahead individual residential load forecasting is of great importance to various applications of smart grid on day-ahead market. Deep learning, as a powerful machine learning technology, has shown great advantages and promising application in load forecasting tasks. However, deep learning is a computationally-hungry method, and requires high costs (e.g., time, energy and CO2 emission) to train a deep learning model, which aggravates the energy crisis and incurs a substantial burden to the environment. As a consequence, the deep learning methods are difficult to be popularized and applied in the real smart grid environment. In this paper, we propose a low training cost model based on convolutional neural network, namely LoadCNN, for next-day load forecasting of individual resident with reduced training cost. The experiments show that the training time of LoadCNN is only approximately 1/54 of the one of other state-of-the-art models, and energy consumption and CO2 emissions are only approximate 1/45 of those of other state-of-the-art models based on the same indicators. Meanwhile, the prediction accuracy of our model is equal to that of current state-of-the-art models, making LoadCNN the first load forecasting model simultaneously achieving high prediction accuracy and low training costs. LoadCNN is an efficient green model that is able to be quickly, cost-effectively and environmentally-friendly deployed in a realistic smart grid environment. |
Tasks | Load Forecasting |
Published | 2019-08-01 |
URL | https://arxiv.org/abs/1908.00298v3 |
https://arxiv.org/pdf/1908.00298v3.pdf | |
PWC | https://paperswithcode.com/paper/loadcnn-a-efficient-green-deep-learning-model |
Repo | |
Framework | |
Hierarchical Document Encoder for Parallel Corpus Mining
Title | Hierarchical Document Encoder for Parallel Corpus Mining |
Authors | Mandy Guo, Yinfei Yang, Keith Stevens, Daniel Cer, Heming Ge, Yun-Hsuan Sung, Brian Strope, Ray Kurzweil |
Abstract | We explore using multilingual document embeddings for nearest neighbor mining of parallel data. Three document-level representations are investigated: (i) document embeddings generated by simply averaging multilingual sentence embeddings; (ii) a neural bag-of-words (BoW) document encoding model; (iii) a hierarchical multilingual document encoder (HiDE) that builds on our sentence-level model. The results show document embeddings derived from sentence-level averaging are surprisingly effective for clean datasets, but suggest models trained hierarchically at the document-level are more effective on noisy data. Analysis experiments demonstrate our hierarchical models are very robust to variations in the underlying sentence embedding quality. Using document embeddings trained with HiDE achieves state-of-the-art performance on United Nations (UN) parallel document mining, 94.9% P@1 for en-fr and 97.3% P@1 for en-es. |
Tasks | Parallel Corpus Mining, Sentence Embedding, Sentence Embeddings |
Published | 2019-06-20 |
URL | https://arxiv.org/abs/1906.08401v2 |
https://arxiv.org/pdf/1906.08401v2.pdf | |
PWC | https://paperswithcode.com/paper/hierarchical-document-encoder-for-parallel |
Repo | |
Framework | |
Efficient Cross-Validation of Echo State Networks
Title | Efficient Cross-Validation of Echo State Networks |
Authors | Mantas Lukoševičius, Arnas Uselis |
Abstract | Echo State Networks (ESNs) are known for their fast and precise one-shot learning of time series. But they often need good hyper-parameter tuning for best performance. For this good validation is key, but usually, a single validation split is used. In this rather practical contribution we suggest several schemes for cross-validating ESNs and introduce an efficient algorithm for implementing them. The component that dominates the time complexity of the already quite fast ESN training remains constant (does not scale up with $k$) in our proposed method of doing $k$-fold cross-validation. The component that does scale linearly with $k$ starts dominating only in some not very common situations. Thus in many situations $k$-fold cross-validation of ESNs can be done for virtually the same time complexity as a simple single split validation. Space complexity can also remain the same. We also discuss when the proposed validation schemes for ESNs could be beneficial and empirically investigate them on several different real-world datasets. |
Tasks | One-Shot Learning, Time Series |
Published | 2019-08-22 |
URL | https://arxiv.org/abs/1908.08450v1 |
https://arxiv.org/pdf/1908.08450v1.pdf | |
PWC | https://paperswithcode.com/paper/efficient-cross-validation-of-echo-state |
Repo | |
Framework | |
Weakly Supervised Universal Fracture Detection in Pelvic X-rays
Title | Weakly Supervised Universal Fracture Detection in Pelvic X-rays |
Authors | Yirui Wang, Le Lu, Chi-Tung Cheng, Dakai Jin, Adam P. Harrison, Jing Xiao, Chien-Hung Liao, Shun Miao |
Abstract | Hip and pelvic fractures are serious injuries with life-threatening complications. However, diagnostic errors of fractures in pelvic X-rays (PXRs) are very common, driving the demand for computer-aided diagnosis (CAD) solutions. A major challenge lies in the fact that fractures are localized patterns that require localized analyses. Unfortunately, the PXRs residing in hospital picture archiving and communication system do not typically specify region of interests. In this paper, we propose a two-stage hip and pelvic fracture detection method that executes localized fracture classification using weakly supervised ROI mining. The first stage uses a large capacity fully-convolutional network, i.e., deep with high levels of abstraction, in a multiple instance learning setting to automatically mine probable true positive and definite hard negative ROIs from the whole PXR in the training data. The second stage trains a smaller capacity model, i.e., shallower and more generalizable, with the mined ROIs to perform localized analyses to classify fractures. During inference, our method detects hip and pelvic fractures in one pass by chaining the probability outputs of the two stages together. We evaluate our method on 4 410 PXRs, reporting an area under the ROC curve value of 0.975, the highest among state-of-the-art fracture detection methods. Moreover, we show that our two-stage approach can perform comparably to human physicians (even outperforming emergency physicians and surgeons), in a preliminary reader study of 23 readers. |
Tasks | Multiple Instance Learning |
Published | 2019-09-04 |
URL | https://arxiv.org/abs/1909.02077v1 |
https://arxiv.org/pdf/1909.02077v1.pdf | |
PWC | https://paperswithcode.com/paper/weakly-supervised-universal-fracture |
Repo | |
Framework | |
Medium-Term Load Forecasting Using Support Vector Regression, Feature Selection, and Symbiotic Organism Search Optimization
Title | Medium-Term Load Forecasting Using Support Vector Regression, Feature Selection, and Symbiotic Organism Search Optimization |
Authors | Arghavan Zare-Noghabi, Morteza Shabanzadeh, Hossein Sangrody |
Abstract | An accurate load forecasting has always been one of the main indispensable parts in the operation and planning of power systems. Among different time horizons of forecasting, while short-term load forecasting (STLF) and long-term load forecasting (LTLF) have respectively got benefits of accurate predictors and probabilistic forecasting, medium-term load forecasting (MTLF) demands more attention due to its vital role in power system operation and planning such as optimal scheduling of generation units, robust planning program for customer service, and economic supply. In this study, a hybrid method, composed of Support Vector Regression (SVR) and Symbiotic Organism Search Optimization (SOSO) method, is proposed for MTLF. In the proposed forecasting model, SVR is the main part of the forecasting algorithm while SOSO is embedded into it to optimize the parameters of SVR. In addition, a minimum redundancy-maximum relevance feature selection algorithm is used to in the preprocessing of input data. The proposed method is tested on EUNITE competition dataset to demonstrate its proper performance. Furthermore, it is compared with some previous works to show eligibility of our method. |
Tasks | Feature Selection, Load Forecasting |
Published | 2019-06-11 |
URL | https://arxiv.org/abs/1906.04818v1 |
https://arxiv.org/pdf/1906.04818v1.pdf | |
PWC | https://paperswithcode.com/paper/medium-term-load-forecasting-using-support |
Repo | |
Framework | |
ASYNC: A Cloud Engine with Asynchrony and History for Distributed Machine Learning
Title | ASYNC: A Cloud Engine with Asynchrony and History for Distributed Machine Learning |
Authors | Saeed Soori, Bugra Can, Mert Gurbuzbalaba, Maryam Mehri Dehnavi |
Abstract | ASYNC is a framework that supports the implementation of asynchrony and history for optimization methods on distributed computing platforms. The popularity of asynchronous optimization methods has increased in distributed machine learning. However, their applicability and practical experimentation on distributed systems are limited because current bulk-processing cloud engines do not provide a robust support for asynchrony and history. With introducing three main modules and bookkeeping system-specific and application parameters, ASYNC provides practitioners with a framework to implement asynchronous machine learning methods. To demonstrate ease-of-implementation in ASYNC, the synchronous and asynchronous variants of two well-known optimization methods, stochastic gradient descent and SAGA, are demonstrated in ASYNC. |
Tasks | |
Published | 2019-07-19 |
URL | https://arxiv.org/abs/1907.08526v4 |
https://arxiv.org/pdf/1907.08526v4.pdf | |
PWC | https://paperswithcode.com/paper/async-asynchronous-machine-learning-on |
Repo | |
Framework | |
An introduction to decentralized stochastic optimization with gradient tracking
Title | An introduction to decentralized stochastic optimization with gradient tracking |
Authors | Ran Xin, Soummya Kar, Usman A. Khan |
Abstract | Decentralized solutions to finite-sum minimization are of significant importance in many signal processing, control, and machine learning applications. In such settings, the data is distributed over a network of arbitrarily-connected nodes and raw data sharing is prohibitive often due to communication or privacy constraints. In this article, we review decentralized stochastic first-order optimization methods and illustrate some recent improvements based on gradient tracking and variance reduction, focusing particularly on smooth and strongly-convex objective functions. We provide intuitive illustrations of the main technical ideas as well as applications of the algorithms in the context of decentralized training of machine learning models. |
Tasks | Stochastic Optimization |
Published | 2019-07-23 |
URL | https://arxiv.org/abs/1907.09648v2 |
https://arxiv.org/pdf/1907.09648v2.pdf | |
PWC | https://paperswithcode.com/paper/decentralized-stochastic-first-order-methods |
Repo | |
Framework | |
Unsupervised Temperature Scaling: An Unsupervised Post-Processing Calibration Method of Deep Networks
Title | Unsupervised Temperature Scaling: An Unsupervised Post-Processing Calibration Method of Deep Networks |
Authors | Azadeh Sadat Mozafari, Hugo Siqueira Gomes, Wilson Leão, Christian Gagné |
Abstract | The great performances of deep learning are undeniable, with impressive results over a wide range of tasks. However, the output confidence of these models is usually not well-calibrated, which can be an issue for applications where confidence on the decisions is central to providing trust and reliability (e.g., autonomous driving or medical diagnosis). For models using softmax at the last layer, Temperature Scaling (TS) is a state-of-the-art calibration method, with low time and memory complexity as well as demonstrated effectiveness. TS relies on a T parameter to rescale and calibrate values of the softmax layer, whose parameter value is computed from a labelled dataset. We are proposing an Unsupervised Temperature Scaling (UTS) approach, which does not depend on labelled samples to calibrate the model, which allows, for example, the use of a part of a test samples to calibrate the pre-trained model before going into inference mode. We provide theoretical justifications for UTS and assess its effectiveness on a wide range of deep models and datasets. We also demonstrate calibration results of UTS on skin lesion detection, a problem where a well-calibrated output can play an important role for accurate decision-making. |
Tasks | Autonomous Driving, Calibration, Decision Making, Medical Diagnosis |
Published | 2019-05-01 |
URL | https://arxiv.org/abs/1905.00174v3 |
https://arxiv.org/pdf/1905.00174v3.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-temperature-scaling-post |
Repo | |
Framework | |