January 27, 2020

2988 words 15 mins read

Paper Group ANR 1290

Paper Group ANR 1290

Electricity Load Forecasting – An Evaluation of Simple 1D-CNN Network Structures. Learning to Map Nearly Anything. Spatial-Winograd Pruning Enabling Sparse Winograd Convolution. Capacity of the covariance perceptron. Data Selection for Short Term load forecasting. 3D Semantic Scene Completion from a Single Depth Image using Adversarial Training. F …

Electricity Load Forecasting – An Evaluation of Simple 1D-CNN Network Structures

Title Electricity Load Forecasting – An Evaluation of Simple 1D-CNN Network Structures
Authors Christian Lang, Florian Steinborn, Oliver Steffens, Elmar W. Lang
Abstract This paper presents a convolutional neural network (CNN) which can be used for forecasting electricity load profiles 36 hours into the future. In contrast to well established CNN architectures, the input data is one-dimensional. A parameter scanning of network parameters is conducted in order to gain information about the influence of the kernel size, number of filters, and dense size. The results show that a good forecast quality can already be achieved with basic CNN architectures.The method works not only for smooth sum loads of many hundred consumers, but also for the load of apartment buildings.
Tasks Load Forecasting
Published 2019-11-26
URL https://arxiv.org/abs/1911.11536v1
PDF https://arxiv.org/pdf/1911.11536v1.pdf
PWC https://paperswithcode.com/paper/electricity-load-forecasting-an-evaluation-of
Repo
Framework

Learning to Map Nearly Anything

Title Learning to Map Nearly Anything
Authors Tawfiq Salem, Connor Greenwell, Hunter Blanton, Nathan Jacobs
Abstract Looking at the world from above, it is possible to estimate many properties of a given location, including the type of land cover and the expected land use. Historically, such tasks have relied on relatively coarse-grained categories due to the difficulty of obtaining fine-grained annotations. In this work, we propose an easily extensible approach that makes it possible to estimate fine-grained properties from overhead imagery. In particular, we propose a cross-modal distillation strategy to learn to predict the distribution of fine-grained properties from overhead imagery, without requiring any manual annotation of overhead imagery. We show that our learned models can be used directly for applications in mapping and image localization.
Tasks
Published 2019-09-16
URL https://arxiv.org/abs/1909.06928v1
PDF https://arxiv.org/pdf/1909.06928v1.pdf
PWC https://paperswithcode.com/paper/learning-to-map-nearly-anything
Repo
Framework

Spatial-Winograd Pruning Enabling Sparse Winograd Convolution

Title Spatial-Winograd Pruning Enabling Sparse Winograd Convolution
Authors Jiecao Yu, Jongsoo Park, Maxim Naumov
Abstract Deep convolutional neural networks (CNNs) are deployed in various applications but demand immense computational requirements. Pruning techniques and Winograd convolution are two typical methods to reduce the CNN computation. However, they cannot be directly combined because Winograd transformation fills in the sparsity resulting from pruning. Li et al. (2017) propose sparse Winograd convolution in which weights are directly pruned in the Winograd domain, but this technique is not very practical because Winograd-domain retraining requires low learning rates and hence significantly longer training time. Besides, Liu et al. (2018) move the ReLU function into the Winograd domain, which can help increase the weight sparsity but requires changes in the network structure. To achieve a high Winograd-domain weight sparsity without changing network structures, we propose a new pruning method, spatial-Winograd pruning. As the first step, spatial-domain weights are pruned in a structured way, which efficiently transfers the spatial-domain sparsity into the Winograd domain and avoids Winograd-domain retraining. For the next step, we also perform pruning and retraining directly in the Winograd domain but propose to use an importance factor matrix to adjust weight importance and weight gradients. This adjustment makes it possible to effectively retrain the pruned Winograd-domain network without changing the network structure. For the three models on the datasets of CIFAR10, CIFAR-100, and ImageNet, our proposed method can achieve the Winograd domain sparsities of 63%, 50%, and 74%, respectively.
Tasks
Published 2019-01-08
URL http://arxiv.org/abs/1901.02132v1
PDF http://arxiv.org/pdf/1901.02132v1.pdf
PWC https://paperswithcode.com/paper/spatial-winograd-pruning-enabling-sparse
Repo
Framework

Capacity of the covariance perceptron

Title Capacity of the covariance perceptron
Authors David Dahmen, Matthieu Gilson, Moritz Helias
Abstract The classical perceptron is a simple neural network that performs a binary classification by a linear mapping between static inputs and outputs and application of a threshold. For small inputs, neural networks in a stationary state also perform an effectively linear input-output transformation, but of an entire time series. Choosing the temporal mean of the time series as the feature for classification, the linear transformation of the network with subsequent thresholding is equivalent to the classical perceptron. Here we show that choosing covariances of time series as the feature for classification maps the neural network to what we call a ‘covariance perceptron’; a bilinear mapping between covariances. By extending Gardner’s theory of connections to this bilinear problem, using a replica symmetric mean-field theory, we compute the pattern and information capacities of the covariance perceptron in the infinite-size limit. Closed-form expressions reveal superior pattern capacity in the binary classification task compared to the classical perceptron in the case of a high-dimensional input and low-dimensional output. For less convergent networks, the mean perceptron classifies a larger number of stimuli. However, since covariances span a much larger input and output space than means, the amount of stored information in the covariance perceptron exceeds the classical counterpart. For strongly convergent connectivity it is superior by a factor equal to the number of input neurons. Theoretical calculations are validated numerically for finite size systems using a gradient-based optimization of a soft-margin, as well as numerical solvers for the NP hard quadratically constrained quadratic programming problem, to which training can be mapped.
Tasks Time Series
Published 2019-12-02
URL https://arxiv.org/abs/1912.00824v1
PDF https://arxiv.org/pdf/1912.00824v1.pdf
PWC https://paperswithcode.com/paper/capacity-of-the-covariance-perceptron
Repo
Framework

Data Selection for Short Term load forecasting

Title Data Selection for Short Term load forecasting
Authors Nestor Pereira, Miguel Angel Hombrados Herrera, Vanesssa Gómez-Verdejo, Andrea A. Mammoli, Manel Martínez-Ramón
Abstract Power load forecast with Machine Learning is a fairly mature application of artificial intelligence and it is indispensable in operation, control and planning. Data selection techniqies have been hardly used in this application. However, the use of such techniques could be beneficial provided the assumption that the data is identically distributed is clearly not true in load forecasting, but it is cyclostationary. In this work we present a fully automatic methodology to determine what are the most adequate data to train a predictor which is based on a full Bayesian probabilistic model. We assess the performance of the method with experiments based on real publicly available data recorded from several years in the United States of America.
Tasks Load Forecasting
Published 2019-09-02
URL https://arxiv.org/abs/1909.01759v2
PDF https://arxiv.org/pdf/1909.01759v2.pdf
PWC https://paperswithcode.com/paper/data-selection-for-short-term-load
Repo
Framework

3D Semantic Scene Completion from a Single Depth Image using Adversarial Training

Title 3D Semantic Scene Completion from a Single Depth Image using Adversarial Training
Authors Yueh-Tung Chen, Martin Garbade, Juergen Gall
Abstract We address the task of 3D semantic scene completion, i.e. , given a single depth image, we predict the semantic labels and occupancy of voxels in a 3D grid representing the scene. In light of the recently introduced generative adversarial networks (GAN), our goal is to explore the potential of this model and the efficiency of various important design choices. Our results show that using conditional GANs outperforms the vanilla GAN setup. We evaluate these architecture designs on several datasets. Based on our experiments, we demonstrate that GANs are able to outperform the performance of a baseline 3D CNN in case of clean annotations, but they suffer from poorly aligned annotations.
Tasks
Published 2019-05-15
URL https://arxiv.org/abs/1905.06231v1
PDF https://arxiv.org/pdf/1905.06231v1.pdf
PWC https://paperswithcode.com/paper/3d-semantic-scene-completion-from-a-single
Repo
Framework

Feeling Anxious? Perceiving Anxiety in Tweets using Machine Learning

Title Feeling Anxious? Perceiving Anxiety in Tweets using Machine Learning
Authors Dritjon Gruda, Souleiman Hasan
Abstract This study provides a predictive measurement tool to examine perceived anxiety from a longitudinal perspective, using a non-intrusive machine learning approach to scale human rating of anxiety in microblogs. Results suggest that our chosen machine learning approach depicts perceived user state-anxiety fluctuations over time, as well as mean trait anxiety. We further find a reverse relationship between perceived anxiety and outcomes such as social engagement and popularity. Implications on the individual, organizational, and societal levels are discussed.
Tasks
Published 2019-09-13
URL https://arxiv.org/abs/1909.06959v1
PDF https://arxiv.org/pdf/1909.06959v1.pdf
PWC https://paperswithcode.com/paper/feeling-anxious-perceiving-anxiety-in-tweets
Repo
Framework

LoadCNN: A Low Training Cost Deep Learning Model for Day-Ahead Individual Residential Load Forecasting

Title LoadCNN: A Low Training Cost Deep Learning Model for Day-Ahead Individual Residential Load Forecasting
Authors Yunyou Huang, Nana Wang, Wanling Gao, Xiaoxu Guo, Cheng Huang, Tianshu Hao, Jianfeng Zhan
Abstract Accurate day-ahead individual residential load forecasting is of great importance to various applications of smart grid on day-ahead market. Deep learning, as a powerful machine learning technology, has shown great advantages and promising application in load forecasting tasks. However, deep learning is a computationally-hungry method, and requires high costs (e.g., time, energy and CO2 emission) to train a deep learning model, which aggravates the energy crisis and incurs a substantial burden to the environment. As a consequence, the deep learning methods are difficult to be popularized and applied in the real smart grid environment. In this paper, we propose a low training cost model based on convolutional neural network, namely LoadCNN, for next-day load forecasting of individual resident with reduced training cost. The experiments show that the training time of LoadCNN is only approximately 1/54 of the one of other state-of-the-art models, and energy consumption and CO2 emissions are only approximate 1/45 of those of other state-of-the-art models based on the same indicators. Meanwhile, the prediction accuracy of our model is equal to that of current state-of-the-art models, making LoadCNN the first load forecasting model simultaneously achieving high prediction accuracy and low training costs. LoadCNN is an efficient green model that is able to be quickly, cost-effectively and environmentally-friendly deployed in a realistic smart grid environment.
Tasks Load Forecasting
Published 2019-08-01
URL https://arxiv.org/abs/1908.00298v3
PDF https://arxiv.org/pdf/1908.00298v3.pdf
PWC https://paperswithcode.com/paper/loadcnn-a-efficient-green-deep-learning-model
Repo
Framework

Hierarchical Document Encoder for Parallel Corpus Mining

Title Hierarchical Document Encoder for Parallel Corpus Mining
Authors Mandy Guo, Yinfei Yang, Keith Stevens, Daniel Cer, Heming Ge, Yun-Hsuan Sung, Brian Strope, Ray Kurzweil
Abstract We explore using multilingual document embeddings for nearest neighbor mining of parallel data. Three document-level representations are investigated: (i) document embeddings generated by simply averaging multilingual sentence embeddings; (ii) a neural bag-of-words (BoW) document encoding model; (iii) a hierarchical multilingual document encoder (HiDE) that builds on our sentence-level model. The results show document embeddings derived from sentence-level averaging are surprisingly effective for clean datasets, but suggest models trained hierarchically at the document-level are more effective on noisy data. Analysis experiments demonstrate our hierarchical models are very robust to variations in the underlying sentence embedding quality. Using document embeddings trained with HiDE achieves state-of-the-art performance on United Nations (UN) parallel document mining, 94.9% P@1 for en-fr and 97.3% P@1 for en-es.
Tasks Parallel Corpus Mining, Sentence Embedding, Sentence Embeddings
Published 2019-06-20
URL https://arxiv.org/abs/1906.08401v2
PDF https://arxiv.org/pdf/1906.08401v2.pdf
PWC https://paperswithcode.com/paper/hierarchical-document-encoder-for-parallel
Repo
Framework

Efficient Cross-Validation of Echo State Networks

Title Efficient Cross-Validation of Echo State Networks
Authors Mantas Lukoševičius, Arnas Uselis
Abstract Echo State Networks (ESNs) are known for their fast and precise one-shot learning of time series. But they often need good hyper-parameter tuning for best performance. For this good validation is key, but usually, a single validation split is used. In this rather practical contribution we suggest several schemes for cross-validating ESNs and introduce an efficient algorithm for implementing them. The component that dominates the time complexity of the already quite fast ESN training remains constant (does not scale up with $k$) in our proposed method of doing $k$-fold cross-validation. The component that does scale linearly with $k$ starts dominating only in some not very common situations. Thus in many situations $k$-fold cross-validation of ESNs can be done for virtually the same time complexity as a simple single split validation. Space complexity can also remain the same. We also discuss when the proposed validation schemes for ESNs could be beneficial and empirically investigate them on several different real-world datasets.
Tasks One-Shot Learning, Time Series
Published 2019-08-22
URL https://arxiv.org/abs/1908.08450v1
PDF https://arxiv.org/pdf/1908.08450v1.pdf
PWC https://paperswithcode.com/paper/efficient-cross-validation-of-echo-state
Repo
Framework

Weakly Supervised Universal Fracture Detection in Pelvic X-rays

Title Weakly Supervised Universal Fracture Detection in Pelvic X-rays
Authors Yirui Wang, Le Lu, Chi-Tung Cheng, Dakai Jin, Adam P. Harrison, Jing Xiao, Chien-Hung Liao, Shun Miao
Abstract Hip and pelvic fractures are serious injuries with life-threatening complications. However, diagnostic errors of fractures in pelvic X-rays (PXRs) are very common, driving the demand for computer-aided diagnosis (CAD) solutions. A major challenge lies in the fact that fractures are localized patterns that require localized analyses. Unfortunately, the PXRs residing in hospital picture archiving and communication system do not typically specify region of interests. In this paper, we propose a two-stage hip and pelvic fracture detection method that executes localized fracture classification using weakly supervised ROI mining. The first stage uses a large capacity fully-convolutional network, i.e., deep with high levels of abstraction, in a multiple instance learning setting to automatically mine probable true positive and definite hard negative ROIs from the whole PXR in the training data. The second stage trains a smaller capacity model, i.e., shallower and more generalizable, with the mined ROIs to perform localized analyses to classify fractures. During inference, our method detects hip and pelvic fractures in one pass by chaining the probability outputs of the two stages together. We evaluate our method on 4 410 PXRs, reporting an area under the ROC curve value of 0.975, the highest among state-of-the-art fracture detection methods. Moreover, we show that our two-stage approach can perform comparably to human physicians (even outperforming emergency physicians and surgeons), in a preliminary reader study of 23 readers.
Tasks Multiple Instance Learning
Published 2019-09-04
URL https://arxiv.org/abs/1909.02077v1
PDF https://arxiv.org/pdf/1909.02077v1.pdf
PWC https://paperswithcode.com/paper/weakly-supervised-universal-fracture
Repo
Framework

Medium-Term Load Forecasting Using Support Vector Regression, Feature Selection, and Symbiotic Organism Search Optimization

Title Medium-Term Load Forecasting Using Support Vector Regression, Feature Selection, and Symbiotic Organism Search Optimization
Authors Arghavan Zare-Noghabi, Morteza Shabanzadeh, Hossein Sangrody
Abstract An accurate load forecasting has always been one of the main indispensable parts in the operation and planning of power systems. Among different time horizons of forecasting, while short-term load forecasting (STLF) and long-term load forecasting (LTLF) have respectively got benefits of accurate predictors and probabilistic forecasting, medium-term load forecasting (MTLF) demands more attention due to its vital role in power system operation and planning such as optimal scheduling of generation units, robust planning program for customer service, and economic supply. In this study, a hybrid method, composed of Support Vector Regression (SVR) and Symbiotic Organism Search Optimization (SOSO) method, is proposed for MTLF. In the proposed forecasting model, SVR is the main part of the forecasting algorithm while SOSO is embedded into it to optimize the parameters of SVR. In addition, a minimum redundancy-maximum relevance feature selection algorithm is used to in the preprocessing of input data. The proposed method is tested on EUNITE competition dataset to demonstrate its proper performance. Furthermore, it is compared with some previous works to show eligibility of our method.
Tasks Feature Selection, Load Forecasting
Published 2019-06-11
URL https://arxiv.org/abs/1906.04818v1
PDF https://arxiv.org/pdf/1906.04818v1.pdf
PWC https://paperswithcode.com/paper/medium-term-load-forecasting-using-support
Repo
Framework

ASYNC: A Cloud Engine with Asynchrony and History for Distributed Machine Learning

Title ASYNC: A Cloud Engine with Asynchrony and History for Distributed Machine Learning
Authors Saeed Soori, Bugra Can, Mert Gurbuzbalaba, Maryam Mehri Dehnavi
Abstract ASYNC is a framework that supports the implementation of asynchrony and history for optimization methods on distributed computing platforms. The popularity of asynchronous optimization methods has increased in distributed machine learning. However, their applicability and practical experimentation on distributed systems are limited because current bulk-processing cloud engines do not provide a robust support for asynchrony and history. With introducing three main modules and bookkeeping system-specific and application parameters, ASYNC provides practitioners with a framework to implement asynchronous machine learning methods. To demonstrate ease-of-implementation in ASYNC, the synchronous and asynchronous variants of two well-known optimization methods, stochastic gradient descent and SAGA, are demonstrated in ASYNC.
Tasks
Published 2019-07-19
URL https://arxiv.org/abs/1907.08526v4
PDF https://arxiv.org/pdf/1907.08526v4.pdf
PWC https://paperswithcode.com/paper/async-asynchronous-machine-learning-on
Repo
Framework

An introduction to decentralized stochastic optimization with gradient tracking

Title An introduction to decentralized stochastic optimization with gradient tracking
Authors Ran Xin, Soummya Kar, Usman A. Khan
Abstract Decentralized solutions to finite-sum minimization are of significant importance in many signal processing, control, and machine learning applications. In such settings, the data is distributed over a network of arbitrarily-connected nodes and raw data sharing is prohibitive often due to communication or privacy constraints. In this article, we review decentralized stochastic first-order optimization methods and illustrate some recent improvements based on gradient tracking and variance reduction, focusing particularly on smooth and strongly-convex objective functions. We provide intuitive illustrations of the main technical ideas as well as applications of the algorithms in the context of decentralized training of machine learning models.
Tasks Stochastic Optimization
Published 2019-07-23
URL https://arxiv.org/abs/1907.09648v2
PDF https://arxiv.org/pdf/1907.09648v2.pdf
PWC https://paperswithcode.com/paper/decentralized-stochastic-first-order-methods
Repo
Framework

Unsupervised Temperature Scaling: An Unsupervised Post-Processing Calibration Method of Deep Networks

Title Unsupervised Temperature Scaling: An Unsupervised Post-Processing Calibration Method of Deep Networks
Authors Azadeh Sadat Mozafari, Hugo Siqueira Gomes, Wilson Leão, Christian Gagné
Abstract The great performances of deep learning are undeniable, with impressive results over a wide range of tasks. However, the output confidence of these models is usually not well-calibrated, which can be an issue for applications where confidence on the decisions is central to providing trust and reliability (e.g., autonomous driving or medical diagnosis). For models using softmax at the last layer, Temperature Scaling (TS) is a state-of-the-art calibration method, with low time and memory complexity as well as demonstrated effectiveness. TS relies on a T parameter to rescale and calibrate values of the softmax layer, whose parameter value is computed from a labelled dataset. We are proposing an Unsupervised Temperature Scaling (UTS) approach, which does not depend on labelled samples to calibrate the model, which allows, for example, the use of a part of a test samples to calibrate the pre-trained model before going into inference mode. We provide theoretical justifications for UTS and assess its effectiveness on a wide range of deep models and datasets. We also demonstrate calibration results of UTS on skin lesion detection, a problem where a well-calibrated output can play an important role for accurate decision-making.
Tasks Autonomous Driving, Calibration, Decision Making, Medical Diagnosis
Published 2019-05-01
URL https://arxiv.org/abs/1905.00174v3
PDF https://arxiv.org/pdf/1905.00174v3.pdf
PWC https://paperswithcode.com/paper/unsupervised-temperature-scaling-post
Repo
Framework
comments powered by Disqus