January 27, 2020

2988 words 15 mins read

Paper Group ANR 1290

Electricity Load Forecasting – An Evaluation of Simple 1D-CNN Network Structures. Learning to Map Nearly Anything. Spatial-Winograd Pruning Enabling Sparse Winograd Convolution. Capacity of the covariance perceptron. Data Selection for Short Term load forecasting. 3D Semantic Scene Completion from a Single Depth Image using Adversarial Training. F …

Electricity Load Forecasting – An Evaluation of Simple 1D-CNN Network Structures


Title	Electricity Load Forecasting – An Evaluation of Simple 1D-CNN Network Structures
Authors	Christian Lang, Florian Steinborn, Oliver Steffens, Elmar W. Lang
Abstract	This paper presents a convolutional neural network (CNN) which can be used for forecasting electricity load profiles 36 hours into the future. In contrast to well established CNN architectures, the input data is one-dimensional. A parameter scanning of network parameters is conducted in order to gain information about the influence of the kernel size, number of filters, and dense size. The results show that a good forecast quality can already be achieved with basic CNN architectures.The method works not only for smooth sum loads of many hundred consumers, but also for the load of apartment buildings.
Tasks	Load Forecasting
Published	2019-11-26
URL	https://arxiv.org/abs/1911.11536v1
PDF	https://arxiv.org/pdf/1911.11536v1.pdf
PWC	https://paperswithcode.com/paper/electricity-load-forecasting-an-evaluation-of
Repo
Framework

Learning to Map Nearly Anything


Title	Learning to Map Nearly Anything
Authors	Tawfiq Salem, Connor Greenwell, Hunter Blanton, Nathan Jacobs
Abstract	Looking at the world from above, it is possible to estimate many properties of a given location, including the type of land cover and the expected land use. Historically, such tasks have relied on relatively coarse-grained categories due to the difficulty of obtaining fine-grained annotations. In this work, we propose an easily extensible approach that makes it possible to estimate fine-grained properties from overhead imagery. In particular, we propose a cross-modal distillation strategy to learn to predict the distribution of fine-grained properties from overhead imagery, without requiring any manual annotation of overhead imagery. We show that our learned models can be used directly for applications in mapping and image localization.
Tasks
Published	2019-09-16
URL	https://arxiv.org/abs/1909.06928v1
PDF	https://arxiv.org/pdf/1909.06928v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-map-nearly-anything
Repo
Framework

Spatial-Winograd Pruning Enabling Sparse Winograd Convolution


Title	Spatial-Winograd Pruning Enabling Sparse Winograd Convolution
Authors	Jiecao Yu, Jongsoo Park, Maxim Naumov
Abstract	Deep convolutional neural networks (CNNs) are deployed in various applications but demand immense computational requirements. Pruning techniques and Winograd convolution are two typical methods to reduce the CNN computation. However, they cannot be directly combined because Winograd transformation fills in the sparsity resulting from pruning. Li et al. (2017) propose sparse Winograd convolution in which weights are directly pruned in the Winograd domain, but this technique is not very practical because Winograd-domain retraining requires low learning rates and hence significantly longer training time. Besides, Liu et al. (2018) move the ReLU function into the Winograd domain, which can help increase the weight sparsity but requires changes in the network structure. To achieve a high Winograd-domain weight sparsity without changing network structures, we propose a new pruning method, spatial-Winograd pruning. As the first step, spatial-domain weights are pruned in a structured way, which efficiently transfers the spatial-domain sparsity into the Winograd domain and avoids Winograd-domain retraining. For the next step, we also perform pruning and retraining directly in the Winograd domain but propose to use an importance factor matrix to adjust weight importance and weight gradients. This adjustment makes it possible to effectively retrain the pruned Winograd-domain network without changing the network structure. For the three models on the datasets of CIFAR10, CIFAR-100, and ImageNet, our proposed method can achieve the Winograd domain sparsities of 63%, 50%, and 74%, respectively.
Tasks
Published	2019-01-08
URL	http://arxiv.org/abs/1901.02132v1
PDF	http://arxiv.org/pdf/1901.02132v1.pdf
PWC	https://paperswithcode.com/paper/spatial-winograd-pruning-enabling-sparse
Repo
Framework

Capacity of the covariance perceptron


Title	Capacity of the covariance perceptron
Authors	David Dahmen, Matthieu Gilson, Moritz Helias
Abstract	The classical perceptron is a simple neural network that performs a binary classification by a linear mapping between static inputs and outputs and application of a threshold. For small inputs, neural networks in a stationary state also perform an effectively linear input-output transformation, but of an entire time series. Choosing the temporal mean of the time series as the feature for classification, the linear transformation of the network with subsequent thresholding is equivalent to the classical perceptron. Here we show that choosing covariances of time series as the feature for classification maps the neural network to what we call a ‘covariance perceptron’; a bilinear mapping between covariances. By extending Gardner’s theory of connections to this bilinear problem, using a replica symmetric mean-field theory, we compute the pattern and information capacities of the covariance perceptron in the infinite-size limit. Closed-form expressions reveal superior pattern capacity in the binary classification task compared to the classical perceptron in the case of a high-dimensional input and low-dimensional output. For less convergent networks, the mean perceptron classifies a larger number of stimuli. However, since covariances span a much larger input and output space than means, the amount of stored information in the covariance perceptron exceeds the classical counterpart. For strongly convergent connectivity it is superior by a factor equal to the number of input neurons. Theoretical calculations are validated numerically for finite size systems using a gradient-based optimization of a soft-margin, as well as numerical solvers for the NP hard quadratically constrained quadratic programming problem, to which training can be mapped.
Tasks	Time Series
Published	2019-12-02
URL	https://arxiv.org/abs/1912.00824v1
PDF	https://arxiv.org/pdf/1912.00824v1.pdf
PWC	https://paperswithcode.com/paper/capacity-of-the-covariance-perceptron
Repo
Framework

Data Selection for Short Term load forecasting


Title	Data Selection for Short Term load forecasting
Authors	Nestor Pereira, Miguel Angel Hombrados Herrera, Vanesssa Gómez-Verdejo, Andrea A. Mammoli, Manel Martínez-Ramón
Abstract	Power load forecast with Machine Learning is a fairly mature application of artificial intelligence and it is indispensable in operation, control and planning. Data selection techniqies have been hardly used in this application. However, the use of such techniques could be beneficial provided the assumption that the data is identically distributed is clearly not true in load forecasting, but it is cyclostationary. In this work we present a fully automatic methodology to determine what are the most adequate data to train a predictor which is based on a full Bayesian probabilistic model. We assess the performance of the method with experiments based on real publicly available data recorded from several years in the United States of America.
Tasks	Load Forecasting
Published	2019-09-02
URL	https://arxiv.org/abs/1909.01759v2
PDF	https://arxiv.org/pdf/1909.01759v2.pdf
PWC	https://paperswithcode.com/paper/data-selection-for-short-term-load
Repo
Framework

3D Semantic Scene Completion from a Single Depth Image using Adversarial Training


Title	3D Semantic Scene Completion from a Single Depth Image using Adversarial Training
Authors	Yueh-Tung Chen, Martin Garbade, Juergen Gall
Abstract	We address the task of 3D semantic scene completion, i.e. , given a single depth image, we predict the semantic labels and occupancy of voxels in a 3D grid representing the scene. In light of the recently introduced generative adversarial networks (GAN), our goal is to explore the potential of this model and the efficiency of various important design choices. Our results show that using conditional GANs outperforms the vanilla GAN setup. We evaluate these architecture designs on several datasets. Based on our experiments, we demonstrate that GANs are able to outperform the performance of a baseline 3D CNN in case of clean annotations, but they suffer from poorly aligned annotations.
Tasks
Published	2019-05-15
URL	https://arxiv.org/abs/1905.06231v1
PDF	https://arxiv.org/pdf/1905.06231v1.pdf
PWC	https://paperswithcode.com/paper/3d-semantic-scene-completion-from-a-single
Repo
Framework

Feeling Anxious? Perceiving Anxiety in Tweets using Machine Learning


Title	Feeling Anxious? Perceiving Anxiety in Tweets using Machine Learning
Authors	Dritjon Gruda, Souleiman Hasan
Abstract	This study provides a predictive measurement tool to examine perceived anxiety from a longitudinal perspective, using a non-intrusive machine learning approach to scale human rating of anxiety in microblogs. Results suggest that our chosen machine learning approach depicts perceived user state-anxiety fluctuations over time, as well as mean trait anxiety. We further find a reverse relationship between perceived anxiety and outcomes such as social engagement and popularity. Implications on the individual, organizational, and societal levels are discussed.
Tasks
Published	2019-09-13
URL	https://arxiv.org/abs/1909.06959v1
PDF	https://arxiv.org/pdf/1909.06959v1.pdf
PWC	https://paperswithcode.com/paper/feeling-anxious-perceiving-anxiety-in-tweets
Repo
Framework

LoadCNN: A Low Training Cost Deep Learning Model for Day-Ahead Individual Residential Load Forecasting


Title	LoadCNN: A Low Training Cost Deep Learning Model for Day-Ahead Individual Residential Load Forecasting
Authors	Yunyou Huang, Nana Wang, Wanling Gao, Xiaoxu Guo, Cheng Huang, Tianshu Hao, Jianfeng Zhan
Abstract	Accurate day-ahead individual residential load forecasting is of great importance to various applications of smart grid on day-ahead market. Deep learning, as a powerful machine learning technology, has shown great advantages and promising application in load forecasting tasks. However, deep learning is a computationally-hungry method, and requires high costs (e.g., time, energy and CO2 emission) to train a deep learning model, which aggravates the energy crisis and incurs a substantial burden to the environment. As a consequence, the deep learning methods are difficult to be popularized and applied in the real smart grid environment. In this paper, we propose a low training cost model based on convolutional neural network, namely LoadCNN, for next-day load forecasting of individual resident with reduced training cost. The experiments show that the training time of LoadCNN is only approximately 1/54 of the one of other state-of-the-art models, and energy consumption and CO2 emissions are only approximate 1/45 of those of other state-of-the-art models based on the same indicators. Meanwhile, the prediction accuracy of our model is equal to that of current state-of-the-art models, making LoadCNN the first load forecasting model simultaneously achieving high prediction accuracy and low training costs. LoadCNN is an efficient green model that is able to be quickly, cost-effectively and environmentally-friendly deployed in a realistic smart grid environment.
Tasks	Load Forecasting
Published	2019-08-01
URL	https://arxiv.org/abs/1908.00298v3
PDF	https://arxiv.org/pdf/1908.00298v3.pdf
PWC	https://paperswithcode.com/paper/loadcnn-a-efficient-green-deep-learning-model
Repo
Framework

Hierarchical Document Encoder for Parallel Corpus Mining


Title	Hierarchical Document Encoder for Parallel Corpus Mining
Authors	Mandy Guo, Yinfei Yang, Keith Stevens, Daniel Cer, Heming Ge, Yun-Hsuan Sung, Brian Strope, Ray Kurzweil
Abstract	We explore using multilingual document embeddings for nearest neighbor mining of parallel data. Three document-level representations are investigated: (i) document embeddings generated by simply averaging multilingual sentence embeddings; (ii) a neural bag-of-words (BoW) document encoding model; (iii) a hierarchical multilingual document encoder (HiDE) that builds on our sentence-level model. The results show document embeddings derived from sentence-level averaging are surprisingly effective for clean datasets, but suggest models trained hierarchically at the document-level are more effective on noisy data. Analysis experiments demonstrate our hierarchical models are very robust to variations in the underlying sentence embedding quality. Using document embeddings trained with HiDE achieves state-of-the-art performance on United Nations (UN) parallel document mining, 94.9% P@1 for en-fr and 97.3% P@1 for en-es.
Tasks	Parallel Corpus Mining, Sentence Embedding, Sentence Embeddings
Published	2019-06-20
URL	https://arxiv.org/abs/1906.08401v2
PDF	https://arxiv.org/pdf/1906.08401v2.pdf
PWC	https://paperswithcode.com/paper/hierarchical-document-encoder-for-parallel
Repo
Framework

Efficient Cross-Validation of Echo State Networks


Title	Efficient Cross-Validation of Echo State Networks
Authors	Mantas Lukoševičius, Arnas Uselis
Abstract	Echo State Networks (ESNs) are known for their fast and precise one-shot learning of time series. But they often need good hyper-parameter tuning for best performance. For this good validation is key, but usually, a single validation split is used. In this rather practical contribution we suggest several schemes for cross-validating ESNs and introduce an efficient algorithm for implementing them. The component that dominates the time complexity of the already quite fast ESN training remains constant (does not scale up with $k$) in our proposed method of doing $k$-fold cross-validation. The component that does scale linearly with $k$ starts dominating only in some not very common situations. Thus in many situations $k$-fold cross-validation of ESNs can be done for virtually the same time complexity as a simple single split validation. Space complexity can also remain the same. We also discuss when the proposed validation schemes for ESNs could be beneficial and empirically investigate them on several different real-world datasets.
Tasks	One-Shot Learning, Time Series
Published	2019-08-22
URL	https://arxiv.org/abs/1908.08450v1
PDF	https://arxiv.org/pdf/1908.08450v1.pdf
PWC	https://paperswithcode.com/paper/efficient-cross-validation-of-echo-state
Repo
Framework

Weakly Supervised Universal Fracture Detection in Pelvic X-rays


Title	Weakly Supervised Universal Fracture Detection in Pelvic X-rays
Authors	Yirui Wang, Le Lu, Chi-Tung Cheng, Dakai Jin, Adam P. Harrison, Jing Xiao, Chien-Hung Liao, Shun Miao
Abstract	Hip and pelvic fractures are serious injuries with life-threatening complications. However, diagnostic errors of fractures in pelvic X-rays (PXRs) are very common, driving the demand for computer-aided diagnosis (CAD) solutions. A major challenge lies in the fact that fractures are localized patterns that require localized analyses. Unfortunately, the PXRs residing in hospital picture archiving and communication system do not typically specify region of interests. In this paper, we propose a two-stage hip and pelvic fracture detection method that executes localized fracture classification using weakly supervised ROI mining. The first stage uses a large capacity fully-convolutional network, i.e., deep with high levels of abstraction, in a multiple instance learning setting to automatically mine probable true positive and definite hard negative ROIs from the whole PXR in the training data. The second stage trains a smaller capacity model, i.e., shallower and more generalizable, with the mined ROIs to perform localized analyses to classify fractures. During inference, our method detects hip and pelvic fractures in one pass by chaining the probability outputs of the two stages together. We evaluate our method on 4 410 PXRs, reporting an area under the ROC curve value of 0.975, the highest among state-of-the-art fracture detection methods. Moreover, we show that our two-stage approach can perform comparably to human physicians (even outperforming emergency physicians and surgeons), in a preliminary reader study of 23 readers.
Tasks	Multiple Instance Learning
Published	2019-09-04
URL	https://arxiv.org/abs/1909.02077v1
PDF	https://arxiv.org/pdf/1909.02077v1.pdf
PWC	https://paperswithcode.com/paper/weakly-supervised-universal-fracture
Repo
Framework

Medium-Term Load Forecasting Using Support Vector Regression, Feature Selection, and Symbiotic Organism Search Optimization


Title	Medium-Term Load Forecasting Using Support Vector Regression, Feature Selection, and Symbiotic Organism Search Optimization
Authors	Arghavan Zare-Noghabi, Morteza Shabanzadeh, Hossein Sangrody
Abstract	An accurate load forecasting has always been one of the main indispensable parts in the operation and planning of power systems. Among different time horizons of forecasting, while short-term load forecasting (STLF) and long-term load forecasting (LTLF) have respectively got benefits of accurate predictors and probabilistic forecasting, medium-term load forecasting (MTLF) demands more attention due to its vital role in power system operation and planning such as optimal scheduling of generation units, robust planning program for customer service, and economic supply. In this study, a hybrid method, composed of Support Vector Regression (SVR) and Symbiotic Organism Search Optimization (SOSO) method, is proposed for MTLF. In the proposed forecasting model, SVR is the main part of the forecasting algorithm while SOSO is embedded into it to optimize the parameters of SVR. In addition, a minimum redundancy-maximum relevance feature selection algorithm is used to in the preprocessing of input data. The proposed method is tested on EUNITE competition dataset to demonstrate its proper performance. Furthermore, it is compared with some previous works to show eligibility of our method.
Tasks	Feature Selection, Load Forecasting
Published	2019-06-11
URL	https://arxiv.org/abs/1906.04818v1
PDF	https://arxiv.org/pdf/1906.04818v1.pdf
PWC	https://paperswithcode.com/paper/medium-term-load-forecasting-using-support
Repo
Framework

ASYNC: A Cloud Engine with Asynchrony and History for Distributed Machine Learning


Title	ASYNC: A Cloud Engine with Asynchrony and History for Distributed Machine Learning
Authors	Saeed Soori, Bugra Can, Mert Gurbuzbalaba, Maryam Mehri Dehnavi
Abstract	ASYNC is a framework that supports the implementation of asynchrony and history for optimization methods on distributed computing platforms. The popularity of asynchronous optimization methods has increased in distributed machine learning. However, their applicability and practical experimentation on distributed systems are limited because current bulk-processing cloud engines do not provide a robust support for asynchrony and history. With introducing three main modules and bookkeeping system-specific and application parameters, ASYNC provides practitioners with a framework to implement asynchronous machine learning methods. To demonstrate ease-of-implementation in ASYNC, the synchronous and asynchronous variants of two well-known optimization methods, stochastic gradient descent and SAGA, are demonstrated in ASYNC.
Tasks
Published	2019-07-19
URL	https://arxiv.org/abs/1907.08526v4
PDF	https://arxiv.org/pdf/1907.08526v4.pdf
PWC	https://paperswithcode.com/paper/async-asynchronous-machine-learning-on
Repo
Framework

An introduction to decentralized stochastic optimization with gradient tracking


Title	An introduction to decentralized stochastic optimization with gradient tracking
Authors	Ran Xin, Soummya Kar, Usman A. Khan
Abstract	Decentralized solutions to finite-sum minimization are of significant importance in many signal processing, control, and machine learning applications. In such settings, the data is distributed over a network of arbitrarily-connected nodes and raw data sharing is prohibitive often due to communication or privacy constraints. In this article, we review decentralized stochastic first-order optimization methods and illustrate some recent improvements based on gradient tracking and variance reduction, focusing particularly on smooth and strongly-convex objective functions. We provide intuitive illustrations of the main technical ideas as well as applications of the algorithms in the context of decentralized training of machine learning models.
Tasks	Stochastic Optimization
Published	2019-07-23
URL	https://arxiv.org/abs/1907.09648v2
PDF	https://arxiv.org/pdf/1907.09648v2.pdf
PWC	https://paperswithcode.com/paper/decentralized-stochastic-first-order-methods
Repo
Framework

Unsupervised Temperature Scaling: An Unsupervised Post-Processing Calibration Method of Deep Networks


Title	Unsupervised Temperature Scaling: An Unsupervised Post-Processing Calibration Method of Deep Networks
Authors	Azadeh Sadat Mozafari, Hugo Siqueira Gomes, Wilson Leão, Christian Gagné
Abstract	The great performances of deep learning are undeniable, with impressive results over a wide range of tasks. However, the output confidence of these models is usually not well-calibrated, which can be an issue for applications where confidence on the decisions is central to providing trust and reliability (e.g., autonomous driving or medical diagnosis). For models using softmax at the last layer, Temperature Scaling (TS) is a state-of-the-art calibration method, with low time and memory complexity as well as demonstrated effectiveness. TS relies on a T parameter to rescale and calibrate values of the softmax layer, whose parameter value is computed from a labelled dataset. We are proposing an Unsupervised Temperature Scaling (UTS) approach, which does not depend on labelled samples to calibrate the model, which allows, for example, the use of a part of a test samples to calibrate the pre-trained model before going into inference mode. We provide theoretical justifications for UTS and assess its effectiveness on a wide range of deep models and datasets. We also demonstrate calibration results of UTS on skin lesion detection, a problem where a well-calibrated output can play an important role for accurate decision-making.
Tasks	Autonomous Driving, Calibration, Decision Making, Medical Diagnosis
Published	2019-05-01
URL	https://arxiv.org/abs/1905.00174v3
PDF	https://arxiv.org/pdf/1905.00174v3.pdf
PWC	https://paperswithcode.com/paper/unsupervised-temperature-scaling-post
Repo
Framework