Paper Group ANR 575
Quantifying the Influences on Probabilistic Wind Power Forecasts. Learning Vine Copula Models For Synthetic Data Generation. Meta-learning autoencoders for few-shot prediction. Deep Signal Recovery with One-Bit Quantization. Efficient and Effective Quantum Compiling for Entanglement-based Machine Learning on IBM Q Devices. A Text Analysis of Federa …
Quantifying the Influences on Probabilistic Wind Power Forecasts
Title | Quantifying the Influences on Probabilistic Wind Power Forecasts |
Authors | Jens Schreiber, Bernhard Sick |
Abstract | In recent years, probabilistic forecasts techniques were proposed in research as well as in applications to integrate volatile renewable energy resources into the electrical grid. These techniques allow decision makers to take the uncertainty of the prediction into account and, therefore, to devise optimal decisions, e.g., related to costs and risks in the electrical grid. However, it was yet not studied how the input, such as numerical weather predictions, affects the model output of forecasting models in detail. Therefore, we examine the potential influences with techniques from the field of sensitivity analysis on three different black-box models to obtain insights into differences and similarities of these probabilistic models. The analysis shows a considerable number of potential influences in those models depending on, e.g., the predicted probability and the type of model. These effects motivate the need to take various influences into account when models are tested, analyzed, or compared. Nevertheless, results of the sensitivity analysis will allow us to select a model with advantages in the practical application. |
Tasks | |
Published | 2018-08-14 |
URL | http://arxiv.org/abs/1808.04750v1 |
http://arxiv.org/pdf/1808.04750v1.pdf | |
PWC | https://paperswithcode.com/paper/quantifying-the-influences-on-probabilistic |
Repo | |
Framework | |
Learning Vine Copula Models For Synthetic Data Generation
Title | Learning Vine Copula Models For Synthetic Data Generation |
Authors | Yi Sun, Alfredo Cuesta-Infante, Kalyan Veeramachaneni |
Abstract | A vine copula model is a flexible high-dimensional dependence model which uses only bivariate building blocks. However, the number of possible configurations of a vine copula grows exponentially as the number of variables increases, making model selection a major challenge in development. In this work, we formulate a vine structure learning problem with both vector and reinforcement learning representation. We use neural network to find the embeddings for the best possible vine model and generate a structure. Throughout experiments on synthetic and real-world datasets, we show that our proposed approach fits the data better in terms of log-likelihood. Moreover, we demonstrate that the model is able to generate high-quality samples in a variety of applications, making it a good candidate for synthetic data generation. |
Tasks | Model Selection, Synthetic Data Generation |
Published | 2018-12-04 |
URL | http://arxiv.org/abs/1812.01226v1 |
http://arxiv.org/pdf/1812.01226v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-vine-copula-models-for-synthetic |
Repo | |
Framework | |
Meta-learning autoencoders for few-shot prediction
Title | Meta-learning autoencoders for few-shot prediction |
Authors | Tailin Wu, John Peurifoy, Isaac L. Chuang, Max Tegmark |
Abstract | Compared to humans, machine learning models generally require significantly more training examples and fail to extrapolate from experience to solve previously unseen challenges. To help close this performance gap, we augment single-task neural networks with a meta-recognition model which learns a succinct model code via its autoencoder structure, using just a few informative examples. The model code is then employed by a meta-generative model to construct parameters for the task-specific model. We demonstrate that for previously unseen tasks, without additional training, this Meta-Learning Autoencoder (MeLA) framework can build models that closely match the true underlying models, with loss significantly lower than given by fine-tuned baseline networks, and performance that compares favorably with state-of-the-art meta-learning algorithms. MeLA also adds the ability to identify influential training examples and predict which additional data will be most valuable to acquire to improve model prediction. |
Tasks | Meta-Learning |
Published | 2018-07-26 |
URL | http://arxiv.org/abs/1807.09912v1 |
http://arxiv.org/pdf/1807.09912v1.pdf | |
PWC | https://paperswithcode.com/paper/meta-learning-autoencoders-for-few-shot |
Repo | |
Framework | |
Deep Signal Recovery with One-Bit Quantization
Title | Deep Signal Recovery with One-Bit Quantization |
Authors | Shahin Khobahi, Naveed Naimipour, Mojtaba Soltanalian, Yonina C. Eldar |
Abstract | Machine learning, and more specifically deep learning, have shown remarkable performance in sensing, communications, and inference. In this paper, we consider the application of the deep unfolding technique in the problem of signal reconstruction from its one-bit noisy measurements. Namely, we propose a model-based machine learning method and unfold the iterations of an inference optimization algorithm into the layers of a deep neural network for one-bit signal recovery. The resulting network, which we refer to as DeepRec, can efficiently handle the recovery of high-dimensional signals from acquired one-bit noisy measurements. The proposed method results in an improvement in accuracy and computational efficiency with respect to the original framework as shown through numerical analysis. |
Tasks | Quantization |
Published | 2018-11-30 |
URL | http://arxiv.org/abs/1812.00797v1 |
http://arxiv.org/pdf/1812.00797v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-signal-recovery-with-one-bit |
Repo | |
Framework | |
Efficient and Effective Quantum Compiling for Entanglement-based Machine Learning on IBM Q Devices
Title | Efficient and Effective Quantum Compiling for Entanglement-based Machine Learning on IBM Q Devices |
Authors | Davide Ferrari, Michele Amoretti |
Abstract | Quantum compiling means fast, device-aware implementation of quantum algorithms (i.e., quantum circuits, in the quantum circuit model of computation). In this paper, we present a strategy for compiling IBM Q -aware, low-depth quantum circuits that generate Greenberger-Horne-Zeilinger (GHZ) entangled states. The resulting compiler can replace the QISKit compiler for the specific purpose of obtaining improved GHZ circuits. It is well known that GHZ states have several practical applications, including quantum machine learning. We illustrate our experience in implementing and querying a uniform quantum example oracle based on the GHZ circuit, for solving the classically hard problem of learning parity with noise. |
Tasks | Quantum Machine Learning |
Published | 2018-01-08 |
URL | https://arxiv.org/abs/1801.02363v3 |
https://arxiv.org/pdf/1801.02363v3.pdf | |
PWC | https://paperswithcode.com/paper/demonstration-of-envariance-and-parity |
Repo | |
Framework | |
A Text Analysis of Federal Reserve meeting minutes
Title | A Text Analysis of Federal Reserve meeting minutes |
Authors | Harish Gandhi Ramachandran, Dan DeRose Jr |
Abstract | Recent developments in monetary policy by the Federal Reserve has created a need for an objective method of communication analysis.Using methods developed for text analysis, we present a novel technique of analysis which creates a semantic space defined by various policymakers public comments and places the committee consensus in the appropriate location. Its then possible to determine which member of the committee is most closely aligned with the committee consensus over time and create a foundation for further actionable research. |
Tasks | |
Published | 2018-05-21 |
URL | http://arxiv.org/abs/1805.07851v1 |
http://arxiv.org/pdf/1805.07851v1.pdf | |
PWC | https://paperswithcode.com/paper/a-text-analysis-of-federal-reserve-meeting |
Repo | |
Framework | |
Fast Convergence for Stochastic and Distributed Gradient Descent in the Interpolation Limit
Title | Fast Convergence for Stochastic and Distributed Gradient Descent in the Interpolation Limit |
Authors | Partha P Mitra |
Abstract | Modern supervised learning techniques, particularly those using deep nets, involve fitting high dimensional labelled data sets with functions containing very large numbers of parameters. Much of this work is empirical. Interesting phenomena have been observed that require theoretical explanations; however the non-convexity of the loss functions complicates the analysis. Recently it has been proposed that the success of these techniques rests partly in the effectiveness of the simple stochastic gradient descent algorithm in the so called interpolation limit in which all labels are fit perfectly. This analysis is made possible since the SGD algorithm reduces to a stochastic linear system near the interpolating minimum of the loss function. Here we exploit this insight by presenting and analyzing a new distributed algorithm for gradient descent, also in the interpolating limit. The distributed SGD algorithm presented in the paper corresponds to gradient descent applied to a simple penalized distributed loss function, $L({\bf w}_1,…,{\bf w}_n) = \Sigma_i l_i({\bf w}i) + \mu \sum{<i,j>}{\bf w}i-{\bf w}j^2$. Here each node holds only one sample, and its own parameter vector. The notation $<i,j>$ denotes edges of a connected graph defining the links between nodes. It is shown that this distributed algorithm converges linearly (ie the error reduces exponentially with iteration number), with a rate $1-\frac{\eta}{n}\lambda{min}(H)<R<1$ where $\lambda{min}(H)$ is the smallest nonzero eigenvalue of the sample covariance or the Hessian H. In contrast with previous usage of similar penalty functions to enforce consensus between nodes, in the interpolating limit it is not required to take the penalty parameter to infinity for consensus to occur. The analysis further reinforces the utility of the interpolation limit in the theoretical treatment of modern machine learning algorithms. |
Tasks | |
Published | 2018-03-08 |
URL | http://arxiv.org/abs/1803.02922v3 |
http://arxiv.org/pdf/1803.02922v3.pdf | |
PWC | https://paperswithcode.com/paper/fast-convergence-for-stochastic-and |
Repo | |
Framework | |
Protection against Cloning for Deep Learning
Title | Protection against Cloning for Deep Learning |
Authors | Richard Kenway |
Abstract | The susceptibility of deep learning to adversarial attack can be understood in the framework of the Renormalisation Group (RG) and the vulnerability of a specific network may be diagnosed provided the weights in each layer are known. An adversary with access to the inputs and outputs could train a second network to clone these weights and, having identified a weakness, use them to compute the perturbation of the input data which exploits it. However, the RG framework also provides a means to poison the outputs of the network imperceptibly, without affecting their legitimate use, so as to prevent such cloning of its weights and thereby foil the generation of adversarial data. |
Tasks | Adversarial Attack |
Published | 2018-03-29 |
URL | http://arxiv.org/abs/1803.10995v1 |
http://arxiv.org/pdf/1803.10995v1.pdf | |
PWC | https://paperswithcode.com/paper/protection-against-cloning-for-deep-learning |
Repo | |
Framework | |
Learning Deep Hidden Nonlinear Dynamics from Aggregate Data
Title | Learning Deep Hidden Nonlinear Dynamics from Aggregate Data |
Authors | Yisen Wang, Bo Dai, Lingkai Kong, Sarah Monazam Erfani, James Bailey, Hongyuan Zha |
Abstract | Learning nonlinear dynamics from diffusion data is a challenging problem since the individuals observed may be different at different time points, generally following an aggregate behaviour. Existing work cannot handle the tasks well since they model such dynamics either directly on observations or enforce the availability of complete longitudinal individual-level trajectories. However, in most of the practical applications, these requirements are unrealistic: the evolving dynamics may be too complex to be modeled directly on observations, and individual-level trajectories may not be available due to technical limitations, experimental costs and/or privacy issues. To address these challenges, we formulate a model of diffusion dynamics as the {\em hidden stochastic process} via the introduction of hidden variables for flexibility, and learn the hidden dynamics directly on {\em aggregate observations} without any requirement for individual-level trajectories. We propose a dynamic generative model with Wasserstein distance for LEarninG dEep hidden Nonlinear Dynamics (LEGEND) and prove its theoretical guarantees as well. Experiments on a range of synthetic and real-world datasets illustrate that LEGEND has very strong performance compared to state-of-the-art baselines. |
Tasks | |
Published | 2018-07-22 |
URL | http://arxiv.org/abs/1807.08237v2 |
http://arxiv.org/pdf/1807.08237v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-deep-hidden-nonlinear-dynamics-from |
Repo | |
Framework | |
Towards Better UD Parsing: Deep Contextualized Word Embeddings, Ensemble, and Treebank Concatenation
Title | Towards Better UD Parsing: Deep Contextualized Word Embeddings, Ensemble, and Treebank Concatenation |
Authors | Wanxiang Che, Yijia Liu, Yuxuan Wang, Bo Zheng, Ting Liu |
Abstract | This paper describes our system (HIT-SCIR) submitted to the CoNLL 2018 shared task on Multilingual Parsing from Raw Text to Universal Dependencies. We base our submission on Stanford’s winning system for the CoNLL 2017 shared task and make two effective extensions: 1) incorporating deep contextualized word embeddings into both the part of speech tagger and parser; 2) ensembling parsers trained with different initialization. We also explore different ways of concatenating treebanks for further improvements. Experimental results on the development data show the effectiveness of our methods. In the final evaluation, our system was ranked first according to LAS (75.84%) and outperformed the other systems by a large margin. |
Tasks | Word Embeddings |
Published | 2018-07-09 |
URL | http://arxiv.org/abs/1807.03121v3 |
http://arxiv.org/pdf/1807.03121v3.pdf | |
PWC | https://paperswithcode.com/paper/towards-better-ud-parsing-deep-contextualized |
Repo | |
Framework | |
Learning First-to-Spike Policies for Neuromorphic Control Using Policy Gradients
Title | Learning First-to-Spike Policies for Neuromorphic Control Using Policy Gradients |
Authors | Bleema Rosenfeld, Osvaldo Simeone, Bipin Rajendran |
Abstract | Artificial Neural Networks (ANNs) are currently being used as function approximators in many state-of-the-art Reinforcement Learning (RL) algorithms. Spiking Neural Networks (SNNs) have been shown to drastically reduce the energy consumption of ANNs by encoding information in sparse temporal binary spike streams, hence emulating the communication mechanism of biological neurons. Due to their low energy consumption, SNNs are considered to be important candidates as co-processors to be implemented in mobile devices. In this work, the use of SNNs as stochastic policies is explored under an energy-efficient first-to-spike action rule, whereby the action taken by the RL agent is determined by the occurrence of the first spike among the output neurons. A policy gradient-based algorithm is derived considering a Generalized Linear Model (GLM) for spiking neurons. Experimental results demonstrate the capability of online trained SNNs as stochastic policies to gracefully trade energy consumption, as measured by the number of spikes, and control performance. Significant gains are shown as compared to the standard approach of converting an offline trained ANN into an SNN. |
Tasks | |
Published | 2018-10-23 |
URL | http://arxiv.org/abs/1810.09977v3 |
http://arxiv.org/pdf/1810.09977v3.pdf | |
PWC | https://paperswithcode.com/paper/learning-first-to-spike-policies-for |
Repo | |
Framework | |
Towards Robust Deep Neural Networks
Title | Towards Robust Deep Neural Networks |
Authors | Timothy E. Wang, Yiming Gu, Dhagash Mehta, Xiaojun Zhao, Edgar A. Bernal |
Abstract | We investigate the topics of sensitivity and robustness in feedforward and convolutional neural networks. Combining energy landscape techniques developed in computational chemistry with tools drawn from formal methods, we produce empirical evidence indicating that networks corresponding to lower-lying minima in the optimization landscape of the learning objective tend to be more robust. The robustness estimate used is the inverse of a proposed sensitivity measure, which we define as the volume of an over-approximation of the reachable set of network outputs under all additive $l_{\infty}$-bounded perturbations on the input data. We present a novel loss function which includes a sensitivity term in addition to the traditional task-oriented and regularization terms. In our experiments on standard machine learning and computer vision datasets, we show that the proposed loss function leads to networks which reliably optimize the robustness measure as well as other related metrics of adversarial robustness without significant degradation in the classification error. Experimental results indicate that the proposed method outperforms state-of-the-art sensitivity-based learning approaches with regards to robustness to adversarial attacks. We also show that although the introduced framework does not explicitly enforce an adversarial loss, it achieves competitive overall performance relative to methods that do. |
Tasks | |
Published | 2018-10-27 |
URL | http://arxiv.org/abs/1810.11726v2 |
http://arxiv.org/pdf/1810.11726v2.pdf | |
PWC | https://paperswithcode.com/paper/towards-robust-deep-neural-networks |
Repo | |
Framework | |
A Parametric Top-View Representation of Complex Road Scenes
Title | A Parametric Top-View Representation of Complex Road Scenes |
Authors | Ziyan Wang, Buyu Liu, Samuel Schulter, Manmohan Chandraker |
Abstract | In this paper, we address the problem of inferring the layout of complex road scenes given a single camera as input. To achieve that, we first propose a novel parameterized model of road layouts in a top-view representation, which is not only intuitive for human visualization but also provides an interpretable interface for higher-level decision making. Moreover, the design of our top-view scene model allows for efficient sampling and thus generation of large-scale simulated data, which we leverage to train a deep neural network to infer our scene model’s parameters. Specifically, our proposed training procedure uses supervised domain-adaptation techniques to incorporate both simulated as well as manually annotated data. Finally, we design a Conditional Random Field (CRF) that enforces coherent predictions for a single frame and encourages temporal smoothness among video frames. Experiments on two public data sets show that: (1) Our parametric top-view model is representative enough to describe complex road scenes; (2) The proposed method outperforms baselines trained on manually-annotated or simulated data only, thus getting the best of both; (3) Our CRF is able to generate temporally smoothed while semantically meaningful results. |
Tasks | Decision Making, Domain Adaptation |
Published | 2018-12-14 |
URL | http://arxiv.org/abs/1812.06152v2 |
http://arxiv.org/pdf/1812.06152v2.pdf | |
PWC | https://paperswithcode.com/paper/a-parametric-top-view-representation-of |
Repo | |
Framework | |
Pooling is neither necessary nor sufficient for appropriate deformation stability in CNNs
Title | Pooling is neither necessary nor sufficient for appropriate deformation stability in CNNs |
Authors | Avraham Ruderman, Neil C. Rabinowitz, Ari S. Morcos, Daniel Zoran |
Abstract | Many of our core assumptions about how neural networks operate remain empirically untested. One common assumption is that convolutional neural networks need to be stable to small translations and deformations to solve image recognition tasks. For many years, this stability was baked into CNN architectures by incorporating interleaved pooling layers. Recently, however, interleaved pooling has largely been abandoned. This raises a number of questions: Are our intuitions about deformation stability right at all? Is it important? Is pooling necessary for deformation invariance? If not, how is deformation invariance achieved in its absence? In this work, we rigorously test these questions, and find that deformation stability in convolutional networks is more nuanced than it first appears: (1) Deformation invariance is not a binary property, but rather that different tasks require different degrees of deformation stability at different layers. (2) Deformation stability is not a fixed property of a network and is heavily adjusted over the course of training, largely through the smoothness of the convolutional filters. (3) Interleaved pooling layers are neither necessary nor sufficient for achieving the optimal form of deformation stability for natural image classification. (4) Pooling confers too much deformation stability for image classification at initialization, and during training, networks have to learn to counteract this inductive bias. Together, these findings provide new insights into the role of interleaved pooling and deformation invariance in CNNs, and demonstrate the importance of rigorous empirical testing of even our most basic assumptions about the working of neural networks. |
Tasks | Image Classification |
Published | 2018-04-12 |
URL | http://arxiv.org/abs/1804.04438v2 |
http://arxiv.org/pdf/1804.04438v2.pdf | |
PWC | https://paperswithcode.com/paper/pooling-is-neither-necessary-nor-sufficient |
Repo | |
Framework | |
Scalable Centralized Deep Multi-Agent Reinforcement Learning via Policy Gradients
Title | Scalable Centralized Deep Multi-Agent Reinforcement Learning via Policy Gradients |
Authors | Arbaaz Khan, Clark Zhang, Daniel D. Lee, Vijay Kumar, Alejandro Ribeiro |
Abstract | In this paper, we explore using deep reinforcement learning for problems with multiple agents. Most existing methods for deep multi-agent reinforcement learning consider only a small number of agents. When the number of agents increases, the dimensionality of the input and control spaces increase as well, and these methods do not scale well. To address this, we propose casting the multi-agent reinforcement learning problem as a distributed optimization problem. Our algorithm assumes that for multi-agent settings, policies of individual agents in a given population live close to each other in parameter space and can be approximated by a single policy. With this simple assumption, we show our algorithm to be extremely effective for reinforcement learning in multi-agent settings. We demonstrate its effectiveness against existing comparable approaches on co-operative and competitive tasks. |
Tasks | Distributed Optimization, Multi-agent Reinforcement Learning |
Published | 2018-05-22 |
URL | http://arxiv.org/abs/1805.08776v1 |
http://arxiv.org/pdf/1805.08776v1.pdf | |
PWC | https://paperswithcode.com/paper/scalable-centralized-deep-multi-agent |
Repo | |
Framework | |