January 30, 2020

3135 words 15 mins read

Paper Group ANR 386

Paper Group ANR 386

Graph Colouring Meets Deep Learning: Effective Graph Neural Network Models for Combinatorial Problems. MDP Playground: Meta-Features in Reinforcement Learning. Improving Safety in Reinforcement Learning Using Model-Based Architectures and Human Intervention. Domain Bridge for Unpaired Image-to-Image Translation and Unsupervised Domain Adaptation. L …

Graph Colouring Meets Deep Learning: Effective Graph Neural Network Models for Combinatorial Problems

Title Graph Colouring Meets Deep Learning: Effective Graph Neural Network Models for Combinatorial Problems
Authors Henrique Lemos, Marcelo Prates, Pedro Avelar, Luis Lamb
Abstract Deep learning has consistently defied state-of-the-art techniques in many fields over the last decade. However, we are just beginning to understand the capabilities of neural learning in symbolic domains. Deep learning architectures that employ parameter sharing over graphs can produce models which can be trained on complex properties of relational data. These include highly relevant NP-Complete problems, such as SAT and TSP. In this work, we showcase how Graph Neural Networks (GNN) can be engineered – with a very simple architecture – to solve the fundamental combinatorial problem of graph colouring. Our results show that the model, which achieves high accuracy upon training on random instances, is able to generalise to graph distributions different from those seen at training time. Further, it performs better than the Neurosat, Tabucol and greedy baselines for some distributions. In addition, we show how vertex embeddings can be clustered in multidimensional spaces to yield constructive solutions even though our model is only trained as a binary classifier. In summary, our results contribute to shorten the gap in our understanding of the algorithms learned by GNNs, as well as hoarding empirical evidence for their capability on hard combinatorial problems. Our results thus contribute to the standing challenge of integrating robust learning and symbolic reasoning in Deep Learning systems.
Tasks
Published 2019-03-11
URL https://arxiv.org/abs/1903.04598v2
PDF https://arxiv.org/pdf/1903.04598v2.pdf
PWC https://paperswithcode.com/paper/graph-colouring-meets-deep-learning-effective
Repo
Framework

MDP Playground: Meta-Features in Reinforcement Learning

Title MDP Playground: Meta-Features in Reinforcement Learning
Authors Raghu Rajan, Frank Hutter
Abstract Reinforcement Learning (RL) algorithms usually do not try to identify specific features of environments which could help them perform better. Here, we present a few key \textit{meta-features} of environments: delayed rewards, specific reward sequences, sparsity of rewards, and stochasticity of environments, adapting to which should help RL agents perform better. While it is very time consuming to run RL algorithms on standard benchmarks, we define a parameterised collection of fast-to-run toy benchmarks in OpenAI Gym by varying these meta-features. Despite their toy nature and low compute requirements, we show that these benchmarks present substantial difficulties to current RL algorithms. Furthermore, since we can generate environments with a desired value for each of the meta-features, we have fine-grained control over the environments’ \textit{difficulty} and also have the ground truth available for evaluating algorithms. We believe that devising algorithms that can detect such meta-features of environments and adapt to them will be key to creating robust RL algorithms that work in a variety of different real-world problems.
Tasks
Published 2019-09-17
URL https://arxiv.org/abs/1909.07750v2
PDF https://arxiv.org/pdf/1909.07750v2.pdf
PWC https://paperswithcode.com/paper/mdp-playground-meta-features-in-reinforcement
Repo
Framework

Improving Safety in Reinforcement Learning Using Model-Based Architectures and Human Intervention

Title Improving Safety in Reinforcement Learning Using Model-Based Architectures and Human Intervention
Authors Bharat Prakash, Mohit Khatwani, Nicholas Waytowich, Tinoosh Mohsenin
Abstract Recent progress in AI and Reinforcement learning has shown great success in solving complex problems with high dimensional state spaces. However, most of these successes have been primarily in simulated environments where failure is of little or no consequence. Most real-world applications, however, require training solutions that are safe to operate as catastrophic failures are inadmissible especially when there is human interaction involved. Currently, Safe RL systems use human oversight during training and exploration in order to make sure the RL agent does not go into a catastrophic state. These methods require a large amount of human labor and it is very difficult to scale up. We present a hybrid method for reducing the human intervention time by combining model-based approaches and training a supervised learner to improve sample efficiency while also ensuring safety. We evaluate these methods on various grid-world environments using both standard and visual representations and show that our approach achieves better performance in terms of sample efficiency, number of catastrophic states reached as well as overall task performance compared to traditional model-free approaches
Tasks
Published 2019-03-22
URL http://arxiv.org/abs/1903.09328v1
PDF http://arxiv.org/pdf/1903.09328v1.pdf
PWC https://paperswithcode.com/paper/improving-safety-in-reinforcement-learning
Repo
Framework

Domain Bridge for Unpaired Image-to-Image Translation and Unsupervised Domain Adaptation

Title Domain Bridge for Unpaired Image-to-Image Translation and Unsupervised Domain Adaptation
Authors Fabio Pizzati, Raoul de Charette, Michela Zaccaria, Pietro Cerri
Abstract Image-to-image translation architectures may have limited effectiveness in some circumstances. For example, while generating rainy scenarios, they may fail to model typical traits of rain as water drops, and this ultimately impacts the synthetic images realism. With our method, called domain bridge, web-crawled data are exploited to reduce the domain gap, leading to the inclusion of previously ignored elements in the generated images. We make use of a network for clear to rain translation trained with the domain bridge to extend our work to Unsupervised Domain Adaptation (UDA). In that context, we introduce an online multimodal style-sampling strategy, where image translation multimodality is exploited at training time to improve performances. Finally, a novel approach for self-supervised learning is presented, and used to further align the domains. With our contributions, we simultaneously increase the realism of the generated images, while reaching on par performances with respect to the UDA state-of-the-art, with a simpler approach.
Tasks Domain Adaptation, Image-to-Image Translation, Unsupervised Domain Adaptation
Published 2019-10-23
URL https://arxiv.org/abs/1910.10563v3
PDF https://arxiv.org/pdf/1910.10563v3.pdf
PWC https://paperswithcode.com/paper/domain-bridge-for-unpaired-image-to-image
Repo
Framework

Lower Bounds for Learning Distributions under Communication Constraints via Fisher Information

Title Lower Bounds for Learning Distributions under Communication Constraints via Fisher Information
Authors Leighton Pate Barnes, Yanjun Han, Ayfer Ozgur
Abstract We consider the problem of learning high-dimensional, nonparametric and structured (e.g. Gaussian) distributions in distributed networks, where each node in the network observes an independent sample from the underlying distribution and can use $k$ bits to communicate its sample to a central processor. We consider three different models for communication. Under the independent model, each node communicates its sample to a central processor by independently encoding it into $k$ bits. Under the more general sequential or blackboard communication models, nodes can share information interactively but each node is restricted to write at most $k$ bits on the final transcript. We characterize the impact of the communication constraint $k$ on the minimax risk of estimating the underlying distribution under $\ell^2$ loss. We develop minimax lower bounds that apply in a unified way to many common statistical models and reveal that the impact of the communication constraint can be qualitatively different depending on the tail behavior of the score function associated with each model. A key ingredient in our proofs is a geometric characterization of Fisher information from quantized samples.
Tasks
Published 2019-02-07
URL https://arxiv.org/abs/1902.02890v2
PDF https://arxiv.org/pdf/1902.02890v2.pdf
PWC https://paperswithcode.com/paper/learning-distributions-from-their-samples
Repo
Framework

Actor-Critic Provably Finds Nash Equilibria of Linear-Quadratic Mean-Field Games

Title Actor-Critic Provably Finds Nash Equilibria of Linear-Quadratic Mean-Field Games
Authors Zuyue Fu, Zhuoran Yang, Yongxin Chen, Zhaoran Wang
Abstract We study discrete-time mean-field Markov games with infinite numbers of agents where each agent aims to minimize its ergodic cost. We consider the setting where the agents have identical linear state transitions and quadratic cost functions, while the aggregated effect of the agents is captured by the population mean of their states, namely, the mean-field state. For such a game, based on the Nash certainty equivalence principle, we provide sufficient conditions for the existence and uniqueness of its Nash equilibrium. Moreover, to find the Nash equilibrium, we propose a mean-field actor-critic algorithm with linear function approximation, which does not require knowing the model of dynamics. Specifically, at each iteration of our algorithm, we use the single-agent actor-critic algorithm to approximately obtain the optimal policy of the each agent given the current mean-field state, and then update the mean-field state. In particular, we prove that our algorithm converges to the Nash equilibrium at a linear rate. To the best of our knowledge, this is the first success of applying model-free reinforcement learning with function approximation to discrete-time mean-field Markov games with provable non-asymptotic global convergence guarantees.
Tasks
Published 2019-10-16
URL https://arxiv.org/abs/1910.07498v1
PDF https://arxiv.org/pdf/1910.07498v1.pdf
PWC https://paperswithcode.com/paper/actor-critic-provably-finds-nash-equilibria
Repo
Framework

CloudLSTM: A Recurrent Neural Model for Spatiotemporal Point-cloud Stream Forecasting

Title CloudLSTM: A Recurrent Neural Model for Spatiotemporal Point-cloud Stream Forecasting
Authors Chaoyun Zhang, Marco Fiore, Iain Murray, Paul Patras
Abstract This paper introduces CloudLSTM, a new branch of recurrent neural models tailored to forecasting over data streams generated by geospatial point-cloud sources. We design a Dynamic Point-cloud Convolution (D-Conv) operator as the core component of CloudLSTMs, which performs convolution directly over point-clouds and extracts local spatial features from sets of neighboring points that surround different elements of the input. This operator maintains the permutation invariance of sequence-to-sequence learning frameworks, while representing neighboring correlations at each time step – an important aspect in spatiotemporal predictive learning. The D-Conv operator resolves the grid-structural data requirements of existing spatiotemporal forecasting models and can be easily plugged into traditional LSTM architectures with sequence-to-sequence learning and attention mechanisms. We apply our proposed architecture to two representative, practical use cases that involve point-cloud streams, i.e. mobile service traffic forecasting and air quality indicator forecasting. Our results, obtained with real-world datasets collected in diverse scenarios for each use case, show that CloudLSTM delivers accurate long-term predictions, outperforming a variety of neural network models.
Tasks
Published 2019-07-29
URL https://arxiv.org/abs/1907.12410v2
PDF https://arxiv.org/pdf/1907.12410v2.pdf
PWC https://paperswithcode.com/paper/cloudlstm-a-recurrent-neural-model-for
Repo
Framework

Dreaming machine learning: Lipschitz extensions for reinforcement learning on financial markets

Title Dreaming machine learning: Lipschitz extensions for reinforcement learning on financial markets
Authors J. M. Calabuig, H. Falciani, E. A. Sánchez-Pérez
Abstract We consider a quasi-metric topological structure for the construction of a new reinforcement learning model in the framework of financial markets. It is based on a Lipschitz type extension of reward functions defined in metric spaces. Specifically, the McShane and Whitney extensions are considered for a reward function which is defined by the total evaluation of the benefits produced by the investment decision at a given time. We define the metric as a linear combination of a Euclidean distance and an angular metric component. All information about the evolution of the system from the beginning of the time interval is used to support the extension of the reward function, but in addition this data set is enriched by adding some artificially produced states. Thus, the main novelty of our method is the way we produce more states – which we call “dreams” – to enrich learning. Using some known states of the dynamical system that represents the evolution of the financial market, we use our technique to simulate new states by interpolating real states and introducing some random variables. These new states are used to feed a learning algorithm designed to improve the investment strategy by following a typical reinforcement learning scheme.
Tasks
Published 2019-07-09
URL https://arxiv.org/abs/1907.05697v2
PDF https://arxiv.org/pdf/1907.05697v2.pdf
PWC https://paperswithcode.com/paper/dreaming-machine-learning-lipschitz
Repo
Framework

Climate-driven statistical models as effective predictors of local dengue incidence in Costa Rica: A Generalized Additive Model and Random Forest approach

Title Climate-driven statistical models as effective predictors of local dengue incidence in Costa Rica: A Generalized Additive Model and Random Forest approach
Authors Paola Vásquez, Antonio Loría, Fabio Sanchez, Luis A. Barboza
Abstract Climate has been an important factor in shaping the distribution and incidence of dengue cases in tropical and subtropical countries. In Costa Rica, a tropical country with distinctive micro-climates, dengue has been endemic since its introduction in 1993, inflicting substantial economic, social, and public health repercussions. Using the number of dengue reported cases and climate data from 2007-2017, we fitted a prediction model applying a Generalized Additive Model (GAM) and Random Forest (RF) approach, which allowed us to retrospectively predict the relative risk of dengue in five climatological diverse municipalities around the country.
Tasks
Published 2019-07-30
URL https://arxiv.org/abs/1907.13095v2
PDF https://arxiv.org/pdf/1907.13095v2.pdf
PWC https://paperswithcode.com/paper/climate-driven-statistical-models-as
Repo
Framework

Regulating AI: do we need new tools?

Title Regulating AI: do we need new tools?
Authors Otello Ardovino, Jacopo Arpetti, Marco Delmastro
Abstract The Artificial Intelligence paradigm (hereinafter referred to as “AI”) builds on the analysis of data able, among other things, to snap pictures of the individuals’ behaviors and preferences. Such data represent the most valuable currency in the digital ecosystem, where their value derives from their being a fundamental asset in order to train machines with a view to developing AI applications. In this environment, online providers attract users by offering them services for free and getting in exchange data generated right through the usage of such services. This swap, characterized by an implicit nature, constitutes the focus of the present paper, in the light of the disequilibria, as well as market failures, that it may bring about. We use mobile apps and the related permission system as an ideal environment to explore, via econometric tools, those issues. The results, stemming from a dataset of over one million observations, show that both buyers and sellers are aware that access to digital services implicitly implies an exchange of data, although this does not have a considerable impact neither on the level of downloads (demand), nor on the level of the prices (supply). In other words, the implicit nature of this exchange does not allow market indicators to work efficiently. We conclude that current policies (e.g. transparency rules) may be inherently biased and we put forward suggestions for a new approach.
Tasks
Published 2019-04-27
URL http://arxiv.org/abs/1904.12134v1
PDF http://arxiv.org/pdf/1904.12134v1.pdf
PWC https://paperswithcode.com/paper/regulating-ai-do-we-need-new-tools
Repo
Framework

Combination of linear classifiers using score function – analysis of possible combination strategies

Title Combination of linear classifiers using score function – analysis of possible combination strategies
Authors Pawel Trajdos, Robert Burduk
Abstract In this work, we addressed the issue of combining linear classifiers using their score functions. The value of the scoring function depends on the distance from the decision boundary. Two score functions have been tested and four different combination strategies were investigated. During the experimental study, the proposed approach was applied to the heterogeneous ensemble and it was compared to two reference methods – majority voting and model averaging respectively. The comparison was made in terms of seven different quality criteria. The result shows that combination strategies based on simple average, and trimmed average are the best combination strategies of the geometrical combination.
Tasks
Published 2019-05-23
URL https://arxiv.org/abs/1905.09522v1
PDF https://arxiv.org/pdf/1905.09522v1.pdf
PWC https://paperswithcode.com/paper/combination-of-linear-classifiers-using-score
Repo
Framework

Differential Privacy for Sparse Classification Learning

Title Differential Privacy for Sparse Classification Learning
Authors Puyu Wang, Hai Zhang
Abstract In this paper, we present a differential privacy version of convex and nonconvex sparse classification approach. Based on alternating direction method of multiplier (ADMM) algorithm, we transform the solving of sparse problem into the multistep iteration process. Then we add exponential noise to stable steps to achieve privacy protection. By the property of the post-processing holding of differential privacy, the proposed approach satisfies the $\epsilon-$differential privacy even when the original problem is unstable. Furthermore, we present the theoretical privacy bound of the differential privacy classification algorithm. Specifically, the privacy bound of our algorithm is controlled by the algorithm iteration number, the privacy parameter, the parameter of loss function, ADMM pre-selected parameter, and the data size. Finally we apply our framework to logistic regression with $L_1$ regularizer and logistic regression with $L_{1/2}$ regularizer. Numerical studies demonstrate that our method is both effective and efficient which performs well in sensitive data analysis.
Tasks
Published 2019-08-02
URL https://arxiv.org/abs/1908.00780v1
PDF https://arxiv.org/pdf/1908.00780v1.pdf
PWC https://paperswithcode.com/paper/differential-privacy-for-sparse
Repo
Framework

Sinkhorn Algorithm as a Special Case of Stochastic Mirror Descent

Title Sinkhorn Algorithm as a Special Case of Stochastic Mirror Descent
Authors Konstantin Mishchenko
Abstract We present a new perspective on the celebrated Sinkhorn algorithm by showing that is a special case of incremental/stochastic mirror descent. In order to see this, one should simply plug Kullback-Leibler divergence in both mirror map and the objective function. Since the problem has unbounded domain, the objective function is neither smooth nor it has bounded gradients. However, one can still approach the problem using the notion of relative smoothness, obtaining that the stochastic objective is 1-relative smooth. The discovered equivalence allows us to propose 1) new methods for optimal transport, 2) an extension of Sinkhorn algorithm beyond two constraints.
Tasks
Published 2019-09-16
URL https://arxiv.org/abs/1909.06918v1
PDF https://arxiv.org/pdf/1909.06918v1.pdf
PWC https://paperswithcode.com/paper/sinkhorn-algorithm-as-a-special-case-of
Repo
Framework

AI Aided Noise Processing of Spintronic Based IoT Sensor for Magnetocardiography Application

Title AI Aided Noise Processing of Spintronic Based IoT Sensor for Magnetocardiography Application
Authors Attayeb Mohsen, Muftah Al-Mahdawi, Mostafa M. Fouda, Mikihiko Oogane, Yasuo Ando, Zubair Md Fadlullah
Abstract As we are about to embark upon the highly hyped “Society 5.0”, powered by the Internet of Things (IoT), traditional ways to monitor human heart signals for tracking cardio-vascular conditions are challenging, particularly in remote healthcare settings. On the merits of low power consumption, portability, and non-intrusiveness, there are no suitable IoT solutions that can provide information comparable to the conventional Electrocardiography (ECG). In this paper, we propose an IoT device utilizing a spintronic ultra-sensitive sensor that measures the magnetic fields produced by cardio-vascular electrical activity, i.e. Magentocardiography (MCG). After that, we treat the low-frequency noise generated by the sensors, which is also a challenge for most other sensors dealing with low-frequency bio-magnetic signals. Instead of relying on generic signal processing techniques such as averaging or filtering, we employ deep-learning training on bio-magnetic signals. Using an existing dataset of ECG records, MCG labels are synthetically constructed. A unique deep learning structure composed of combined Convolutional Neural Network (CNN) with Gated Recurrent Unit (GRU) is trained using the labeled data moving through a striding window, which is able to smartly capture and eliminate the noise features. Simulation results are reported to evaluate the effectiveness of the proposed method that demonstrates encouraging performance.
Tasks Electrocardiography (ECG)
Published 2019-11-08
URL https://arxiv.org/abs/1911.03127v1
PDF https://arxiv.org/pdf/1911.03127v1.pdf
PWC https://paperswithcode.com/paper/ai-aided-noise-processing-of-spintronic-based
Repo
Framework

DEMN: Distilled-Exposition Enhanced Matching Network for Story Comprehension

Title DEMN: Distilled-Exposition Enhanced Matching Network for Story Comprehension
Authors Chunhua Liu, Haiou Zhang, Shan Jiang, Dong Yu
Abstract This paper proposes a Distilled-Exposition Enhanced Matching Network (DEMN) for story-cloze test, which is still a challenging task in story comprehension. We divide a complete story into three narrative segments: an \textit{exposition}, a \textit{climax}, and an \textit{ending}. The model consists of three modules: input module, matching module, and distillation module. The input module provides semantic representations for the three segments and then feeds them into the other two modules. The matching module collects interaction features between the ending and the climax. The distillation module distills the crucial semantic information in the exposition and infuses it into the matching module in two different ways. We evaluate our single and ensemble model on ROCStories Corpus \cite{Mostafazadeh2016ACA}, achieving an accuracy of 80.1% and 81.2% on the test set respectively. The experimental results demonstrate that our DEMN model achieves a state-of-the-art performance.
Tasks
Published 2019-01-08
URL http://arxiv.org/abs/1901.02252v1
PDF http://arxiv.org/pdf/1901.02252v1.pdf
PWC https://paperswithcode.com/paper/demn-distilled-exposition-enhanced-matching
Repo
Framework
comments powered by Disqus