October 17, 2019

2991 words 15 mins read

Paper Group ANR 875

Paper Group ANR 875

Optimizing Market Making using Multi-Agent Reinforcement Learning. Hierarchical Bilinear Pooling for Fine-Grained Visual Recognition. Character-Level Feature Extraction with Densely Connected Networks. Decentralized Likelihood Quantile Networks for Improving Performance in Deep Multi-Agent Reinforcement Learning. Intelligence Graph. Communication-E …

Optimizing Market Making using Multi-Agent Reinforcement Learning

Title Optimizing Market Making using Multi-Agent Reinforcement Learning
Authors Yagna Patel
Abstract In this paper, reinforcement learning is applied to the problem of optimizing market making. A multi-agent reinforcement learning framework is used to optimally place limit orders that lead to successful trades. The framework consists of two agents. The macro-agent optimizes on making the decision to buy, sell, or hold an asset. The micro-agent optimizes on placing limit orders within the limit order book. For the context of this paper, the proposed framework is applied and studied on the Bitcoin cryptocurrency market. The goal of this paper is to show that reinforcement learning is a viable strategy that can be applied to complex problems (with complex environments) such as market making.
Tasks Multi-agent Reinforcement Learning
Published 2018-12-26
URL http://arxiv.org/abs/1812.10252v1
PDF http://arxiv.org/pdf/1812.10252v1.pdf
PWC https://paperswithcode.com/paper/optimizing-market-making-using-multi-agent
Repo
Framework

Hierarchical Bilinear Pooling for Fine-Grained Visual Recognition

Title Hierarchical Bilinear Pooling for Fine-Grained Visual Recognition
Authors Chaojian Yu, Xinyi Zhao, Qi Zheng, Peng Zhang, Xinge You
Abstract Fine-grained visual recognition is challenging because it highly relies on the modeling of various semantic parts and fine-grained feature learning. Bilinear pooling based models have been shown to be effective at fine-grained recognition, while most previous approaches neglect the fact that inter-layer part feature interaction and fine-grained feature learning are mutually correlated and can reinforce each other. In this paper, we present a novel model to address these issues. First, a cross-layer bilinear pooling approach is proposed to capture the inter-layer part feature relations, which results in superior performance compared with other bilinear pooling based approaches. Second, we propose a novel hierarchical bilinear pooling framework to integrate multiple cross-layer bilinear features to enhance their representation capability. Our formulation is intuitive, efficient and achieves state-of-the-art results on the widely used fine-grained recognition datasets.
Tasks Fine-Grained Visual Recognition
Published 2018-07-26
URL http://arxiv.org/abs/1807.09915v1
PDF http://arxiv.org/pdf/1807.09915v1.pdf
PWC https://paperswithcode.com/paper/hierarchical-bilinear-pooling-for-fine
Repo
Framework

Character-Level Feature Extraction with Densely Connected Networks

Title Character-Level Feature Extraction with Densely Connected Networks
Authors Chanhee Lee, Young-Bum Kim, Dongyub Lee, HeuiSeok Lim
Abstract Generating character-level features is an important step for achieving good results in various natural language processing tasks. To alleviate the need for human labor in generating hand-crafted features, methods that utilize neural architectures such as Convolutional Neural Network (CNN) or Recurrent Neural Network (RNN) to automatically extract such features have been proposed and have shown great results. However, CNN generates position-independent features, and RNN is slow since it needs to process the characters sequentially. In this paper, we propose a novel method of using a densely connected network to automatically extract character-level features. The proposed method does not require any language or task specific assumptions, and shows robustness and effectiveness while being faster than CNN- or RNN-based methods. Evaluating this method on three sequence labeling tasks - slot tagging, Part-of-Speech (POS) tagging, and Named-Entity Recognition (NER) - we obtain state-of-the-art performance with a 96.62 F1-score and 97.73% accuracy on slot tagging and POS tagging, respectively, and comparable performance to the state-of-the-art 91.13 F1-score on NER.
Tasks Named Entity Recognition, Part-Of-Speech Tagging
Published 2018-06-24
URL http://arxiv.org/abs/1806.09089v2
PDF http://arxiv.org/pdf/1806.09089v2.pdf
PWC https://paperswithcode.com/paper/character-level-feature-extraction-with
Repo
Framework

Decentralized Likelihood Quantile Networks for Improving Performance in Deep Multi-Agent Reinforcement Learning

Title Decentralized Likelihood Quantile Networks for Improving Performance in Deep Multi-Agent Reinforcement Learning
Authors Xueguang Lu, Christopher Amato
Abstract Recent successes of value-based multi-agent deep reinforcement learning employ optimism by limiting underestimation updates of value function estimator, through carefully controlled learning rate (Omidshafiei et al., 2017) or reduced update probability (Palmer et al., 2018). To achieve full cooperation when learning independently, an agent must estimate the state values contingent on having optimal teammates; therefore, value overestimation is frequency injected to counteract negative effects caused by unobservable teammate sub-optimal policies and explorations. Aiming to solve this issue through automatic scheduling, this paper introduces a decentralized quantile estimator, which we found empirically to be more stable, sample efficient and more likely to converge to the joint optimal policy.
Tasks Multi-agent Reinforcement Learning
Published 2018-12-15
URL http://arxiv.org/abs/1812.06319v4
PDF http://arxiv.org/pdf/1812.06319v4.pdf
PWC https://paperswithcode.com/paper/decentralized-likelihood-quantile-networks
Repo
Framework

Intelligence Graph

Title Intelligence Graph
Authors Han Xiao
Abstract In fact, there exist three genres of intelligence architectures: logics (e.g. \textit{Random Forest, A$^*$ Searching}), neurons (e.g. \textit{CNN, LSTM}) and probabilities (e.g. \textit{Naive Bayes, HMM}), all of which are incompatible to each other. However, to construct powerful intelligence systems with various methods, we propose the intelligence graph (short as \textbf{\textit{iGraph}}), which is composed by both of neural and probabilistic graph, under the framework of forward-backward propagation. By the paradigm of iGraph, we design a recommendation model with semantic principle. First, the probabilistic distributions of categories are generated from the embedding representations of users/items, in the manner of neurons. Second, the probabilistic graph infers the distributions of features, in the manner of probabilities. Last, for the recommendation diversity, we perform an expectation computation then conduct a logic judgment, in the manner of logics. Experimentally, we beat the state-of-the-art baselines and verify our conclusions.
Tasks
Published 2018-01-05
URL http://arxiv.org/abs/1801.01604v1
PDF http://arxiv.org/pdf/1801.01604v1.pdf
PWC https://paperswithcode.com/paper/intelligence-graph
Repo
Framework

Communication-Efficient Distributed Reinforcement Learning

Title Communication-Efficient Distributed Reinforcement Learning
Authors Tianyi Chen, Kaiqing Zhang, Georgios B. Giannakis, Tamer Başar
Abstract This paper deals with distributed reinforcement learning (DRL), which involves a central controller and a group of learners. In particular, two DRL settings encountered in several applications are considered: multi-agent reinforcement learning (RL) and parallel RL, where frequent information exchanges between the learners and the controller are required. For many practical distributed systems, however, such as those involving parallel machines for training deep RL algorithms, and multi-robot systems for learning the optimal coordination strategies, the overhead caused by these frequent communication exchanges is considerable, and becomes the bottleneck of the overall performance. To address this challenge, a novel policy gradient method is developed here to cope with such communication-constrained DRL settings. The proposed approach reduces the communication overhead without degrading learning performance by adaptively skipping the policy gradient communication during iterations. It is established analytically that i) the novel algorithm has convergence rate identical to that of the plain-vanilla policy gradient for DRL; while ii) if the distributed computing units are heterogeneous in terms of their reward functions and initial state distributions, the number of communication rounds needed to achieve a desirable learning accuracy is markedly reduced. Numerical experiments on a popular multi-agent RL benchmark corroborate the significant communication reduction attained by the novel algorithm compared to alternatives.
Tasks Multi-agent Reinforcement Learning
Published 2018-12-07
URL http://arxiv.org/abs/1812.03239v2
PDF http://arxiv.org/pdf/1812.03239v2.pdf
PWC https://paperswithcode.com/paper/communication-efficient-distributed-4
Repo
Framework

Are ResNets Provably Better than Linear Predictors?

Title Are ResNets Provably Better than Linear Predictors?
Authors Ohad Shamir
Abstract A residual network (or ResNet) is a standard deep neural net architecture, with state-of-the-art performance across numerous applications. The main premise of ResNets is that they allow the training of each layer to focus on fitting just the residual of the previous layer’s output and the target output. Thus, we should expect that the trained network is no worse than what we can obtain if we remove the residual layers and train a shallower network instead. However, due to the non-convexity of the optimization problem, it is not at all clear that ResNets indeed achieve this behavior, rather than getting stuck at some arbitrarily poor local minimum. In this paper, we rigorously prove that arbitrarily deep, nonlinear residual units indeed exhibit this behavior, in the sense that the optimization landscape contains no local minima with value above what can be obtained with a linear predictor (namely a 1-layer network). Notably, we show this under minimal or no assumptions on the precise network architecture, data distribution, or loss function used. We also provide a quantitative analysis of approximate stationary points for this problem. Finally, we show that with a certain tweak to the architecture, training the network with standard stochastic gradient descent achieves an objective value close or better than any linear predictor.
Tasks
Published 2018-04-18
URL http://arxiv.org/abs/1804.06739v4
PDF http://arxiv.org/pdf/1804.06739v4.pdf
PWC https://paperswithcode.com/paper/are-resnets-provably-better-than-linear
Repo
Framework

Finite-Sample Analysis For Decentralized Batch Multi-Agent Reinforcement Learning With Networked Agents

Title Finite-Sample Analysis For Decentralized Batch Multi-Agent Reinforcement Learning With Networked Agents
Authors Kaiqing Zhang, Zhuoran Yang, Han Liu, Tong Zhang, Tamer Başar
Abstract Despite the increasing interest in multi-agent reinforcement learning (MARL) in multiple communities, understanding its theoretical foundation has long been recognized as a challenging problem. In this work, we address this problem by providing a finite-sample analysis for decentralized batch MARL with networked agents. Specifically, we consider two decentralized MARL settings, where teams of agents are connected by time-varying communication networks, and either collaborate or compete in a zero-sum game setting, without any central controller. These settings cover many conventional MARL settings in the literature. For both settings, we develop batch MARL algorithms that can be implemented in a decentralized fashion, and quantify the finite-sample errors of the estimated action-value functions. Our error analysis captures how the function class, the number of samples within each iteration, and the number of iterations determine the statistical accuracy of the proposed algorithms. Our results, compared to the finite-sample bounds for single-agent RL, involve additional error terms caused by decentralized computation, which is inherent in our decentralized MARL setting. This work appears to be the first finite-sample analysis for batch MARL, which sheds light on understanding both the sample and computational efficiency of MARL algorithms in general.
Tasks Multi-agent Reinforcement Learning
Published 2018-12-06
URL https://arxiv.org/abs/1812.02783v7
PDF https://arxiv.org/pdf/1812.02783v7.pdf
PWC https://paperswithcode.com/paper/finite-sample-analyses-for-fully
Repo
Framework

Augmenting Robot Knowledge Consultants with Distributed Short Term Memory

Title Augmenting Robot Knowledge Consultants with Distributed Short Term Memory
Authors Tom Williams, Ravenna Thielstrom, Evan Krause, Bradley Oosterveld, Matthias Scheutz
Abstract Human-robot communication in situated environments involves a complex interplay between knowledge representations across a wide variety of modalities. Crucially, linguistic information must be associated with representations of objects, locations, people, and goals, which may be represented in very different ways. In previous work, we developed a Consultant Framework that facilitates modality-agnostic access to information distributed across a set of heterogeneously represented knowledge sources. In this work, we draw inspiration from cognitive science to augment these distributed knowledge sources with Short Term Memory Buffers to create an STM-augmented algorithm for referring expression generation. We then discuss the potential performance benefits of this approach and insights from cognitive science that may inform future refinements in the design of our approach.
Tasks
Published 2018-11-26
URL http://arxiv.org/abs/1811.10229v1
PDF http://arxiv.org/pdf/1811.10229v1.pdf
PWC https://paperswithcode.com/paper/augmenting-robot-knowledge-consultants-with
Repo
Framework

Multimodal Polynomial Fusion for Detecting Driver Distraction

Title Multimodal Polynomial Fusion for Detecting Driver Distraction
Authors Yulun Du, Chirag Raman, Alan W Black, Louis-Philippe Morency, Maxine Eskenazi
Abstract Distracted driving is deadly, claiming 3,477 lives in the U.S. in 2015 alone. Although there has been a considerable amount of research on modeling the distracted behavior of drivers under various conditions, accurate automatic detection using multiple modalities and especially the contribution of using the speech modality to improve accuracy has received little attention. This paper introduces a new multimodal dataset for distracted driving behavior and discusses automatic distraction detection using features from three modalities: facial expression, speech and car signals. Detailed multimodal feature analysis shows that adding more modalities monotonically increases the predictive accuracy of the model. Finally, a simple and effective multimodal fusion technique using a polynomial fusion layer shows superior distraction detection results compared to the baseline SVM and neural network models.
Tasks
Published 2018-10-24
URL http://arxiv.org/abs/1810.10565v1
PDF http://arxiv.org/pdf/1810.10565v1.pdf
PWC https://paperswithcode.com/paper/multimodal-polynomial-fusion-for-detecting
Repo
Framework

Leabra7: a Python package for modeling recurrent, biologically-realistic neural networks

Title Leabra7: a Python package for modeling recurrent, biologically-realistic neural networks
Authors C. Daniel Greenidge, Noam Miller, Kenneth A. Norman
Abstract Emergent is a software package that uses the AdEx neural dynamics model and LEABRA learning algorithm to simulate and train arbitrary recurrent neural network architectures in a biologically-realistic manner. We present Leabra7, a complementary Python library that implements these same algorithms. Leabra7 is developed and distributed using modern software development principles, and integrates tightly with Python’s scientific stack. We demonstrate recurrent Leabra7 networks using traditional pattern-association tasks and a standard machine learning task, classifying the IRIS dataset.
Tasks
Published 2018-09-11
URL http://arxiv.org/abs/1809.04166v2
PDF http://arxiv.org/pdf/1809.04166v2.pdf
PWC https://paperswithcode.com/paper/leabra7-a-python-package-for-modeling
Repo
Framework

Emergence of linguistic conventions in multi-agent reinforcement learning

Title Emergence of linguistic conventions in multi-agent reinforcement learning
Authors Dorota Lipowska, Adam Lipowski
Abstract Recently, emergence of signaling conventions, among which language is a prime example, draws a considerable interdisciplinary interest ranging from game theory, to robotics to evolutionary linguistics. Such a wide spectrum of research is based on much different assumptions and methodologies, but complexity of the problem precludes formulation of a unifying and commonly accepted explanation. We examine formation of signaling conventions in a framework of a multi-agent reinforcement learning model. When the network of interactions between agents is a complete graph or a sufficiently dense random graph, a global consensus is typically reached with the emerging language being a nearly unique object-word mapping or containing some synonyms and homonyms. On finite-dimensional lattices, the model gets trapped in disordered configurations with a local consensus only. Such a trapping can be avoided by introducing a population renewal, which in the presence of superlinear reinforcement restores an ordinary surface-tension driven coarsening and considerably enhances formation of efficient signaling.
Tasks Multi-agent Reinforcement Learning
Published 2018-11-17
URL http://arxiv.org/abs/1811.07208v1
PDF http://arxiv.org/pdf/1811.07208v1.pdf
PWC https://paperswithcode.com/paper/emergence-of-linguistic-conventions-in-multi
Repo
Framework

Reconstruction of partially sampled multi-band images - Application to STEM-EELS imaging

Title Reconstruction of partially sampled multi-band images - Application to STEM-EELS imaging
Authors Étienne Monier, Thomas Oberlin, Nathalie Brun, Marcel Tencé, Marta de Frutos, Nicolas Dobigeon
Abstract Electron microscopy has shown to be a very powerful tool to map the chemical nature of samples at various scales down to atomic resolution. However, many samples can not be analyzed with an acceptable signal-to-noise ratio because of the radiation damage induced by the electron beam. This is particularly crucial for electron energy loss spectroscopy (EELS) which acquires spectral-spatial data and requires high beam intensity. Since scanning transmission electron microscopes (STEM) are able to acquire data cubes by scanning the electron probe over the sample and recording a spectrum for each spatial position, it is possible to design the scan pattern and to sample only specific pixels. As a consequence, partial acquisition schemes are now conceivable, provided a reconstruction of the full data cube is conducted as a post-processing step. This paper proposes two reconstruction algorithms for multi-band images acquired by STEM-EELS which exploits the spectral structure and the spatial smoothness of the image. The performance of the proposed schemes is illustrated thanks to experiments conducted on a realistic phantom dataset as well as real EELS spectrum-images.
Tasks
Published 2018-02-27
URL http://arxiv.org/abs/1802.10066v1
PDF http://arxiv.org/pdf/1802.10066v1.pdf
PWC https://paperswithcode.com/paper/reconstruction-of-partially-sampled-multi
Repo
Framework

Learning Segmentation Masks with the Independence Prior

Title Learning Segmentation Masks with the Independence Prior
Authors Songmin Dai, Xiaoqiang Li, Lu Wang, Pin Wu, Weiqin Tong, Yimin Chen
Abstract An instance with a bad mask might make a composite image that uses it look fake. This encourages us to learn segmentation by generating realistic composite images. To achieve this, we propose a novel framework that exploits a new proposed prior called the independence prior based on Generative Adversarial Networks (GANs). The generator produces an image with multiple category-specific instance providers, a layout module and a composition module. Firstly, each provider independently outputs a category-specific instance image with a soft mask. Then the provided instances’ poses are corrected by the layout module. Lastly, the composition module combines these instances into a final image. Training with adversarial loss and penalty for mask area, each provider learns a mask that is as small as possible but enough to cover a complete category-specific instance. Weakly supervised semantic segmentation methods widely use grouping cues modeling the association between image parts, which are either artificially designed or learned with costly segmentation labels or only modeled on local pairs. Unlike them, our method automatically models the dependence between any parts and learns instance segmentation. We apply our framework in two cases: (1) Foreground segmentation on category-specific images with box-level annotation. (2) Unsupervised learning of instance appearances and masks with only one image of homogeneous object cluster (HOC). We get appealing results in both tasks, which shows the independence prior is useful for instance segmentation and it is possible to unsupervisedly learn instance masks with only one image.
Tasks Instance Segmentation, Semantic Segmentation, Weakly-Supervised Semantic Segmentation
Published 2018-11-12
URL http://arxiv.org/abs/1811.04682v2
PDF http://arxiv.org/pdf/1811.04682v2.pdf
PWC https://paperswithcode.com/paper/learning-segmentation-masks-with-the
Repo
Framework

Plithogeny, Plithogenic Set, Logic, Probability, and Statistics

Title Plithogeny, Plithogenic Set, Logic, Probability, and Statistics
Authors Florentin Smarandache
Abstract In this book we introduce the plithogenic set (as generalization of crisp, fuzzy, intuitionistic fuzzy, and neutrosophic sets), plithogenic logic (as generalization of classical, fuzzy, intuitionistic fuzzy, and neutrosophic logics), plithogenic probability (as generalization of classical, imprecise, and neutrosophic probabilities), and plithogenic statistics (as generalization of classical, and neutrosophic statistics). Plithogenic Set is a set whose elements are characterized by one or more attributes, and each attribute may have many values. An attribute value v has a corresponding (fuzzy, intuitionistic fuzzy, or neutrosophic) degree of appurtenance d(x,v) of the element x, to the set P, with respect to some given criteria. In order to obtain a better accuracy for the plithogenic aggregation operators in the plithogenic set, logic, probability and for a more exact inclusion (partial order), a (fuzzy, intuitionistic fuzzy, or neutrosophic) contradiction (dissimilarity) degree is defined between each attribute value and the dominant (most important) attribute value. The plithogenic intersection and union are linear combinations of the fuzzy operators tnorm and tconorm, while the plithogenic complement, inclusion, equality are influenced by the attribute values contradiction (dissimilarity) degrees. Formal definitions of plithogenic set, logic, probability, statistics are presented into the book, followed by plithogenic aggregation operators, various theorems related to them, and afterwards examples and applications of these new concepts in our everyday life.
Tasks
Published 2018-08-12
URL http://arxiv.org/abs/1808.03948v1
PDF http://arxiv.org/pdf/1808.03948v1.pdf
PWC https://paperswithcode.com/paper/plithogeny-plithogenic-set-logic-probability
Repo
Framework
comments powered by Disqus