April 2, 2020

2783 words 14 mins read

Paper Group ANR 205

Paper Group ANR 205

Q-Learning for Mean-Field Controls. Regret Bounds for Decentralized Learning in Cooperative Multi-Agent Dynamical Systems. Learning Cost Functions for Optimal Transport. Cognitive Anthropomorphism of AI: How Humans and Computers Classify Images. An Empirical Investigation of Pre-Trained Transformer Language Models for Open-Domain Dialogue Generatio …

Q-Learning for Mean-Field Controls

Title Q-Learning for Mean-Field Controls
Authors Haotian Gu, Xin Guo, Xiaoli Wei, Renyuan Xu
Abstract Multi-agent reinforcement learning (MARL) has been applied to many challenging problems including two-team computer games, autonomous drivings, and real-time biddings. Despite the empirical success, there is a conspicuous absence of theoretical study of different MARL algorithms: this is mainly due to the curse of dimensionality caused by the exponential growth of the joint state-action space as the number of agents increases. Mean-field controls (MFC) with infinitely many agents and deterministic flows, meanwhile, provide good approximations to $N$-agent collaborative games in terms of both game values and optimal strategies. In this paper, we study the collaborative MARL under an MFC approximation framework: we develop a model-free kernel-based Q-learning algorithm (CDD-Q) and show that its convergence rate and sample complexity are independent of the number of agents. Our empirical studies on MFC examples demonstrate strong performances of CDD-Q. Moreover, the CDD-Q algorithm can be applied to a general class of Markov decision problems (MDPs) with deterministic dynamics and continuous state-action space.
Tasks Multi-agent Reinforcement Learning, Q-Learning
Published 2020-02-10
URL https://arxiv.org/abs/2002.04131v1
PDF https://arxiv.org/pdf/2002.04131v1.pdf
PWC https://paperswithcode.com/paper/q-learning-for-mean-field-controls
Repo
Framework

Regret Bounds for Decentralized Learning in Cooperative Multi-Agent Dynamical Systems

Title Regret Bounds for Decentralized Learning in Cooperative Multi-Agent Dynamical Systems
Authors Seyed Mohammad Asghari, Yi Ouyang, Ashutosh Nayyar
Abstract Regret analysis is challenging in Multi-Agent Reinforcement Learning (MARL) primarily due to the dynamical environments and the decentralized information among agents. We attempt to solve this challenge in the context of decentralized learning in multi-agent linear-quadratic (LQ) dynamical systems. We begin with a simple setup consisting of two agents and two dynamically decoupled stochastic linear systems, each system controlled by an agent. The systems are coupled through a quadratic cost function. When both systems’ dynamics are unknown and there is no communication among the agents, we show that no learning policy can generate sub-linear in $T$ regret, where $T$ is the time horizon. When only one system’s dynamics are unknown and there is one-directional communication from the agent controlling the unknown system to the other agent, we propose a MARL algorithm based on the construction of an auxiliary single-agent LQ problem. The auxiliary single-agent problem in the proposed MARL algorithm serves as an implicit coordination mechanism among the two learning agents. This allows the agents to achieve a regret within $O(\sqrt{T})$ of the regret of the auxiliary single-agent problem. Consequently, using existing results for single-agent LQ regret, our algorithm provides a $\tilde{O}(\sqrt{T})$ regret bound. (Here $\tilde{O}(\cdot)$ hides constants and logarithmic factors). Our numerical experiments indicate that this bound is matched in practice. From the two-agent problem, we extend our results to multi-agent LQ systems with certain communication patterns.
Tasks Multi-agent Reinforcement Learning
Published 2020-01-27
URL https://arxiv.org/abs/2001.10122v1
PDF https://arxiv.org/pdf/2001.10122v1.pdf
PWC https://paperswithcode.com/paper/regret-bounds-for-decentralized-learning-in
Repo
Framework

Learning Cost Functions for Optimal Transport

Title Learning Cost Functions for Optimal Transport
Authors Haodong Sun, Haomin Zhou, Hongyuan Zha, Xiaojing Ye
Abstract Learning the cost function for optimal transport from observed transport plan or its samples has been cast as a bi-level optimization problem. In this paper, we derive an unconstrained convex optimization formulation for the problem which can be further augmented by any customizable regularization. This novel framework avoids repeatedly solving a forward optimal transport problem in each iteration which has been a thorny computational bottleneck for the bi-level optimization approach. To validate the effectiveness of this framework, we develop two numerical algorithms, one is a fast matrix scaling method based on the Sinkhorn-Knopp algorithm for the discrete case, and the other is a supervised learning algorithm that realizes the cost function as a deep neural network in the continuous case. Numerical results demonstrate promising efficiency and accuracy advantages of the proposed algorithms over existing state of the art methods.
Tasks
Published 2020-02-22
URL https://arxiv.org/abs/2002.09650v1
PDF https://arxiv.org/pdf/2002.09650v1.pdf
PWC https://paperswithcode.com/paper/learning-cost-functions-for-optimal-transport
Repo
Framework

Cognitive Anthropomorphism of AI: How Humans and Computers Classify Images

Title Cognitive Anthropomorphism of AI: How Humans and Computers Classify Images
Authors Shane T. Mueller
Abstract Modern AI image classifiers have made impressive advances in recent years, but their performance often appears strange or violates expectations of users. This suggests humans engage in cognitive anthropomorphism: expecting AI to have the same nature as human intelligence. This mismatch presents an obstacle to appropriate human-AI interaction. To delineate this mismatch, I examine known properties of human classification, in comparison to image classifier systems. Based on this examination, I offer three strategies for system design that can address the mismatch between human and AI classification: explainable AI, novel methods for training users, and new algorithms that match human cognition.
Tasks
Published 2020-02-07
URL https://arxiv.org/abs/2002.03024v1
PDF https://arxiv.org/pdf/2002.03024v1.pdf
PWC https://paperswithcode.com/paper/cognitive-anthropomorphism-of-ai-how-humans
Repo
Framework

An Empirical Investigation of Pre-Trained Transformer Language Models for Open-Domain Dialogue Generation

Title An Empirical Investigation of Pre-Trained Transformer Language Models for Open-Domain Dialogue Generation
Authors Piji Li
Abstract We present an empirical investigation of pre-trained Transformer-based auto-regressive language models for the task of open-domain dialogue generation. Training paradigm of pre-training and fine-tuning is employed to conduct the parameter learning. Corpora of News and Wikipedia in Chinese and English are collected for the pre-training stage respectively. Dialogue context and response are concatenated into a single sequence utilized as the input of the models during the fine-tuning stage. A weighted joint prediction paradigm for both context and response is designed to evaluate the performance of models with or without the loss term for context prediction. Various of decoding strategies such as greedy search, beam search, top-k sampling, etc. are employed to conduct the response text generation. Extensive experiments are conducted on the typical single-turn and multi-turn dialogue corpora such as Weibo, Douban, Reddit, DailyDialog, and Persona-Chat. Detailed numbers of automatic evaluation metrics on relevance and diversity of the generated results for the languages models as well as the baseline approaches are reported.
Tasks Dialogue Generation, Text Generation
Published 2020-03-09
URL https://arxiv.org/abs/2003.04195v1
PDF https://arxiv.org/pdf/2003.04195v1.pdf
PWC https://paperswithcode.com/paper/an-empirical-investigation-of-pre-trained
Repo
Framework

Depth Estimation by Learning Triangulation and Densification of Sparse Points for Multi-view Stereo

Title Depth Estimation by Learning Triangulation and Densification of Sparse Points for Multi-view Stereo
Authors Ayan Sinha, Zak Murez, James Bartolozzi, Vijay Badrinarayanan, Andrew Rabinovich
Abstract Multi-view stereo (MVS) is the golden mean between the accuracy of active depth sensing and the practicality of monocular depth estimation. Cost volume based approaches employing 3D convolutional neural networks (CNNs) have considerably improved the accuracy of MVS systems. However, this accuracy comes at a high computational cost which impedes practical adoption. Distinct from cost volume approaches, we propose an efficient depth estimation approach by first (a) detecting and evaluating descriptors for interest points, then (b) learning to match and triangulate a small set of interest points, and finally (c) densifying this sparse set of 3D points using CNNs. An end-to-end network efficiently performs all three steps within a deep learning framework and trained with intermediate 2D image and 3D geometric supervision, along with depth supervision. Crucially, our first step complements pose estimation using interest point detection and descriptor learning. We demonstrate that state-of-the-art results on depth estimation with lower compute for different scene lengths. Furthermore, our method generalizes to newer environments and the descriptors output by our network compare favorably to strong baselines.
Tasks Depth Estimation, Interest Point Detection, Monocular Depth Estimation, Pose Estimation
Published 2020-03-19
URL https://arxiv.org/abs/2003.08933v1
PDF https://arxiv.org/pdf/2003.08933v1.pdf
PWC https://paperswithcode.com/paper/depth-estimation-by-learning-triangulation
Repo
Framework

Active Depth Estimation: Stability Analysis and its Applications

Title Active Depth Estimation: Stability Analysis and its Applications
Authors Romulo T. Rodrigues, Pedro Miraldo, Dimos V. Dimarogonas, A. Pedro Aguiar
Abstract Recovering the 3D structure of the surrounding environment is an essential task in any vision-controlled Structure-from-Motion (SfM) scheme. This paper focuses on the theoretical properties of the SfM, known as the incremental active depth estimation. The term incremental stands for estimating the 3D structure of the scene over a chronological sequence of image frames. Active means that the camera actuation is such that it improves estimation performance. Starting from a known depth estimation filter, this paper presents the stability analysis of the filter in terms of the control inputs of the camera. By analyzing the convergence of the estimator using the Lyapunov theory, we relax the constraints on the projection of the 3D point in the image plane when compared to previous results. Nonetheless, our method is capable of dealing with the cameras’ limited field-of-view constraints. The main results are validated through experiments with simulated data.
Tasks Depth Estimation
Published 2020-03-16
URL https://arxiv.org/abs/2003.07137v1
PDF https://arxiv.org/pdf/2003.07137v1.pdf
PWC https://paperswithcode.com/paper/active-depth-estimation-stability-analysis
Repo
Framework

On the Uniqueness of Binary Quantizers for Maximizing Mutual Information

Title On the Uniqueness of Binary Quantizers for Maximizing Mutual Information
Authors Thuan Nguyen, Thinh Nguyen
Abstract We consider a channel with a binary input X being corrupted by a continuous-valued noise that results in a continuous-valued output Y. An optimal binary quantizer is used to quantize the continuous-valued output Y to the final binary output Z to maximize the mutual information I(X; Z). We show that when the ratio of the channel conditional density r(y) = P(Y=yX=0)/ P(Y =yX=1) is a strictly increasing/decreasing function of y, then a quantizer having a single threshold can maximize mutual information. Furthermore, we show that an optimal quantizer (possibly with multiple thresholds) is the one with the thresholding vector whose elements are all the solutions of r(y) = r* for some constant r* > 0. Interestingly, the optimal constant r* is unique. This uniqueness property allows for fast algorithmic implementation such as a bisection algorithm to find the optimal quantizer. Our results also confirm some previous results using alternative elementary proofs. We show some numerical examples of applying our results to channels with additive Gaussian noises.
Tasks
Published 2020-01-07
URL https://arxiv.org/abs/2001.01836v1
PDF https://arxiv.org/pdf/2001.01836v1.pdf
PWC https://paperswithcode.com/paper/on-the-uniqueness-of-binary-quantizers-for
Repo
Framework

AL2: Progressive Activation Loss for Learning General Representations in Classification Neural Networks

Title AL2: Progressive Activation Loss for Learning General Representations in Classification Neural Networks
Authors Majed El Helou, Frederike Dümbgen, Sabine Süsstrunk
Abstract The large capacity of neural networks enables them to learn complex functions. To avoid overfitting, networks however require a lot of training data that can be expensive and time-consuming to collect. A common practical approach to attenuate overfitting is the use of network regularization techniques. We propose a novel regularization method that progressively penalizes the magnitude of activations during training. The combined activation signals produced by all neurons in a given layer form the representation of the input image in that feature space. We propose to regularize this representation in the last feature layer before classification layers. Our method’s effect on generalization is analyzed with label randomization tests and cumulative ablations. Experimental results show the advantages of our approach in comparison with commonly-used regularizers on standard benchmark datasets.
Tasks
Published 2020-03-07
URL https://arxiv.org/abs/2003.03633v1
PDF https://arxiv.org/pdf/2003.03633v1.pdf
PWC https://paperswithcode.com/paper/al2-progressive-activation-loss-for-learning
Repo
Framework

Neural Network Approximation of Graph Fourier Transforms for Sparse Sampling of Networked Flow Dynamics

Title Neural Network Approximation of Graph Fourier Transforms for Sparse Sampling of Networked Flow Dynamics
Authors Alessio Pagani, Zhuangkun Wei, Ricardo Silva, Weisi Guo
Abstract Infrastructure monitoring is critical for safe operations and sustainability. Water distribution networks (WDNs) are large-scale networked critical systems with complex cascade dynamics which are difficult to predict. Ubiquitous monitoring is expensive and a key challenge is to infer the contaminant dynamics from partial sparse monitoring data. Existing approaches use multi-objective optimisation to find the minimum set of essential monitoring points, but lack performance guarantees and a theoretical framework. Here, we first develop Graph Fourier Transform (GFT) operators to compress networked contamination spreading dynamics to identify the essential principle data collection points with inference performance guarantees. We then build autoencoder (AE) inspired neural networks (NN) to generalize the GFT sampling process and under-sample further from the initial sampling set, allowing a very small set of data points to largely reconstruct the contamination dynamics over real and artificial WDNs. Various sources of the contamination are tested and we obtain high accuracy reconstruction using around 5-10% of the sample set. This general approach of compression and under-sampled recovery via neural networks can be applied to a wide range of networked infrastructures to enable digital twins.
Tasks
Published 2020-02-11
URL https://arxiv.org/abs/2002.05508v1
PDF https://arxiv.org/pdf/2002.05508v1.pdf
PWC https://paperswithcode.com/paper/neural-network-approximation-of-graph-fourier
Repo
Framework

Adaptive Control of Embedding Strength in Image Watermarking using Neural Networks

Title Adaptive Control of Embedding Strength in Image Watermarking using Neural Networks
Authors Mahnoosh Bagheri, Majid Mohrekesh, Nader Karimi, Shadrokh Samavi
Abstract Digital image watermarking has been widely used in different applications such as copyright protection of digital media, such as audio, image, and video files. Two opposing criteria of robustness and transparency are the goals of watermarking methods. In this paper, we propose a framework for determining the appropriate embedding strength factor. The framework can use most DWT and DCT based blind watermarking approaches. We use Mask R-CNN on the COCO dataset to find a good strength factor for each sub-block. Experiments show that this method is robust against different attacks and has good transparency.
Tasks
Published 2020-01-09
URL https://arxiv.org/abs/2001.03251v1
PDF https://arxiv.org/pdf/2001.03251v1.pdf
PWC https://paperswithcode.com/paper/adaptive-control-of-embedding-strength-in
Repo
Framework

Predicting population neural activity in the Algonauts challenge using end-to-end trained Siamese networks and group convolutions

Title Predicting population neural activity in the Algonauts challenge using end-to-end trained Siamese networks and group convolutions
Authors Georgin Jacob, Harish Katti
Abstract The Algonauts challenge is about predicting the object representations in the form of Representational Dissimilarity Matrices (RDMS) derived from visual brain regions. We used a customized deep learning model using the concept of Siamese networks and group convolutions to predict neural distances corresponding to a pair of images. Training data was best explained by distances computed over the last layer.
Tasks
Published 2020-01-13
URL https://arxiv.org/abs/2001.05841v1
PDF https://arxiv.org/pdf/2001.05841v1.pdf
PWC https://paperswithcode.com/paper/predicting-population-neural-activity-in-the
Repo
Framework

Machine learning as a model for cultural learning: Teaching an algorithm what it means to be fat

Title Machine learning as a model for cultural learning: Teaching an algorithm what it means to be fat
Authors Alina Arseniev-Koehler, Jacob G. Foster
Abstract Overweight individuals, and especially women, are disparaged as immoral, unhealthy, and low class. These negative conceptions are not intrinsic to obesity; they are the tainted fruit of cultural learning. Scholars often cite media consumption as a key mechanism for learning cultural biases, but it remains unclear how this public culture becomes private culture. Here we provide a computational account of this learning mechanism, showing that cultural schemata can be learned from news reporting. We extract schemata about obesity from New York Times articles with word2vec, a neural language model inspired by human cognition. We identify several cultural schemata that link obesity to gender, immorality, poor health, and low socioeconomic class. Such schemata may be subtly but pervasively activated by our language; thus, language can chronically reproduce biases (e.g., about weight and health). Our findings also reinforce ongoing concerns that machine learning can encode, and reproduce, harmful human biases.
Tasks Language Modelling
Published 2020-03-24
URL https://arxiv.org/abs/2003.12133v1
PDF https://arxiv.org/pdf/2003.12133v1.pdf
PWC https://paperswithcode.com/paper/machine-learning-as-a-model-for-cultural
Repo
Framework

Obstacle Avoidance and Navigation Utilizing Reinforcement Learning with Reward Shaping

Title Obstacle Avoidance and Navigation Utilizing Reinforcement Learning with Reward Shaping
Authors Daniel Zhang, Colleen P. Bailey
Abstract In this paper, we investigate the obstacle avoidance and navigation problem in the robotic control area. For solving such a problem, we propose revised Deep Deterministic Policy Gradient (DDPG) and Proximal Policy Optimization algorithms with an improved reward shaping technique. We compare the performances between the original DDPG and PPO with the revised version of both on simulations with a real mobile robot and demonstrate that the proposed algorithms achieve better results.
Tasks
Published 2020-03-28
URL https://arxiv.org/abs/2003.12863v1
PDF https://arxiv.org/pdf/2003.12863v1.pdf
PWC https://paperswithcode.com/paper/obstacle-avoidance-and-navigation-utilizing
Repo
Framework

An Analysis of Object Representations in Deep Visual Trackers

Title An Analysis of Object Representations in Deep Visual Trackers
Authors Ross Goroshin, Jonathan Tompson, Debidatta Dwibedi
Abstract Fully convolutional deep correlation networks are integral components of state-of the-art approaches to single object visual tracking. It is commonly assumed that these networks perform tracking by detection by matching features of the object instance with features of the entire frame. Strong architectural priors and conditioning on the object representation is thought to encourage this tracking strategy. Despite these strong priors, we show that deep trackers often default to tracking by saliency detection - without relying on the object instance representation. Our analysis shows that despite being a useful prior, salience detection can prevent the emergence of more robust tracking strategies in deep networks. This leads us to introduce an auxiliary detection task that encourages more discriminative object representations that improve tracking performance.
Tasks Saliency Detection, Visual Tracking
Published 2020-01-08
URL https://arxiv.org/abs/2001.02593v1
PDF https://arxiv.org/pdf/2001.02593v1.pdf
PWC https://paperswithcode.com/paper/an-analysis-of-object-representations-in-deep
Repo
Framework
comments powered by Disqus