Paper Group ANR 205
Q-Learning for Mean-Field Controls. Regret Bounds for Decentralized Learning in Cooperative Multi-Agent Dynamical Systems. Learning Cost Functions for Optimal Transport. Cognitive Anthropomorphism of AI: How Humans and Computers Classify Images. An Empirical Investigation of Pre-Trained Transformer Language Models for Open-Domain Dialogue Generatio …
Q-Learning for Mean-Field Controls
Title | Q-Learning for Mean-Field Controls |
Authors | Haotian Gu, Xin Guo, Xiaoli Wei, Renyuan Xu |
Abstract | Multi-agent reinforcement learning (MARL) has been applied to many challenging problems including two-team computer games, autonomous drivings, and real-time biddings. Despite the empirical success, there is a conspicuous absence of theoretical study of different MARL algorithms: this is mainly due to the curse of dimensionality caused by the exponential growth of the joint state-action space as the number of agents increases. Mean-field controls (MFC) with infinitely many agents and deterministic flows, meanwhile, provide good approximations to $N$-agent collaborative games in terms of both game values and optimal strategies. In this paper, we study the collaborative MARL under an MFC approximation framework: we develop a model-free kernel-based Q-learning algorithm (CDD-Q) and show that its convergence rate and sample complexity are independent of the number of agents. Our empirical studies on MFC examples demonstrate strong performances of CDD-Q. Moreover, the CDD-Q algorithm can be applied to a general class of Markov decision problems (MDPs) with deterministic dynamics and continuous state-action space. |
Tasks | Multi-agent Reinforcement Learning, Q-Learning |
Published | 2020-02-10 |
URL | https://arxiv.org/abs/2002.04131v1 |
https://arxiv.org/pdf/2002.04131v1.pdf | |
PWC | https://paperswithcode.com/paper/q-learning-for-mean-field-controls |
Repo | |
Framework | |
Regret Bounds for Decentralized Learning in Cooperative Multi-Agent Dynamical Systems
Title | Regret Bounds for Decentralized Learning in Cooperative Multi-Agent Dynamical Systems |
Authors | Seyed Mohammad Asghari, Yi Ouyang, Ashutosh Nayyar |
Abstract | Regret analysis is challenging in Multi-Agent Reinforcement Learning (MARL) primarily due to the dynamical environments and the decentralized information among agents. We attempt to solve this challenge in the context of decentralized learning in multi-agent linear-quadratic (LQ) dynamical systems. We begin with a simple setup consisting of two agents and two dynamically decoupled stochastic linear systems, each system controlled by an agent. The systems are coupled through a quadratic cost function. When both systems’ dynamics are unknown and there is no communication among the agents, we show that no learning policy can generate sub-linear in $T$ regret, where $T$ is the time horizon. When only one system’s dynamics are unknown and there is one-directional communication from the agent controlling the unknown system to the other agent, we propose a MARL algorithm based on the construction of an auxiliary single-agent LQ problem. The auxiliary single-agent problem in the proposed MARL algorithm serves as an implicit coordination mechanism among the two learning agents. This allows the agents to achieve a regret within $O(\sqrt{T})$ of the regret of the auxiliary single-agent problem. Consequently, using existing results for single-agent LQ regret, our algorithm provides a $\tilde{O}(\sqrt{T})$ regret bound. (Here $\tilde{O}(\cdot)$ hides constants and logarithmic factors). Our numerical experiments indicate that this bound is matched in practice. From the two-agent problem, we extend our results to multi-agent LQ systems with certain communication patterns. |
Tasks | Multi-agent Reinforcement Learning |
Published | 2020-01-27 |
URL | https://arxiv.org/abs/2001.10122v1 |
https://arxiv.org/pdf/2001.10122v1.pdf | |
PWC | https://paperswithcode.com/paper/regret-bounds-for-decentralized-learning-in |
Repo | |
Framework | |
Learning Cost Functions for Optimal Transport
Title | Learning Cost Functions for Optimal Transport |
Authors | Haodong Sun, Haomin Zhou, Hongyuan Zha, Xiaojing Ye |
Abstract | Learning the cost function for optimal transport from observed transport plan or its samples has been cast as a bi-level optimization problem. In this paper, we derive an unconstrained convex optimization formulation for the problem which can be further augmented by any customizable regularization. This novel framework avoids repeatedly solving a forward optimal transport problem in each iteration which has been a thorny computational bottleneck for the bi-level optimization approach. To validate the effectiveness of this framework, we develop two numerical algorithms, one is a fast matrix scaling method based on the Sinkhorn-Knopp algorithm for the discrete case, and the other is a supervised learning algorithm that realizes the cost function as a deep neural network in the continuous case. Numerical results demonstrate promising efficiency and accuracy advantages of the proposed algorithms over existing state of the art methods. |
Tasks | |
Published | 2020-02-22 |
URL | https://arxiv.org/abs/2002.09650v1 |
https://arxiv.org/pdf/2002.09650v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-cost-functions-for-optimal-transport |
Repo | |
Framework | |
Cognitive Anthropomorphism of AI: How Humans and Computers Classify Images
Title | Cognitive Anthropomorphism of AI: How Humans and Computers Classify Images |
Authors | Shane T. Mueller |
Abstract | Modern AI image classifiers have made impressive advances in recent years, but their performance often appears strange or violates expectations of users. This suggests humans engage in cognitive anthropomorphism: expecting AI to have the same nature as human intelligence. This mismatch presents an obstacle to appropriate human-AI interaction. To delineate this mismatch, I examine known properties of human classification, in comparison to image classifier systems. Based on this examination, I offer three strategies for system design that can address the mismatch between human and AI classification: explainable AI, novel methods for training users, and new algorithms that match human cognition. |
Tasks | |
Published | 2020-02-07 |
URL | https://arxiv.org/abs/2002.03024v1 |
https://arxiv.org/pdf/2002.03024v1.pdf | |
PWC | https://paperswithcode.com/paper/cognitive-anthropomorphism-of-ai-how-humans |
Repo | |
Framework | |
An Empirical Investigation of Pre-Trained Transformer Language Models for Open-Domain Dialogue Generation
Title | An Empirical Investigation of Pre-Trained Transformer Language Models for Open-Domain Dialogue Generation |
Authors | Piji Li |
Abstract | We present an empirical investigation of pre-trained Transformer-based auto-regressive language models for the task of open-domain dialogue generation. Training paradigm of pre-training and fine-tuning is employed to conduct the parameter learning. Corpora of News and Wikipedia in Chinese and English are collected for the pre-training stage respectively. Dialogue context and response are concatenated into a single sequence utilized as the input of the models during the fine-tuning stage. A weighted joint prediction paradigm for both context and response is designed to evaluate the performance of models with or without the loss term for context prediction. Various of decoding strategies such as greedy search, beam search, top-k sampling, etc. are employed to conduct the response text generation. Extensive experiments are conducted on the typical single-turn and multi-turn dialogue corpora such as Weibo, Douban, Reddit, DailyDialog, and Persona-Chat. Detailed numbers of automatic evaluation metrics on relevance and diversity of the generated results for the languages models as well as the baseline approaches are reported. |
Tasks | Dialogue Generation, Text Generation |
Published | 2020-03-09 |
URL | https://arxiv.org/abs/2003.04195v1 |
https://arxiv.org/pdf/2003.04195v1.pdf | |
PWC | https://paperswithcode.com/paper/an-empirical-investigation-of-pre-trained |
Repo | |
Framework | |
Depth Estimation by Learning Triangulation and Densification of Sparse Points for Multi-view Stereo
Title | Depth Estimation by Learning Triangulation and Densification of Sparse Points for Multi-view Stereo |
Authors | Ayan Sinha, Zak Murez, James Bartolozzi, Vijay Badrinarayanan, Andrew Rabinovich |
Abstract | Multi-view stereo (MVS) is the golden mean between the accuracy of active depth sensing and the practicality of monocular depth estimation. Cost volume based approaches employing 3D convolutional neural networks (CNNs) have considerably improved the accuracy of MVS systems. However, this accuracy comes at a high computational cost which impedes practical adoption. Distinct from cost volume approaches, we propose an efficient depth estimation approach by first (a) detecting and evaluating descriptors for interest points, then (b) learning to match and triangulate a small set of interest points, and finally (c) densifying this sparse set of 3D points using CNNs. An end-to-end network efficiently performs all three steps within a deep learning framework and trained with intermediate 2D image and 3D geometric supervision, along with depth supervision. Crucially, our first step complements pose estimation using interest point detection and descriptor learning. We demonstrate that state-of-the-art results on depth estimation with lower compute for different scene lengths. Furthermore, our method generalizes to newer environments and the descriptors output by our network compare favorably to strong baselines. |
Tasks | Depth Estimation, Interest Point Detection, Monocular Depth Estimation, Pose Estimation |
Published | 2020-03-19 |
URL | https://arxiv.org/abs/2003.08933v1 |
https://arxiv.org/pdf/2003.08933v1.pdf | |
PWC | https://paperswithcode.com/paper/depth-estimation-by-learning-triangulation |
Repo | |
Framework | |
Active Depth Estimation: Stability Analysis and its Applications
Title | Active Depth Estimation: Stability Analysis and its Applications |
Authors | Romulo T. Rodrigues, Pedro Miraldo, Dimos V. Dimarogonas, A. Pedro Aguiar |
Abstract | Recovering the 3D structure of the surrounding environment is an essential task in any vision-controlled Structure-from-Motion (SfM) scheme. This paper focuses on the theoretical properties of the SfM, known as the incremental active depth estimation. The term incremental stands for estimating the 3D structure of the scene over a chronological sequence of image frames. Active means that the camera actuation is such that it improves estimation performance. Starting from a known depth estimation filter, this paper presents the stability analysis of the filter in terms of the control inputs of the camera. By analyzing the convergence of the estimator using the Lyapunov theory, we relax the constraints on the projection of the 3D point in the image plane when compared to previous results. Nonetheless, our method is capable of dealing with the cameras’ limited field-of-view constraints. The main results are validated through experiments with simulated data. |
Tasks | Depth Estimation |
Published | 2020-03-16 |
URL | https://arxiv.org/abs/2003.07137v1 |
https://arxiv.org/pdf/2003.07137v1.pdf | |
PWC | https://paperswithcode.com/paper/active-depth-estimation-stability-analysis |
Repo | |
Framework | |
On the Uniqueness of Binary Quantizers for Maximizing Mutual Information
Title | On the Uniqueness of Binary Quantizers for Maximizing Mutual Information |
Authors | Thuan Nguyen, Thinh Nguyen |
Abstract | We consider a channel with a binary input X being corrupted by a continuous-valued noise that results in a continuous-valued output Y. An optimal binary quantizer is used to quantize the continuous-valued output Y to the final binary output Z to maximize the mutual information I(X; Z). We show that when the ratio of the channel conditional density r(y) = P(Y=yX=0)/ P(Y =yX=1) is a strictly increasing/decreasing function of y, then a quantizer having a single threshold can maximize mutual information. Furthermore, we show that an optimal quantizer (possibly with multiple thresholds) is the one with the thresholding vector whose elements are all the solutions of r(y) = r* for some constant r* > 0. Interestingly, the optimal constant r* is unique. This uniqueness property allows for fast algorithmic implementation such as a bisection algorithm to find the optimal quantizer. Our results also confirm some previous results using alternative elementary proofs. We show some numerical examples of applying our results to channels with additive Gaussian noises. |
Tasks | |
Published | 2020-01-07 |
URL | https://arxiv.org/abs/2001.01836v1 |
https://arxiv.org/pdf/2001.01836v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-uniqueness-of-binary-quantizers-for |
Repo | |
Framework | |
AL2: Progressive Activation Loss for Learning General Representations in Classification Neural Networks
Title | AL2: Progressive Activation Loss for Learning General Representations in Classification Neural Networks |
Authors | Majed El Helou, Frederike Dümbgen, Sabine Süsstrunk |
Abstract | The large capacity of neural networks enables them to learn complex functions. To avoid overfitting, networks however require a lot of training data that can be expensive and time-consuming to collect. A common practical approach to attenuate overfitting is the use of network regularization techniques. We propose a novel regularization method that progressively penalizes the magnitude of activations during training. The combined activation signals produced by all neurons in a given layer form the representation of the input image in that feature space. We propose to regularize this representation in the last feature layer before classification layers. Our method’s effect on generalization is analyzed with label randomization tests and cumulative ablations. Experimental results show the advantages of our approach in comparison with commonly-used regularizers on standard benchmark datasets. |
Tasks | |
Published | 2020-03-07 |
URL | https://arxiv.org/abs/2003.03633v1 |
https://arxiv.org/pdf/2003.03633v1.pdf | |
PWC | https://paperswithcode.com/paper/al2-progressive-activation-loss-for-learning |
Repo | |
Framework | |
Neural Network Approximation of Graph Fourier Transforms for Sparse Sampling of Networked Flow Dynamics
Title | Neural Network Approximation of Graph Fourier Transforms for Sparse Sampling of Networked Flow Dynamics |
Authors | Alessio Pagani, Zhuangkun Wei, Ricardo Silva, Weisi Guo |
Abstract | Infrastructure monitoring is critical for safe operations and sustainability. Water distribution networks (WDNs) are large-scale networked critical systems with complex cascade dynamics which are difficult to predict. Ubiquitous monitoring is expensive and a key challenge is to infer the contaminant dynamics from partial sparse monitoring data. Existing approaches use multi-objective optimisation to find the minimum set of essential monitoring points, but lack performance guarantees and a theoretical framework. Here, we first develop Graph Fourier Transform (GFT) operators to compress networked contamination spreading dynamics to identify the essential principle data collection points with inference performance guarantees. We then build autoencoder (AE) inspired neural networks (NN) to generalize the GFT sampling process and under-sample further from the initial sampling set, allowing a very small set of data points to largely reconstruct the contamination dynamics over real and artificial WDNs. Various sources of the contamination are tested and we obtain high accuracy reconstruction using around 5-10% of the sample set. This general approach of compression and under-sampled recovery via neural networks can be applied to a wide range of networked infrastructures to enable digital twins. |
Tasks | |
Published | 2020-02-11 |
URL | https://arxiv.org/abs/2002.05508v1 |
https://arxiv.org/pdf/2002.05508v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-network-approximation-of-graph-fourier |
Repo | |
Framework | |
Adaptive Control of Embedding Strength in Image Watermarking using Neural Networks
Title | Adaptive Control of Embedding Strength in Image Watermarking using Neural Networks |
Authors | Mahnoosh Bagheri, Majid Mohrekesh, Nader Karimi, Shadrokh Samavi |
Abstract | Digital image watermarking has been widely used in different applications such as copyright protection of digital media, such as audio, image, and video files. Two opposing criteria of robustness and transparency are the goals of watermarking methods. In this paper, we propose a framework for determining the appropriate embedding strength factor. The framework can use most DWT and DCT based blind watermarking approaches. We use Mask R-CNN on the COCO dataset to find a good strength factor for each sub-block. Experiments show that this method is robust against different attacks and has good transparency. |
Tasks | |
Published | 2020-01-09 |
URL | https://arxiv.org/abs/2001.03251v1 |
https://arxiv.org/pdf/2001.03251v1.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-control-of-embedding-strength-in |
Repo | |
Framework | |
Predicting population neural activity in the Algonauts challenge using end-to-end trained Siamese networks and group convolutions
Title | Predicting population neural activity in the Algonauts challenge using end-to-end trained Siamese networks and group convolutions |
Authors | Georgin Jacob, Harish Katti |
Abstract | The Algonauts challenge is about predicting the object representations in the form of Representational Dissimilarity Matrices (RDMS) derived from visual brain regions. We used a customized deep learning model using the concept of Siamese networks and group convolutions to predict neural distances corresponding to a pair of images. Training data was best explained by distances computed over the last layer. |
Tasks | |
Published | 2020-01-13 |
URL | https://arxiv.org/abs/2001.05841v1 |
https://arxiv.org/pdf/2001.05841v1.pdf | |
PWC | https://paperswithcode.com/paper/predicting-population-neural-activity-in-the |
Repo | |
Framework | |
Machine learning as a model for cultural learning: Teaching an algorithm what it means to be fat
Title | Machine learning as a model for cultural learning: Teaching an algorithm what it means to be fat |
Authors | Alina Arseniev-Koehler, Jacob G. Foster |
Abstract | Overweight individuals, and especially women, are disparaged as immoral, unhealthy, and low class. These negative conceptions are not intrinsic to obesity; they are the tainted fruit of cultural learning. Scholars often cite media consumption as a key mechanism for learning cultural biases, but it remains unclear how this public culture becomes private culture. Here we provide a computational account of this learning mechanism, showing that cultural schemata can be learned from news reporting. We extract schemata about obesity from New York Times articles with word2vec, a neural language model inspired by human cognition. We identify several cultural schemata that link obesity to gender, immorality, poor health, and low socioeconomic class. Such schemata may be subtly but pervasively activated by our language; thus, language can chronically reproduce biases (e.g., about weight and health). Our findings also reinforce ongoing concerns that machine learning can encode, and reproduce, harmful human biases. |
Tasks | Language Modelling |
Published | 2020-03-24 |
URL | https://arxiv.org/abs/2003.12133v1 |
https://arxiv.org/pdf/2003.12133v1.pdf | |
PWC | https://paperswithcode.com/paper/machine-learning-as-a-model-for-cultural |
Repo | |
Framework | |
Obstacle Avoidance and Navigation Utilizing Reinforcement Learning with Reward Shaping
Title | Obstacle Avoidance and Navigation Utilizing Reinforcement Learning with Reward Shaping |
Authors | Daniel Zhang, Colleen P. Bailey |
Abstract | In this paper, we investigate the obstacle avoidance and navigation problem in the robotic control area. For solving such a problem, we propose revised Deep Deterministic Policy Gradient (DDPG) and Proximal Policy Optimization algorithms with an improved reward shaping technique. We compare the performances between the original DDPG and PPO with the revised version of both on simulations with a real mobile robot and demonstrate that the proposed algorithms achieve better results. |
Tasks | |
Published | 2020-03-28 |
URL | https://arxiv.org/abs/2003.12863v1 |
https://arxiv.org/pdf/2003.12863v1.pdf | |
PWC | https://paperswithcode.com/paper/obstacle-avoidance-and-navigation-utilizing |
Repo | |
Framework | |
An Analysis of Object Representations in Deep Visual Trackers
Title | An Analysis of Object Representations in Deep Visual Trackers |
Authors | Ross Goroshin, Jonathan Tompson, Debidatta Dwibedi |
Abstract | Fully convolutional deep correlation networks are integral components of state-of the-art approaches to single object visual tracking. It is commonly assumed that these networks perform tracking by detection by matching features of the object instance with features of the entire frame. Strong architectural priors and conditioning on the object representation is thought to encourage this tracking strategy. Despite these strong priors, we show that deep trackers often default to tracking by saliency detection - without relying on the object instance representation. Our analysis shows that despite being a useful prior, salience detection can prevent the emergence of more robust tracking strategies in deep networks. This leads us to introduce an auxiliary detection task that encourages more discriminative object representations that improve tracking performance. |
Tasks | Saliency Detection, Visual Tracking |
Published | 2020-01-08 |
URL | https://arxiv.org/abs/2001.02593v1 |
https://arxiv.org/pdf/2001.02593v1.pdf | |
PWC | https://paperswithcode.com/paper/an-analysis-of-object-representations-in-deep |
Repo | |
Framework | |