April 2, 2020

2783 words 14 mins read

Paper Group ANR 205

Q-Learning for Mean-Field Controls. Regret Bounds for Decentralized Learning in Cooperative Multi-Agent Dynamical Systems. Learning Cost Functions for Optimal Transport. Cognitive Anthropomorphism of AI: How Humans and Computers Classify Images. An Empirical Investigation of Pre-Trained Transformer Language Models for Open-Domain Dialogue Generatio …

Q-Learning for Mean-Field Controls


Title	Q-Learning for Mean-Field Controls
Authors	Haotian Gu, Xin Guo, Xiaoli Wei, Renyuan Xu
Abstract	Multi-agent reinforcement learning (MARL) has been applied to many challenging problems including two-team computer games, autonomous drivings, and real-time biddings. Despite the empirical success, there is a conspicuous absence of theoretical study of different MARL algorithms: this is mainly due to the curse of dimensionality caused by the exponential growth of the joint state-action space as the number of agents increases. Mean-field controls (MFC) with infinitely many agents and deterministic flows, meanwhile, provide good approximations to $N$-agent collaborative games in terms of both game values and optimal strategies. In this paper, we study the collaborative MARL under an MFC approximation framework: we develop a model-free kernel-based Q-learning algorithm (CDD-Q) and show that its convergence rate and sample complexity are independent of the number of agents. Our empirical studies on MFC examples demonstrate strong performances of CDD-Q. Moreover, the CDD-Q algorithm can be applied to a general class of Markov decision problems (MDPs) with deterministic dynamics and continuous state-action space.
Tasks	Multi-agent Reinforcement Learning, Q-Learning
Published	2020-02-10
URL	https://arxiv.org/abs/2002.04131v1
PDF	https://arxiv.org/pdf/2002.04131v1.pdf
PWC	https://paperswithcode.com/paper/q-learning-for-mean-field-controls
Repo
Framework

Regret Bounds for Decentralized Learning in Cooperative Multi-Agent Dynamical Systems


Title	Regret Bounds for Decentralized Learning in Cooperative Multi-Agent Dynamical Systems
Authors	Seyed Mohammad Asghari, Yi Ouyang, Ashutosh Nayyar
Abstract	Regret analysis is challenging in Multi-Agent Reinforcement Learning (MARL) primarily due to the dynamical environments and the decentralized information among agents. We attempt to solve this challenge in the context of decentralized learning in multi-agent linear-quadratic (LQ) dynamical systems. We begin with a simple setup consisting of two agents and two dynamically decoupled stochastic linear systems, each system controlled by an agent. The systems are coupled through a quadratic cost function. When both systems’ dynamics are unknown and there is no communication among the agents, we show that no learning policy can generate sub-linear in $T$ regret, where $T$ is the time horizon. When only one system’s dynamics are unknown and there is one-directional communication from the agent controlling the unknown system to the other agent, we propose a MARL algorithm based on the construction of an auxiliary single-agent LQ problem. The auxiliary single-agent problem in the proposed MARL algorithm serves as an implicit coordination mechanism among the two learning agents. This allows the agents to achieve a regret within $O(\sqrt{T})$ of the regret of the auxiliary single-agent problem. Consequently, using existing results for single-agent LQ regret, our algorithm provides a $\tilde{O}(\sqrt{T})$ regret bound. (Here $\tilde{O}(\cdot)$ hides constants and logarithmic factors). Our numerical experiments indicate that this bound is matched in practice. From the two-agent problem, we extend our results to multi-agent LQ systems with certain communication patterns.
Tasks	Multi-agent Reinforcement Learning
Published	2020-01-27
URL	https://arxiv.org/abs/2001.10122v1
PDF	https://arxiv.org/pdf/2001.10122v1.pdf
PWC	https://paperswithcode.com/paper/regret-bounds-for-decentralized-learning-in
Repo
Framework

Learning Cost Functions for Optimal Transport


Title	Learning Cost Functions for Optimal Transport
Authors	Haodong Sun, Haomin Zhou, Hongyuan Zha, Xiaojing Ye
Abstract	Learning the cost function for optimal transport from observed transport plan or its samples has been cast as a bi-level optimization problem. In this paper, we derive an unconstrained convex optimization formulation for the problem which can be further augmented by any customizable regularization. This novel framework avoids repeatedly solving a forward optimal transport problem in each iteration which has been a thorny computational bottleneck for the bi-level optimization approach. To validate the effectiveness of this framework, we develop two numerical algorithms, one is a fast matrix scaling method based on the Sinkhorn-Knopp algorithm for the discrete case, and the other is a supervised learning algorithm that realizes the cost function as a deep neural network in the continuous case. Numerical results demonstrate promising efficiency and accuracy advantages of the proposed algorithms over existing state of the art methods.
Tasks
Published	2020-02-22
URL	https://arxiv.org/abs/2002.09650v1
PDF	https://arxiv.org/pdf/2002.09650v1.pdf
PWC	https://paperswithcode.com/paper/learning-cost-functions-for-optimal-transport
Repo
Framework

Cognitive Anthropomorphism of AI: How Humans and Computers Classify Images


Title	Cognitive Anthropomorphism of AI: How Humans and Computers Classify Images
Authors	Shane T. Mueller
Abstract	Modern AI image classifiers have made impressive advances in recent years, but their performance often appears strange or violates expectations of users. This suggests humans engage in cognitive anthropomorphism: expecting AI to have the same nature as human intelligence. This mismatch presents an obstacle to appropriate human-AI interaction. To delineate this mismatch, I examine known properties of human classification, in comparison to image classifier systems. Based on this examination, I offer three strategies for system design that can address the mismatch between human and AI classification: explainable AI, novel methods for training users, and new algorithms that match human cognition.
Tasks
Published	2020-02-07
URL	https://arxiv.org/abs/2002.03024v1
PDF	https://arxiv.org/pdf/2002.03024v1.pdf
PWC	https://paperswithcode.com/paper/cognitive-anthropomorphism-of-ai-how-humans
Repo
Framework

An Empirical Investigation of Pre-Trained Transformer Language Models for Open-Domain Dialogue Generation


Title	An Empirical Investigation of Pre-Trained Transformer Language Models for Open-Domain Dialogue Generation
Authors	Piji Li
Abstract	We present an empirical investigation of pre-trained Transformer-based auto-regressive language models for the task of open-domain dialogue generation. Training paradigm of pre-training and fine-tuning is employed to conduct the parameter learning. Corpora of News and Wikipedia in Chinese and English are collected for the pre-training stage respectively. Dialogue context and response are concatenated into a single sequence utilized as the input of the models during the fine-tuning stage. A weighted joint prediction paradigm for both context and response is designed to evaluate the performance of models with or without the loss term for context prediction. Various of decoding strategies such as greedy search, beam search, top-k sampling, etc. are employed to conduct the response text generation. Extensive experiments are conducted on the typical single-turn and multi-turn dialogue corpora such as Weibo, Douban, Reddit, DailyDialog, and Persona-Chat. Detailed numbers of automatic evaluation metrics on relevance and diversity of the generated results for the languages models as well as the baseline approaches are reported.
Tasks	Dialogue Generation, Text Generation
Published	2020-03-09
URL	https://arxiv.org/abs/2003.04195v1
PDF	https://arxiv.org/pdf/2003.04195v1.pdf
PWC	https://paperswithcode.com/paper/an-empirical-investigation-of-pre-trained
Repo
Framework

Depth Estimation by Learning Triangulation and Densification of Sparse Points for Multi-view Stereo


Title	Depth Estimation by Learning Triangulation and Densification of Sparse Points for Multi-view Stereo
Authors	Ayan Sinha, Zak Murez, James Bartolozzi, Vijay Badrinarayanan, Andrew Rabinovich
Abstract	Multi-view stereo (MVS) is the golden mean between the accuracy of active depth sensing and the practicality of monocular depth estimation. Cost volume based approaches employing 3D convolutional neural networks (CNNs) have considerably improved the accuracy of MVS systems. However, this accuracy comes at a high computational cost which impedes practical adoption. Distinct from cost volume approaches, we propose an efficient depth estimation approach by first (a) detecting and evaluating descriptors for interest points, then (b) learning to match and triangulate a small set of interest points, and finally (c) densifying this sparse set of 3D points using CNNs. An end-to-end network efficiently performs all three steps within a deep learning framework and trained with intermediate 2D image and 3D geometric supervision, along with depth supervision. Crucially, our first step complements pose estimation using interest point detection and descriptor learning. We demonstrate that state-of-the-art results on depth estimation with lower compute for different scene lengths. Furthermore, our method generalizes to newer environments and the descriptors output by our network compare favorably to strong baselines.
Tasks	Depth Estimation, Interest Point Detection, Monocular Depth Estimation, Pose Estimation
Published	2020-03-19
URL	https://arxiv.org/abs/2003.08933v1
PDF	https://arxiv.org/pdf/2003.08933v1.pdf
PWC	https://paperswithcode.com/paper/depth-estimation-by-learning-triangulation
Repo
Framework

Active Depth Estimation: Stability Analysis and its Applications


Title	Active Depth Estimation: Stability Analysis and its Applications
Authors	Romulo T. Rodrigues, Pedro Miraldo, Dimos V. Dimarogonas, A. Pedro Aguiar
Abstract	Recovering the 3D structure of the surrounding environment is an essential task in any vision-controlled Structure-from-Motion (SfM) scheme. This paper focuses on the theoretical properties of the SfM, known as the incremental active depth estimation. The term incremental stands for estimating the 3D structure of the scene over a chronological sequence of image frames. Active means that the camera actuation is such that it improves estimation performance. Starting from a known depth estimation filter, this paper presents the stability analysis of the filter in terms of the control inputs of the camera. By analyzing the convergence of the estimator using the Lyapunov theory, we relax the constraints on the projection of the 3D point in the image plane when compared to previous results. Nonetheless, our method is capable of dealing with the cameras’ limited field-of-view constraints. The main results are validated through experiments with simulated data.
Tasks	Depth Estimation
Published	2020-03-16
URL	https://arxiv.org/abs/2003.07137v1
PDF	https://arxiv.org/pdf/2003.07137v1.pdf
PWC	https://paperswithcode.com/paper/active-depth-estimation-stability-analysis
Repo
Framework

On the Uniqueness of Binary Quantizers for Maximizing Mutual Information


Title	On the Uniqueness of Binary Quantizers for Maximizing Mutual Information
Authors	Thuan Nguyen, Thinh Nguyen
Abstract	We consider a channel with a binary input X being corrupted by a continuous-valued noise that results in a continuous-valued output Y. An optimal binary quantizer is used to quantize the continuous-valued output Y to the final binary output Z to maximize the mutual information I(X; Z). We show that when the ratio of the channel conditional density r(y) = P(Y=yX=0)/ P(Y =yX=1) is a strictly increasing/decreasing function of y, then a quantizer having a single threshold can maximize mutual information. Furthermore, we show that an optimal quantizer (possibly with multiple thresholds) is the one with the thresholding vector whose elements are all the solutions of r(y) = r* for some constant r* > 0. Interestingly, the optimal constant r* is unique. This uniqueness property allows for fast algorithmic implementation such as a bisection algorithm to find the optimal quantizer. Our results also confirm some previous results using alternative elementary proofs. We show some numerical examples of applying our results to channels with additive Gaussian noises.
Tasks
Published	2020-01-07
URL	https://arxiv.org/abs/2001.01836v1
PDF	https://arxiv.org/pdf/2001.01836v1.pdf
PWC	https://paperswithcode.com/paper/on-the-uniqueness-of-binary-quantizers-for
Repo
Framework

AL2: Progressive Activation Loss for Learning General Representations in Classification Neural Networks


Title	AL2: Progressive Activation Loss for Learning General Representations in Classification Neural Networks
Authors	Majed El Helou, Frederike Dümbgen, Sabine Süsstrunk
Abstract	The large capacity of neural networks enables them to learn complex functions. To avoid overfitting, networks however require a lot of training data that can be expensive and time-consuming to collect. A common practical approach to attenuate overfitting is the use of network regularization techniques. We propose a novel regularization method that progressively penalizes the magnitude of activations during training. The combined activation signals produced by all neurons in a given layer form the representation of the input image in that feature space. We propose to regularize this representation in the last feature layer before classification layers. Our method’s effect on generalization is analyzed with label randomization tests and cumulative ablations. Experimental results show the advantages of our approach in comparison with commonly-used regularizers on standard benchmark datasets.
Tasks
Published	2020-03-07
URL	https://arxiv.org/abs/2003.03633v1
PDF	https://arxiv.org/pdf/2003.03633v1.pdf
PWC	https://paperswithcode.com/paper/al2-progressive-activation-loss-for-learning
Repo
Framework

Neural Network Approximation of Graph Fourier Transforms for Sparse Sampling of Networked Flow Dynamics


Title	Neural Network Approximation of Graph Fourier Transforms for Sparse Sampling of Networked Flow Dynamics
Authors	Alessio Pagani, Zhuangkun Wei, Ricardo Silva, Weisi Guo
Abstract	Infrastructure monitoring is critical for safe operations and sustainability. Water distribution networks (WDNs) are large-scale networked critical systems with complex cascade dynamics which are difficult to predict. Ubiquitous monitoring is expensive and a key challenge is to infer the contaminant dynamics from partial sparse monitoring data. Existing approaches use multi-objective optimisation to find the minimum set of essential monitoring points, but lack performance guarantees and a theoretical framework. Here, we first develop Graph Fourier Transform (GFT) operators to compress networked contamination spreading dynamics to identify the essential principle data collection points with inference performance guarantees. We then build autoencoder (AE) inspired neural networks (NN) to generalize the GFT sampling process and under-sample further from the initial sampling set, allowing a very small set of data points to largely reconstruct the contamination dynamics over real and artificial WDNs. Various sources of the contamination are tested and we obtain high accuracy reconstruction using around 5-10% of the sample set. This general approach of compression and under-sampled recovery via neural networks can be applied to a wide range of networked infrastructures to enable digital twins.
Tasks
Published	2020-02-11
URL	https://arxiv.org/abs/2002.05508v1
PDF	https://arxiv.org/pdf/2002.05508v1.pdf
PWC	https://paperswithcode.com/paper/neural-network-approximation-of-graph-fourier
Repo
Framework

Adaptive Control of Embedding Strength in Image Watermarking using Neural Networks


Title	Adaptive Control of Embedding Strength in Image Watermarking using Neural Networks
Authors	Mahnoosh Bagheri, Majid Mohrekesh, Nader Karimi, Shadrokh Samavi
Abstract	Digital image watermarking has been widely used in different applications such as copyright protection of digital media, such as audio, image, and video files. Two opposing criteria of robustness and transparency are the goals of watermarking methods. In this paper, we propose a framework for determining the appropriate embedding strength factor. The framework can use most DWT and DCT based blind watermarking approaches. We use Mask R-CNN on the COCO dataset to find a good strength factor for each sub-block. Experiments show that this method is robust against different attacks and has good transparency.
Tasks
Published	2020-01-09
URL	https://arxiv.org/abs/2001.03251v1
PDF	https://arxiv.org/pdf/2001.03251v1.pdf
PWC	https://paperswithcode.com/paper/adaptive-control-of-embedding-strength-in
Repo
Framework

Predicting population neural activity in the Algonauts challenge using end-to-end trained Siamese networks and group convolutions


Title	Predicting population neural activity in the Algonauts challenge using end-to-end trained Siamese networks and group convolutions
Authors	Georgin Jacob, Harish Katti
Abstract	The Algonauts challenge is about predicting the object representations in the form of Representational Dissimilarity Matrices (RDMS) derived from visual brain regions. We used a customized deep learning model using the concept of Siamese networks and group convolutions to predict neural distances corresponding to a pair of images. Training data was best explained by distances computed over the last layer.
Tasks
Published	2020-01-13
URL	https://arxiv.org/abs/2001.05841v1
PDF	https://arxiv.org/pdf/2001.05841v1.pdf
PWC	https://paperswithcode.com/paper/predicting-population-neural-activity-in-the
Repo
Framework

Machine learning as a model for cultural learning: Teaching an algorithm what it means to be fat


Title	Machine learning as a model for cultural learning: Teaching an algorithm what it means to be fat
Authors	Alina Arseniev-Koehler, Jacob G. Foster
Abstract	Overweight individuals, and especially women, are disparaged as immoral, unhealthy, and low class. These negative conceptions are not intrinsic to obesity; they are the tainted fruit of cultural learning. Scholars often cite media consumption as a key mechanism for learning cultural biases, but it remains unclear how this public culture becomes private culture. Here we provide a computational account of this learning mechanism, showing that cultural schemata can be learned from news reporting. We extract schemata about obesity from New York Times articles with word2vec, a neural language model inspired by human cognition. We identify several cultural schemata that link obesity to gender, immorality, poor health, and low socioeconomic class. Such schemata may be subtly but pervasively activated by our language; thus, language can chronically reproduce biases (e.g., about weight and health). Our findings also reinforce ongoing concerns that machine learning can encode, and reproduce, harmful human biases.
Tasks	Language Modelling
Published	2020-03-24
URL	https://arxiv.org/abs/2003.12133v1
PDF	https://arxiv.org/pdf/2003.12133v1.pdf
PWC	https://paperswithcode.com/paper/machine-learning-as-a-model-for-cultural
Repo
Framework


Title	Obstacle Avoidance and Navigation Utilizing Reinforcement Learning with Reward Shaping
Authors	Daniel Zhang, Colleen P. Bailey
Abstract	In this paper, we investigate the obstacle avoidance and navigation problem in the robotic control area. For solving such a problem, we propose revised Deep Deterministic Policy Gradient (DDPG) and Proximal Policy Optimization algorithms with an improved reward shaping technique. We compare the performances between the original DDPG and PPO with the revised version of both on simulations with a real mobile robot and demonstrate that the proposed algorithms achieve better results.
Tasks
Published	2020-03-28
URL	https://arxiv.org/abs/2003.12863v1
PDF	https://arxiv.org/pdf/2003.12863v1.pdf
PWC	https://paperswithcode.com/paper/obstacle-avoidance-and-navigation-utilizing
Repo
Framework

An Analysis of Object Representations in Deep Visual Trackers


Title	An Analysis of Object Representations in Deep Visual Trackers
Authors	Ross Goroshin, Jonathan Tompson, Debidatta Dwibedi
Abstract	Fully convolutional deep correlation networks are integral components of state-of the-art approaches to single object visual tracking. It is commonly assumed that these networks perform tracking by detection by matching features of the object instance with features of the entire frame. Strong architectural priors and conditioning on the object representation is thought to encourage this tracking strategy. Despite these strong priors, we show that deep trackers often default to tracking by saliency detection - without relying on the object instance representation. Our analysis shows that despite being a useful prior, salience detection can prevent the emergence of more robust tracking strategies in deep networks. This leads us to introduce an auxiliary detection task that encourages more discriminative object representations that improve tracking performance.
Tasks	Saliency Detection, Visual Tracking
Published	2020-01-08
URL	https://arxiv.org/abs/2001.02593v1
PDF	https://arxiv.org/pdf/2001.02593v1.pdf
PWC	https://paperswithcode.com/paper/an-analysis-of-object-representations-in-deep
Repo
Framework