Paper Group ANR 154
ACCNet: Actor-Coordinator-Critic Net for “Learning-to-Communicate” with Deep Multi-agent Reinforcement Learning. Exploiting Layerwise Convexity of Rectifier Networks with Sign Constrained Weights. Trust-PCL: An Off-Policy Trust Region Method for Continuous Control. Exploration–Exploitation in MDPs with Options. Real-time Traffic Accident Risk Pred …
ACCNet: Actor-Coordinator-Critic Net for “Learning-to-Communicate” with Deep Multi-agent Reinforcement Learning
Title | ACCNet: Actor-Coordinator-Critic Net for “Learning-to-Communicate” with Deep Multi-agent Reinforcement Learning |
Authors | Hangyu Mao, Zhibo Gong, Yan Ni, Zhen Xiao |
Abstract | Communication is a critical factor for the big multi-agent world to stay organized and productive. Typically, most previous multi-agent “learning-to-communicate” studies try to predefine the communication protocols or use technologies such as tabular reinforcement learning and evolutionary algorithm, which can not generalize to changing environment or large collection of agents. In this paper, we propose an Actor-Coordinator-Critic Net (ACCNet) framework for solving “learning-to-communicate” problem. The ACCNet naturally combines the powerful actor-critic reinforcement learning technology with deep learning technology. It can efficiently learn the communication protocols even from scratch under partially observable environment. We demonstrate that the ACCNet can achieve better results than several baselines under both continuous and discrete action space environments. We also analyse the learned protocols and discuss some design considerations. |
Tasks | Multi-agent Reinforcement Learning |
Published | 2017-06-10 |
URL | http://arxiv.org/abs/1706.03235v3 |
http://arxiv.org/pdf/1706.03235v3.pdf | |
PWC | https://paperswithcode.com/paper/accnet-actor-coordinator-critic-net-for |
Repo | |
Framework | |
Exploiting Layerwise Convexity of Rectifier Networks with Sign Constrained Weights
Title | Exploiting Layerwise Convexity of Rectifier Networks with Sign Constrained Weights |
Authors | Senjian An, Farid Boussaid, Mohammed Bennamoun, Ferdous Sohel |
Abstract | By introducing sign constraints on the weights, this paper proposes sign constrained rectifier networks (SCRNs), whose training can be solved efficiently by the well known majorization-minimization (MM) algorithms. We prove that the proposed two-hidden-layer SCRNs, which exhibit negative weights in the second hidden layer and negative weights in the output layer, are capable of separating any two (or more) disjoint pattern sets. Furthermore, the proposed two-hidden-layer SCRNs can decompose the patterns of each class into several clusters so that each cluster is convexly separable from all the patterns from the other classes. This provides a means to learn the pattern structures and analyse the discriminant factors between different classes of patterns. |
Tasks | |
Published | 2017-11-14 |
URL | http://arxiv.org/abs/1711.05627v1 |
http://arxiv.org/pdf/1711.05627v1.pdf | |
PWC | https://paperswithcode.com/paper/exploiting-layerwise-convexity-of-rectifier |
Repo | |
Framework | |
Trust-PCL: An Off-Policy Trust Region Method for Continuous Control
Title | Trust-PCL: An Off-Policy Trust Region Method for Continuous Control |
Authors | Ofir Nachum, Mohammad Norouzi, Kelvin Xu, Dale Schuurmans |
Abstract | Trust region methods, such as TRPO, are often used to stabilize policy optimization algorithms in reinforcement learning (RL). While current trust region strategies are effective for continuous control, they typically require a prohibitively large amount of on-policy interaction with the environment. To address this problem, we propose an off-policy trust region method, Trust-PCL. The algorithm is the result of observing that the optimal policy and state values of a maximum reward objective with a relative-entropy regularizer satisfy a set of multi-step pathwise consistencies along any path. Thus, Trust-PCL is able to maintain optimization stability while exploiting off-policy data to improve sample efficiency. When evaluated on a number of continuous control tasks, Trust-PCL improves the solution quality and sample efficiency of TRPO. |
Tasks | Continuous Control |
Published | 2017-07-06 |
URL | http://arxiv.org/abs/1707.01891v3 |
http://arxiv.org/pdf/1707.01891v3.pdf | |
PWC | https://paperswithcode.com/paper/trust-pcl-an-off-policy-trust-region-method |
Repo | |
Framework | |
Exploration–Exploitation in MDPs with Options
Title | Exploration–Exploitation in MDPs with Options |
Authors | Ronan Fruit, Alessandro Lazaric |
Abstract | While a large body of empirical results show that temporally-extended actions and options may significantly affect the learning performance of an agent, the theoretical understanding of how and when options can be beneficial in online reinforcement learning is relatively limited. In this paper, we derive an upper and lower bound on the regret of a variant of UCRL using options. While we first analyze the algorithm in the general case of semi-Markov decision processes (SMDPs), we show how these results can be translated to the specific case of MDPs with options and we illustrate simple scenarios in which the regret of learning with options can be \textit{provably} much smaller than the regret suffered when learning with primitive actions. |
Tasks | |
Published | 2017-03-25 |
URL | http://arxiv.org/abs/1703.08667v2 |
http://arxiv.org/pdf/1703.08667v2.pdf | |
PWC | https://paperswithcode.com/paper/exploration-exploitation-in-mdps-with-options |
Repo | |
Framework | |
Real-time Traffic Accident Risk Prediction based on Frequent Pattern Tree
Title | Real-time Traffic Accident Risk Prediction based on Frequent Pattern Tree |
Authors | Lei Lin, Qian Wang, Adel W. Sadek |
Abstract | Traffic accident data are usually noisy, contain missing values, and heterogeneous. How to select the most important variables to improve real-time traffic accident risk prediction has become a concern of many recent studies. This paper proposes a novel variable selection method based on the Frequent Pattern tree (FP tree) algorithm. First, all the frequent patterns in the traffic accident dataset are discovered. Then for each frequent pattern, a new criterion, called the Relative Object Purity Ratio (ROPR) which we proposed, is calculated. This ROPR is added to the importance score of the variables that differentiate one frequent pattern from the others. To test the proposed method, a dataset was compiled from the traffic accidents records detected by only one detector on interstate highway I-64 in Virginia in 2005. This dataset was then linked to other variables such as real-time traffic information and weather conditions. Both the proposed method based on the FP tree algorithm, as well as the widely utilized, random forest method, were then used to identify the important variables or the Virginia dataset. The results indicate that there are some differences between the variables deemed important by the FP tree and those selected by the random forest method. Following this, two baseline models (i.e. a nearest neighbor (k-NN) method and a Bayesian network) were developed to predict accident risk based on the variables identified by both the FP tree method and the random forest method. The results show that the models based on the variable selection using the FP tree performed better than those based on the random forest method for several versions of the k-NN and Bayesian network models.The best results were derived from a Bayesian network model using variables from FP tree. That model could predict 61.11% of accidents accurately while having a false alarm rate of 38.16%. |
Tasks | |
Published | 2017-01-20 |
URL | http://arxiv.org/abs/1701.05691v2 |
http://arxiv.org/pdf/1701.05691v2.pdf | |
PWC | https://paperswithcode.com/paper/real-time-traffic-accident-risk-prediction |
Repo | |
Framework | |
Training Feedforward Neural Networks with Standard Logistic Activations is Feasible
Title | Training Feedforward Neural Networks with Standard Logistic Activations is Feasible |
Authors | Emanuele Sansone, Francesco G. B. De Natale |
Abstract | Training feedforward neural networks with standard logistic activations is considered difficult because of the intrinsic properties of these sigmoidal functions. This work aims at showing that these networks can be trained to achieve generalization performance comparable to those based on hyperbolic tangent activations. The solution consists on applying a set of conditions in parameter initialization, which have been derived from the study of the properties of a single neuron from an information-theoretic perspective. The proposed initialization is validated through an extensive experimental analysis. |
Tasks | |
Published | 2017-10-03 |
URL | http://arxiv.org/abs/1710.01013v1 |
http://arxiv.org/pdf/1710.01013v1.pdf | |
PWC | https://paperswithcode.com/paper/training-feedforward-neural-networks-with |
Repo | |
Framework | |
Multi-Scale Spatially Weighted Local Histograms in O(1)
Title | Multi-Scale Spatially Weighted Local Histograms in O(1) |
Authors | Mahdieh Poostchi, Ali Shafiekhani, Kannappan Palaniappan, Guna Seetharaman |
Abstract | Weighting pixel contribution considering its location is a key feature in many fundamental image processing tasks including filtering, object modeling and distance matching. Several techniques have been proposed that incorporate Spatial information to increase the accuracy and boost the performance of detection, tracking and recognition systems at the cost of speed. But, it is still not clear how to efficiently ex- tract weighted local histograms in constant time using integral histogram. This paper presents a novel algorithm to compute accurately multi-scale Spatially weighted local histograms in constant time using Weighted Integral Histogram (SWIH) for fast search. We applied our spatially weighted integral histogram approach for fast tracking and obtained more accurate and robust target localization result in comparison with using plain histogram. |
Tasks | |
Published | 2017-05-09 |
URL | http://arxiv.org/abs/1705.03524v1 |
http://arxiv.org/pdf/1705.03524v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-scale-spatially-weighted-local |
Repo | |
Framework | |
Deep Submodular Functions
Title | Deep Submodular Functions |
Authors | Jeffrey Bilmes, Wenruo Bai |
Abstract | We start with an overview of a class of submodular functions called SCMMs (sums of concave composed with non-negative modular functions plus a final arbitrary modular). We then define a new class of submodular functions we call {\em deep submodular functions} or DSFs. We show that DSFs are a flexible parametric family of submodular functions that share many of the properties and advantages of deep neural networks (DNNs). DSFs can be motivated by considering a hierarchy of descriptive concepts over ground elements and where one wishes to allow submodular interaction throughout this hierarchy. Results in this paper show that DSFs constitute a strictly larger class of submodular functions than SCMMs. We show that, for any integer $k>0$, there are $k$-layer DSFs that cannot be represented by a $k'$-layer DSF for any $k'<k$. This implies that, like DNNs, there is a utility to depth, but unlike DNNs, the family of DSFs strictly increase with depth. Despite this, we show (using a “backpropagation” like method) that DSFs, even with arbitrarily large $k$, do not comprise all submodular functions. In offering the above results, we also define the notion of an antitone superdifferential of a concave function and show how this relates to submodular functions (in general), DSFs (in particular), negative second-order partial derivatives, continuous submodularity, and concave extensions. To further motivate our analysis, we provide various special case results from matroid theory, comparing DSFs with forms of matroid rank, in particular the laminar matroid. Lastly, we discuss strategies to learn DSFs, and define the classes of deep supermodular functions, deep difference of submodular functions, and deep multivariate submodular functions, and discuss where these can be useful in applications. |
Tasks | |
Published | 2017-01-31 |
URL | http://arxiv.org/abs/1701.08939v1 |
http://arxiv.org/pdf/1701.08939v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-submodular-functions |
Repo | |
Framework | |
Latent Human Traits in the Language of Social Media: An Open-Vocabulary Approach
Title | Latent Human Traits in the Language of Social Media: An Open-Vocabulary Approach |
Authors | Vivek Kulkarni, Margaret L. Kern, David Stillwell, Michal Kosinski, Sandra Matz, Lyle Ungar, Steven Skiena, H. Andrew Schwartz |
Abstract | Over the past century, personality theory and research has successfully identified core sets of characteristics that consistently describe and explain fundamental differences in the way people think, feel and behave. Such characteristics were derived through theory, dictionary analyses, and survey research using explicit self-reports. The availability of social media data spanning millions of users now makes it possible to automatically derive characteristics from language use – at large scale. Taking advantage of linguistic information available through Facebook, we study the process of inferring a new set of potential human traits based on unprompted language use. We subject these new traits to a comprehensive set of evaluations and compare them with a popular five factor model of personality. We find that our language-based trait construct is often more generalizable in that it often predicts non-questionnaire-based outcomes better than questionnaire-based traits (e.g. entities someone likes, income and intelligence quotient), while the factors remain nearly as stable as traditional factors. Our approach suggests a value in new constructs of personality derived from everyday human language use. |
Tasks | |
Published | 2017-05-22 |
URL | http://arxiv.org/abs/1705.08038v1 |
http://arxiv.org/pdf/1705.08038v1.pdf | |
PWC | https://paperswithcode.com/paper/latent-human-traits-in-the-language-of-social |
Repo | |
Framework | |
Enabling Massive Deep Neural Networks with the GraphBLAS
Title | Enabling Massive Deep Neural Networks with the GraphBLAS |
Authors | Jeremy Kepner, Manoj Kumar, José Moreira, Pratap Pattnaik, Mauricio Serrano, Henry Tufo |
Abstract | Deep Neural Networks (DNNs) have emerged as a core tool for machine learning. The computations performed during DNN training and inference are dominated by operations on the weight matrices describing the DNN. As DNNs incorporate more stages and more nodes per stage, these weight matrices may be required to be sparse because of memory limitations. The GraphBLAS.org math library standard was developed to provide high performance manipulation of sparse weight matrices and input/output vectors. For sufficiently sparse matrices, a sparse matrix library requires significantly less memory than the corresponding dense matrix implementation. This paper provides a brief description of the mathematics underlying the GraphBLAS. In addition, the equations of a typical DNN are rewritten in a form designed to use the GraphBLAS. An implementation of the DNN is given using a preliminary GraphBLAS C library. The performance of the GraphBLAS implementation is measured relative to a standard dense linear algebra library implementation. For various sizes of DNN weight matrices, it is shown that the GraphBLAS sparse implementation outperforms a BLAS dense implementation as the weight matrix becomes sparser. |
Tasks | |
Published | 2017-08-09 |
URL | http://arxiv.org/abs/1708.02937v1 |
http://arxiv.org/pdf/1708.02937v1.pdf | |
PWC | https://paperswithcode.com/paper/enabling-massive-deep-neural-networks-with |
Repo | |
Framework | |
Unified Framework for Automated Person Re-identification and Camera Network Topology Inference in Camera Networks
Title | Unified Framework for Automated Person Re-identification and Camera Network Topology Inference in Camera Networks |
Authors | Yeong-Jun Cho, Jae-Han Park, Su-A Kim, Kyuewang Lee, Kuk-Jin Yoon |
Abstract | Person re-identification in large-scale multi-camera networks is a challenging task because of the spatio-temporal uncertainty and high complexity due to large numbers of cameras and people. To handle these difficulties, additional information such as camera network topology should be provided, which is also difficult to automatically estimate. In this paper, we propose a unified framework which jointly solves both person re-id and camera network topology inference problems. The proposed framework takes general multi-camera network environments into account. To effectively show the superiority of the proposed framework, we also provide a new person re-id dataset with full annotations, named SLP, captured in the synchronized multi-camera network. Experimental results show that the proposed methods are promising for both person re-id and camera topology inference tasks. |
Tasks | Person Re-Identification |
Published | 2017-04-24 |
URL | http://arxiv.org/abs/1704.07085v5 |
http://arxiv.org/pdf/1704.07085v5.pdf | |
PWC | https://paperswithcode.com/paper/unified-framework-for-automated-person-re |
Repo | |
Framework | |
Generative Adversarial Residual Pairwise Networks for One Shot Learning
Title | Generative Adversarial Residual Pairwise Networks for One Shot Learning |
Authors | Akshay Mehrotra, Ambedkar Dukkipati |
Abstract | Deep neural networks achieve unprecedented performance levels over many tasks and scale well with large quantities of data, but performance in the low-data regime and tasks like one shot learning still lags behind. While recent work suggests many hypotheses from better optimization to more complicated network structures, in this work we hypothesize that having a learnable and more expressive similarity objective is an essential missing component. Towards overcoming that, we propose a network design inspired by deep residual networks that allows the efficient computation of this more expressive pairwise similarity objective. Further, we argue that regularization is key in learning with small amounts of data, and propose an additional generator network based on the Generative Adversarial Networks where the discriminator is our residual pairwise network. This provides a strong regularizer by leveraging the generated data samples. The proposed model can generate plausible variations of exemplars over unseen classes and outperforms strong discriminative baselines for few shot classification tasks. Notably, our residual pairwise network design outperforms previous state-of-theart on the challenging mini-Imagenet dataset for one shot learning by getting over 55% accuracy for the 5-way classification task over unseen classes. |
Tasks | One-Shot Learning |
Published | 2017-03-23 |
URL | http://arxiv.org/abs/1703.08033v1 |
http://arxiv.org/pdf/1703.08033v1.pdf | |
PWC | https://paperswithcode.com/paper/generative-adversarial-residual-pairwise |
Repo | |
Framework | |
Revisiting IM2GPS in the Deep Learning Era
Title | Revisiting IM2GPS in the Deep Learning Era |
Authors | Nam Vo, Nathan Jacobs, James Hays |
Abstract | Image geolocalization, inferring the geographic location of an image, is a challenging computer vision problem with many potential applications. The recent state-of-the-art approach to this problem is a deep image classification approach in which the world is spatially divided into cells and a deep network is trained to predict the correct cell for a given image. We propose to combine this approach with the original Im2GPS approach in which a query image is matched against a database of geotagged images and the location is inferred from the retrieved set. We estimate the geographic location of a query image by applying kernel density estimation to the locations of its nearest neighbors in the reference database. Interestingly, we find that the best features for our retrieval task are derived from networks trained with classification loss even though we do not use a classification approach at test time. Training with classification loss outperforms several deep feature learning methods (e.g. Siamese networks with contrastive of triplet loss) more typical for retrieval applications. Our simple approach achieves state-of-the-art geolocalization accuracy while also requiring significantly less training data. |
Tasks | Density Estimation, Image Classification |
Published | 2017-05-13 |
URL | http://arxiv.org/abs/1705.04838v1 |
http://arxiv.org/pdf/1705.04838v1.pdf | |
PWC | https://paperswithcode.com/paper/revisiting-im2gps-in-the-deep-learning-era |
Repo | |
Framework | |
Synthesis-based Robust Low Resolution Face Recognition
Title | Synthesis-based Robust Low Resolution Face Recognition |
Authors | Sumit Shekhar, Vishal M. Patel, Rama Chellappa |
Abstract | Recognition of low resolution face images is a challenging problem in many practical face recognition systems. Methods have been proposed in the face recognition literature for the problem which assume that the probe is low resolution, but a high resolution gallery is available for recognition. These attempts have been aimed at modifying the probe image such that the resultant image provides better discrimination. We formulate the problem differently by leveraging the information available in the high resolution gallery image and propose a dictionary learning approach for classifying the low-resolution probe image. An important feature of our algorithm is that it can handle resolution change along with illumination variations. Furthermore, we also kernelize the algorithm to handle non-linearity in data and present a joint dictionary learning technique for robust recognition at low resolutions. The effectiveness of the proposed method is demonstrated using standard datasets and a challenging outdoor face dataset. It is shown that our method is efficient and can perform significantly better than many competitive low resolution face recognition algorithms. |
Tasks | Dictionary Learning, Face Recognition |
Published | 2017-07-10 |
URL | http://arxiv.org/abs/1707.02733v1 |
http://arxiv.org/pdf/1707.02733v1.pdf | |
PWC | https://paperswithcode.com/paper/synthesis-based-robust-low-resolution-face |
Repo | |
Framework | |
Mental Sampling in Multimodal Representations
Title | Mental Sampling in Multimodal Representations |
Authors | Jian-Qiao Zhu, Adam N. Sanborn, Nick Chater |
Abstract | Both resources in the natural environment and concepts in a semantic space are distributed “patchily”, with large gaps in between the patches. To describe people’s internal and external foraging behavior, various random walk models have been proposed. In particular, internal foraging has been modeled as sampling: in order to gather relevant information for making a decision, people draw samples from a mental representation using random-walk algorithms such as Markov chain Monte Carlo (MCMC). However, two common empirical observations argue against simple sampling algorithms such as MCMC. First, the spatial structure is often best described by a L'evy flight distribution: the probability of the distance between two successive locations follows a power-law on the distances. Second, the temporal structure of the sampling that humans and other animals produce have long-range, slowly decaying serial correlations characterized as $1/f$-like fluctuations. We propose that mental sampling is not done by simple MCMC, but is instead adapted to multimodal representations and is implemented by Metropolis-coupled Markov chain Monte Carlo (MC$^3$), one of the first algorithms developed for sampling from multimodal distributions. MC$^3$ involves running multiple Markov chains in parallel but with target distributions of different temperatures, and it swaps the states of the chains whenever a better location is found. Heated chains more readily traverse valleys in the probability landscape to propose moves to far-away peaks, while the colder chains make the local steps that explore the current peak or patch. We show that MC$^3$ generates distances between successive samples that follow a L'evy flight distribution and $1/f$-like serial correlations, providing a single mechanistic account of these two puzzling empirical phenomena. |
Tasks | |
Published | 2017-10-14 |
URL | http://arxiv.org/abs/1710.05219v1 |
http://arxiv.org/pdf/1710.05219v1.pdf | |
PWC | https://paperswithcode.com/paper/mental-sampling-in-multimodal-representations |
Repo | |
Framework | |