January 28, 2020

3058 words 15 mins read

Paper Group ANR 1056

Safe Linear Thompson Sampling with Side Information. Structured and Deep Similarity Matching via Structured and Deep Hebbian Networks. Segmentation Guided Attention Network for Crowd Counting via Curriculum Learning. Comparing reliability of grid-based Quality-Diversity algorithms using artificial landscapes. Rearchitecting Classification Framework …

Safe Linear Thompson Sampling with Side Information


Title	Safe Linear Thompson Sampling with Side Information
Authors	Ahmadreza Moradipari, Sanae Amani, Mahnoosh Alizadeh, Christos Thrampoulidis
Abstract	The design and performance analysis of bandit algorithms in the presence of stage-wise safety or reliability constraints has recently garnered significant interest. In this work, we consider the linear stochastic bandit problem under additional \textit{linear safety constraints} that need to be satisfied at each round. We provide a new safe algorithm based on linear Thompson Sampling (TS) for this problem and show a frequentist regret of order $\mathcal{O} (d^{3/2}\log^{1/2}d \cdot T^{1/2}\log^{3/2}T)$, which remarkably matches the results provided by (Abeille et al., 2017) for the standard linear TS algorithm in the absence of safety constraints. We compare the performance of our algorithm with UCB-based safe algorithms and highlight how the inherently randomized nature of TS leads to a superior performance in expanding the set of safe actions the algorithm has access to at each round.
Tasks
Published	2019-11-06
URL	https://arxiv.org/abs/1911.02156v2
PDF	https://arxiv.org/pdf/1911.02156v2.pdf
PWC	https://paperswithcode.com/paper/safe-linear-thompson-sampling
Repo
Framework

Structured and Deep Similarity Matching via Structured and Deep Hebbian Networks


Title	Structured and Deep Similarity Matching via Structured and Deep Hebbian Networks
Authors	Dina Obeid, Hugo Ramambason, Cengiz Pehlevan
Abstract	Synaptic plasticity is widely accepted to be the mechanism behind learning in the brain’s neural networks. A central question is how synapses, with access to only local information about the network, can still organize collectively and perform circuit-wide learning in an efficient manner. In single-layered and all-to-all connected neural networks, local plasticity has been shown to implement gradient-based learning on a class of cost functions that contain a term that aligns the similarity of outputs to the similarity of inputs. Whether such cost functions exist for networks with other architectures is not known. In this paper, we introduce structured and deep similarity matching cost functions, and show how they can be optimized in a gradient-based manner by neural networks with local learning rules. These networks extend F"oldiak’s Hebbian/Anti-Hebbian network to deep architectures and structured feedforward, lateral and feedback connections. Credit assignment problem is solved elegantly by a factorization of the dual learning objective to synapse specific local objectives. Simulations show that our networks learn meaningful features.
Tasks
Published	2019-10-11
URL	https://arxiv.org/abs/1910.04958v2
PDF	https://arxiv.org/pdf/1910.04958v2.pdf
PWC	https://paperswithcode.com/paper/structured-and-deep-similarity-matching-via
Repo
Framework

Segmentation Guided Attention Network for Crowd Counting via Curriculum Learning


Title	Segmentation Guided Attention Network for Crowd Counting via Curriculum Learning
Authors	Qian Wang, Toby P. Breckon
Abstract	Crowd counting using deep convolutional neural networks (CNN) has achieved encouraging progress in the last couple of years. Novel network architectures have been designed to handle the scale variance issue in crowd images. For this purpose, the ideas of using multi-column networks with different convolution kernel sizes and rich feature fusion have been prevalent in literature. Recent works have shown the effectiveness of \textit{Inception} modules in crowd counting due to its ability to capture multi-scale visual information via the fusion of features from multi-column networks. However, the existing crowd counting networks built with \textit{Inception} modules usually have a small number of layers and only employ the basic type of \textit{Inception} modules. In this paper, we investigate the use of pre-trained \textit{Inception} model for crowd counting. Specifically, we firstly benchmark the baseline \textit{Inception-v3} models on commonly used crowd counting datasets and show its superiority to other existing models. Subsequently, we present a Segmentation Guided Attention Network (SGANet) with the \textit{Inception-v3} as the backbone for crowd counting. We also propose a novel curriculum learning strategy for more efficient training of crowd counting networks. Finally, we conduct thorough experiments to compare the performance of SGANet and other state-of-the-art models. The experimental results validate the effectiveness of the segmentation guided attention layer and the curriculum learning strategy in crowd counting.
Tasks	Crowd Counting
Published	2019-11-18
URL	https://arxiv.org/abs/1911.07990v1
PDF	https://arxiv.org/pdf/1911.07990v1.pdf
PWC	https://paperswithcode.com/paper/segmentation-guided-attention-network-for
Repo
Framework

Comparing reliability of grid-based Quality-Diversity algorithms using artificial landscapes


Title	Comparing reliability of grid-based Quality-Diversity algorithms using artificial landscapes
Authors	Leo Cazenille
Abstract	Quality-Diversity (QD) algorithms are a recent type of optimisation methods that search for a collection of both diverse and high performing solutions. They can be used to effectively explore a target problem according to features defined by the user. However, the field of QD still does not possess extensive methodologies and reference benchmarks to compare these algorithms. We propose a simple benchmark to compare the reliability of QD algorithms by optimising the Rastrigin function, an artificial landscape function often used to test global optimisation methods.
Tasks
Published	2019-07-23
URL	https://arxiv.org/abs/1908.08020v1
PDF	https://arxiv.org/pdf/1908.08020v1.pdf
PWC	https://paperswithcode.com/paper/comparing-reliability-of-grid-based-quality
Repo
Framework

Rearchitecting Classification Frameworks For Increased Robustness


Title	Rearchitecting Classification Frameworks For Increased Robustness
Authors	Varun Chandrasekaran, Brian Tang, Nicolas Papernot, Kassem Fawaz, Somesh Jha, Xi Wu
Abstract	While generalizing well over natural inputs, neural networks are vulnerable to adversarial inputs. Existing defenses against adversarial inputs have largely been detached from the real world. These defenses also come at a cost to accuracy. Fortunately, there are invariances of an object that are its salient features; when we break them it will necessarily change the perception of the object. We find that applying invariants to the classification task makes robustness and accuracy feasible together. Two questions follow: how to extract and model these invariances? and how to design a classification paradigm that leverages these invariances to improve the robustness accuracy trade-off? The remainder of the paper discusses solutions to the aformenetioned questions.
Tasks	Autonomous Driving
Published	2019-05-26
URL	https://arxiv.org/abs/1905.10900v3
PDF	https://arxiv.org/pdf/1905.10900v3.pdf
PWC	https://paperswithcode.com/paper/enhancing-ml-robustness-using-physical-world
Repo
Framework

Batched Multi-armed Bandits Problem


Title	Batched Multi-armed Bandits Problem
Authors	Zijun Gao, Yanjun Han, Zhimei Ren, Zhengqing Zhou
Abstract	In this paper, we study the multi-armed bandit problem in the batched setting where the employed policy must split data into a small number of batches. While the minimax regret for the two-armed stochastic bandits has been completely characterized in \cite{perchet2016batched}, the effect of the number of arms on the regret for the multi-armed case is still open. Moreover, the question whether adaptively chosen batch sizes will help to reduce the regret also remains underexplored. In this paper, we propose the BaSE (batched successive elimination) policy to achieve the rate-optimal regrets (within logarithmic factors) for batched multi-armed bandits, with matching lower bounds even if the batch sizes are determined in an adaptive manner.
Tasks	Multi-Armed Bandits
Published	2019-04-03
URL	https://arxiv.org/abs/1904.01763v3
PDF	https://arxiv.org/pdf/1904.01763v3.pdf
PWC	https://paperswithcode.com/paper/batched-multi-armed-bandits-problem
Repo
Framework

Arrow, Hausdorff, and Ambiguities in the Choice of Preferred States in Complex Systems


Title	Arrow, Hausdorff, and Ambiguities in the Choice of Preferred States in Complex Systems
Authors	T. Erber, M. J. Frank
Abstract	Arrow’s `impossibility’ theorem asserts that there are no satisfactory methods of aggregating individual preferences into collective preferences in many complex situations. This result has ramifications in economics, politics, i.e., the theory of voting, and the structure of tournaments. By identifying the objects of choice with mathematical sets, and preferences with Hausdorff measures of the distances between sets, it is possible to extend Arrow’s arguments from a sociological to a mathematical setting. One consequence is that notions of reversibility can be expressed in terms of the relative configurations of patterns of sets. \|
Tasks
Published	2019-09-10
URL	https://arxiv.org/abs/1909.07771v1
PDF	https://arxiv.org/pdf/1909.07771v1.pdf
PWC	https://paperswithcode.com/paper/arrow-hausdorff-and-ambiguities-in-the-choice
Repo
Framework

Deep Q-Learning with Q-Matrix Transfer Learning for Novel Fire Evacuation Environment


Title	Deep Q-Learning with Q-Matrix Transfer Learning for Novel Fire Evacuation Environment
Authors	Jivitesh Sharma, Per-Arne Andersen, Ole-Chrisoffer Granmo, Morten Goodwin
Abstract	We focus on the important problem of emergency evacuation, which clearly could benefit from reinforcement learning that has been largely unaddressed. Emergency evacuation is a complex task which is difficult to solve with reinforcement learning, since an emergency situation is highly dynamic, with a lot of changing variables and complex constraints that makes it difficult to train on. In this paper, we propose the first fire evacuation environment to train reinforcement learning agents for evacuation planning. The environment is modelled as a graph capturing the building structure. It consists of realistic features like fire spread, uncertainty and bottlenecks. We have implemented the environment in the OpenAI gym format, to facilitate future research. We also propose a new reinforcement learning approach that entails pretraining the network weights of a DQN based agents to incorporate information on the shortest path to the exit. We achieved this by using tabular Q-learning to learn the shortest path on the building model’s graph. This information is transferred to the network by deliberately overfitting it on the Q-matrix. Then, the pretrained DQN model is trained on the fire evacuation environment to generate the optimal evacuation path under time varying conditions. We perform comparisons of the proposed approach with state-of-the-art reinforcement learning algorithms like PPO, VPG, SARSA, A2C and ACKTR. The results show that our method is able to outperform state-of-the-art models by a huge margin including the original DQN based models. Finally, we test our model on a large and complex real building consisting of 91 rooms, with the possibility to move to any other room, hence giving 8281 actions. We use an attention based mechanism to deal with large action spaces. Our model achieves near optimal performance on the real world emergency environment.
Tasks	Q-Learning, Transfer Learning
Published	2019-05-23
URL	https://arxiv.org/abs/1905.09673v2
PDF	https://arxiv.org/pdf/1905.09673v2.pdf
PWC	https://paperswithcode.com/paper/deep-q-learning-with-q-matrix-transfer
Repo
Framework

Ten ways to fool the masses with machine learning


Title	Ten ways to fool the masses with machine learning
Authors	Fayyaz Minhas, Amina Asif, Asa Ben-Hur
Abstract	If you want to tell people the truth, make them laugh, otherwise they’ll kill you. (source unclear) Machine learning and deep learning are the technologies of the day for developing intelligent automatic systems. However, a key hurdle for progress in the field is the literature itself: we often encounter papers that report results that are difficult to reconstruct or reproduce, results that mis-represent the performance of the system, or contain other biases that limit their validity. In this semi-humorous article, we discuss issues that arise in running and reporting results of machine learning experiments. The purpose of the article is to provide a list of watch out points for researchers to be aware of when developing machine learning models or writing and reviewing machine learning papers.
Tasks
Published	2019-01-07
URL	http://arxiv.org/abs/1901.01686v1
PDF	http://arxiv.org/pdf/1901.01686v1.pdf
PWC	https://paperswithcode.com/paper/ten-ways-to-fool-the-masses-with-machine
Repo
Framework

Using Big Five Personality Model to Detect Cultural Aspects in Crowds


Title	Using Big Five Personality Model to Detect Cultural Aspects in Crowds
Authors	Rodolfo Migon Favaretto, Leandro Dihl, Soraia Raupp Musse, Felipe Vilanova, Angelo Brandelli Costa
Abstract	The use of information technology in the study of human behavior is a subject of great scientific interest. Cultural and personality aspects are factors that influence how people interact with one another in a crowd. This paper presents a methodology to detect cultural characteristics of crowds in video sequences. Based on filmed sequences, pedestrians are detected, tracked and characterized. Such information is then used to find out cultural differences in those videos, based on the Big-five personality model. Regarding cultural differences of each country, results indicate that this model generates coherent information when compared to data provided in literature.
Tasks
Published	2019-03-05
URL	http://arxiv.org/abs/1903.01688v1
PDF	http://arxiv.org/pdf/1903.01688v1.pdf
PWC	https://paperswithcode.com/paper/using-big-five-personality-model-to-detect
Repo
Framework

Distributed Computation for Marginal Likelihood based Model Choice


Title	Distributed Computation for Marginal Likelihood based Model Choice
Authors	Alexander Buchholz, Daniel Ahfock, Sylvia Richardson
Abstract	We propose a general method for distributed Bayesian model choice, using the marginal likelihood, where each worker has access only to non-overlapping subsets of the data. Our approach approximates the model evidence for the full data set through Monte Carlo sampling from the posterior on every subset generating a model evidence per subset. The model evidences per worker are then consistently combined using a novel approach which corrects for the splitting using summary statistics of the generated samples. This divide-and-conquer approach allows Bayesian model choice in the large data setting, exploiting all available information but limiting communication between workers. Our work thereby complements the work on consensus Monte Carlo (Scott et al., 2016) by explicitly enabling model choice. In addition, we show how the suggested approach can be extended to model choice within a reversible jump setting that explores multiple feature combinations within one run.
Tasks
Published	2019-10-10
URL	https://arxiv.org/abs/1910.04672v2
PDF	https://arxiv.org/pdf/1910.04672v2.pdf
PWC	https://paperswithcode.com/paper/distributed-bayesian-computation-for-model
Repo
Framework

Teaching robots to perceive time – A reinforcement learning approach (Extended version)


Title	Teaching robots to perceive time – A reinforcement learning approach (Extended version)
Authors	Inês Lourenço, Bo Wahlberg, Rodrigo Ventura
Abstract	Time perception is the phenomenological experience of time by an individual. In this paper, we study how to replicate neural mechanisms involved in time perception, allowing robots to take a step towards temporal cognition. Our framework follows a twofold biologically inspired approach. The first step consists of estimating the passage of time from sensor measurements, since environmental stimuli influence the perception of time. Sensor data is modeled as Gaussian processes that represent the second-order statistics of the natural environment. The estimated elapsed time between two events is computed from the maximum likelihood estimate of the joint distribution of the data collected between them. Moreover, exactly how time is encoded in the brain remains unknown, but there is strong evidence of the involvement of dopaminergic neurons in timing mechanisms. Since their phasic activity has a similar behavior to the reward prediction error of temporal-difference learning models, the latter are used to replicate this behavior. The second step of this approach consists therefore of applying the agent’s estimate of the elapsed time in a reinforcement learning problem, where a feature representation called Microstimuli is used. We validate our framework by applying it to an experiment that was originally conducted with mice, and conclude that a robot using this framework is able to reproduce the timing mechanisms of the animal’s brain.
Tasks	Gaussian Processes
Published	2019-12-20
URL	https://arxiv.org/abs/1912.10113v1
PDF	https://arxiv.org/pdf/1912.10113v1.pdf
PWC	https://paperswithcode.com/paper/teaching-robots-to-perceive-time-a
Repo
Framework

Point in, Box out: Beyond Counting Persons in Crowds


Title	Point in, Box out: Beyond Counting Persons in Crowds
Authors	Yuting Liu, Miaojing Shi, Qijun Zhao, Xiaofang Wang
Abstract	Modern crowd counting methods usually employ deep neural networks (DNN) to estimate crowd counts via density regression. Despite their significant improvements, the regression-based methods are incapable of providing the detection of individuals in crowds. The detection-based methods, on the other hand, have not been largely explored in recent trends of crowd counting due to the needs for expensive bounding box annotations. In this work, we instead propose a new deep detection network with only point supervision required. It can simultaneously detect the size and location of human heads and count them in crowds. We first mine useful person size information from point-level annotations and initialize the pseudo ground truth bounding boxes. An online updating scheme is introduced to refine the pseudo ground truth during training; while a locally-constrained regression loss is designed to provide additional constraints on the size of the predicted boxes in a local neighborhood. In the end, we propose a curriculum learning strategy to train the network from images of relatively accurate and easy pseudo ground truth first. Extensive experiments are conducted in both detection and counting tasks on several standard benchmarks, e.g. ShanghaiTech, UCF_CC_50, WiderFace, and TRANCOS datasets, and the results show the superiority of our method over the state-of-the-art.
Tasks	Crowd Counting
Published	2019-04-02
URL	http://arxiv.org/abs/1904.01333v2
PDF	http://arxiv.org/pdf/1904.01333v2.pdf
PWC	https://paperswithcode.com/paper/point-in-box-out-beyond-counting-persons-in
Repo
Framework

Temporal Planning with Intermediate Conditions and Effects


Title	Temporal Planning with Intermediate Conditions and Effects
Authors	Alessandro Valentini, Andrea Micheli, Alessandro Cimatti
Abstract	Automated temporal planning is the technology of choice when controlling systems that can execute more actions in parallel and when temporal constraints, such as deadlines, are needed in the model. One limitation of several action-based planning systems is that actions are modeled as intervals having conditions and effects only at the extremes and as invariants, but no conditions nor effects can be specified at arbitrary points or sub-intervals. In this paper, we address this limitation by providing an effective heuristic-search technique for temporal planning, allowing the definition of actions with conditions and effects at any arbitrary time within the action duration. We experimentally demonstrate that our approach is far better than standard encodings in PDDL 2.1 and is competitive with other approaches that can (directly or indirectly) represent intermediate action conditions or effects.
Tasks
Published	2019-09-25
URL	https://arxiv.org/abs/1909.11581v1
PDF	https://arxiv.org/pdf/1909.11581v1.pdf
PWC	https://paperswithcode.com/paper/temporal-planning-with-intermediate
Repo
Framework


Title	Doc2Vec on the PubMed corpus: study of a new approach to generate related articles
Authors	Emeric Dynomant, Stéfan J. Darmoni, Émeline Lejeune, Gaëtan Kerdelhué, Jean-Philippe Leroy, Vincent Lequertier, Stéphane Canu, Julien Grosjean
Abstract	PubMed is the biggest and most used bibliographic database worldwide, hosting more than 26M biomedical publications. One of its useful features is the “similar articles” section, allowing the end-user to find scientific articles linked to the consulted document in term of context. The aim of this study is to analyze whether it is possible to replace the statistic model PubMed Related Articles (pmra) with a document embedding method. Doc2Vec algorithm was used to train models allowing to vectorize documents. Six of its parameters were optimised by following a grid-search strategy to train more than 1,900 models. Parameters combination leading to the best accuracy was used to train models on abstracts from the PubMed database. Four evaluations tasks were defined to determine what does or does not influence the proximity between documents for both Doc2Vec and pmra. The two different Doc2Vec architectures have different abilities to link documents about a common context. The terminological indexing, words and stems contents of linked documents are highly similar between pmra and Doc2Vec PV-DBOW architecture. These algorithms are also more likely to bring closer documents having a similar size. In contrary, the manual evaluation shows much better results for the pmra algorithm. While the pmra algorithm links documents by explicitly using terminological indexing in its formula, Doc2Vec does not need a prior indexing. It can infer relations between documents sharing a similar indexing, without any knowledge about them, particularly regarding the PV-DBOW architecture. In contrary, the human evaluation, without any clear agreement between evaluators, implies future studies to better understand this difference between PV-DBOW and pmra algorithm.
Tasks	Document Embedding
Published	2019-11-26
URL	https://arxiv.org/abs/1911.11698v1
PDF	https://arxiv.org/pdf/1911.11698v1.pdf
PWC	https://paperswithcode.com/paper/doc2vec-on-the-pubmed-corpus-study-of-a-new
Repo
Framework