January 31, 2020

3289 words 16 mins read

Paper Group ANR 89

Analysing Deep Reinforcement Learning Agents Trained with Domain Randomisation. Infinite-horizon Off-Policy Policy Evaluation with Multiple Behavior Policies. Unsupervised algorithm for disaggregating low-sampling-rate electricity consumption of households. Stigmergic Independent Reinforcement Learning for Multi-Agent Collaboration. Transmission Ma …

Analysing Deep Reinforcement Learning Agents Trained with Domain Randomisation


Title	Analysing Deep Reinforcement Learning Agents Trained with Domain Randomisation
Authors	Tianhong Dai, Kai Arulkumaran, Tamara Gerbert, Samyakh Tukra, Feryal Behbahani, Anil Anthony Bharath
Abstract	Deep reinforcement learning has the potential to train robots to perform complex tasks in the real world without requiring accurate models of the robot or its environment. A practical approach is to train agents in simulation, and then transfer them to the real world. One popular method for achieving transferability is to use domain randomisation, which involves randomly perturbing various aspects of a simulated environment in order to make trained agents robust to the reality gap. However, less work has gone into understanding such agents - which are deployed in the real world - beyond task performance. In this work we examine such agents, through qualitative and quantitative comparisons between agents trained with and without visual domain randomisation. We train agents for Fetch and Jaco robots on a visuomotor control task and evaluate how well they generalise using different testing conditions. Finally, we investigate the internals of the trained agents by using a suite of interpretability techniques. Our results show that the primary outcome of domain randomisation is more robust, entangled representations, accompanied with larger weights with greater spatial structure; moreover, the types of changes are heavily influenced by the task setup and presence of additional proprioceptive inputs. Additionally, we demonstrate that our domain randomised agents require higher sample complexity, can overfit and more heavily rely on recurrent processing. Furthermore, even with an improved saliency method introduced in this work, we show that qualitative studies may not always correspond with quantitative measures, necessitating the combination of inspection tools in order to provide sufficient insights into the behaviour of trained agents.
Tasks
Published	2019-12-18
URL	https://arxiv.org/abs/1912.08324v2
PDF	https://arxiv.org/pdf/1912.08324v2.pdf
PWC	https://paperswithcode.com/paper/analysing-deep-reinforcement-learning-agents
Repo
Framework

Infinite-horizon Off-Policy Policy Evaluation with Multiple Behavior Policies


Title	Infinite-horizon Off-Policy Policy Evaluation with Multiple Behavior Policies
Authors	Xinyun Chen, Lu Wang, Yizhe Hang, Heng Ge, Hongyuan Zha
Abstract	We consider off-policy policy evaluation when the trajectory data are generated by multiple behavior policies. Recent work has shown the key role played by the state or state-action stationary distribution corrections in the infinite horizon context for off-policy policy evaluation. We propose estimated mixture policy (EMP), a novel class of partially policy-agnostic methods to accurately estimate those quantities. With careful analysis, we show that EMP gives rise to estimates with reduced variance for estimating the state stationary distribution correction while it also offers a useful induction bias for estimating the state-action stationary distribution correction. In extensive experiments with both continuous and discrete environments, we demonstrate that our algorithm offers significantly improved accuracy compared to the state-of-the-art methods.
Tasks
Published	2019-10-10
URL	https://arxiv.org/abs/1910.04849v1
PDF	https://arxiv.org/pdf/1910.04849v1.pdf
PWC	https://paperswithcode.com/paper/infinite-horizon-off-policy-policy-evaluation-1
Repo
Framework

Unsupervised algorithm for disaggregating low-sampling-rate electricity consumption of households


Title	Unsupervised algorithm for disaggregating low-sampling-rate electricity consumption of households
Authors	Jordan Holweger, Marina Dorokhova, Lionel Bloch, Christophe Ballif, Nicolas Wyrsch
Abstract	Non-intrusive load monitoring (NILM) has been extensively researched over the last decade. The objective of NILM is to identify the power consumption of individual appliances and to detect when particular devices are on or off from measuring the power consumption of an entire house. This information allows households to receive customized advice on how to better manage their electrical consumption. In this paper, we present an alternative NILM method that breaks down the aggregated power signal into categories of appliances. The ultimate goal is to use this approach for demand-side management to estimate potential flexibility within the electricity consumption of households. Our method is implemented as an algorithm combining NILM and load profile simulation. This algorithm, based on a Markov model, allocates an activity chain to each inhabitant of the household, deduces from the whole-house power measurement and statistical data the appliance usage, generate the power profile accordingly and finally returns the share of energy consumed by each appliance category over time. To analyze its performance, the algorithm was benchmarked against several state-of-the-art NILM algorithms and tested on three public datasets. The proposed algorithm is unsupervised; hence it does not require any labeled data, which are expensive to acquire. Although better performance is shown for the supervised algorithms, our proposed unsupervised algorithm achieves a similar range of uncertainty while saving on the cost of acquiring labeled data. Additionally, our method requires lower computational power compared to most of the tested NILM algorithms. It was designed for low-sampling-rate power measurement (every 15 min), which corresponds to the frequency range of most common smart meters.
Tasks	Non-Intrusive Load Monitoring
Published	2019-08-20
URL	https://arxiv.org/abs/1908.10713v1
PDF	https://arxiv.org/pdf/1908.10713v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-algorithm-for-disaggregating-low
Repo
Framework

Stigmergic Independent Reinforcement Learning for Multi-Agent Collaboration


Title	Stigmergic Independent Reinforcement Learning for Multi-Agent Collaboration
Authors	Xu Xing, Li Rongpeng, Zhao Zhifeng, Zhang Honggang
Abstract	With the rapid evolution of wireless mobile devices, it emerges stronger incentive to design proper collaboration mechanisms among the intelligent agents. Following their individual observations, multiple intelligent agents could cooperate and gradually approach the final collective objective through continuously learning from the environment. In that regard, independent reinforcement learning (IRL) is often deployed within the multi-agent collaboration to alleviate the dilemma of non-stationary learning environment. However, behavioral strategies of the intelligent agents in IRL could only be formulated upon their local individual observations of the global environment, and appropriate communication mechanisms must be introduced to reduce their behavioral localities. In this paper, we tackle the communication problem among the intelligent agents in IRL by jointly adopting two mechanisms with different scales. For the large scale, we introduce the stigmergy mechanism as an indirect communication bridge among the independent learning agents and carefully design a mathematical representation to indicate the impact of digital pheromone. For the small scale, we propose a conflict-avoidance mechanism between adjacent agents by implementing an additionally embedded neural network to provide more opportunities for participants with higher action priorities. Besides, we also present a federal training method to effectively optimize the neural networks within each agent in a decentralized manner. Finally, we establish a simulation scenario where a number of mobile agents in a certain area move automatically to form a specified target shape, and demonstrate the superiorities of our proposed methods through extensive simulations.
Tasks
Published	2019-11-28
URL	https://arxiv.org/abs/1911.12504v1
PDF	https://arxiv.org/pdf/1911.12504v1.pdf
PWC	https://paperswithcode.com/paper/stigmergic-independent-reinforcement-learning
Repo
Framework

Transmission Matrix Inference via Pseudolikelihood Decimation


Title	Transmission Matrix Inference via Pseudolikelihood Decimation
Authors	Daniele Ancora, Luca Leuzzi
Abstract	One of the biggest challenges in the field of biomedical imaging is the comprehension and the exploitation of the photon scattering through disordered media. Many studies have pursued the solution to this puzzle, achieving light-focusing control or reconstructing images in complex media. In the present work, we investigate how statistical inference helps the calculation of the transmission matrix in a complex scrambling environment, enabling its usage like a normal optical element. We convert a linear input-output transmission problem into a statistical formulation based on pseudolikelihood maximization, learning the coupling matrix via random sampling of intensity realizations. Our aim is to uncover insights from the scattering problem, encouraging the development of novel imaging techniques for better medical investigations, borrowing a number of statistical tools from spin-glass theory.
Tasks
Published	2019-03-13
URL	http://arxiv.org/abs/1903.05379v1
PDF	http://arxiv.org/pdf/1903.05379v1.pdf
PWC	https://paperswithcode.com/paper/transmission-matrix-inference-via
Repo
Framework

Societal Controversies in Wikipedia Articles


Title	Societal Controversies in Wikipedia Articles
Authors	Erik Borra, Andreas Kaltenbrunner, Michele Mauri, Esther Weltevrede, David Laniado, Richard Rogers, Paolo Ciuccarelli, Giovanni Magni, Tommaso Venturini
Abstract	Collaborative content creation inevitably reaches situations where different points of view lead to conflict. We focus on Wikipedia, the free encyclopedia anyone may edit, where disputes about content in controversial articles often reflect larger societal debates. While Wikipedia has a public edit history and discussion section for every article, the substance of these sections is difficult to phantom for Wikipedia users interested in the development of an article and in locating which topics were most controversial. In this paper we present Contropedia, a tool that augments Wikipedia articles and gives insight into the development of controversial topics. Contropedia uses an efficient language agnostic measure based on the edit history that focuses on wiki links to easily identify which topics within a Wikipedia article have been most controversial and when.
Tasks
Published	2019-04-18
URL	http://arxiv.org/abs/1904.08721v1
PDF	http://arxiv.org/pdf/1904.08721v1.pdf
PWC	https://paperswithcode.com/paper/societal-controversies-in-wikipedia-articles
Repo
Framework

Curriculum Learning Strategies for Hindi-English Codemixed Sentiment Analysis


Title	Curriculum Learning Strategies for Hindi-English Codemixed Sentiment Analysis
Authors	Anirudh Dahiya, Neeraj Battan, Manish Shrivastava, Dipti Mishra Sharma
Abstract	Sentiment Analysis and other semantic tasks are commonly used for social media textual analysis to gauge public opinion and make sense from the noise on social media. The language used on social media not only commonly diverges from the formal language, but is compounded by codemixing between languages, especially in large multilingual societies like India. Traditional methods for learning semantic NLP tasks have long relied on end to end task specific training, requiring expensive data creation process, even more so for deep learning methods. This challenge is even more severe for resource scarce texts like codemixed language pairs, with lack of well learnt representations as model priors, and task specific datasets can be few and small in quantities to efficiently exploit recent deep learning approaches. To address above challenges, we introduce curriculum learning strategies for semantic tasks in code-mixed Hindi-English (Hi-En) texts, and investigate various training strategies for enhancing model performance. Our method outperforms the state of the art methods for Hi-En codemixed sentiment analysis by 3.31% accuracy, and also shows better model robustness in terms of convergence, and variance in test performance.
Tasks	Sentiment Analysis
Published	2019-06-18
URL	https://arxiv.org/abs/1906.07382v1
PDF	https://arxiv.org/pdf/1906.07382v1.pdf
PWC	https://paperswithcode.com/paper/curriculum-learning-strategies-for-hindi
Repo
Framework

Neural Architecture Search for Deep Face Recognition


Title	Neural Architecture Search for Deep Face Recognition
Authors	Ning Zhu
Abstract	By the widespread popularity of electronic devices, the emergence of biometric technology has brought significant convenience to user authentication compared with the traditional password and mode unlocking. Among many biological characteristics, the face is a universal and irreplaceable feature that does not need too much cooperation and can significantly improve the user’s experience at the same time. Face recognition is one of the main functions of electronic equipment propaganda. Hence it’s virtually worth researching in computer vision. Previous work in this field has focused on two directions: converting loss function to improve recognition accuracy in traditional deep convolution neural networks (Resnet); combining the latest loss function with the lightweight system (MobileNet) to reduce network size at the minimal expense of accuracy. But none of these has changed the network structure. With the development of AutoML, neural architecture search (NAS) has shown excellent performance in the benchmark of image classification. In this paper, we integrate NAS technology into face recognition to customize a more suitable network. We quote the framework of neural architecture search which trains child and controller network alternately. At the same time, we mutate NAS by incorporating evaluation latency into rewards of reinforcement learning and utilize policy gradient algorithm to search the architecture automatically with the most classical cross-entropy loss. The network architectures we searched out have got state-of-the-art accuracy in the large-scale face dataset, which achieves 98.77% top-1 in MS-Celeb-1M and 99.89% in LFW with relatively small network size. To the best of our knowledge, this proposal is the first attempt to use NAS to solve the problem of Deep Face Recognition and achieve the best results in this domain.
Tasks	AutoML, Face Recognition, Image Classification, Neural Architecture Search
Published	2019-04-21
URL	http://arxiv.org/abs/1904.09523v2
PDF	http://arxiv.org/pdf/1904.09523v2.pdf
PWC	https://paperswithcode.com/paper/neural-architecture-search-for-deep-face
Repo
Framework

Supervised and Unsupervised Learning of Parameterized Color Enhancement


Title	Supervised and Unsupervised Learning of Parameterized Color Enhancement
Authors	Yoav Chai, Raja Giryes, Lior Wolf
Abstract	We treat the problem of color enhancement as an image translation task, which we tackle using both supervised and unsupervised learning. Unlike traditional image to image generators, our translation is performed using a global parameterized color transformation instead of learning to directly map image information. In the supervised case, every training image is paired with a desired target image and a convolutional neural network (CNN) learns from the expert retouched images the parameters of the transformation. In the unpaired case, we employ two-way generative adversarial networks (GANs) to learn these parameters and apply a circularity constraint. We achieve state-of-the-art results compared to both supervised (paired data) and unsupervised (unpaired data) image enhancement methods on the MIT-Adobe FiveK benchmark. Moreover, we show the generalization capability of our method, by applying it on photos from the early 20th century and to dark video frames.
Tasks	Image Enhancement
Published	2019-12-30
URL	https://arxiv.org/abs/2001.05843v1
PDF	https://arxiv.org/pdf/2001.05843v1.pdf
PWC	https://paperswithcode.com/paper/supervised-and-unsupervised-learning-of
Repo
Framework

AutoIDS: Auto-encoder Based Method for Intrusion Detection System


Title	AutoIDS: Auto-encoder Based Method for Intrusion Detection System
Authors	Mohammed Gharib, Bahram Mohammadi, Shadi Hejareh Dastgerdi, Mohammad Sabokrou
Abstract	Intrusion Detection System (IDS) is one of the most effective solutions for providing primary security services. IDSs are generally working based on attack signatures or by detecting anomalies. In this paper, we have presented AutoIDS, a novel yet efficient solution for IDS, based on a semi-supervised machine learning technique. AutoIDS can distinguish abnormal packet flows from normal ones by taking advantage of cascading two efficient detectors. These detectors are two encoder-decoder neural networks that are forced to provide a compressed and a sparse representation from the normal flows. In the test phase, failing these neural networks on providing compressed or sparse representation from an incoming packet flow, means such flow does not comply with the normal traffic and thus it is considered as an intrusion. For lowering the computational cost along with preserving the accuracy, a large number of flows are just processed by the first detector. In fact, the second detector is only used for difficult samples which the first detector is not confident about them. We have evaluated AutoIDS on the NSL-KDD benchmark as a widely-used and well-known dataset. The accuracy of AutoIDS is 90.17% showing its superiority compared to the other state-of-the-art methods.
Tasks	Intrusion Detection
Published	2019-11-08
URL	https://arxiv.org/abs/1911.03306v1
PDF	https://arxiv.org/pdf/1911.03306v1.pdf
PWC	https://paperswithcode.com/paper/autoids-auto-encoder-based-method-for
Repo
Framework

Sparse Popularity Adjusted Stochastic Block Model


Title	Sparse Popularity Adjusted Stochastic Block Model
Authors	Majid Noroozi, Ramchandra Rimal, Marianna Pensky
Abstract	The objective of the present paper is to study the Popularity Adjusted Block Model (PABM) in the sparse setting. Unlike in other block models, the flexibility of PABM allows to set some of the connection probabilities to zero while maintaining the rest of the probabilities non-negligible, leading to the Sparse Popularity Adjusted Block Model (SPABM). The latter reduces the size of parameter set and leads to improved precision of estimation and clustering. The theory is complemented by the simulation study and real data examples.
Tasks
Published	2019-10-03
URL	https://arxiv.org/abs/1910.01931v1
PDF	https://arxiv.org/pdf/1910.01931v1.pdf
PWC	https://paperswithcode.com/paper/sparse-popularity-adjusted-stochastic-block
Repo
Framework

Fully Decentralized Joint Learning of Personalized Models and Collaboration Graphs


Title	Fully Decentralized Joint Learning of Personalized Models and Collaboration Graphs
Authors	Valentina Zantedeschi, Aurélien Bellet, Marc Tommasi
Abstract	We consider the fully decentralized machine learning scenario where many users with personal datasets collaborate to learn models through local peer-to-peer exchanges, without a central coordinator. We propose to train personalized models that leverage a collaboration graph describing the relationships between user personal tasks, which we learn jointly with the models. Our fully decentralized optimization procedure alternates between training nonlinear models given the graph in a greedy boosting manner, and updating the collaboration graph (with controlled sparsity) given the models. Throughout the process, users exchange messages only with a small number of peers (their direct neighbors when updating the models, and a few random users when updating the graph), ensuring that the procedure naturally scales with the number of users. Overall, our approach is communication-efficient and avoids exchanging personal data. We provide an extensive analysis of the convergence rate, memory and communication complexity of our approach, and demonstrate its benefits compared to competing techniques on synthetic and real datasets.
Tasks
Published	2019-01-24
URL	https://arxiv.org/abs/1901.08460v4
PDF	https://arxiv.org/pdf/1901.08460v4.pdf
PWC	https://paperswithcode.com/paper/communication-efficient-and-decentralized
Repo
Framework


Title	Accurate 3D Cell Segmentation using Deep Feature and CRF Refinement
Authors	Jiaxiang Jiang, Po-Yu Kao, Samuel A. Belteton, Daniel B. Szymanski, B. S. Manjunath
Abstract	We consider the problem of accurately identifying cell boundaries and labeling individual cells in confocal microscopy images, specifically, 3D image stacks of cells with tagged cell membranes. Precise identification of cell boundaries, their shapes, and quantifying inter-cellular space leads to a better understanding of cell morphogenesis. Towards this, we outline a cell segmentation method that uses a deep neural network architecture to extract a confidence map of cell boundaries, followed by a 3D watershed algorithm and a final refinement using a conditional random field. In addition to improving the accuracy of segmentation compared to other state-of-the-art methods, the proposed approach also generalizes well to different datasets without the need to retrain the network for each dataset. Detailed experimental results are provided, and the source code is available on GitHub.
Tasks	Cell Segmentation
Published	2019-02-13
URL	http://arxiv.org/abs/1902.04729v1
PDF	http://arxiv.org/pdf/1902.04729v1.pdf
PWC	https://paperswithcode.com/paper/accurate-3d-cell-segmentation-using-deep
Repo
Framework

Attributed Graph Clustering via Adaptive Graph Convolution


Title	Attributed Graph Clustering via Adaptive Graph Convolution
Authors	Xiaotong Zhang, Han Liu, Qimai Li, Xiao-Ming Wu
Abstract	Attributed graph clustering is challenging as it requires joint modelling of graph structures and node attributes. Recent progress on graph convolutional networks has proved that graph convolution is effective in combining structural and content information, and several recent methods based on it have achieved promising clustering performance on some real attributed networks. However, there is limited understanding of how graph convolution affects clustering performance and how to properly use it to optimize performance for different graphs. Existing methods essentially use graph convolution of a fixed and low order that only takes into account neighbours within a few hops of each node, which underutilizes node relations and ignores the diversity of graphs. In this paper, we propose an adaptive graph convolution method for attributed graph clustering that exploits high-order graph convolution to capture global cluster structure and adaptively selects the appropriate order for different graphs. We establish the validity of our method by theoretical analysis and extensive experiments on benchmark datasets. Empirical results show that our method compares favourably with state-of-the-art methods.
Tasks	Graph Clustering
Published	2019-06-04
URL	https://arxiv.org/abs/1906.01210v1
PDF	https://arxiv.org/pdf/1906.01210v1.pdf
PWC	https://paperswithcode.com/paper/attributed-graph-clustering-via-adaptive
Repo
Framework

Biometric Blockchain: A Better Solution for the Security and Trust of Food Logistics


Title	Biometric Blockchain: A Better Solution for the Security and Trust of Food Logistics
Authors	Bing Xu, Tobechukwu Agbele, Richard Jiang
Abstract	Blockchain has been emerging as a promising technology that could totally change the landscape of data security in the coming years, particularly for data access over Internet-of-Things and cloud servers. However, blockchain itself, though secured by its protocol, does not identify who owns the data and who uses the data. Other than simply encrypting data into keys, in this paper, we proposed a protocol called Biometric Blockchain (BBC) that explicitly incorporate the biometric cues of individuals to unambiguously identify the creators and users in a blockchain-based system, particularly to address the increasing needs to secure the food logistics, following the recently widely reported incident on wrongly labelled foods that caused the death of a customer on a flight. The advantage of using BBC in the food logistics is clear: it can not only identify if the data or labels are authentic, but also clearly record who is responsible for the secured data or labels. As a result, such a BBC-based solution can great ease the difficulty to control the risks accompanying the food logistics, such as faked foods or wrong gradient labels.
Tasks
Published	2019-07-21
URL	https://arxiv.org/abs/1907.10589v2
PDF	https://arxiv.org/pdf/1907.10589v2.pdf
PWC	https://paperswithcode.com/paper/biometric-blockchain-a-better-solution-for
Repo
Framework