January 26, 2020

3477 words 17 mins read

Paper Group ANR 1441

Single-step Options for Adversary Driving. Resolving Congestions in the Air Traffic Management Domain via Multiagent Reinforcement Learning Methods. Federated Learning of N-gram Language Models. Relation-Aware Graph Attention Network for Visual Question Answering. Can You Explain That? Lucid Explanations Help Human-AI Collaborative Image Retrieval. …

Single-step Options for Adversary Driving


Title	Single-step Options for Adversary Driving
Authors	Nazmus Sakib, Hengshuai Yao, Hong Zhang, Shangling Jui
Abstract	In this paper, we use reinforcement learning for safety driving in adversary settings. In our work, the knowledge in state-of-art planning methods is reused by single-step options whose action suggestions are compared in parallel with primitive actions. We show two advantages by doing so. First, training this reinforcement learning agent is easier and faster than training the primitive-action agent. Second, our new agent outperforms the primitive-action reinforcement learning agent, human testers as well as the state-of-art planning methods that our agent queries as skill options.
Tasks
Published	2019-03-20
URL	https://arxiv.org/abs/1903.08606v2
PDF	https://arxiv.org/pdf/1903.08606v2.pdf
PWC	https://paperswithcode.com/paper/reinforcing-classical-planning-for-adversary
Repo
Framework

Resolving Congestions in the Air Traffic Management Domain via Multiagent Reinforcement Learning Methods


Title	Resolving Congestions in the Air Traffic Management Domain via Multiagent Reinforcement Learning Methods
Authors	Theocharis Kravaris, Christos Spatharis, Alevizos Bastas, George A. Vouros, Konstantinos Blekas, Gennady Andrienko, Natalia Andrienko, Jose Manuel Cordero Garcia
Abstract	In this article, we report on the efficiency and effectiveness of multiagent reinforcement learning methods (MARL) for the computation of flight delays to resolve congestion problems in the Air Traffic Management (ATM) domain. Specifically, we aim to resolve cases where demand of airspace use exceeds capacity (demand-capacity problems), via imposing ground delays to flights at the pre-tactical stage of operations (i.e. few days to few hours before operation). Casting this into the multiagent domain, agents, representing flights, need to decide on own delays w.r.t. own preferences, having no information about others’ payoffs, preferences and constraints, while they plan to execute their trajectories jointly with others, adhering to operational constraints. Specifically, we formalize the problem as a multiagent Markov Decision Process (MA-MDP) and we show that it can be considered as a Markov game in which interacting agents need to reach an equilibrium: What makes the problem more interesting is the dynamic setting in which agents operate, which is also due to the unforeseen, emergent effects of their decisions in the whole system. We propose collaborative multiagent reinforcement learning methods to resolve demand-capacity imbalances: Extensive experimental study on real-world cases, shows the potential of the proposed approaches in resolving problems, while advanced visualizations provide detailed views towards understanding the quality of solutions provided.
Tasks
Published	2019-12-14
URL	https://arxiv.org/abs/1912.06860v1
PDF	https://arxiv.org/pdf/1912.06860v1.pdf
PWC	https://paperswithcode.com/paper/resolving-congestions-in-the-air-traffic
Repo
Framework

Federated Learning of N-gram Language Models


Title	Federated Learning of N-gram Language Models
Authors	Mingqing Chen, Ananda Theertha Suresh, Rajiv Mathews, Adeline Wong, Cyril Allauzen, Françoise Beaufays, Michael Riley
Abstract	We propose algorithms to train production-quality n-gram language models using federated learning. Federated learning is a distributed computation platform that can be used to train global models for portable devices such as smart phones. Federated learning is especially relevant for applications handling privacy-sensitive data, such as virtual keyboards, because training is performed without the users’ data ever leaving their devices. While the principles of federated learning are fairly generic, its methodology assumes that the underlying models are neural networks. However, virtual keyboards are typically powered by n-gram language models for latency reasons. We propose to train a recurrent neural network language model using the decentralized FederatedAveraging algorithm and to approximate this federated model server-side with an n-gram model that can be deployed to devices for fast inference. Our technical contributions include ways of handling large vocabularies, algorithms to correct capitalization errors in user data, and efficient finite state transducer algorithms to convert word language models to word-piece language models and vice versa. The n-gram language models trained with federated learning are compared to n-grams trained with traditional server-based algorithms using A/B tests on tens of millions of users of virtual keyboard. Results are presented for two languages, American English and Brazilian Portuguese. This work demonstrates that high-quality n-gram language models can be trained directly on client mobile devices without sensitive training data ever leaving the devices.
Tasks	Language Modelling
Published	2019-10-08
URL	https://arxiv.org/abs/1910.03432v1
PDF	https://arxiv.org/pdf/1910.03432v1.pdf
PWC	https://paperswithcode.com/paper/federated-learning-of-n-gram-language-models
Repo
Framework

Relation-Aware Graph Attention Network for Visual Question Answering


Title	Relation-Aware Graph Attention Network for Visual Question Answering
Authors	Linjie Li, Zhe Gan, Yu Cheng, Jingjing Liu
Abstract	In order to answer semantically-complicated questions about an image, a Visual Question Answering (VQA) model needs to fully understand the visual scene in the image, especially the interactive dynamics between different objects. We propose a Relation-aware Graph Attention Network (ReGAT), which encodes each image into a graph and models multi-type inter-object relations via a graph attention mechanism, to learn question-adaptive relation representations. Two types of visual object relations are explored: (i) Explicit Relations that represent geometric positions and semantic interactions between objects; and (ii) Implicit Relations that capture the hidden dynamics between image regions. Experiments demonstrate that ReGAT outperforms prior state-of-the-art approaches on both VQA 2.0 and VQA-CP v2 datasets. We further show that ReGAT is compatible to existing VQA architectures, and can be used as a generic relation encoder to boost the model performance for VQA.
Tasks	Question Answering, Visual Question Answering
Published	2019-03-29
URL	https://arxiv.org/abs/1903.12314v3
PDF	https://arxiv.org/pdf/1903.12314v3.pdf
PWC	https://paperswithcode.com/paper/relation-aware-graph-attention-network-for
Repo
Framework

Can You Explain That? Lucid Explanations Help Human-AI Collaborative Image Retrieval


Title	Can You Explain That? Lucid Explanations Help Human-AI Collaborative Image Retrieval
Authors	Arijit Ray, Yi Yao, Rakesh Kumar, Ajay Divakaran, Giedrius Burachas
Abstract	While there have been many proposals on making AI algorithms explainable, few have attempted to evaluate the impact of AI-generated explanations on human performance in conducting human-AI collaborative tasks. To bridge the gap, we propose a Twenty-Questions style collaborative image retrieval game, Explanation-assisted Guess Which (ExAG), as a method of evaluating the efficacy of explanations (visual evidence or textual justification) in the context of Visual Question Answering (VQA). In our proposed ExAG, a human user needs to guess a secret image picked by the VQA agent by asking natural language questions to it. We show that overall, when AI explains its answers, users succeed more often in guessing the secret image correctly. Notably, a few correct explanations can readily improve human performance when VQA answers are mostly incorrect as compared to no-explanation games. Furthermore, we also show that while explanations rated as “helpful” significantly improve human performance, “incorrect” and “unhelpful” explanations can degrade performance as compared to no-explanation games. Our experiments, therefore, demonstrate that ExAG is an effective means to evaluate the efficacy of AI-generated explanations on a human-AI collaborative task.
Tasks	Image Retrieval, Question Answering, Visual Question Answering
Published	2019-04-05
URL	https://arxiv.org/abs/1904.03285v4
PDF	https://arxiv.org/pdf/1904.03285v4.pdf
PWC	https://paperswithcode.com/paper/lucid-explanations-help-using-a-human-ai
Repo
Framework

A New Analysis of Differential Privacy’s Generalization Guarantees


Title	A New Analysis of Differential Privacy’s Generalization Guarantees
Authors	Christopher Jung, Katrina Ligett, Seth Neel, Aaron Roth, Saeed Sharifi-Malvajerdi, Moshe Shenfeld
Abstract	We give a new proof of the “transfer theorem” underlying adaptive data analysis: that any mechanism for answering adaptively chosen statistical queries that is differentially private and sample-accurate is also accurate out-of-sample. Our new proof is elementary and gives structural insights that we expect will be useful elsewhere. We show: 1) that differential privacy ensures that the expectation of any query on the posterior distribution on datasets induced by the transcript of the interaction is close to its true value on the data distribution, and 2) sample accuracy on its own ensures that any query answer produced by the mechanism is close to its posterior expectation with high probability. This second claim follows from a thought experiment in which we imagine that the dataset is resampled from the posterior distribution after the mechanism has committed to its answers. The transfer theorem then follows by summing these two bounds, and in particular, avoids the “monitor argument” used to derive high probability bounds in prior work. An upshot of our new proof technique is that the concrete bounds we obtain are substantially better than the best previously known bounds, even though the improvements are in the constants, rather than the asymptotics (which are known to be tight). As we show, our new bounds outperform the naive “sample-splitting” baseline at dramatically smaller dataset sizes compared to the previous state of the art, bringing techniques from this literature closer to practicality.
Tasks
Published	2019-09-09
URL	https://arxiv.org/abs/1909.03577v1
PDF	https://arxiv.org/pdf/1909.03577v1.pdf
PWC	https://paperswithcode.com/paper/a-new-analysis-of-differential-privacys
Repo
Framework

Reversible Privacy Preservation using Multi-level Encryption and Compressive Sensing


Title	Reversible Privacy Preservation using Multi-level Encryption and Compressive Sensing
Authors	Mehmet Yamac, Mete Ahishali, Nikolaos Passalis, Jenni Raitoharju, Bulent Sankur, Moncef Gabbouj
Abstract	Security monitoring via ubiquitous cameras and their more extended in intelligent buildings stand to gain from advances in signal processing and machine learning. While these innovative and ground-breaking applications can be considered as a boon, at the same time they raise significant privacy concerns. In fact, recent GDPR (General Data Protection Regulation) legislation has highlighted and become an incentive for privacy-preserving solutions. Typical privacy-preserving video monitoring schemes address these concerns by either anonymizing the sensitive data. However, these approaches suffer from some limitations, since they are usually non-reversible, do not provide multiple levels of decryption and computationally costly. In this paper, we provide a novel privacy-preserving method, which is reversible, supports de-identification at multiple privacy levels, and can efficiently perform data acquisition, encryption and data hiding by combining multi-level encryption with compressive sensing. The effectiveness of the proposed approach in protecting the identity of the users has been validated using the goodness of reconstruction quality and strong anonymization of the faces.
Tasks	Compressive Sensing
Published	2019-06-20
URL	https://arxiv.org/abs/1906.08713v1
PDF	https://arxiv.org/pdf/1906.08713v1.pdf
PWC	https://paperswithcode.com/paper/reversible-privacy-preservation-using-multi
Repo
Framework

Making sense of sensory input


Title	Making sense of sensory input
Authors	Richard Evans, Jose Hernandez-Orallo, Johannes Welbl, Pushmeet Kohli, Marek Sergot
Abstract	This paper attempts to answer a central question in unsupervised learning: what does it mean to “make sense” of a sensory sequence? In our formalization, making sense involves constructing a symbolic causal theory that explains the sensory sequence and satisfies a set of unity conditions. This model was inspired by Kant’s discussion of the synthetic unity of apperception in the Critique of Pure Reason. On our account, making sense of sensory input is a type of program synthesis, but it is unsupervised program synthesis. Our second contribution is a computer implementation, the Apperception Engine, that was designed to satisfy the above requirements. Our system is able to produce interpretable human-readable causal theories from very small amounts of data, because of the strong inductive bias provided by the Kantian unity constraints. A causal theory produced by our system is able to predict future sensor readings, as well as retrodict earlier readings, and “impute” (fill in the blanks of) missing sensory readings, in any combination. We tested the engine in a diverse variety of domains, including cellular automata, rhythms and simple nursery tunes, multi-modal binding problems, occlusion tasks, and sequence induction IQ tests. In each domain, we test our engine’s ability to predict future sensor values, retrodict earlier sensor values, and impute missing sensory data. The Apperception Engine performs well in all these domains, significantly out-performing neural net baselines. We note in particular that in the sequence induction IQ tasks, our system achieved human-level performance. This is notable because our system is not a bespoke system designed specifically to solve IQ tasks, but a general purpose apperception system that was designed to make sense of any sensory sequence.
Tasks	Program Synthesis
Published	2019-10-05
URL	https://arxiv.org/abs/1910.02227v1
PDF	https://arxiv.org/pdf/1910.02227v1.pdf
PWC	https://paperswithcode.com/paper/making-sense-of-sensory-input
Repo
Framework

Real-Time Privacy-Preserving Data Release for Smart Meters


Title	Real-Time Privacy-Preserving Data Release for Smart Meters
Authors	Mohammadhadi Shateri, Francisco Messina, Pablo Piantanida, Fabrice Labeau
Abstract	Smart Meters (SMs) are a fundamental component of smart grids, but they carry sensitive information about users such as occupancy status of houses and therefore, they have raised serious concerns about leakage of consumers’ private information. In particular, we focus on real-time privacy threats, i.e., potential attackers that try to infer sensitive data from SMs reported data in an online fashion. We adopt an information-theoretic privacy measure and show that it effectively limits the performance of any real-time attacker. Using this privacy measure, we propose a general formulation to design a privatization mechanism that can provide a target level of privacy by adding a minimal amount of distortion to the SMs measurements. On the other hand, to cope with different applications, a flexible distortion measure is considered. This formulation leads to a general loss function, which is optimized using a deep learning adversarial framework, where two neural networks $-$ referred to as the releaser and the adversary $-$ are trained with opposite goals. An exhaustive empirical study is then performed to validate the performances of the proposed approach for the occupancy detection privacy problem, assuming the attacker disposes of either limited or full access to the training dataset.
Tasks	Time Series
Published	2019-06-14
URL	https://arxiv.org/abs/1906.06427v2
PDF	https://arxiv.org/pdf/1906.06427v2.pdf
PWC	https://paperswithcode.com/paper/deep-recurrent-adversarial-learning-for
Repo
Framework

Few-Shot Learning-Based Human Activity Recognition


Title	Few-Shot Learning-Based Human Activity Recognition
Authors	Siwei Feng, Marco F. Duarte
Abstract	Few-shot learning is a technique to learn a model with a very small amount of labeled training data by transferring knowledge from relevant tasks. In this paper, we propose a few-shot learning method for wearable sensor based human activity recognition, a technique that seeks high-level human activity knowledge from low-level sensor inputs. Due to the high costs to obtain human generated activity data and the ubiquitous similarities between activity modes, it can be more efficient to borrow information from existing activity recognition models than to collect more data to train a new model from scratch when only a few data are available for model training. The proposed few-shot human activity recognition method leverages a deep learning model for feature extraction and classification while knowledge transfer is performed in the manner of model parameter transfer. In order to alleviate negative transfer, we propose a metric to measure cross-domain class-wise relevance so that knowledge of higher relevance is assigned larger weights during knowledge transfer. Promising results in extensive experiments show the advantages of the proposed approach.
Tasks	Activity Recognition, Few-Shot Learning, Human Activity Recognition, Transfer Learning
Published	2019-03-25
URL	http://arxiv.org/abs/1903.10416v1
PDF	http://arxiv.org/pdf/1903.10416v1.pdf
PWC	https://paperswithcode.com/paper/few-shot-learning-based-human-activity
Repo
Framework

Automatic text summarization: What has been done and what has to be done


Title	Automatic text summarization: What has been done and what has to be done
Authors	Abdelkrime Aries, Djamel eddine Zegour, Walid Khaled Hidouci
Abstract	Summaries are important when it comes to process huge amounts of information. Their most important benefit is saving time, which we do not have much nowadays. Therefore, a summary must be short, representative and readable. Generating summaries automatically can be beneficial for humans, since it can save time and help selecting relevant documents. Automatic summarization and, in particular, Automatic text summarization (ATS) is not a new research field; It was known since the 50s. Since then, researchers have been active to find the perfect summarization method. In this article, we will discuss different works in automatic summarization, especially the recent ones. We will present some problems and limits which prevent works to move forward. Most of these challenges are much more related to the nature of processed languages. These challenges are interesting for academics and developers, as a path to follow in this field.
Tasks	Text Summarization
Published	2019-04-01
URL	http://arxiv.org/abs/1904.00688v1
PDF	http://arxiv.org/pdf/1904.00688v1.pdf
PWC	https://paperswithcode.com/paper/automatic-text-summarization-what-has-been
Repo
Framework

Information search in a professional context - exploring a collection of professional search tasks


Title	Information search in a professional context - exploring a collection of professional search tasks
Authors	Suzan Verberne, Jiyin He, Gineke Wiggers, Tony Russell-Rose, Udo Kruschwitz, Arjen P. de Vries
Abstract	Search conducted in a work context is an everyday activity that has been around since long before the Web was invented, yet we still seem to understand little about its general characteristics. With this paper we aim to contribute to a better understanding of this large but rather multi-faceted area of `professional search’. Unlike task-based studies that aim at measuring the effectiveness of search methods, we chose to take a step back by conducting a survey among professional searchers to understand their typical search tasks. By doing so we offer complementary insights into the subject area. We asked our respondents to provide actual search tasks they have worked on, information about how these were conducted and details on how successful they eventually were. We then manually coded the collection of 56 search tasks with task characteristics and relevance criteria, and used the coded dataset for exploration purposes. Despite the relatively small scale of this study, our data provides enough evidence that professional search is indeed very different from Web search in many key respects and that this is a field that offers many avenues for future research. \|
Tasks
Published	2019-05-11
URL	https://arxiv.org/abs/1905.04577v1
PDF	https://arxiv.org/pdf/1905.04577v1.pdf
PWC	https://paperswithcode.com/paper/information-search-in-a-professional-context
Repo
Framework

Image based Eye Gaze Tracking and its Applications


Title	Image based Eye Gaze Tracking and its Applications
Authors	Anjith George
Abstract	Eye movements play a vital role in perceiving the world. Eye gaze can give a direct indication of the users point of attention, which can be useful in improving human-computer interaction. Gaze estimation in a non-intrusive manner can make human-computer interaction more natural. Eye tracking can be used for several applications such as fatigue detection, biometric authentication, disease diagnosis, activity recognition, alertness level estimation, gaze-contingent display, human-computer interaction, etc. Even though eye-tracking technology has been around for many decades, it has not found much use in consumer applications. The main reasons are the high cost of eye tracking hardware and lack of consumer level applications. In this work, we attempt to address these two issues. In the first part of this work, image-based algorithms are developed for gaze tracking which includes a new two-stage iris center localization algorithm. We have developed a new algorithm which works in challenging conditions such as motion blur, glint, and varying illumination levels. A person independent gaze direction classification framework using a convolutional neural network is also developed which eliminates the requirement of user-specific calibration. In the second part of this work, we have developed two applications which can benefit from eye tracking data. A new framework for biometric identification based on eye movement parameters is developed. A framework for activity recognition, using gaze data from a head-mounted eye tracker is also developed. The information from gaze data, ego-motion, and visual features are integrated to classify the activities.
Tasks	Activity Recognition, Calibration, Eye Tracking, Gaze Estimation
Published	2019-07-09
URL	https://arxiv.org/abs/1907.04325v1
PDF	https://arxiv.org/pdf/1907.04325v1.pdf
PWC	https://paperswithcode.com/paper/image-based-eye-gaze-tracking-and-its
Repo
Framework

Linearized ADMM and Fast Nonlocal Denoising for Efficient Plug-and-Play Restoration


Title	Linearized ADMM and Fast Nonlocal Denoising for Efficient Plug-and-Play Restoration
Authors	Unni V. S., Sanjay Ghosh, Kunal N. Chaudhury
Abstract	In plug-and-play image restoration, the regularization is performed using powerful denoisers such as nonlocal means (NLM) or BM3D. This is done within the framework of alternating direction method of multipliers (ADMM), where the regularization step is formally replaced by an off-the-shelf denoiser. Each plug-and-play iteration involves the inversion of the forward model followed by a denoising step. In this paper, we present a couple of ideas for improving the efficiency of the inversion and denoising steps. First, we propose to use linearized ADMM, which generally allows us to perform the inversion at a lower cost than standard ADMM. Moreover, we can easily incorporate hard constraints into the optimization framework as a result. Second, we develop a fast algorithm for doubly stochastic NLM, originally proposed by Sreehari et al. (IEEE TCI, 2016), which is about 80x faster than brute-force computation. This particular denoiser can be expressed as the proximal map of a convex regularizer and, as a consequence, we can guarantee convergence for linearized plug-and-play ADMM. We demonstrate the effectiveness of our proposals for super-resolution and single-photon imaging.
Tasks	Denoising, Image Restoration, Super-Resolution
Published	2019-01-18
URL	http://arxiv.org/abs/1901.06110v1
PDF	http://arxiv.org/pdf/1901.06110v1.pdf
PWC	https://paperswithcode.com/paper/linearized-admm-and-fast-nonlocal-denoising
Repo
Framework

On Automating Conversations


Title	On Automating Conversations
Authors	Ting-Hao ‘Kenneth’ Huang
Abstract	From 2016 to 2018, we developed and deployed Chorus, a system that blends real-time human computation with artificial intelligence (AI) and has real-world, open conversations with users. We took a top-down approach that started with a working crowd-powered system, Chorus, and then created a framework, Evorus, that enables Chorus to automate itself over time. Over our two-year deployment, more than 420 users talked with Chorus, having over 2,200 conversation sessions. This line of work demonstrated how a crowd-powered conversational assistant can be automated over time, and more importantly, how such a system can be deployed to talk with real users to help them with their everyday tasks. This position paper discusses two sets of challenges that we explored during the development and deployment of Chorus and Evorus: the challenges that come from being an “agent” and those that arise from the subset of conversations that are more difficult to automate.
Tasks
Published	2019-10-21
URL	https://arxiv.org/abs/1910.09621v3
PDF	https://arxiv.org/pdf/1910.09621v3.pdf
PWC	https://paperswithcode.com/paper/on-automating-conversations
Repo
Framework