Paper Group ANR 1441
Single-step Options for Adversary Driving. Resolving Congestions in the Air Traffic Management Domain via Multiagent Reinforcement Learning Methods. Federated Learning of N-gram Language Models. Relation-Aware Graph Attention Network for Visual Question Answering. Can You Explain That? Lucid Explanations Help Human-AI Collaborative Image Retrieval. …
Single-step Options for Adversary Driving
Title | Single-step Options for Adversary Driving |
Authors | Nazmus Sakib, Hengshuai Yao, Hong Zhang, Shangling Jui |
Abstract | In this paper, we use reinforcement learning for safety driving in adversary settings. In our work, the knowledge in state-of-art planning methods is reused by single-step options whose action suggestions are compared in parallel with primitive actions. We show two advantages by doing so. First, training this reinforcement learning agent is easier and faster than training the primitive-action agent. Second, our new agent outperforms the primitive-action reinforcement learning agent, human testers as well as the state-of-art planning methods that our agent queries as skill options. |
Tasks | |
Published | 2019-03-20 |
URL | https://arxiv.org/abs/1903.08606v2 |
https://arxiv.org/pdf/1903.08606v2.pdf | |
PWC | https://paperswithcode.com/paper/reinforcing-classical-planning-for-adversary |
Repo | |
Framework | |
Resolving Congestions in the Air Traffic Management Domain via Multiagent Reinforcement Learning Methods
Title | Resolving Congestions in the Air Traffic Management Domain via Multiagent Reinforcement Learning Methods |
Authors | Theocharis Kravaris, Christos Spatharis, Alevizos Bastas, George A. Vouros, Konstantinos Blekas, Gennady Andrienko, Natalia Andrienko, Jose Manuel Cordero Garcia |
Abstract | In this article, we report on the efficiency and effectiveness of multiagent reinforcement learning methods (MARL) for the computation of flight delays to resolve congestion problems in the Air Traffic Management (ATM) domain. Specifically, we aim to resolve cases where demand of airspace use exceeds capacity (demand-capacity problems), via imposing ground delays to flights at the pre-tactical stage of operations (i.e. few days to few hours before operation). Casting this into the multiagent domain, agents, representing flights, need to decide on own delays w.r.t. own preferences, having no information about others’ payoffs, preferences and constraints, while they plan to execute their trajectories jointly with others, adhering to operational constraints. Specifically, we formalize the problem as a multiagent Markov Decision Process (MA-MDP) and we show that it can be considered as a Markov game in which interacting agents need to reach an equilibrium: What makes the problem more interesting is the dynamic setting in which agents operate, which is also due to the unforeseen, emergent effects of their decisions in the whole system. We propose collaborative multiagent reinforcement learning methods to resolve demand-capacity imbalances: Extensive experimental study on real-world cases, shows the potential of the proposed approaches in resolving problems, while advanced visualizations provide detailed views towards understanding the quality of solutions provided. |
Tasks | |
Published | 2019-12-14 |
URL | https://arxiv.org/abs/1912.06860v1 |
https://arxiv.org/pdf/1912.06860v1.pdf | |
PWC | https://paperswithcode.com/paper/resolving-congestions-in-the-air-traffic |
Repo | |
Framework | |
Federated Learning of N-gram Language Models
Title | Federated Learning of N-gram Language Models |
Authors | Mingqing Chen, Ananda Theertha Suresh, Rajiv Mathews, Adeline Wong, Cyril Allauzen, Françoise Beaufays, Michael Riley |
Abstract | We propose algorithms to train production-quality n-gram language models using federated learning. Federated learning is a distributed computation platform that can be used to train global models for portable devices such as smart phones. Federated learning is especially relevant for applications handling privacy-sensitive data, such as virtual keyboards, because training is performed without the users’ data ever leaving their devices. While the principles of federated learning are fairly generic, its methodology assumes that the underlying models are neural networks. However, virtual keyboards are typically powered by n-gram language models for latency reasons. We propose to train a recurrent neural network language model using the decentralized FederatedAveraging algorithm and to approximate this federated model server-side with an n-gram model that can be deployed to devices for fast inference. Our technical contributions include ways of handling large vocabularies, algorithms to correct capitalization errors in user data, and efficient finite state transducer algorithms to convert word language models to word-piece language models and vice versa. The n-gram language models trained with federated learning are compared to n-grams trained with traditional server-based algorithms using A/B tests on tens of millions of users of virtual keyboard. Results are presented for two languages, American English and Brazilian Portuguese. This work demonstrates that high-quality n-gram language models can be trained directly on client mobile devices without sensitive training data ever leaving the devices. |
Tasks | Language Modelling |
Published | 2019-10-08 |
URL | https://arxiv.org/abs/1910.03432v1 |
https://arxiv.org/pdf/1910.03432v1.pdf | |
PWC | https://paperswithcode.com/paper/federated-learning-of-n-gram-language-models |
Repo | |
Framework | |
Relation-Aware Graph Attention Network for Visual Question Answering
Title | Relation-Aware Graph Attention Network for Visual Question Answering |
Authors | Linjie Li, Zhe Gan, Yu Cheng, Jingjing Liu |
Abstract | In order to answer semantically-complicated questions about an image, a Visual Question Answering (VQA) model needs to fully understand the visual scene in the image, especially the interactive dynamics between different objects. We propose a Relation-aware Graph Attention Network (ReGAT), which encodes each image into a graph and models multi-type inter-object relations via a graph attention mechanism, to learn question-adaptive relation representations. Two types of visual object relations are explored: (i) Explicit Relations that represent geometric positions and semantic interactions between objects; and (ii) Implicit Relations that capture the hidden dynamics between image regions. Experiments demonstrate that ReGAT outperforms prior state-of-the-art approaches on both VQA 2.0 and VQA-CP v2 datasets. We further show that ReGAT is compatible to existing VQA architectures, and can be used as a generic relation encoder to boost the model performance for VQA. |
Tasks | Question Answering, Visual Question Answering |
Published | 2019-03-29 |
URL | https://arxiv.org/abs/1903.12314v3 |
https://arxiv.org/pdf/1903.12314v3.pdf | |
PWC | https://paperswithcode.com/paper/relation-aware-graph-attention-network-for |
Repo | |
Framework | |
Can You Explain That? Lucid Explanations Help Human-AI Collaborative Image Retrieval
Title | Can You Explain That? Lucid Explanations Help Human-AI Collaborative Image Retrieval |
Authors | Arijit Ray, Yi Yao, Rakesh Kumar, Ajay Divakaran, Giedrius Burachas |
Abstract | While there have been many proposals on making AI algorithms explainable, few have attempted to evaluate the impact of AI-generated explanations on human performance in conducting human-AI collaborative tasks. To bridge the gap, we propose a Twenty-Questions style collaborative image retrieval game, Explanation-assisted Guess Which (ExAG), as a method of evaluating the efficacy of explanations (visual evidence or textual justification) in the context of Visual Question Answering (VQA). In our proposed ExAG, a human user needs to guess a secret image picked by the VQA agent by asking natural language questions to it. We show that overall, when AI explains its answers, users succeed more often in guessing the secret image correctly. Notably, a few correct explanations can readily improve human performance when VQA answers are mostly incorrect as compared to no-explanation games. Furthermore, we also show that while explanations rated as “helpful” significantly improve human performance, “incorrect” and “unhelpful” explanations can degrade performance as compared to no-explanation games. Our experiments, therefore, demonstrate that ExAG is an effective means to evaluate the efficacy of AI-generated explanations on a human-AI collaborative task. |
Tasks | Image Retrieval, Question Answering, Visual Question Answering |
Published | 2019-04-05 |
URL | https://arxiv.org/abs/1904.03285v4 |
https://arxiv.org/pdf/1904.03285v4.pdf | |
PWC | https://paperswithcode.com/paper/lucid-explanations-help-using-a-human-ai |
Repo | |
Framework | |
A New Analysis of Differential Privacy’s Generalization Guarantees
Title | A New Analysis of Differential Privacy’s Generalization Guarantees |
Authors | Christopher Jung, Katrina Ligett, Seth Neel, Aaron Roth, Saeed Sharifi-Malvajerdi, Moshe Shenfeld |
Abstract | We give a new proof of the “transfer theorem” underlying adaptive data analysis: that any mechanism for answering adaptively chosen statistical queries that is differentially private and sample-accurate is also accurate out-of-sample. Our new proof is elementary and gives structural insights that we expect will be useful elsewhere. We show: 1) that differential privacy ensures that the expectation of any query on the posterior distribution on datasets induced by the transcript of the interaction is close to its true value on the data distribution, and 2) sample accuracy on its own ensures that any query answer produced by the mechanism is close to its posterior expectation with high probability. This second claim follows from a thought experiment in which we imagine that the dataset is resampled from the posterior distribution after the mechanism has committed to its answers. The transfer theorem then follows by summing these two bounds, and in particular, avoids the “monitor argument” used to derive high probability bounds in prior work. An upshot of our new proof technique is that the concrete bounds we obtain are substantially better than the best previously known bounds, even though the improvements are in the constants, rather than the asymptotics (which are known to be tight). As we show, our new bounds outperform the naive “sample-splitting” baseline at dramatically smaller dataset sizes compared to the previous state of the art, bringing techniques from this literature closer to practicality. |
Tasks | |
Published | 2019-09-09 |
URL | https://arxiv.org/abs/1909.03577v1 |
https://arxiv.org/pdf/1909.03577v1.pdf | |
PWC | https://paperswithcode.com/paper/a-new-analysis-of-differential-privacys |
Repo | |
Framework | |
Reversible Privacy Preservation using Multi-level Encryption and Compressive Sensing
Title | Reversible Privacy Preservation using Multi-level Encryption and Compressive Sensing |
Authors | Mehmet Yamac, Mete Ahishali, Nikolaos Passalis, Jenni Raitoharju, Bulent Sankur, Moncef Gabbouj |
Abstract | Security monitoring via ubiquitous cameras and their more extended in intelligent buildings stand to gain from advances in signal processing and machine learning. While these innovative and ground-breaking applications can be considered as a boon, at the same time they raise significant privacy concerns. In fact, recent GDPR (General Data Protection Regulation) legislation has highlighted and become an incentive for privacy-preserving solutions. Typical privacy-preserving video monitoring schemes address these concerns by either anonymizing the sensitive data. However, these approaches suffer from some limitations, since they are usually non-reversible, do not provide multiple levels of decryption and computationally costly. In this paper, we provide a novel privacy-preserving method, which is reversible, supports de-identification at multiple privacy levels, and can efficiently perform data acquisition, encryption and data hiding by combining multi-level encryption with compressive sensing. The effectiveness of the proposed approach in protecting the identity of the users has been validated using the goodness of reconstruction quality and strong anonymization of the faces. |
Tasks | Compressive Sensing |
Published | 2019-06-20 |
URL | https://arxiv.org/abs/1906.08713v1 |
https://arxiv.org/pdf/1906.08713v1.pdf | |
PWC | https://paperswithcode.com/paper/reversible-privacy-preservation-using-multi |
Repo | |
Framework | |
Making sense of sensory input
Title | Making sense of sensory input |
Authors | Richard Evans, Jose Hernandez-Orallo, Johannes Welbl, Pushmeet Kohli, Marek Sergot |
Abstract | This paper attempts to answer a central question in unsupervised learning: what does it mean to “make sense” of a sensory sequence? In our formalization, making sense involves constructing a symbolic causal theory that explains the sensory sequence and satisfies a set of unity conditions. This model was inspired by Kant’s discussion of the synthetic unity of apperception in the Critique of Pure Reason. On our account, making sense of sensory input is a type of program synthesis, but it is unsupervised program synthesis. Our second contribution is a computer implementation, the Apperception Engine, that was designed to satisfy the above requirements. Our system is able to produce interpretable human-readable causal theories from very small amounts of data, because of the strong inductive bias provided by the Kantian unity constraints. A causal theory produced by our system is able to predict future sensor readings, as well as retrodict earlier readings, and “impute” (fill in the blanks of) missing sensory readings, in any combination. We tested the engine in a diverse variety of domains, including cellular automata, rhythms and simple nursery tunes, multi-modal binding problems, occlusion tasks, and sequence induction IQ tests. In each domain, we test our engine’s ability to predict future sensor values, retrodict earlier sensor values, and impute missing sensory data. The Apperception Engine performs well in all these domains, significantly out-performing neural net baselines. We note in particular that in the sequence induction IQ tasks, our system achieved human-level performance. This is notable because our system is not a bespoke system designed specifically to solve IQ tasks, but a general purpose apperception system that was designed to make sense of any sensory sequence. |
Tasks | Program Synthesis |
Published | 2019-10-05 |
URL | https://arxiv.org/abs/1910.02227v1 |
https://arxiv.org/pdf/1910.02227v1.pdf | |
PWC | https://paperswithcode.com/paper/making-sense-of-sensory-input |
Repo | |
Framework | |
Real-Time Privacy-Preserving Data Release for Smart Meters
Title | Real-Time Privacy-Preserving Data Release for Smart Meters |
Authors | Mohammadhadi Shateri, Francisco Messina, Pablo Piantanida, Fabrice Labeau |
Abstract | Smart Meters (SMs) are a fundamental component of smart grids, but they carry sensitive information about users such as occupancy status of houses and therefore, they have raised serious concerns about leakage of consumers’ private information. In particular, we focus on real-time privacy threats, i.e., potential attackers that try to infer sensitive data from SMs reported data in an online fashion. We adopt an information-theoretic privacy measure and show that it effectively limits the performance of any real-time attacker. Using this privacy measure, we propose a general formulation to design a privatization mechanism that can provide a target level of privacy by adding a minimal amount of distortion to the SMs measurements. On the other hand, to cope with different applications, a flexible distortion measure is considered. This formulation leads to a general loss function, which is optimized using a deep learning adversarial framework, where two neural networks $-$ referred to as the releaser and the adversary $-$ are trained with opposite goals. An exhaustive empirical study is then performed to validate the performances of the proposed approach for the occupancy detection privacy problem, assuming the attacker disposes of either limited or full access to the training dataset. |
Tasks | Time Series |
Published | 2019-06-14 |
URL | https://arxiv.org/abs/1906.06427v2 |
https://arxiv.org/pdf/1906.06427v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-recurrent-adversarial-learning-for |
Repo | |
Framework | |
Few-Shot Learning-Based Human Activity Recognition
Title | Few-Shot Learning-Based Human Activity Recognition |
Authors | Siwei Feng, Marco F. Duarte |
Abstract | Few-shot learning is a technique to learn a model with a very small amount of labeled training data by transferring knowledge from relevant tasks. In this paper, we propose a few-shot learning method for wearable sensor based human activity recognition, a technique that seeks high-level human activity knowledge from low-level sensor inputs. Due to the high costs to obtain human generated activity data and the ubiquitous similarities between activity modes, it can be more efficient to borrow information from existing activity recognition models than to collect more data to train a new model from scratch when only a few data are available for model training. The proposed few-shot human activity recognition method leverages a deep learning model for feature extraction and classification while knowledge transfer is performed in the manner of model parameter transfer. In order to alleviate negative transfer, we propose a metric to measure cross-domain class-wise relevance so that knowledge of higher relevance is assigned larger weights during knowledge transfer. Promising results in extensive experiments show the advantages of the proposed approach. |
Tasks | Activity Recognition, Few-Shot Learning, Human Activity Recognition, Transfer Learning |
Published | 2019-03-25 |
URL | http://arxiv.org/abs/1903.10416v1 |
http://arxiv.org/pdf/1903.10416v1.pdf | |
PWC | https://paperswithcode.com/paper/few-shot-learning-based-human-activity |
Repo | |
Framework | |
Automatic text summarization: What has been done and what has to be done
Title | Automatic text summarization: What has been done and what has to be done |
Authors | Abdelkrime Aries, Djamel eddine Zegour, Walid Khaled Hidouci |
Abstract | Summaries are important when it comes to process huge amounts of information. Their most important benefit is saving time, which we do not have much nowadays. Therefore, a summary must be short, representative and readable. Generating summaries automatically can be beneficial for humans, since it can save time and help selecting relevant documents. Automatic summarization and, in particular, Automatic text summarization (ATS) is not a new research field; It was known since the 50s. Since then, researchers have been active to find the perfect summarization method. In this article, we will discuss different works in automatic summarization, especially the recent ones. We will present some problems and limits which prevent works to move forward. Most of these challenges are much more related to the nature of processed languages. These challenges are interesting for academics and developers, as a path to follow in this field. |
Tasks | Text Summarization |
Published | 2019-04-01 |
URL | http://arxiv.org/abs/1904.00688v1 |
http://arxiv.org/pdf/1904.00688v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-text-summarization-what-has-been |
Repo | |
Framework | |
Information search in a professional context - exploring a collection of professional search tasks
Title | Information search in a professional context - exploring a collection of professional search tasks |
Authors | Suzan Verberne, Jiyin He, Gineke Wiggers, Tony Russell-Rose, Udo Kruschwitz, Arjen P. de Vries |
Abstract | Search conducted in a work context is an everyday activity that has been around since long before the Web was invented, yet we still seem to understand little about its general characteristics. With this paper we aim to contribute to a better understanding of this large but rather multi-faceted area of `professional search’. Unlike task-based studies that aim at measuring the effectiveness of search methods, we chose to take a step back by conducting a survey among professional searchers to understand their typical search tasks. By doing so we offer complementary insights into the subject area. We asked our respondents to provide actual search tasks they have worked on, information about how these were conducted and details on how successful they eventually were. We then manually coded the collection of 56 search tasks with task characteristics and relevance criteria, and used the coded dataset for exploration purposes. Despite the relatively small scale of this study, our data provides enough evidence that professional search is indeed very different from Web search in many key respects and that this is a field that offers many avenues for future research. | |
Tasks | |
Published | 2019-05-11 |
URL | https://arxiv.org/abs/1905.04577v1 |
https://arxiv.org/pdf/1905.04577v1.pdf | |
PWC | https://paperswithcode.com/paper/information-search-in-a-professional-context |
Repo | |
Framework | |
Image based Eye Gaze Tracking and its Applications
Title | Image based Eye Gaze Tracking and its Applications |
Authors | Anjith George |
Abstract | Eye movements play a vital role in perceiving the world. Eye gaze can give a direct indication of the users point of attention, which can be useful in improving human-computer interaction. Gaze estimation in a non-intrusive manner can make human-computer interaction more natural. Eye tracking can be used for several applications such as fatigue detection, biometric authentication, disease diagnosis, activity recognition, alertness level estimation, gaze-contingent display, human-computer interaction, etc. Even though eye-tracking technology has been around for many decades, it has not found much use in consumer applications. The main reasons are the high cost of eye tracking hardware and lack of consumer level applications. In this work, we attempt to address these two issues. In the first part of this work, image-based algorithms are developed for gaze tracking which includes a new two-stage iris center localization algorithm. We have developed a new algorithm which works in challenging conditions such as motion blur, glint, and varying illumination levels. A person independent gaze direction classification framework using a convolutional neural network is also developed which eliminates the requirement of user-specific calibration. In the second part of this work, we have developed two applications which can benefit from eye tracking data. A new framework for biometric identification based on eye movement parameters is developed. A framework for activity recognition, using gaze data from a head-mounted eye tracker is also developed. The information from gaze data, ego-motion, and visual features are integrated to classify the activities. |
Tasks | Activity Recognition, Calibration, Eye Tracking, Gaze Estimation |
Published | 2019-07-09 |
URL | https://arxiv.org/abs/1907.04325v1 |
https://arxiv.org/pdf/1907.04325v1.pdf | |
PWC | https://paperswithcode.com/paper/image-based-eye-gaze-tracking-and-its |
Repo | |
Framework | |
Linearized ADMM and Fast Nonlocal Denoising for Efficient Plug-and-Play Restoration
Title | Linearized ADMM and Fast Nonlocal Denoising for Efficient Plug-and-Play Restoration |
Authors | Unni V. S., Sanjay Ghosh, Kunal N. Chaudhury |
Abstract | In plug-and-play image restoration, the regularization is performed using powerful denoisers such as nonlocal means (NLM) or BM3D. This is done within the framework of alternating direction method of multipliers (ADMM), where the regularization step is formally replaced by an off-the-shelf denoiser. Each plug-and-play iteration involves the inversion of the forward model followed by a denoising step. In this paper, we present a couple of ideas for improving the efficiency of the inversion and denoising steps. First, we propose to use linearized ADMM, which generally allows us to perform the inversion at a lower cost than standard ADMM. Moreover, we can easily incorporate hard constraints into the optimization framework as a result. Second, we develop a fast algorithm for doubly stochastic NLM, originally proposed by Sreehari et al. (IEEE TCI, 2016), which is about 80x faster than brute-force computation. This particular denoiser can be expressed as the proximal map of a convex regularizer and, as a consequence, we can guarantee convergence for linearized plug-and-play ADMM. We demonstrate the effectiveness of our proposals for super-resolution and single-photon imaging. |
Tasks | Denoising, Image Restoration, Super-Resolution |
Published | 2019-01-18 |
URL | http://arxiv.org/abs/1901.06110v1 |
http://arxiv.org/pdf/1901.06110v1.pdf | |
PWC | https://paperswithcode.com/paper/linearized-admm-and-fast-nonlocal-denoising |
Repo | |
Framework | |
On Automating Conversations
Title | On Automating Conversations |
Authors | Ting-Hao ‘Kenneth’ Huang |
Abstract | From 2016 to 2018, we developed and deployed Chorus, a system that blends real-time human computation with artificial intelligence (AI) and has real-world, open conversations with users. We took a top-down approach that started with a working crowd-powered system, Chorus, and then created a framework, Evorus, that enables Chorus to automate itself over time. Over our two-year deployment, more than 420 users talked with Chorus, having over 2,200 conversation sessions. This line of work demonstrated how a crowd-powered conversational assistant can be automated over time, and more importantly, how such a system can be deployed to talk with real users to help them with their everyday tasks. This position paper discusses two sets of challenges that we explored during the development and deployment of Chorus and Evorus: the challenges that come from being an “agent” and those that arise from the subset of conversations that are more difficult to automate. |
Tasks | |
Published | 2019-10-21 |
URL | https://arxiv.org/abs/1910.09621v3 |
https://arxiv.org/pdf/1910.09621v3.pdf | |
PWC | https://paperswithcode.com/paper/on-automating-conversations |
Repo | |
Framework | |