Paper Group ANR 433
Neural Game Engine: Accurate learning of generalizable forward models from pixels. AdvMS: A Multi-source Multi-cost Defense Against Adversarial Attacks. Content Adaptive and Error Propagation Aware Deep Video Compression. Right for the Wrong Scientific Reasons: Revising Deep Networks by Interacting with their Explanations. A Unified End-to-End Fram …
Neural Game Engine: Accurate learning of generalizable forward models from pixels
Title | Neural Game Engine: Accurate learning of generalizable forward models from pixels |
Authors | Chris Bamford, Simon Lucas |
Abstract | Access to a fast and easily copied forward model of a game is essential for model-based reinforcement learning and for algorithms such as Monte Carlo tree search, and is also beneficial as a source of unlimited experience data for model-free algorithms. Learning forward models is an interesting and important challenge in order to address problems where a model is not available. Building upon previous work on the Neural GPU, this paper introduces the Neural Game Engine, as a way to learn models directly from pixels. The learned models are able to generalise to different size game levels to the ones they were trained on without loss of accuracy. Results on 10 deterministic General Video Game AI games demonstrate competitive performance, with many of the games models being learned perfectly both in terms of pixel predictions and reward predictions. The pre-trained models are available through the OpenAI Gym interface and are available publicly for future research here: \url{https://github.com/Bam4d/Neural-Game-Engine} |
Tasks | |
Published | 2020-03-23 |
URL | https://arxiv.org/abs/2003.10520v2 |
https://arxiv.org/pdf/2003.10520v2.pdf | |
PWC | https://paperswithcode.com/paper/neural-game-engine-accurate-learning |
Repo | |
Framework | |
AdvMS: A Multi-source Multi-cost Defense Against Adversarial Attacks
Title | AdvMS: A Multi-source Multi-cost Defense Against Adversarial Attacks |
Authors | Xiao Wang, Siyue Wang, Pin-Yu Chen, Xue Lin, Peter Chin |
Abstract | Designing effective defense against adversarial attacks is a crucial topic as deep neural networks have been proliferated rapidly in many security-critical domains such as malware detection and self-driving cars. Conventional defense methods, although shown to be promising, are largely limited by their single-source single-cost nature: The robustness promotion tends to plateau when the defenses are made increasingly stronger while the cost tends to amplify. In this paper, we study principles of designing multi-source and multi-cost schemes where defense performance is boosted from multiple defending components. Based on this motivation, we propose a multi-source and multi-cost defense scheme, Adversarially Trained Model Switching (AdvMS), that inherits advantages from two leading schemes: adversarial training and random model switching. We show that the multi-source nature of AdvMS mitigates the performance plateauing issue and the multi-cost nature enables improving robustness at a flexible and adjustable combination of costs over different factors which can better suit specific restrictions and needs in practice. |
Tasks | Malware Detection, Self-Driving Cars |
Published | 2020-02-19 |
URL | https://arxiv.org/abs/2002.08439v1 |
https://arxiv.org/pdf/2002.08439v1.pdf | |
PWC | https://paperswithcode.com/paper/advms-a-multi-source-multi-cost-defense |
Repo | |
Framework | |
Content Adaptive and Error Propagation Aware Deep Video Compression
Title | Content Adaptive and Error Propagation Aware Deep Video Compression |
Authors | Guo Lu, Chunlei Cai, Xiaoyun Zhang, Li Chen, Wanli Ouyang, Dong Xu, Zhiyong Gao |
Abstract | Recently, learning based video compression methods attract increasing attention. However, the previous works suffer from error propagation due to the accumulation of reconstructed error in inter predictive coding. Meanwhile, the previous learning based video codecs are also not adaptive to different video contents. To address these two problems, we propose a content adaptive and error propagation aware video compression system. Specifically, our method employs a joint training strategy by considering the compression performance of multiple consecutive frames instead of a single frame. Based on the learned long-term temporal information, our approach effectively alleviates error propagation in reconstructed frames. More importantly, instead of using the hand-crafted coding modes in the traditional compression systems, we design an online encoder updating scheme in our system. The proposed approach updates the parameters for encoder according to the rate-distortion criterion but keeps the decoder unchanged in the inference stage. Therefore, the encoder is adaptive to different video contents and achieves better compression performance by reducing the domain gap between the training and testing datasets. Our method is simple yet effective and outperforms the state-of-the-art learning based video codecs on benchmark datasets without increasing the model size or decreasing the decoding speed. |
Tasks | Video Compression |
Published | 2020-03-25 |
URL | https://arxiv.org/abs/2003.11282v1 |
https://arxiv.org/pdf/2003.11282v1.pdf | |
PWC | https://paperswithcode.com/paper/content-adaptive-and-error-propagation-aware |
Repo | |
Framework | |
Right for the Wrong Scientific Reasons: Revising Deep Networks by Interacting with their Explanations
Title | Right for the Wrong Scientific Reasons: Revising Deep Networks by Interacting with their Explanations |
Authors | Patrick Schramowski, Wolfgang Stammer, Stefano Teso, Anna Brugger, Xiaoting Shao, Hans-Georg Luigs, Anne-Katrin Mahlein, Kristian Kersting |
Abstract | Deep neural networks have shown excellent performances in many real-world applications. Unfortunately, they may show “Clever Hans”-like behavior—making use of confounding factors within datasets—to achieve high performance. In this work we introduce the novel learning setting of “explanatory interactive learning” (XIL) and illustrate its benefits on a plant phenotyping research task. XIL adds the scientist into the training loop such that she interactively revises the original model via providing feedback on its explanations. Our experimental results demonstrate that XIL can help avoiding Clever Hans moments in machine learning and encourages (or discourages, if appropriate) trust into the underlying model. |
Tasks | |
Published | 2020-01-15 |
URL | https://arxiv.org/abs/2001.05371v2 |
https://arxiv.org/pdf/2001.05371v2.pdf | |
PWC | https://paperswithcode.com/paper/right-for-the-wrong-scientific-reasons |
Repo | |
Framework | |
A Unified End-to-End Framework for Efficient Deep Image Compression
Title | A Unified End-to-End Framework for Efficient Deep Image Compression |
Authors | Jiaheng Liu, Guo Lu, Zhihao Hu, Dong Xu |
Abstract | Image compression is a widely used technique to reduce the spatial redundancy in images. Recently, learning based image compression has achieved significant progress by using the powerful representation ability from neural networks. However, the current state-of-the-art learning based image compression methods suffer from the huge computational complexity, which limits their capacity for practical applications. In this paper, we propose a unified framework called Efficient Deep Image Compression (EDIC) based on three new technologies, including a channel attention module, a Gaussian mixture model and a decoder-side enhancement module. Specifically, we design an auto-encoder style network for learning based image compression. To improve the coding efficiency, we exploit the channel relationship between latent representations by using the channel attention module. Besides, the Gaussian mixture model is introduced for the entropy model and improves the accuracy for bitrate estimation. Furthermore, we introduce the decoder-side enhancement module to further improve image compression performance. Our EDIC method can also be readily incorporated with the Deep Video Compression (DVC) framework to further improve the video compression performance. Simultaneously, our EDIC method boosts the coding performance significantly while bringing slightly increased computational complexity. More importantly, experimental results demonstrate that the proposed approach outperforms the current state-of-the-art image compression methods and is up to more than 150 times faster in terms of decoding speed when compared with Minnen’s method. The proposed framework also successfully improves the performance of the recent deep video compression system DVC. |
Tasks | Image Compression, Video Compression |
Published | 2020-02-09 |
URL | https://arxiv.org/abs/2002.03370v2 |
https://arxiv.org/pdf/2002.03370v2.pdf | |
PWC | https://paperswithcode.com/paper/a-unified-end-to-end-framework-for-efficient |
Repo | |
Framework | |
Can’t Boil This Frog: Robustness of Online-Trained Autoencoder-Based Anomaly Detectors to Adversarial Poisoning Attacks
Title | Can’t Boil This Frog: Robustness of Online-Trained Autoencoder-Based Anomaly Detectors to Adversarial Poisoning Attacks |
Authors | Moshe Kravchik, Asaf Shabtai |
Abstract | In recent years, a variety of effective neural network-based methods for anomaly and cyber attack detection in industrial control systems (ICSs) have been demonstrated in the literature. Given their successful implementation and widespread use, there is a need to study adversarial attacks on such detection methods to better protect the systems that depend upon them. The extensive research performed on adversarial attacks on image and malware classification has little relevance to the physical system state prediction domain, which most of the ICS attack detection systems belong to. Moreover, such detection systems are typically retrained using new data collected from the monitored system, thus the threat of adversarial data poisoning is significant, however this threat has not yet been addressed by the research community. In this paper, we present the first study focused on poisoning attacks on online-trained autoencoder-based attack detectors. We propose two algorithms for generating poison samples, an interpolation-based algorithm and a back-gradient optimization-based algorithm, which we evaluate on both synthetic and real-world ICS data. We demonstrate that the proposed algorithms can generate poison samples that cause the target attack to go undetected by the autoencoder detector, however the ability to poison the detector is limited to a small set of attack types and magnitudes. When the poison-generating algorithms are applied to the popular SWaT dataset, we show that the autoencoder detector trained on the physical system state data is resilient to poisoning in the face of all ten of the relevant attacks in the dataset. This finding suggests that neural network-based attack detectors used in the cyber-physical domain are more robust to poisoning than in other problem domains, such as malware detection and image processing. |
Tasks | Cyber Attack Detection, data poisoning, Malware Classification, Malware Detection |
Published | 2020-02-07 |
URL | https://arxiv.org/abs/2002.02741v1 |
https://arxiv.org/pdf/2002.02741v1.pdf | |
PWC | https://paperswithcode.com/paper/cant-boil-this-frog-robustness-of-online |
Repo | |
Framework | |
A Comprehensive Study on Temporal Modeling for Online Action Detection
Title | A Comprehensive Study on Temporal Modeling for Online Action Detection |
Authors | Wen Wang, Xiaojiang Peng, Yu Qiao, Jian Cheng |
Abstract | Online action detection (OAD) is a practical yet challenging task, which has attracted increasing attention in recent years. A typical OAD system mainly consists of three modules: a frame-level feature extractor which is usually based on pre-trained deep Convolutional Neural Networks (CNNs), a temporal modeling module, and an action classifier. Among them, the temporal modeling module is crucial which aggregates discriminative information from historical and current features. Though many temporal modeling methods have been developed for OAD and other topics, their effects are lack of investigation on OAD fairly. This paper aims to provide a comprehensive study on temporal modeling for OAD including four meta types of temporal modeling methods, \ie temporal pooling, temporal convolution, recurrent neural networks, and temporal attention, and uncover some good practices to produce a state-of-the-art OAD system. Many of them are explored in OAD for the first time, and extensively evaluated with various hyper parameters. Furthermore, based on our comprehensive study, we present several hybrid temporal modeling methods, which outperform the recent state-of-the-art methods with sizable margins on THUMOS-14 and TVSeries. |
Tasks | Action Detection |
Published | 2020-01-21 |
URL | https://arxiv.org/abs/2001.07501v1 |
https://arxiv.org/pdf/2001.07501v1.pdf | |
PWC | https://paperswithcode.com/paper/a-comprehensive-study-on-temporal-modeling |
Repo | |
Framework | |
Adversarial Deepfakes: Evaluating Vulnerability of Deepfake Detectors to Adversarial Examples
Title | Adversarial Deepfakes: Evaluating Vulnerability of Deepfake Detectors to Adversarial Examples |
Authors | Paarth Neekhara, Shehzeen Hussain, Malhar Jere, Farinaz Koushanfar, Julian McAuley |
Abstract | Recent advances in video manipulation techniques have made the generation of fake videos more accessible than ever before. Manipulated videos can fuel disinformation and reduce trust in media. Therefore detection of fake videos has garnered immense interest in academia and industry. Recently developed Deepfake detection methods rely on deep neural networks (DNNs) to distinguish AI-generated fake videos from real videos. In this work, we demonstrate that it is possible to bypass such detectors by adversarially modifying fake videos synthesized using existing Deepfake generation methods. We further demonstrate that our adversarial perturbations are robust to image and video compression codecs, making them a real-world threat. We present pipelines in both white-box and black-box attack scenarios that can fool DNN based Deepfake detectors into classifying fake videos as real. |
Tasks | DeepFake Detection, Face Swapping, Video Compression |
Published | 2020-02-09 |
URL | https://arxiv.org/abs/2002.12749v2 |
https://arxiv.org/pdf/2002.12749v2.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-deepfakes-evaluating |
Repo | |
Framework | |
Dynamic Error-bounded Lossy Compression (EBLC) to Reduce the Bandwidth Requirement for Real-time Vision-based Pedestrian Safety Applications
Title | Dynamic Error-bounded Lossy Compression (EBLC) to Reduce the Bandwidth Requirement for Real-time Vision-based Pedestrian Safety Applications |
Authors | Mizanur Rahman, Mhafuzul Islam, Jon C. Calhoun, Mashrur Chowdhury |
Abstract | As camera quality improves and their deployment moves to areas with limited bandwidth, communication bottlenecks can impair real-time constraints of an ITS application, such as video-based real-time pedestrian detection. Video compression reduces the bandwidth requirement to transmit the video but degrades the video quality. As the quality level of the video decreases, it results in the corresponding decreases in the accuracy of the vision-based pedestrian detection model. Furthermore, environmental conditions (e.g., rain and darkness) alter the compression ratio and can make maintaining a high pedestrian detection accuracy more difficult. The objective of this study is to develop a real-time error-bounded lossy compression (EBLC) strategy to dynamically change the video compression level depending on different environmental conditions in order to maintain a high pedestrian detection accuracy. We conduct a case study to show the efficacy of our dynamic EBLC strategy for real-time vision-based pedestrian detection under adverse environmental conditions. Our strategy selects the error tolerances dynamically for lossy compression that can maintain a high detection accuracy across a representative set of environmental conditions. Analyses reveal that our strategy increases pedestrian detection accuracy up to 14% and reduces the communication bandwidth up to 14x for adverse environmental conditions compared to the same conditions but without our dynamic EBLC strategy. Our dynamic EBLC strategy is independent of detection models and environmental conditions allowing other detection models and environmental conditions to be easily incorporated in our strategy. |
Tasks | Pedestrian Detection, Video Compression |
Published | 2020-01-29 |
URL | https://arxiv.org/abs/2002.03742v1 |
https://arxiv.org/pdf/2002.03742v1.pdf | |
PWC | https://paperswithcode.com/paper/dynamic-error-bounded-lossy-compression-eblc |
Repo | |
Framework | |
A Neural Approach to Discourse Relation Signal Detection
Title | A Neural Approach to Discourse Relation Signal Detection |
Authors | Amir Zeldes, Yang Liu |
Abstract | Previous data-driven work investigating the types and distributions of discourse relation signals, including discourse markers such as ‘however’ or phrases such as ‘as a result’ has focused on the relative frequencies of signal words within and outside text from each discourse relation. Such approaches do not allow us to quantify the signaling strength of individual instances of a signal on a scale (e.g. more or less discourse-relevant instances of ‘and’), to assess the distribution of ambiguity for signals, or to identify words that hinder discourse relation identification in context (‘anti-signals’ or ‘distractors’). In this paper we present a data-driven approach to signal detection using a distantly supervised neural network and develop a metric, Delta s (or ‘delta-softmax’), to quantify signaling strength. Ranging between -1 and 1 and relying on recent advances in contextualized words embeddings, the metric represents each word’s positive or negative contribution to the identifiability of a relation in specific instances in context. Based on an English corpus annotated for discourse relations using Rhetorical Structure Theory and signal type annotations anchored to specific tokens, our analysis examines the reliability of the metric, the places where it overlaps with and differs from human judgments, and the implications for identifying features that neural models may need in order to perform better on automatic discourse relation classification. |
Tasks | Relation Classification |
Published | 2020-01-08 |
URL | https://arxiv.org/abs/2001.02380v2 |
https://arxiv.org/pdf/2001.02380v2.pdf | |
PWC | https://paperswithcode.com/paper/a-neural-approach-to-discourse-relation |
Repo | |
Framework | |
AI safety: state of the field through quantitative lens
Title | AI safety: state of the field through quantitative lens |
Authors | Mislav Juric, Agneza Sandic, Mario Brcic |
Abstract | Last decade has seen major improvements in the performance of artificial intelligence which has driven wide-spread applications. Unforeseen effects of such mass-adoption has put the notion of AI safety into the public eye. AI safety is a relatively new field of research focused on techniques for building AI beneficial for humans. While there exist survey papers for the field of AI safety, there is a lack of a quantitative look at the research being conducted. The quantitative aspect gives a data-driven insight about the emerging trends, knowledge gaps and potential areas for future research. In this paper, bibliometric analysis of the literature finds significant increase in research activity since 2015. Also, the field is so new that most of the technical issues are open, including: explainability with its long-term utility, and value alignment which we have identified as the most important long-term research topic. Equally, there is a severe lack of research into concrete policies regarding AI. As we expect AI to be the one of the main driving forces of changes in society, AI safety is the field under which we need to decide the direction of humanity’s future. |
Tasks | |
Published | 2020-02-12 |
URL | https://arxiv.org/abs/2002.05671v1 |
https://arxiv.org/pdf/2002.05671v1.pdf | |
PWC | https://paperswithcode.com/paper/ai-safety-state-of-the-field-through |
Repo | |
Framework | |
A Foreground-background Parallel Compression with Residual Encoding for Surveillance Video
Title | A Foreground-background Parallel Compression with Residual Encoding for Surveillance Video |
Authors | Lirong Wu, Kejie Huang, Haibin Shen, Lianli Gao |
Abstract | The data storage has been one of the bottlenecks in surveillance systems. The conventional video compression algorithms such as H.264 and H.265 do not fully utilize the low information density characteristic of the surveillance video. In this paper, we propose a video compression method that extracts and compresses the foreground and background of the video separately. The compression ratio is greatly improved by sharing background information among multiple adjacent frames through an adaptive background updating and interpolation module. Besides, we present two different schemes to compress the foreground and compare their performance in the ablation study to show the importance of temporal information for video compression. In the decoding end, a coarse-to-fine two-stage module is applied to achieve the composition of the foreground and background and the enhancements of frame quality. Furthermore, an adaptive sampling method for surveillance cameras is proposed, and we have shown its effects through software simulation. The experimental results show that our proposed method requires 69.5% less bpp (bits per pixel) than the conventional algorithm H.265 to achieve the same PSNR (36 dB) on the HECV dataset. |
Tasks | Video Compression |
Published | 2020-01-18 |
URL | https://arxiv.org/abs/2001.06590v1 |
https://arxiv.org/pdf/2001.06590v1.pdf | |
PWC | https://paperswithcode.com/paper/a-foreground-background-parallel-compression |
Repo | |
Framework | |
Risk-Aware Energy Scheduling for Edge Computing with Microgrid: A Multi-Agent Deep Reinforcement Learning Approach
Title | Risk-Aware Energy Scheduling for Edge Computing with Microgrid: A Multi-Agent Deep Reinforcement Learning Approach |
Authors | Md. Shirajum Munir, Sarder Fakhrul Abedin, Nguyen H. Tran, Zhu Han, Eui Nam Huh, Choong Seon Hong |
Abstract | In recent years, multi-access edge computing (MEC) is a key enabler for handling the massive expansion of Internet of Things (IoT) applications and services. However, energy consumption of a MEC network depends on volatile tasks that induces risk for energy demand estimations. As an energy supplier, a microgrid can facilitate seamless energy supply. However, the risk associated with energy supply is also increased due to unpredictable energy generation from renewable and non-renewable sources. Especially, the risk of energy shortfall is involved with uncertainties in both energy consumption and generation. In this paper, we study a risk-aware energy scheduling problem for a microgrid-powered MEC network. First, we formulate an optimization problem considering the conditional value-at-risk (CVaR) measurement for both energy consumption and generation, where the objective is to minimize the loss of energy shortfall of the MEC networks and we show this problem is an NP-hard problem. Second, we analyze our formulated problem using a multi-agent stochastic game that ensures the joint policy Nash equilibrium, and show the convergence of the proposed model. Third, we derive the solution by applying a multi-agent deep reinforcement learning (MADRL)-based asynchronous advantage actor-critic (A3C) algorithm with shared neural networks. This method mitigates the curse of dimensionality of the state space and chooses the best policy among the agents for the proposed problem. Finally, the experimental results establish a significant performance gain by considering CVaR for high accuracy energy scheduling of the proposed model than both the single and random agent models. |
Tasks | |
Published | 2020-02-21 |
URL | https://arxiv.org/abs/2003.02157v1 |
https://arxiv.org/pdf/2003.02157v1.pdf | |
PWC | https://paperswithcode.com/paper/risk-aware-energy-scheduling-for-edge |
Repo | |
Framework | |
Non-asymptotic and Accurate Learning of Nonlinear Dynamical Systems
Title | Non-asymptotic and Accurate Learning of Nonlinear Dynamical Systems |
Authors | Yahya Sattar, Samet Oymak |
Abstract | We consider the problem of learning stabilizable systems governed by nonlinear state equation $h_{t+1}=\phi(h_t,u_t;\theta)+w_t$. Here $\theta$ is the unknown system dynamics, $h_t $ is the state, $u_t$ is the input and $w_t$ is the additive noise vector. We study gradient based algorithms to learn the system dynamics $\theta$ from samples obtained from a single finite trajectory. If the system is run by a stabilizing input policy, we show that temporally-dependent samples can be approximated by i.i.d. samples via a truncation argument by using mixing-time arguments. We then develop new guarantees for the uniform convergence of the gradients of empirical loss. Unlike existing work, our bounds are noise sensitive which allows for learning ground-truth dynamics with high accuracy and small sample complexity. Together, our results facilitate efficient learning of the general nonlinear system under stabilizing policy. We specialize our guarantees to entry-wise nonlinear activations and verify our theory in various numerical experiments |
Tasks | |
Published | 2020-02-20 |
URL | https://arxiv.org/abs/2002.08538v1 |
https://arxiv.org/pdf/2002.08538v1.pdf | |
PWC | https://paperswithcode.com/paper/non-asymptotic-and-accurate-learning-of |
Repo | |
Framework | |
Unsupervised Word Polysemy Quantification with Multiresolution Grids of Contextual Embeddings
Title | Unsupervised Word Polysemy Quantification with Multiresolution Grids of Contextual Embeddings |
Authors | Christos Xypolopoulos, Antoine J. -P. Tixier, Michalis Vazirgiannis |
Abstract | The number of senses of a given word, or polysemy, is a very subjective notion, which varies widely across annotators and resources. We propose a novel method to estimate polysemy, based on simple geometry in the contextual embedding space. Our approach is fully unsupervised and purely data-driven. We show through rigorous experiments that our rankings are well correlated (with strong statistical significance) with 6 different rankings derived from famous human-constructed resources such as WordNet, OntoNotes, Oxford, Wikipedia etc., for 6 different standard metrics. We also visualize and analyze the correlation between the human rankings. A valuable by-product of our method is the ability to sample, at no extra cost, sentences containing different senses of a given word. Finally, the fully unsupervised nature of our method makes it applicable to any language. Code and data are publicly available at https://github.com/ksipos/polysemy-assessment. |
Tasks | |
Published | 2020-03-23 |
URL | https://arxiv.org/abs/2003.10224v1 |
https://arxiv.org/pdf/2003.10224v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-word-polysemy-quantification |
Repo | |
Framework | |