Paper Group ANR 1399
Bias-Reduced Hindsight Experience Replay with Virtual Goal Prioritization. A neural network based policy iteration algorithm with global $H^2$-superlinear convergence for stochastic games on domains. Representing text as abstract images enables image classifiers to also simultaneously classify text. Representation of Federated Learning via Worst-Ca …
Bias-Reduced Hindsight Experience Replay with Virtual Goal Prioritization
Title | Bias-Reduced Hindsight Experience Replay with Virtual Goal Prioritization |
Authors | Binyamin Manela, Armin Biess |
Abstract | Hindsight Experience Replay (HER) is a multi-goal reinforcement learning algorithm for sparse reward functions. The algorithm treats every failure as a success for an alternative (virtual) goal that has been achieved in the episode. Virtual goals are randomly selected, irrespective of which are most instructive for the agent. In this paper, we present two improvements over the existing HER algorithm. First, we prioritize virtual goals from which the agent will learn more valuable information. We call this property the instructiveness of the virtual goal and define it by a heuristic measure, which expresses how well the agent will be able to generalize from that virtual goal to actual goals. Secondly, we reduce existing bias in HER by the removal of misleading samples. To test our algorithms, we built two challenging environments with sparse reward functions. Our empirical results in both environments show vast improvement in the final success rate and sample efficiency when compared to the original HER algorithm. A video showing experimental results is available at https://youtu.be/3cZwfK8Nfps . |
Tasks | Multi-Goal Reinforcement Learning |
Published | 2019-05-14 |
URL | https://arxiv.org/abs/1905.05498v4 |
https://arxiv.org/pdf/1905.05498v4.pdf | |
PWC | https://paperswithcode.com/paper/bias-reduced-hindsight-experience-replay-with |
Repo | |
Framework | |
A neural network based policy iteration algorithm with global $H^2$-superlinear convergence for stochastic games on domains
Title | A neural network based policy iteration algorithm with global $H^2$-superlinear convergence for stochastic games on domains |
Authors | Kazufumi Ito, Christoph Reisinger, Yufei Zhang |
Abstract | In this work, we propose a class of numerical schemes for solving semilinear Hamilton-Jacobi-Bellman-Isaacs (HJBI) boundary value problems which arise naturally from exit time problems of diffusion processes with controlled drift. We exploit policy iteration to reduce the semilinear problem into a sequence of linear Dirichlet problems, which are subsequently approximated by a multilayer feedforward neural network ansatz. We establish that the numerical solutions converge globally in the $H^2$-norm, and further demonstrate that this convergence is superlinear, by interpreting the algorithm as an inexact Newton iteration for the HJBI equation. Moreover, we construct the optimal feedback controls from the numerical value functions and deduce convergence. The numerical schemes and convergence results are then extended to HJBI boundary value problems corresponding to controlled diffusion processes with oblique boundary reflection. Numerical experiments on the stochastic Zermelo navigation problem are presented to illustrate the theoretical results and to demonstrate the effectiveness of the method. |
Tasks | |
Published | 2019-06-05 |
URL | https://arxiv.org/abs/1906.02304v3 |
https://arxiv.org/pdf/1906.02304v3.pdf | |
PWC | https://paperswithcode.com/paper/a-neural-network-based-policy-iteration |
Repo | |
Framework | |
Representing text as abstract images enables image classifiers to also simultaneously classify text
Title | Representing text as abstract images enables image classifiers to also simultaneously classify text |
Authors | Stephen M. Petrie, T’Mir D. Julius |
Abstract | We introduce a novel method for converting text data into abstract image representations, which allows image-based processing techniques (e.g. image classification networks) to be applied to text-based comparison problems. We apply the technique to entity disambiguation of inventor names in US patents. The method involves converting text from each pairwise comparison between two inventor name records into a 2D RGB (stacked) image representation. We then train an image classification neural network to discriminate between such pairwise comparison images, and use the trained network to label each pair of records as either matched (same inventor) or non-matched (different inventors), obtaining highly accurate results. Our new text-to-image representation method could also be used more broadly for other NLP comparison problems, such as disambiguation of academic publications, or for problems that require simultaneous classification of both text and image datasets. |
Tasks | Entity Disambiguation, Image Classification, Text Classification |
Published | 2019-08-19 |
URL | https://arxiv.org/abs/1908.07846v3 |
https://arxiv.org/pdf/1908.07846v3.pdf | |
PWC | https://paperswithcode.com/paper/190807846 |
Repo | |
Framework | |
Representation of Federated Learning via Worst-Case Robust Optimization Theory
Title | Representation of Federated Learning via Worst-Case Robust Optimization Theory |
Authors | Saeedeh Parsaeefard, Iman Tabrizian, Alberto Leon Garcia |
Abstract | Federated learning (FL) is a distributed learning approach where a set of end-user devices participate in the learning process by acting on their isolated local data sets. Here, we process local data sets of users where worst-case optimization theory is used to reformulate the FL problem where the impact of local data sets in training phase is considered as an uncertain function bounded in a closed uncertainty region. This representation allows us to compare the performance of FL with its centralized counterpart, and to replace the uncertain function with a concept of protection functions leading to more tractable formulation. The latter supports applying a regularization factor in each user cost function in FL to reach a better performance. We evaluated our model using the MNIST data set versus the protection function parameters, e.g., regularization factors. |
Tasks | |
Published | 2019-12-11 |
URL | https://arxiv.org/abs/1912.05571v1 |
https://arxiv.org/pdf/1912.05571v1.pdf | |
PWC | https://paperswithcode.com/paper/representation-of-federated-learning-via |
Repo | |
Framework | |
Revisiting a single-stage method for face detection
Title | Revisiting a single-stage method for face detection |
Authors | Nguyen Van Quang, Hiromasa Fujihara |
Abstract | Although accurate, two-stage face detectors usually require more inference time than single-stage detectors do. This paper proposes a simple yet effective single-stage model for real-time face detection with a prominently high accuracy. We build our single-stage model on the top of the ResNet-101 backbone and analyze some problems with the baseline single-stage detector in order to design several strategies for reducing the false positive rate. The design leverages the context information from the deeper layers in order to increase recall rate while maintaining a low false positive rate. In addition, we reduce the detection time by an improved inference procedure for decoding outputs faster. The inference time of a VGA ($640{\times}480$) image was only approximately 26 ms with a Titan X GPU. The effectiveness of our proposed method was evaluated on several face detection benchmarks (Wider Face, AFW, Pascal Face, and FDDB). The experiments show that our method achieved competitive results on these popular datasets with a faster runtime than the current best two-stage practices. |
Tasks | Face Detection |
Published | 2019-02-05 |
URL | http://arxiv.org/abs/1902.01559v1 |
http://arxiv.org/pdf/1902.01559v1.pdf | |
PWC | https://paperswithcode.com/paper/revisiting-a-single-stage-method-for-face |
Repo | |
Framework | |
Uncertainty-Based Out-of-Distribution Detection in Deep Reinforcement Learning
Title | Uncertainty-Based Out-of-Distribution Detection in Deep Reinforcement Learning |
Authors | Andreas Sedlmeier, Thomas Gabor, Thomy Phan, Lenz Belzner, Claudia Linnhoff-Popien |
Abstract | We consider the problem of detecting out-of-distribution (OOD) samples in deep reinforcement learning. In a value based reinforcement learning setting, we propose to use uncertainty estimation techniques directly on the agent’s value estimating neural network to detect OOD samples. The focus of our work lies in analyzing the suitability of approximate Bayesian inference methods and related ensembling techniques that generate uncertainty estimates. Although prior work has shown that dropout-based variational inference techniques and bootstrap-based approaches can be used to model epistemic uncertainty, the suitability for detecting OOD samples in deep reinforcement learning remains an open question. Our results show that uncertainty estimation can be used to differentiate in- from out-of-distribution samples. Over the complete training process of the reinforcement learning agents, bootstrap-based approaches tend to produce more reliable epistemic uncertainty estimates, when compared to dropout-based approaches. |
Tasks | Bayesian Inference, Out-of-Distribution Detection |
Published | 2019-01-08 |
URL | http://arxiv.org/abs/1901.02219v1 |
http://arxiv.org/pdf/1901.02219v1.pdf | |
PWC | https://paperswithcode.com/paper/uncertainty-based-out-of-distribution |
Repo | |
Framework | |
LSTM Based Music Generation System
Title | LSTM Based Music Generation System |
Authors | Sanidhya Mangal, Rahul Modak, Poorva Joshi |
Abstract | Traditionally, music was treated as an analogue signal and was generated manually. In recent years, music is conspicuous to technology which can generate a suite of music automatically without any human intervention. To accomplish this task, we need to overcome some technical challenges which are discussed descriptively in this paper. A brief introduction about music and its components is provided in the paper along with the citation and analysis of related work accomplished by different authors in this domain. Main objective of this paper is to propose an algorithm which can be used to generate musical notes using Recurrent Neural Networks (RNN), principally Long Short-Term Memory (LSTM) networks. A model is designed to execute this algorithm where data is represented with the help of musical instrument digital interface (MIDI) file format for easier access and better understanding. Preprocessing of data before feeding it into the model, revealing methods to read, process and prepare MIDI files for input are also discussed. The model used in this paper is used to learn the sequences of polyphonic musical notes over a single-layered LSTM network. The model must have the potential to recall past details of a musical sequence and its structure for better learning. Description of layered architecture used in LSTM model and its intertwining connections to develop a neural network is presented in this work. This paper imparts a peek view of distributions of weights and biases in every layer of the model along with a precise representation of losses and accuracy at each step and batches. When the model was thoroughly analyzed, it produced stellar results in composing new melodies. |
Tasks | Music Generation |
Published | 2019-08-02 |
URL | https://arxiv.org/abs/1908.01080v1 |
https://arxiv.org/pdf/1908.01080v1.pdf | |
PWC | https://paperswithcode.com/paper/lstm-based-music-generation-system |
Repo | |
Framework | |
ACTRCE: Augmenting Experience via Teacher’s Advice For Multi-Goal Reinforcement Learning
Title | ACTRCE: Augmenting Experience via Teacher’s Advice For Multi-Goal Reinforcement Learning |
Authors | Harris Chan, Yuhuai Wu, Jamie Kiros, Sanja Fidler, Jimmy Ba |
Abstract | Sparse reward is one of the most challenging problems in reinforcement learning (RL). Hindsight Experience Replay (HER) attempts to address this issue by converting a failed experience to a successful one by relabeling the goals. Despite its effectiveness, HER has limited applicability because it lacks a compact and universal goal representation. We present Augmenting experienCe via TeacheR’s adviCE (ACTRCE), an efficient reinforcement learning technique that extends the HER framework using natural language as the goal representation. We first analyze the differences among goal representation, and show that ACTRCE can efficiently solve difficult reinforcement learning problems in challenging 3D navigation tasks, whereas HER with non-language goal representation failed to learn. We also show that with language goal representations, the agent can generalize to unseen instructions, and even generalize to instructions with unseen lexicons. We further demonstrate it is crucial to use hindsight advice to solve challenging tasks, and even small amount of advice is sufficient for the agent to achieve good performance. |
Tasks | Multi-Goal Reinforcement Learning |
Published | 2019-02-12 |
URL | http://arxiv.org/abs/1902.04546v1 |
http://arxiv.org/pdf/1902.04546v1.pdf | |
PWC | https://paperswithcode.com/paper/actrce-augmenting-experience-via-teachers |
Repo | |
Framework | |
Evaluation Mechanism of Collective Intelligence for Heterogeneous Agents Group
Title | Evaluation Mechanism of Collective Intelligence for Heterogeneous Agents Group |
Authors | Anna Dai, Zhifeng Zhao, Honggang Zhang, Rongpeng Li, Yugeng Zhou |
Abstract | Collective intelligence is manifested when multiple agents coherently work in observation, interaction, decision-making and action. In this paper, we define and quantify the intelligence level of heterogeneous agents group with the improved Anytime Universal Intelligence Test(AUIT), based on an extension of the existing evaluation of homogeneous agents group. The relationship of intelligence level with agents composition, group size, spatial complexity and testing time is analyzed. The intelligence level of heterogeneous agents groups is compared with the homogeneous ones to analyze the effects of heterogeneity on collective intelligence. Our work will help to understand the essence of collective intelligence more deeply and reveal the effect of various key factors on group intelligence level. |
Tasks | Decision Making |
Published | 2019-03-01 |
URL | https://arxiv.org/abs/1903.00206v2 |
https://arxiv.org/pdf/1903.00206v2.pdf | |
PWC | https://paperswithcode.com/paper/evaluation-mechanism-of-collective |
Repo | |
Framework | |
Rough Contact in General Rough Mereology
Title | Rough Contact in General Rough Mereology |
Authors | A. Mani |
Abstract | Theories of rough mereology have originated from diverse semantic considerations from contexts relating to study of databases, to human reasoning. These ideas of origin, especially in the latter context, are intensely complex. In this research, concepts of rough contact relations are introduced and rough mereologies are situated in relation to general spatial mereology by the present author. These considerations are restricted to her rough mereologies that seek to avoid contamination. |
Tasks | |
Published | 2019-05-12 |
URL | https://arxiv.org/abs/1905.04689v1 |
https://arxiv.org/pdf/1905.04689v1.pdf | |
PWC | https://paperswithcode.com/paper/rough-contact-in-general-rough-mereology |
Repo | |
Framework | |
A Novel Loss Function Incorporating Imaging Acquisition Physics for PET Attenuation Map Generation using Deep Learning
Title | A Novel Loss Function Incorporating Imaging Acquisition Physics for PET Attenuation Map Generation using Deep Learning |
Authors | Luyao Shi, John A. Onofrey, Enette Mae Revilla, Takuya Toyonaga, David Menard, Jo-seph Ankrah, Richard E. Carson, Chi Liu, Yihuan Lu |
Abstract | In PET/CT imaging, CT is used for PET attenuation correction (AC). Mismatch between CT and PET due to patient body motion results in AC artifacts. In addition, artifact caused by metal, beam-hardening and count-starving in CT itself also introduces inaccurate AC for PET. Maximum likelihood reconstruction of activity and attenuation (MLAA) was proposed to solve those issues by simultaneously reconstructing tracer activity ($\lambda$-MLAA) and attenuation map ($\mu$-MLAA) based on the PET raw data only. However, $\mu$-MLAA suffers from high noise and $\lambda$-MLAA suffers from large bias as compared to the reconstruction using the CT-based attenuation map ($\mu$-CT). Recently, a convolutional neural network (CNN) was applied to predict the CT attenuation map ($\mu$-CNN) from $\lambda$-MLAA and $\mu$-MLAA, in which an image-domain loss (IM-loss) function between the $\mu$-CNN and the ground truth $\mu$-CT was used. However, IM-loss does not directly measure the AC errors according to the PET attenuation physics, where the line-integral projection of the attenuation map ($\mu$) along the path of the two annihilation events, instead of the $\mu$ itself, is used for AC. Therefore, a network trained with the IM-loss may yield suboptimal performance in the $\mu$ generation. Here, we propose a novel line-integral projection loss (LIP-loss) function that incorporates the PET attenuation physics for $\mu$ generation. Eighty training and twenty testing datasets of whole-body 18F-FDG PET and paired ground truth $\mu$-CT were used. Quantitative evaluations showed that the model trained with the additional LIP-loss was able to significantly outperform the model trained solely based on the IM-loss function. |
Tasks | |
Published | 2019-09-03 |
URL | https://arxiv.org/abs/1909.01394v1 |
https://arxiv.org/pdf/1909.01394v1.pdf | |
PWC | https://paperswithcode.com/paper/a-novel-loss-function-incorporating-imaging |
Repo | |
Framework | |
GETNET: A General End-to-end Two-dimensional CNN Framework for Hyperspectral Image Change Detection
Title | GETNET: A General End-to-end Two-dimensional CNN Framework for Hyperspectral Image Change Detection |
Authors | Qi Wang, Senior Member, IEEE, Zhenghang Yuan, Qian Du, Fellow, IEEE, Xuelong Li, Fellow, IEEE |
Abstract | Change detection (CD) is an important application of remote sensing, which provides timely change information about large-scale Earth surface. With the emergence of hyperspectral imagery, CD technology has been greatly promoted, as hyperspectral data with the highspectral resolution are capable of detecting finer changes than using the traditional multispectral imagery. Nevertheless, the high dimension of hyperspectral data makes it difficult to implement traditional CD algorithms. Besides, endmember abundance information at subpixel level is often not fully utilized. In order to better handle high dimension problem and explore abundance information, this paper presents a General End-to-end Two-dimensional CNN (GETNET) framework for hyperspectral image change detection (HSI-CD). The main contributions of this work are threefold: 1) Mixed-affinity matrix that integrates subpixel representation is introduced to mine more cross-channel gradient features and fuse multi-source information; 2) 2-D CNN is designed to learn the discriminative features effectively from multi-source data at a higher level and enhance the generalization ability of the proposed CD algorithm; 3) A new HSI-CD data set is designed for the objective comparison of different methods. Experimental results on real hyperspectral data sets demonstrate the proposed method outperforms most of the state-of-the-arts. |
Tasks | |
Published | 2019-05-05 |
URL | https://arxiv.org/abs/1905.01662v1 |
https://arxiv.org/pdf/1905.01662v1.pdf | |
PWC | https://paperswithcode.com/paper/getnet-a-general-end-to-end-two-dimensional |
Repo | |
Framework | |
DEAN: Learning Dual Emotion for Fake News Detection on Social Media
Title | DEAN: Learning Dual Emotion for Fake News Detection on Social Media |
Authors | Chuan Guo, Juan Cao, Xueyao Zhang, Kai Shu, Huan Liu |
Abstract | Microblogging is a popular way for people to post, share, and seek information due to its convenience and low cost. However, it also facilitates the generation and propagation of fake news, which could cause detrimental societal consequences. Detecting fake news on microblogs is important for societal good. Emotion is considered a significant indicator in many fake news detection studies, and most of them utilize emotion mainly through users stances or simple statistical emotional features. In reality, the publishers typically post either a piece of news with intense emotion which could easily resonate with the crowd, or a controversial statement unemotionally aiming to evoke intense emotion among the users. However, existing studies that exploiting the emotion information from both news content and user comments corporately is ignored. Therefore, in this paper, we study the novel problem of learning dual emotion for fake news detection. We propose a new Dual Emotion-based fAke News detection framework (DEAN), which can i) learn content- and comment- emotion representations for publishers and users respectively; and ii) exploit the dual emotion representations simultaneously for fake news detection. Experimental results on real-world datasets demonstrate the effectiveness of the proposed framework. |
Tasks | Fake News Detection |
Published | 2019-03-05 |
URL | https://arxiv.org/abs/1903.01728v2 |
https://arxiv.org/pdf/1903.01728v2.pdf | |
PWC | https://paperswithcode.com/paper/exploiting-emotions-for-fake-news-detection |
Repo | |
Framework | |
Fake News Detection via NLP is Vulnerable to Adversarial Attacks
Title | Fake News Detection via NLP is Vulnerable to Adversarial Attacks |
Authors | Zhixuan Zhou, Huankang Guan, Meghana Moorthy Bhat, Justin Hsu |
Abstract | News plays a significant role in shaping people’s beliefs and opinions. Fake news has always been a problem, which wasn’t exposed to the mass public until the past election cycle for the 45th President of the United States. While quite a few detection methods have been proposed to combat fake news since 2015, they focus mainly on linguistic aspects of an article without any fact checking. In this paper, we argue that these models have the potential to misclassify fact-tampering fake news as well as under-written real news. Through experiments on Fakebox, a state-of-the-art fake news detector, we show that fact tampering attacks can be effective. To address these weaknesses, we argue that fact checking should be adopted in conjunction with linguistic characteristics analysis, so as to truly separate fake news from real news. A crowdsourced knowledge graph is proposed as a straw man solution to collecting timely facts about news events. |
Tasks | Fake News Detection |
Published | 2019-01-05 |
URL | http://arxiv.org/abs/1901.09657v1 |
http://arxiv.org/pdf/1901.09657v1.pdf | |
PWC | https://paperswithcode.com/paper/fake-news-detection-via-nlp-is-vulnerable-to |
Repo | |
Framework | |
Mobile Video Action Recognition
Title | Mobile Video Action Recognition |
Authors | Yuqi Huo, Xiaoli Xu, Yao Lu, Yulei Niu, Zhiwu Lu, Ji-Rong Wen |
Abstract | Video action recognition, which is topical in computer vision and video analysis, aims to allocate a short video clip to a pre-defined category such as brushing hair or climbing stairs. Recent works focus on action recognition with deep neural networks that achieve state-of-the-art results in need of high-performance platforms. Despite the fast development of mobile computing, video action recognition on mobile devices has not been fully discussed. In this paper, we focus on the novel mobile video action recognition task, where only the computational capabilities of mobile devices are accessible. Instead of raw videos with huge storage, we choose to extract multiple modalities (including I-frames, motion vectors, and residuals) directly from compressed videos. By employing MobileNetV2 as backbone, we propose a novel Temporal Trilinear Pooling (TTP) module to fuse the multiple modalities for mobile video action recognition. In addition to motion vectors, we also provide a temporal fusion method to explicitly induce the temporal context. The efficiency test on a mobile device indicates that our model can perform mobile video action recognition at about 40FPS. The comparative results on two benchmarks show that our model outperforms existing action recognition methods in model size and time consuming, but with competitive accuracy. |
Tasks | Temporal Action Localization |
Published | 2019-08-27 |
URL | https://arxiv.org/abs/1908.10155v1 |
https://arxiv.org/pdf/1908.10155v1.pdf | |
PWC | https://paperswithcode.com/paper/mobile-video-action-recognition |
Repo | |
Framework | |