October 21, 2019

3087 words 15 mins read

Paper Group AWR 95

Learning Graph-Level Representations with Recurrent Neural Networks. Efficient Interactive Annotation of Segmentation Datasets with Polygon-RNN++. PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition. Real-Time Nonparametric Anomaly Detection in High-Dimensional Settings. Detector monitoring with artificial neural networ …

Learning Graph-Level Representations with Recurrent Neural Networks


Title	Learning Graph-Level Representations with Recurrent Neural Networks
Authors	Yu Jin, Joseph F. JaJa
Abstract	Recently a variety of methods have been developed to encode graphs into low-dimensional vectors that can be easily exploited by machine learning algorithms. The majority of these methods start by embedding the graph nodes into a low-dimensional vector space, followed by using some scheme to aggregate the node embeddings. In this work, we develop a new approach to learn graph-level representations, which includes a combination of unsupervised and supervised learning components. We start by learning a set of node representations in an unsupervised fashion. Graph nodes are mapped into node sequences sampled from random walk approaches approximated by the Gumbel-Softmax distribution. Recurrent neural network (RNN) units are modified to accommodate both the node representations as well as their neighborhood information. Experiments on standard graph classification benchmarks demonstrate that our proposed approach achieves superior or comparable performance relative to the state-of-the-art algorithms in terms of convergence speed and classification accuracy. We further illustrate the effectiveness of the different components used by our approach.
Tasks	Graph Classification
Published	2018-05-20
URL	http://arxiv.org/abs/1805.07683v4
PDF	http://arxiv.org/pdf/1805.07683v4.pdf
PWC	https://paperswithcode.com/paper/learning-graph-level-representations-with
Repo	https://github.com/yuj-umd/graphRNN
Framework	pytorch

Efficient Interactive Annotation of Segmentation Datasets with Polygon-RNN++


Title	Efficient Interactive Annotation of Segmentation Datasets with Polygon-RNN++
Authors	David Acuna, Huan Ling, Amlan Kar, Sanja Fidler
Abstract	Manually labeling datasets with object masks is extremely time consuming. In this work, we follow the idea of Polygon-RNN to produce polygonal annotations of objects interactively using humans-in-the-loop. We introduce several important improvements to the model: 1) we design a new CNN encoder architecture, 2) show how to effectively train the model with Reinforcement Learning, and 3) significantly increase the output resolution using a Graph Neural Network, allowing the model to accurately annotate high-resolution objects in images. Extensive evaluation on the Cityscapes dataset shows that our model, which we refer to as Polygon-RNN++, significantly outperforms the original model in both automatic (10% absolute and 16% relative improvement in mean IoU) and interactive modes (requiring 50% fewer clicks by annotators). We further analyze the cross-domain scenario in which our model is trained on one dataset, and used out of the box on datasets from varying domains. The results show that Polygon-RNN++ exhibits powerful generalization capabilities, achieving significant improvements over existing pixel-wise methods. Using simple online fine-tuning we further achieve a high reduction in annotation time for new datasets, moving a step closer towards an interactive annotation tool to be used in practice.
Tasks
Published	2018-03-26
URL	http://arxiv.org/abs/1803.09693v1
PDF	http://arxiv.org/pdf/1803.09693v1.pdf
PWC	https://paperswithcode.com/paper/efficient-interactive-annotation-of
Repo	https://github.com/fidler-lab/polyrnn-pp-pytorch
Framework	pytorch

PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition


Title	PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition
Authors	Mikaela Angelina Uy, Gim Hee Lee
Abstract	Unlike its image based counterpart, point cloud based retrieval for place recognition has remained as an unexplored and unsolved problem. This is largely due to the difficulty in extracting local feature descriptors from a point cloud that can subsequently be encoded into a global descriptor for the retrieval task. In this paper, we propose the PointNetVLAD where we leverage on the recent success of deep networks to solve point cloud based retrieval for place recognition. Specifically, our PointNetVLAD is a combination/modification of the existing PointNet and NetVLAD, which allows end-to-end training and inference to extract the global descriptor from a given 3D point cloud. Furthermore, we propose the “lazy triplet and quadruplet” loss functions that can achieve more discriminative and generalizable global descriptors to tackle the retrieval task. We create benchmark datasets for point cloud based retrieval for place recognition, and the experimental results on these datasets show the feasibility of our PointNetVLAD. Our code and the link for the benchmark dataset downloads are available in our project website. http://github.com/mikacuy/pointnetvlad/
Tasks
Published	2018-04-10
URL	http://arxiv.org/abs/1804.03492v3
PDF	http://arxiv.org/pdf/1804.03492v3.pdf
PWC	https://paperswithcode.com/paper/pointnetvlad-deep-point-cloud-based-retrieval
Repo	https://github.com/mikacuy/pointnetvlad
Framework	tf

Real-Time Nonparametric Anomaly Detection in High-Dimensional Settings


Title	Real-Time Nonparametric Anomaly Detection in High-Dimensional Settings
Authors	Mehmet Necip Kurt, Yasin Yilmaz, Xiaodong Wang
Abstract	Timely detection of abrupt anomalies is crucial for real-time monitoring and security of modern systems producing high-dimensional data. With this goal, we propose effective and scalable algorithms. Proposed algorithms are nonparametric as both the nominal and anomalous multivariate data distributions are assumed unknown. We extract useful univariate summary statistics and perform anomaly detection in a single-dimensional space. We model anomalies as persistent outliers and propose to detect them via a cumulative sum-like algorithm. In case the observed data have a low intrinsic dimensionality, we learn a submanifold in which the nominal data are embedded and evaluate whether the sequentially acquired data persistently deviate from the nominal submanifold. Further, in the general case, we learn an acceptance region for nominal data via Geometric Entropy Minimization and evaluate whether the sequentially observed data persistently fall outside the acceptance region. We provide an asymptotic lower bound and an asymptotic approximation for the average false alarm period of the proposed algorithm. Moreover, we provide a sufficient condition to asymptotically guarantee that the decision statistic of the proposed algorithm does not diverge in the absence of anomalies. Experiments illustrate the effectiveness of the proposed schemes in quick and accurate anomaly detection in high-dimensional settings.
Tasks	Anomaly Detection
Published	2018-09-14
URL	https://arxiv.org/abs/1809.05250v2
PDF	https://arxiv.org/pdf/1809.05250v2.pdf
PWC	https://paperswithcode.com/paper/real-time-nonparametric-anomaly-detection-in
Repo	https://github.com/mnecipkurt/pami20
Framework	none

Detector monitoring with artificial neural networks at the CMS experiment at the CERN Large Hadron Collider


Title	Detector monitoring with artificial neural networks at the CMS experiment at the CERN Large Hadron Collider
Authors	Adrian Alan Pol, Gianluca Cerminara, Cecile Germain, Maurizio Pierini, Agrima Seth
Abstract	Reliable data quality monitoring is a key asset in delivering collision data suitable for physics analysis in any modern large-scale High Energy Physics experiment. This paper focuses on the use of artificial neural networks for supervised and semi-supervised problems related to the identification of anomalies in the data collected by the CMS muon detectors. We use deep neural networks to analyze LHC collision data, represented as images organized geographically. We train a classifier capable of detecting the known anomalous behaviors with unprecedented efficiency and explore the usage of convolutional autoencoders to extend anomaly detection capabilities to unforeseen failure modes. A generalization of this strategy could pave the way to the automation of the data quality assessment process for present and future high-energy physics experiments.
Tasks	Anomaly Detection
Published	2018-07-27
URL	http://arxiv.org/abs/1808.00911v1
PDF	http://arxiv.org/pdf/1808.00911v1.pdf
PWC	https://paperswithcode.com/paper/detector-monitoring-with-artificial-neural
Repo	https://github.com/MantasPtr/CERN-CMS-DQM-DT-visualization
Framework	tf

Utilizing a Transparency-driven Environment toward Trusted Automatic Genre Classification: A Case Study in Journalism History


Title	Utilizing a Transparency-driven Environment toward Trusted Automatic Genre Classification: A Case Study in Journalism History
Authors	Aysenur Bilgin, Laura Hollink, Jacco van Ossenbruggen, Erik Tjong Kim Sang, Kim Smeenk, Frank Harbers, Marcel Broersma
Abstract	With the growing abundance of unlabeled data in real-world tasks, researchers have to rely on the predictions given by black-boxed computational models. However, it is an often neglected fact that these models may be scoring high on accuracy for the wrong reasons. In this paper, we present a practical impact analysis of enabling model transparency by various presentation forms. For this purpose, we developed an environment that empowers non-computer scientists to become practicing data scientists in their own research field. We demonstrate the gradually increasing understanding of journalism historians through a real-world use case study on automatic genre classification of newspaper articles. This study is a first step towards trusted usage of machine learning pipelines in a responsible way.
Tasks
Published	2018-10-01
URL	http://arxiv.org/abs/1810.00968v1
PDF	http://arxiv.org/pdf/1810.00968v1.pdf
PWC	https://paperswithcode.com/paper/utilizing-a-transparency-driven-environment
Repo	https://github.com/newsgac/platform
Framework	none

Stein Neural Sampler


Title	Stein Neural Sampler
Authors	Tianyang Hu, Zixiang Chen, Hanxi Sun, Jincheng Bai, Mao Ye, Guang Cheng
Abstract	We propose two novel samplers to produce high-quality samples from a given (un-normalized) probability density. The sampling is achieved by transforming a reference distribution to the target distribution with neural networks, which are trained separately by minimizing two kinds of Stein Discrepancies, and hence our method is named as Stein neural sampler. Theoretical and empirical results suggest that, compared with traditional sampling schemes, our samplers share the following three advantages: 1. Being asymptotically correct; 2. Experiencing less convergence issue in practice; 3. Generating samples instantaneously.
Tasks
Published	2018-10-08
URL	http://arxiv.org/abs/1810.03545v1
PDF	http://arxiv.org/pdf/1810.03545v1.pdf
PWC	https://paperswithcode.com/paper/stein-neural-sampler
Repo	https://github.com/HanxiSun/SteinNS
Framework	tf

Unsupervised Representation Adversarial Learning Network: from Reconstruction to Generation


Title	Unsupervised Representation Adversarial Learning Network: from Reconstruction to Generation
Authors	Yuqian Zhou, Kuangxiao Gu, Thomas Huang
Abstract	A good representation for arbitrarily complicated data should have the capability of semantic generation, clustering and reconstruction. Previous research has already achieved impressive performance on either one. This paper aims at learning a disentangled representation effective for all of them in an unsupervised way. To achieve all the three tasks together, we learn the forward and inverse mapping between data and representation on the basis of a symmetric adversarial process. In theory, we minimize the upper bound of the two conditional entropy loss between the latent variables and the observations together to achieve the cycle consistency. The newly proposed RepGAN is tested on MNIST, fashionMNIST, CelebA, and SVHN datasets to perform unsupervised classification, generation and reconstruction tasks. The result demonstrates that RepGAN is able to learn a useful and competitive representation. To the author’s knowledge, our work is the first one to achieve both a high unsupervised classification accuracy and low reconstruction error on MNIST. Codes are available at https://github.com/yzhouas/RepGAN-tensorflow.
Tasks
Published	2018-04-19
URL	http://arxiv.org/abs/1804.07353v2
PDF	http://arxiv.org/pdf/1804.07353v2.pdf
PWC	https://paperswithcode.com/paper/unsupervised-representation-adversarial
Repo	https://github.com/yzhouas/RepGAN-tensorflow
Framework	tf

IGLOO: Slicing the Features Space to Represent Long Sequences


Title	IGLOO: Slicing the Features Space to Represent Long Sequences
Authors	Vsevolod Sourkov
Abstract	We introduce a new neural network architecture, IGLOO, which aims at providing a representation for long sequences where RNNs fail to converge. The structure uses the relationships between random patches sliced out of the features space of some backbone 1 dimensional CNN to find a representation. This paper explains the implementation of the method and provides benchmark results commonly used for RNNs and compare IGLOO to other structures recently published. It is found that IGLOO can deal with sequences of up to 25,000 time steps. For shorter sequences it is also found to be effective and we find that it achieves the highest score in the literature for the permuted MNIST task. Benchmarks also show that IGLOO can run at the speed of the CuDNN optimized GRU or LSTM or faster for most of the tasks presented.
Tasks
Published	2018-07-09
URL	http://arxiv.org/abs/1807.03402v2
PDF	http://arxiv.org/pdf/1807.03402v2.pdf
PWC	https://paperswithcode.com/paper/igloo-slicing-the-features-space-to-represent
Repo	https://github.com/redna11/igloo1D
Framework	tf


Title	Multi-Resolution Multi-Modal Sensor Fusion For Remote Sensing Data With Label Uncertainty
Authors	Xiaoxiao Du, Alina Zare
Abstract	In remote sensing, each sensor can provide complementary or reinforcing information. It is valuable to fuse outputs from multiple sensors to boost overall performance. Previous supervised fusion methods often require accurate labels for each pixel in the training data. However, in many remote sensing applications, pixel-level labels are difficult or infeasible to obtain. In addition, outputs from multiple sensors often have different resolution or modalities. For example, rasterized hyperspectral imagery presents data in a pixel grid while airborne Light Detection and Ranging (LiDAR) generates dense three-dimensional (3D) point clouds. It is often difficult to directly fuse such multi-modal, multi-resolution data. To address these challenges, we present a novel Multiple Instance Multi-Resolution Fusion (MIMRF) framework that can fuse multi-resolution and multi-modal sensor outputs while learning from automatically-generated, imprecisely-labeled data. Experiments were conducted on the MUUFL Gulfport hyperspectral and LiDAR data set and a remotely-sensed soybean and weed data set. Results show improved, consistent performance on scene understanding and agricultural applications when compared to traditional fusion methods.
Tasks	Scene Understanding, Sensor Fusion
Published	2018-05-02
URL	https://arxiv.org/abs/1805.00930v2
PDF	https://arxiv.org/pdf/1805.00930v2.pdf
PWC	https://paperswithcode.com/paper/multi-resolution-multi-modal-sensor-fusion
Repo	https://github.com/GatorSense/MIMRF
Framework	none

The Fine Line between Linguistic Generalization and Failure in Seq2Seq-Attention Models


Title	The Fine Line between Linguistic Generalization and Failure in Seq2Seq-Attention Models
Authors	Noah Weber, Leena Shekhar, Niranjan Balasubramanian
Abstract	Seq2Seq based neural architectures have become the go-to architecture to apply to sequence to sequence language tasks. Despite their excellent performance on these tasks, recent work has noted that these models usually do not fully capture the linguistic structure required to generalize beyond the dense sections of the data distribution \cite{ettinger2017towards}, and as such, are likely to fail on samples from the tail end of the distribution (such as inputs that are noisy \citep{belkinovnmtbreak} or of different lengths \citep{bentivoglinmtlength}). In this paper, we look at a model’s ability to generalize on a simple symbol rewriting task with a clearly defined structure. We find that the model’s ability to generalize this structure beyond the training distribution depends greatly on the chosen random seed, even when performance on the standard test set remains the same. This suggests that a model’s ability to capture generalizable structure is highly sensitive. Moreover, this sensitivity may not be apparent when evaluating it on standard test sets.
Tasks
Published	2018-05-03
URL	http://arxiv.org/abs/1805.01445v2
PDF	http://arxiv.org/pdf/1805.01445v2.pdf
PWC	https://paperswithcode.com/paper/the-fine-line-between-linguistic
Repo	https://github.com/LeenaShekhar/FailureAndGeneralizationDataset
Framework	none

TStarBots: Defeating the Cheating Level Builtin AI in StarCraft II in the Full Game


Title	TStarBots: Defeating the Cheating Level Builtin AI in StarCraft II in the Full Game
Authors	Peng Sun, Xinghai Sun, Lei Han, Jiechao Xiong, Qing Wang, Bo Li, Yang Zheng, Ji Liu, Yongsheng Liu, Han Liu, Tong Zhang
Abstract	Starcraft II (SC2) is widely considered as the most challenging Real Time Strategy (RTS) game. The underlying challenges include a large observation space, a huge (continuous and infinite) action space, partial observations, simultaneous move for all players, and long horizon delayed rewards for local decisions. To push the frontier of AI research, Deepmind and Blizzard jointly developed the StarCraft II Learning Environment (SC2LE) as a testbench of complex decision making systems. SC2LE provides a few mini games such as MoveToBeacon, CollectMineralShards, and DefeatRoaches, where some AI agents have achieved the performance level of human professional players. However, for full games, the current AI agents are still far from achieving human professional level performance. To bridge this gap, we present two full game AI agents in this paper - the AI agent TStarBot1 is based on deep reinforcement learning over a flat action structure, and the AI agent TStarBot2 is based on hard-coded rules over a hierarchical action structure. Both TStarBot1 and TStarBot2 are able to defeat the built-in AI agents from level 1 to level 10 in a full game (1v1 Zerg-vs-Zerg game on the AbyssalReef map), noting that level 8, level 9, and level 10 are cheating agents with unfair advantages such as full vision on the whole map and resource harvest boosting. To the best of our knowledge, this is the first public work to investigate AI agents that can defeat the built-in AI in the StarCraft II full game.
Tasks	Decision Making, Real-Time Strategy Games, Starcraft, Starcraft II
Published	2018-09-19
URL	http://arxiv.org/abs/1809.07193v3
PDF	http://arxiv.org/pdf/1809.07193v3.pdf
PWC	https://paperswithcode.com/paper/tstarbots-defeating-the-cheating-level
Repo	https://github.com/LFhase/Research_Navigation
Framework	none

Attend and Rectify: a Gated Attention Mechanism for Fine-Grained Recovery


Title	Attend and Rectify: a Gated Attention Mechanism for Fine-Grained Recovery
Authors	Pau Rodríguez, Josep M. Gonfaus, Guillem Cucurull, F. Xavier Roca, Jordi Gonzàlez
Abstract	We propose a novel attention mechanism to enhance Convolutional Neural Networks for fine-grained recognition. It learns to attend to lower-level feature activations without requiring part annotations and uses these activations to update and rectify the output likelihood distribution. In contrast to other approaches, the proposed mechanism is modular, architecture-independent and efficient both in terms of parameters and computation required. Experiments show that networks augmented with our approach systematically improve their classification accuracy and become more robust to clutter. As a result, Wide Residual Networks augmented with our proposal surpasses the state of the art classification accuracies in CIFAR-10, the Adience gender recognition task, Stanford dogs, and UEC Food-100.
Tasks
Published	2018-07-19
URL	http://arxiv.org/abs/1807.07320v2
PDF	http://arxiv.org/pdf/1807.07320v2.pdf
PWC	https://paperswithcode.com/paper/attend-and-rectify-a-gated-attention
Repo	https://github.com/prlz77/attend-and-rectify
Framework	pytorch

Predicting Gaze in Egocentric Video by Learning Task-dependent Attention Transition


Title	Predicting Gaze in Egocentric Video by Learning Task-dependent Attention Transition
Authors	Yifei Huang, Minjie Cai, Zhenqiang Li, Yoichi Sato
Abstract	We present a new computational model for gaze prediction in egocentric videos by exploring patterns in temporal shift of gaze fixations (attention transition) that are dependent on egocentric manipulation tasks. Our assumption is that the high-level context of how a task is completed in a certain way has a strong influence on attention transition and should be modeled for gaze prediction in natural dynamic scenes. Specifically, we propose a hybrid model based on deep neural networks which integrates task-dependent attention transition with bottom-up saliency prediction. In particular, the task-dependent attention transition is learned with a recurrent neural network to exploit the temporal context of gaze fixations, e.g. looking at a cup after moving gaze away from a grasped bottle. Experiments on public egocentric activity datasets show that our model significantly outperforms state-of-the-art gaze prediction methods and is able to learn meaningful transition of human attention.
Tasks	Gaze Prediction, Saliency Prediction
Published	2018-03-24
URL	http://arxiv.org/abs/1803.09125v3
PDF	http://arxiv.org/pdf/1803.09125v3.pdf
PWC	https://paperswithcode.com/paper/predicting-gaze-in-egocentric-video-by
Repo	https://github.com/hyf015/egocentric-gaze-prediction
Framework	pytorch

A General Path-Based Representation for Predicting Program Properties


Title	A General Path-Based Representation for Predicting Program Properties
Authors	Uri Alon, Meital Zilberstein, Omer Levy, Eran Yahav
Abstract	Predicting program properties such as names or expression types has a wide range of applications. It can ease the task of programming and increase programmer productivity. A major challenge when learning from programs is $\textit{how to represent programs in a way that facilitates effective learning}$. We present a $\textit{general path-based representation}$ for learning from programs. Our representation is purely syntactic and extracted automatically. The main idea is to represent a program using paths in its abstract syntax tree (AST). This allows a learning model to leverage the structured nature of code rather than treating it as a flat sequence of tokens. We show that this representation is general and can: (i) cover different prediction tasks, (ii) drive different learning algorithms (for both generative and discriminative models), and (iii) work across different programming languages. We evaluate our approach on the tasks of predicting variable names, method names, and full types. We use our representation to drive both CRF-based and word2vec-based learning, for programs of four languages: JavaScript, Java, Python and C#. Our evaluation shows that our approach obtains better results than task-specific handcrafted representations across different tasks and programming languages.
Tasks
Published	2018-03-26
URL	http://arxiv.org/abs/1803.09544v3
PDF	http://arxiv.org/pdf/1803.09544v3.pdf
PWC	https://paperswithcode.com/paper/a-general-path-based-representation-for
Repo	https://github.com/vovak/astminer
Framework	none