October 21, 2019

3381 words 16 mins read

Paper Group AWR 12

Paper Group AWR 12

High Quality Prediction of Protein Q8 Secondary Structure by Diverse Neural Network Architectures. The Text-Based Adventure AI Competition. Learning to Shoot in First Person Shooter Games by Stabilizing Actions and Clustering Rewards for Reinforcement Learning. Memorable Maps: A Framework for Re-defining Places in Visual Place Recognition. MVSNet: …

High Quality Prediction of Protein Q8 Secondary Structure by Diverse Neural Network Architectures

Title High Quality Prediction of Protein Q8 Secondary Structure by Diverse Neural Network Architectures
Authors Iddo Drori, Isht Dwivedi, Pranav Shrestha, Jeffrey Wan, Yueqi Wang, Yunchu He, Anthony Mazza, Hugh Krogh-Freeman, Dimitri Leggas, Kendal Sandridge, Linyong Nan, Kaveri Thakoor, Chinmay Joshi, Sonam Goenka, Chen Keasar, Itsik Pe’er
Abstract We tackle the problem of protein secondary structure prediction using a common task framework. This lead to the introduction of multiple ideas for neural architectures based on state of the art building blocks, used in this task for the first time. We take a principled machine learning approach, which provides genuine, unbiased performance measures, correcting longstanding errors in the application domain. We focus on the Q8 resolution of secondary structure, an active area for continuously improving methods. We use an ensemble of strong predictors to achieve accuracy of 70.7% (on the CB513 test set using the CB6133filtered training set). These results are statistically indistinguishable from those of the top existing predictors. In the spirit of reproducible research we make our data, models and code available, aiming to set a gold standard for purity of training and testing sets. Such good practices lower entry barriers to this domain and facilitate reproducible, extendable research.
Tasks Protein Secondary Structure Prediction
Published 2018-11-17
URL http://arxiv.org/abs/1811.07143v1
PDF http://arxiv.org/pdf/1811.07143v1.pdf
PWC https://paperswithcode.com/paper/high-quality-prediction-of-protein-q8
Repo https://github.com/idrori/cu-ssp
Framework tf

The Text-Based Adventure AI Competition

Title The Text-Based Adventure AI Competition
Authors Timothy Atkinson, Hendrik Baier, Tara Copplestone, Sam Devlin, Jerry Swan
Abstract In 2016, 2017, and 2018 at the IEEE Conference on Computational Intelligence in Games, the authors of this paper ran a competition for agents that can play classic text-based adventure games. This competition fills a gap in existing game AI competitions that have typically focussed on traditional card/board games or modern video games with graphical interfaces. By providing a platform for evaluating agents in text-based adventures, the competition provides a novel benchmark for game AI with unique challenges for natural language understanding and generation. This paper summarises the three competitions ran in 2016, 2017, and 2018 (including details of open source implementations of both the competition framework and our competitors) and presents the results of an improved evaluation of these competitors across 20 games.
Tasks Board Games
Published 2018-08-03
URL http://arxiv.org/abs/1808.01262v4
PDF http://arxiv.org/pdf/1808.01262v4.pdf
PWC https://paperswithcode.com/paper/the-text-based-adventure-ai-competition
Repo https://github.com/Microsoft/nail_agent
Framework none

Learning to Shoot in First Person Shooter Games by Stabilizing Actions and Clustering Rewards for Reinforcement Learning

Title Learning to Shoot in First Person Shooter Games by Stabilizing Actions and Clustering Rewards for Reinforcement Learning
Authors Frank G. Glavin, Michael G. Madden
Abstract While reinforcement learning (RL) has been applied to turn-based board games for many years, more complex games involving decision-making in real-time are beginning to receive more attention. A challenge in such environments is that the time that elapses between deciding to take an action and receiving a reward based on its outcome can be longer than the interval between successive decisions. We explore this in the context of a non-player character (NPC) in a modern first-person shooter game. Such games take place in 3D environments where players, both human and computer-controlled, compete by engaging in combat and completing task objectives. We investigate the use of RL to enable NPCs to gather experience from game-play and improve their shooting skill over time from a reward signal based on the damage caused to opponents. We propose a new method for RL updates and reward calculations, in which the updates are carried out periodically, after each shooting encounter has ended, and a new weighted-reward mechanism is used which increases the reward applied to actions that lead to damaging the opponent in successive hits in what we term “hit clusters”.
Tasks Board Games, Decision Making
Published 2018-06-13
URL http://arxiv.org/abs/1806.05117v1
PDF http://arxiv.org/pdf/1806.05117v1.pdf
PWC https://paperswithcode.com/paper/learning-to-shoot-in-first-person-shooter
Repo https://github.com/lucylow/b00m-h3adsh0t
Framework none

Memorable Maps: A Framework for Re-defining Places in Visual Place Recognition

Title Memorable Maps: A Framework for Re-defining Places in Visual Place Recognition
Authors Mubariz Zaffar, Shoaib Ehsan, Michael Milford, Klaus Mcdonald Maier
Abstract This paper presents a cognition-inspired agnostic framework for building a map for Visual Place Recognition. This framework draws inspiration from human-memorability, utilizes the traditional image entropy concept and computes the static content in an image; thereby presenting a tri-folded criterion to assess the ‘memorability’ of an image for visual place recognition. A dataset namely ‘ESSEX3IN1’ is created, composed of highly confusing images from indoor, outdoor and natural scenes for analysis. When used in conjunction with state-of-the-art visual place recognition methods, the proposed framework provides significant performance boost to these techniques, as evidenced by results on ESSEX3IN1 and other public datasets.
Tasks Visual Place Recognition
Published 2018-11-08
URL http://arxiv.org/abs/1811.03529v2
PDF http://arxiv.org/pdf/1811.03529v2.pdf
PWC https://paperswithcode.com/paper/memorable-maps-a-framework-for-re-defining
Repo https://github.com/MubarizZaffar/ESSEX3IN1-Dataset
Framework none

MVSNet: Depth Inference for Unstructured Multi-view Stereo

Title MVSNet: Depth Inference for Unstructured Multi-view Stereo
Authors Yao Yao, Zixin Luo, Shiwei Li, Tian Fang, Long Quan
Abstract We present an end-to-end deep learning architecture for depth map inference from multi-view images. In the network, we first extract deep visual image features, and then build the 3D cost volume upon the reference camera frustum via the differentiable homography warping. Next, we apply 3D convolutions to regularize and regress the initial depth map, which is then refined with the reference image to generate the final output. Our framework flexibly adapts arbitrary N-view inputs using a variance-based cost metric that maps multiple features into one cost feature. The proposed MVSNet is demonstrated on the large-scale indoor DTU dataset. With simple post-processing, our method not only significantly outperforms previous state-of-the-arts, but also is several times faster in runtime. We also evaluate MVSNet on the complex outdoor Tanks and Temples dataset, where our method ranks first before April 18, 2018 without any fine-tuning, showing the strong generalization ability of MVSNet.
Tasks
Published 2018-04-07
URL http://arxiv.org/abs/1804.02505v2
PDF http://arxiv.org/pdf/1804.02505v2.pdf
PWC https://paperswithcode.com/paper/mvsnet-depth-inference-for-unstructured-multi
Repo https://github.com/YoYo000/MVSNet
Framework tf

Unsupervised Anomaly Detection via Variational Auto-Encoder for Seasonal KPIs in Web Applications

Title Unsupervised Anomaly Detection via Variational Auto-Encoder for Seasonal KPIs in Web Applications
Authors Haowen Xu, Wenxiao Chen, Nengwen Zhao, Zeyan Li, Jiahao Bu, Zhihan Li, Ying Liu, Youjian Zhao, Dan Pei, Yang Feng, Jie Chen, Zhaogang Wang, Honglin Qiao
Abstract To ensure undisrupted business, large Internet companies need to closely monitor various KPIs (e.g., Page Views, number of online users, and number of orders) of its Web applications, to accurately detect anomalies and trigger timely troubleshooting/mitigation. However, anomaly detection for these seasonal KPIs with various patterns and data quality has been a great challenge, especially without labels. In this paper, we proposed Donut, an unsupervised anomaly detection algorithm based on VAE. Thanks to a few of our key techniques, Donut greatly outperforms a state-of-arts supervised ensemble approach and a baseline VAE approach, and its best F-scores range from 0.75 to 0.9 for the studied KPIs from a top global Internet company. We come up with a novel KDE interpretation of reconstruction for Donut, making it the first VAE-based anomaly detection algorithm with solid theoretical explanation.
Tasks Anomaly Detection, Unsupervised Anomaly Detection
Published 2018-02-12
URL http://arxiv.org/abs/1802.03903v1
PDF http://arxiv.org/pdf/1802.03903v1.pdf
PWC https://paperswithcode.com/paper/unsupervised-anomaly-detection-via
Repo https://github.com/nakumgaurav/Anomaly-Detection
Framework none

DOPING: Generative Data Augmentation for Unsupervised Anomaly Detection with GAN

Title DOPING: Generative Data Augmentation for Unsupervised Anomaly Detection with GAN
Authors Swee Kiat Lim, Yi Loo, Ngoc-Trung Tran, Ngai-Man Cheung, Gemma Roig, Yuval Elovici
Abstract Recently, the introduction of the generative adversarial network (GAN) and its variants has enabled the generation of realistic synthetic samples, which has been used for enlarging training sets. Previous work primarily focused on data augmentation for semi-supervised and supervised tasks. In this paper, we instead focus on unsupervised anomaly detection and propose a novel generative data augmentation framework optimized for this task. In particular, we propose to oversample infrequent normal samples - normal samples that occur with small probability, e.g., rare normal events. We show that these samples are responsible for false positives in anomaly detection. However, oversampling of infrequent normal samples is challenging for real-world high-dimensional data with multimodal distributions. To address this challenge, we propose to use a GAN variant known as the adversarial autoencoder (AAE) to transform the high-dimensional multimodal data distributions into low-dimensional unimodal latent distributions with well-defined tail probability. Then, we systematically oversample at the `edge’ of the latent distributions to increase the density of infrequent normal samples. We show that our oversampling pipeline is a unified one: it is generally applicable to datasets with different complex data distributions. To the best of our knowledge, our method is the first data augmentation technique focused on improving performance in unsupervised anomaly detection. We validate our method by demonstrating consistent improvements across several real-world datasets. |
Tasks Anomaly Detection, Data Augmentation, Unsupervised Anomaly Detection
Published 2018-08-23
URL http://arxiv.org/abs/1808.07632v2
PDF http://arxiv.org/pdf/1808.07632v2.pdf
PWC https://paperswithcode.com/paper/doping-generative-data-augmentation-for
Repo https://github.com/greentfrapp/doping
Framework none

Scalable and Interpretable One-class SVMs with Deep Learning and Random Fourier features

Title Scalable and Interpretable One-class SVMs with Deep Learning and Random Fourier features
Authors Minh-Nghia Nguyen, Ngo Anh Vien
Abstract One-class support vector machine (OC-SVM) for a long time has been one of the most effective anomaly detection methods and extensively adopted in both research as well as industrial applications. The biggest issue for OC-SVM is yet the capability to operate with large and high-dimensional datasets due to optimization complexity. Those problems might be mitigated via dimensionality reduction techniques such as manifold learning or autoencoder. However, previous work often treats representation learning and anomaly prediction separately. In this paper, we propose autoencoder based one-class support vector machine (AE-1SVM) that brings OC-SVM, with the aid of random Fourier features to approximate the radial basis kernel, into deep learning context by combining it with a representation learning architecture and jointly exploit stochastic gradient descent to obtain end-to-end training. Interestingly, this also opens up the possible use of gradient-based attribution methods to explain the decision making for anomaly detection, which has ever been challenging as a result of the implicit mappings between the input space and the kernel space. To the best of our knowledge, this is the first work to study the interpretability of deep learning in anomaly detection. We evaluate our method on a wide range of unsupervised anomaly detection tasks in which our end-to-end training architecture achieves a performance significantly better than the previous work using separate training.
Tasks Anomaly Detection, Decision Making, Dimensionality Reduction, Representation Learning, Unsupervised Anomaly Detection
Published 2018-04-13
URL http://arxiv.org/abs/1804.04888v2
PDF http://arxiv.org/pdf/1804.04888v2.pdf
PWC https://paperswithcode.com/paper/scalable-and-interpretable-one-class-svms
Repo https://github.com/minh-nghia/AE-1SVM
Framework tf

Fast and Accurate Point Cloud Registration using Trees of Gaussian Mixtures

Title Fast and Accurate Point Cloud Registration using Trees of Gaussian Mixtures
Authors Ben Eckart, Kihwan Kim, Jan Kautz
Abstract Point cloud registration sits at the core of many important and challenging 3D perception problems including autonomous navigation, SLAM, object/scene recognition, and augmented reality. In this paper, we present a new registration algorithm that is able to achieve state-of-the-art speed and accuracy through its use of a hierarchical Gaussian Mixture Model (GMM) representation. Our method constructs a top-down multi-scale representation of point cloud data by recursively running many small-scale data likelihood segmentations in parallel on a GPU. We leverage the resulting representation using a novel PCA-based optimization criterion that adaptively finds the best scale to perform data association between spatial subsets of point cloud data. Compared to previous Iterative Closest Point and GMM-based techniques, our tree-based point association algorithm performs data association in logarithmic-time while dynamically adjusting the level of detail to best match the complexity and spatial distribution characteristics of local scene geometry. In addition, unlike other GMM methods that restrict covariances to be isotropic, our new PCA-based optimization criterion well-approximates the true MLE solution even when fully anisotropic Gaussian covariances are used. Efficient data association, multi-scale adaptability, and a robust MLE approximation produce an algorithm that is up to an order of magnitude both faster and more accurate than current state-of-the-art on a wide variety of 3D datasets captured from LiDAR to structured light.
Tasks Autonomous Navigation, Point Cloud Registration, Scene Recognition
Published 2018-07-06
URL http://arxiv.org/abs/1807.02587v1
PDF http://arxiv.org/pdf/1807.02587v1.pdf
PWC https://paperswithcode.com/paper/fast-and-accurate-point-cloud-registration
Repo https://github.com/neka-nat/probreg
Framework none

DeepMapping: Unsupervised Map Estimation From Multiple Point Clouds

Title DeepMapping: Unsupervised Map Estimation From Multiple Point Clouds
Authors Li Ding, Chen Feng
Abstract We propose DeepMapping, a novel registration framework using deep neural networks (DNNs) as auxiliary functions to align multiple point clouds from scratch to a globally consistent frame. We use DNNs to model the highly non-convex mapping process that traditionally involves hand-crafted data association, sensor pose initialization, and global refinement. Our key novelty is that “training” these DNNs with properly defined unsupervised losses is equivalent to solving the underlying registration problem, but less sensitive to good initialization than ICP. Our framework contains two DNNs: a localization network that estimates the poses for input point clouds, and a map network that models the scene structure by estimating the occupancy status of global coordinates. This allows us to convert the registration problem to a binary occupancy classification, which can be solved efficiently using gradient-based optimization. We further show that DeepMapping can be readily extended to address the problem of Lidar SLAM by imposing geometric constraints between consecutive point clouds. Experiments are conducted on both simulated and real datasets. Qualitative and quantitative comparisons demonstrate that DeepMapping often enables more robust and accurate global registration of multiple point clouds than existing techniques. Our code is available at https://ai4ce.github.io/DeepMapping/.
Tasks Point Cloud Registration
Published 2018-11-28
URL http://arxiv.org/abs/1811.11397v2
PDF http://arxiv.org/pdf/1811.11397v2.pdf
PWC https://paperswithcode.com/paper/deepmapping-unsupervised-map-estimation-from
Repo https://github.com/ai4ce/DeepMapping
Framework pytorch

Pre-Synaptic Pool Modification (PSPM): A Supervised Learning Procedure for Spiking Neural Networks

Title Pre-Synaptic Pool Modification (PSPM): A Supervised Learning Procedure for Spiking Neural Networks
Authors Bryce Bagley, Blake Bordelon, Benjamin Moseley, Ralf Wessel
Abstract Learning synaptic weights of spiking neural network (SNN) models that can reproduce target spike trains from provided neural firing data is a central problem in computational neuroscience and spike-based computing. The discovery of the optimal weight values can be posed as a supervised learning task wherein the weights of the model network are chosen to maximize the similarity between the target spike trains and the model outputs. It is still largely unknown whether optimizing spike train similarity of highly recurrent SNNs produces weight matrices similar to those of the ground truth model. To this end, we propose flexible heuristic supervised learning rules, termed Pre-Synaptic Pool Modification (PSPM), that rely on stochastic weight updates in order to produce spikes within a short window of the desired times and eliminate spikes outside of this window. PSPM improves spike train similarity for all-to-all SNNs and makes no assumption about the post-synaptic potential of the neurons or the structure of the network since no gradients are required. We test whether optimizing for spike train similarity entails the discovery of accurate weights and explore the relative contributions of local and homeostatic weight updates. Although PSPM improves similarity between spike trains, the learned weights often differ from the weights of the ground truth model, implying that connectome inference from spike data may require additional constraints on connectivity statistics. We also find that spike train similarity is sensitive to local updates, but other measures of network activity such as avalanche distributions, can be learned through synaptic homeostasis.
Tasks
Published 2018-10-07
URL https://arxiv.org/abs/1810.03199v3
PDF https://arxiv.org/pdf/1810.03199v3.pdf
PWC https://paperswithcode.com/paper/pre-synaptic-pool-modification-pspm-a
Repo https://github.com/blakebordelon/Spiking-Neural-Network-Optimization
Framework none

AutoInt: Automatic Feature Interaction Learning via Self-Attentive Neural Networks

Title AutoInt: Automatic Feature Interaction Learning via Self-Attentive Neural Networks
Authors Weiping Song, Chence Shi, Zhiping Xiao, Zhijian Duan, Yewen Xu, Ming Zhang, Jian Tang
Abstract Click-through rate (CTR) prediction, which aims to predict the probability of a user clicking on an ad or an item, is critical to many online applications such as online advertising and recommender systems. The problem is very challenging since (1) the input features (e.g., the user id, user age, item id, item category) are usually sparse and high-dimensional, and (2) an effective prediction relies on high-order combinatorial features (\textit{a.k.a.} cross features), which are very time-consuming to hand-craft by domain experts and are impossible to be enumerated. Therefore, there have been efforts in finding low-dimensional representations of the sparse and high-dimensional raw features and their meaningful combinations. In this paper, we propose an effective and efficient method called the \emph{AutoInt} to automatically learn the high-order feature interactions of input features. Our proposed algorithm is very general, which can be applied to both numerical and categorical input features. Specifically, we map both the numerical and categorical features into the same low-dimensional space. Afterwards, a multi-head self-attentive neural network with residual connections is proposed to explicitly model the feature interactions in the low-dimensional space. With different layers of the multi-head self-attentive neural networks, different orders of feature combinations of input features can be modeled. The whole model can be efficiently fit on large-scale raw data in an end-to-end fashion. Experimental results on four real-world datasets show that our proposed approach not only outperforms existing state-of-the-art approaches for prediction but also offers good explainability. Code is available at: \url{https://github.com/DeepGraphLearning/RecommenderSystems}.
Tasks Click-Through Rate Prediction, Recommendation Systems
Published 2018-10-29
URL https://arxiv.org/abs/1810.11921v2
PDF https://arxiv.org/pdf/1810.11921v2.pdf
PWC https://paperswithcode.com/paper/autoint-automatic-feature-interaction
Repo https://github.com/shichence/AutoInt
Framework tf

How2: A Large-scale Dataset for Multimodal Language Understanding

Title How2: A Large-scale Dataset for Multimodal Language Understanding
Authors Ramon Sanabria, Ozan Caglayan, Shruti Palaskar, Desmond Elliott, Loïc Barrault, Lucia Specia, Florian Metze
Abstract In this paper, we introduce How2, a multimodal collection of instructional videos with English subtitles and crowdsourced Portuguese translations. We also present integrated sequence-to-sequence baselines for machine translation, automatic speech recognition, spoken language translation, and multimodal summarization. By making available data and code for several multimodal natural language tasks, we hope to stimulate more research on these and similar challenges, to obtain a deeper understanding of multimodality in language processing.
Tasks Machine Translation, Speech Recognition
Published 2018-11-01
URL http://arxiv.org/abs/1811.00347v2
PDF http://arxiv.org/pdf/1811.00347v2.pdf
PWC https://paperswithcode.com/paper/how2-a-large-scale-dataset-for-multimodal
Repo https://github.com/srvk/how2-dataset
Framework none

DTMT: A Novel Deep Transition Architecture for Neural Machine Translation

Title DTMT: A Novel Deep Transition Architecture for Neural Machine Translation
Authors Fandong Meng, Jinchao Zhang
Abstract Past years have witnessed rapid developments in Neural Machine Translation (NMT). Most recently, with advanced modeling and training techniques, the RNN-based NMT (RNMT) has shown its potential strength, even compared with the well-known Transformer (self-attentional) model. Although the RNMT model can possess very deep architectures through stacking layers, the transition depth between consecutive hidden states along the sequential axis is still shallow. In this paper, we further enhance the RNN-based NMT through increasing the transition depth between consecutive hidden states and build a novel Deep Transition RNN-based Architecture for Neural Machine Translation, named DTMT. This model enhances the hidden-to-hidden transition with multiple non-linear transformations, as well as maintains a linear transformation path throughout this deep transition by the well-designed linear transformation mechanism to alleviate the gradient vanishing problem. Experiments show that with the specially designed deep transition modules, our DTMT can achieve remarkable improvements on translation quality. Experimental results on Chinese->English translation task show that DTMT can outperform the Transformer model by +2.09 BLEU points and achieve the best results ever reported in the same dataset. On WMT14 English->German and English->French translation tasks, DTMT shows superior quality to the state-of-the-art NMT systems, including the Transformer and the RNMT+.
Tasks Machine Translation
Published 2018-12-19
URL https://arxiv.org/abs/1812.07807v2
PDF https://arxiv.org/pdf/1812.07807v2.pdf
PWC https://paperswithcode.com/paper/dtmt-a-novel-deep-transition-architecture-for
Repo https://github.com/fandongmeng/DTMT_InDec
Framework tf

Partial Convolution based Padding

Title Partial Convolution based Padding
Authors Guilin Liu, Kevin J. Shih, Ting-Chun Wang, Fitsum A. Reda, Karan Sapra, Zhiding Yu, Andrew Tao, Bryan Catanzaro
Abstract In this paper, we present a simple yet effective padding scheme that can be used as a drop-in module for existing convolutional neural networks. We call it partial convolution based padding, with the intuition that the padded region can be treated as holes and the original input as non-holes. Specifically, during the convolution operation, the convolution results are re-weighted near image borders based on the ratios between the padded area and the convolution sliding window area. Extensive experiments with various deep network models on ImageNet classification and semantic segmentation demonstrate that the proposed padding scheme consistently outperforms standard zero padding with better accuracy.
Tasks Semantic Segmentation
Published 2018-11-28
URL http://arxiv.org/abs/1811.11718v1
PDF http://arxiv.org/pdf/1811.11718v1.pdf
PWC https://paperswithcode.com/paper/partial-convolution-based-padding
Repo https://github.com/lessw2020/auto-adaptive-ai
Framework pytorch
comments powered by Disqus