Paper Group AWR 12
High Quality Prediction of Protein Q8 Secondary Structure by Diverse Neural Network Architectures. The Text-Based Adventure AI Competition. Learning to Shoot in First Person Shooter Games by Stabilizing Actions and Clustering Rewards for Reinforcement Learning. Memorable Maps: A Framework for Re-defining Places in Visual Place Recognition. MVSNet: …
High Quality Prediction of Protein Q8 Secondary Structure by Diverse Neural Network Architectures
Title | High Quality Prediction of Protein Q8 Secondary Structure by Diverse Neural Network Architectures |
Authors | Iddo Drori, Isht Dwivedi, Pranav Shrestha, Jeffrey Wan, Yueqi Wang, Yunchu He, Anthony Mazza, Hugh Krogh-Freeman, Dimitri Leggas, Kendal Sandridge, Linyong Nan, Kaveri Thakoor, Chinmay Joshi, Sonam Goenka, Chen Keasar, Itsik Pe’er |
Abstract | We tackle the problem of protein secondary structure prediction using a common task framework. This lead to the introduction of multiple ideas for neural architectures based on state of the art building blocks, used in this task for the first time. We take a principled machine learning approach, which provides genuine, unbiased performance measures, correcting longstanding errors in the application domain. We focus on the Q8 resolution of secondary structure, an active area for continuously improving methods. We use an ensemble of strong predictors to achieve accuracy of 70.7% (on the CB513 test set using the CB6133filtered training set). These results are statistically indistinguishable from those of the top existing predictors. In the spirit of reproducible research we make our data, models and code available, aiming to set a gold standard for purity of training and testing sets. Such good practices lower entry barriers to this domain and facilitate reproducible, extendable research. |
Tasks | Protein Secondary Structure Prediction |
Published | 2018-11-17 |
URL | http://arxiv.org/abs/1811.07143v1 |
http://arxiv.org/pdf/1811.07143v1.pdf | |
PWC | https://paperswithcode.com/paper/high-quality-prediction-of-protein-q8 |
Repo | https://github.com/idrori/cu-ssp |
Framework | tf |
The Text-Based Adventure AI Competition
Title | The Text-Based Adventure AI Competition |
Authors | Timothy Atkinson, Hendrik Baier, Tara Copplestone, Sam Devlin, Jerry Swan |
Abstract | In 2016, 2017, and 2018 at the IEEE Conference on Computational Intelligence in Games, the authors of this paper ran a competition for agents that can play classic text-based adventure games. This competition fills a gap in existing game AI competitions that have typically focussed on traditional card/board games or modern video games with graphical interfaces. By providing a platform for evaluating agents in text-based adventures, the competition provides a novel benchmark for game AI with unique challenges for natural language understanding and generation. This paper summarises the three competitions ran in 2016, 2017, and 2018 (including details of open source implementations of both the competition framework and our competitors) and presents the results of an improved evaluation of these competitors across 20 games. |
Tasks | Board Games |
Published | 2018-08-03 |
URL | http://arxiv.org/abs/1808.01262v4 |
http://arxiv.org/pdf/1808.01262v4.pdf | |
PWC | https://paperswithcode.com/paper/the-text-based-adventure-ai-competition |
Repo | https://github.com/Microsoft/nail_agent |
Framework | none |
Learning to Shoot in First Person Shooter Games by Stabilizing Actions and Clustering Rewards for Reinforcement Learning
Title | Learning to Shoot in First Person Shooter Games by Stabilizing Actions and Clustering Rewards for Reinforcement Learning |
Authors | Frank G. Glavin, Michael G. Madden |
Abstract | While reinforcement learning (RL) has been applied to turn-based board games for many years, more complex games involving decision-making in real-time are beginning to receive more attention. A challenge in such environments is that the time that elapses between deciding to take an action and receiving a reward based on its outcome can be longer than the interval between successive decisions. We explore this in the context of a non-player character (NPC) in a modern first-person shooter game. Such games take place in 3D environments where players, both human and computer-controlled, compete by engaging in combat and completing task objectives. We investigate the use of RL to enable NPCs to gather experience from game-play and improve their shooting skill over time from a reward signal based on the damage caused to opponents. We propose a new method for RL updates and reward calculations, in which the updates are carried out periodically, after each shooting encounter has ended, and a new weighted-reward mechanism is used which increases the reward applied to actions that lead to damaging the opponent in successive hits in what we term “hit clusters”. |
Tasks | Board Games, Decision Making |
Published | 2018-06-13 |
URL | http://arxiv.org/abs/1806.05117v1 |
http://arxiv.org/pdf/1806.05117v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-shoot-in-first-person-shooter |
Repo | https://github.com/lucylow/b00m-h3adsh0t |
Framework | none |
Memorable Maps: A Framework for Re-defining Places in Visual Place Recognition
Title | Memorable Maps: A Framework for Re-defining Places in Visual Place Recognition |
Authors | Mubariz Zaffar, Shoaib Ehsan, Michael Milford, Klaus Mcdonald Maier |
Abstract | This paper presents a cognition-inspired agnostic framework for building a map for Visual Place Recognition. This framework draws inspiration from human-memorability, utilizes the traditional image entropy concept and computes the static content in an image; thereby presenting a tri-folded criterion to assess the ‘memorability’ of an image for visual place recognition. A dataset namely ‘ESSEX3IN1’ is created, composed of highly confusing images from indoor, outdoor and natural scenes for analysis. When used in conjunction with state-of-the-art visual place recognition methods, the proposed framework provides significant performance boost to these techniques, as evidenced by results on ESSEX3IN1 and other public datasets. |
Tasks | Visual Place Recognition |
Published | 2018-11-08 |
URL | http://arxiv.org/abs/1811.03529v2 |
http://arxiv.org/pdf/1811.03529v2.pdf | |
PWC | https://paperswithcode.com/paper/memorable-maps-a-framework-for-re-defining |
Repo | https://github.com/MubarizZaffar/ESSEX3IN1-Dataset |
Framework | none |
MVSNet: Depth Inference for Unstructured Multi-view Stereo
Title | MVSNet: Depth Inference for Unstructured Multi-view Stereo |
Authors | Yao Yao, Zixin Luo, Shiwei Li, Tian Fang, Long Quan |
Abstract | We present an end-to-end deep learning architecture for depth map inference from multi-view images. In the network, we first extract deep visual image features, and then build the 3D cost volume upon the reference camera frustum via the differentiable homography warping. Next, we apply 3D convolutions to regularize and regress the initial depth map, which is then refined with the reference image to generate the final output. Our framework flexibly adapts arbitrary N-view inputs using a variance-based cost metric that maps multiple features into one cost feature. The proposed MVSNet is demonstrated on the large-scale indoor DTU dataset. With simple post-processing, our method not only significantly outperforms previous state-of-the-arts, but also is several times faster in runtime. We also evaluate MVSNet on the complex outdoor Tanks and Temples dataset, where our method ranks first before April 18, 2018 without any fine-tuning, showing the strong generalization ability of MVSNet. |
Tasks | |
Published | 2018-04-07 |
URL | http://arxiv.org/abs/1804.02505v2 |
http://arxiv.org/pdf/1804.02505v2.pdf | |
PWC | https://paperswithcode.com/paper/mvsnet-depth-inference-for-unstructured-multi |
Repo | https://github.com/YoYo000/MVSNet |
Framework | tf |
Unsupervised Anomaly Detection via Variational Auto-Encoder for Seasonal KPIs in Web Applications
Title | Unsupervised Anomaly Detection via Variational Auto-Encoder for Seasonal KPIs in Web Applications |
Authors | Haowen Xu, Wenxiao Chen, Nengwen Zhao, Zeyan Li, Jiahao Bu, Zhihan Li, Ying Liu, Youjian Zhao, Dan Pei, Yang Feng, Jie Chen, Zhaogang Wang, Honglin Qiao |
Abstract | To ensure undisrupted business, large Internet companies need to closely monitor various KPIs (e.g., Page Views, number of online users, and number of orders) of its Web applications, to accurately detect anomalies and trigger timely troubleshooting/mitigation. However, anomaly detection for these seasonal KPIs with various patterns and data quality has been a great challenge, especially without labels. In this paper, we proposed Donut, an unsupervised anomaly detection algorithm based on VAE. Thanks to a few of our key techniques, Donut greatly outperforms a state-of-arts supervised ensemble approach and a baseline VAE approach, and its best F-scores range from 0.75 to 0.9 for the studied KPIs from a top global Internet company. We come up with a novel KDE interpretation of reconstruction for Donut, making it the first VAE-based anomaly detection algorithm with solid theoretical explanation. |
Tasks | Anomaly Detection, Unsupervised Anomaly Detection |
Published | 2018-02-12 |
URL | http://arxiv.org/abs/1802.03903v1 |
http://arxiv.org/pdf/1802.03903v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-anomaly-detection-via |
Repo | https://github.com/nakumgaurav/Anomaly-Detection |
Framework | none |
DOPING: Generative Data Augmentation for Unsupervised Anomaly Detection with GAN
Title | DOPING: Generative Data Augmentation for Unsupervised Anomaly Detection with GAN |
Authors | Swee Kiat Lim, Yi Loo, Ngoc-Trung Tran, Ngai-Man Cheung, Gemma Roig, Yuval Elovici |
Abstract | Recently, the introduction of the generative adversarial network (GAN) and its variants has enabled the generation of realistic synthetic samples, which has been used for enlarging training sets. Previous work primarily focused on data augmentation for semi-supervised and supervised tasks. In this paper, we instead focus on unsupervised anomaly detection and propose a novel generative data augmentation framework optimized for this task. In particular, we propose to oversample infrequent normal samples - normal samples that occur with small probability, e.g., rare normal events. We show that these samples are responsible for false positives in anomaly detection. However, oversampling of infrequent normal samples is challenging for real-world high-dimensional data with multimodal distributions. To address this challenge, we propose to use a GAN variant known as the adversarial autoencoder (AAE) to transform the high-dimensional multimodal data distributions into low-dimensional unimodal latent distributions with well-defined tail probability. Then, we systematically oversample at the `edge’ of the latent distributions to increase the density of infrequent normal samples. We show that our oversampling pipeline is a unified one: it is generally applicable to datasets with different complex data distributions. To the best of our knowledge, our method is the first data augmentation technique focused on improving performance in unsupervised anomaly detection. We validate our method by demonstrating consistent improvements across several real-world datasets. | |
Tasks | Anomaly Detection, Data Augmentation, Unsupervised Anomaly Detection |
Published | 2018-08-23 |
URL | http://arxiv.org/abs/1808.07632v2 |
http://arxiv.org/pdf/1808.07632v2.pdf | |
PWC | https://paperswithcode.com/paper/doping-generative-data-augmentation-for |
Repo | https://github.com/greentfrapp/doping |
Framework | none |
Scalable and Interpretable One-class SVMs with Deep Learning and Random Fourier features
Title | Scalable and Interpretable One-class SVMs with Deep Learning and Random Fourier features |
Authors | Minh-Nghia Nguyen, Ngo Anh Vien |
Abstract | One-class support vector machine (OC-SVM) for a long time has been one of the most effective anomaly detection methods and extensively adopted in both research as well as industrial applications. The biggest issue for OC-SVM is yet the capability to operate with large and high-dimensional datasets due to optimization complexity. Those problems might be mitigated via dimensionality reduction techniques such as manifold learning or autoencoder. However, previous work often treats representation learning and anomaly prediction separately. In this paper, we propose autoencoder based one-class support vector machine (AE-1SVM) that brings OC-SVM, with the aid of random Fourier features to approximate the radial basis kernel, into deep learning context by combining it with a representation learning architecture and jointly exploit stochastic gradient descent to obtain end-to-end training. Interestingly, this also opens up the possible use of gradient-based attribution methods to explain the decision making for anomaly detection, which has ever been challenging as a result of the implicit mappings between the input space and the kernel space. To the best of our knowledge, this is the first work to study the interpretability of deep learning in anomaly detection. We evaluate our method on a wide range of unsupervised anomaly detection tasks in which our end-to-end training architecture achieves a performance significantly better than the previous work using separate training. |
Tasks | Anomaly Detection, Decision Making, Dimensionality Reduction, Representation Learning, Unsupervised Anomaly Detection |
Published | 2018-04-13 |
URL | http://arxiv.org/abs/1804.04888v2 |
http://arxiv.org/pdf/1804.04888v2.pdf | |
PWC | https://paperswithcode.com/paper/scalable-and-interpretable-one-class-svms |
Repo | https://github.com/minh-nghia/AE-1SVM |
Framework | tf |
Fast and Accurate Point Cloud Registration using Trees of Gaussian Mixtures
Title | Fast and Accurate Point Cloud Registration using Trees of Gaussian Mixtures |
Authors | Ben Eckart, Kihwan Kim, Jan Kautz |
Abstract | Point cloud registration sits at the core of many important and challenging 3D perception problems including autonomous navigation, SLAM, object/scene recognition, and augmented reality. In this paper, we present a new registration algorithm that is able to achieve state-of-the-art speed and accuracy through its use of a hierarchical Gaussian Mixture Model (GMM) representation. Our method constructs a top-down multi-scale representation of point cloud data by recursively running many small-scale data likelihood segmentations in parallel on a GPU. We leverage the resulting representation using a novel PCA-based optimization criterion that adaptively finds the best scale to perform data association between spatial subsets of point cloud data. Compared to previous Iterative Closest Point and GMM-based techniques, our tree-based point association algorithm performs data association in logarithmic-time while dynamically adjusting the level of detail to best match the complexity and spatial distribution characteristics of local scene geometry. In addition, unlike other GMM methods that restrict covariances to be isotropic, our new PCA-based optimization criterion well-approximates the true MLE solution even when fully anisotropic Gaussian covariances are used. Efficient data association, multi-scale adaptability, and a robust MLE approximation produce an algorithm that is up to an order of magnitude both faster and more accurate than current state-of-the-art on a wide variety of 3D datasets captured from LiDAR to structured light. |
Tasks | Autonomous Navigation, Point Cloud Registration, Scene Recognition |
Published | 2018-07-06 |
URL | http://arxiv.org/abs/1807.02587v1 |
http://arxiv.org/pdf/1807.02587v1.pdf | |
PWC | https://paperswithcode.com/paper/fast-and-accurate-point-cloud-registration |
Repo | https://github.com/neka-nat/probreg |
Framework | none |
DeepMapping: Unsupervised Map Estimation From Multiple Point Clouds
Title | DeepMapping: Unsupervised Map Estimation From Multiple Point Clouds |
Authors | Li Ding, Chen Feng |
Abstract | We propose DeepMapping, a novel registration framework using deep neural networks (DNNs) as auxiliary functions to align multiple point clouds from scratch to a globally consistent frame. We use DNNs to model the highly non-convex mapping process that traditionally involves hand-crafted data association, sensor pose initialization, and global refinement. Our key novelty is that “training” these DNNs with properly defined unsupervised losses is equivalent to solving the underlying registration problem, but less sensitive to good initialization than ICP. Our framework contains two DNNs: a localization network that estimates the poses for input point clouds, and a map network that models the scene structure by estimating the occupancy status of global coordinates. This allows us to convert the registration problem to a binary occupancy classification, which can be solved efficiently using gradient-based optimization. We further show that DeepMapping can be readily extended to address the problem of Lidar SLAM by imposing geometric constraints between consecutive point clouds. Experiments are conducted on both simulated and real datasets. Qualitative and quantitative comparisons demonstrate that DeepMapping often enables more robust and accurate global registration of multiple point clouds than existing techniques. Our code is available at https://ai4ce.github.io/DeepMapping/. |
Tasks | Point Cloud Registration |
Published | 2018-11-28 |
URL | http://arxiv.org/abs/1811.11397v2 |
http://arxiv.org/pdf/1811.11397v2.pdf | |
PWC | https://paperswithcode.com/paper/deepmapping-unsupervised-map-estimation-from |
Repo | https://github.com/ai4ce/DeepMapping |
Framework | pytorch |
Pre-Synaptic Pool Modification (PSPM): A Supervised Learning Procedure for Spiking Neural Networks
Title | Pre-Synaptic Pool Modification (PSPM): A Supervised Learning Procedure for Spiking Neural Networks |
Authors | Bryce Bagley, Blake Bordelon, Benjamin Moseley, Ralf Wessel |
Abstract | Learning synaptic weights of spiking neural network (SNN) models that can reproduce target spike trains from provided neural firing data is a central problem in computational neuroscience and spike-based computing. The discovery of the optimal weight values can be posed as a supervised learning task wherein the weights of the model network are chosen to maximize the similarity between the target spike trains and the model outputs. It is still largely unknown whether optimizing spike train similarity of highly recurrent SNNs produces weight matrices similar to those of the ground truth model. To this end, we propose flexible heuristic supervised learning rules, termed Pre-Synaptic Pool Modification (PSPM), that rely on stochastic weight updates in order to produce spikes within a short window of the desired times and eliminate spikes outside of this window. PSPM improves spike train similarity for all-to-all SNNs and makes no assumption about the post-synaptic potential of the neurons or the structure of the network since no gradients are required. We test whether optimizing for spike train similarity entails the discovery of accurate weights and explore the relative contributions of local and homeostatic weight updates. Although PSPM improves similarity between spike trains, the learned weights often differ from the weights of the ground truth model, implying that connectome inference from spike data may require additional constraints on connectivity statistics. We also find that spike train similarity is sensitive to local updates, but other measures of network activity such as avalanche distributions, can be learned through synaptic homeostasis. |
Tasks | |
Published | 2018-10-07 |
URL | https://arxiv.org/abs/1810.03199v3 |
https://arxiv.org/pdf/1810.03199v3.pdf | |
PWC | https://paperswithcode.com/paper/pre-synaptic-pool-modification-pspm-a |
Repo | https://github.com/blakebordelon/Spiking-Neural-Network-Optimization |
Framework | none |
AutoInt: Automatic Feature Interaction Learning via Self-Attentive Neural Networks
Title | AutoInt: Automatic Feature Interaction Learning via Self-Attentive Neural Networks |
Authors | Weiping Song, Chence Shi, Zhiping Xiao, Zhijian Duan, Yewen Xu, Ming Zhang, Jian Tang |
Abstract | Click-through rate (CTR) prediction, which aims to predict the probability of a user clicking on an ad or an item, is critical to many online applications such as online advertising and recommender systems. The problem is very challenging since (1) the input features (e.g., the user id, user age, item id, item category) are usually sparse and high-dimensional, and (2) an effective prediction relies on high-order combinatorial features (\textit{a.k.a.} cross features), which are very time-consuming to hand-craft by domain experts and are impossible to be enumerated. Therefore, there have been efforts in finding low-dimensional representations of the sparse and high-dimensional raw features and their meaningful combinations. In this paper, we propose an effective and efficient method called the \emph{AutoInt} to automatically learn the high-order feature interactions of input features. Our proposed algorithm is very general, which can be applied to both numerical and categorical input features. Specifically, we map both the numerical and categorical features into the same low-dimensional space. Afterwards, a multi-head self-attentive neural network with residual connections is proposed to explicitly model the feature interactions in the low-dimensional space. With different layers of the multi-head self-attentive neural networks, different orders of feature combinations of input features can be modeled. The whole model can be efficiently fit on large-scale raw data in an end-to-end fashion. Experimental results on four real-world datasets show that our proposed approach not only outperforms existing state-of-the-art approaches for prediction but also offers good explainability. Code is available at: \url{https://github.com/DeepGraphLearning/RecommenderSystems}. |
Tasks | Click-Through Rate Prediction, Recommendation Systems |
Published | 2018-10-29 |
URL | https://arxiv.org/abs/1810.11921v2 |
https://arxiv.org/pdf/1810.11921v2.pdf | |
PWC | https://paperswithcode.com/paper/autoint-automatic-feature-interaction |
Repo | https://github.com/shichence/AutoInt |
Framework | tf |
How2: A Large-scale Dataset for Multimodal Language Understanding
Title | How2: A Large-scale Dataset for Multimodal Language Understanding |
Authors | Ramon Sanabria, Ozan Caglayan, Shruti Palaskar, Desmond Elliott, Loïc Barrault, Lucia Specia, Florian Metze |
Abstract | In this paper, we introduce How2, a multimodal collection of instructional videos with English subtitles and crowdsourced Portuguese translations. We also present integrated sequence-to-sequence baselines for machine translation, automatic speech recognition, spoken language translation, and multimodal summarization. By making available data and code for several multimodal natural language tasks, we hope to stimulate more research on these and similar challenges, to obtain a deeper understanding of multimodality in language processing. |
Tasks | Machine Translation, Speech Recognition |
Published | 2018-11-01 |
URL | http://arxiv.org/abs/1811.00347v2 |
http://arxiv.org/pdf/1811.00347v2.pdf | |
PWC | https://paperswithcode.com/paper/how2-a-large-scale-dataset-for-multimodal |
Repo | https://github.com/srvk/how2-dataset |
Framework | none |
DTMT: A Novel Deep Transition Architecture for Neural Machine Translation
Title | DTMT: A Novel Deep Transition Architecture for Neural Machine Translation |
Authors | Fandong Meng, Jinchao Zhang |
Abstract | Past years have witnessed rapid developments in Neural Machine Translation (NMT). Most recently, with advanced modeling and training techniques, the RNN-based NMT (RNMT) has shown its potential strength, even compared with the well-known Transformer (self-attentional) model. Although the RNMT model can possess very deep architectures through stacking layers, the transition depth between consecutive hidden states along the sequential axis is still shallow. In this paper, we further enhance the RNN-based NMT through increasing the transition depth between consecutive hidden states and build a novel Deep Transition RNN-based Architecture for Neural Machine Translation, named DTMT. This model enhances the hidden-to-hidden transition with multiple non-linear transformations, as well as maintains a linear transformation path throughout this deep transition by the well-designed linear transformation mechanism to alleviate the gradient vanishing problem. Experiments show that with the specially designed deep transition modules, our DTMT can achieve remarkable improvements on translation quality. Experimental results on Chinese->English translation task show that DTMT can outperform the Transformer model by +2.09 BLEU points and achieve the best results ever reported in the same dataset. On WMT14 English->German and English->French translation tasks, DTMT shows superior quality to the state-of-the-art NMT systems, including the Transformer and the RNMT+. |
Tasks | Machine Translation |
Published | 2018-12-19 |
URL | https://arxiv.org/abs/1812.07807v2 |
https://arxiv.org/pdf/1812.07807v2.pdf | |
PWC | https://paperswithcode.com/paper/dtmt-a-novel-deep-transition-architecture-for |
Repo | https://github.com/fandongmeng/DTMT_InDec |
Framework | tf |
Partial Convolution based Padding
Title | Partial Convolution based Padding |
Authors | Guilin Liu, Kevin J. Shih, Ting-Chun Wang, Fitsum A. Reda, Karan Sapra, Zhiding Yu, Andrew Tao, Bryan Catanzaro |
Abstract | In this paper, we present a simple yet effective padding scheme that can be used as a drop-in module for existing convolutional neural networks. We call it partial convolution based padding, with the intuition that the padded region can be treated as holes and the original input as non-holes. Specifically, during the convolution operation, the convolution results are re-weighted near image borders based on the ratios between the padded area and the convolution sliding window area. Extensive experiments with various deep network models on ImageNet classification and semantic segmentation demonstrate that the proposed padding scheme consistently outperforms standard zero padding with better accuracy. |
Tasks | Semantic Segmentation |
Published | 2018-11-28 |
URL | http://arxiv.org/abs/1811.11718v1 |
http://arxiv.org/pdf/1811.11718v1.pdf | |
PWC | https://paperswithcode.com/paper/partial-convolution-based-padding |
Repo | https://github.com/lessw2020/auto-adaptive-ai |
Framework | pytorch |