July 29, 2019

3062 words 15 mins read

Paper Group AWR 176

Paper Group AWR 176

Abnormal Event Detection in Videos using Spatiotemporal Autoencoder. Multi-timescale memory dynamics in a reinforcement learning network with attention-gated memory. Assumed Density Filtering Q-learning. Why We Need New Evaluation Metrics for NLG. End-to-End Interpretation of the French Street Name Signs Dataset. Connecting Look and Feel: Associati …

Abnormal Event Detection in Videos using Spatiotemporal Autoencoder

Title Abnormal Event Detection in Videos using Spatiotemporal Autoencoder
Authors Yong Shean Chong, Yong Haur Tay
Abstract We present an efficient method for detecting anomalies in videos. Recent applications of convolutional neural networks have shown promises of convolutional layers for object detection and recognition, especially in images. However, convolutional neural networks are supervised and require labels as learning signals. We propose a spatiotemporal architecture for anomaly detection in videos including crowded scenes. Our architecture includes two main components, one for spatial feature representation, and one for learning the temporal evolution of the spatial features. Experimental results on Avenue, Subway and UCSD benchmarks confirm that the detection accuracy of our method is comparable to state-of-the-art methods at a considerable speed of up to 140 fps.
Tasks Anomaly Detection, Object Detection
Published 2017-01-06
URL http://arxiv.org/abs/1701.01546v1
PDF http://arxiv.org/pdf/1701.01546v1.pdf
PWC https://paperswithcode.com/paper/abnormal-event-detection-in-videos-using
Repo https://github.com/drsagitn/anomaly-detection-and-localization
Framework tf

Multi-timescale memory dynamics in a reinforcement learning network with attention-gated memory

Title Multi-timescale memory dynamics in a reinforcement learning network with attention-gated memory
Authors Marco Martinolli, Wulfram Gerstner, Aditya Gilra
Abstract Learning and memory are intertwined in our brain and their relationship is at the core of several recent neural network models. In particular, the Attention-Gated MEmory Tagging model (AuGMEnT) is a reinforcement learning network with an emphasis on biological plausibility of memory dynamics and learning. We find that the AuGMEnT network does not solve some hierarchical tasks, where higher-level stimuli have to be maintained over a long time, while lower-level stimuli need to be remembered and forgotten over a shorter timescale. To overcome this limitation, we introduce hybrid AuGMEnT, with leaky or short-timescale and non-leaky or long-timescale units in memory, that allow to exchange lower-level information while maintaining higher-level one, thus solving both hierarchical and distractor tasks.
Tasks
Published 2017-12-28
URL http://arxiv.org/abs/1712.10062v1
PDF http://arxiv.org/pdf/1712.10062v1.pdf
PWC https://paperswithcode.com/paper/multi-timescale-memory-dynamics-in-a
Repo https://github.com/martin592/hybrid_AuGMEnT
Framework none

Assumed Density Filtering Q-learning

Title Assumed Density Filtering Q-learning
Authors Heejin Jeong, Clark Zhang, George J. Pappas, Daniel D. Lee
Abstract While off-policy temporal difference (TD) methods have widely been used in reinforcement learning due to their efficiency and simple implementation, their Bayesian counterparts have not been utilized as frequently. One reason is that the non-linear max operation in the Bellman optimality equation makes it difficult to define conjugate distributions over the value functions. In this paper, we introduce a novel Bayesian approach to off-policy TD methods, called as ADFQ, which updates beliefs on state-action values, Q, through an online Bayesian inference method known as Assumed Density Filtering. We formulate an efficient closed-form solution for the value update by approximately estimating analytic parameters of the posterior of the Q-beliefs. Uncertainty measures in the beliefs not only are used in exploration but also provide a natural regularization for the value update considering all next available actions. ADFQ converges to Q-learning as the uncertainty measures of the Q-beliefs decrease and improves common drawbacks of other Bayesian RL algorithms such as computational complexity. We extend ADFQ with a neural network. Our empirical results demonstrate that ADFQ outperforms comparable algorithms on various Atari 2600 games, with drastic improvements in highly stochastic domains or domains with a large action space.
Tasks Atari Games, Bayesian Inference, Q-Learning
Published 2017-12-09
URL https://arxiv.org/abs/1712.03333v4
PDF https://arxiv.org/pdf/1712.03333v4.pdf
PWC https://paperswithcode.com/paper/assumed-density-filtering-q-learning
Repo https://github.com/coco66/ADFQ
Framework tf

Why We Need New Evaluation Metrics for NLG

Title Why We Need New Evaluation Metrics for NLG
Authors Jekaterina Novikova, Ondřej Dušek, Amanda Cercas Curry, Verena Rieser
Abstract The majority of NLG evaluation relies on automatic metrics, such as BLEU . In this paper, we motivate the need for novel, system- and data-independent automatic evaluation methods: We investigate a wide range of metrics, including state-of-the-art word-based and novel grammar-based ones, and demonstrate that they only weakly reflect human judgements of system outputs as generated by data-driven, end-to-end NLG. We also show that metric performance is data- and system-specific. Nevertheless, our results also suggest that automatic metrics perform reliably at system-level and can support system development by finding cases where a system performs poorly.
Tasks
Published 2017-07-21
URL http://arxiv.org/abs/1707.06875v1
PDF http://arxiv.org/pdf/1707.06875v1.pdf
PWC https://paperswithcode.com/paper/why-we-need-new-evaluation-metrics-for-nlg
Repo https://github.com/jeknov/EMNLP_17_submission
Framework none

End-to-End Interpretation of the French Street Name Signs Dataset

Title End-to-End Interpretation of the French Street Name Signs Dataset
Authors Raymond Smith, Chunhui Gu, Dar-Shyang Lee, Huiyi Hu, Ranjith Unnikrishnan, Julian Ibarz, Sacha Arnoud, Sophia Lin
Abstract We introduce the French Street Name Signs (FSNS) Dataset consisting of more than a million images of street name signs cropped from Google Street View images of France. Each image contains several views of the same street name sign. Every image has normalized, title case folded ground-truth text as it would appear on a map. We believe that the FSNS dataset is large and complex enough to train a deep network of significant complexity to solve the street name extraction problem “end-to-end” or to explore the design trade-offs between a single complex engineered network and multiple sub-networks designed and trained to solve sub-problems. We present such an “end-to-end” network/graph for Tensor Flow and its results on the FSNS dataset.
Tasks Optical Character Recognition
Published 2017-02-13
URL http://arxiv.org/abs/1702.03970v1
PDF http://arxiv.org/pdf/1702.03970v1.pdf
PWC https://paperswithcode.com/paper/end-to-end-interpretation-of-the-french
Repo https://github.com/OzHsu23/chineseocr
Framework tf

Connecting Look and Feel: Associating the visual and tactile properties of physical materials

Title Connecting Look and Feel: Associating the visual and tactile properties of physical materials
Authors Wenzhen Yuan, Shaoxiong Wang, Siyuan Dong, Edward Adelson
Abstract For machines to interact with the physical world, they must understand the physical properties of objects and materials they encounter. We use fabrics as an example of a deformable material with a rich set of mechanical properties. A thin flexible fabric, when draped, tends to look different from a heavy stiff fabric. It also feels different when touched. Using a collection of 118 fabric sample, we captured color and depth images of draped fabrics along with tactile data from a high resolution touch sensor. We then sought to associate the information from vision and touch by jointly training CNNs across the three modalities. Through the CNN, each input, regardless of the modality, generates an embedding vector that records the fabric’s physical property. By comparing the embeddings, our system is able to look at a fabric image and predict how it will feel, and vice versa. We also show that a system jointly trained on vision and touch data can outperform a similar system trained only on visual data when tested purely with visual inputs.
Tasks
Published 2017-04-12
URL http://arxiv.org/abs/1704.03822v1
PDF http://arxiv.org/pdf/1704.03822v1.pdf
PWC https://paperswithcode.com/paper/connecting-look-and-feel-associating-the
Repo https://github.com/wx405557858/FabricGel
Framework none

QCBA: Postoptimization of Quantitative Attributes in Classifiers based on Association Rules

Title QCBA: Postoptimization of Quantitative Attributes in Classifiers based on Association Rules
Authors Tomas Kliegr
Abstract The need to prediscretize numeric attributes before they can be used in association rule learning is a source of inefficiencies in the resulting classifier. This paper describes several new rule tuning steps aiming to recover information lost in the discretization of numeric (quantitative) attributes, and a new rule pruning strategy, which further reduces the size of the classification models. We demonstrate the effectiveness of the proposed methods on postoptimization of models generated by three state-of-the-art association rule classification algorithms: Classification based on Associations (Liu, 1998), Interpretable Decision Sets (Lakkaraju et al, 2016), and Scalable Bayesian Rule Lists (Yang, 2017). Benchmarks on 22 datasets from the UCI repository show that the postoptimized models are consistently smaller – typically by about 50% – and have better classification performance on most datasets.
Tasks
Published 2017-11-28
URL https://arxiv.org/abs/1711.10166v2
PDF https://arxiv.org/pdf/1711.10166v2.pdf
PWC https://paperswithcode.com/paper/quantitative-cba-small-and-comprehensible
Repo https://github.com/jirifilip/pyARC
Framework none

Structured Embedding Models for Grouped Data

Title Structured Embedding Models for Grouped Data
Authors Maja Rudolph, Francisco Ruiz, Susan Athey, David Blei
Abstract Word embeddings are a powerful approach for analyzing language, and exponential family embeddings (EFE) extend them to other types of data. Here we develop structured exponential family embeddings (S-EFE), a method for discovering embeddings that vary across related groups of data. We study how the word usage of U.S. Congressional speeches varies across states and party affiliation, how words are used differently across sections of the ArXiv, and how the co-purchase patterns of groceries can vary across seasons. Key to the success of our method is that the groups share statistical information. We develop two sharing strategies: hierarchical modeling and amortization. We demonstrate the benefits of this approach in empirical studies of speeches, abstracts, and shopping baskets. We show how S-EFE enables group-specific interpretation of word usage, and outperforms EFE in predicting held-out data.
Tasks Word Embeddings
Published 2017-09-28
URL http://arxiv.org/abs/1709.10367v1
PDF http://arxiv.org/pdf/1709.10367v1.pdf
PWC https://paperswithcode.com/paper/structured-embedding-models-for-grouped-data
Repo https://github.com/mariru/structured_embeddings
Framework tf

Improving Consistency and Correctness of Sequence Inpainting using Semantically Guided Generative Adversarial Network

Title Improving Consistency and Correctness of Sequence Inpainting using Semantically Guided Generative Adversarial Network
Authors Avisek Lahiri, Arnav Jain, Prabir Kumar Biswas, Pabitra Mitra
Abstract Contemporary benchmark methods for image inpainting are based on deep generative models and specifically leverage adversarial loss for yielding realistic reconstructions. However, these models cannot be directly applied on image/video sequences because of an intrinsic drawback- the reconstructions might be independently realistic, but, when visualized as a sequence, often lacks fidelity to the original uncorrupted sequence. The fundamental reason is that these methods try to find the best matching latent space representation near to natural image manifold without any explicit distance based loss. In this paper, we present a semantically conditioned Generative Adversarial Network (GAN) for sequence inpainting. The conditional information constrains the GAN to map a latent representation to a point in image manifold respecting the underlying pose and semantics of the scene. To the best of our knowledge, this is the first work which simultaneously addresses consistency and correctness of generative model based inpainting. We show that our generative model learns to disentangle pose and appearance information; this independence is exploited by our model to generate highly consistent reconstructions. The conditional information also aids the generator network in GAN to produce sharper images compared to the original GAN formulation. This helps in achieving more appealing inpainting performance. Though generic, our algorithm was targeted for inpainting on faces. When applied on CelebA and Youtube Faces datasets, the proposed method results in a significant improvement over the current benchmark, both in terms of quantitative evaluation (Peak Signal to Noise Ratio) and human visual scoring over diversified combinations of resolutions and deformations.
Tasks Image Inpainting
Published 2017-11-16
URL http://arxiv.org/abs/1711.06106v2
PDF http://arxiv.org/pdf/1711.06106v2.pdf
PWC https://paperswithcode.com/paper/improving-consistency-and-correctness-of
Repo https://github.com/arnavkj1995/face_inpainting
Framework tf

Filtering Variational Objectives

Title Filtering Variational Objectives
Authors Chris J. Maddison, Dieterich Lawson, George Tucker, Nicolas Heess, Mohammad Norouzi, Andriy Mnih, Arnaud Doucet, Yee Whye Teh
Abstract When used as a surrogate objective for maximum likelihood estimation in latent variable models, the evidence lower bound (ELBO) produces state-of-the-art results. Inspired by this, we consider the extension of the ELBO to a family of lower bounds defined by a particle filter’s estimator of the marginal likelihood, the filtering variational objectives (FIVOs). FIVOs take the same arguments as the ELBO, but can exploit a model’s sequential structure to form tighter bounds. We present results that relate the tightness of FIVO’s bound to the variance of the particle filter’s estimator by considering the generic case of bounds defined as log-transformed likelihood estimators. Experimentally, we show that training with FIVO results in substantial improvements over training the same model architecture with the ELBO on sequential data.
Tasks Latent Variable Models
Published 2017-05-25
URL http://arxiv.org/abs/1705.09279v3
PDF http://arxiv.org/pdf/1705.09279v3.pdf
PWC https://paperswithcode.com/paper/filtering-variational-objectives
Repo https://github.com/tensorflow/models
Framework tf

Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

Title Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference
Authors Benoit Jacob, Skirmantas Kligys, Bo Chen, Menglong Zhu, Matthew Tang, Andrew Howard, Hartwig Adam, Dmitry Kalenichenko
Abstract The rising popularity of intelligent mobile devices and the daunting computational cost of deep learning-based models call for efficient and accurate on-device inference schemes. We propose a quantization scheme that allows inference to be carried out using integer-only arithmetic, which can be implemented more efficiently than floating point inference on commonly available integer-only hardware. We also co-design a training procedure to preserve end-to-end model accuracy post quantization. As a result, the proposed quantization scheme improves the tradeoff between accuracy and on-device latency. The improvements are significant even on MobileNets, a model family known for run-time efficiency, and are demonstrated in ImageNet classification and COCO detection on popular CPUs.
Tasks Quantization
Published 2017-12-15
URL http://arxiv.org/abs/1712.05877v1
PDF http://arxiv.org/pdf/1712.05877v1.pdf
PWC https://paperswithcode.com/paper/quantization-and-training-of-neural-networks
Repo https://github.com/li-weihua/notes
Framework none

The Neural Network Pushdown Automaton: Model, Stack and Learning Simulations

Title The Neural Network Pushdown Automaton: Model, Stack and Learning Simulations
Authors G. Z. Sun, C. L. Giles, H. H. Chen, Y. C. Lee
Abstract In order for neural networks to learn complex languages or grammars, they must have sufficient computational power or resources to recognize or generate such languages. Though many approaches have been discussed, one ob- vious approach to enhancing the processing power of a recurrent neural network is to couple it with an external stack memory - in effect creating a neural network pushdown automata (NNPDA). This paper discusses in detail this NNPDA - its construction, how it can be trained and how useful symbolic information can be extracted from the trained network. In order to couple the external stack to the neural network, an optimization method is developed which uses an error function that connects the learning of the state automaton of the neural network to the learning of the operation of the external stack. To minimize the error function using gradient descent learning, an analog stack is designed such that the action and storage of information in the stack are continuous. One interpretation of a continuous stack is the probabilistic storage of and action on data. After training on sample strings of an unknown source grammar, a quantization procedure extracts from the analog stack and neural network a discrete pushdown automata (PDA). Simulations show that in learning deterministic context-free grammars - the balanced parenthesis language, 1n0n, and the deterministic Palindrome - the extracted PDA is correct in the sense that it can correctly recognize unseen strings of arbitrary length. In addition, the extracted PDAs can be shown to be identical or equivalent to the PDAs of the source grammars which were used to generate the training strings.
Tasks Quantization
Published 2017-11-15
URL http://arxiv.org/abs/1711.05738v1
PDF http://arxiv.org/pdf/1711.05738v1.pdf
PWC https://paperswithcode.com/paper/the-neural-network-pushdown-automaton-model
Repo https://github.com/eonu/arx
Framework none

DuReader: a Chinese Machine Reading Comprehension Dataset from Real-world Applications

Title DuReader: a Chinese Machine Reading Comprehension Dataset from Real-world Applications
Authors Wei He, Kai Liu, Jing Liu, Yajuan Lyu, Shiqi Zhao, Xinyan Xiao, Yuan Liu, Yizhong Wang, Hua Wu, Qiaoqiao She, Xuan Liu, Tian Wu, Haifeng Wang
Abstract This paper introduces DuReader, a new large-scale, open-domain Chinese ma- chine reading comprehension (MRC) dataset, designed to address real-world MRC. DuReader has three advantages over previous MRC datasets: (1) data sources: questions and documents are based on Baidu Search and Baidu Zhidao; answers are manually generated. (2) question types: it provides rich annotations for more question types, especially yes-no and opinion questions, that leaves more opportunity for the research community. (3) scale: it contains 200K questions, 420K answers and 1M documents; it is the largest Chinese MRC dataset so far. Experiments show that human performance is well above current state-of-the-art baseline systems, leaving plenty of room for the community to make improvements. To help the community make these improvements, both DuReader and baseline systems have been posted online. We also organize a shared competition to encourage the exploration of more models. Since the release of the task, there are significant improvements over the baselines.
Tasks Machine Reading Comprehension, Reading Comprehension
Published 2017-11-14
URL http://arxiv.org/abs/1711.05073v4
PDF http://arxiv.org/pdf/1711.05073v4.pdf
PWC https://paperswithcode.com/paper/dureader-a-chinese-machine-reading
Repo https://github.com/PaddlePaddle/models
Framework none

Improving Session Recommendation with Recurrent Neural Networks by Exploiting Dwell Time

Title Improving Session Recommendation with Recurrent Neural Networks by Exploiting Dwell Time
Authors Alexander Dallmann, Alexander Grimm, Christian Pölitz, Daniel Zoller, Andreas Hotho
Abstract Recently, Recurrent Neural Networks (RNNs) have been applied to the task of session-based recommendation. These approaches use RNNs to predict the next item in a user session based on the previ- ously visited items. While some approaches consider additional item properties, we argue that item dwell time can be used as an implicit measure of user interest to improve session-based item recommen- dations. We propose an extension to existing RNN approaches that captures user dwell time in addition to the visited items and show that recommendation performance can be improved. Additionally, we investigate the usefulness of a single validation split for model selection in the case of minor improvements and find that in our case the best model is not selected and a fold-like study with different validation sets is necessary to ensure the selection of the best model.
Tasks Model Selection, Session-Based Recommendations
Published 2017-06-30
URL http://arxiv.org/abs/1706.10231v1
PDF http://arxiv.org/pdf/1706.10231v1.pdf
PWC https://paperswithcode.com/paper/improving-session-recommendation-with
Repo https://github.com/miniii222/Graduate-Paper
Framework tf

Fast Incremental SVDD Learning Algorithm with the Gaussian Kernel

Title Fast Incremental SVDD Learning Algorithm with the Gaussian Kernel
Authors Hansi Jiang, Haoyu Wang, Wenhao Hu, Deovrat Kakde, Arin Chaudhuri
Abstract Support vector data description (SVDD) is a machine learning technique that is used for single-class classification and outlier detection. The idea of SVDD is to find a set of support vectors that defines a boundary around data. When dealing with online or large data, existing batch SVDD methods have to be rerun in each iteration. We propose an incremental learning algorithm for SVDD that uses the Gaussian kernel. This algorithm builds on the observation that all support vectors on the boundary have the same distance to the center of sphere in a higher-dimensional feature space as mapped by the Gaussian kernel function. Each iteration involves only the existing support vectors and the new data point. Moreover, the algorithm is based solely on matrix manipulations; the support vectors and their corresponding Lagrange multiplier $\alpha_i$'s are automatically selected and determined in each iteration. It can be seen that the complexity of our algorithm in each iteration is only $O(k^2)$, where $k$ is the number of support vectors. Experimental results on some real data sets indicate that FISVDD demonstrates significant gains in efficiency with almost no loss in either outlier detection accuracy or objective function value.
Tasks Outlier Detection
Published 2017-09-01
URL http://arxiv.org/abs/1709.00139v4
PDF http://arxiv.org/pdf/1709.00139v4.pdf
PWC https://paperswithcode.com/paper/fast-incremental-svdd-learning-algorithm-with
Repo https://github.com/hs-jiang/FISVDD
Framework none
comments powered by Disqus