May 6, 2019

2815 words 14 mins read

Paper Group ANR 197

Paper Group ANR 197

Multi-domain Neural Network Language Generation for Spoken Dialogue Systems. Fine Hand Segmentation using Convolutional Neural Networks. Going Out of Business: Auction House Behavior in the Massively Multi-Player Online Game. An Empirical Study of Adequate Vision Span for Attention-Based Neural Machine Translation. On Deep Multi-View Representation …

Multi-domain Neural Network Language Generation for Spoken Dialogue Systems

Title Multi-domain Neural Network Language Generation for Spoken Dialogue Systems
Authors Tsung-Hsien Wen, Milica Gasic, Nikola Mrksic, Lina M. Rojas-Barahona, Pei-Hao Su, David Vandyke, Steve Young
Abstract Moving from limited-domain natural language generation (NLG) to open domain is difficult because the number of semantic input combinations grows exponentially with the number of domains. Therefore, it is important to leverage existing resources and exploit similarities between domains to facilitate domain adaptation. In this paper, we propose a procedure to train multi-domain, Recurrent Neural Network-based (RNN) language generators via multiple adaptation steps. In this procedure, a model is first trained on counterfeited data synthesised from an out-of-domain dataset, and then fine tuned on a small set of in-domain utterances with a discriminative objective function. Corpus-based evaluation results show that the proposed procedure can achieve competitive performance in terms of BLEU score and slot error rate while significantly reducing the data needed to train generators in new, unseen domains. In subjective testing, human judges confirm that the procedure greatly improves generator performance when only a small amount of data is available in the domain.
Tasks Domain Adaptation, Spoken Dialogue Systems, Text Generation
Published 2016-03-03
URL http://arxiv.org/abs/1603.01232v1
PDF http://arxiv.org/pdf/1603.01232v1.pdf
PWC https://paperswithcode.com/paper/multi-domain-neural-network-language
Repo
Framework

Fine Hand Segmentation using Convolutional Neural Networks

Title Fine Hand Segmentation using Convolutional Neural Networks
Authors Tadej Vodopivec, Vincent Lepetit, Peter Peer
Abstract We propose a method for extracting very accurate masks of hands in egocentric views. Our method is based on a novel Deep Learning architecture: In contrast with current Deep Learning methods, we do not use upscaling layers applied to a low-dimensional representation of the input image. Instead, we extract features with convolutional layers and map them directly to a segmentation mask with a fully connected layer. We show that this approach, when applied in a multi-scale fashion, is both accurate and efficient enough for real-time. We demonstrate it on a new dataset made of images captured in various environments, from the outdoors to offices.
Tasks Hand Segmentation
Published 2016-08-26
URL http://arxiv.org/abs/1608.07454v1
PDF http://arxiv.org/pdf/1608.07454v1.pdf
PWC https://paperswithcode.com/paper/fine-hand-segmentation-using-convolutional
Repo
Framework

Going Out of Business: Auction House Behavior in the Massively Multi-Player Online Game

Title Going Out of Business: Auction House Behavior in the Massively Multi-Player Online Game
Authors Anders Drachen, Joseph Riley, Shawna Baskin, Diego Klabjan
Abstract The in-game economies of massively multi-player online games (MMOGs) are complex systems that have to be carefully designed and managed. This paper presents the results of an analysis of auction house data from the MMOG Glitch, across a 14 month time period, the entire lifetime of the game. The data comprise almost 3 million data points, over 20,000 unique players and more than 650 products. Furthermore, an interactive visualization, based on Sankey flow diagrams, is presented which shows the proportion of the different clusters across each time bin, as well as the flow of players between clusters. The diagram allows evaluation of migration of players between clusters as a function of time, as well as churn analysis. The presented work provides a template analysis and visualization model for progression-based or temporal-based analysis of player behavior broadly applicable to games.
Tasks
Published 2016-03-24
URL http://arxiv.org/abs/1603.07610v1
PDF http://arxiv.org/pdf/1603.07610v1.pdf
PWC https://paperswithcode.com/paper/going-out-of-business-auction-house-behavior
Repo
Framework

An Empirical Study of Adequate Vision Span for Attention-Based Neural Machine Translation

Title An Empirical Study of Adequate Vision Span for Attention-Based Neural Machine Translation
Authors Raphael Shu, Hideki Nakayama
Abstract Recently, the attention mechanism plays a key role to achieve high performance for Neural Machine Translation models. However, as it computes a score function for the encoder states in all positions at each decoding step, the attention model greatly increases the computational complexity. In this paper, we investigate the adequate vision span of attention models in the context of machine translation, by proposing a novel attention framework that is capable of reducing redundant score computation dynamically. The term “vision span” means a window of the encoder states considered by the attention model in one step. In our experiments, we found that the average window size of vision span can be reduced by over 50% with modest loss in accuracy on English-Japanese and German-English translation tasks.% This results indicate that the conventional attention mechanism performs a significant amount of redundant computation.
Tasks Machine Translation
Published 2016-12-19
URL http://arxiv.org/abs/1612.06043v4
PDF http://arxiv.org/pdf/1612.06043v4.pdf
PWC https://paperswithcode.com/paper/an-empirical-study-of-adequate-vision-span
Repo
Framework

On Deep Multi-View Representation Learning: Objectives and Optimization

Title On Deep Multi-View Representation Learning: Objectives and Optimization
Authors Weiran Wang, Raman Arora, Karen Livescu, Jeff Bilmes
Abstract We consider learning representations (features) in the setting in which we have access to multiple unlabeled views of the data for learning while only one view is available for downstream tasks. Previous work on this problem has proposed several techniques based on deep neural networks, typically involving either autoencoder-like networks with a reconstruction objective or paired feedforward networks with a batch-style correlation-based objective. We analyze several techniques based on prior work, as well as new variants, and compare them empirically on image, speech, and text tasks. We find an advantage for correlation-based representation learning, while the best results on most tasks are obtained with our new variant, deep canonically correlated autoencoders (DCCAE). We also explore a stochastic optimization procedure for minibatch correlation-based objectives and discuss the time/performance trade-offs for kernel-based and neural network-based implementations.
Tasks Representation Learning, Stochastic Optimization
Published 2016-02-02
URL http://arxiv.org/abs/1602.01024v1
PDF http://arxiv.org/pdf/1602.01024v1.pdf
PWC https://paperswithcode.com/paper/on-deep-multi-view-representation-learning
Repo
Framework

Autonomous Grounding of Visual Field Experience through Sensorimotor Prediction

Title Autonomous Grounding of Visual Field Experience through Sensorimotor Prediction
Authors Alban Laflaquière
Abstract In a developmental framework, autonomous robots need to explore the world and learn how to interact with it. Without an a priori model of the system, this opens the challenging problem of having robots master their interface with the world: how to perceive their environment using their sensors, and how to act in it using their motors. The sensorimotor approach of perception claims that a naive agent can learn to master this interface by capturing regularities in the way its actions transform its sensory inputs. In this paper, we apply such an approach to the discovery and mastery of the visual field associated with a visual sensor. A computational model is formalized and applied to a simulated system to illustrate the approach.
Tasks
Published 2016-08-03
URL http://arxiv.org/abs/1608.01127v1
PDF http://arxiv.org/pdf/1608.01127v1.pdf
PWC https://paperswithcode.com/paper/autonomous-grounding-of-visual-field
Repo
Framework

On the Troll-Trust Model for Edge Sign Prediction in Social Networks

Title On the Troll-Trust Model for Edge Sign Prediction in Social Networks
Authors Géraud Le Falher, Nicolò Cesa-Bianchi, Claudio Gentile, Fabio Vitale
Abstract In the problem of edge sign prediction, we are given a directed graph (representing a social network), and our task is to predict the binary labels of the edges (i.e., the positive or negative nature of the social relationships). Many successful heuristics for this problem are based on the troll-trust features, estimating at each node the fraction of outgoing and incoming positive/negative edges. We show that these heuristics can be understood, and rigorously analyzed, as approximators to the Bayes optimal classifier for a simple probabilistic model of the edge labels. We then show that the maximum likelihood estimator for this model approximately corresponds to the predictions of a Label Propagation algorithm run on a transformed version of the original social graph. Extensive experiments on a number of real-world datasets show that this algorithm is competitive against state-of-the-art classifiers in terms of both accuracy and scalability. Finally, we show that troll-trust features can also be used to derive online learning algorithms which have theoretical guarantees even when edges are adversarially labeled.
Tasks
Published 2016-06-01
URL http://arxiv.org/abs/1606.00182v5
PDF http://arxiv.org/pdf/1606.00182v5.pdf
PWC https://paperswithcode.com/paper/on-the-troll-trust-model-for-edge-sign
Repo
Framework

End-to-end Concept Word Detection for Video Captioning, Retrieval, and Question Answering

Title End-to-end Concept Word Detection for Video Captioning, Retrieval, and Question Answering
Authors Youngjae Yu, Hyungjin Ko, Jongwook Choi, Gunhee Kim
Abstract We propose a high-level concept word detector that can be integrated with any video-to-language models. It takes a video as input and generates a list of concept words as useful semantic priors for language generation models. The proposed word detector has two important properties. First, it does not require any external knowledge sources for training. Second, the proposed word detector is trainable in an end-to-end manner jointly with any video-to-language models. To maximize the values of detected words, we also develop a semantic attention mechanism that selectively focuses on the detected concept words and fuse them with the word encoding and decoding in the language model. In order to demonstrate that the proposed approach indeed improves the performance of multiple video-to-language tasks, we participate in four tasks of LSMDC 2016. Our approach achieves the best accuracies in three of them, including fill-in-the-blank, multiple-choice test, and movie retrieval. We also attain comparable performance for the other task, movie description.
Tasks Language Modelling, Question Answering, Text Generation, Video Captioning, Video Retrieval
Published 2016-10-10
URL http://arxiv.org/abs/1610.02947v3
PDF http://arxiv.org/pdf/1610.02947v3.pdf
PWC https://paperswithcode.com/paper/end-to-end-concept-word-detection-for-video
Repo
Framework

Text Network Exploration via Heterogeneous Web of Topics

Title Text Network Exploration via Heterogeneous Web of Topics
Authors Junxian He, Ying Huang, Changfeng Liu, Jiaming Shen, Yuting Jia, Xinbing Wang
Abstract A text network refers to a data type that each vertex is associated with a text document and the relationship between documents is represented by edges. The proliferation of text networks such as hyperlinked webpages and academic citation networks has led to an increasing demand for quickly developing a general sense of a new text network, namely text network exploration. In this paper, we address the problem of text network exploration through constructing a heterogeneous web of topics, which allows people to investigate a text network associating word level with document level. To achieve this, a probabilistic generative model for text and links is proposed, where three different relationships in the heterogeneous topic web are quantified. We also develop a prototype demo system named TopicAtlas to exhibit such heterogeneous topic web, and demonstrate how this system can facilitate the task of text network exploration. Extensive qualitative analyses are included to verify the effectiveness of this heterogeneous topic web. Besides, we validate our model on real-life text networks, showing that it preserves good performance on objective evaluation metrics.
Tasks
Published 2016-10-02
URL http://arxiv.org/abs/1610.00219v1
PDF http://arxiv.org/pdf/1610.00219v1.pdf
PWC https://paperswithcode.com/paper/text-network-exploration-via-heterogeneous
Repo
Framework

Synthetic Data for Text Localisation in Natural Images

Title Synthetic Data for Text Localisation in Natural Images
Authors Ankush Gupta, Andrea Vedaldi, Andrew Zisserman
Abstract In this paper we introduce a new method for text detection in natural images. The method comprises two contributions: First, a fast and scalable engine to generate synthetic images of text in clutter. This engine overlays synthetic text to existing background images in a natural way, accounting for the local 3D scene geometry. Second, we use the synthetic images to train a Fully-Convolutional Regression Network (FCRN) which efficiently performs text detection and bounding-box regression at all locations and multiple scales in an image. We discuss the relation of FCRN to the recently-introduced YOLO detector, as well as other end-to-end object detection systems based on deep learning. The resulting detection network significantly out performs current methods for text detection in natural images, achieving an F-measure of 84.2% on the standard ICDAR 2013 benchmark. Furthermore, it can process 15 images per second on a GPU.
Tasks Object Detection
Published 2016-04-22
URL http://arxiv.org/abs/1604.06646v1
PDF http://arxiv.org/pdf/1604.06646v1.pdf
PWC https://paperswithcode.com/paper/synthetic-data-for-text-localisation-in
Repo
Framework

Safe and Efficient Off-Policy Reinforcement Learning

Title Safe and Efficient Off-Policy Reinforcement Learning
Authors Rémi Munos, Tom Stepleton, Anna Harutyunyan, Marc G. Bellemare
Abstract In this work, we take a fresh look at some old and new algorithms for off-policy, return-based reinforcement learning. Expressing these in a common form, we derive a novel algorithm, Retrace($\lambda$), with three desired properties: (1) it has low variance; (2) it safely uses samples collected from any behaviour policy, whatever its degree of “off-policyness”; and (3) it is efficient as it makes the best use of samples collected from near on-policy behaviour policies. We analyze the contractive nature of the related operator under both off-policy policy evaluation and control settings and derive online sample-based algorithms. We believe this is the first return-based off-policy control algorithm converging a.s. to $Q^*$ without the GLIE assumption (Greedy in the Limit with Infinite Exploration). As a corollary, we prove the convergence of Watkins’ Q($\lambda$), which was an open problem since 1989. We illustrate the benefits of Retrace($\lambda$) on a standard suite of Atari 2600 games.
Tasks Atari Games
Published 2016-06-08
URL http://arxiv.org/abs/1606.02647v2
PDF http://arxiv.org/pdf/1606.02647v2.pdf
PWC https://paperswithcode.com/paper/safe-and-efficient-off-policy-reinforcement
Repo
Framework

Morphological Priors for Probabilistic Neural Word Embeddings

Title Morphological Priors for Probabilistic Neural Word Embeddings
Authors Parminder Bhatia, Robert Guthrie, Jacob Eisenstein
Abstract Word embeddings allow natural language processing systems to share statistical information across related words. These embeddings are typically based on distributional statistics, making it difficult for them to generalize to rare or unseen words. We propose to improve word embeddings by incorporating morphological information, capturing shared sub-word features. Unlike previous work that constructs word embeddings directly from morphemes, we combine morphological and distributional information in a unified probabilistic framework, in which the word embedding is a latent variable. The morphological information provides a prior distribution on the latent word embeddings, which in turn condition a likelihood function over an observed corpus. This approach yields improvements on intrinsic word similarity evaluations, and also in the downstream task of part-of-speech tagging.
Tasks Part-Of-Speech Tagging, Word Embeddings
Published 2016-08-03
URL http://arxiv.org/abs/1608.01056v2
PDF http://arxiv.org/pdf/1608.01056v2.pdf
PWC https://paperswithcode.com/paper/morphological-priors-for-probabilistic-neural
Repo
Framework

Bayesian Neural Word Embedding

Title Bayesian Neural Word Embedding
Authors Oren Barkan
Abstract Recently, several works in the domain of natural language processing presented successful methods for word embedding. Among them, the Skip-Gram with negative sampling, known also as word2vec, advanced the state-of-the-art of various linguistics tasks. In this paper, we propose a scalable Bayesian neural word embedding algorithm. The algorithm relies on a Variational Bayes solution for the Skip-Gram objective and a detailed step by step description is provided. We present experimental results that demonstrate the performance of the proposed algorithm for word analogy and similarity tasks on six different datasets and show it is competitive with the original Skip-Gram method.
Tasks
Published 2016-03-21
URL http://arxiv.org/abs/1603.06571v3
PDF http://arxiv.org/pdf/1603.06571v3.pdf
PWC https://paperswithcode.com/paper/bayesian-neural-word-embedding
Repo
Framework

Computer-Aided Colorectal Tumor Classification in NBI Endoscopy Using CNN Features

Title Computer-Aided Colorectal Tumor Classification in NBI Endoscopy Using CNN Features
Authors Toru Tamaki, Shoji Sonoyama, Tsubasa Hirakawa, Bisser Raytchev, Kazufumi Kaneda, Tetsushi Koide, Shigeto Yoshida, Hiroshi Mieno, Shinji Tanaka
Abstract In this paper we report results for recognizing colorectal NBI endoscopic images by using features extracted from convolutional neural network (CNN). In this comparative study, we extract features from different layers from different CNN models, and then train linear SVM classifiers. Experimental results with 10-fold cross validations show that features from first few convolution layers are enough to achieve similar performance (i.e., recognition rate of 95%) with non-CNN local features such as Bag-of-Visual words, Fisher vector, and VLAD.
Tasks
Published 2016-08-24
URL http://arxiv.org/abs/1608.06709v1
PDF http://arxiv.org/pdf/1608.06709v1.pdf
PWC https://paperswithcode.com/paper/computer-aided-colorectal-tumor
Repo
Framework

CT-Mapper: Mapping Sparse Multimodal Cellular Trajectories using a Multilayer Transportation Network

Title CT-Mapper: Mapping Sparse Multimodal Cellular Trajectories using a Multilayer Transportation Network
Authors Fereshteh Asgari, Alexis Sultan, Haoyi Xiong, Vincent Gauthier, Mounim El-Yacoubi
Abstract Mobile phone data have recently become an attractive source of information about mobility behavior. Since cell phone data can be captured in a passive way for a large user population, they can be harnessed to collect well-sampled mobility information. In this paper, we propose CT-Mapper, an unsupervised algorithm that enables the mapping of mobile phone traces over a multimodal transport network. One of the main strengths of CT-Mapper is its capability to map noisy sparse cellular multimodal trajectories over a multilayer transportation network where the layers have different physical properties and not only to map trajectories associated with a single layer. Such a network is modeled by a large multilayer graph in which the nodes correspond to metro/train stations or road intersections and edges correspond to connections between them. The mapping problem is modeled by an unsupervised HMM where the observations correspond to sparse user mobile trajectories and the hidden states to the multilayer graph nodes. The HMM is unsupervised as the transition and emission probabilities are inferred using respectively the physical transportation properties and the information on the spatial coverage of antenna base stations. To evaluate CT-Mapper we collected cellular traces with their corresponding GPS trajectories for a group of volunteer users in Paris and vicinity (France). We show that CT-Mapper is able to accurately retrieve the real cell phone user paths despite the sparsity of the observed trace trajectories. Furthermore our transition probability model is up to 20% more accurate than other naive models.
Tasks
Published 2016-04-22
URL http://arxiv.org/abs/1604.06577v1
PDF http://arxiv.org/pdf/1604.06577v1.pdf
PWC https://paperswithcode.com/paper/ct-mapper-mapping-sparse-multimodal-cellular
Repo
Framework
comments powered by Disqus