May 6, 2019

2815 words 14 mins read

Paper Group ANR 197

Multi-domain Neural Network Language Generation for Spoken Dialogue Systems. Fine Hand Segmentation using Convolutional Neural Networks. Going Out of Business: Auction House Behavior in the Massively Multi-Player Online Game. An Empirical Study of Adequate Vision Span for Attention-Based Neural Machine Translation. On Deep Multi-View Representation …

Multi-domain Neural Network Language Generation for Spoken Dialogue Systems


Title	Multi-domain Neural Network Language Generation for Spoken Dialogue Systems
Authors	Tsung-Hsien Wen, Milica Gasic, Nikola Mrksic, Lina M. Rojas-Barahona, Pei-Hao Su, David Vandyke, Steve Young
Abstract	Moving from limited-domain natural language generation (NLG) to open domain is difficult because the number of semantic input combinations grows exponentially with the number of domains. Therefore, it is important to leverage existing resources and exploit similarities between domains to facilitate domain adaptation. In this paper, we propose a procedure to train multi-domain, Recurrent Neural Network-based (RNN) language generators via multiple adaptation steps. In this procedure, a model is first trained on counterfeited data synthesised from an out-of-domain dataset, and then fine tuned on a small set of in-domain utterances with a discriminative objective function. Corpus-based evaluation results show that the proposed procedure can achieve competitive performance in terms of BLEU score and slot error rate while significantly reducing the data needed to train generators in new, unseen domains. In subjective testing, human judges confirm that the procedure greatly improves generator performance when only a small amount of data is available in the domain.
Tasks	Domain Adaptation, Spoken Dialogue Systems, Text Generation
Published	2016-03-03
URL	http://arxiv.org/abs/1603.01232v1
PDF	http://arxiv.org/pdf/1603.01232v1.pdf
PWC	https://paperswithcode.com/paper/multi-domain-neural-network-language
Repo
Framework

Fine Hand Segmentation using Convolutional Neural Networks


Title	Fine Hand Segmentation using Convolutional Neural Networks
Authors	Tadej Vodopivec, Vincent Lepetit, Peter Peer
Abstract	We propose a method for extracting very accurate masks of hands in egocentric views. Our method is based on a novel Deep Learning architecture: In contrast with current Deep Learning methods, we do not use upscaling layers applied to a low-dimensional representation of the input image. Instead, we extract features with convolutional layers and map them directly to a segmentation mask with a fully connected layer. We show that this approach, when applied in a multi-scale fashion, is both accurate and efficient enough for real-time. We demonstrate it on a new dataset made of images captured in various environments, from the outdoors to offices.
Tasks	Hand Segmentation
Published	2016-08-26
URL	http://arxiv.org/abs/1608.07454v1
PDF	http://arxiv.org/pdf/1608.07454v1.pdf
PWC	https://paperswithcode.com/paper/fine-hand-segmentation-using-convolutional
Repo
Framework

Going Out of Business: Auction House Behavior in the Massively Multi-Player Online Game


Title	Going Out of Business: Auction House Behavior in the Massively Multi-Player Online Game
Authors	Anders Drachen, Joseph Riley, Shawna Baskin, Diego Klabjan
Abstract	The in-game economies of massively multi-player online games (MMOGs) are complex systems that have to be carefully designed and managed. This paper presents the results of an analysis of auction house data from the MMOG Glitch, across a 14 month time period, the entire lifetime of the game. The data comprise almost 3 million data points, over 20,000 unique players and more than 650 products. Furthermore, an interactive visualization, based on Sankey flow diagrams, is presented which shows the proportion of the different clusters across each time bin, as well as the flow of players between clusters. The diagram allows evaluation of migration of players between clusters as a function of time, as well as churn analysis. The presented work provides a template analysis and visualization model for progression-based or temporal-based analysis of player behavior broadly applicable to games.
Tasks
Published	2016-03-24
URL	http://arxiv.org/abs/1603.07610v1
PDF	http://arxiv.org/pdf/1603.07610v1.pdf
PWC	https://paperswithcode.com/paper/going-out-of-business-auction-house-behavior
Repo
Framework

An Empirical Study of Adequate Vision Span for Attention-Based Neural Machine Translation


Title	An Empirical Study of Adequate Vision Span for Attention-Based Neural Machine Translation
Authors	Raphael Shu, Hideki Nakayama
Abstract	Recently, the attention mechanism plays a key role to achieve high performance for Neural Machine Translation models. However, as it computes a score function for the encoder states in all positions at each decoding step, the attention model greatly increases the computational complexity. In this paper, we investigate the adequate vision span of attention models in the context of machine translation, by proposing a novel attention framework that is capable of reducing redundant score computation dynamically. The term “vision span” means a window of the encoder states considered by the attention model in one step. In our experiments, we found that the average window size of vision span can be reduced by over 50% with modest loss in accuracy on English-Japanese and German-English translation tasks.% This results indicate that the conventional attention mechanism performs a significant amount of redundant computation.
Tasks	Machine Translation
Published	2016-12-19
URL	http://arxiv.org/abs/1612.06043v4
PDF	http://arxiv.org/pdf/1612.06043v4.pdf
PWC	https://paperswithcode.com/paper/an-empirical-study-of-adequate-vision-span
Repo
Framework

On Deep Multi-View Representation Learning: Objectives and Optimization


Title	On Deep Multi-View Representation Learning: Objectives and Optimization
Authors	Weiran Wang, Raman Arora, Karen Livescu, Jeff Bilmes
Abstract	We consider learning representations (features) in the setting in which we have access to multiple unlabeled views of the data for learning while only one view is available for downstream tasks. Previous work on this problem has proposed several techniques based on deep neural networks, typically involving either autoencoder-like networks with a reconstruction objective or paired feedforward networks with a batch-style correlation-based objective. We analyze several techniques based on prior work, as well as new variants, and compare them empirically on image, speech, and text tasks. We find an advantage for correlation-based representation learning, while the best results on most tasks are obtained with our new variant, deep canonically correlated autoencoders (DCCAE). We also explore a stochastic optimization procedure for minibatch correlation-based objectives and discuss the time/performance trade-offs for kernel-based and neural network-based implementations.
Tasks	Representation Learning, Stochastic Optimization
Published	2016-02-02
URL	http://arxiv.org/abs/1602.01024v1
PDF	http://arxiv.org/pdf/1602.01024v1.pdf
PWC	https://paperswithcode.com/paper/on-deep-multi-view-representation-learning
Repo
Framework

Autonomous Grounding of Visual Field Experience through Sensorimotor Prediction


Title	Autonomous Grounding of Visual Field Experience through Sensorimotor Prediction
Authors	Alban Laflaquière
Abstract	In a developmental framework, autonomous robots need to explore the world and learn how to interact with it. Without an a priori model of the system, this opens the challenging problem of having robots master their interface with the world: how to perceive their environment using their sensors, and how to act in it using their motors. The sensorimotor approach of perception claims that a naive agent can learn to master this interface by capturing regularities in the way its actions transform its sensory inputs. In this paper, we apply such an approach to the discovery and mastery of the visual field associated with a visual sensor. A computational model is formalized and applied to a simulated system to illustrate the approach.
Tasks
Published	2016-08-03
URL	http://arxiv.org/abs/1608.01127v1
PDF	http://arxiv.org/pdf/1608.01127v1.pdf
PWC	https://paperswithcode.com/paper/autonomous-grounding-of-visual-field
Repo
Framework


Title	On the Troll-Trust Model for Edge Sign Prediction in Social Networks
Authors	Géraud Le Falher, Nicolò Cesa-Bianchi, Claudio Gentile, Fabio Vitale
Abstract	In the problem of edge sign prediction, we are given a directed graph (representing a social network), and our task is to predict the binary labels of the edges (i.e., the positive or negative nature of the social relationships). Many successful heuristics for this problem are based on the troll-trust features, estimating at each node the fraction of outgoing and incoming positive/negative edges. We show that these heuristics can be understood, and rigorously analyzed, as approximators to the Bayes optimal classifier for a simple probabilistic model of the edge labels. We then show that the maximum likelihood estimator for this model approximately corresponds to the predictions of a Label Propagation algorithm run on a transformed version of the original social graph. Extensive experiments on a number of real-world datasets show that this algorithm is competitive against state-of-the-art classifiers in terms of both accuracy and scalability. Finally, we show that troll-trust features can also be used to derive online learning algorithms which have theoretical guarantees even when edges are adversarially labeled.
Tasks
Published	2016-06-01
URL	http://arxiv.org/abs/1606.00182v5
PDF	http://arxiv.org/pdf/1606.00182v5.pdf
PWC	https://paperswithcode.com/paper/on-the-troll-trust-model-for-edge-sign
Repo
Framework

End-to-end Concept Word Detection for Video Captioning, Retrieval, and Question Answering


Title	End-to-end Concept Word Detection for Video Captioning, Retrieval, and Question Answering
Authors	Youngjae Yu, Hyungjin Ko, Jongwook Choi, Gunhee Kim
Abstract	We propose a high-level concept word detector that can be integrated with any video-to-language models. It takes a video as input and generates a list of concept words as useful semantic priors for language generation models. The proposed word detector has two important properties. First, it does not require any external knowledge sources for training. Second, the proposed word detector is trainable in an end-to-end manner jointly with any video-to-language models. To maximize the values of detected words, we also develop a semantic attention mechanism that selectively focuses on the detected concept words and fuse them with the word encoding and decoding in the language model. In order to demonstrate that the proposed approach indeed improves the performance of multiple video-to-language tasks, we participate in four tasks of LSMDC 2016. Our approach achieves the best accuracies in three of them, including fill-in-the-blank, multiple-choice test, and movie retrieval. We also attain comparable performance for the other task, movie description.
Tasks	Language Modelling, Question Answering, Text Generation, Video Captioning, Video Retrieval
Published	2016-10-10
URL	http://arxiv.org/abs/1610.02947v3
PDF	http://arxiv.org/pdf/1610.02947v3.pdf
PWC	https://paperswithcode.com/paper/end-to-end-concept-word-detection-for-video
Repo
Framework

Text Network Exploration via Heterogeneous Web of Topics


Title	Text Network Exploration via Heterogeneous Web of Topics
Authors	Junxian He, Ying Huang, Changfeng Liu, Jiaming Shen, Yuting Jia, Xinbing Wang
Abstract	A text network refers to a data type that each vertex is associated with a text document and the relationship between documents is represented by edges. The proliferation of text networks such as hyperlinked webpages and academic citation networks has led to an increasing demand for quickly developing a general sense of a new text network, namely text network exploration. In this paper, we address the problem of text network exploration through constructing a heterogeneous web of topics, which allows people to investigate a text network associating word level with document level. To achieve this, a probabilistic generative model for text and links is proposed, where three different relationships in the heterogeneous topic web are quantified. We also develop a prototype demo system named TopicAtlas to exhibit such heterogeneous topic web, and demonstrate how this system can facilitate the task of text network exploration. Extensive qualitative analyses are included to verify the effectiveness of this heterogeneous topic web. Besides, we validate our model on real-life text networks, showing that it preserves good performance on objective evaluation metrics.
Tasks
Published	2016-10-02
URL	http://arxiv.org/abs/1610.00219v1
PDF	http://arxiv.org/pdf/1610.00219v1.pdf
PWC	https://paperswithcode.com/paper/text-network-exploration-via-heterogeneous
Repo
Framework

Synthetic Data for Text Localisation in Natural Images


Title	Synthetic Data for Text Localisation in Natural Images
Authors	Ankush Gupta, Andrea Vedaldi, Andrew Zisserman
Abstract	In this paper we introduce a new method for text detection in natural images. The method comprises two contributions: First, a fast and scalable engine to generate synthetic images of text in clutter. This engine overlays synthetic text to existing background images in a natural way, accounting for the local 3D scene geometry. Second, we use the synthetic images to train a Fully-Convolutional Regression Network (FCRN) which efficiently performs text detection and bounding-box regression at all locations and multiple scales in an image. We discuss the relation of FCRN to the recently-introduced YOLO detector, as well as other end-to-end object detection systems based on deep learning. The resulting detection network significantly out performs current methods for text detection in natural images, achieving an F-measure of 84.2% on the standard ICDAR 2013 benchmark. Furthermore, it can process 15 images per second on a GPU.
Tasks	Object Detection
Published	2016-04-22
URL	http://arxiv.org/abs/1604.06646v1
PDF	http://arxiv.org/pdf/1604.06646v1.pdf
PWC	https://paperswithcode.com/paper/synthetic-data-for-text-localisation-in
Repo
Framework

Safe and Efficient Off-Policy Reinforcement Learning


Title	Safe and Efficient Off-Policy Reinforcement Learning
Authors	Rémi Munos, Tom Stepleton, Anna Harutyunyan, Marc G. Bellemare
Abstract	In this work, we take a fresh look at some old and new algorithms for off-policy, return-based reinforcement learning. Expressing these in a common form, we derive a novel algorithm, Retrace($\lambda$), with three desired properties: (1) it has low variance; (2) it safely uses samples collected from any behaviour policy, whatever its degree of “off-policyness”; and (3) it is efficient as it makes the best use of samples collected from near on-policy behaviour policies. We analyze the contractive nature of the related operator under both off-policy policy evaluation and control settings and derive online sample-based algorithms. We believe this is the first return-based off-policy control algorithm converging a.s. to $Q^*$ without the GLIE assumption (Greedy in the Limit with Infinite Exploration). As a corollary, we prove the convergence of Watkins’ Q($\lambda$), which was an open problem since 1989. We illustrate the benefits of Retrace($\lambda$) on a standard suite of Atari 2600 games.
Tasks	Atari Games
Published	2016-06-08
URL	http://arxiv.org/abs/1606.02647v2
PDF	http://arxiv.org/pdf/1606.02647v2.pdf
PWC	https://paperswithcode.com/paper/safe-and-efficient-off-policy-reinforcement
Repo
Framework

Morphological Priors for Probabilistic Neural Word Embeddings


Title	Morphological Priors for Probabilistic Neural Word Embeddings
Authors	Parminder Bhatia, Robert Guthrie, Jacob Eisenstein
Abstract	Word embeddings allow natural language processing systems to share statistical information across related words. These embeddings are typically based on distributional statistics, making it difficult for them to generalize to rare or unseen words. We propose to improve word embeddings by incorporating morphological information, capturing shared sub-word features. Unlike previous work that constructs word embeddings directly from morphemes, we combine morphological and distributional information in a unified probabilistic framework, in which the word embedding is a latent variable. The morphological information provides a prior distribution on the latent word embeddings, which in turn condition a likelihood function over an observed corpus. This approach yields improvements on intrinsic word similarity evaluations, and also in the downstream task of part-of-speech tagging.
Tasks	Part-Of-Speech Tagging, Word Embeddings
Published	2016-08-03
URL	http://arxiv.org/abs/1608.01056v2
PDF	http://arxiv.org/pdf/1608.01056v2.pdf
PWC	https://paperswithcode.com/paper/morphological-priors-for-probabilistic-neural
Repo
Framework

Bayesian Neural Word Embedding


Title	Bayesian Neural Word Embedding
Authors	Oren Barkan
Abstract	Recently, several works in the domain of natural language processing presented successful methods for word embedding. Among them, the Skip-Gram with negative sampling, known also as word2vec, advanced the state-of-the-art of various linguistics tasks. In this paper, we propose a scalable Bayesian neural word embedding algorithm. The algorithm relies on a Variational Bayes solution for the Skip-Gram objective and a detailed step by step description is provided. We present experimental results that demonstrate the performance of the proposed algorithm for word analogy and similarity tasks on six different datasets and show it is competitive with the original Skip-Gram method.
Tasks
Published	2016-03-21
URL	http://arxiv.org/abs/1603.06571v3
PDF	http://arxiv.org/pdf/1603.06571v3.pdf
PWC	https://paperswithcode.com/paper/bayesian-neural-word-embedding
Repo
Framework

Computer-Aided Colorectal Tumor Classification in NBI Endoscopy Using CNN Features


Title	Computer-Aided Colorectal Tumor Classification in NBI Endoscopy Using CNN Features
Authors	Toru Tamaki, Shoji Sonoyama, Tsubasa Hirakawa, Bisser Raytchev, Kazufumi Kaneda, Tetsushi Koide, Shigeto Yoshida, Hiroshi Mieno, Shinji Tanaka
Abstract	In this paper we report results for recognizing colorectal NBI endoscopic images by using features extracted from convolutional neural network (CNN). In this comparative study, we extract features from different layers from different CNN models, and then train linear SVM classifiers. Experimental results with 10-fold cross validations show that features from first few convolution layers are enough to achieve similar performance (i.e., recognition rate of 95%) with non-CNN local features such as Bag-of-Visual words, Fisher vector, and VLAD.
Tasks
Published	2016-08-24
URL	http://arxiv.org/abs/1608.06709v1
PDF	http://arxiv.org/pdf/1608.06709v1.pdf
PWC	https://paperswithcode.com/paper/computer-aided-colorectal-tumor
Repo
Framework

CT-Mapper: Mapping Sparse Multimodal Cellular Trajectories using a Multilayer Transportation Network


Title	CT-Mapper: Mapping Sparse Multimodal Cellular Trajectories using a Multilayer Transportation Network
Authors	Fereshteh Asgari, Alexis Sultan, Haoyi Xiong, Vincent Gauthier, Mounim El-Yacoubi
Abstract	Mobile phone data have recently become an attractive source of information about mobility behavior. Since cell phone data can be captured in a passive way for a large user population, they can be harnessed to collect well-sampled mobility information. In this paper, we propose CT-Mapper, an unsupervised algorithm that enables the mapping of mobile phone traces over a multimodal transport network. One of the main strengths of CT-Mapper is its capability to map noisy sparse cellular multimodal trajectories over a multilayer transportation network where the layers have different physical properties and not only to map trajectories associated with a single layer. Such a network is modeled by a large multilayer graph in which the nodes correspond to metro/train stations or road intersections and edges correspond to connections between them. The mapping problem is modeled by an unsupervised HMM where the observations correspond to sparse user mobile trajectories and the hidden states to the multilayer graph nodes. The HMM is unsupervised as the transition and emission probabilities are inferred using respectively the physical transportation properties and the information on the spatial coverage of antenna base stations. To evaluate CT-Mapper we collected cellular traces with their corresponding GPS trajectories for a group of volunteer users in Paris and vicinity (France). We show that CT-Mapper is able to accurately retrieve the real cell phone user paths despite the sparsity of the observed trace trajectories. Furthermore our transition probability model is up to 20% more accurate than other naive models.
Tasks
Published	2016-04-22
URL	http://arxiv.org/abs/1604.06577v1
PDF	http://arxiv.org/pdf/1604.06577v1.pdf
PWC	https://paperswithcode.com/paper/ct-mapper-mapping-sparse-multimodal-cellular
Repo
Framework