Paper Group ANR 197
Multi-domain Neural Network Language Generation for Spoken Dialogue Systems. Fine Hand Segmentation using Convolutional Neural Networks. Going Out of Business: Auction House Behavior in the Massively Multi-Player Online Game. An Empirical Study of Adequate Vision Span for Attention-Based Neural Machine Translation. On Deep Multi-View Representation …
Multi-domain Neural Network Language Generation for Spoken Dialogue Systems
Title | Multi-domain Neural Network Language Generation for Spoken Dialogue Systems |
Authors | Tsung-Hsien Wen, Milica Gasic, Nikola Mrksic, Lina M. Rojas-Barahona, Pei-Hao Su, David Vandyke, Steve Young |
Abstract | Moving from limited-domain natural language generation (NLG) to open domain is difficult because the number of semantic input combinations grows exponentially with the number of domains. Therefore, it is important to leverage existing resources and exploit similarities between domains to facilitate domain adaptation. In this paper, we propose a procedure to train multi-domain, Recurrent Neural Network-based (RNN) language generators via multiple adaptation steps. In this procedure, a model is first trained on counterfeited data synthesised from an out-of-domain dataset, and then fine tuned on a small set of in-domain utterances with a discriminative objective function. Corpus-based evaluation results show that the proposed procedure can achieve competitive performance in terms of BLEU score and slot error rate while significantly reducing the data needed to train generators in new, unseen domains. In subjective testing, human judges confirm that the procedure greatly improves generator performance when only a small amount of data is available in the domain. |
Tasks | Domain Adaptation, Spoken Dialogue Systems, Text Generation |
Published | 2016-03-03 |
URL | http://arxiv.org/abs/1603.01232v1 |
http://arxiv.org/pdf/1603.01232v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-domain-neural-network-language |
Repo | |
Framework | |
Fine Hand Segmentation using Convolutional Neural Networks
Title | Fine Hand Segmentation using Convolutional Neural Networks |
Authors | Tadej Vodopivec, Vincent Lepetit, Peter Peer |
Abstract | We propose a method for extracting very accurate masks of hands in egocentric views. Our method is based on a novel Deep Learning architecture: In contrast with current Deep Learning methods, we do not use upscaling layers applied to a low-dimensional representation of the input image. Instead, we extract features with convolutional layers and map them directly to a segmentation mask with a fully connected layer. We show that this approach, when applied in a multi-scale fashion, is both accurate and efficient enough for real-time. We demonstrate it on a new dataset made of images captured in various environments, from the outdoors to offices. |
Tasks | Hand Segmentation |
Published | 2016-08-26 |
URL | http://arxiv.org/abs/1608.07454v1 |
http://arxiv.org/pdf/1608.07454v1.pdf | |
PWC | https://paperswithcode.com/paper/fine-hand-segmentation-using-convolutional |
Repo | |
Framework | |
Going Out of Business: Auction House Behavior in the Massively Multi-Player Online Game
Title | Going Out of Business: Auction House Behavior in the Massively Multi-Player Online Game |
Authors | Anders Drachen, Joseph Riley, Shawna Baskin, Diego Klabjan |
Abstract | The in-game economies of massively multi-player online games (MMOGs) are complex systems that have to be carefully designed and managed. This paper presents the results of an analysis of auction house data from the MMOG Glitch, across a 14 month time period, the entire lifetime of the game. The data comprise almost 3 million data points, over 20,000 unique players and more than 650 products. Furthermore, an interactive visualization, based on Sankey flow diagrams, is presented which shows the proportion of the different clusters across each time bin, as well as the flow of players between clusters. The diagram allows evaluation of migration of players between clusters as a function of time, as well as churn analysis. The presented work provides a template analysis and visualization model for progression-based or temporal-based analysis of player behavior broadly applicable to games. |
Tasks | |
Published | 2016-03-24 |
URL | http://arxiv.org/abs/1603.07610v1 |
http://arxiv.org/pdf/1603.07610v1.pdf | |
PWC | https://paperswithcode.com/paper/going-out-of-business-auction-house-behavior |
Repo | |
Framework | |
An Empirical Study of Adequate Vision Span for Attention-Based Neural Machine Translation
Title | An Empirical Study of Adequate Vision Span for Attention-Based Neural Machine Translation |
Authors | Raphael Shu, Hideki Nakayama |
Abstract | Recently, the attention mechanism plays a key role to achieve high performance for Neural Machine Translation models. However, as it computes a score function for the encoder states in all positions at each decoding step, the attention model greatly increases the computational complexity. In this paper, we investigate the adequate vision span of attention models in the context of machine translation, by proposing a novel attention framework that is capable of reducing redundant score computation dynamically. The term “vision span” means a window of the encoder states considered by the attention model in one step. In our experiments, we found that the average window size of vision span can be reduced by over 50% with modest loss in accuracy on English-Japanese and German-English translation tasks.% This results indicate that the conventional attention mechanism performs a significant amount of redundant computation. |
Tasks | Machine Translation |
Published | 2016-12-19 |
URL | http://arxiv.org/abs/1612.06043v4 |
http://arxiv.org/pdf/1612.06043v4.pdf | |
PWC | https://paperswithcode.com/paper/an-empirical-study-of-adequate-vision-span |
Repo | |
Framework | |
On Deep Multi-View Representation Learning: Objectives and Optimization
Title | On Deep Multi-View Representation Learning: Objectives and Optimization |
Authors | Weiran Wang, Raman Arora, Karen Livescu, Jeff Bilmes |
Abstract | We consider learning representations (features) in the setting in which we have access to multiple unlabeled views of the data for learning while only one view is available for downstream tasks. Previous work on this problem has proposed several techniques based on deep neural networks, typically involving either autoencoder-like networks with a reconstruction objective or paired feedforward networks with a batch-style correlation-based objective. We analyze several techniques based on prior work, as well as new variants, and compare them empirically on image, speech, and text tasks. We find an advantage for correlation-based representation learning, while the best results on most tasks are obtained with our new variant, deep canonically correlated autoencoders (DCCAE). We also explore a stochastic optimization procedure for minibatch correlation-based objectives and discuss the time/performance trade-offs for kernel-based and neural network-based implementations. |
Tasks | Representation Learning, Stochastic Optimization |
Published | 2016-02-02 |
URL | http://arxiv.org/abs/1602.01024v1 |
http://arxiv.org/pdf/1602.01024v1.pdf | |
PWC | https://paperswithcode.com/paper/on-deep-multi-view-representation-learning |
Repo | |
Framework | |
Autonomous Grounding of Visual Field Experience through Sensorimotor Prediction
Title | Autonomous Grounding of Visual Field Experience through Sensorimotor Prediction |
Authors | Alban Laflaquière |
Abstract | In a developmental framework, autonomous robots need to explore the world and learn how to interact with it. Without an a priori model of the system, this opens the challenging problem of having robots master their interface with the world: how to perceive their environment using their sensors, and how to act in it using their motors. The sensorimotor approach of perception claims that a naive agent can learn to master this interface by capturing regularities in the way its actions transform its sensory inputs. In this paper, we apply such an approach to the discovery and mastery of the visual field associated with a visual sensor. A computational model is formalized and applied to a simulated system to illustrate the approach. |
Tasks | |
Published | 2016-08-03 |
URL | http://arxiv.org/abs/1608.01127v1 |
http://arxiv.org/pdf/1608.01127v1.pdf | |
PWC | https://paperswithcode.com/paper/autonomous-grounding-of-visual-field |
Repo | |
Framework | |
On the Troll-Trust Model for Edge Sign Prediction in Social Networks
Title | On the Troll-Trust Model for Edge Sign Prediction in Social Networks |
Authors | Géraud Le Falher, Nicolò Cesa-Bianchi, Claudio Gentile, Fabio Vitale |
Abstract | In the problem of edge sign prediction, we are given a directed graph (representing a social network), and our task is to predict the binary labels of the edges (i.e., the positive or negative nature of the social relationships). Many successful heuristics for this problem are based on the troll-trust features, estimating at each node the fraction of outgoing and incoming positive/negative edges. We show that these heuristics can be understood, and rigorously analyzed, as approximators to the Bayes optimal classifier for a simple probabilistic model of the edge labels. We then show that the maximum likelihood estimator for this model approximately corresponds to the predictions of a Label Propagation algorithm run on a transformed version of the original social graph. Extensive experiments on a number of real-world datasets show that this algorithm is competitive against state-of-the-art classifiers in terms of both accuracy and scalability. Finally, we show that troll-trust features can also be used to derive online learning algorithms which have theoretical guarantees even when edges are adversarially labeled. |
Tasks | |
Published | 2016-06-01 |
URL | http://arxiv.org/abs/1606.00182v5 |
http://arxiv.org/pdf/1606.00182v5.pdf | |
PWC | https://paperswithcode.com/paper/on-the-troll-trust-model-for-edge-sign |
Repo | |
Framework | |
End-to-end Concept Word Detection for Video Captioning, Retrieval, and Question Answering
Title | End-to-end Concept Word Detection for Video Captioning, Retrieval, and Question Answering |
Authors | Youngjae Yu, Hyungjin Ko, Jongwook Choi, Gunhee Kim |
Abstract | We propose a high-level concept word detector that can be integrated with any video-to-language models. It takes a video as input and generates a list of concept words as useful semantic priors for language generation models. The proposed word detector has two important properties. First, it does not require any external knowledge sources for training. Second, the proposed word detector is trainable in an end-to-end manner jointly with any video-to-language models. To maximize the values of detected words, we also develop a semantic attention mechanism that selectively focuses on the detected concept words and fuse them with the word encoding and decoding in the language model. In order to demonstrate that the proposed approach indeed improves the performance of multiple video-to-language tasks, we participate in four tasks of LSMDC 2016. Our approach achieves the best accuracies in three of them, including fill-in-the-blank, multiple-choice test, and movie retrieval. We also attain comparable performance for the other task, movie description. |
Tasks | Language Modelling, Question Answering, Text Generation, Video Captioning, Video Retrieval |
Published | 2016-10-10 |
URL | http://arxiv.org/abs/1610.02947v3 |
http://arxiv.org/pdf/1610.02947v3.pdf | |
PWC | https://paperswithcode.com/paper/end-to-end-concept-word-detection-for-video |
Repo | |
Framework | |
Text Network Exploration via Heterogeneous Web of Topics
Title | Text Network Exploration via Heterogeneous Web of Topics |
Authors | Junxian He, Ying Huang, Changfeng Liu, Jiaming Shen, Yuting Jia, Xinbing Wang |
Abstract | A text network refers to a data type that each vertex is associated with a text document and the relationship between documents is represented by edges. The proliferation of text networks such as hyperlinked webpages and academic citation networks has led to an increasing demand for quickly developing a general sense of a new text network, namely text network exploration. In this paper, we address the problem of text network exploration through constructing a heterogeneous web of topics, which allows people to investigate a text network associating word level with document level. To achieve this, a probabilistic generative model for text and links is proposed, where three different relationships in the heterogeneous topic web are quantified. We also develop a prototype demo system named TopicAtlas to exhibit such heterogeneous topic web, and demonstrate how this system can facilitate the task of text network exploration. Extensive qualitative analyses are included to verify the effectiveness of this heterogeneous topic web. Besides, we validate our model on real-life text networks, showing that it preserves good performance on objective evaluation metrics. |
Tasks | |
Published | 2016-10-02 |
URL | http://arxiv.org/abs/1610.00219v1 |
http://arxiv.org/pdf/1610.00219v1.pdf | |
PWC | https://paperswithcode.com/paper/text-network-exploration-via-heterogeneous |
Repo | |
Framework | |
Synthetic Data for Text Localisation in Natural Images
Title | Synthetic Data for Text Localisation in Natural Images |
Authors | Ankush Gupta, Andrea Vedaldi, Andrew Zisserman |
Abstract | In this paper we introduce a new method for text detection in natural images. The method comprises two contributions: First, a fast and scalable engine to generate synthetic images of text in clutter. This engine overlays synthetic text to existing background images in a natural way, accounting for the local 3D scene geometry. Second, we use the synthetic images to train a Fully-Convolutional Regression Network (FCRN) which efficiently performs text detection and bounding-box regression at all locations and multiple scales in an image. We discuss the relation of FCRN to the recently-introduced YOLO detector, as well as other end-to-end object detection systems based on deep learning. The resulting detection network significantly out performs current methods for text detection in natural images, achieving an F-measure of 84.2% on the standard ICDAR 2013 benchmark. Furthermore, it can process 15 images per second on a GPU. |
Tasks | Object Detection |
Published | 2016-04-22 |
URL | http://arxiv.org/abs/1604.06646v1 |
http://arxiv.org/pdf/1604.06646v1.pdf | |
PWC | https://paperswithcode.com/paper/synthetic-data-for-text-localisation-in |
Repo | |
Framework | |
Safe and Efficient Off-Policy Reinforcement Learning
Title | Safe and Efficient Off-Policy Reinforcement Learning |
Authors | Rémi Munos, Tom Stepleton, Anna Harutyunyan, Marc G. Bellemare |
Abstract | In this work, we take a fresh look at some old and new algorithms for off-policy, return-based reinforcement learning. Expressing these in a common form, we derive a novel algorithm, Retrace($\lambda$), with three desired properties: (1) it has low variance; (2) it safely uses samples collected from any behaviour policy, whatever its degree of “off-policyness”; and (3) it is efficient as it makes the best use of samples collected from near on-policy behaviour policies. We analyze the contractive nature of the related operator under both off-policy policy evaluation and control settings and derive online sample-based algorithms. We believe this is the first return-based off-policy control algorithm converging a.s. to $Q^*$ without the GLIE assumption (Greedy in the Limit with Infinite Exploration). As a corollary, we prove the convergence of Watkins’ Q($\lambda$), which was an open problem since 1989. We illustrate the benefits of Retrace($\lambda$) on a standard suite of Atari 2600 games. |
Tasks | Atari Games |
Published | 2016-06-08 |
URL | http://arxiv.org/abs/1606.02647v2 |
http://arxiv.org/pdf/1606.02647v2.pdf | |
PWC | https://paperswithcode.com/paper/safe-and-efficient-off-policy-reinforcement |
Repo | |
Framework | |
Morphological Priors for Probabilistic Neural Word Embeddings
Title | Morphological Priors for Probabilistic Neural Word Embeddings |
Authors | Parminder Bhatia, Robert Guthrie, Jacob Eisenstein |
Abstract | Word embeddings allow natural language processing systems to share statistical information across related words. These embeddings are typically based on distributional statistics, making it difficult for them to generalize to rare or unseen words. We propose to improve word embeddings by incorporating morphological information, capturing shared sub-word features. Unlike previous work that constructs word embeddings directly from morphemes, we combine morphological and distributional information in a unified probabilistic framework, in which the word embedding is a latent variable. The morphological information provides a prior distribution on the latent word embeddings, which in turn condition a likelihood function over an observed corpus. This approach yields improvements on intrinsic word similarity evaluations, and also in the downstream task of part-of-speech tagging. |
Tasks | Part-Of-Speech Tagging, Word Embeddings |
Published | 2016-08-03 |
URL | http://arxiv.org/abs/1608.01056v2 |
http://arxiv.org/pdf/1608.01056v2.pdf | |
PWC | https://paperswithcode.com/paper/morphological-priors-for-probabilistic-neural |
Repo | |
Framework | |
Bayesian Neural Word Embedding
Title | Bayesian Neural Word Embedding |
Authors | Oren Barkan |
Abstract | Recently, several works in the domain of natural language processing presented successful methods for word embedding. Among them, the Skip-Gram with negative sampling, known also as word2vec, advanced the state-of-the-art of various linguistics tasks. In this paper, we propose a scalable Bayesian neural word embedding algorithm. The algorithm relies on a Variational Bayes solution for the Skip-Gram objective and a detailed step by step description is provided. We present experimental results that demonstrate the performance of the proposed algorithm for word analogy and similarity tasks on six different datasets and show it is competitive with the original Skip-Gram method. |
Tasks | |
Published | 2016-03-21 |
URL | http://arxiv.org/abs/1603.06571v3 |
http://arxiv.org/pdf/1603.06571v3.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-neural-word-embedding |
Repo | |
Framework | |
Computer-Aided Colorectal Tumor Classification in NBI Endoscopy Using CNN Features
Title | Computer-Aided Colorectal Tumor Classification in NBI Endoscopy Using CNN Features |
Authors | Toru Tamaki, Shoji Sonoyama, Tsubasa Hirakawa, Bisser Raytchev, Kazufumi Kaneda, Tetsushi Koide, Shigeto Yoshida, Hiroshi Mieno, Shinji Tanaka |
Abstract | In this paper we report results for recognizing colorectal NBI endoscopic images by using features extracted from convolutional neural network (CNN). In this comparative study, we extract features from different layers from different CNN models, and then train linear SVM classifiers. Experimental results with 10-fold cross validations show that features from first few convolution layers are enough to achieve similar performance (i.e., recognition rate of 95%) with non-CNN local features such as Bag-of-Visual words, Fisher vector, and VLAD. |
Tasks | |
Published | 2016-08-24 |
URL | http://arxiv.org/abs/1608.06709v1 |
http://arxiv.org/pdf/1608.06709v1.pdf | |
PWC | https://paperswithcode.com/paper/computer-aided-colorectal-tumor |
Repo | |
Framework | |
CT-Mapper: Mapping Sparse Multimodal Cellular Trajectories using a Multilayer Transportation Network
Title | CT-Mapper: Mapping Sparse Multimodal Cellular Trajectories using a Multilayer Transportation Network |
Authors | Fereshteh Asgari, Alexis Sultan, Haoyi Xiong, Vincent Gauthier, Mounim El-Yacoubi |
Abstract | Mobile phone data have recently become an attractive source of information about mobility behavior. Since cell phone data can be captured in a passive way for a large user population, they can be harnessed to collect well-sampled mobility information. In this paper, we propose CT-Mapper, an unsupervised algorithm that enables the mapping of mobile phone traces over a multimodal transport network. One of the main strengths of CT-Mapper is its capability to map noisy sparse cellular multimodal trajectories over a multilayer transportation network where the layers have different physical properties and not only to map trajectories associated with a single layer. Such a network is modeled by a large multilayer graph in which the nodes correspond to metro/train stations or road intersections and edges correspond to connections between them. The mapping problem is modeled by an unsupervised HMM where the observations correspond to sparse user mobile trajectories and the hidden states to the multilayer graph nodes. The HMM is unsupervised as the transition and emission probabilities are inferred using respectively the physical transportation properties and the information on the spatial coverage of antenna base stations. To evaluate CT-Mapper we collected cellular traces with their corresponding GPS trajectories for a group of volunteer users in Paris and vicinity (France). We show that CT-Mapper is able to accurately retrieve the real cell phone user paths despite the sparsity of the observed trace trajectories. Furthermore our transition probability model is up to 20% more accurate than other naive models. |
Tasks | |
Published | 2016-04-22 |
URL | http://arxiv.org/abs/1604.06577v1 |
http://arxiv.org/pdf/1604.06577v1.pdf | |
PWC | https://paperswithcode.com/paper/ct-mapper-mapping-sparse-multimodal-cellular |
Repo | |
Framework | |