July 27, 2019

3093 words 15 mins read

Paper Group ANR 585

A Convenient Category for Higher-Order Probability Theory. Learning Word Embeddings from the Portuguese Twitter Stream: A Study of some Practical Aspects. A Deep Incremental Boltzmann Machine for Modeling Context in Robots. Bollywood Movie Corpus for Text, Images and Videos. Developing the Path Signature Methodology and its Application to Landmark- …

A Convenient Category for Higher-Order Probability Theory


Title	A Convenient Category for Higher-Order Probability Theory
Authors	Chris Heunen, Ohad Kammar, Sam Staton, Hongseok Yang
Abstract	Higher-order probabilistic programming languages allow programmers to write sophisticated models in machine learning and statistics in a succinct and structured way, but step outside the standard measure-theoretic formalization of probability theory. Programs may use both higher-order functions and continuous distributions, or even define a probability distribution on functions. But standard probability theory does not handle higher-order functions well: the category of measurable spaces is not cartesian closed. Here we introduce quasi-Borel spaces. We show that these spaces: form a new formalization of probability theory replacing measurable spaces; form a cartesian closed category and so support higher-order functions; form a well-pointed category and so support good proof principles for equational reasoning; and support continuous probability distributions. We demonstrate the use of quasi-Borel spaces for higher-order functions and probability by: showing that a well-known construction of probability theory involving random functions gains a cleaner expression; and generalizing de Finetti’s theorem, that is a crucial theorem in probability theory, to quasi-Borel spaces.
Tasks	Probabilistic Programming
Published	2017-01-10
URL	http://arxiv.org/abs/1701.02547v3
PDF	http://arxiv.org/pdf/1701.02547v3.pdf
PWC	https://paperswithcode.com/paper/a-convenient-category-for-higher-order
Repo
Framework

Learning Word Embeddings from the Portuguese Twitter Stream: A Study of some Practical Aspects


Title	Learning Word Embeddings from the Portuguese Twitter Stream: A Study of some Practical Aspects
Authors	Pedro Saleiro, Luís Sarmento, Eduarda Mendes Rodrigues, Carlos Soares, Eugénio Oliveira
Abstract	This paper describes a preliminary study for producing and distributing a large-scale database of embeddings from the Portuguese Twitter stream. We start by experimenting with a relatively small sample and focusing on three challenges: volume of training data, vocabulary size and intrinsic evaluation metrics. Using a single GPU, we were able to scale up vocabulary size from 2048 words embedded and 500K training examples to 32768 words over 10M training examples while keeping a stable validation loss and approximately linear trend on training time per epoch. We also observed that using less than 50% of the available training examples for each vocabulary size might result in overfitting. Results on intrinsic evaluation show promising performance for a vocabulary size of 32768 words. Nevertheless, intrinsic evaluation metrics suffer from over-sensitivity to their corresponding cosine similarity thresholds, indicating that a wider range of metrics need to be developed to track progress.
Tasks	Learning Word Embeddings, Word Embeddings
Published	2017-09-04
URL	http://arxiv.org/abs/1709.00947v1
PDF	http://arxiv.org/pdf/1709.00947v1.pdf
PWC	https://paperswithcode.com/paper/learning-word-embeddings-from-the-portuguese
Repo
Framework

A Deep Incremental Boltzmann Machine for Modeling Context in Robots


Title	A Deep Incremental Boltzmann Machine for Modeling Context in Robots
Authors	Fethiye Irmak Doğan, Hande Çelikkanat, Sinan Kalkan
Abstract	Context is an essential capability for robots that are to be as adaptive as possible in challenging environments. Although there are many context modeling efforts, they assume a fixed structure and number of contexts. In this paper, we propose an incremental deep model that extends Restricted Boltzmann Machines. Our model gets one scene at a time, and gradually extends the contextual model when necessary, either by adding a new context or a new context layer to form a hierarchy. We show on a scene classification benchmark that our method converges to a good estimate of the contexts of the scenes, and performs better or on-par on several tasks compared to other incremental models or non-incremental models.
Tasks	Scene Classification
Published	2017-10-13
URL	http://arxiv.org/abs/1710.04975v3
PDF	http://arxiv.org/pdf/1710.04975v3.pdf
PWC	https://paperswithcode.com/paper/a-deep-incremental-boltzmann-machine-for
Repo
Framework

Bollywood Movie Corpus for Text, Images and Videos


Title	Bollywood Movie Corpus for Text, Images and Videos
Authors	Nishtha Madaan, Sameep Mehta, Mayank Saxena, Aditi Aggarwal, Taneea S Agrawaal, Vrinda Malhotra
Abstract	In past few years, several data-sets have been released for text and images. We present an approach to create the data-set for use in detecting and removing gender bias from text. We also include a set of challenges we have faced while creating this corpora. In this work, we have worked with movie data from Wikipedia plots and movie trailers from YouTube. Our Bollywood Movie corpus contains 4000 movies extracted from Wikipedia and 880 trailers extracted from YouTube which were released from 1970-2017. The corpus contains csv files with the following data about each movie - Wikipedia title of movie, cast, plot text, co-referenced plot text, soundtrack information, link to movie poster, caption of movie poster, number of males in poster, number of females in poster. In addition to that, corresponding to each cast member the following data is available - cast name, cast gender, cast verbs, cast adjectives, cast relations, cast centrality, cast mentions. We present some preliminary results on the task of bias removal which suggest that the data-set is quite useful for performing such tasks.
Tasks
Published	2017-10-11
URL	http://arxiv.org/abs/1710.04142v1
PDF	http://arxiv.org/pdf/1710.04142v1.pdf
PWC	https://paperswithcode.com/paper/bollywood-movie-corpus-for-text-images-and
Repo
Framework

Developing the Path Signature Methodology and its Application to Landmark-based Human Action Recognition


Title	Developing the Path Signature Methodology and its Application to Landmark-based Human Action Recognition
Authors	Weixin Yang, Terry Lyons, Hao Ni, Cordelia Schmid, Lianwen Jin
Abstract	Landmark-based human action recognition in videos is a challenging task in computer vision. One key step is to design a generic approach that generates discriminative features for the spatial structure and temporal dynamics. To this end, we regard the evolving landmark data as a high-dimensional path and apply non-linear path signature techniques to provide an expressive, robust, non-linear, and interpretable representation for the sequential events. We do not extract signature features from the raw path, rather we propose path disintegrations and path transformations as preprocessing steps. Path disintegrations turn a high-dimensional path linearly into a collection of lower-dimensional paths; some of these paths are in pose space while others are defined over a multiscale collection of temporal intervals. Path transformations decorate the paths with additional coordinates in standard ways to allow the truncated signatures of transformed paths to expose additional features. For spatial representation, we apply the signature transform to vectorize the paths that arise out of pose disintegration, and for temporal representation, we apply it again to describe this evolving vectorization. Finally, all the features are collected together to constitute the input vector of a linear single-hidden-layer fully-connected network for classification. Experimental results on four datasets demonstrated that the proposed feature set with only a linear shallow network and Dropconnect is effective and achieves comparable state-of-the-art results to the advanced deep networks, and meanwhile, is capable of interpretation.
Tasks	Action Classification, Action Recognition In Videos, Temporal Action Localization
Published	2017-07-13
URL	https://arxiv.org/abs/1707.03993v2
PDF	https://arxiv.org/pdf/1707.03993v2.pdf
PWC	https://paperswithcode.com/paper/leveraging-the-path-signature-for-skeleton
Repo
Framework

Multi-focus Attention Network for Efficient Deep Reinforcement Learning


Title	Multi-focus Attention Network for Efficient Deep Reinforcement Learning
Authors	Jinyoung Choi, Beom-Jin Lee, Byoung-Tak Zhang
Abstract	Deep reinforcement learning (DRL) has shown incredible performance in learning various tasks to the human level. However, unlike human perception, current DRL models connect the entire low-level sensory input to the state-action values rather than exploiting the relationship between and among entities that constitute the sensory input. Because of this difference, DRL needs vast amount of experience samples to learn. In this paper, we propose a Multi-focus Attention Network (MANet) which mimics human ability to spatially abstract the low-level sensory input into multiple entities and attend to them simultaneously. The proposed method first divides the low-level input into several segments which we refer to as partial states. After this segmentation, parallel attention layers attend to the partial states relevant to solving the task. Our model estimates state-action values using these attended partial states. In our experiments, MANet attains highest scores with significantly less experience samples. Additionally, the model shows higher performance compared to the Deep Q-network and the single attention model as benchmarks. Furthermore, we extend our model to attentive communication model for performing multi-agent cooperative tasks. In multi-agent cooperative task experiments, our model shows 20% faster learning than existing state-of-the-art model.
Tasks
Published	2017-12-13
URL	http://arxiv.org/abs/1712.04603v1
PDF	http://arxiv.org/pdf/1712.04603v1.pdf
PWC	https://paperswithcode.com/paper/multi-focus-attention-network-for-efficient
Repo
Framework

Source localization in an ocean waveguide using supervised machine learning


Title	Source localization in an ocean waveguide using supervised machine learning
Authors	Haiqiang Niu, Emma Reeves, Peter Gerstoft
Abstract	Source localization in ocean acoustics is posed as a machine learning problem in which data-driven methods learn source ranges directly from observed acoustic data. The pressure received by a vertical linear array is preprocessed by constructing a normalized sample covariance matrix (SCM) and used as the input. Three machine learning methods (feed-forward neural networks (FNN), support vector machines (SVM) and random forests (RF)) are investigated in this paper, with focus on the FNN. The range estimation problem is solved both as a classification problem and as a regression problem by these three machine learning algorithms. The results of range estimation for the Noise09 experiment are compared for FNN, SVM, RF and conventional matched-field processing and demonstrate the potential of machine learning for underwater source localization..
Tasks
Published	2017-01-29
URL	http://arxiv.org/abs/1701.08431v4
PDF	http://arxiv.org/pdf/1701.08431v4.pdf
PWC	https://paperswithcode.com/paper/source-localization-in-an-ocean-waveguide
Repo
Framework

Structural Analysis of Hindi Phonetics and A Method for Extraction of Phonetically Rich Sentences from a Very Large Hindi Text Corpus


Title	Structural Analysis of Hindi Phonetics and A Method for Extraction of Phonetically Rich Sentences from a Very Large Hindi Text Corpus
Authors	Shrikant Malviya, Rohit Mishra, Uma Shanker Tiwary
Abstract	Automatic speech recognition (ASR) and Text to speech (TTS) are two prominent area of research in human computer interaction nowadays. A set of phonetically rich sentences is in a matter of importance in order to develop these two interactive modules of HCI. Essentially, the set of phonetically rich sentences has to cover all possible phone units distributed uniformly. Selecting such a set from a big corpus with maintaining phonetic characteristic based similarity is still a challenging problem. The major objective of this paper is to devise a criteria in order to select a set of sentences encompassing all phonetic aspects of a corpus with size as minimum as possible. First, this paper presents a statistical analysis of Hindi phonetics by observing the structural characteristics. Further a two stage algorithm is proposed to extract phonetically rich sentences with a high variety of triphones from the EMILLE Hindi corpus. The algorithm consists of a distance measuring criteria to select a sentence in order to improve the triphone distribution. Moreover, a special preprocessing method is proposed to score each triphone in terms of inverse probability in order to fasten the algorithm. The results show that the approach efficiently build uniformly distributed phonetically-rich corpus with optimum number of sentences.
Tasks	Speech Recognition
Published	2017-01-30
URL	http://arxiv.org/abs/1701.08655v2
PDF	http://arxiv.org/pdf/1701.08655v2.pdf
PWC	https://paperswithcode.com/paper/structural-analysis-of-hindi-phonetics-and-a
Repo
Framework

Bayesian Models of Data Streams with Hierarchical Power Priors


Title	Bayesian Models of Data Streams with Hierarchical Power Priors
Authors	Andres Masegosa, Thomas D. Nielsen, Helge Langseth, Dario Ramos-Lopez, Antonio Salmeron, Anders L. Madsen
Abstract	Making inferences from data streams is a pervasive problem in many modern data analysis applications. But it requires to address the problem of continuous model updating and adapt to changes or drifts in the underlying data generating distribution. In this paper, we approach these problems from a Bayesian perspective covering general conjugate exponential models. Our proposal makes use of non-conjugate hierarchical priors to explicitly model temporal changes of the model parameters. We also derive a novel variational inference scheme which overcomes the use of non-conjugate priors while maintaining the computational efficiency of variational methods over conjugate models. The approach is validated on three real data sets over three latent variable models.
Tasks	Latent Variable Models
Published	2017-07-07
URL	http://arxiv.org/abs/1707.02293v1
PDF	http://arxiv.org/pdf/1707.02293v1.pdf
PWC	https://paperswithcode.com/paper/bayesian-models-of-data-streams-with
Repo
Framework

CT sinogram-consistency learning for metal-induced beam hardening correction


Title	CT sinogram-consistency learning for metal-induced beam hardening correction
Authors	Hyung Suk Park, Sung Min Lee, Hwa Pyung Kim, Jin Keun Seo
Abstract	This paper proposes a sinogram consistency learning method to deal with beam-hardening related artifacts in polychromatic computerized tomography (CT). The presence of highly attenuating materials in the scan field causes an inconsistent sinogram, that does not match the range space of the Radon transform. When the mismatched data are entered into the range space during CT reconstruction, streaking and shading artifacts are generated owing to the inherent nature of the inverse Radon transform. The proposed learning method aims to repair inconsistent sinograms by removing the primary metal-induced beam-hardening factors along the metal trace in the sinogram. Taking account of the fundamental difficulty in obtaining sufficient training data in a medical environment, the learning method is designed to use simulated training data and a patient-type specific learning model is used to simplify the learning process. The feasibility of the proposed method is investigated using a dataset, consisting of real CT scan of pelvises containing hip prostheses. The anatomical areas in training and test data are different, in order to demonstrate that the proposed method extracts the beam hardening features, selectively. The results show that our method successfully corrects sinogram inconsistency by extracting beam-hardening sources by means of deep learning. This paper proposed a deep learning method of sinogram correction for beam hardening reduction in CT for the first time. Conventional methods for beam hardening reduction are based on regularizations, and have the fundamental drawback of being not easily able to use manifold CT images, while a deep learning approach has the potential to do so.
Tasks
Published	2017-08-02
URL	http://arxiv.org/abs/1708.00607v2
PDF	http://arxiv.org/pdf/1708.00607v2.pdf
PWC	https://paperswithcode.com/paper/ct-sinogram-consistency-learning-for-metal
Repo
Framework

The Marginal Value of Adaptive Gradient Methods in Machine Learning


Title	The Marginal Value of Adaptive Gradient Methods in Machine Learning
Authors	Ashia C. Wilson, Rebecca Roelofs, Mitchell Stern, Nathan Srebro, Benjamin Recht
Abstract	Adaptive optimization methods, which perform local optimization with a metric constructed from the history of iterates, are becoming increasingly popular for training deep neural networks. Examples include AdaGrad, RMSProp, and Adam. We show that for simple overparameterized problems, adaptive methods often find drastically different solutions than gradient descent (GD) or stochastic gradient descent (SGD). We construct an illustrative binary classification problem where the data is linearly separable, GD and SGD achieve zero test error, and AdaGrad, Adam, and RMSProp attain test errors arbitrarily close to half. We additionally study the empirical generalization capability of adaptive methods on several state-of-the-art deep learning models. We observe that the solutions found by adaptive methods generalize worse (often significantly worse) than SGD, even when these solutions have better training performance. These results suggest that practitioners should reconsider the use of adaptive methods to train neural networks.
Tasks
Published	2017-05-23
URL	http://arxiv.org/abs/1705.08292v2
PDF	http://arxiv.org/pdf/1705.08292v2.pdf
PWC	https://paperswithcode.com/paper/the-marginal-value-of-adaptive-gradient
Repo
Framework

The Simulator: Understanding Adaptive Sampling in the Moderate-Confidence Regime


Title	The Simulator: Understanding Adaptive Sampling in the Moderate-Confidence Regime
Authors	Max Simchowitz, Kevin Jamieson, Benjamin Recht
Abstract	We propose a novel technique for analyzing adaptive sampling called the {\em Simulator}. Our approach differs from the existing methods by considering not how much information could be gathered by any fixed sampling strategy, but how difficult it is to distinguish a good sampling strategy from a bad one given the limited amount of data collected up to any given time. This change of perspective allows us to match the strength of both Fano and change-of-measure techniques, without succumbing to the limitations of either method. For concreteness, we apply our techniques to a structured multi-arm bandit problem in the fixed-confidence pure exploration setting, where we show that the constraints on the means imply a substantial gap between the moderate-confidence sample complexity, and the asymptotic sample complexity as $\delta \to 0$ found in the literature. We also prove the first instance-based lower bounds for the top-k problem which incorporate the appropriate log-factors. Moreover, our lower bounds zero-in on the number of times each \emph{individual} arm needs to be pulled, uncovering new phenomena which are drowned out in the aggregate sample complexity. Our new analysis inspires a simple and near-optimal algorithm for the best-arm and top-k identification, the first {\em practical} algorithm of its kind for the latter problem which removes extraneous log factors, and outperforms the state-of-the-art in experiments.
Tasks
Published	2017-02-16
URL	http://arxiv.org/abs/1702.05186v1
PDF	http://arxiv.org/pdf/1702.05186v1.pdf
PWC	https://paperswithcode.com/paper/the-simulator-understanding-adaptive-sampling
Repo
Framework

Synkhronos: a Multi-GPU Theano Extension for Data Parallelism


Title	Synkhronos: a Multi-GPU Theano Extension for Data Parallelism
Authors	Adam Stooke, Pieter Abbeel
Abstract	We present Synkhronos, an extension to Theano for multi-GPU computations leveraging data parallelism. Our framework provides automated execution and synchronization across devices, allowing users to continue to write serial programs without risk of race conditions. The NVIDIA Collective Communication Library is used for high-bandwidth inter-GPU communication. Further enhancements to the Theano function interface include input slicing (with aggregation) and input indexing, which perform common data-parallel computation patterns efficiently. One example use case is synchronous SGD, which has recently been shown to scale well for a growing set of deep learning problems. When training ResNet-50, we achieve a near-linear speedup of 7.5x on an NVIDIA DGX-1 using 8 GPUs, relative to Theano-only code running a single GPU in isolation. Yet Synkhronos remains general to any data-parallel computation programmable in Theano. By implementing parallelism at the level of individual Theano functions, our framework uniquely addresses a niche between manual multi-device programming and prescribed multi-GPU training routines.
Tasks
Published	2017-10-11
URL	http://arxiv.org/abs/1710.04162v1
PDF	http://arxiv.org/pdf/1710.04162v1.pdf
PWC	https://paperswithcode.com/paper/synkhronos-a-multi-gpu-theano-extension-for
Repo
Framework

Adversarial Evaluation of Dialogue Models


Title	Adversarial Evaluation of Dialogue Models
Authors	Anjuli Kannan, Oriol Vinyals
Abstract	The recent application of RNN encoder-decoder models has resulted in substantial progress in fully data-driven dialogue systems, but evaluation remains a challenge. An adversarial loss could be a way to directly evaluate the extent to which generated dialogue responses sound like they came from a human. This could reduce the need for human evaluation, while more directly evaluating on a generative task. In this work, we investigate this idea by training an RNN to discriminate a dialogue model’s samples from human-generated samples. Although we find some evidence this setup could be viable, we also note that many issues remain in its practical application. We discuss both aspects and conclude that future work is warranted.
Tasks
Published	2017-01-27
URL	http://arxiv.org/abs/1701.08198v1
PDF	http://arxiv.org/pdf/1701.08198v1.pdf
PWC	https://paperswithcode.com/paper/adversarial-evaluation-of-dialogue-models
Repo
Framework

Conditional Accelerated Lazy Stochastic Gradient Descent


Title	Conditional Accelerated Lazy Stochastic Gradient Descent
Authors	Guanghui Lan, Sebastian Pokutta, Yi Zhou, Daniel Zink
Abstract	In this work we introduce a conditional accelerated lazy stochastic gradient descent algorithm with optimal number of calls to a stochastic first-order oracle and convergence rate $O\left(\frac{1}{\varepsilon^2}\right)$ improving over the projection-free, Online Frank-Wolfe based stochastic gradient descent of Hazan and Kale [2012] with convergence rate $O\left(\frac{1}{\varepsilon^4}\right)$.
Tasks
Published	2017-03-16
URL	http://arxiv.org/abs/1703.05840v5
PDF	http://arxiv.org/pdf/1703.05840v5.pdf
PWC	https://paperswithcode.com/paper/conditional-accelerated-lazy-stochastic
Repo
Framework