October 19, 2019

3008 words 15 mins read

Paper Group ANR 213

Paper Group ANR 213

Improving Image Captioning with Conditional Generative Adversarial Nets. Cross-modal Embeddings for Video and Audio Retrieval. Quantifying the Burden of Exploration and the Unfairness of Free Riding. Configurable Markov Decision Processes. Speaker Selective Beamformer with Keyword Mask Estimation. Optimizing deep video representation to match brain …

Improving Image Captioning with Conditional Generative Adversarial Nets

Title Improving Image Captioning with Conditional Generative Adversarial Nets
Authors Chen Chen, Shuai Mu, Wanpeng Xiao, Zexiong Ye, Liesi Wu, Qi Ju
Abstract In this paper, we propose a novel conditional-generative-adversarial-nets-based image captioning framework as an extension of traditional reinforcement-learning (RL)-based encoder-decoder architecture. To deal with the inconsistent evaluation problem among different objective language metrics, we are motivated to design some “discriminator” networks to automatically and progressively determine whether generated caption is human described or machine generated. Two kinds of discriminator architectures (CNN and RNN-based structures) are introduced since each has its own advantages. The proposed algorithm is generic so that it can enhance any existing RL-based image captioning framework and we show that the conventional RL training method is just a special case of our approach. Empirically, we show consistent improvements over all language evaluation metrics for different state-of-the-art image captioning models. In addition, the well-trained discriminators can also be viewed as objective image captioning evaluators
Tasks Image Captioning
Published 2018-05-18
URL http://arxiv.org/abs/1805.07112v4
PDF http://arxiv.org/pdf/1805.07112v4.pdf
PWC https://paperswithcode.com/paper/improving-image-captioning-with-conditional
Repo
Framework

Cross-modal Embeddings for Video and Audio Retrieval

Title Cross-modal Embeddings for Video and Audio Retrieval
Authors Didac Surís, Amanda Duarte, Amaia Salvador, Jordi Torres, Xavier Giró-i-Nieto
Abstract The increasing amount of online videos brings several opportunities for training self-supervised neural networks. The creation of large scale datasets of videos such as the YouTube-8M allows us to deal with this large amount of data in manageable way. In this work, we find new ways of exploiting this dataset by taking advantage of the multi-modal information it provides. By means of a neural network, we are able to create links between audio and visual documents, by projecting them into a common region of the feature space, obtaining joint audio-visual embeddings. These links are used to retrieve audio samples that fit well to a given silent video, and also to retrieve images that match a given a query audio. The results in terms of Recall@K obtained over a subset of YouTube-8M videos show the potential of this unsupervised approach for cross-modal feature learning. We train embeddings for both scales and assess their quality in a retrieval problem, formulated as using the feature extracted from one modality to retrieve the most similar videos based on the features computed in the other modality.
Tasks
Published 2018-01-07
URL http://arxiv.org/abs/1801.02200v1
PDF http://arxiv.org/pdf/1801.02200v1.pdf
PWC https://paperswithcode.com/paper/cross-modal-embeddings-for-video-and-audio
Repo
Framework

Quantifying the Burden of Exploration and the Unfairness of Free Riding

Title Quantifying the Burden of Exploration and the Unfairness of Free Riding
Authors Christopher Jung, Sampath Kannan, Neil Lutz
Abstract We consider the multi-armed bandit setting with a twist. Rather than having just one decision maker deciding which arm to pull in each round, we have $n$ different decision makers (agents). In the simple stochastic setting, we show that a “free-riding” agent observing another “self-reliant” agent can achieve just $O(1)$ regret, as opposed to the regret lower bound of $\Omega (\log t)$ when one decision maker is playing in isolation. This result holds whenever the self-reliant agent’s strategy satisfies either one of two assumptions: (1) each arm is pulled at least $\gamma \ln t$ times in expectation for a constant $\gamma$ that we compute, or (2) the self-reliant agent achieves $o(t)$ realized regret with high probability. Both of these assumptions are satisfied by standard zero-regret algorithms. Under the second assumption, we further show that the free rider only needs to observe the number of times each arm is pulled by the self-reliant agent, and not the rewards realized. In the linear contextual setting, each arm has a distribution over parameter vectors, each agent has a context vector, and the reward realized when an agent pulls an arm is the inner product of that agent’s context vector with a parameter vector sampled from the pulled arm’s distribution. We show that the free rider can achieve $O(1)$ regret in this setting whenever the free rider’s context is a small (in $L_2$-norm) linear combination of other agents’ contexts and all other agents pull each arm $\Omega (\log t)$ times with high probability. Again, this condition on the self-reliant players is satisfied by standard zero-regret algorithms like UCB. We also prove a number of lower bounds.
Tasks
Published 2018-10-20
URL https://arxiv.org/abs/1810.08743v3
PDF https://arxiv.org/pdf/1810.08743v3.pdf
PWC https://paperswithcode.com/paper/quantifying-the-burden-of-exploration-and-the
Repo
Framework

Configurable Markov Decision Processes

Title Configurable Markov Decision Processes
Authors Alberto Maria Metelli, Mirco Mutti, Marcello Restelli
Abstract In many real-world problems, there is the possibility to configure, to a limited extent, some environmental parameters to improve the performance of a learning agent. In this paper, we propose a novel framework, Configurable Markov Decision Processes (Conf-MDPs), to model this new type of interaction with the environment. Furthermore, we provide a new learning algorithm, Safe Policy-Model Iteration (SPMI), to jointly and adaptively optimize the policy and the environment configuration. After having introduced our approach and derived some theoretical results, we present the experimental evaluation in two explicative problems to show the benefits of the environment configurability on the performance of the learned policy.
Tasks
Published 2018-06-14
URL http://arxiv.org/abs/1806.05415v1
PDF http://arxiv.org/pdf/1806.05415v1.pdf
PWC https://paperswithcode.com/paper/configurable-markov-decision-processes
Repo
Framework

Speaker Selective Beamformer with Keyword Mask Estimation

Title Speaker Selective Beamformer with Keyword Mask Estimation
Authors Yusuke Kida, Dung Tran, Motoi Omachi, Toru Taniguchi, Yuya Fujita
Abstract This paper addresses the problem of automatic speech recognition (ASR) of a target speaker in background speech. The novelty of our approach is that we focus on a wakeup keyword, which is usually used for activating ASR systems like smart speakers. The proposed method firstly utilizes a DNN-based mask estimator to separate the mixture signal into the keyword signal uttered by the target speaker and the remaining background speech. Then the separated signals are used for calculating a beamforming filter to enhance the subsequent utterances from the target speaker. Experimental evaluations show that the trained DNN-based mask can selectively separate the keyword and background speech from the mixture signal. The effectiveness of the proposed method is also verified with Japanese ASR experiments, and we confirm that the character error rates are significantly improved by the proposed method for both simulated and real recorded test sets.
Tasks Speech Recognition
Published 2018-10-25
URL http://arxiv.org/abs/1810.10727v2
PDF http://arxiv.org/pdf/1810.10727v2.pdf
PWC https://paperswithcode.com/paper/speaker-selective-beamformer-with-keyword
Repo
Framework

Optimizing deep video representation to match brain activity

Title Optimizing deep video representation to match brain activity
Authors Hugo Richard, Ana Pinho, Bertrand Thirion, Guillaume Charpiat
Abstract The comparison of observed brain activity with the statistics generated by artificial intelligence systems is useful to probe brain functional organization under ecological conditions. Here we study fMRI activity in ten subjects watching color natural movies and compute deep representations of these movies with an architecture that relies on optical flow and image content. The association of activity in visual areas with the different layers of the deep architecture displays complexity-related contrasts across visual areas and reveals a striking foveal/peripheral dichotomy.
Tasks Optical Flow Estimation
Published 2018-09-07
URL http://arxiv.org/abs/1809.02440v1
PDF http://arxiv.org/pdf/1809.02440v1.pdf
PWC https://paperswithcode.com/paper/optimizing-deep-video-representation-to-match
Repo
Framework

Mixed Membership Recurrent Neural Networks

Title Mixed Membership Recurrent Neural Networks
Authors Ghazal Fazelnia, Mark Ibrahim, Ceena Modarres, Kevin Wu, John Paisley
Abstract Models for sequential data such as the recurrent neural network (RNN) often implicitly model a sequence as having a fixed time interval between observations and do not account for group-level effects when multiple sequences are observed. We propose a model for grouped sequential data based on the RNN that accounts for varying time intervals between observations in a sequence by learning a group-level base parameter to which each sequence can revert. Our approach is motivated by the mixed membership framework, and we show how it can be used for dynamic topic modeling in which the distribution on topics (not the topics themselves) are evolving in time. We demonstrate our approach on a dataset of 3.4 million online grocery shopping orders made by 206K customers.
Tasks
Published 2018-12-23
URL http://arxiv.org/abs/1812.09645v1
PDF http://arxiv.org/pdf/1812.09645v1.pdf
PWC https://paperswithcode.com/paper/mixed-membership-recurrent-neural-networks
Repo
Framework

Statistical Windows in Testing for the Initial Distribution of a Reversible Markov Chain

Title Statistical Windows in Testing for the Initial Distribution of a Reversible Markov Chain
Authors Quentin Berthet, Varun Kanade
Abstract We study the problem of hypothesis testing between two discrete distributions, where we only have access to samples after the action of a known reversible Markov chain, playing the role of noise. We derive instance-dependent minimax rates for the sample complexity of this problem, and show how its dependence in time is related to the spectral properties of the Markov chain. We show that there exists a wide statistical window, in terms of sample complexity for hypothesis testing between different pairs of initial distributions. We illustrate these results in several concrete examples.
Tasks
Published 2018-08-06
URL http://arxiv.org/abs/1808.01857v1
PDF http://arxiv.org/pdf/1808.01857v1.pdf
PWC https://paperswithcode.com/paper/statistical-windows-in-testing-for-the
Repo
Framework

John’s Walk

Title John’s Walk
Authors Adam Gustafson, Hariharan Narayanan
Abstract We present an affine-invariant random walk for drawing uniform random samples from a convex body $\mathcal{K} \subset \mathbb{R}^n$ for which the maximum volume inscribed ellipsoid, known as John’s ellipsoid, may be computed. We consider a polytope $\mathcal{P} = {x \in \mathbb{R}^n \mid Ax \leq 1}$ where $A \in \mathbb{R}^{m \times n}$ as a special case. Our algorithm makes steps using uniform sampling from the John’s ellipsoid of the symmetrization of $\mathcal{K}$ at the current point. We show that from a warm start, the random walk mixes in $\widetilde{O}(n^7)$ steps where the log factors depend only on constants determined by the warm start and error parameters (and not on the dimension or number of constraints defining the body). This sampling algorithm thus offers improvement over the affine-invariant Dikin Walk for polytopes (which mixes in $\widetilde{O}(mn)$ steps from a warm start) for applications in which $m \gg n$. Furthermore, we describe an $\widetilde{O}(mn^{\omega+1} + n^{2\omega+2})$ algorithm for finding a suitably approximate John’s ellipsoid for a symmetric polytope based on Vaidya’s algorithm, and show the mixing time is retained using these approximate ellipsoids (where $\omega < 2.373$ is the current value of the fast matrix multiplication constant).
Tasks
Published 2018-03-06
URL http://arxiv.org/abs/1803.02032v1
PDF http://arxiv.org/pdf/1803.02032v1.pdf
PWC https://paperswithcode.com/paper/johns-walk
Repo
Framework

Finding the bandit in a graph: Sequential search-and-stop

Title Finding the bandit in a graph: Sequential search-and-stop
Authors Pierre Perrault, Vianney Perchet, Michal Valko
Abstract We consider the problem where an agent wants to find a hidden object that is randomly located in some vertex of a directed acyclic graph (DAG) according to a fixed but possibly unknown distribution. The agent can only examine vertices whose in-neighbors have already been examined. In this paper, we address a learning setting where we allow the agent to stop before having found the object and restart searching on a new independent instance of the same problem. Our goal is to maximize the total number of hidden objects found given a time budget. The agent can thus skip an instance after realizing that it would spend too much time on it. Our contributions are both to the search theory and multi-armed bandits. If the distribution is known, we provide a quasi-optimal and efficient stationary strategy. If the distribution is unknown, we additionally show how to sequentially approximate it and, at the same time, act near-optimally in order to collect as many hidden objects as possible.
Tasks Multi-Armed Bandits
Published 2018-06-06
URL http://arxiv.org/abs/1806.02282v3
PDF http://arxiv.org/pdf/1806.02282v3.pdf
PWC https://paperswithcode.com/paper/finding-the-bandit-in-a-graph-sequential
Repo
Framework

A General Approach to Multi-Armed Bandits Under Risk Criteria

Title A General Approach to Multi-Armed Bandits Under Risk Criteria
Authors Asaf Cassel, Shie Mannor, Assaf Zeevi
Abstract Different risk-related criteria have received recent interest in learning problems, where typically each case is treated in a customized manner. In this paper we provide a more systematic approach to analyzing such risk criteria within a stochastic multi-armed bandit (MAB) formulation. We identify a set of general conditions that yield a simple characterization of the oracle rule (which serves as the regret benchmark), and facilitate the design of upper confidence bound (UCB) learning policies. The conditions are derived from problem primitives, primarily focusing on the relation between the arm reward distributions and the (risk criteria) performance metric. Among other things, the work highlights some (possibly non-intuitive) subtleties that differentiate various criteria in conjunction with statistical properties of the arms. Our main findings are illustrated on several widely used objectives such as conditional value-at-risk, mean-variance, Sharpe-ratio, and more.
Tasks Multi-Armed Bandits
Published 2018-06-04
URL http://arxiv.org/abs/1806.01380v1
PDF http://arxiv.org/pdf/1806.01380v1.pdf
PWC https://paperswithcode.com/paper/a-general-approach-to-multi-armed-bandits
Repo
Framework

The Externalities of Exploration and How Data Diversity Helps Exploitation

Title The Externalities of Exploration and How Data Diversity Helps Exploitation
Authors Manish Raghavan, Aleksandrs Slivkins, Jennifer Wortman Vaughan, Zhiwei Steven Wu
Abstract Online learning algorithms, widely used to power search and content optimization on the web, must balance exploration and exploitation, potentially sacrificing the experience of current users for information that will lead to better decisions in the future. Recently, concerns have been raised about whether the process of exploration could be viewed as unfair, placing too much burden on certain individuals or groups. Motivated by these concerns, we initiate the study of the externalities of exploration - the undesirable side effects that the presence of one party may impose on another - under the linear contextual bandits model. We introduce the notion of a group externality, measuring the extent to which the presence of one population of users impacts the rewards of another. We show that this impact can in some cases be negative, and that, in a certain sense, no algorithm can avoid it. We then study externalities at the individual level, interpreting the act of exploration as an externality imposed on the current user of a system by future users. This drives us to ask under what conditions inherent diversity in the data makes explicit exploration unnecessary. We build on a recent line of work on the smoothed analysis of the greedy algorithm that always chooses the action that currently looks optimal, improving on prior results to show that a greedy approach almost matches the best possible Bayesian regret rate of any other algorithm on the same problem instance whenever the diversity conditions hold, and that this regret is at most $\tilde{O}(T^{1/3})$. Returning to group-level effects, we show that under the same conditions, negative group externalities essentially vanish under the greedy algorithm. Together, our results uncover a sharp contrast between the high externalities that exist in the worst case, and the ability to remove all externalities if the data is sufficiently diverse.
Tasks Multi-Armed Bandits
Published 2018-06-01
URL http://arxiv.org/abs/1806.00543v2
PDF http://arxiv.org/pdf/1806.00543v2.pdf
PWC https://paperswithcode.com/paper/the-externalities-of-exploration-and-how-data
Repo
Framework

An agent-based model of an endangered population of the Arctic fox from Mednyi Island

Title An agent-based model of an endangered population of the Arctic fox from Mednyi Island
Authors Angelina Brilliantova, Anton Pletenev, Liliya Doronina, Hadi Hosseini
Abstract Artificial Intelligence techniques such as agent-based modeling and probabilistic reasoning have shown promise in modeling complex biological systems and testing ecological hypotheses through simulation. We develop an agent-based model of Arctic foxes from Medniy Island while utilizing Probabilistic Graphical Models to capture the conditional dependencies between the random variables. Such models provide valuable insights in analyzing factors behind catastrophic degradation of this population and in revealing evolutionary mechanisms of its persistence in high-density environment. Using empirical data from studies in Medniy Island, we create a realistic model of Arctic foxes as agents, and study their survival and population dynamics under a variety of conditions.
Tasks
Published 2018-07-16
URL http://arxiv.org/abs/1807.06103v1
PDF http://arxiv.org/pdf/1807.06103v1.pdf
PWC https://paperswithcode.com/paper/an-agent-based-model-of-an-endangered
Repo
Framework

Predictive Maintenance for Industrial IoT of Vehicle Fleets using Hierarchical Modified Fuzzy Support Vector Machine

Title Predictive Maintenance for Industrial IoT of Vehicle Fleets using Hierarchical Modified Fuzzy Support Vector Machine
Authors Arindam Chaudhuri
Abstract Connected vehicle fleets are deployed worldwide in several industrial IoT scenarios. With the gradual increase of machines being controlled and managed through networked smart devices, the predictive maintenance potential grows rapidly. Predictive maintenance has the potential of optimizing uptime as well as performance such that time and labor associated with inspections and preventive maintenance are reduced. In order to understand the trends of vehicle faults with respect to important vehicle attributes viz mileage, age, vehicle type etc this problem is addressed through hierarchical modified fuzzy support vector machine (HMFSVM). The proposed method is compared with other commonly used approaches like logistic regression, random forests and support vector machines. This helps better implementation of telematics data to ensure preventative management as part of the desired solution. The superiority of the proposed method is highlighted through several experimental results.
Tasks
Published 2018-06-24
URL http://arxiv.org/abs/1806.09612v1
PDF http://arxiv.org/pdf/1806.09612v1.pdf
PWC https://paperswithcode.com/paper/predictive-maintenance-for-industrial-iot-of
Repo
Framework

Multi-Statistic Approximate Bayesian Computation with Multi-Armed Bandits

Title Multi-Statistic Approximate Bayesian Computation with Multi-Armed Bandits
Authors Prashant Singh, Andreas Hellander
Abstract Approximate Bayesian computation is an established and popular method for likelihood-free inference with applications in many disciplines. The effectiveness of the method depends critically on the availability of well performing summary statistics. Summary statistic selection relies heavily on domain knowledge and carefully engineered features, and can be a laborious time consuming process. Since the method is sensitive to data dimensionality, the process of selecting summary statistics must balance the need to include informative statistics and the dimensionality of the feature vector. This paper proposes to treat the problem of dynamically selecting an appropriate summary statistic from a given pool of candidate summary statistics as a multi-armed bandit problem. This allows approximate Bayesian computation rejection sampling to dynamically focus on a distribution over well performing summary statistics as opposed to a fixed set of statistics. The proposed method is unique in that it does not require any pre-processing and is scalable to a large number of candidate statistics. This enables efficient use of a large library of possible time series summary statistics without prior feature engineering. The proposed approach is compared to state-of-the-art methods for summary statistics selection using a challenging test problem from the systems biology literature.
Tasks Feature Engineering, Multi-Armed Bandits, Time Series
Published 2018-05-22
URL http://arxiv.org/abs/1805.08647v1
PDF http://arxiv.org/pdf/1805.08647v1.pdf
PWC https://paperswithcode.com/paper/multi-statistic-approximate-bayesian
Repo
Framework
comments powered by Disqus