Paper Group ANR 481
State Duration and Interval Modeling in Hidden Semi-Markov Model for Sequential Data Analysis. Fairness in Reinforcement Learning. Backprop KF: Learning Discriminative Deterministic State Estimators. Optimal and Adaptive Off-policy Evaluation in Contextual Bandits. Binomial Checkpointing for Arbitrary Programs with No User Annotation. A scaled Breg …
State Duration and Interval Modeling in Hidden Semi-Markov Model for Sequential Data Analysis
Title | State Duration and Interval Modeling in Hidden Semi-Markov Model for Sequential Data Analysis |
Authors | Hiromi Narimatsu, Hiroyuki Kasai |
Abstract | Sequential data modeling and analysis have become indispensable tools for analyzing sequential data, such as time-series data, because larger amounts of sensed event data have become available. These methods capture the sequential structure of data of interest, such as input-output relations and correlation among datasets. However, because most studies in this area are specialized or limited to their respective applications, rigorous requirement analysis of such models has not been undertaken from a general perspective. Therefore, we particularly examine the structure of sequential data, and extract the necessity of state duration' and state interval’ of events for efficient and rich representation of sequential data. Specifically addressing the hidden semi-Markov model (HSMM) that represents such state duration inside a model, we attempt to add representational capability of a state interval of events onto HSMM. To this end, we propose two extended models: an interval state hidden semi-Markov model (IS-HSMM) to express the length of a state interval with a special state node designated as “interval state node”; and an interval length probability hidden semi-Markov model (ILP-HSMM) which represents the length of the state interval with a new probabilistic parameter “interval length probability.” Exhaustive simulations have revealed superior performance of the proposed models in comparison with HSMM. These proposed models are the first reported extensions of HMM to support state interval representation as well as state duration representation. |
Tasks | Time Series |
Published | 2016-08-24 |
URL | http://arxiv.org/abs/1608.06954v2 |
http://arxiv.org/pdf/1608.06954v2.pdf | |
PWC | https://paperswithcode.com/paper/state-duration-and-interval-modeling-in |
Repo | |
Framework | |
Fairness in Reinforcement Learning
Title | Fairness in Reinforcement Learning |
Authors | Shahin Jabbari, Matthew Joseph, Michael Kearns, Jamie Morgenstern, Aaron Roth |
Abstract | We initiate the study of fairness in reinforcement learning, where the actions of a learning algorithm may affect its environment and future rewards. Our fairness constraint requires that an algorithm never prefers one action over another if the long-term (discounted) reward of choosing the latter action is higher. Our first result is negative: despite the fact that fairness is consistent with the optimal policy, any learning algorithm satisfying fairness must take time exponential in the number of states to achieve non-trivial approximation to the optimal policy. We then provide a provably fair polynomial time algorithm under an approximate notion of fairness, thus establishing an exponential gap between exact and approximate fairness |
Tasks | |
Published | 2016-11-09 |
URL | http://arxiv.org/abs/1611.03071v4 |
http://arxiv.org/pdf/1611.03071v4.pdf | |
PWC | https://paperswithcode.com/paper/fairness-in-reinforcement-learning |
Repo | |
Framework | |
Backprop KF: Learning Discriminative Deterministic State Estimators
Title | Backprop KF: Learning Discriminative Deterministic State Estimators |
Authors | Tuomas Haarnoja, Anurag Ajay, Sergey Levine, Pieter Abbeel |
Abstract | Generative state estimators based on probabilistic filters and smoothers are one of the most popular classes of state estimators for robots and autonomous vehicles. However, generative models have limited capacity to handle rich sensory observations, such as camera images, since they must model the entire distribution over sensor readings. Discriminative models do not suffer from this limitation, but are typically more complex to train as latent variable models for state estimation. We present an alternative approach where the parameters of the latent state distribution are directly optimized as a deterministic computation graph, resulting in a simple and effective gradient descent algorithm for training discriminative state estimators. We show that this procedure can be used to train state estimators that use complex input, such as raw camera images, which must be processed using expressive nonlinear function approximators such as convolutional neural networks. Our model can be viewed as a type of recurrent neural network, and the connection to probabilistic filtering allows us to design a network architecture that is particularly well suited for state estimation. We evaluate our approach on synthetic tracking task with raw image inputs and on the visual odometry task in the KITTI dataset. The results show significant improvement over both standard generative approaches and regular recurrent neural networks. |
Tasks | Autonomous Vehicles, Latent Variable Models, Visual Odometry |
Published | 2016-05-23 |
URL | http://arxiv.org/abs/1605.07148v4 |
http://arxiv.org/pdf/1605.07148v4.pdf | |
PWC | https://paperswithcode.com/paper/backprop-kf-learning-discriminative |
Repo | |
Framework | |
Optimal and Adaptive Off-policy Evaluation in Contextual Bandits
Title | Optimal and Adaptive Off-policy Evaluation in Contextual Bandits |
Authors | Yu-Xiang Wang, Alekh Agarwal, Miroslav Dudik |
Abstract | We study the off-policy evaluation problem—estimating the value of a target policy using data collected by another policy—under the contextual bandit model. We consider the general (agnostic) setting without access to a consistent model of rewards and establish a minimax lower bound on the mean squared error (MSE). The bound is matched up to constants by the inverse propensity scoring (IPS) and doubly robust (DR) estimators. This highlights the difficulty of the agnostic contextual setting, in contrast with multi-armed bandits and contextual bandits with access to a consistent reward model, where IPS is suboptimal. We then propose the SWITCH estimator, which can use an existing reward model (not necessarily consistent) to achieve a better bias-variance tradeoff than IPS and DR. We prove an upper bound on its MSE and demonstrate its benefits empirically on a diverse collection of data sets, often outperforming prior work by orders of magnitude. |
Tasks | Multi-Armed Bandits |
Published | 2016-12-04 |
URL | http://arxiv.org/abs/1612.01205v2 |
http://arxiv.org/pdf/1612.01205v2.pdf | |
PWC | https://paperswithcode.com/paper/optimal-and-adaptive-off-policy-evaluation-in |
Repo | |
Framework | |
Binomial Checkpointing for Arbitrary Programs with No User Annotation
Title | Binomial Checkpointing for Arbitrary Programs with No User Annotation |
Authors | Jeffrey Mark Siskind, Barak A. Pearlmutter |
Abstract | Heretofore, automatic checkpointing at procedure-call boundaries, to reduce the space complexity of reverse mode, has been provided by systems like Tapenade. However, binomial checkpointing, or treeverse, has only been provided in Automatic Differentiation (AD) systems in special cases, e.g., through user-provided pragmas on DO loops in Tapenade, or as the nested taping mechanism in adol-c for time integration processes, which requires that user code be refactored. We present a framework for applying binomial checkpointing to arbitrary code with no special annotation or refactoring required. This is accomplished by applying binomial checkpointing directly to a program trace. This trace is produced by a general-purpose checkpointing mechanism that is orthogonal to AD. |
Tasks | |
Published | 2016-11-10 |
URL | http://arxiv.org/abs/1611.03410v1 |
http://arxiv.org/pdf/1611.03410v1.pdf | |
PWC | https://paperswithcode.com/paper/binomial-checkpointing-for-arbitrary-programs |
Repo | |
Framework | |
A scaled Bregman theorem with applications
Title | A scaled Bregman theorem with applications |
Authors | Richard Nock, Aditya Krishna Menon, Cheng Soon Ong |
Abstract | Bregman divergences play a central role in the design and analysis of a range of machine learning algorithms. This paper explores the use of Bregman divergences to establish reductions between such algorithms and their analyses. We present a new scaled isodistortion theorem involving Bregman divergences (scaled Bregman theorem for short) which shows that certain “Bregman distortions’” (employing a potentially non-convex generator) may be exactly re-written as a scaled Bregman divergence computed over transformed data. Admissible distortions include geodesic distances on curved manifolds and projections or gauge-normalisation, while admissible data include scalars, vectors and matrices. Our theorem allows one to leverage to the wealth and convenience of Bregman divergences when analysing algorithms relying on the aforementioned Bregman distortions. We illustrate this with three novel applications of our theorem: a reduction from multi-class density ratio to class-probability estimation, a new adaptive projection free yet norm-enforcing dual norm mirror descent algorithm, and a reduction from clustering on flat manifolds to clustering on curved manifolds. Experiments on each of these domains validate the analyses and suggest that the scaled Bregman theorem might be a worthy addition to the popular handful of Bregman divergence properties that have been pervasive in machine learning. |
Tasks | |
Published | 2016-07-01 |
URL | http://arxiv.org/abs/1607.00360v1 |
http://arxiv.org/pdf/1607.00360v1.pdf | |
PWC | https://paperswithcode.com/paper/a-scaled-bregman-theorem-with-applications |
Repo | |
Framework | |
Markov substitute processes : a new model for linguistics and beyond
Title | Markov substitute processes : a new model for linguistics and beyond |
Authors | Olivier Catoni, Thomas Mainguy |
Abstract | We introduce Markov substitute processes, a new model at the crossroad of statistics and formal grammars, and prove its main property : Markov substitute processes with a given support form an exponential family. |
Tasks | |
Published | 2016-03-25 |
URL | http://arxiv.org/abs/1603.07850v1 |
http://arxiv.org/pdf/1603.07850v1.pdf | |
PWC | https://paperswithcode.com/paper/markov-substitute-processes-a-new-model-for |
Repo | |
Framework | |
Trajectory probability hypothesis density filter
Title | Trajectory probability hypothesis density filter |
Authors | Ángel F. García-Fernández, Lennart Svensson |
Abstract | This paper presents the probability hypothesis density (PHD) filter for sets of trajectories: the trajectory probability density (TPHD) filter. The TPHD filter is capable of estimating trajectories in a principled way without requiring to evaluate all measurement-to-target association hypotheses. The TPHD filter is based on recursively obtaining the best Poisson approximation to the multitrajectory filtering density in the sense of minimising the Kullback-Leibler divergence. We also propose a Gaussian mixture implementation of the TPHD recursion. Finally, we include simulation results to show the performance of the proposed algorithm. |
Tasks | |
Published | 2016-05-24 |
URL | http://arxiv.org/abs/1605.07264v2 |
http://arxiv.org/pdf/1605.07264v2.pdf | |
PWC | https://paperswithcode.com/paper/trajectory-probability-hypothesis-density |
Repo | |
Framework | |
Investigation Into The Effectiveness Of Long Short Term Memory Networks For Stock Price Prediction
Title | Investigation Into The Effectiveness Of Long Short Term Memory Networks For Stock Price Prediction |
Authors | Hengjian Jia |
Abstract | The effectiveness of long short term memory networks trained by backpropagation through time for stock price prediction is explored in this paper. A range of different architecture LSTM networks are constructed trained and tested. |
Tasks | Stock Price Prediction |
Published | 2016-03-25 |
URL | http://arxiv.org/abs/1603.07893v3 |
http://arxiv.org/pdf/1603.07893v3.pdf | |
PWC | https://paperswithcode.com/paper/investigation-into-the-effectiveness-of-long |
Repo | |
Framework | |
A Statistical Test for Joint Distributions Equivalence
Title | A Statistical Test for Joint Distributions Equivalence |
Authors | Francesco Solera, Andrea Palazzi |
Abstract | We provide a distribution-free test that can be used to determine whether any two joint distributions $p$ and $q$ are statistically different by inspection of a large enough set of samples. Following recent efforts from Long et al. [1], we rely on joint kernel distribution embedding to extend the kernel two-sample test of Gretton et al. [2] to the case of joint probability distributions. Our main result can be directly applied to verify if a dataset-shift has occurred between training and test distributions in a learning framework, without further assuming the shift has occurred only in the input, in the target or in the conditional distribution. |
Tasks | |
Published | 2016-07-25 |
URL | http://arxiv.org/abs/1607.07270v1 |
http://arxiv.org/pdf/1607.07270v1.pdf | |
PWC | https://paperswithcode.com/paper/a-statistical-test-for-joint-distributions |
Repo | |
Framework | |
Question Retrieval for Community-based Question Answering via Heterogeneous Network Integration Learning
Title | Question Retrieval for Community-based Question Answering via Heterogeneous Network Integration Learning |
Authors | Zheqian Chen, Chi Zhang, Zhou Zhao, Deng Cai |
Abstract | Community based question answering platforms have attracted substantial users to share knowledge and learn from each other. As the rapid enlargement of CQA platforms, quantities of overlapped questions emerge, which makes users confounded to select a proper reference. It is urgent for us to take effective automated algorithms to reuse historical questions with corresponding answers. In this paper we focus on the problem with question retrieval, which aims to match historical questions that are relevant or semantically equivalent to resolve one s query directly. The challenges in this task are the lexical gaps between questions for the word ambiguity and word mismatch problem. Furthermore, limited words in queried sentences cause sparsity of word features. To alleviate these challenges, we propose a novel framework named HNIL which encodes not only the question contents but also the askers social interactions to enhance the question embedding performance. More specifically, we apply random walk based learning method with recurrent neural network to match the similarities between askers question and historical questions proposed by other users. Extensive experiments on a large scale dataset from a real world CQA site show that employing the heterogeneous social network information outperforms the other state of the art solutions in this task. |
Tasks | Question Answering |
Published | 2016-11-24 |
URL | http://arxiv.org/abs/1611.08135v1 |
http://arxiv.org/pdf/1611.08135v1.pdf | |
PWC | https://paperswithcode.com/paper/question-retrieval-for-community-based |
Repo | |
Framework | |
Deep Gate Recurrent Neural Network
Title | Deep Gate Recurrent Neural Network |
Authors | Yuan Gao, Dorota Glowacka |
Abstract | This paper introduces two recurrent neural network structures called Simple Gated Unit (SGU) and Deep Simple Gated Unit (DSGU), which are general structures for learning long term dependencies. Compared to traditional Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU), both structures require fewer parameters and less computation time in sequence classification tasks. Unlike GRU and LSTM, which require more than one gates to control information flow in the network, SGU and DSGU only use one multiplicative gate to control the flow of information. We show that this difference can accelerate the learning speed in tasks that require long dependency information. We also show that DSGU is more numerically stable than SGU. In addition, we also propose a standard way of representing inner structure of RNN called RNN Conventional Graph (RCG), which helps analyzing the relationship between input units and hidden units of RNN. |
Tasks | |
Published | 2016-04-11 |
URL | http://arxiv.org/abs/1604.02910v3 |
http://arxiv.org/pdf/1604.02910v3.pdf | |
PWC | https://paperswithcode.com/paper/deep-gate-recurrent-neural-network |
Repo | |
Framework | |
Vanishing point detection with convolutional neural networks
Title | Vanishing point detection with convolutional neural networks |
Authors | Ali Borji |
Abstract | Inspired by the finding that vanishing point (road tangent) guides driver’s gaze, in our previous work we showed that vanishing point attracts gaze during free viewing of natural scenes as well as in visual search (Borji et al., Journal of Vision 2016). We have also introduced improved saliency models using vanishing point detectors (Feng et al., WACV 2016). Here, we aim to predict vanishing points in naturalistic environments by training convolutional neural networks in an end-to-end manner over a large set of road images downloaded from Youtube with vanishing points annotated. Results demonstrate effectiveness of our approach compared to classic approaches of vanishing point detection in the literature. |
Tasks | |
Published | 2016-09-04 |
URL | http://arxiv.org/abs/1609.00967v1 |
http://arxiv.org/pdf/1609.00967v1.pdf | |
PWC | https://paperswithcode.com/paper/vanishing-point-detection-with-convolutional |
Repo | |
Framework | |
Credibilistic TOPSIS Model for Evaluation and Selection of Municipal Solid Waste Disposal Methods
Title | Credibilistic TOPSIS Model for Evaluation and Selection of Municipal Solid Waste Disposal Methods |
Authors | Jagannath Roy, Krishnendu Adhikary, Samarjit Kar |
Abstract | Municipal solid waste management (MSWM) is a challenging issue of urban development in developing countries. Each country having different socio-economic-environmental background, might not accept a particular disposal method as the optimal choice. Selection of suitable disposal method in MSWM, under vague and imprecise information can be considered as multi criteria decision making problem (MCDM). In the present paper, TOPSIS (Technique for Order Preference by Similarity to Ideal Solution) methodology is extended based on credibility theory for evaluating the performances of MSW disposal methods under some criteria fixed by experts. The proposed model helps decision makers to choose a preferable alternative for their municipal area. A sensitivity analysis by our proposed model confirms this fact. |
Tasks | Decision Making |
Published | 2016-06-29 |
URL | http://arxiv.org/abs/1606.08965v3 |
http://arxiv.org/pdf/1606.08965v3.pdf | |
PWC | https://paperswithcode.com/paper/credibilistic-topsis-model-for-evaluation-and |
Repo | |
Framework | |
Reinforcement Learning of POMDPs using Spectral Methods
Title | Reinforcement Learning of POMDPs using Spectral Methods |
Authors | Kamyar Azizzadenesheli, Alessandro Lazaric, Animashree Anandkumar |
Abstract | We propose a new reinforcement learning algorithm for partially observable Markov decision processes (POMDP) based on spectral decomposition methods. While spectral methods have been previously employed for consistent learning of (passive) latent variable models such as hidden Markov models, POMDPs are more challenging since the learner interacts with the environment and possibly changes the future observations in the process. We devise a learning algorithm running through episodes, in each episode we employ spectral techniques to learn the POMDP parameters from a trajectory generated by a fixed policy. At the end of the episode, an optimization oracle returns the optimal memoryless planning policy which maximizes the expected reward based on the estimated POMDP model. We prove an order-optimal regret bound with respect to the optimal memoryless policy and efficient scaling with respect to the dimensionality of observation and action spaces. |
Tasks | Latent Variable Models |
Published | 2016-02-25 |
URL | http://arxiv.org/abs/1602.07764v2 |
http://arxiv.org/pdf/1602.07764v2.pdf | |
PWC | https://paperswithcode.com/paper/reinforcement-learning-of-pomdps-using |
Repo | |
Framework | |