Paper Group ANR 414
Towards Score Following in Sheet Music Images. Sample-efficient Deep Reinforcement Learning for Dialog Control. Matching Handwritten Document Images. Risk-Sensitive Learning and Pricing for Demand Response. The Symbolic Interior Point Method. Technical Report: A Generalized Matching Pursuit Approach for Graph-Structured Sparsity. Stationary time-ve …
Towards Score Following in Sheet Music Images
Title | Towards Score Following in Sheet Music Images |
Authors | Matthias Dorfer, Andreas Arzt, Gerhard Widmer |
Abstract | This paper addresses the matching of short music audio snippets to the corresponding pixel location in images of sheet music. A system is presented that simultaneously learns to read notes, listens to music and matches the currently played music to its corresponding notes in the sheet. It consists of an end-to-end multi-modal convolutional neural network that takes as input images of sheet music and spectrograms of the respective audio snippets. It learns to predict, for a given unseen audio snippet (covering approximately one bar of music), the corresponding position in the respective score line. Our results suggest that with the use of (deep) neural networks – which have proven to be powerful image processing models – working with sheet music becomes feasible and a promising future research direction. |
Tasks | |
Published | 2016-12-15 |
URL | http://arxiv.org/abs/1612.05050v1 |
http://arxiv.org/pdf/1612.05050v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-score-following-in-sheet-music-images |
Repo | |
Framework | |
Sample-efficient Deep Reinforcement Learning for Dialog Control
Title | Sample-efficient Deep Reinforcement Learning for Dialog Control |
Authors | Kavosh Asadi, Jason D. Williams |
Abstract | Representing a dialog policy as a recurrent neural network (RNN) is attractive because it handles partial observability, infers a latent representation of state, and can be optimized with supervised learning (SL) or reinforcement learning (RL). For RL, a policy gradient approach is natural, but is sample inefficient. In this paper, we present 3 methods for reducing the number of dialogs required to optimize an RNN-based dialog policy with RL. The key idea is to maintain a second RNN which predicts the value of the current policy, and to apply experience replay to both networks. On two tasks, these methods reduce the number of dialogs/episodes required by about a third, vs. standard policy gradient methods. |
Tasks | Policy Gradient Methods |
Published | 2016-12-18 |
URL | http://arxiv.org/abs/1612.06000v1 |
http://arxiv.org/pdf/1612.06000v1.pdf | |
PWC | https://paperswithcode.com/paper/sample-efficient-deep-reinforcement-learning-1 |
Repo | |
Framework | |
Matching Handwritten Document Images
Title | Matching Handwritten Document Images |
Authors | Praveen Krishnan, C. V. Jawahar |
Abstract | We address the problem of predicting similarity between a pair of handwritten document images written by different individuals. This has applications related to matching and mining in image collections containing handwritten content. A similarity score is computed by detecting patterns of text re-usages between document images irrespective of the minor variations in word morphology, word ordering, layout and paraphrasing of the content. Our method does not depend on an accurate segmentation of words and lines. We formulate the document matching problem as a structured comparison of the word distributions across two document images. To match two word images, we propose a convolutional neural network (CNN) based feature descriptor. Performance of this representation surpasses the state-of-the-art on handwritten word spotting. Finally, we demonstrate the applicability of our method on a practical problem of matching handwritten assignments. |
Tasks | |
Published | 2016-05-19 |
URL | http://arxiv.org/abs/1605.05923v1 |
http://arxiv.org/pdf/1605.05923v1.pdf | |
PWC | https://paperswithcode.com/paper/matching-handwritten-document-images |
Repo | |
Framework | |
Risk-Sensitive Learning and Pricing for Demand Response
Title | Risk-Sensitive Learning and Pricing for Demand Response |
Authors | Kia Khezeli, Eilyan Bitar |
Abstract | We consider the setting in which an electric power utility seeks to curtail its peak electricity demand by offering a fixed group of customers a uniform price for reductions in consumption relative to their predetermined baselines. The underlying demand curve, which describes the aggregate reduction in consumption in response to the offered price, is assumed to be affine and subject to unobservable random shocks. Assuming that both the parameters of the demand curve and the distribution of the random shocks are initially unknown to the utility, we investigate the extent to which the utility might dynamically adjust its offered prices to maximize its cumulative risk-sensitive payoff over a finite number of $T$ days. In order to do so effectively, the utility must design its pricing policy to balance the tradeoff between the need to learn the unknown demand model (exploration) and maximize its payoff (exploitation) over time. In this paper, we propose such a pricing policy, which is shown to exhibit an expected payoff loss over $T$ days that is at most $O(\sqrt{T}\log(T))$, relative to an oracle pricing policy that knows the underlying demand model. Moreover, the proposed pricing policy is shown to yield a sequence of prices that converge to the oracle optimal prices in the mean square sense. |
Tasks | |
Published | 2016-11-21 |
URL | http://arxiv.org/abs/1611.07098v3 |
http://arxiv.org/pdf/1611.07098v3.pdf | |
PWC | https://paperswithcode.com/paper/risk-sensitive-learning-and-pricing-for |
Repo | |
Framework | |
The Symbolic Interior Point Method
Title | The Symbolic Interior Point Method |
Authors | Martin Mladenov, Vaishak Belle, Kristian Kersting |
Abstract | A recent trend in probabilistic inference emphasizes the codification of models in a formal syntax, with suitable high-level features such as individuals, relations, and connectives, enabling descriptive clarity, succinctness and circumventing the need for the modeler to engineer a custom solver. Unfortunately, bringing these linguistic and pragmatic benefits to numerical optimization has proven surprisingly challenging. In this paper, we turn to these challenges: we introduce a rich modeling language, for which an interior-point method computes approximate solutions in a generic way. While logical features easily complicates the underlying model, often yielding intricate dependencies, we exploit and cache local structure using algebraic decision diagrams (ADDs). Indeed, standard matrix-vector algebra is efficiently realizable in ADDs, but we argue and show that well-known optimization methods are not ideal for ADDs. Our engine, therefore, invokes a sophisticated matrix-free approach. We demonstrate the flexibility of the resulting symbolic-numeric optimizer on decision making and compressed sensing tasks with millions of non-zero entries. |
Tasks | Decision Making |
Published | 2016-05-26 |
URL | http://arxiv.org/abs/1605.08187v3 |
http://arxiv.org/pdf/1605.08187v3.pdf | |
PWC | https://paperswithcode.com/paper/the-symbolic-interior-point-method |
Repo | |
Framework | |
Technical Report: A Generalized Matching Pursuit Approach for Graph-Structured Sparsity
Title | Technical Report: A Generalized Matching Pursuit Approach for Graph-Structured Sparsity |
Authors | Feng Chen, Baojian Zhou |
Abstract | Sparsity-constrained optimization is an important and challenging problem that has wide applicability in data mining, machine learning, and statistics. In this paper, we focus on sparsity-constrained optimization in cases where the cost function is a general nonlinear function and, in particular, the sparsity constraint is defined by a graph-structured sparsity model. Existing methods explore this problem in the context of sparse estimation in linear models. To the best of our knowledge, this is the first work to present an efficient approximation algorithm, namely, Graph-structured Matching Pursuit (Graph-Mp), to optimize a general nonlinear function subject to graph-structured constraints. We prove that our algorithm enjoys the strong guarantees analogous to those designed for linear models in terms of convergence rate and approximation accuracy. As a case study, we specialize Graph-Mp to optimize a number of well-known graph scan statistic models for the connected subgraph detection task, and empirical evidence demonstrates that our general algorithm performs superior over state-of-the-art methods that are designed specifically for the task of connected subgraph detection. |
Tasks | |
Published | 2016-12-11 |
URL | http://arxiv.org/abs/1612.03364v1 |
http://arxiv.org/pdf/1612.03364v1.pdf | |
PWC | https://paperswithcode.com/paper/technical-report-a-generalized-matching |
Repo | |
Framework | |
Stationary time-vertex signal processing
Title | Stationary time-vertex signal processing |
Authors | Andreas Loukas, Nathanaël Perraudin |
Abstract | This paper considers regression tasks involving high-dimensional multivariate processes whose structure is dependent on some {known} graph topology. We put forth a new definition of time-vertex wide-sense stationarity, or joint stationarity for short, that goes beyond product graphs. Joint stationarity helps by reducing the estimation variance and recovery complexity. In particular, for any jointly stationary process (a) one reliably learns the covariance structure from as little as a single realization of the process, and (b) solves MMSE recovery problems, such as interpolation and denoising, in computational time nearly linear on the number of edges and timesteps. Experiments with three datasets suggest that joint stationarity can yield accuracy improvements in the recovery of high-dimensional processes evolving over a graph, even when the latter is only approximately known, or the process is not strictly stationary. |
Tasks | Denoising |
Published | 2016-11-01 |
URL | https://arxiv.org/abs/1611.00255v3 |
https://arxiv.org/pdf/1611.00255v3.pdf | |
PWC | https://paperswithcode.com/paper/stationary-time-vertex-signal-processing |
Repo | |
Framework | |
Mesh Denoising based on Normal Voting Tensor and Binary Optimization
Title | Mesh Denoising based on Normal Voting Tensor and Binary Optimization |
Authors | S. K. Yadav, U. Reitebuch, K. Polthier |
Abstract | This paper presents a tensor multiplication based smoothing algorithm that follows a two step denoising method. Unlike other traditional averaging approaches, our approach uses an element based normal voting tensor to compute smooth surfaces. By introducing a binary optimization on the proposed tensor together with a local binary neighborhood concept, our algorithm better retains sharp features and produces smoother umbilical regions than previous approaches. On top of that, we provide a stochastic analysis on the different kinds of noise based on the average edge length. The quantitative and visual results demonstrate the performance our method is better compared to state of the art smoothing approaches. |
Tasks | Denoising |
Published | 2016-07-20 |
URL | http://arxiv.org/abs/1607.07427v2 |
http://arxiv.org/pdf/1607.07427v2.pdf | |
PWC | https://paperswithcode.com/paper/mesh-denoising-based-on-normal-voting-tensor |
Repo | |
Framework | |
Analysis and Implementation of an Asynchronous Optimization Algorithm for the Parameter Server
Title | Analysis and Implementation of an Asynchronous Optimization Algorithm for the Parameter Server |
Authors | Arda Aytekin, Hamid Reza Feyzmahdavian, Mikael Johansson |
Abstract | This paper presents an asynchronous incremental aggregated gradient algorithm and its implementation in a parameter server framework for solving regularized optimization problems. The algorithm can handle both general convex (possibly non-smooth) regularizers and general convex constraints. When the empirical data loss is strongly convex, we establish linear convergence rate, give explicit expressions for step-size choices that guarantee convergence to the optimum, and bound the associated convergence factors. The expressions have an explicit dependence on the degree of asynchrony and recover classical results under synchronous operation. Simulations and implementations on commercial compute clouds validate our findings. |
Tasks | |
Published | 2016-10-18 |
URL | http://arxiv.org/abs/1610.05507v1 |
http://arxiv.org/pdf/1610.05507v1.pdf | |
PWC | https://paperswithcode.com/paper/analysis-and-implementation-of-an |
Repo | |
Framework | |
Automatic Discoveries of Physical and Semantic Concepts via Association Priors of Neuron Groups
Title | Automatic Discoveries of Physical and Semantic Concepts via Association Priors of Neuron Groups |
Authors | Shuai Li, Kui Jia, Xiaogang Wang |
Abstract | The recent successful deep neural networks are largely trained in a supervised manner. It {\it associates} complex patterns of input samples with neurons in the last layer, which form representations of {\it concepts}. In spite of their successes, the properties of complex patterns associated a learned concept remain elusive. In this work, by analyzing how neurons are associated with concepts in supervised networks, we hypothesize that with proper priors to regulate learning, neural networks can automatically associate neurons in the intermediate layers with concepts that are aligned with real world concepts, when trained only with labels that associate concepts with top level neurons, which is a plausible way for unsupervised learning. We develop a prior to verify the hypothesis and experimentally find the proposed prior help neural networks automatically learn both basic physical concepts at the lower layers, e.g., rotation of filters, and highly semantic concepts at the higher layers, e.g., fine-grained categories of an entry-level category. |
Tasks | |
Published | 2016-12-30 |
URL | http://arxiv.org/abs/1612.09438v2 |
http://arxiv.org/pdf/1612.09438v2.pdf | |
PWC | https://paperswithcode.com/paper/automatic-discoveries-of-physical-and |
Repo | |
Framework | |
Fast, Compact, and High Quality LSTM-RNN Based Statistical Parametric Speech Synthesizers for Mobile Devices
Title | Fast, Compact, and High Quality LSTM-RNN Based Statistical Parametric Speech Synthesizers for Mobile Devices |
Authors | Heiga Zen, Yannis Agiomyrgiannakis, Niels Egberts, Fergus Henderson, Przemysław Szczepaniak |
Abstract | Acoustic models based on long short-term memory recurrent neural networks (LSTM-RNNs) were applied to statistical parametric speech synthesis (SPSS) and showed significant improvements in naturalness and latency over those based on hidden Markov models (HMMs). This paper describes further optimizations of LSTM-RNN-based SPSS for deployment on mobile devices; weight quantization, multi-frame inference, and robust inference using an {\epsilon}-contaminated Gaussian loss function. Experimental results in subjective listening tests show that these optimizations can make LSTM-RNN-based SPSS comparable to HMM-based SPSS in runtime speed while maintaining naturalness. Evaluations between LSTM-RNN- based SPSS and HMM-driven unit selection speech synthesis are also presented. |
Tasks | Quantization, Speech Synthesis |
Published | 2016-06-20 |
URL | http://arxiv.org/abs/1606.06061v2 |
http://arxiv.org/pdf/1606.06061v2.pdf | |
PWC | https://paperswithcode.com/paper/fast-compact-and-high-quality-lstm-rnn-based |
Repo | |
Framework | |
3D Human Pose Estimation Using Convolutional Neural Networks with 2D Pose Information
Title | 3D Human Pose Estimation Using Convolutional Neural Networks with 2D Pose Information |
Authors | Sungheon Park, Jihye Hwang, Nojun Kwak |
Abstract | While there has been a success in 2D human pose estimation with convolutional neural networks (CNNs), 3D human pose estimation has not been thoroughly studied. In this paper, we tackle the 3D human pose estimation task with end-to-end learning using CNNs. Relative 3D positions between one joint and the other joints are learned via CNNs. The proposed method improves the performance of CNN with two novel ideas. First, we added 2D pose information to estimate a 3D pose from an image by concatenating 2D pose estimation result with the features from an image. Second, we have found that more accurate 3D poses are obtained by combining information on relative positions with respect to multiple joints, instead of just one root joint. Experimental results show that the proposed method achieves comparable performance to the state-of-the-art methods on Human 3.6m dataset. |
Tasks | 3D Human Pose Estimation, Pose Estimation |
Published | 2016-08-10 |
URL | http://arxiv.org/abs/1608.03075v2 |
http://arxiv.org/pdf/1608.03075v2.pdf | |
PWC | https://paperswithcode.com/paper/3d-human-pose-estimation-using-convolutional |
Repo | |
Framework | |
Deep Feature Based Contextual Model for Object Detection
Title | Deep Feature Based Contextual Model for Object Detection |
Authors | Wenqing Chu, Deng Cai |
Abstract | Object detection is one of the most active areas in computer vision, which has made significant improvement in recent years. Current state-of-the-art object detection methods mostly adhere to the framework of regions with convolutional neural network (R-CNN) and only use local appearance features inside object bounding boxes. Since these approaches ignore the contextual information around the object proposals, the outcome of these detectors may generate a semantically incoherent interpretation of the input image. In this paper, we propose an ensemble object detection system which incorporates the local appearance, the contextual information in term of relationships among objects and the global scene based contextual feature generated by a convolutional neural network. The system is formulated as a fully connected conditional random field (CRF) defined on object proposals and the contextual constraints among object proposals are modeled as edges naturally. Furthermore, a fast mean field approximation method is utilized to inference in this CRF model efficiently. The experimental results demonstrate that our approach achieves a higher mean average precision (mAP) on PASCAL VOC 2007 datasets compared to the baseline algorithm Faster R-CNN. |
Tasks | Object Detection |
Published | 2016-04-14 |
URL | http://arxiv.org/abs/1604.04048v1 |
http://arxiv.org/pdf/1604.04048v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-feature-based-contextual-model-for |
Repo | |
Framework | |
Optimal Belief Approximation
Title | Optimal Belief Approximation |
Authors | Reimar H. Leike, Torsten A. Enßlin |
Abstract | In Bayesian statistics probability distributions express beliefs. However, for many problems the beliefs cannot be computed analytically and approximations of beliefs are needed. We seek a loss function that quantifies how “embarrassing” it is to communicate a given approximation. We reproduce and discuss an old proof showing that there is only one ranking under the requirements that (1) the best ranked approximation is the non-approximated belief and (2) that the ranking judges approximations only by their predictions for actual outcomes. The loss function that is obtained in the derivation is equal to the Kullback-Leibler divergence when normalized. This loss function is frequently used in the literature. However, there seems to be confusion about the correct order in which its functional arguments, the approximated and non-approximated beliefs, should be used. The correct order ensures that the recipient of a communication is only deprived of the minimal amount of information. We hope that the elementary derivation settles the apparent confusion. For example when approximating beliefs with Gaussian distributions the optimal approximation is given by moment matching. This is in contrast to many suggested computational schemes. |
Tasks | |
Published | 2016-10-27 |
URL | http://arxiv.org/abs/1610.09018v6 |
http://arxiv.org/pdf/1610.09018v6.pdf | |
PWC | https://paperswithcode.com/paper/optimal-belief-approximation |
Repo | |
Framework | |
Processing Natural Language About Ongoing Actions
Title | Processing Natural Language About Ongoing Actions |
Authors | Steve Doubleday, Sean Trott, Jerome Feldman |
Abstract | Actions may not proceed as planned; they may be interrupted, resumed or overridden. This is a challenge to handle in a natural language understanding system. We describe extensions to an existing implementation for the control of autonomous systems by natural language, to enable such systems to handle incoming language requests regarding actions. Language Communication with Autonomous Systems (LCAS) has been extended with support for X-nets, parameterized executable schemas representing actions. X-nets enable the system to control actions at a desired level of granularity, while providing a mechanism for language requests to be processed asynchronously. Standard semantics supported include requests to stop, continue, or override the existing action. The specific domain demonstrated is the control of motion of a simulated robot, but the approach is general, and could be applied to other domains. |
Tasks | |
Published | 2016-07-23 |
URL | http://arxiv.org/abs/1607.06875v2 |
http://arxiv.org/pdf/1607.06875v2.pdf | |
PWC | https://paperswithcode.com/paper/processing-natural-language-about-ongoing |
Repo | |
Framework | |