Paper Group ANR 297
On Lower Bounds for Regret in Reinforcement Learning. Spatio-Temporal Image Boundary Extrapolation. Neural Network Support Vector Detection via a Soft-Label, Hybrid K-Means Classifier. A Modular Theory of Feature Learning. Joint Stochastic Approximation learning of Helmholtz Machines. Causes for Query Answers from Databases, Datalog Abduction and V …
On Lower Bounds for Regret in Reinforcement Learning
Title | On Lower Bounds for Regret in Reinforcement Learning |
Authors | Ian Osband, Benjamin Van Roy |
Abstract | This is a brief technical note to clarify the state of lower bounds on regret for reinforcement learning. In particular, this paper: - Reproduces a lower bound on regret for reinforcement learning, similar to the result of Theorem 5 in the journal UCRL2 paper (Jaksch et al 2010). - Clarifies that the proposed proof of Theorem 6 in the REGAL paper (Bartlett and Tewari 2009) does not hold using the standard techniques without further work. We suggest that this result should instead be considered a conjecture as it has no rigorous proof. - Suggests that the conjectured lower bound given by (Bartlett and Tewari 2009) is incorrect and, in fact, it is possible to improve the scaling of the upper bound to match the weaker lower bounds presented in this paper. We hope that this note serves to clarify existing results in the field of reinforcement learning and provides interesting motivation for future work. |
Tasks | |
Published | 2016-08-09 |
URL | http://arxiv.org/abs/1608.02732v1 |
http://arxiv.org/pdf/1608.02732v1.pdf | |
PWC | https://paperswithcode.com/paper/on-lower-bounds-for-regret-in-reinforcement |
Repo | |
Framework | |
Spatio-Temporal Image Boundary Extrapolation
Title | Spatio-Temporal Image Boundary Extrapolation |
Authors | Apratim Bhattacharyya, Mateusz Malinowski, Mario Fritz |
Abstract | Boundary prediction in images as well as video has been a very active topic of research and organizing visual information into boundaries and segments is believed to be a corner stone of visual perception. While prior work has focused on predicting boundaries for observed frames, our work aims at predicting boundaries of future unobserved frames. This requires our model to learn about the fate of boundaries and extrapolate motion patterns. We experiment on established real-world video segmentation dataset, which provides a testbed for this new task. We show for the first time spatio-temporal boundary extrapolation in this challenging scenario. Furthermore, we show long-term prediction of boundaries in situations where the motion is governed by the laws of physics. We successfully predict boundaries in a billiard scenario without any assumptions of a strong parametric model or any object notion. We argue that our model has with minimalistic model assumptions derived a notion of ‘intuitive physics’ that can be applied to novel scenes. |
Tasks | Video Semantic Segmentation |
Published | 2016-05-24 |
URL | http://arxiv.org/abs/1605.07363v1 |
http://arxiv.org/pdf/1605.07363v1.pdf | |
PWC | https://paperswithcode.com/paper/spatio-temporal-image-boundary-extrapolation |
Repo | |
Framework | |
Neural Network Support Vector Detection via a Soft-Label, Hybrid K-Means Classifier
Title | Neural Network Support Vector Detection via a Soft-Label, Hybrid K-Means Classifier |
Authors | Robert A. Murphy |
Abstract | We use random geometric graphs to describe clusters of higher dimensional data points which are bijectively mapped to a (possibly) lower dimensional space where an equivalent random cluster model is used to calculate the expected number of modes to be found when separating the data of a multi-modal data set into distinct clusters. Furthermore, as a function of the expected number of modes and the number of data points in the sample, an upper bound on a given distance measure is found such that data points have the greatest correlation if their mutual distances from a common center is less than or equal to the calculated bound. Anomalies are exposed, which lie outside of the union of all regularized clusters of data points. Similar to finding a hyperplane which can be shifted along its normal to expose the maximal distance between binary classes, it is shown that the union of regularized clusters can be used to define a hyperplane which can be shifted by a certain amount to separate the data into binary classes and that the shifted hyperplane defines the activation function for a two-class discriminating neural network. Lastly, this neural network is used to detect the set of support vectors which determines the maximally-separating region between the binary classes. |
Tasks | |
Published | 2016-02-11 |
URL | http://arxiv.org/abs/1602.03822v5 |
http://arxiv.org/pdf/1602.03822v5.pdf | |
PWC | https://paperswithcode.com/paper/neural-network-support-vector-detection-via-a |
Repo | |
Framework | |
A Modular Theory of Feature Learning
Title | A Modular Theory of Feature Learning |
Authors | Daniel McNamara, Cheng Soon Ong, Robert C. Williamson |
Abstract | Learning representations of data, and in particular learning features for a subsequent prediction task, has been a fruitful area of research delivering impressive empirical results in recent years. However, relatively little is understood about what makes a representation `good’. We propose the idea of a risk gap induced by representation learning for a given prediction context, which measures the difference in the risk of some learner using the learned features as compared to the original inputs. We describe a set of sufficient conditions for unsupervised representation learning to provide a benefit, as measured by this risk gap. These conditions decompose the problem of when representation learning works into its constituent parts, which can be separately evaluated using an unlabeled sample, suitable domain-specific assumptions about the joint distribution, and analysis of the feature learner and subsequent supervised learner. We provide two examples of such conditions in the context of specific properties of the unlabeled distribution, namely when the data lies close to a low-dimensional manifold and when it forms clusters. We compare our approach to a recently proposed analysis of semi-supervised learning. | |
Tasks | Representation Learning, Unsupervised Representation Learning |
Published | 2016-11-09 |
URL | http://arxiv.org/abs/1611.03125v1 |
http://arxiv.org/pdf/1611.03125v1.pdf | |
PWC | https://paperswithcode.com/paper/a-modular-theory-of-feature-learning |
Repo | |
Framework | |
Joint Stochastic Approximation learning of Helmholtz Machines
Title | Joint Stochastic Approximation learning of Helmholtz Machines |
Authors | Haotian Xu, Zhijian Ou |
Abstract | Though with progress, model learning and performing posterior inference still remains a common challenge for using deep generative models, especially for handling discrete hidden variables. This paper is mainly concerned with algorithms for learning Helmholz machines, which is characterized by pairing the generative model with an auxiliary inference model. A common drawback of previous learning algorithms is that they indirectly optimize some bounds of the targeted marginal log-likelihood. In contrast, we successfully develop a new class of algorithms, based on stochastic approximation (SA) theory of the Robbins-Monro type, to directly optimize the marginal log-likelihood and simultaneously minimize the inclusive KL-divergence. The resulting learning algorithm is thus called joint SA (JSA). Moreover, we construct an effective MCMC operator for JSA. Our results on the MNIST datasets demonstrate that the JSA’s performance is consistently superior to that of competing algorithms like RWS, for learning a range of difficult models. |
Tasks | |
Published | 2016-03-20 |
URL | http://arxiv.org/abs/1603.06170v2 |
http://arxiv.org/pdf/1603.06170v2.pdf | |
PWC | https://paperswithcode.com/paper/joint-stochastic-approximation-learning-of |
Repo | |
Framework | |
Causes for Query Answers from Databases, Datalog Abduction and View-Updates: The Presence of Integrity Constraints
Title | Causes for Query Answers from Databases, Datalog Abduction and View-Updates: The Presence of Integrity Constraints |
Authors | Babak Salimi, Leopoldo Bertossi |
Abstract | Causality has been recently introduced in databases, to model, characterize and possibly compute causes for query results (answers). Connections between queryanswer causality, consistency-based diagnosis, database repairs (wrt. integrity constraint violations), abductive diagnosis and the view-update problem have been established. In this work we further investigate connections between query-answer causality and abductive diagnosis and the view-update problem. In this context, we also define and investigate the notion of query-answer causality in the presence of integrity constraints. |
Tasks | |
Published | 2016-02-20 |
URL | http://arxiv.org/abs/1602.06458v1 |
http://arxiv.org/pdf/1602.06458v1.pdf | |
PWC | https://paperswithcode.com/paper/causes-for-query-answers-from-databases |
Repo | |
Framework | |
A Visual Representation for Editing Face Images
Title | A Visual Representation for Editing Face Images |
Authors | Jiajun Lu, Kalyan Sunkavalli, Nathan Carr, Sunil Hadap, David Forsyth |
Abstract | We propose a new approach for editing face images, which enables numerous exciting applications including face relighting, makeup transfer and face detail editing. Our face edits are based on a visual representation, which includes geometry, face segmentation, albedo, illumination and detail map. To recover our visual representation, we start by estimating geometry using a morphable face model, then decompose the face image to recover the albedo, and then shade the geometry with the albedo and illumination. The residual between our shaded geometry and the input image produces our detail map, which carries high frequency information that is either insufficiently or incorrectly captured by our shading process. By manipulating the detail map, we can edit face images with reality and identity preserved. Our representation allows various applications. First, it allows a user to directly manipulate various illumination. Second, it allows non-parametric makeup transfer with input face’s distinctive identity features preserved. Third, it allows non-parametric modifications to the face appearance by transferring details. For face relighting and detail editing, we evaluate via a user study and our method outperforms other methods. For makeup transfer, we evaluate via an online attractiveness evaluation system, and can reliably make people look younger and more attractive. We also show extensive qualitative comparisons to existing methods, and have significant improvements over previous techniques. |
Tasks | |
Published | 2016-12-02 |
URL | http://arxiv.org/abs/1612.00522v1 |
http://arxiv.org/pdf/1612.00522v1.pdf | |
PWC | https://paperswithcode.com/paper/a-visual-representation-for-editing-face |
Repo | |
Framework | |
A Delay-Tolerant Potential-Field-Based Network Implementation of an Integrated Navigation System
Title | A Delay-Tolerant Potential-Field-Based Network Implementation of an Integrated Navigation System |
Authors | Rachana Ashok Gupta, Ahmad A. Masoud, Mo-Yuen Chow |
Abstract | Network controllers (NCs) are devices that are capable of converting dynamic, spatially extended, and functionally specialized modules into a taskable goal-oriented group called networked control system. This paper examines the practical aspects of designing and building an NC that uses the Internet as a communication medium. It focuses on finding compatible controller components that can be integrated via a host structure in a manner that makes it possible to network, in real-time, a webcam, an unmanned ground vehicle (UGV), and a remote computer server along with the necessary operator software interface. The aim is to deskill the UGV navigation process and yet maintain a robust performance. The structure of the suggested controller, its components, and the manner in which they are interfaced are described. Thorough experimental results along with performance assessment and comparisons to a previously implemented NC are provided. |
Tasks | |
Published | 2016-08-23 |
URL | http://arxiv.org/abs/1608.06440v1 |
http://arxiv.org/pdf/1608.06440v1.pdf | |
PWC | https://paperswithcode.com/paper/a-delay-tolerant-potential-field-based |
Repo | |
Framework | |
Algorithms for Fitting the Constrained Lasso
Title | Algorithms for Fitting the Constrained Lasso |
Authors | Brian R. Gaines, Hua Zhou |
Abstract | We compare alternative computing strategies for solving the constrained lasso problem. As its name suggests, the constrained lasso extends the widely-used lasso to handle linear constraints, which allow the user to incorporate prior information into the model. In addition to quadratic programming, we employ the alternating direction method of multipliers (ADMM) and also derive an efficient solution path algorithm. Through both simulations and real data examples, we compare the different algorithms and provide practical recommendations in terms of efficiency and accuracy for various sizes of data. We also show that, for an arbitrary penalty matrix, the generalized lasso can be transformed to a constrained lasso, while the converse is not true. Thus, our methods can also be used for estimating a generalized lasso, which has wide-ranging applications. Code for implementing the algorithms is freely available in the Matlab toolbox SparseReg. |
Tasks | |
Published | 2016-10-28 |
URL | http://arxiv.org/abs/1611.01511v1 |
http://arxiv.org/pdf/1611.01511v1.pdf | |
PWC | https://paperswithcode.com/paper/algorithms-for-fitting-the-constrained-lasso |
Repo | |
Framework | |
A fine-grained approach to scene text script identification
Title | A fine-grained approach to scene text script identification |
Authors | Lluis Gomez, Dimosthenis Karatzas |
Abstract | This paper focuses on the problem of script identification in unconstrained scenarios. Script identification is an important prerequisite to recognition, and an indispensable condition for automatic text understanding systems designed for multi-language environments. Although widely studied for document images and handwritten documents, it remains an almost unexplored territory for scene text images. We detail a novel method for script identification in natural images that combines convolutional features and the Naive-Bayes Nearest Neighbor classifier. The proposed framework efficiently exploits the discriminative power of small stroke-parts, in a fine-grained classification framework. In addition, we propose a new public benchmark dataset for the evaluation of joint text detection and script identification in natural scenes. Experiments done in this new dataset demonstrate that the proposed method yields state of the art results, while it generalizes well to different datasets and variable number of scripts. The evidence provided shows that multi-lingual scene text recognition in the wild is a viable proposition. Source code of the proposed method is made available online. |
Tasks | Scene Text Recognition |
Published | 2016-02-24 |
URL | http://arxiv.org/abs/1602.07475v1 |
http://arxiv.org/pdf/1602.07475v1.pdf | |
PWC | https://paperswithcode.com/paper/a-fine-grained-approach-to-scene-text-script |
Repo | |
Framework | |
A Game-Theoretic Approach to Word Sense Disambiguation
Title | A Game-Theoretic Approach to Word Sense Disambiguation |
Authors | Rocco Tripodi, Marcello Pelillo |
Abstract | This paper presents a new model for word sense disambiguation formulated in terms of evolutionary game theory, where each word to be disambiguated is represented as a node on a graph whose edges represent word relations and senses are represented as classes. The words simultaneously update their class membership preferences according to the senses that neighboring words are likely to choose. We use distributional information to weigh the influence that each word has on the decisions of the others and semantic similarity information to measure the strength of compatibility among the choices. With this information we can formulate the word sense disambiguation problem as a constraint satisfaction problem and solve it using tools derived from game theory, maintaining the textual coherence. The model is based on two ideas: similar words should be assigned to similar classes and the meaning of a word does not depend on all the words in a text but just on some of them. The paper provides an in-depth motivation of the idea of modeling the word sense disambiguation problem in terms of game theory, which is illustrated by an example. The conclusion presents an extensive analysis on the combination of similarity measures to use in the framework and a comparison with state-of-the-art systems. The results show that our model outperforms state-of-the-art algorithms and can be applied to different tasks and in different scenarios. |
Tasks | Semantic Similarity, Semantic Textual Similarity, Word Sense Disambiguation |
Published | 2016-06-24 |
URL | http://arxiv.org/abs/1606.07711v4 |
http://arxiv.org/pdf/1606.07711v4.pdf | |
PWC | https://paperswithcode.com/paper/a-game-theoretic-approach-to-word-sense |
Repo | |
Framework | |
A Curriculum Learning Method for Improved Noise Robustness in Automatic Speech Recognition
Title | A Curriculum Learning Method for Improved Noise Robustness in Automatic Speech Recognition |
Authors | Stefan Braun, Daniel Neil, Shih-Chii Liu |
Abstract | The performance of automatic speech recognition systems under noisy environments still leaves room for improvement. Speech enhancement or feature enhancement techniques for increasing noise robustness of these systems usually add components to the recognition system that need careful optimization. In this work, we propose the use of a relatively simple curriculum training strategy called accordion annealing (ACCAN). It uses a multi-stage training schedule where samples at signal-to-noise ratio (SNR) values as low as 0dB are first added and samples at increasing higher SNR values are gradually added up to an SNR value of 50dB. We also use a method called per-epoch noise mixing (PEM) that generates noisy training samples online during training and thus enables dynamically changing the SNR of our training data. Both the ACCAN and the PEM methods are evaluated on a end-to-end speech recognition pipeline on the Wall Street Journal corpus. ACCAN decreases the average word error rate (WER) on the 20dB to -10dB SNR range by up to 31.4% when compared to a conventional multi-condition training method. |
Tasks | End-To-End Speech Recognition, Speech Enhancement, Speech Recognition |
Published | 2016-06-22 |
URL | http://arxiv.org/abs/1606.06864v2 |
http://arxiv.org/pdf/1606.06864v2.pdf | |
PWC | https://paperswithcode.com/paper/a-curriculum-learning-method-for-improved |
Repo | |
Framework | |
On the Identification and Mitigation of Weaknesses in the Knowledge Gradient Policy for Multi-Armed Bandits
Title | On the Identification and Mitigation of Weaknesses in the Knowledge Gradient Policy for Multi-Armed Bandits |
Authors | James Edwards, Paul Fearnhead, Kevin Glazebrook |
Abstract | The Knowledge Gradient (KG) policy was originally proposed for online ranking and selection problems but has recently been adapted for use in online decision making in general and multi-armed bandit problems (MABs) in particular. We study its use in a class of exponential family MABs and identify weaknesses, including a propensity to take actions which are dominated with respect to both exploitation and exploration. We propose variants of KG which avoid such errors. These new policies include an index heuristic which deploys a KG approach to develop an approximation to the Gittins index. A numerical study shows this policy to perform well over a range of MABs including those for which index policies are not optimal. While KG does not make dominated actions when bandits are Gaussian, it fails to be index consistent and appears not to enjoy a performance advantage over competitor policies when arms are correlated to compensate for its greater computational demands. |
Tasks | Decision Making, Multi-Armed Bandits |
Published | 2016-07-20 |
URL | http://arxiv.org/abs/1607.05970v2 |
http://arxiv.org/pdf/1607.05970v2.pdf | |
PWC | https://paperswithcode.com/paper/on-the-identification-and-mitigation-of |
Repo | |
Framework | |
Model-based Test Generation for Robotic Software: Automata versus Belief-Desire-Intention Agents
Title | Model-based Test Generation for Robotic Software: Automata versus Belief-Desire-Intention Agents |
Authors | Dejanira Araiza-Illan, Anthony G. Pipe, Kerstin Eder |
Abstract | Robotic code needs to be verified to ensure its safety and functional correctness, especially when the robot is interacting with people. Testing real code in simulation is a viable option. However, generating tests that cover rare scenarios, as well as exercising most of the code, is a challenge amplified by the complexity of the interactions between the environment and the software. Model-based test generation methods can automate otherwise manual processes and facilitate reaching rare scenarios during testing. In this paper, we compare using Belief-Desire-Intention (BDI) agents as models for test generation with more conventional automata-based techniques that exploit model checking, in terms of practicality, performance, transferability to different scenarios, and exploration (`coverage’), through two case studies: a cooperative manufacturing task, and a home care scenario. The results highlight the advantages of using BDI agents for test generation. BDI agents naturally emulate the agency present in Human-Robot Interactions (HRIs), and are thus more expressive than automata. The performance of the BDI-based test generation is at least as high, and the achieved coverage is higher or equivalent, compared to test generation based on model checking automata. | |
Tasks | |
Published | 2016-09-16 |
URL | http://arxiv.org/abs/1609.08439v2 |
http://arxiv.org/pdf/1609.08439v2.pdf | |
PWC | https://paperswithcode.com/paper/model-based-test-generation-for-robotic |
Repo | |
Framework | |
Trans-gram, Fast Cross-lingual Word-embeddings
Title | Trans-gram, Fast Cross-lingual Word-embeddings |
Authors | Jocelyn Coulmance, Jean-Marc Marty, Guillaume Wenzek, Amine Benhalloum |
Abstract | We introduce Trans-gram, a simple and computationally-efficient method to simultaneously learn and align wordembeddings for a variety of languages, using only monolingual data and a smaller set of sentence-aligned data. We use our new method to compute aligned wordembeddings for twenty-one languages using English as a pivot language. We show that some linguistic features are aligned across languages for which we do not have aligned data, even though those properties do not exist in the pivot language. We also achieve state of the art results on standard cross-lingual text classification and word translation tasks. |
Tasks | Text Classification, Word Embeddings |
Published | 2016-01-11 |
URL | http://arxiv.org/abs/1601.02502v1 |
http://arxiv.org/pdf/1601.02502v1.pdf | |
PWC | https://paperswithcode.com/paper/trans-gram-fast-cross-lingual-word-embeddings |
Repo | |
Framework | |