Paper Group ANR 499
Experimental results : Reinforcement Learning of POMDPs using Spectral Methods. Open-World Visual Recognition Using Knowledge Graphs. Building competitive direct acoustics-to-word models for English conversational speech recognition. Learning to Play Othello with Deep Neural Networks. Direct Acoustics-to-Word Models for English Conversational Speec …
Experimental results : Reinforcement Learning of POMDPs using Spectral Methods
Title | Experimental results : Reinforcement Learning of POMDPs using Spectral Methods |
Authors | Kamyar Azizzadenesheli, Alessandro Lazaric, Animashree Anandkumar |
Abstract | We propose a new reinforcement learning algorithm for partially observable Markov decision processes (POMDP) based on spectral decomposition methods. While spectral methods have been previously employed for consistent learning of (passive) latent variable models such as hidden Markov models, POMDPs are more challenging since the learner interacts with the environment and possibly changes the future observations in the process. We devise a learning algorithm running through epochs, in each epoch we employ spectral techniques to learn the POMDP parameters from a trajectory generated by a fixed policy. At the end of the epoch, an optimization oracle returns the optimal memoryless planning policy which maximizes the expected reward based on the estimated POMDP model. We prove an order-optimal regret bound with respect to the optimal memoryless policy and efficient scaling with respect to the dimensionality of observation and action spaces. |
Tasks | Latent Variable Models |
Published | 2017-05-07 |
URL | http://arxiv.org/abs/1705.02553v1 |
http://arxiv.org/pdf/1705.02553v1.pdf | |
PWC | https://paperswithcode.com/paper/experimental-results-reinforcement-learning |
Repo | |
Framework | |
Open-World Visual Recognition Using Knowledge Graphs
Title | Open-World Visual Recognition Using Knowledge Graphs |
Authors | Vincent P. A. Lonij, Ambrish Rawat, Maria-Irina Nicolae |
Abstract | In a real-world setting, visual recognition systems can be brought to make predictions for images belonging to previously unknown class labels. In order to make semantically meaningful predictions for such inputs, we propose a two-step approach that utilizes information from knowledge graphs. First, a knowledge-graph representation is learned to embed a large set of entities into a semantic space. Second, an image representation is learned to embed images into the same space. Under this setup, we are able to predict structured properties in the form of relationship triples for any open-world image. This is true even when a set of labels has been omitted from the training protocols of both the knowledge graph and image embeddings. Furthermore, we append this learning framework with appropriate smoothness constraints and show how prior knowledge can be incorporated into the model. Both these improvements combined increase performance for visual recognition by a factor of six compared to our baseline. Finally, we propose a new, extended dataset which we use for experiments. |
Tasks | Knowledge Graphs |
Published | 2017-08-28 |
URL | http://arxiv.org/abs/1708.08310v1 |
http://arxiv.org/pdf/1708.08310v1.pdf | |
PWC | https://paperswithcode.com/paper/open-world-visual-recognition-using-knowledge |
Repo | |
Framework | |
Building competitive direct acoustics-to-word models for English conversational speech recognition
Title | Building competitive direct acoustics-to-word models for English conversational speech recognition |
Authors | Kartik Audhkhasi, Brian Kingsbury, Bhuvana Ramabhadran, George Saon, Michael Picheny |
Abstract | Direct acoustics-to-word (A2W) models in the end-to-end paradigm have received increasing attention compared to conventional sub-word based automatic speech recognition models using phones, characters, or context-dependent hidden Markov model states. This is because A2W models recognize words from speech without any decoder, pronunciation lexicon, or externally-trained language model, making training and decoding with such models simple. Prior work has shown that A2W models require orders of magnitude more training data in order to perform comparably to conventional models. Our work also showed this accuracy gap when using the English Switchboard-Fisher data set. This paper describes a recipe to train an A2W model that closes this gap and is at-par with state-of-the-art sub-word based models. We achieve a word error rate of 8.8%/13.9% on the Hub5-2000 Switchboard/CallHome test sets without any decoder or language model. We find that model initialization, training data order, and regularization have the most impact on the A2W model performance. Next, we present a joint word-character A2W model that learns to first spell the word and then recognize it. This model provides a rich output to the user instead of simple word hypotheses, making it especially useful in the case of words unseen or rarely-seen during training. |
Tasks | English Conversational Speech Recognition, Language Modelling, Speech Recognition |
Published | 2017-12-08 |
URL | http://arxiv.org/abs/1712.03133v1 |
http://arxiv.org/pdf/1712.03133v1.pdf | |
PWC | https://paperswithcode.com/paper/building-competitive-direct-acoustics-to-word |
Repo | |
Framework | |
Learning to Play Othello with Deep Neural Networks
Title | Learning to Play Othello with Deep Neural Networks |
Authors | Paweł Liskowski, Wojciech Jaśkowski, Krzysztof Krawiec |
Abstract | Achieving superhuman playing level by AlphaGo corroborated the capabilities of convolutional neural architectures (CNNs) for capturing complex spatial patterns. This result was to a great extent due to several analogies between Go board states and 2D images CNNs have been designed for, in particular translational invariance and a relatively large board. In this paper, we verify whether CNN-based move predictors prove effective for Othello, a game with significantly different characteristics, including a much smaller board size and complete lack of translational invariance. We compare several CNN architectures and board encodings, augment them with state-of-the-art extensions, train on an extensive database of experts’ moves, and examine them with respect to move prediction accuracy and playing strength. The empirical evaluation confirms high capabilities of neural move predictors and suggests a strong correlation between prediction accuracy and playing strength. The best CNNs not only surpass all other 1-ply Othello players proposed to date but defeat (2-ply) Edax, the best open-source Othello player. |
Tasks | |
Published | 2017-11-17 |
URL | http://arxiv.org/abs/1711.06583v1 |
http://arxiv.org/pdf/1711.06583v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-play-othello-with-deep-neural |
Repo | |
Framework | |
Direct Acoustics-to-Word Models for English Conversational Speech Recognition
Title | Direct Acoustics-to-Word Models for English Conversational Speech Recognition |
Authors | Kartik Audhkhasi, Bhuvana Ramabhadran, George Saon, Michael Picheny, David Nahamoo |
Abstract | Recent work on end-to-end automatic speech recognition (ASR) has shown that the connectionist temporal classification (CTC) loss can be used to convert acoustics to phone or character sequences. Such systems are used with a dictionary and separately-trained Language Model (LM) to produce word sequences. However, they are not truly end-to-end in the sense of mapping acoustics directly to words without an intermediate phone representation. In this paper, we present the first results employing direct acoustics-to-word CTC models on two well-known public benchmark tasks: Switchboard and CallHome. These models do not require an LM or even a decoder at run-time and hence recognize speech with minimal complexity. However, due to the large number of word output units, CTC word models require orders of magnitude more data to train reliably compared to traditional systems. We present some techniques to mitigate this issue. Our CTC word model achieves a word error rate of 13.0%/18.8% on the Hub5-2000 Switchboard/CallHome test sets without any LM or decoder compared with 9.6%/16.0% for phone-based CTC with a 4-gram LM. We also present rescoring results on CTC word model lattices to quantify the performance benefits of a LM, and contrast the performance of word and phone CTC models. |
Tasks | English Conversational Speech Recognition, Language Modelling, Speech Recognition |
Published | 2017-03-22 |
URL | http://arxiv.org/abs/1703.07754v1 |
http://arxiv.org/pdf/1703.07754v1.pdf | |
PWC | https://paperswithcode.com/paper/direct-acoustics-to-word-models-for-english |
Repo | |
Framework | |
Approximate Bayes learning of stochastic differential equations
Title | Approximate Bayes learning of stochastic differential equations |
Authors | Philipp Batz, Andreas Ruttor, Manfred Opper |
Abstract | We introduce a nonparametric approach for estimating drift and diffusion functions in systems of stochastic differential equations from observations of the state vector. Gaussian processes are used as flexible models for these functions and estimates are calculated directly from dense data sets using Gaussian process regression. We also develop an approximate expectation maximization algorithm to deal with the unobserved, latent dynamics between sparse observations. The posterior over states is approximated by a piecewise linearized process of the Ornstein-Uhlenbeck type and the maximum a posteriori estimation of the drift is facilitated by a sparse Gaussian process approximation. |
Tasks | Gaussian Processes |
Published | 2017-02-17 |
URL | http://arxiv.org/abs/1702.05390v1 |
http://arxiv.org/pdf/1702.05390v1.pdf | |
PWC | https://paperswithcode.com/paper/approximate-bayes-learning-of-stochastic |
Repo | |
Framework | |
Transfer entropy-based feedback improves performance in artificial neural networks
Title | Transfer entropy-based feedback improves performance in artificial neural networks |
Authors | Sebastian Herzog, Christian Tetzlaff, Florentin Wörgötter |
Abstract | The structure of the majority of modern deep neural networks is characterized by uni- directional feed-forward connectivity across a very large number of layers. By contrast, the architecture of the cortex of vertebrates contains fewer hierarchical levels but many recurrent and feedback connections. Here we show that a small, few-layer artificial neural network that employs feedback will reach top level performance on a standard benchmark task, otherwise only obtained by large feed-forward structures. To achieve this we use feed-forward transfer entropy between neurons to structure feedback connectivity. Transfer entropy can here intuitively be understood as a measure for the relevance of certain pathways in the network, which are then amplified by feedback. Feedback may therefore be key for high network performance in small brain-like architectures. |
Tasks | |
Published | 2017-06-13 |
URL | http://arxiv.org/abs/1706.04265v2 |
http://arxiv.org/pdf/1706.04265v2.pdf | |
PWC | https://paperswithcode.com/paper/transfer-entropy-based-feedback-improves |
Repo | |
Framework | |
Conversion Rate Optimization through Evolutionary Computation
Title | Conversion Rate Optimization through Evolutionary Computation |
Authors | Risto Miikkulainen, Neil Iscoe, Aaron Shagrin, Ron Cordell, Sam Nazari, Cory Schoolland, Myles Brundage, Jonathan Epstein, Randy Dean, Gurmeet Lamba |
Abstract | Conversion optimization means designing a web interface so that as many users as possible take a desired action on it, such as register or purchase. Such design is usually done by hand, testing one change at a time through A/B testing, or a limited number of combinations through multivariate testing, making it possible to evaluate only a small fraction of designs in a vast design space. This paper describes Sentient Ascend, an automatic conversion optimization system that uses evolutionary optimization to create effective web interface designs. Ascend makes it possible to discover and utilize interactions between the design elements that are difficult to identify otherwise. Moreover, evaluation of design candidates is done in parallel online, i.e. with a large number of real users interacting with the system. A case study on an existing media site shows that significant improvements (i.e. over 43%) are possible beyond human design. Ascend can therefore be seen as an approach to massively multivariate conversion optimization, based on a massively parallel interactive evolution. |
Tasks | |
Published | 2017-03-01 |
URL | http://arxiv.org/abs/1703.00556v4 |
http://arxiv.org/pdf/1703.00556v4.pdf | |
PWC | https://paperswithcode.com/paper/conversion-rate-optimization-through |
Repo | |
Framework | |
High Dimensional Multi-Level Covariance Estimation and Kriging
Title | High Dimensional Multi-Level Covariance Estimation and Kriging |
Authors | Julio E. Castrillon-Candas |
Abstract | With the advent of big data sets much of the computational science and engineering communities have been moving toward data-driven approaches to regression and classification. However, they present a significant challenge due to the increasing size, complexity and dimensionality of the problems. In this paper a multi-level kriging method that scales well with dimensions is developed. A multi-level basis is constructed that is adapted to a random projection tree (or kD-tree) partitioning of the observations and a sparse grid approximation. This approach identifies the high dimensional underlying phenomena from the noise in an accurate and numerically stable manner. Furthermore, numerically unstable covariance matrices are transformed into well conditioned multi-level matrices without compromising accuracy. A-posteriori error estimates are derived, such as the sub-exponential decay of the coefficients of the multi-level covariance matrix. The multi-level method is tested on numerically unstable problems of up to 50 dimensions. Accurate solutions with feasible computational cost are obtained. |
Tasks | |
Published | 2017-01-01 |
URL | http://arxiv.org/abs/1701.00285v1 |
http://arxiv.org/pdf/1701.00285v1.pdf | |
PWC | https://paperswithcode.com/paper/high-dimensional-multi-level-covariance |
Repo | |
Framework | |
Towards Monetary Incentives in Social Q&A Services
Title | Towards Monetary Incentives in Social Q&A Services |
Authors | Steve T. K. Jan, Chun Wang, Qing Zhang, Gang Wang |
Abstract | Community-based question answering (CQA) services are facing key challenges to motivate domain experts to provide timely answers. Recently, CQA services are exploring new incentive models to engage experts and celebrities by allowing them to set a price on their answers. In this paper, we perform a data-driven analysis on two emerging payment-based CQA systems: Fenda (China) and Whale (US). By analyzing a large dataset of 220K questions (worth 1 million USD collectively), we examine how monetary incentives affect different players in the system. We find that, while monetary incentive enables quick answers from experts, it also drives certain users to aggressively game the system for profits. In addition, in this supplier-driven marketplace, users need to proactively adjust their price to make profits. Famous people are unwilling to lower their price, which in turn hurts their income and engagement over time. Finally, we discuss the key implications to future CQA design. |
Tasks | Question Answering |
Published | 2017-03-03 |
URL | http://arxiv.org/abs/1703.01333v2 |
http://arxiv.org/pdf/1703.01333v2.pdf | |
PWC | https://paperswithcode.com/paper/towards-monetary-incentives-in-social-qa |
Repo | |
Framework | |
Deep Learning in Multiple Multistep Time Series Prediction
Title | Deep Learning in Multiple Multistep Time Series Prediction |
Authors | Chuanyun Zang |
Abstract | The project aims to research on combining deep learning specifically Long-Short Memory (LSTM) and basic statistics in multiple multistep time series prediction. LSTM can dive into all the pages and learn the general trends of variation in a large scope, while the well selected medians for each page can keep the special seasonality of different pages so that the future trend will not fluctuate too much from the reality. A recent Kaggle competition on 145K Web Traffic Time Series Forecasting [1] is used to thoroughly illustrate and test this idea. |
Tasks | Time Series, Time Series Forecasting, Time Series Prediction |
Published | 2017-10-12 |
URL | http://arxiv.org/abs/1710.04373v1 |
http://arxiv.org/pdf/1710.04373v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-in-multiple-multistep-time |
Repo | |
Framework | |
Multimodal Affect Analysis for Product Feedback Assessment
Title | Multimodal Affect Analysis for Product Feedback Assessment |
Authors | Amol S Patwardhan, Gerald M Knapp |
Abstract | Consumers often react expressively to products such as food samples, perfume, jewelry, sunglasses, and clothing accessories. This research discusses a multimodal affect recognition system developed to classify whether a consumer likes or dislikes a product tested at a counter or kiosk, by analyzing the consumer’s facial expression, body posture, hand gestures, and voice after testing the product. A depth-capable camera and microphone system - Kinect for Windows - is utilized. An emotion identification engine has been developed to analyze the images and voice to determine affective state of the customer. The image is segmented using skin color and adaptive threshold. Face, body and hands are detected using the Haar cascade classifier. Canny edges are identified and the lip, body and hand contours are extracted using spatial filtering. Edge count and orientation around the mouth, cheeks, eyes, shoulders, fingers and the location of the edges are used as features. Classification is done by an emotion template mapping algorithm and training a classifier using support vector machines. The real-time performance, accuracy and feasibility for multimodal affect recognition in feedback assessment are evaluated. |
Tasks | |
Published | 2017-05-07 |
URL | http://arxiv.org/abs/1705.02694v1 |
http://arxiv.org/pdf/1705.02694v1.pdf | |
PWC | https://paperswithcode.com/paper/multimodal-affect-analysis-for-product |
Repo | |
Framework | |
Empirical Bayes Matrix Completion
Title | Empirical Bayes Matrix Completion |
Authors | Takeru Matsuda, Fumiyasu Komaki |
Abstract | We develop an empirical Bayes (EB) algorithm for the matrix completion problems. The EB algorithm is motivated from the singular value shrinkage estimator for matrix means by Efron and Morris (1972). Since the EB algorithm is essentially the EM algorithm applied to a simple model, it does not require heuristic parameter tuning other than tolerance. Numerical results demonstrated that the EB algorithm achieves a good trade-off between accuracy and efficiency compared to existing algorithms and that it works particularly well when the difference between the number of rows and columns is large. Application to real data also shows the practical utility of the EB algorithm. |
Tasks | Matrix Completion |
Published | 2017-06-05 |
URL | http://arxiv.org/abs/1706.01252v2 |
http://arxiv.org/pdf/1706.01252v2.pdf | |
PWC | https://paperswithcode.com/paper/empirical-bayes-matrix-completion |
Repo | |
Framework | |
Stochastic Newton and Quasi-Newton Methods for Large Linear Least-squares Problems
Title | Stochastic Newton and Quasi-Newton Methods for Large Linear Least-squares Problems |
Authors | Julianne Chung, Matthias Chung, J. Tanner Slagel, Luis Tenorio |
Abstract | We describe stochastic Newton and stochastic quasi-Newton approaches to efficiently solve large linear least-squares problems where the very large data sets present a significant computational burden (e.g., the size may exceed computer memory or data are collected in real-time). In our proposed framework, stochasticity is introduced in two different frameworks as a means to overcome these computational limitations, and probability distributions that can exploit structure and/or sparsity are considered. Theoretical results on consistency of the approximations for both the stochastic Newton and the stochastic quasi-Newton methods are provided. The results show, in particular, that stochastic Newton iterates, in contrast to stochastic quasi-Newton iterates, may not converge to the desired least-squares solution. Numerical examples, including an example from extreme learning machines, demonstrate the potential applications of these methods. |
Tasks | |
Published | 2017-02-23 |
URL | http://arxiv.org/abs/1702.07367v1 |
http://arxiv.org/pdf/1702.07367v1.pdf | |
PWC | https://paperswithcode.com/paper/stochastic-newton-and-quasi-newton-methods |
Repo | |
Framework | |
Visualizing and Improving Scattering Networks
Title | Visualizing and Improving Scattering Networks |
Authors | Fergal Cotter, Nick Kingsbury |
Abstract | Scattering Transforms (or ScatterNets) introduced by Mallat are a promising start into creating a well-defined feature extractor to use for pattern recognition and image classification tasks. They are of particular interest due to their architectural similarity to Convolutional Neural Networks (CNNs), while requiring no parameter learning and still performing very well (particularly in constrained classification tasks). In this paper we visualize what the deeper layers of a ScatterNet are sensitive to using a ‘DeScatterNet’. We show that the higher orders of ScatterNets are sensitive to complex, edge-like patterns (checker-boards and rippled edges). These complex patterns may be useful for texture classification, but are quite dissimilar from the patterns visualized in second and third layers of Convolutional Neural Networks (CNNs) - the current state of the art Image Classifiers. We propose that this may be the source of the current gaps in performance between ScatterNets and CNNs (83% vs 93% on CIFAR-10 for ScatterNet+SVM vs ResNet). We then use these visualization tools to propose possible enhancements to the ScatterNet design, which show they have the power to extract features more closely resembling CNNs, while still being well-defined and having the invariance properties fundamental to ScatterNets. |
Tasks | Image Classification, Texture Classification |
Published | 2017-09-05 |
URL | http://arxiv.org/abs/1709.01355v1 |
http://arxiv.org/pdf/1709.01355v1.pdf | |
PWC | https://paperswithcode.com/paper/visualizing-and-improving-scattering-networks |
Repo | |
Framework | |