May 6, 2019

2262 words 11 mins read

Paper Group ANR 257

Paper Group ANR 257

Statistical Properties of European Languages and Voynich Manuscript Analysis. The Computational Power of Dynamic Bayesian Networks. On the Existence of a Projective Reconstruction. Primal-Dual Rates and Certificates. Learning to predict where to look in interactive environments using deep recurrent q-learning. Nonsymbolic Text Representation. Faste …

Statistical Properties of European Languages and Voynich Manuscript Analysis

Title Statistical Properties of European Languages and Voynich Manuscript Analysis
Authors Andronik Arutyunov, Leonid Borisov, Sergey Fedorov, Anastasiya Ivchenko, Elizabeth Kirina-Lilinskaya, Yurii Orlov, Konstantin Osminin, Sergey Shilin, Dmitriy Zeniuk
Abstract The statistical properties of letters frequencies in European literature texts are investigated. The determination of logarithmic dependence of letters sequence for one-language and two-language texts are examined. The pare of languages is suggested for Voynich Manuscript. The internal structure of Manuscript is considered. The spectral portraits of two-letters distribution are constructed.
Tasks
Published 2016-11-18
URL http://arxiv.org/abs/1611.09122v1
PDF http://arxiv.org/pdf/1611.09122v1.pdf
PWC https://paperswithcode.com/paper/statistical-properties-of-european-languages
Repo
Framework

The Computational Power of Dynamic Bayesian Networks

Title The Computational Power of Dynamic Bayesian Networks
Authors Joshua Brulé
Abstract This paper considers the computational power of constant size, dynamic Bayesian networks. Although discrete dynamic Bayesian networks are no more powerful than hidden Markov models, dynamic Bayesian networks with continuous random variables and discrete children of continuous parents are capable of performing Turing-complete computation. With modified versions of existing algorithms for belief propagation, such a simulation can be carried out in real time. This result suggests that dynamic Bayesian networks may be more powerful than previously considered. Relationships to causal models and recurrent neural networks are also discussed.
Tasks
Published 2016-03-19
URL http://arxiv.org/abs/1603.06125v1
PDF http://arxiv.org/pdf/1603.06125v1.pdf
PWC https://paperswithcode.com/paper/the-computational-power-of-dynamic-bayesian
Repo
Framework

On the Existence of a Projective Reconstruction

Title On the Existence of a Projective Reconstruction
Authors Hon-Leung Lee
Abstract In this note we study the connection between the existence of a projective reconstruction and the existence of a fundamental matrix satisfying the epipolar constraints.
Tasks
Published 2016-08-19
URL http://arxiv.org/abs/1608.05518v1
PDF http://arxiv.org/pdf/1608.05518v1.pdf
PWC https://paperswithcode.com/paper/on-the-existence-of-a-projective
Repo
Framework

Primal-Dual Rates and Certificates

Title Primal-Dual Rates and Certificates
Authors Celestine Dünner, Simone Forte, Martin Takáč, Martin Jaggi
Abstract We propose an algorithm-independent framework to equip existing optimization methods with primal-dual certificates. Such certificates and corresponding rate of convergence guarantees are important for practitioners to diagnose progress, in particular in machine learning applications. We obtain new primal-dual convergence rates, e.g., for the Lasso as well as many L1, Elastic Net, group Lasso and TV-regularized problems. The theory applies to any norm-regularized generalized linear model. Our approach provides efficiently computable duality gaps which are globally defined, without modifying the original problems in the region of interest.
Tasks
Published 2016-02-16
URL http://arxiv.org/abs/1602.05205v2
PDF http://arxiv.org/pdf/1602.05205v2.pdf
PWC https://paperswithcode.com/paper/primal-dual-rates-and-certificates
Repo
Framework

Learning to predict where to look in interactive environments using deep recurrent q-learning

Title Learning to predict where to look in interactive environments using deep recurrent q-learning
Authors Sajad Mousavi, Michael Schukat, Enda Howley, Ali Borji, Nasser Mozayani
Abstract Bottom-Up (BU) saliency models do not perform well in complex interactive environments where humans are actively engaged in tasks (e.g., sandwich making and playing the video games). In this paper, we leverage Reinforcement Learning (RL) to highlight task-relevant locations of input frames. We propose a soft attention mechanism combined with the Deep Q-Network (DQN) model to teach an RL agent how to play a game and where to look by focusing on the most pertinent parts of its visual input. Our evaluations on several Atari 2600 games show that the soft attention based model could predict fixation locations significantly better than bottom-up models such as Itti-Kochs saliency and Graph-Based Visual Saliency (GBVS) models.
Tasks Atari Games, Q-Learning
Published 2016-12-17
URL http://arxiv.org/abs/1612.05753v2
PDF http://arxiv.org/pdf/1612.05753v2.pdf
PWC https://paperswithcode.com/paper/learning-to-predict-where-to-look-in
Repo
Framework

Nonsymbolic Text Representation

Title Nonsymbolic Text Representation
Authors Hinrich Schuetze, Heike Adel, Ehsaneddin Asgari
Abstract We introduce the first generic text representation model that is completely nonsymbolic, i.e., it does not require the availability of a segmentation or tokenization method that attempts to identify words or other symbolic units in text. This applies to training the parameters of the model on a training corpus as well as to applying it when computing the representation of a new text. We show that our model performs better than prior work on an information extraction and a text denoising task.
Tasks Denoising, Tokenization
Published 2016-10-03
URL http://arxiv.org/abs/1610.00479v3
PDF http://arxiv.org/pdf/1610.00479v3.pdf
PWC https://paperswithcode.com/paper/nonsymbolic-text-representation
Repo
Framework

Faster variational inducing input Gaussian process classification

Title Faster variational inducing input Gaussian process classification
Authors Pavel Izmailov, Dmitry Kropotov
Abstract Gaussian processes (GP) provide a prior over functions and allow finding complex regularities in data. Gaussian processes are successfully used for classification/regression problems and dimensionality reduction. In this work we consider the classification problem only. The complexity of standard methods for GP-classification scales cubically with the size of the training dataset. This complexity makes them inapplicable to big data problems. Therefore, a variety of methods were introduced to overcome this limitation. In the paper we focus on methods based on so called inducing inputs. This approach is based on variational inference and proposes a particular lower bound for marginal likelihood (evidence). This bound is then maximized w.r.t. parameters of kernel function of the Gaussian process, thus fitting the model to data. The computational complexity of this method is $O(nm^2)$, where $m$ is the number of inducing inputs used by the model and is assumed to be substantially smaller than the size of the dataset $n$. Recently, a new evidence lower bound for GP-classification problem was introduced. It allows using stochastic optimization, which makes it suitable for big data problems. However, the new lower bound depends on $O(m^2)$ variational parameter, which makes optimization challenging in case of big m. In this work we develop a new approach for training inducing input GP models for classification problems. Here we use quadratic approximation of several terms in the aforementioned evidence lower bound, obtaining analytical expressions for optimal values of most of the parameters in the optimization, thus sufficiently reducing the dimension of optimization space. In our experiments we achieve as well or better results, compared to the existing method. Moreover, our method doesn’t require the user to manually set the learning rate, making it more practical, than the existing method.
Tasks Dimensionality Reduction, Gaussian Processes, Stochastic Optimization
Published 2016-11-18
URL http://arxiv.org/abs/1611.06132v1
PDF http://arxiv.org/pdf/1611.06132v1.pdf
PWC https://paperswithcode.com/paper/faster-variational-inducing-input-gaussian
Repo
Framework

Deep Variational Canonical Correlation Analysis

Title Deep Variational Canonical Correlation Analysis
Authors Weiran Wang, Xinchen Yan, Honglak Lee, Karen Livescu
Abstract We present deep variational canonical correlation analysis (VCCA), a deep multi-view learning model that extends the latent variable model interpretation of linear CCA to nonlinear observation models parameterized by deep neural networks. We derive variational lower bounds of the data likelihood by parameterizing the posterior probability of the latent variables from the view that is available at test time. We also propose a variant of VCCA called VCCA-private that can, in addition to the “common variables” underlying both views, extract the “private variables” within each view, and disentangles the shared and private information for multi-view data without hard supervision. Experimental results on real-world datasets show that our methods are competitive across domains.
Tasks MULTI-VIEW LEARNING
Published 2016-10-11
URL http://arxiv.org/abs/1610.03454v3
PDF http://arxiv.org/pdf/1610.03454v3.pdf
PWC https://paperswithcode.com/paper/deep-variational-canonical-correlation
Repo
Framework

LSTM with Working Memory

Title LSTM with Working Memory
Authors Andrew Pulver, Siwei Lyu
Abstract Previous RNN architectures have largely been superseded by LSTM, or “Long Short-Term Memory”. Since its introduction, there have been many variations on this simple design. However, it is still widely used and we are not aware of a gated-RNN architecture that outperforms LSTM in a broad sense while still being as simple and efficient. In this paper we propose a modified LSTM-like architecture. Our architecture is still simple and achieves better performance on the tasks that we tested on. We also introduce a new RNN performance benchmark that uses the handwritten digits and stresses several important network capabilities.
Tasks
Published 2016-05-06
URL http://arxiv.org/abs/1605.01988v3
PDF http://arxiv.org/pdf/1605.01988v3.pdf
PWC https://paperswithcode.com/paper/lstm-with-working-memory
Repo
Framework

Expectation Consistent Approximate Inference: Generalizations and Convergence

Title Expectation Consistent Approximate Inference: Generalizations and Convergence
Authors Alyson K. Fletcher, Mojtaba Sahraee-Ardakan, Sundeep Rangan, Philip Schniter
Abstract Approximations of loopy belief propagation, including expectation propagation and approximate message passing, have attracted considerable attention for probabilistic inference problems. This paper proposes and analyzes a generalization of Opper and Winther’s expectation consistent (EC) approximate inference method. The proposed method, called Generalized Expectation Consistency (GEC), can be applied to both maximum a posteriori (MAP) and minimum mean squared error (MMSE) estimation. Here we characterize its fixed points, convergence, and performance relative to the replica prediction of optimality.
Tasks
Published 2016-02-25
URL http://arxiv.org/abs/1602.07795v2
PDF http://arxiv.org/pdf/1602.07795v2.pdf
PWC https://paperswithcode.com/paper/expectation-consistent-approximate-inference
Repo
Framework

Keyphrase Extraction using Sequential Labeling

Title Keyphrase Extraction using Sequential Labeling
Authors Sujatha Das Gollapalli, Xiao-li Li
Abstract Keyphrases efficiently summarize a document’s content and are used in various document processing and retrieval tasks. Several unsupervised techniques and classifiers exist for extracting keyphrases from text documents. Most of these methods operate at a phrase-level and rely on part-of-speech (POS) filters for candidate phrase generation. In addition, they do not directly handle keyphrases of varying lengths. We overcome these modeling shortcomings by addressing keyphrase extraction as a sequential labeling task in this paper. We explore a basic set of features commonly used in NLP tasks as well as predictions from various unsupervised methods to train our taggers. In addition to a more natural modeling for the keyphrase extraction problem, we show that tagging models yield significant performance benefits over existing state-of-the-art extraction methods.
Tasks
Published 2016-08-01
URL http://arxiv.org/abs/1608.00329v2
PDF http://arxiv.org/pdf/1608.00329v2.pdf
PWC https://paperswithcode.com/paper/keyphrase-extraction-using-sequential
Repo
Framework

Multi-Output Artificial Neural Network for Storm Surge Prediction in North Carolina

Title Multi-Output Artificial Neural Network for Storm Surge Prediction in North Carolina
Authors Anton Bezuglov, Brian Blanton, Reinaldo Santiago
Abstract During hurricane seasons, emergency managers and other decision makers need accurate and `on-time’ information on potential storm surge impacts. Fully dynamical computer models, such as the ADCIRC tide, storm surge, and wind-wave model take several hours to complete a forecast when configured at high spatial resolution. Additionally, statically meaningful ensembles of high-resolution models (needed for uncertainty estimation) cannot easily be computed in near real-time. This paper discusses an artificial neural network model for storm surge prediction in North Carolina. The network model provides fast, real-time storm surge estimates at coastal locations in North Carolina. The paper studies the performance of the neural network model vs. other models on synthetic and real hurricane data. |
Tasks
Published 2016-09-23
URL http://arxiv.org/abs/1609.07378v1
PDF http://arxiv.org/pdf/1609.07378v1.pdf
PWC https://paperswithcode.com/paper/multi-output-artificial-neural-network-for
Repo
Framework

Distributed Deep Learning Using Synchronous Stochastic Gradient Descent

Title Distributed Deep Learning Using Synchronous Stochastic Gradient Descent
Authors Dipankar Das, Sasikanth Avancha, Dheevatsa Mudigere, Karthikeyan Vaidynathan, Srinivas Sridharan, Dhiraj Kalamkar, Bharat Kaul, Pradeep Dubey
Abstract We design and implement a distributed multinode synchronous SGD algorithm, without altering hyper parameters, or compressing data, or altering algorithmic behavior. We perform a detailed analysis of scaling, and identify optimal design points for different networks. We demonstrate scaling of CNNs on 100s of nodes, and present what we believe to be record training throughputs. A 512 minibatch VGG-A CNN training run is scaled 90X on 128 nodes. Also 256 minibatch VGG-A and OverFeat-FAST networks are scaled 53X and 42X respectively on a 64 node cluster. We also demonstrate the generality of our approach via best-in-class 6.5X scaling for a 7-layer DNN on 16 nodes. Thereafter we attempt to democratize deep-learning by training on an Ethernet based AWS cluster and show ~14X scaling on 16 nodes.
Tasks
Published 2016-02-22
URL http://arxiv.org/abs/1602.06709v1
PDF http://arxiv.org/pdf/1602.06709v1.pdf
PWC https://paperswithcode.com/paper/distributed-deep-learning-using-synchronous
Repo
Framework

Temporal Attention Model for Neural Machine Translation

Title Temporal Attention Model for Neural Machine Translation
Authors Baskaran Sankaran, Haitao Mi, Yaser Al-Onaizan, Abe Ittycheriah
Abstract Attention-based Neural Machine Translation (NMT) models suffer from attention deficiency issues as has been observed in recent research. We propose a novel mechanism to address some of these limitations and improve the NMT attention. Specifically, our approach memorizes the alignments temporally (within each sentence) and modulates the attention with the accumulated temporal memory, as the decoder generates the candidate translation. We compare our approach against the baseline NMT model and two other related approaches that address this issue either explicitly or implicitly. Large-scale experiments on two language pairs show that our approach achieves better and robust gains over the baseline and related NMT approaches. Our model further outperforms strong SMT baselines in some settings even without using ensembles.
Tasks Machine Translation
Published 2016-08-09
URL http://arxiv.org/abs/1608.02927v1
PDF http://arxiv.org/pdf/1608.02927v1.pdf
PWC https://paperswithcode.com/paper/temporal-attention-model-for-neural-machine
Repo
Framework

Distilling Information Reliability and Source Trustworthiness from Digital Traces

Title Distilling Information Reliability and Source Trustworthiness from Digital Traces
Authors Behzad Tabibian, Isabel Valera, Mehrdad Farajtabar, Le Song, Bernhard Schölkopf, Manuel Gomez-Rodriguez
Abstract Online knowledge repositories typically rely on their users or dedicated editors to evaluate the reliability of their content. These evaluations can be viewed as noisy measurements of both information reliability and information source trustworthiness. Can we leverage these noisy evaluations, often biased, to distill a robust, unbiased and interpretable measure of both notions? In this paper, we argue that the temporal traces left by these noisy evaluations give cues on the reliability of the information and the trustworthiness of the sources. Then, we propose a temporal point process modeling framework that links these temporal traces to robust, unbiased and interpretable notions of information reliability and source trustworthiness. Furthermore, we develop an efficient convex optimization procedure to learn the parameters of the model from historical traces. Experiments on real-world data gathered from Wikipedia and Stack Overflow show that our modeling framework accurately predicts evaluation events, provides an interpretable measure of information reliability and source trustworthiness, and yields interesting insights about real-world events.
Tasks
Published 2016-10-24
URL http://arxiv.org/abs/1610.07472v3
PDF http://arxiv.org/pdf/1610.07472v3.pdf
PWC https://paperswithcode.com/paper/distilling-information-reliability-and-source
Repo
Framework
comments powered by Disqus