May 6, 2019

2262 words 11 mins read

Paper Group ANR 257

Statistical Properties of European Languages and Voynich Manuscript Analysis. The Computational Power of Dynamic Bayesian Networks. On the Existence of a Projective Reconstruction. Primal-Dual Rates and Certificates. Learning to predict where to look in interactive environments using deep recurrent q-learning. Nonsymbolic Text Representation. Faste …

Statistical Properties of European Languages and Voynich Manuscript Analysis


Title	Statistical Properties of European Languages and Voynich Manuscript Analysis
Authors	Andronik Arutyunov, Leonid Borisov, Sergey Fedorov, Anastasiya Ivchenko, Elizabeth Kirina-Lilinskaya, Yurii Orlov, Konstantin Osminin, Sergey Shilin, Dmitriy Zeniuk
Abstract	The statistical properties of letters frequencies in European literature texts are investigated. The determination of logarithmic dependence of letters sequence for one-language and two-language texts are examined. The pare of languages is suggested for Voynich Manuscript. The internal structure of Manuscript is considered. The spectral portraits of two-letters distribution are constructed.
Tasks
Published	2016-11-18
URL	http://arxiv.org/abs/1611.09122v1
PDF	http://arxiv.org/pdf/1611.09122v1.pdf
PWC	https://paperswithcode.com/paper/statistical-properties-of-european-languages
Repo
Framework

The Computational Power of Dynamic Bayesian Networks


Title	The Computational Power of Dynamic Bayesian Networks
Authors	Joshua Brulé
Abstract	This paper considers the computational power of constant size, dynamic Bayesian networks. Although discrete dynamic Bayesian networks are no more powerful than hidden Markov models, dynamic Bayesian networks with continuous random variables and discrete children of continuous parents are capable of performing Turing-complete computation. With modified versions of existing algorithms for belief propagation, such a simulation can be carried out in real time. This result suggests that dynamic Bayesian networks may be more powerful than previously considered. Relationships to causal models and recurrent neural networks are also discussed.
Tasks
Published	2016-03-19
URL	http://arxiv.org/abs/1603.06125v1
PDF	http://arxiv.org/pdf/1603.06125v1.pdf
PWC	https://paperswithcode.com/paper/the-computational-power-of-dynamic-bayesian
Repo
Framework

On the Existence of a Projective Reconstruction


Title	On the Existence of a Projective Reconstruction
Authors	Hon-Leung Lee
Abstract	In this note we study the connection between the existence of a projective reconstruction and the existence of a fundamental matrix satisfying the epipolar constraints.
Tasks
Published	2016-08-19
URL	http://arxiv.org/abs/1608.05518v1
PDF	http://arxiv.org/pdf/1608.05518v1.pdf
PWC	https://paperswithcode.com/paper/on-the-existence-of-a-projective
Repo
Framework

Primal-Dual Rates and Certificates


Title	Primal-Dual Rates and Certificates
Authors	Celestine Dünner, Simone Forte, Martin Takáč, Martin Jaggi
Abstract	We propose an algorithm-independent framework to equip existing optimization methods with primal-dual certificates. Such certificates and corresponding rate of convergence guarantees are important for practitioners to diagnose progress, in particular in machine learning applications. We obtain new primal-dual convergence rates, e.g., for the Lasso as well as many L1, Elastic Net, group Lasso and TV-regularized problems. The theory applies to any norm-regularized generalized linear model. Our approach provides efficiently computable duality gaps which are globally defined, without modifying the original problems in the region of interest.
Tasks
Published	2016-02-16
URL	http://arxiv.org/abs/1602.05205v2
PDF	http://arxiv.org/pdf/1602.05205v2.pdf
PWC	https://paperswithcode.com/paper/primal-dual-rates-and-certificates
Repo
Framework

Learning to predict where to look in interactive environments using deep recurrent q-learning


Title	Learning to predict where to look in interactive environments using deep recurrent q-learning
Authors	Sajad Mousavi, Michael Schukat, Enda Howley, Ali Borji, Nasser Mozayani
Abstract	Bottom-Up (BU) saliency models do not perform well in complex interactive environments where humans are actively engaged in tasks (e.g., sandwich making and playing the video games). In this paper, we leverage Reinforcement Learning (RL) to highlight task-relevant locations of input frames. We propose a soft attention mechanism combined with the Deep Q-Network (DQN) model to teach an RL agent how to play a game and where to look by focusing on the most pertinent parts of its visual input. Our evaluations on several Atari 2600 games show that the soft attention based model could predict fixation locations significantly better than bottom-up models such as Itti-Kochs saliency and Graph-Based Visual Saliency (GBVS) models.
Tasks	Atari Games, Q-Learning
Published	2016-12-17
URL	http://arxiv.org/abs/1612.05753v2
PDF	http://arxiv.org/pdf/1612.05753v2.pdf
PWC	https://paperswithcode.com/paper/learning-to-predict-where-to-look-in
Repo
Framework

Nonsymbolic Text Representation


Title	Nonsymbolic Text Representation
Authors	Hinrich Schuetze, Heike Adel, Ehsaneddin Asgari
Abstract	We introduce the first generic text representation model that is completely nonsymbolic, i.e., it does not require the availability of a segmentation or tokenization method that attempts to identify words or other symbolic units in text. This applies to training the parameters of the model on a training corpus as well as to applying it when computing the representation of a new text. We show that our model performs better than prior work on an information extraction and a text denoising task.
Tasks	Denoising, Tokenization
Published	2016-10-03
URL	http://arxiv.org/abs/1610.00479v3
PDF	http://arxiv.org/pdf/1610.00479v3.pdf
PWC	https://paperswithcode.com/paper/nonsymbolic-text-representation
Repo
Framework

Faster variational inducing input Gaussian process classification


Title	Faster variational inducing input Gaussian process classification
Authors	Pavel Izmailov, Dmitry Kropotov
Abstract	Gaussian processes (GP) provide a prior over functions and allow finding complex regularities in data. Gaussian processes are successfully used for classification/regression problems and dimensionality reduction. In this work we consider the classification problem only. The complexity of standard methods for GP-classification scales cubically with the size of the training dataset. This complexity makes them inapplicable to big data problems. Therefore, a variety of methods were introduced to overcome this limitation. In the paper we focus on methods based on so called inducing inputs. This approach is based on variational inference and proposes a particular lower bound for marginal likelihood (evidence). This bound is then maximized w.r.t. parameters of kernel function of the Gaussian process, thus fitting the model to data. The computational complexity of this method is $O(nm^2)$, where $m$ is the number of inducing inputs used by the model and is assumed to be substantially smaller than the size of the dataset $n$. Recently, a new evidence lower bound for GP-classification problem was introduced. It allows using stochastic optimization, which makes it suitable for big data problems. However, the new lower bound depends on $O(m^2)$ variational parameter, which makes optimization challenging in case of big m. In this work we develop a new approach for training inducing input GP models for classification problems. Here we use quadratic approximation of several terms in the aforementioned evidence lower bound, obtaining analytical expressions for optimal values of most of the parameters in the optimization, thus sufficiently reducing the dimension of optimization space. In our experiments we achieve as well or better results, compared to the existing method. Moreover, our method doesn’t require the user to manually set the learning rate, making it more practical, than the existing method.
Tasks	Dimensionality Reduction, Gaussian Processes, Stochastic Optimization
Published	2016-11-18
URL	http://arxiv.org/abs/1611.06132v1
PDF	http://arxiv.org/pdf/1611.06132v1.pdf
PWC	https://paperswithcode.com/paper/faster-variational-inducing-input-gaussian
Repo
Framework

Deep Variational Canonical Correlation Analysis


Title	Deep Variational Canonical Correlation Analysis
Authors	Weiran Wang, Xinchen Yan, Honglak Lee, Karen Livescu
Abstract	We present deep variational canonical correlation analysis (VCCA), a deep multi-view learning model that extends the latent variable model interpretation of linear CCA to nonlinear observation models parameterized by deep neural networks. We derive variational lower bounds of the data likelihood by parameterizing the posterior probability of the latent variables from the view that is available at test time. We also propose a variant of VCCA called VCCA-private that can, in addition to the “common variables” underlying both views, extract the “private variables” within each view, and disentangles the shared and private information for multi-view data without hard supervision. Experimental results on real-world datasets show that our methods are competitive across domains.
Tasks	MULTI-VIEW LEARNING
Published	2016-10-11
URL	http://arxiv.org/abs/1610.03454v3
PDF	http://arxiv.org/pdf/1610.03454v3.pdf
PWC	https://paperswithcode.com/paper/deep-variational-canonical-correlation
Repo
Framework

LSTM with Working Memory


Title	LSTM with Working Memory
Authors	Andrew Pulver, Siwei Lyu
Abstract	Previous RNN architectures have largely been superseded by LSTM, or “Long Short-Term Memory”. Since its introduction, there have been many variations on this simple design. However, it is still widely used and we are not aware of a gated-RNN architecture that outperforms LSTM in a broad sense while still being as simple and efficient. In this paper we propose a modified LSTM-like architecture. Our architecture is still simple and achieves better performance on the tasks that we tested on. We also introduce a new RNN performance benchmark that uses the handwritten digits and stresses several important network capabilities.
Tasks
Published	2016-05-06
URL	http://arxiv.org/abs/1605.01988v3
PDF	http://arxiv.org/pdf/1605.01988v3.pdf
PWC	https://paperswithcode.com/paper/lstm-with-working-memory
Repo
Framework

Expectation Consistent Approximate Inference: Generalizations and Convergence


Title	Expectation Consistent Approximate Inference: Generalizations and Convergence
Authors	Alyson K. Fletcher, Mojtaba Sahraee-Ardakan, Sundeep Rangan, Philip Schniter
Abstract	Approximations of loopy belief propagation, including expectation propagation and approximate message passing, have attracted considerable attention for probabilistic inference problems. This paper proposes and analyzes a generalization of Opper and Winther’s expectation consistent (EC) approximate inference method. The proposed method, called Generalized Expectation Consistency (GEC), can be applied to both maximum a posteriori (MAP) and minimum mean squared error (MMSE) estimation. Here we characterize its fixed points, convergence, and performance relative to the replica prediction of optimality.
Tasks
Published	2016-02-25
URL	http://arxiv.org/abs/1602.07795v2
PDF	http://arxiv.org/pdf/1602.07795v2.pdf
PWC	https://paperswithcode.com/paper/expectation-consistent-approximate-inference
Repo
Framework

Keyphrase Extraction using Sequential Labeling


Title	Keyphrase Extraction using Sequential Labeling
Authors	Sujatha Das Gollapalli, Xiao-li Li
Abstract	Keyphrases efficiently summarize a document’s content and are used in various document processing and retrieval tasks. Several unsupervised techniques and classifiers exist for extracting keyphrases from text documents. Most of these methods operate at a phrase-level and rely on part-of-speech (POS) filters for candidate phrase generation. In addition, they do not directly handle keyphrases of varying lengths. We overcome these modeling shortcomings by addressing keyphrase extraction as a sequential labeling task in this paper. We explore a basic set of features commonly used in NLP tasks as well as predictions from various unsupervised methods to train our taggers. In addition to a more natural modeling for the keyphrase extraction problem, we show that tagging models yield significant performance benefits over existing state-of-the-art extraction methods.
Tasks
Published	2016-08-01
URL	http://arxiv.org/abs/1608.00329v2
PDF	http://arxiv.org/pdf/1608.00329v2.pdf
PWC	https://paperswithcode.com/paper/keyphrase-extraction-using-sequential
Repo
Framework

Multi-Output Artificial Neural Network for Storm Surge Prediction in North Carolina


Title	Multi-Output Artificial Neural Network for Storm Surge Prediction in North Carolina
Authors	Anton Bezuglov, Brian Blanton, Reinaldo Santiago
Abstract	During hurricane seasons, emergency managers and other decision makers need accurate and `on-time’ information on potential storm surge impacts. Fully dynamical computer models, such as the ADCIRC tide, storm surge, and wind-wave model take several hours to complete a forecast when configured at high spatial resolution. Additionally, statically meaningful ensembles of high-resolution models (needed for uncertainty estimation) cannot easily be computed in near real-time. This paper discusses an artificial neural network model for storm surge prediction in North Carolina. The network model provides fast, real-time storm surge estimates at coastal locations in North Carolina. The paper studies the performance of the neural network model vs. other models on synthetic and real hurricane data. \|
Tasks
Published	2016-09-23
URL	http://arxiv.org/abs/1609.07378v1
PDF	http://arxiv.org/pdf/1609.07378v1.pdf
PWC	https://paperswithcode.com/paper/multi-output-artificial-neural-network-for
Repo
Framework

Distributed Deep Learning Using Synchronous Stochastic Gradient Descent


Title	Distributed Deep Learning Using Synchronous Stochastic Gradient Descent
Authors	Dipankar Das, Sasikanth Avancha, Dheevatsa Mudigere, Karthikeyan Vaidynathan, Srinivas Sridharan, Dhiraj Kalamkar, Bharat Kaul, Pradeep Dubey
Abstract	We design and implement a distributed multinode synchronous SGD algorithm, without altering hyper parameters, or compressing data, or altering algorithmic behavior. We perform a detailed analysis of scaling, and identify optimal design points for different networks. We demonstrate scaling of CNNs on 100s of nodes, and present what we believe to be record training throughputs. A 512 minibatch VGG-A CNN training run is scaled 90X on 128 nodes. Also 256 minibatch VGG-A and OverFeat-FAST networks are scaled 53X and 42X respectively on a 64 node cluster. We also demonstrate the generality of our approach via best-in-class 6.5X scaling for a 7-layer DNN on 16 nodes. Thereafter we attempt to democratize deep-learning by training on an Ethernet based AWS cluster and show ~14X scaling on 16 nodes.
Tasks
Published	2016-02-22
URL	http://arxiv.org/abs/1602.06709v1
PDF	http://arxiv.org/pdf/1602.06709v1.pdf
PWC	https://paperswithcode.com/paper/distributed-deep-learning-using-synchronous
Repo
Framework

Temporal Attention Model for Neural Machine Translation


Title	Temporal Attention Model for Neural Machine Translation
Authors	Baskaran Sankaran, Haitao Mi, Yaser Al-Onaizan, Abe Ittycheriah
Abstract	Attention-based Neural Machine Translation (NMT) models suffer from attention deficiency issues as has been observed in recent research. We propose a novel mechanism to address some of these limitations and improve the NMT attention. Specifically, our approach memorizes the alignments temporally (within each sentence) and modulates the attention with the accumulated temporal memory, as the decoder generates the candidate translation. We compare our approach against the baseline NMT model and two other related approaches that address this issue either explicitly or implicitly. Large-scale experiments on two language pairs show that our approach achieves better and robust gains over the baseline and related NMT approaches. Our model further outperforms strong SMT baselines in some settings even without using ensembles.
Tasks	Machine Translation
Published	2016-08-09
URL	http://arxiv.org/abs/1608.02927v1
PDF	http://arxiv.org/pdf/1608.02927v1.pdf
PWC	https://paperswithcode.com/paper/temporal-attention-model-for-neural-machine
Repo
Framework

Distilling Information Reliability and Source Trustworthiness from Digital Traces


Title	Distilling Information Reliability and Source Trustworthiness from Digital Traces
Authors	Behzad Tabibian, Isabel Valera, Mehrdad Farajtabar, Le Song, Bernhard Schölkopf, Manuel Gomez-Rodriguez
Abstract	Online knowledge repositories typically rely on their users or dedicated editors to evaluate the reliability of their content. These evaluations can be viewed as noisy measurements of both information reliability and information source trustworthiness. Can we leverage these noisy evaluations, often biased, to distill a robust, unbiased and interpretable measure of both notions? In this paper, we argue that the temporal traces left by these noisy evaluations give cues on the reliability of the information and the trustworthiness of the sources. Then, we propose a temporal point process modeling framework that links these temporal traces to robust, unbiased and interpretable notions of information reliability and source trustworthiness. Furthermore, we develop an efficient convex optimization procedure to learn the parameters of the model from historical traces. Experiments on real-world data gathered from Wikipedia and Stack Overflow show that our modeling framework accurately predicts evaluation events, provides an interpretable measure of information reliability and source trustworthiness, and yields interesting insights about real-world events.
Tasks
Published	2016-10-24
URL	http://arxiv.org/abs/1610.07472v3
PDF	http://arxiv.org/pdf/1610.07472v3.pdf
PWC	https://paperswithcode.com/paper/distilling-information-reliability-and-source
Repo
Framework