January 28, 2020

3210 words 16 mins read

Paper Group ANR 854

Paper Group ANR 854

Linguistic evaluation of German-English Machine Translation using a Test Suite. Grammatical Gender, Neo-Whorfianism, and Word Embeddings: A Data-Driven Approach to Linguistic Relativity. Combining Physics-Based Domain Knowledge and Machine Learning using Variational Gaussian Processes with Explicit Linear Prior. Approximate Inference Turns Deep Net …

Linguistic evaluation of German-English Machine Translation using a Test Suite

Title Linguistic evaluation of German-English Machine Translation using a Test Suite
Authors Eleftherios Avramidis, Vivien Macketanz, Ursula Strohriegel, Hans Uszkoreit
Abstract We present the results of the application of a grammatical test suite for German$\rightarrow$English MT on the systems submitted at WMT19, with a detailed analysis for 107 phenomena organized in 14 categories. The systems still translate wrong one out of four test items in average. Low performance is indicated for idioms, modals, pseudo-clefts, multi-word expressions and verb valency. When compared to last year, there has been a improvement of function words, non-verbal agreement and punctuation. More detailed conclusions about particular systems and phenomena are also presented.
Tasks Machine Translation
Published 2019-10-16
URL https://arxiv.org/abs/1910.07457v1
PDF https://arxiv.org/pdf/1910.07457v1.pdf
PWC https://paperswithcode.com/paper/linguistic-evaluation-of-german-english-1
Repo
Framework

Grammatical Gender, Neo-Whorfianism, and Word Embeddings: A Data-Driven Approach to Linguistic Relativity

Title Grammatical Gender, Neo-Whorfianism, and Word Embeddings: A Data-Driven Approach to Linguistic Relativity
Authors Katharina Kann
Abstract The relation between language and thought has occupied linguists for at least a century. Neo-Whorfianism, a weak version of the controversial Sapir-Whorf hypothesis, holds that our thoughts are subtly influenced by the grammatical structures of our native language. One area of investigation in this vein focuses on how the grammatical gender of nouns affects the way we perceive the corresponding objects. For instance, does the fact that key is masculine in German (der Schl"ussel), but feminine in Spanish (la llave) change the speakers’ views of those objects? Psycholinguistic evidence presented by Boroditsky et al. (2003, {\S}4) suggested the answer might be yes: When asked to produce adjectives that best described a key, German and Spanish speakers named more stereotypically masculine and feminine ones, respectively. However, recent attempts to replicate those experiments have failed (Mickan et al., 2014). In this work, we offer a computational analogue of Boroditsky et al. (2003, {\S}4)‘s experimental design on 9 languages, finding evidence against neo-Whorfianism.
Tasks Word Embeddings
Published 2019-10-22
URL https://arxiv.org/abs/1910.09729v1
PDF https://arxiv.org/pdf/1910.09729v1.pdf
PWC https://paperswithcode.com/paper/grammatical-gender-neo-whorfianism-and-word
Repo
Framework

Combining Physics-Based Domain Knowledge and Machine Learning using Variational Gaussian Processes with Explicit Linear Prior

Title Combining Physics-Based Domain Knowledge and Machine Learning using Variational Gaussian Processes with Explicit Linear Prior
Authors Daniel L. Marino, Milos Manic
Abstract Centuries of development in natural sciences and mathematical modeling provide valuable domain expert knowledge that has yet to be explored for the development of machine learning models. When modeling complex physical systems, both domain knowledge and data contribute important information about the system. In this paper, we present a data-driven model that takes advantage of partial domain knowledge in order to improve generalization and interpretability. The presented model, which we call EVGP (Explicit Variational Gaussian Process), uses an explicit linear prior to incorporate partial domain knowledge while using data to fill in the gaps in knowledge. Variational inference was used to obtain a sparse approximation that scales well to large datasets. The advantages include: 1) using partial domain knowledge to improve inductive bias (assumptions of the model), 2) scalability to large datasets, 3) improved interpretability. We show how the EVGP model can be used to learn system dynamics using basic Newtonian mechanics as prior knowledge. We demonstrate that using simple priors from partially defined physics models considerably improves performance when compared to fully data-driven models.
Tasks Gaussian Processes
Published 2019-06-05
URL https://arxiv.org/abs/1906.02160v1
PDF https://arxiv.org/pdf/1906.02160v1.pdf
PWC https://paperswithcode.com/paper/combining-physics-based-domain-knowledge-and
Repo
Framework

Approximate Inference Turns Deep Networks into Gaussian Processes

Title Approximate Inference Turns Deep Networks into Gaussian Processes
Authors Mohammad Emtiyaz Khan, Alexander Immer, Ehsan Abedi, Maciej Korzepa
Abstract Deep neural networks (DNN) and Gaussian processes (GP) are two powerful models with several theoretical connections relating them, but the relationship between their training methods is not well understood. In this paper, we show that certain Gaussian posterior approximations for Bayesian DNNs are equivalent to GP posteriors. This enables us to relate solutions and iterations of a deep-learning algorithm to GP inference. As a result, we can obtain a GP kernel and a nonlinear feature map while training a DNN. Surprisingly, the resulting kernel is the neural tangent kernel. We show kernels obtained on real datasets and demonstrate the use of the GP marginal likelihood to tune hyperparameters of DNNs. Our work aims to facilitate further research on combining DNNs and GPs in practical settings.
Tasks Gaussian Processes
Published 2019-06-05
URL https://arxiv.org/abs/1906.01930v2
PDF https://arxiv.org/pdf/1906.01930v2.pdf
PWC https://paperswithcode.com/paper/approximate-inference-turns-deep-networks
Repo
Framework

Modeling Engagement Dynamics of Online Discussions using Relativistic Gravitational Theory

Title Modeling Engagement Dynamics of Online Discussions using Relativistic Gravitational Theory
Authors Subhabrata Dutta, Dipankar Das, Tanmoy Chakraborty
Abstract Online discussions are valuable resources to study user behaviour on a diverse set of topics. Unlike previous studies which model a discussion in a static manner, in the present study, we model it as a time-varying process and solve two inter-related problems – predict which user groups will get engaged with an ongoing discussion, and forecast the growth rate of a discussion in terms of the number of comments. We propose RGNet (Relativistic Gravitational Nerwork), a novel algorithm that uses Einstein Field Equations of gravity to model online discussions as `cloud of dust’ hovering over a user spacetime manifold, attracting users of different groups at different rates over time. We also propose GUVec, a global user embedding method for an online discussion, which is used by RGNet to predict temporal user engagement. RGNet leverages different textual and network-based features to learn the dust distribution for discussions. We employ four baselines – first two using LSTM architecture, third one using Newtonian model of gravity, and fourth one using a logistic regression adopted from a previous work on engagement prediction. Experiments on Reddit dataset show that RGNet achieves 0.72 Micro F1 score and 6.01% average error for temporal engagement prediction of user groups and growth rate forecasting, respectively, outperforming all the baselines significantly. We further employ RGNet to predict non-temporal engagement – whether users will comment to a given post or not. RGNet achieves 0.62 AUC for this task, outperforming existing baseline by 8.77% AUC. |
Tasks
Published 2019-08-10
URL https://arxiv.org/abs/1908.03770v1
PDF https://arxiv.org/pdf/1908.03770v1.pdf
PWC https://paperswithcode.com/paper/modeling-engagement-dynamics-of-online
Repo
Framework

Streaming Variational Monte Carlo

Title Streaming Variational Monte Carlo
Authors Yuan Zhao, Josue Nassar, Ian Jordan, Mónica Bugallo, Il Memming Park
Abstract Nonlinear state-space models are powerful tools to describe dynamical structures in complex time series. In a streaming setting where data are processed one sample at a time, simultaneous inference of the state and its nonlinear dynamics has posed significant challenges in practice. We develop a novel online learning framework, leveraging variational inference and sequential Monte Carlo, which enables flexible and accurate Bayesian joint filtering. Our method provides an approximation of the filtering posterior which can be made arbitrarily close to the true filtering distribution for a wide class of dynamics models and observation models. Specifically, the proposed framework can efficiently approximate a posterior over the dynamics using sparse Gaussian processes, allowing for an interpretable model of the latent dynamics. Constant time complexity per sample makes our approach amenable to online learning scenarios and suitable for real-time applications.
Tasks Gaussian Processes, Time Series
Published 2019-06-04
URL https://arxiv.org/abs/1906.01549v3
PDF https://arxiv.org/pdf/1906.01549v3.pdf
PWC https://paperswithcode.com/paper/streaming-variational-monte-carlo
Repo
Framework

Posterior Variance Analysis of Gaussian Processes with Application to Average Learning Curves

Title Posterior Variance Analysis of Gaussian Processes with Application to Average Learning Curves
Authors Armin Lederer, Jonas Umlauft, Sandra Hirche
Abstract The posterior variance of Gaussian processes is a valuable measure of the learning error which is exploited in various applications such as safe reinforcement learning and control design. However, suitable analysis of the posterior variance which captures its behavior for finite and infinite number of training data is missing. This paper derives a novel bound for the posterior variance function which requires only local information because it depends only on the number of training samples in the proximity of a considered test point. Furthermore, we prove sufficient conditions which ensure the convergence of the posterior variance to zero. Finally, we demonstrate that the extension of our bound to an average learning bound outperforms existing approaches.
Tasks Gaussian Processes
Published 2019-06-04
URL https://arxiv.org/abs/1906.01404v1
PDF https://arxiv.org/pdf/1906.01404v1.pdf
PWC https://paperswithcode.com/paper/posterior-variance-analysis-of-gaussian
Repo
Framework

Large-scale, real-time visual-inertial localization revisited

Title Large-scale, real-time visual-inertial localization revisited
Authors Simon Lynen, Bernhard Zeisl, Dror Aiger, Michael Bosse, Joel Hesch, Marc Pollefeys, Roland Siegwart, Torsten Sattler
Abstract The overarching goals in image-based localization are scale, robustness and speed. In recent years, approaches based on local features and sparse 3D point-cloud models have both dominated the benchmarks and seen successful realworld deployment. They enable applications ranging from robot navigation, autonomous driving, virtual and augmented reality to device geo-localization. Recently end-to-end learned localization approaches have been proposed which show promising results on small scale datasets. However the positioning accuracy, scalability, latency and compute & storage requirements of these approaches remain open challenges. We aim to deploy localization at global-scale where one thus relies on methods using local features and sparse 3D models. Our approach spans from offline model building to real-time client-side pose fusion. The system compresses appearance and geometry of the scene for efficient model storage and lookup leading to scalability beyond what what has been previously demonstrated. It allows for low-latency localization queries and efficient fusion run in real-time on mobile platforms by combining server-side localization with real-time visual-inertial-based camera pose tracking. In order to further improve efficiency we leverage a combination of priors, nearest neighbor search, geometric match culling and a cascaded pose candidate refinement step. This combination outperforms previous approaches when working with large scale models and allows deployment at unprecedented scale. We demonstrate the effectiveness of our approach on a proof-of-concept system localizing 2.5 million images against models from four cities in different regions on the world achieving query latencies in the 200ms range.
Tasks Autonomous Driving, Image-Based Localization, Pose Tracking, Robot Navigation
Published 2019-06-30
URL https://arxiv.org/abs/1907.00338v1
PDF https://arxiv.org/pdf/1907.00338v1.pdf
PWC https://paperswithcode.com/paper/large-scale-real-time-visual-inertial
Repo
Framework

Conversational AI : Open Domain Question Answering and Commonsense Reasoning

Title Conversational AI : Open Domain Question Answering and Commonsense Reasoning
Authors Kinjal Basu
Abstract Our research is focused on making a human-like question answering system which can answer rationally. The distinguishing characteristic of our approach is that it will use automated common sense reasoning to truly “understand” dialogues, allowing it to converse like a human. Humans often make many assumptions during conversations. We infer facts not told explicitly by using our common sense. Incorporating commonsense knowledge in a question answering system will simply make it more robust.
Tasks Common Sense Reasoning, Open-Domain Question Answering, Question Answering
Published 2019-09-18
URL https://arxiv.org/abs/1909.08258v1
PDF https://arxiv.org/pdf/1909.08258v1.pdf
PWC https://paperswithcode.com/paper/conversational-ai-open-domain-question
Repo
Framework

Near-Optimal Glimpse Sequences for Improved Hard Attention Neural Network Training

Title Near-Optimal Glimpse Sequences for Improved Hard Attention Neural Network Training
Authors William Harvey, Michael Teng, Frank Wood
Abstract We introduce the use of Bayesian optimal experimental design techniques for generating glimpse sequences to use in semi-supervised training of hard attention networks. Hard attention holds the promise of greater energy efficiency and superior inference performance. Employing such networks for image classification usually involves choosing a sequence of glimpse locations from a stochastic policy. As the outputs of observations are typically non-differentiable with respect to their glimpse locations, unsupervised gradient learning of such a policy requires REINFORCE-style updates. Also, the only reward signal is the final classification accuracy. For these reasons hard attention networks, despite their promise, have not achieved the wide adoption that soft attention networks have and, in many practical settings, are difficult to train. We find that our method for semi-supervised training makes it easier and faster to train hard attention networks and correspondingly could make them practical to consider in situations where they were not before.
Tasks Image Classification
Published 2019-06-13
URL https://arxiv.org/abs/1906.05462v1
PDF https://arxiv.org/pdf/1906.05462v1.pdf
PWC https://paperswithcode.com/paper/near-optimal-glimpse-sequences-for-improved
Repo
Framework

A New Family of Neural Networks Provably Resistant to Adversarial Attacks

Title A New Family of Neural Networks Provably Resistant to Adversarial Attacks
Authors Rakshit Agrawal, Luca de Alfaro, David Helmbold
Abstract Adversarial attacks add perturbations to the input features with the intent of changing the classification produced by a machine learning system. Small perturbations can yield adversarial examples which are misclassified despite being virtually indistinguishable from the unperturbed input. Classifiers trained with standard neural network techniques are highly susceptible to adversarial examples, allowing an adversary to create misclassifications of their choice. We introduce a new type of network unit, called MWD (max of weighed distance) units that have a built-in resistant to adversarial attacks. These units are highly non-linear, and we develop the techniques needed to effectively train them. We show that simple interval techniques for propagating perturbation effects through the network enables the efficient computation of robustness (i.e., accuracy guarantees) for MWD networks under any perturbations, including adversarial attacks. MWD networks are significantly more robust to input perturbations than ReLU networks. On permutation invariant MNIST, when test examples can be perturbed by 20% of the input range, MWD networks provably retain accuracy above 83%, while the accuracy of ReLU networks drops below 5%. The provable accuracy of MWD networks is superior even to the observed accuracy of ReLU networks trained with the help of adversarial examples. In the absence of adversarial attacks, MWD networks match the performance of sigmoid networks, and have accuracy only slightly below that of ReLU networks.
Tasks
Published 2019-02-01
URL http://arxiv.org/abs/1902.01208v1
PDF http://arxiv.org/pdf/1902.01208v1.pdf
PWC https://paperswithcode.com/paper/a-new-family-of-neural-networks-provably
Repo
Framework

Efficient Sentence Embedding using Discrete Cosine Transform

Title Efficient Sentence Embedding using Discrete Cosine Transform
Authors Nada Almarwani, Hanan Aldarmaki, Mona Diab
Abstract Vector averaging remains one of the most popular sentence embedding methods in spite of its obvious disregard for syntactic structure. While more complex sequential or convolutional networks potentially yield superior classification performance, the improvements in classification accuracy are typically mediocre compared to the simple vector averaging. As an efficient alternative, we propose the use of discrete cosine transform (DCT) to compress word sequences in an order-preserving manner. The lower order DCT coefficients represent the overall feature patterns in sentences, which results in suitable embeddings for tasks that could benefit from syntactic features. Our results in semantic probing tasks demonstrate that DCT embeddings indeed preserve more syntactic information compared with vector averaging. With practically equivalent complexity, the model yields better overall performance in downstream classification tasks that correlate with syntactic features, which illustrates the capacity of DCT to preserve word order information.
Tasks Sentence Embedding
Published 2019-09-06
URL https://arxiv.org/abs/1909.03104v2
PDF https://arxiv.org/pdf/1909.03104v2.pdf
PWC https://paperswithcode.com/paper/efficient-sentence-embedding-using-discrete
Repo
Framework

Truncated Gaussian-Mixture Variational AutoEncoder

Title Truncated Gaussian-Mixture Variational AutoEncoder
Authors Qingyu Zhao, Nicolas Honnorat, Ehsan Adeli, Kilian M. Pohl
Abstract Variation Autoencoder (VAE) has become a powerful tool in modeling the non-linear generative process of data from a low-dimensional latent space. Recently, several studies have proposed to use VAE for unsupervised clustering by using mixture models to capture the multi-modal structure of latent representations. This strategy, however, is ineffective when there are outlier data samples whose latent representations are meaningless, yet contaminating the estimation of key major clusters in the latent space. This exact problem arises in the context of resting-state fMRI (rs-fMRI) analysis, where clustering major functional connectivity patterns is often hindered by heavy noise of rs-fMRI and many minor clusters (rare connectivity patterns) of no interest to analysis. In this paper we propose a novel generative process, in which we use a Gaussian-mixture to model a few major clusters in the data, and use a non-informative uniform distribution to capture the remaining data. We embed this truncated Gaussian-Mixture model in a Variational AutoEncoder framework to obtain a general joint clustering and outlier detection approach, called tGM-VAE. We demonstrated the applicability of tGM-VAE on the MNIST dataset and further validated it in the context of rs-fMRI connectivity analysis.
Tasks Outlier Detection
Published 2019-02-11
URL http://arxiv.org/abs/1902.03717v2
PDF http://arxiv.org/pdf/1902.03717v2.pdf
PWC https://paperswithcode.com/paper/variational-autoencoder-with-truncated
Repo
Framework

REVERIE: Remote Embodied Visual Referring Expression in Real Indoor Environments

Title REVERIE: Remote Embodied Visual Referring Expression in Real Indoor Environments
Authors Yuankai Qi, Qi Wu, Peter Anderson, Xin Wang, William Yang Wang, Chunhua Shen, Anton van den Hengel
Abstract One of the long-term challenges of robotics is to enable robots to interact with humans in the visual world via natural language, as humans are visual animals that communicate through language. Overcoming this challenge requires the ability to perform a wide variety of complex tasks in response to multifarious instructions from humans. In the hope that it might drive progress towards more flexible and powerful human interactions with robots, we propose a dataset of varied and complex robot tasks, described in natural language, in terms of objects visible in a large set of real images. Given an instruction, success requires navigating through a previously-unseen environment to identify an object. This represents a practical challenge, but one that closely reflects one of the core visual problems in robotics. Several state-of-the-art vision-and-language navigation, and referring-expression models are tested to verify the difficulty of this new task, but none of them show promising results because there are many fundamental differences between our task and previous ones. A novel Interactive Navigator-Pointer model is also proposed that provides a strong baseline on the task. The proposed model especially achieves the best performance on the unseen test split, but still leaves substantial room for improvement compared to the human performance.
Tasks
Published 2019-04-23
URL https://arxiv.org/abs/1904.10151v2
PDF https://arxiv.org/pdf/1904.10151v2.pdf
PWC https://paperswithcode.com/paper/rerere-remote-embodied-referring-expressions
Repo
Framework

Outline Generation: Understanding the Inherent Content Structure of Documents

Title Outline Generation: Understanding the Inherent Content Structure of Documents
Authors Ruqing Zhang, Jiafeng Guo, Yixing Fan, Yanyan Lan, Xueqi Cheng
Abstract In this paper, we introduce and tackle the Outline Generation (OG) task, which aims to unveil the inherent content structure of a multi-paragraph document by identifying its potential sections and generating the corresponding section headings. Without loss of generality, the OG task can be viewed as a novel structured summarization task. To generate a sound outline, an ideal OG model should be able to capture three levels of coherence, namely the coherence between context paragraphs, that between a section and its heading, and that between context headings. The first one is the foundation for section identification, while the latter two are critical for consistent heading generation. In this work, we formulate the OG task as a hierarchical structured prediction problem, i.e., to first predict a sequence of section boundaries and then a sequence of section headings accordingly. We propose a novel hierarchical structured neural generation model, named HiStGen, for the task. Our model attempts to capture the three-level coherence via the following ways. First, we introduce a Markov paragraph dependency mechanism between context paragraphs for section identification. Second, we employ a section-aware attention mechanism to ensure the semantic coherence between a section and its heading. Finally, we leverage a Markov heading dependency mechanism and a review mechanism between context headings to improve the consistency and eliminate duplication between section headings. Besides, we build a novel WIKIOG dataset, a public collection which consists of over 1.75 million document-outline pairs for research on the OG task. Experimental results on our benchmark dataset demonstrate that our model can significantly outperform several state-of-the-art sequential generation models for the OG task.
Tasks Structured Prediction
Published 2019-05-24
URL https://arxiv.org/abs/1905.10039v1
PDF https://arxiv.org/pdf/1905.10039v1.pdf
PWC https://paperswithcode.com/paper/outline-generation-understanding-the-inherent
Repo
Framework
comments powered by Disqus