January 26, 2020

2921 words 14 mins read

Paper Group ANR 1351

Paper Group ANR 1351

Characterization of Overlap in Observational Studies. Facilitating on-line opinion dynamics by mining expressions of causation. The case of climate change debates on The Guardian. XRAI: Better Attributions Through Regions. Knowledge Enhanced Contextual Word Representations. Putting words in context: LSTM language models and lexical ambiguity. Writi …

Characterization of Overlap in Observational Studies

Title Characterization of Overlap in Observational Studies
Authors Fredrik D. Johansson, Dennis Wei, Michael Oberst, Tian Gao, Gabriel Brat, David Sontag, Kush R. Varshney
Abstract Overlap between treatment groups is required for non-parametric estimation of causal effects. If a subgroup of subjects always receives the same intervention, we cannot estimate the effect of intervention changes on that subgroup without further assumptions. When overlap does not hold globally, characterizing local regions of overlap can inform the relevance of causal conclusions for new subjects, and can help guide additional data collection. To have impact, these descriptions must be interpretable for downstream users who are not machine learning experts, such as policy makers. We formalize overlap estimation as a problem of finding minimum volume sets subject to coverage constraints and reduce this problem to binary classification with Boolean rule classifiers. We then generalize this method to estimate overlap in off-policy policy evaluation. In several real-world applications, we demonstrate that these rules have comparable accuracy to black-box estimators and provide intuitive and informative explanations that can inform policy making.
Tasks
Published 2019-07-09
URL https://arxiv.org/abs/1907.04138v2
PDF https://arxiv.org/pdf/1907.04138v2.pdf
PWC https://paperswithcode.com/paper/characterization-of-overlap-in-observational
Repo
Framework

Facilitating on-line opinion dynamics by mining expressions of causation. The case of climate change debates on The Guardian

Title Facilitating on-line opinion dynamics by mining expressions of causation. The case of climate change debates on The Guardian
Authors Tom Willaert, Sven Banisch, Paul Van Eecke, Katrien Beuls
Abstract News website comment sections are spaces where potentially conflicting opinions and beliefs are voiced. Addressing questions of how to study such cultural and societal conflicts through technological means, the present article critically examines possibilities and limitations of machine-guided exploration and potential facilitation of on-line opinion dynamics. These investigations are guided by a discussion of an experimental observatory for mining and analyzing opinions from climate change-related user comments on news articles from the TheGuardian.com. This observatory combines causal mapping methods with computational text analysis in order to mine beliefs and visualize opinion landscapes based on expressions of causation. By (1) introducing digital methods and open infrastructures for data exploration and analysis and (2) engaging in debates about the implications of such methods and infrastructures, notably in terms of the leap from opinion observation to debate facilitation, the article aims to make a practical and theoretical contribution to the study of opinion dynamics and conflict in new media environments.
Tasks
Published 2019-12-03
URL https://arxiv.org/abs/1912.01252v1
PDF https://arxiv.org/pdf/1912.01252v1.pdf
PWC https://paperswithcode.com/paper/facilitating-on-line-opinion-dynamics-by
Repo
Framework

XRAI: Better Attributions Through Regions

Title XRAI: Better Attributions Through Regions
Authors Andrei Kapishnikov, Tolga Bolukbasi, Fernanda Viégas, Michael Terry
Abstract Saliency methods can aid understanding of deep neural networks. Recent years have witnessed many improvements to saliency methods, as well as new ways for evaluating them. In this paper, we 1) present a novel region-based attribution method, XRAI, that builds upon integrated gradients (Sundararajan et al. 2017), 2) introduce evaluation methods for empirically assessing the quality of image-based saliency maps (Performance Information Curves (PICs)), and 3) contribute an axiom-based sanity check for attribution methods. Through empirical experiments and example results, we show that XRAI produces better results than other saliency methods for common models and the ImageNet dataset.
Tasks
Published 2019-06-06
URL https://arxiv.org/abs/1906.02825v2
PDF https://arxiv.org/pdf/1906.02825v2.pdf
PWC https://paperswithcode.com/paper/segment-integrated-gradients-better
Repo
Framework

Knowledge Enhanced Contextual Word Representations

Title Knowledge Enhanced Contextual Word Representations
Authors Matthew E. Peters, Mark Neumann, Robert L. Logan IV, Roy Schwartz, Vidur Joshi, Sameer Singh, Noah A. Smith
Abstract Contextual word representations, typically trained on unstructured, unlabeled text, do not contain any explicit grounding to real world entities and are often unable to remember facts about those entities. We propose a general method to embed multiple knowledge bases (KBs) into large scale models, and thereby enhance their representations with structured, human-curated knowledge. For each KB, we first use an integrated entity linker to retrieve relevant entity embeddings, then update contextual word representations via a form of word-to-entity attention. In contrast to previous approaches, the entity linkers and self-supervised language modeling objective are jointly trained end-to-end in a multitask setting that combines a small amount of entity linking supervision with a large amount of raw text. After integrating WordNet and a subset of Wikipedia into BERT, the knowledge enhanced BERT (KnowBert) demonstrates improved perplexity, ability to recall facts as measured in a probing task and downstream performance on relationship extraction, entity typing, and word sense disambiguation. KnowBert’s runtime is comparable to BERT’s and it scales to large KBs.
Tasks Entity Linking, Entity Typing, Language Modelling, Relation Extraction, Word Sense Disambiguation
Published 2019-09-09
URL https://arxiv.org/abs/1909.04164v2
PDF https://arxiv.org/pdf/1909.04164v2.pdf
PWC https://paperswithcode.com/paper/knowledge-enhanced-contextual-word
Repo
Framework

Putting words in context: LSTM language models and lexical ambiguity

Title Putting words in context: LSTM language models and lexical ambiguity
Authors Laura Aina, Kristina Gulordava, Gemma Boleda
Abstract In neural network models of language, words are commonly represented using context-invariant representations (word embeddings) which are then put in context in the hidden layers. Since words are often ambiguous, representing the contextually relevant information is not trivial. We investigate how an LSTM language model deals with lexical ambiguity in English, designing a method to probe its hidden representations for lexical and contextual information about words. We find that both types of information are represented to a large extent, but also that there is room for improvement for contextual information.
Tasks Language Modelling, Word Embeddings
Published 2019-06-12
URL https://arxiv.org/abs/1906.05149v1
PDF https://arxiv.org/pdf/1906.05149v1.pdf
PWC https://paperswithcode.com/paper/putting-words-in-context-lstm-language-models
Repo
Framework

Writing Across the World’s Languages: Deep Internationalization for Gboard, the Google Keyboard

Title Writing Across the World’s Languages: Deep Internationalization for Gboard, the Google Keyboard
Authors Daan van Esch, Elnaz Sarbar, Tamar Lucassen, Jeremy O’Brien, Theresa Breiner, Manasa Prasad, Evan Crew, Chieu Nguyen, Françoise Beaufays
Abstract This technical report describes our deep internationalization program for Gboard, the Google Keyboard. Today, Gboard supports 900+ language varieties across 70+ writing systems, and this report describes how and why we have been adding support for hundreds of language varieties from around the globe. Many languages of the world are increasingly used in writing on an everyday basis, and we describe the trends we see. We cover technological and logistical challenges in scaling up a language technology product like Gboard to hundreds of language varieties, and describe how we built systems and processes to operate at scale. Finally, we summarize the key take-aways from user studies we ran with speakers of hundreds of languages from around the world.
Tasks
Published 2019-12-03
URL https://arxiv.org/abs/1912.01218v1
PDF https://arxiv.org/pdf/1912.01218v1.pdf
PWC https://paperswithcode.com/paper/writing-across-the-worlds-languages-deep
Repo
Framework

DeepBbox: Accelerating Precise Ground Truth Generation for Autonomous Driving Datasets

Title DeepBbox: Accelerating Precise Ground Truth Generation for Autonomous Driving Datasets
Authors Govind Rathore, Wan-Yi Lin, Ji Eun Kim
Abstract Autonomous driving requires various computer vision algorithms, such as object detection and tracking.Precisely-labeled datasets (i.e., objects are fully contained in bounding boxes with only a few extra pixels) are preferred for training such algorithms, so that the algorithms can detect exact locations of the objects. However, it is very time-consuming and hence expensive to generate precise labels for image sequences at scale. In this paper, we propose DeepBbox, an algorithm that corrects loose object labels into right bounding boxes to reduce human annotation efforts. We use Cityscapes dataset to show annotation efficiency and accuracy improvement using DeepBbox. Experimental results show that, with DeepBbox,we can increase the number of object edges that are labeled automatically (within 1% error) by 50% to reduce manual annotation time.
Tasks Autonomous Driving, Object Detection
Published 2019-08-29
URL https://arxiv.org/abs/1909.05620v1
PDF https://arxiv.org/pdf/1909.05620v1.pdf
PWC https://paperswithcode.com/paper/deepbbox-accelerating-precise-ground-truth
Repo
Framework

Towards improving the e-learning experience for deaf students: e-LUX

Title Towards improving the e-learning experience for deaf students: e-LUX
Authors Fabrizio Borgia, Claudia Bianchini, Maria de Marsico
Abstract Deaf people are more heavily affected by the digital divide than many would expect. Moreover, most accessibility guidelines addressing their needs just deal with captioning and audio-content transcription. However, this approach to the problem does not consider that deaf people have big troubles with vocal languages, even in their written form. At present, only a few organizations, like W3C, produced guidelines dealing with one of their most distinctive expressions: Sign Language (SL). SL is, in fact, the visual-gestural language used by many deaf people to communicate with each other. The present work aims at supporting e-learning user experience (e-LUX) for these specific users by enhancing the accessibility of content and container services. In particular, we propose preliminary solutions to tailor activities which can be more fruitful when performed in one’s own “native” language, which for most deaf people, especially younger ones, is represented by national SL.
Tasks
Published 2019-11-27
URL https://arxiv.org/abs/1911.13231v1
PDF https://arxiv.org/pdf/1911.13231v1.pdf
PWC https://paperswithcode.com/paper/towards-improving-the-e-learning-experience
Repo
Framework

Computer-Aided Data Mining: Automating a Novel Knowledge Discovery and Data Mining Process Model for Metabolomics

Title Computer-Aided Data Mining: Automating a Novel Knowledge Discovery and Data Mining Process Model for Metabolomics
Authors Ahmed BaniMustafa, Nigel Hardy
Abstract This work presents MeKDDaM-SAGA, computer-aided automation software for implementing a novel knowledge discovery and data mining process model that was designed for performing justifiable, traceable and reproducible metabolomics data analysis. The process model focuses on achieving metabolomics analytical objectives and on considering the nature of its involved data. MeKDDaM-SAGA was successfully used for guiding the process model execution in a number of metabolomics applications. It satisfies the requirements of the proposed process model design and execution. The software realises the process model layout, structure and flow and it enables its execution externally using various data mining and machine learning tools or internally using a number of embedded facilities that were built for performing a number of automated activities such as data preprocessing, data exploration, data acclimatization, modelling, evaluation and visualization. MeKDDaM-SAGA was developed using object-oriented software engineering methodology and was constructed in Java. It consists of 241 design classes that were designed to implement 27 use-cases. The software uses an XML database to guarantee portability and uses a GUI interface to ensure its user-friendliness. It implements an internal embedded version control system that is used to realise and manage the process flow, feedback and iterations and to enable undoing and redoing the execution of the process phases, activities, and the internal tasks within its phases.
Tasks
Published 2019-07-09
URL https://arxiv.org/abs/1907.04318v1
PDF https://arxiv.org/pdf/1907.04318v1.pdf
PWC https://paperswithcode.com/paper/computer-aided-data-mining-automating-a-novel
Repo
Framework

How Contextual are Contextualized Word Representations? Comparing the Geometry of BERT, ELMo, and GPT-2 Embeddings

Title How Contextual are Contextualized Word Representations? Comparing the Geometry of BERT, ELMo, and GPT-2 Embeddings
Authors Kawin Ethayarajh
Abstract Replacing static word embeddings with contextualized word representations has yielded significant improvements on many NLP tasks. However, just how contextual are the contextualized representations produced by models such as ELMo and BERT? Are there infinitely many context-specific representations for each word, or are words essentially assigned one of a finite number of word-sense representations? For one, we find that the contextualized representations of all words are not isotropic in any layer of the contextualizing model. While representations of the same word in different contexts still have a greater cosine similarity than those of two different words, this self-similarity is much lower in upper layers. This suggests that upper layers of contextualizing models produce more context-specific representations, much like how upper layers of LSTMs produce more task-specific representations. In all layers of ELMo, BERT, and GPT-2, on average, less than 5% of the variance in a word’s contextualized representations can be explained by a static embedding for that word, providing some justification for the success of contextualized representations.
Tasks Word Embeddings
Published 2019-09-02
URL https://arxiv.org/abs/1909.00512v1
PDF https://arxiv.org/pdf/1909.00512v1.pdf
PWC https://paperswithcode.com/paper/how-contextual-are-contextualized-word
Repo
Framework

TauRieL: Targeting Traveling Salesman Problem with a deep reinforcement learning inspired architecture

Title TauRieL: Targeting Traveling Salesman Problem with a deep reinforcement learning inspired architecture
Authors Gorker Alp Malazgirt, Osman S. Unsal, Adrian Cristal Kestelman
Abstract In this paper, we propose TauRieL and target Traveling Salesman Problem (TSP) since it has broad applicability in theoretical and applied sciences. TauRieL utilizes an actor-critic inspired architecture that adopts ordinary feedforward nets to obtain a policy update vector $v$. Then, we use $v$ to improve the state transition matrix from which we generate the policy. Also, the state transition matrix allows the solver to initialize from precomputed solutions such as nearest neighbors. In an online learning setting, TauRieL unifies the training and the search where it can generate near-optimal results in seconds. The input to the neural nets in the actor-critic architecture are raw 2-D inputs, and the design idea behind this decision is to keep neural nets relatively smaller than the architectures with wide embeddings with the tradeoff of omitting any distributed representations of the embeddings. Consequently, TauRieL generates TSP solutions two orders of magnitude faster per TSP instance as compared to state-of-the-art offline techniques with a performance impact of 6.1% in the worst case.
Tasks
Published 2019-05-14
URL https://arxiv.org/abs/1905.05567v1
PDF https://arxiv.org/pdf/1905.05567v1.pdf
PWC https://paperswithcode.com/paper/tauriel-targeting-traveling-salesman-problem
Repo
Framework

Drifting Reinforcement Learning: The Blessing of (More) Optimism in Face of Endogenous & Exogenous Dynamics

Title Drifting Reinforcement Learning: The Blessing of (More) Optimism in Face of Endogenous & Exogenous Dynamics
Authors Wang Chi Cheung, David Simchi-Levi, Ruihao Zhu
Abstract We consider un-discounted reinforcement learning (RL) in Markov decision processes (MDPs) under temporal drifts, ie, both the reward and state transition distributions are allowed to evolve over time, as long as their respective total variations, quantified by suitable metrics, do not exceed certain variation budgets. This setting captures the endogenous and exogenous dynamics, uncertainty, and partial feedback in sequential decision-making scenarios, and finds applications in vehicle remarketing and real-time bidding. We first develop the Sliding Window Upper-Confidence bound for Reinforcement Learning with Confidence Widening (SWUCRL2-CW) algorithm, and establish its dynamic regret bound when the variation budgets are known. In addition, we propose the Bandit-over-Reinforcement Learning (BORL) algorithm to adaptively tune the SWUCRL2-CW algorithm to achieve the same dynamic regret bound, but in a parameter-free manner, ie, without knowing the variation budgets. Finally, we conduct numerical experiments to show that our proposed algorithms achieve superior empirical performance compared to existing algorithms. Notably, the interplay between endogenous and exogenous dynamics presents a unique challenge, absent in existing (stationary and non-stationary) stochastic online learning settings, when we apply the conventional Optimism in Face of Uncertainty principle to design algorithms with provably low dynamic regret for RL in drifting MDPs. We overcome the challenge by a novel confidence widening technique that incorporates additional optimism into our learning algorithms to ensure low dynamic regret bounds. To extend our theoretical findings, we apply our framework to inventory control problems, and demonstrate how one can alternatively leverage special structures on the state transition distributions to bypass the difficulty in exploring time-varying environments.
Tasks Decision Making
Published 2019-06-07
URL https://arxiv.org/abs/1906.02922v3
PDF https://arxiv.org/pdf/1906.02922v3.pdf
PWC https://paperswithcode.com/paper/reinforcement-learning-under-drift
Repo
Framework

Linear-Quadratic Mean-Field Reinforcement Learning: Convergence of Policy Gradient Methods

Title Linear-Quadratic Mean-Field Reinforcement Learning: Convergence of Policy Gradient Methods
Authors René Carmona, Mathieu Laurière, Zongjun Tan
Abstract We investigate reinforcement learning for mean field control problems in discrete time, which can be viewed as Markov decision processes for a large number of exchangeable agents interacting in a mean field manner. Such problems arise, for instance when a large number of robots communicate through a central unit dispatching the optimal policy computed by minimizing the overall social cost. An approximate solution is obtained by learning the optimal policy of a generic agent interacting with the statistical distribution of the states of the other agents. We prove rigorously the convergence of exact and model-free policy gradient methods in a mean-field linear-quadratic setting. We also provide graphical evidence of the convergence based on implementations of our algorithms.
Tasks Policy Gradient Methods
Published 2019-10-09
URL https://arxiv.org/abs/1910.04295v1
PDF https://arxiv.org/pdf/1910.04295v1.pdf
PWC https://paperswithcode.com/paper/linear-quadratic-mean-field-reinforcement
Repo
Framework

Non-Negative Kernel Sparse Coding for the Classification of Motion Data

Title Non-Negative Kernel Sparse Coding for the Classification of Motion Data
Authors Babak Hosseini, Felix Hülsmann, Mario Botsch, Barbara Hammer
Abstract We are interested in the decomposition of motion data into a sparse linear combination of base functions which enable efficient data processing. We combine two prominent frameworks: dynamic time warping (DTW), which offers particularly successful pairwise motion data comparison, and sparse coding (SC), which enables an automatic decomposition of vectorial data into a sparse linear combination of base vectors. We enhance SC as follows: an efficient kernelization which extends its application domain to general similarity data such as offered by DTW, and its restriction to non-negative linear representations of signals and base vectors in order to guarantee a meaningful dictionary. Empirical evaluations on motion capture benchmarks show the effectiveness of our framework regarding interpretation and discrimination concerns.
Tasks Motion Capture
Published 2019-03-10
URL http://arxiv.org/abs/1903.03891v2
PDF http://arxiv.org/pdf/1903.03891v2.pdf
PWC https://paperswithcode.com/paper/non-negative-kernel-sparse-coding-for-the
Repo
Framework

Unsupervised Paraphrasing without Translation

Title Unsupervised Paraphrasing without Translation
Authors Aurko Roy, David Grangier
Abstract Paraphrasing exemplifies the ability to abstract semantic content from surface forms. Recent work on automatic paraphrasing is dominated by methods leveraging Machine Translation (MT) as an intermediate step. This contrasts with humans, who can paraphrase without being bilingual. This work proposes to learn paraphrasing models from an unlabeled monolingual corpus only. To that end, we propose a residual variant of vector-quantized variational auto-encoder. We compare with MT-based approaches on paraphrase identification, generation, and training augmentation. Monolingual paraphrasing outperforms unsupervised translation in all settings. Comparisons with supervised translation are more mixed: monolingual paraphrasing is interesting for identification and augmentation; supervised translation is superior for generation.
Tasks Machine Translation, Paraphrase Identification
Published 2019-05-29
URL https://arxiv.org/abs/1905.12752v1
PDF https://arxiv.org/pdf/1905.12752v1.pdf
PWC https://paperswithcode.com/paper/unsupervised-paraphrasing-without-translation
Repo
Framework
comments powered by Disqus