February 1, 2020

2778 words 14 mins read

Paper Group AWR 77

Paper Group AWR 77

Greedy Optimized Multileaving for Personalization. Verification of Neural Networks: Specifying Global Robustness using Generative Models. Residual Reactive Navigation: Combining Classical and Learned Navigation Strategies For Deployment in Unknown Environments. Zero-Shot Open Entity Typing as Type-Compatible Grounding. The Proper Care and Feeding o …

Greedy Optimized Multileaving for Personalization

Title Greedy Optimized Multileaving for Personalization
Authors Kojiro Iizuka, Takeshi Yoneda, Yoshifumi Seki
Abstract Personalization plays an important role in many services. To evaluate personalized rankings, online evaluation, such as A/B testing, is widely used today. Recently, multileaving has been found to be an efficient method for evaluating rankings in information retrieval fields. This paper describes the first attempt to optimize the multileaving method for personalization settings. We clarify the challenges of applying this method to personalized rankings. Then, to solve these challenges, we propose greedy optimized multileaving (GOM) with a new credit feedback function. The empirical results showed that GOM was stable for increasing ranking lengths and the number of rankers. We implemented GOM on our actual news recommender systems, and compared its online performance. The results showed that GOM evaluated the personalized rankings precisely, with significantly smaller sample sizes (< 1/10) than A/B testing.
Tasks Information Retrieval, Recommendation Systems
Published 2019-07-19
URL https://arxiv.org/abs/1907.08346v1
PDF https://arxiv.org/pdf/1907.08346v1.pdf
PWC https://paperswithcode.com/paper/greedy-optimized-multileaving-for
Repo https://github.com/mathetake/intergo
Framework none

Verification of Neural Networks: Specifying Global Robustness using Generative Models

Title Verification of Neural Networks: Specifying Global Robustness using Generative Models
Authors Nathanaël Fijalkow, Mohit Kumar Gupta
Abstract The success of neural networks across most machine learning tasks and the persistence of adversarial examples have made the verification of such models an important quest. Several techniques have been successfully developed to verify robustness, and are now able to evaluate neural networks with thousands of nodes. The main weakness of this approach is in the specification: robustness is asserted on a validation set consisting of a finite set of examples, i.e. locally. We propose a notion of global robustness based on generative models, which asserts the robustness on a very large and representative set of examples. We show how this can be used for verifying neural networks. In this paper we experimentally explore the merits of this approach, and show how it can be used to construct realistic adversarial examples.
Tasks
Published 2019-10-11
URL https://arxiv.org/abs/1910.05018v1
PDF https://arxiv.org/pdf/1910.05018v1.pdf
PWC https://paperswithcode.com/paper/verification-of-neural-networks-specifying
Repo https://github.com/mohitiitb/NeuralNetworkVerification_GlobalRobustness
Framework tf

Residual Reactive Navigation: Combining Classical and Learned Navigation Strategies For Deployment in Unknown Environments

Title Residual Reactive Navigation: Combining Classical and Learned Navigation Strategies For Deployment in Unknown Environments
Authors Krishan Rana, Ben Talbot, Vibhavari Dasagi, Michael Milford, Niko Sünderhauf
Abstract In this work we focus on improving the efficiency and generalisation of learned navigation strategies when transferred from its training environment to previously unseen ones. We present an extension of the residual reinforcement learning framework from the robotic manipulation literature and adapt it to the vast and unstructured environments that mobile robots can operate in. The concept is based on learning a residual control effect to add to a typical sub-optimal classical controller in order to close the performance gap, whilst guiding the exploration process during training for improved data efficiency. We exploit this tight coupling and propose a novel deployment strategy, switching Residual Reactive Navigation (sRRN), which yields efficient trajectories whilst probabilistically switching to a classical controller in cases of high policy uncertainty. Our approach achieves improved performance over end-to-end alternatives and can be incorporated as part of a complete navigation stack for cluttered indoor navigation tasks in the real world. The code and training environment for this project is made publicly available at https://sites.google.com/view/srrn/home.
Tasks
Published 2019-09-24
URL https://arxiv.org/abs/1909.10972v2
PDF https://arxiv.org/pdf/1909.10972v2.pdf
PWC https://paperswithcode.com/paper/residual-reactive-navigation-combining
Repo https://github.com/krishanrana/2D_SRRN
Framework none

Zero-Shot Open Entity Typing as Type-Compatible Grounding

Title Zero-Shot Open Entity Typing as Type-Compatible Grounding
Authors Ben Zhou, Daniel Khashabi, Chen-Tse Tsai, Dan Roth
Abstract The problem of entity-typing has been studied predominantly in supervised learning fashion, mostly with task-specific annotations (for coarse types) and sometimes with distant supervision (for fine types). While such approaches have strong performance within datasets, they often lack the flexibility to transfer across text genres and to generalize to new type taxonomies. In this work we propose a zero-shot entity typing approach that requires no annotated data and can flexibly identify newly defined types. Given a type taxonomy defined as Boolean functions of FREEBASE “types”, we ground a given mention to a set of type-compatible Wikipedia entries and then infer the target mention’s types using an inference algorithm that makes use of the types of these entries. We evaluate our system on a broad range of datasets, including standard fine-grained and coarse-grained entity typing datasets, and also a dataset in the biological domain. Our system is shown to be competitive with state-of-the-art supervised NER systems and outperforms them on out-of-domain datasets. We also show that our system significantly outperforms other zero-shot fine typing systems.
Tasks Entity Typing
Published 2019-07-07
URL https://arxiv.org/abs/1907.03228v1
PDF https://arxiv.org/pdf/1907.03228v1.pdf
PWC https://paperswithcode.com/paper/zero-shot-open-entity-typing-as-type-1
Repo https://github.com/CogComp/zoe
Framework tf

The Proper Care and Feeding of CAMELS: How Limited Training Data Affects Streamflow Prediction

Title The Proper Care and Feeding of CAMELS: How Limited Training Data Affects Streamflow Prediction
Authors Martin Gauch, Juliane Mai, Jimmy Lin
Abstract Accurate streamflow prediction largely relies on historical records of both meteorological data and streamflow measurements. For many regions around the world, however, such data are only scarcely or not at all available. To select an appropriate model for a region with a given amount of historical data, it is therefore indispensable to know a model’s sensitivity to limited training data, both in terms of geographic diversity and different spans of time. In this study, we provide decision support for tree- and LSTM-based models. We feed the models meteorological measurements from the CAMELS dataset, and individually restrict the training period length and the number of basins used in training. Our findings show that tree-based models provide more accurate predictions on small datasets, while LSTMs are superior given sufficient training data. This is perhaps not surprising, as neural networks are known to be data-hungry; however, we are able to characterize each model’s strengths under different conditions, including the “breakeven point” when LSTMs begin to overtake tree-based models.
Tasks
Published 2019-11-17
URL https://arxiv.org/abs/1911.07249v1
PDF https://arxiv.org/pdf/1911.07249v1.pdf
PWC https://paperswithcode.com/paper/the-proper-care-and-feeding-of-camels-how
Repo https://github.com/gauchm/ealstm_regional_modeling
Framework pytorch

Few-Shot Representation Learning for Out-Of-Vocabulary Words

Title Few-Shot Representation Learning for Out-Of-Vocabulary Words
Authors Ziniu Hu, Ting Chen, Kai-Wei Chang, Yizhou Sun
Abstract Existing approaches for learning word embeddings often assume there are sufficient occurrences for each word in the corpus, such that the representation of words can be accurately estimated from their contexts. However, in real-world scenarios, out-of-vocabulary (a.k.a. OOV) words that do not appear in training corpus emerge frequently. It is challenging to learn accurate representations of these words with only a few observations. In this paper, we formulate the learning of OOV embeddings as a few-shot regression problem, and address it by training a representation function to predict the oracle embedding vector (defined as embedding trained with abundant observations) based on limited observations. Specifically, we propose a novel hierarchical attention-based architecture to serve as the neural regression function, with which the context information of a word is encoded and aggregated from K observations. Furthermore, our approach can leverage Model-Agnostic Meta-Learning (MAML) for adapting the learned model to the new corpus fast and robustly. Experiments show that the proposed approach significantly outperforms existing methods in constructing accurate embeddings for OOV words, and improves downstream tasks where these embeddings are utilized.
Tasks few-shot regression, Learning Word Embeddings, Meta-Learning, Representation Learning, Word Embeddings
Published 2019-07-01
URL https://arxiv.org/abs/1907.00505v1
PDF https://arxiv.org/pdf/1907.00505v1.pdf
PWC https://paperswithcode.com/paper/few-shot-representation-learning-for-out-of
Repo https://github.com/acbull/HiCE
Framework pytorch

Abstract Reasoning with Distracting Features

Title Abstract Reasoning with Distracting Features
Authors Kecheng Zheng, Zheng-jun Zha, Wei Wei
Abstract Abstraction reasoning is a long-standing challenge in artificial intelligence. Recent studies suggest that many of the deep architectures that have triumphed over other domains failed to work well in abstract reasoning. In this paper, we first illustrate that one of the main challenges in such a reasoning task is the presence of distracting features, which requires the learning algorithm to leverage counterevidence and to reject any of the false hypotheses in order to learn the true patterns. We later show that carefully designed learning trajectory over different categories of training data can effectively boost learning performance by mitigating the impacts of distracting features. Inspired by this fact, we propose feature robust abstract reasoning (FRAR) model, which consists of a reinforcement learning based teacher network to determine the sequence of training and a student network for predictions. Experimental results demonstrated strong improvements over baseline algorithms and we are able to beat the state-of-the-art models by 18.7% in the RAVEN dataset and 13.3% in the PGM dataset.
Tasks
Published 2019-12-02
URL https://arxiv.org/abs/1912.00569v1
PDF https://arxiv.org/pdf/1912.00569v1.pdf
PWC https://paperswithcode.com/paper/abstract-reasoning-with-distracting-features-1
Repo https://github.com/zkcys001/distracting_feature
Framework pytorch

Bayesian Batch Active Learning as Sparse Subset Approximation

Title Bayesian Batch Active Learning as Sparse Subset Approximation
Authors Robert Pinsler, Jonathan Gordon, Eric Nalisnick, José Miguel Hernández-Lobato
Abstract Leveraging the wealth of unlabeled data produced in recent years provides great potential for improving supervised models. When the cost of acquiring labels is high, probabilistic active learning methods can be used to greedily select the most informative data points to be labeled. However, for many large-scale problems standard greedy procedures become computationally infeasible and suffer from negligible model change. In this paper, we introduce a novel Bayesian batch active learning approach that mitigates these issues. Our approach is motivated by approximating the complete data posterior of the model parameters. While naive batch construction methods result in correlated queries, our algorithm produces diverse batches that enable efficient active learning at scale. We derive interpretable closed-form solutions akin to existing active learning procedures for linear models, and generalize to arbitrary models using random projections. We demonstrate the benefits of our approach on several large-scale regression and classification tasks.
Tasks Active Learning
Published 2019-08-06
URL https://arxiv.org/abs/1908.02144v3
PDF https://arxiv.org/pdf/1908.02144v3.pdf
PWC https://paperswithcode.com/paper/bayesian-batch-active-learning-as-sparse
Repo https://github.com/rpinsler/active-bayesian-coresets
Framework pytorch

BPE-Dropout: Simple and Effective Subword Regularization

Title BPE-Dropout: Simple and Effective Subword Regularization
Authors Ivan Provilkov, Dmitrii Emelianenko, Elena Voita
Abstract Subword segmentation is widely used to address the open vocabulary problem in machine translation. The dominant approach to subword segmentation is Byte Pair Encoding (BPE), which keeps the most frequent words intact while splitting the rare ones into multiple tokens. While multiple segmentations are possible even with the same vocabulary, BPE splits words into unique sequences; this may prevent a model from better learning the compositionality of words and being robust to segmentation errors. So far, the only way to overcome this BPE imperfection, its deterministic nature, was to create another subword segmentation algorithm (Kudo, 2018). In contrast, we show that BPE itself incorporates the ability to produce multiple segmentations of the same word. We introduce BPE-dropout - simple and effective subword regularization method based on and compatible with conventional BPE. It stochastically corrupts the segmentation procedure of BPE, which leads to producing multiple segmentations within the same fixed BPE framework. Using BPE-dropout during training and the standard BPE during inference improves translation quality up to 3 BLEU compared to BPE and up to 0.9 BLEU compared to the previous subword regularization.
Tasks Machine Translation
Published 2019-10-29
URL https://arxiv.org/abs/1910.13267v1
PDF https://arxiv.org/pdf/1910.13267v1.pdf
PWC https://paperswithcode.com/paper/191013267
Repo https://github.com/kh-mo/QA_wikisql
Framework none

Knowledge-Embedded Routing Network for Scene Graph Generation

Title Knowledge-Embedded Routing Network for Scene Graph Generation
Authors Tianshui Chen, Weihao Yu, Riquan Chen, Liang Lin
Abstract To understand a scene in depth not only involves locating/recognizing individual objects, but also requires to infer the relationships and interactions among them. However, since the distribution of real-world relationships is seriously unbalanced, existing methods perform quite poorly for the less frequent relationships. In this work, we find that the statistical correlations between object pairs and their relationships can effectively regularize semantic space and make prediction less ambiguous, and thus well address the unbalanced distribution issue. To achieve this, we incorporate these statistical correlations into deep neural networks to facilitate scene graph generation by developing a Knowledge-Embedded Routing Network. More specifically, we show that the statistical correlations between objects appearing in images and their relationships, can be explicitly represented by a structured knowledge graph, and a routing mechanism is learned to propagate messages through the graph to explore their interactions. Extensive experiments on the large-scale Visual Genome dataset demonstrate the superiority of the proposed method over current state-of-the-art competitors.
Tasks Graph Generation, Scene Graph Generation
Published 2019-03-08
URL http://arxiv.org/abs/1903.03326v1
PDF http://arxiv.org/pdf/1903.03326v1.pdf
PWC https://paperswithcode.com/paper/knowledge-embedded-routing-network-for-scene
Repo https://github.com/HCPLab-SYSU/KERN
Framework pytorch

Good Secretaries, Bad Truck Drivers? Occupational Gender Stereotypes in Sentiment Analysis

Title Good Secretaries, Bad Truck Drivers? Occupational Gender Stereotypes in Sentiment Analysis
Authors Jayadev Bhaskaran, Isha Bhallamudi
Abstract In this work, we investigate the presence of occupational gender stereotypes in sentiment analysis models. Such a task has implications for reducing implicit biases in these models, which are being applied to an increasingly wide variety of downstream tasks. We release a new gender-balanced dataset of 800 sentences pertaining to specific professions and propose a methodology for using it as a test bench to evaluate sentiment analysis models. We evaluate the presence of occupational gender stereotypes in 3 different models using our approach, and explore their relationship with societal perceptions of occupations.
Tasks Sentiment Analysis
Published 2019-06-24
URL https://arxiv.org/abs/1906.10256v2
PDF https://arxiv.org/pdf/1906.10256v2.pdf
PWC https://paperswithcode.com/paper/good-secretaries-bad-truck-drivers
Repo https://github.com/jayadevbhaskaran/gendered-sentiment
Framework none

Variational Sequential Labelers for Semi-Supervised Learning

Title Variational Sequential Labelers for Semi-Supervised Learning
Authors Mingda Chen, Qingming Tang, Karen Livescu, Kevin Gimpel
Abstract We introduce a family of multitask variational methods for semi-supervised sequence labeling. Our model family consists of a latent-variable generative model and a discriminative labeler. The generative models use latent variables to define the conditional probability of a word given its context, drawing inspiration from word prediction objectives commonly used in learning word embeddings. The labeler helps inject discriminative information into the latent space. We explore several latent variable configurations, including ones with hierarchical structure, which enables the model to account for both label-specific and word-specific information. Our models consistently outperform standard sequential baselines on 8 sequence labeling datasets, and improve further with unlabeled data.
Tasks Learning Word Embeddings, Word Embeddings
Published 2019-06-23
URL https://arxiv.org/abs/1906.09535v1
PDF https://arxiv.org/pdf/1906.09535v1.pdf
PWC https://paperswithcode.com/paper/variational-sequential-labelers-for-semi-1
Repo https://github.com/mingdachen/vsl
Framework pytorch

Towards Robust Named Entity Recognition for Historic German

Title Towards Robust Named Entity Recognition for Historic German
Authors Stefan Schweter, Johannes Baiter
Abstract Recent advances in language modeling using deep neural networks have shown that these models learn representations, that vary with the network depth from morphology to semantic relationships like co-reference. We apply pre-trained language models to low-resource named entity recognition for Historic German. We show on a series of experiments that character-based pre-trained language models do not run into trouble when faced with low-resource datasets. Our pre-trained character-based language models improve upon classical CRF-based methods and previous work on Bi-LSTMs by boosting F1 score performance by up to 6%. Our pre-trained language and NER models are publicly available under https://github.com/stefan-it/historic-ner .
Tasks Language Modelling, Named Entity Recognition
Published 2019-06-18
URL https://arxiv.org/abs/1906.07592v1
PDF https://arxiv.org/pdf/1906.07592v1.pdf
PWC https://paperswithcode.com/paper/towards-robust-named-entity-recognition-for
Repo https://github.com/stefan-it/historic-ner
Framework none

Robust Evaluation of Language-Brain Encoding Experiments

Title Robust Evaluation of Language-Brain Encoding Experiments
Authors Lisa Beinborn, Samira Abnar, Rochelle Choenni
Abstract Language-brain encoding experiments evaluate the ability of language models to predict brain responses elicited by language stimuli. The evaluation scenarios for this task have not yet been standardized which makes it difficult to compare and interpret results. We perform a series of evaluation experiments with a consistent encoding setup and compute the results for multiple fMRI datasets. In addition, we test the sensitivity of the evaluation measures to randomized data and analyze the effect of voxel selection methods. Our experimental framework is publicly available to make modelling decisions more transparent and support reproducibility for future comparisons.
Tasks
Published 2019-04-04
URL http://arxiv.org/abs/1904.02547v1
PDF http://arxiv.org/pdf/1904.02547v1.pdf
PWC https://paperswithcode.com/paper/robust-evaluation-of-language-brain-encoding
Repo https://github.com/beinborn/brain-lang
Framework none

Quantifying the Carbon Emissions of Machine Learning

Title Quantifying the Carbon Emissions of Machine Learning
Authors Alexandre Lacoste, Alexandra Luccioni, Victor Schmidt, Thomas Dandres
Abstract From an environmental standpoint, there are a few crucial aspects of training a neural network that have a major impact on the quantity of carbon that it emits. These factors include: the location of the server used for training and the energy grid that it uses, the length of the training procedure, and even the make and model of hardware on which the training takes place. In order to approximate these emissions, we present our Machine Learning Emissions Calculator, a tool for our community to better understand the environmental impact of training ML models. We accompany this tool with an explanation of the factors cited above, as well as concrete actions that individual practitioners and organizations can take to mitigate their carbon emissions.
Tasks
Published 2019-10-21
URL https://arxiv.org/abs/1910.09700v2
PDF https://arxiv.org/pdf/1910.09700v2.pdf
PWC https://paperswithcode.com/paper/quantifying-the-carbon-emissions-of-machine
Repo https://github.com/mlco2/impact
Framework none
comments powered by Disqus