February 1, 2020

2778 words 14 mins read

Paper Group AWR 77

Greedy Optimized Multileaving for Personalization. Verification of Neural Networks: Specifying Global Robustness using Generative Models. Residual Reactive Navigation: Combining Classical and Learned Navigation Strategies For Deployment in Unknown Environments. Zero-Shot Open Entity Typing as Type-Compatible Grounding. The Proper Care and Feeding o …

Greedy Optimized Multileaving for Personalization


Title	Greedy Optimized Multileaving for Personalization
Authors	Kojiro Iizuka, Takeshi Yoneda, Yoshifumi Seki
Abstract	Personalization plays an important role in many services. To evaluate personalized rankings, online evaluation, such as A/B testing, is widely used today. Recently, multileaving has been found to be an efficient method for evaluating rankings in information retrieval fields. This paper describes the first attempt to optimize the multileaving method for personalization settings. We clarify the challenges of applying this method to personalized rankings. Then, to solve these challenges, we propose greedy optimized multileaving (GOM) with a new credit feedback function. The empirical results showed that GOM was stable for increasing ranking lengths and the number of rankers. We implemented GOM on our actual news recommender systems, and compared its online performance. The results showed that GOM evaluated the personalized rankings precisely, with significantly smaller sample sizes (< 1/10) than A/B testing.
Tasks	Information Retrieval, Recommendation Systems
Published	2019-07-19
URL	https://arxiv.org/abs/1907.08346v1
PDF	https://arxiv.org/pdf/1907.08346v1.pdf
PWC	https://paperswithcode.com/paper/greedy-optimized-multileaving-for
Repo	https://github.com/mathetake/intergo
Framework	none

Verification of Neural Networks: Specifying Global Robustness using Generative Models


Title	Verification of Neural Networks: Specifying Global Robustness using Generative Models
Authors	Nathanaël Fijalkow, Mohit Kumar Gupta
Abstract	The success of neural networks across most machine learning tasks and the persistence of adversarial examples have made the verification of such models an important quest. Several techniques have been successfully developed to verify robustness, and are now able to evaluate neural networks with thousands of nodes. The main weakness of this approach is in the specification: robustness is asserted on a validation set consisting of a finite set of examples, i.e. locally. We propose a notion of global robustness based on generative models, which asserts the robustness on a very large and representative set of examples. We show how this can be used for verifying neural networks. In this paper we experimentally explore the merits of this approach, and show how it can be used to construct realistic adversarial examples.
Tasks
Published	2019-10-11
URL	https://arxiv.org/abs/1910.05018v1
PDF	https://arxiv.org/pdf/1910.05018v1.pdf
PWC	https://paperswithcode.com/paper/verification-of-neural-networks-specifying
Repo	https://github.com/mohitiitb/NeuralNetworkVerification_GlobalRobustness
Framework	tf


Title	Residual Reactive Navigation: Combining Classical and Learned Navigation Strategies For Deployment in Unknown Environments
Authors	Krishan Rana, Ben Talbot, Vibhavari Dasagi, Michael Milford, Niko Sünderhauf
Abstract	In this work we focus on improving the efficiency and generalisation of learned navigation strategies when transferred from its training environment to previously unseen ones. We present an extension of the residual reinforcement learning framework from the robotic manipulation literature and adapt it to the vast and unstructured environments that mobile robots can operate in. The concept is based on learning a residual control effect to add to a typical sub-optimal classical controller in order to close the performance gap, whilst guiding the exploration process during training for improved data efficiency. We exploit this tight coupling and propose a novel deployment strategy, switching Residual Reactive Navigation (sRRN), which yields efficient trajectories whilst probabilistically switching to a classical controller in cases of high policy uncertainty. Our approach achieves improved performance over end-to-end alternatives and can be incorporated as part of a complete navigation stack for cluttered indoor navigation tasks in the real world. The code and training environment for this project is made publicly available at https://sites.google.com/view/srrn/home.
Tasks
Published	2019-09-24
URL	https://arxiv.org/abs/1909.10972v2
PDF	https://arxiv.org/pdf/1909.10972v2.pdf
PWC	https://paperswithcode.com/paper/residual-reactive-navigation-combining
Repo	https://github.com/krishanrana/2D_SRRN
Framework	none

Zero-Shot Open Entity Typing as Type-Compatible Grounding


Title	Zero-Shot Open Entity Typing as Type-Compatible Grounding
Authors	Ben Zhou, Daniel Khashabi, Chen-Tse Tsai, Dan Roth
Abstract	The problem of entity-typing has been studied predominantly in supervised learning fashion, mostly with task-specific annotations (for coarse types) and sometimes with distant supervision (for fine types). While such approaches have strong performance within datasets, they often lack the flexibility to transfer across text genres and to generalize to new type taxonomies. In this work we propose a zero-shot entity typing approach that requires no annotated data and can flexibly identify newly defined types. Given a type taxonomy defined as Boolean functions of FREEBASE “types”, we ground a given mention to a set of type-compatible Wikipedia entries and then infer the target mention’s types using an inference algorithm that makes use of the types of these entries. We evaluate our system on a broad range of datasets, including standard fine-grained and coarse-grained entity typing datasets, and also a dataset in the biological domain. Our system is shown to be competitive with state-of-the-art supervised NER systems and outperforms them on out-of-domain datasets. We also show that our system significantly outperforms other zero-shot fine typing systems.
Tasks	Entity Typing
Published	2019-07-07
URL	https://arxiv.org/abs/1907.03228v1
PDF	https://arxiv.org/pdf/1907.03228v1.pdf
PWC	https://paperswithcode.com/paper/zero-shot-open-entity-typing-as-type-1
Repo	https://github.com/CogComp/zoe
Framework	tf

The Proper Care and Feeding of CAMELS: How Limited Training Data Affects Streamflow Prediction


Title	The Proper Care and Feeding of CAMELS: How Limited Training Data Affects Streamflow Prediction
Authors	Martin Gauch, Juliane Mai, Jimmy Lin
Abstract	Accurate streamflow prediction largely relies on historical records of both meteorological data and streamflow measurements. For many regions around the world, however, such data are only scarcely or not at all available. To select an appropriate model for a region with a given amount of historical data, it is therefore indispensable to know a model’s sensitivity to limited training data, both in terms of geographic diversity and different spans of time. In this study, we provide decision support for tree- and LSTM-based models. We feed the models meteorological measurements from the CAMELS dataset, and individually restrict the training period length and the number of basins used in training. Our findings show that tree-based models provide more accurate predictions on small datasets, while LSTMs are superior given sufficient training data. This is perhaps not surprising, as neural networks are known to be data-hungry; however, we are able to characterize each model’s strengths under different conditions, including the “breakeven point” when LSTMs begin to overtake tree-based models.
Tasks
Published	2019-11-17
URL	https://arxiv.org/abs/1911.07249v1
PDF	https://arxiv.org/pdf/1911.07249v1.pdf
PWC	https://paperswithcode.com/paper/the-proper-care-and-feeding-of-camels-how
Repo	https://github.com/gauchm/ealstm_regional_modeling
Framework	pytorch

Few-Shot Representation Learning for Out-Of-Vocabulary Words


Title	Few-Shot Representation Learning for Out-Of-Vocabulary Words
Authors	Ziniu Hu, Ting Chen, Kai-Wei Chang, Yizhou Sun
Abstract	Existing approaches for learning word embeddings often assume there are sufficient occurrences for each word in the corpus, such that the representation of words can be accurately estimated from their contexts. However, in real-world scenarios, out-of-vocabulary (a.k.a. OOV) words that do not appear in training corpus emerge frequently. It is challenging to learn accurate representations of these words with only a few observations. In this paper, we formulate the learning of OOV embeddings as a few-shot regression problem, and address it by training a representation function to predict the oracle embedding vector (defined as embedding trained with abundant observations) based on limited observations. Specifically, we propose a novel hierarchical attention-based architecture to serve as the neural regression function, with which the context information of a word is encoded and aggregated from K observations. Furthermore, our approach can leverage Model-Agnostic Meta-Learning (MAML) for adapting the learned model to the new corpus fast and robustly. Experiments show that the proposed approach significantly outperforms existing methods in constructing accurate embeddings for OOV words, and improves downstream tasks where these embeddings are utilized.
Tasks	few-shot regression, Learning Word Embeddings, Meta-Learning, Representation Learning, Word Embeddings
Published	2019-07-01
URL	https://arxiv.org/abs/1907.00505v1
PDF	https://arxiv.org/pdf/1907.00505v1.pdf
PWC	https://paperswithcode.com/paper/few-shot-representation-learning-for-out-of
Repo	https://github.com/acbull/HiCE
Framework	pytorch

Abstract Reasoning with Distracting Features


Title	Abstract Reasoning with Distracting Features
Authors	Kecheng Zheng, Zheng-jun Zha, Wei Wei
Abstract	Abstraction reasoning is a long-standing challenge in artificial intelligence. Recent studies suggest that many of the deep architectures that have triumphed over other domains failed to work well in abstract reasoning. In this paper, we first illustrate that one of the main challenges in such a reasoning task is the presence of distracting features, which requires the learning algorithm to leverage counterevidence and to reject any of the false hypotheses in order to learn the true patterns. We later show that carefully designed learning trajectory over different categories of training data can effectively boost learning performance by mitigating the impacts of distracting features. Inspired by this fact, we propose feature robust abstract reasoning (FRAR) model, which consists of a reinforcement learning based teacher network to determine the sequence of training and a student network for predictions. Experimental results demonstrated strong improvements over baseline algorithms and we are able to beat the state-of-the-art models by 18.7% in the RAVEN dataset and 13.3% in the PGM dataset.
Tasks
Published	2019-12-02
URL	https://arxiv.org/abs/1912.00569v1
PDF	https://arxiv.org/pdf/1912.00569v1.pdf
PWC	https://paperswithcode.com/paper/abstract-reasoning-with-distracting-features-1
Repo	https://github.com/zkcys001/distracting_feature
Framework	pytorch

Bayesian Batch Active Learning as Sparse Subset Approximation


Title	Bayesian Batch Active Learning as Sparse Subset Approximation
Authors	Robert Pinsler, Jonathan Gordon, Eric Nalisnick, José Miguel Hernández-Lobato
Abstract	Leveraging the wealth of unlabeled data produced in recent years provides great potential for improving supervised models. When the cost of acquiring labels is high, probabilistic active learning methods can be used to greedily select the most informative data points to be labeled. However, for many large-scale problems standard greedy procedures become computationally infeasible and suffer from negligible model change. In this paper, we introduce a novel Bayesian batch active learning approach that mitigates these issues. Our approach is motivated by approximating the complete data posterior of the model parameters. While naive batch construction methods result in correlated queries, our algorithm produces diverse batches that enable efficient active learning at scale. We derive interpretable closed-form solutions akin to existing active learning procedures for linear models, and generalize to arbitrary models using random projections. We demonstrate the benefits of our approach on several large-scale regression and classification tasks.
Tasks	Active Learning
Published	2019-08-06
URL	https://arxiv.org/abs/1908.02144v3
PDF	https://arxiv.org/pdf/1908.02144v3.pdf
PWC	https://paperswithcode.com/paper/bayesian-batch-active-learning-as-sparse
Repo	https://github.com/rpinsler/active-bayesian-coresets
Framework	pytorch

BPE-Dropout: Simple and Effective Subword Regularization


Title	BPE-Dropout: Simple and Effective Subword Regularization
Authors	Ivan Provilkov, Dmitrii Emelianenko, Elena Voita
Abstract	Subword segmentation is widely used to address the open vocabulary problem in machine translation. The dominant approach to subword segmentation is Byte Pair Encoding (BPE), which keeps the most frequent words intact while splitting the rare ones into multiple tokens. While multiple segmentations are possible even with the same vocabulary, BPE splits words into unique sequences; this may prevent a model from better learning the compositionality of words and being robust to segmentation errors. So far, the only way to overcome this BPE imperfection, its deterministic nature, was to create another subword segmentation algorithm (Kudo, 2018). In contrast, we show that BPE itself incorporates the ability to produce multiple segmentations of the same word. We introduce BPE-dropout - simple and effective subword regularization method based on and compatible with conventional BPE. It stochastically corrupts the segmentation procedure of BPE, which leads to producing multiple segmentations within the same fixed BPE framework. Using BPE-dropout during training and the standard BPE during inference improves translation quality up to 3 BLEU compared to BPE and up to 0.9 BLEU compared to the previous subword regularization.
Tasks	Machine Translation
Published	2019-10-29
URL	https://arxiv.org/abs/1910.13267v1
PDF	https://arxiv.org/pdf/1910.13267v1.pdf
PWC	https://paperswithcode.com/paper/191013267
Repo	https://github.com/kh-mo/QA_wikisql
Framework	none

Knowledge-Embedded Routing Network for Scene Graph Generation


Title	Knowledge-Embedded Routing Network for Scene Graph Generation
Authors	Tianshui Chen, Weihao Yu, Riquan Chen, Liang Lin
Abstract	To understand a scene in depth not only involves locating/recognizing individual objects, but also requires to infer the relationships and interactions among them. However, since the distribution of real-world relationships is seriously unbalanced, existing methods perform quite poorly for the less frequent relationships. In this work, we find that the statistical correlations between object pairs and their relationships can effectively regularize semantic space and make prediction less ambiguous, and thus well address the unbalanced distribution issue. To achieve this, we incorporate these statistical correlations into deep neural networks to facilitate scene graph generation by developing a Knowledge-Embedded Routing Network. More specifically, we show that the statistical correlations between objects appearing in images and their relationships, can be explicitly represented by a structured knowledge graph, and a routing mechanism is learned to propagate messages through the graph to explore their interactions. Extensive experiments on the large-scale Visual Genome dataset demonstrate the superiority of the proposed method over current state-of-the-art competitors.
Tasks	Graph Generation, Scene Graph Generation
Published	2019-03-08
URL	http://arxiv.org/abs/1903.03326v1
PDF	http://arxiv.org/pdf/1903.03326v1.pdf
PWC	https://paperswithcode.com/paper/knowledge-embedded-routing-network-for-scene
Repo	https://github.com/HCPLab-SYSU/KERN
Framework	pytorch

Good Secretaries, Bad Truck Drivers? Occupational Gender Stereotypes in Sentiment Analysis


Title	Good Secretaries, Bad Truck Drivers? Occupational Gender Stereotypes in Sentiment Analysis
Authors	Jayadev Bhaskaran, Isha Bhallamudi
Abstract	In this work, we investigate the presence of occupational gender stereotypes in sentiment analysis models. Such a task has implications for reducing implicit biases in these models, which are being applied to an increasingly wide variety of downstream tasks. We release a new gender-balanced dataset of 800 sentences pertaining to specific professions and propose a methodology for using it as a test bench to evaluate sentiment analysis models. We evaluate the presence of occupational gender stereotypes in 3 different models using our approach, and explore their relationship with societal perceptions of occupations.
Tasks	Sentiment Analysis
Published	2019-06-24
URL	https://arxiv.org/abs/1906.10256v2
PDF	https://arxiv.org/pdf/1906.10256v2.pdf
PWC	https://paperswithcode.com/paper/good-secretaries-bad-truck-drivers
Repo	https://github.com/jayadevbhaskaran/gendered-sentiment
Framework	none

Variational Sequential Labelers for Semi-Supervised Learning


Title	Variational Sequential Labelers for Semi-Supervised Learning
Authors	Mingda Chen, Qingming Tang, Karen Livescu, Kevin Gimpel
Abstract	We introduce a family of multitask variational methods for semi-supervised sequence labeling. Our model family consists of a latent-variable generative model and a discriminative labeler. The generative models use latent variables to define the conditional probability of a word given its context, drawing inspiration from word prediction objectives commonly used in learning word embeddings. The labeler helps inject discriminative information into the latent space. We explore several latent variable configurations, including ones with hierarchical structure, which enables the model to account for both label-specific and word-specific information. Our models consistently outperform standard sequential baselines on 8 sequence labeling datasets, and improve further with unlabeled data.
Tasks	Learning Word Embeddings, Word Embeddings
Published	2019-06-23
URL	https://arxiv.org/abs/1906.09535v1
PDF	https://arxiv.org/pdf/1906.09535v1.pdf
PWC	https://paperswithcode.com/paper/variational-sequential-labelers-for-semi-1
Repo	https://github.com/mingdachen/vsl
Framework	pytorch

Towards Robust Named Entity Recognition for Historic German


Title	Towards Robust Named Entity Recognition for Historic German
Authors	Stefan Schweter, Johannes Baiter
Abstract	Recent advances in language modeling using deep neural networks have shown that these models learn representations, that vary with the network depth from morphology to semantic relationships like co-reference. We apply pre-trained language models to low-resource named entity recognition for Historic German. We show on a series of experiments that character-based pre-trained language models do not run into trouble when faced with low-resource datasets. Our pre-trained character-based language models improve upon classical CRF-based methods and previous work on Bi-LSTMs by boosting F1 score performance by up to 6%. Our pre-trained language and NER models are publicly available under https://github.com/stefan-it/historic-ner .
Tasks	Language Modelling, Named Entity Recognition
Published	2019-06-18
URL	https://arxiv.org/abs/1906.07592v1
PDF	https://arxiv.org/pdf/1906.07592v1.pdf
PWC	https://paperswithcode.com/paper/towards-robust-named-entity-recognition-for
Repo	https://github.com/stefan-it/historic-ner
Framework	none

Robust Evaluation of Language-Brain Encoding Experiments


Title	Robust Evaluation of Language-Brain Encoding Experiments
Authors	Lisa Beinborn, Samira Abnar, Rochelle Choenni
Abstract	Language-brain encoding experiments evaluate the ability of language models to predict brain responses elicited by language stimuli. The evaluation scenarios for this task have not yet been standardized which makes it difficult to compare and interpret results. We perform a series of evaluation experiments with a consistent encoding setup and compute the results for multiple fMRI datasets. In addition, we test the sensitivity of the evaluation measures to randomized data and analyze the effect of voxel selection methods. Our experimental framework is publicly available to make modelling decisions more transparent and support reproducibility for future comparisons.
Tasks
Published	2019-04-04
URL	http://arxiv.org/abs/1904.02547v1
PDF	http://arxiv.org/pdf/1904.02547v1.pdf
PWC	https://paperswithcode.com/paper/robust-evaluation-of-language-brain-encoding
Repo	https://github.com/beinborn/brain-lang
Framework	none

Quantifying the Carbon Emissions of Machine Learning


Title	Quantifying the Carbon Emissions of Machine Learning
Authors	Alexandre Lacoste, Alexandra Luccioni, Victor Schmidt, Thomas Dandres
Abstract	From an environmental standpoint, there are a few crucial aspects of training a neural network that have a major impact on the quantity of carbon that it emits. These factors include: the location of the server used for training and the energy grid that it uses, the length of the training procedure, and even the make and model of hardware on which the training takes place. In order to approximate these emissions, we present our Machine Learning Emissions Calculator, a tool for our community to better understand the environmental impact of training ML models. We accompany this tool with an explanation of the factors cited above, as well as concrete actions that individual practitioners and organizations can take to mitigate their carbon emissions.
Tasks
Published	2019-10-21
URL	https://arxiv.org/abs/1910.09700v2
PDF	https://arxiv.org/pdf/1910.09700v2.pdf
PWC	https://paperswithcode.com/paper/quantifying-the-carbon-emissions-of-machine
Repo	https://github.com/mlco2/impact
Framework	none