Paper Group AWR 77
Greedy Optimized Multileaving for Personalization. Verification of Neural Networks: Specifying Global Robustness using Generative Models. Residual Reactive Navigation: Combining Classical and Learned Navigation Strategies For Deployment in Unknown Environments. Zero-Shot Open Entity Typing as Type-Compatible Grounding. The Proper Care and Feeding o …
Greedy Optimized Multileaving for Personalization
Title | Greedy Optimized Multileaving for Personalization |
Authors | Kojiro Iizuka, Takeshi Yoneda, Yoshifumi Seki |
Abstract | Personalization plays an important role in many services. To evaluate personalized rankings, online evaluation, such as A/B testing, is widely used today. Recently, multileaving has been found to be an efficient method for evaluating rankings in information retrieval fields. This paper describes the first attempt to optimize the multileaving method for personalization settings. We clarify the challenges of applying this method to personalized rankings. Then, to solve these challenges, we propose greedy optimized multileaving (GOM) with a new credit feedback function. The empirical results showed that GOM was stable for increasing ranking lengths and the number of rankers. We implemented GOM on our actual news recommender systems, and compared its online performance. The results showed that GOM evaluated the personalized rankings precisely, with significantly smaller sample sizes (< 1/10) than A/B testing. |
Tasks | Information Retrieval, Recommendation Systems |
Published | 2019-07-19 |
URL | https://arxiv.org/abs/1907.08346v1 |
https://arxiv.org/pdf/1907.08346v1.pdf | |
PWC | https://paperswithcode.com/paper/greedy-optimized-multileaving-for |
Repo | https://github.com/mathetake/intergo |
Framework | none |
Verification of Neural Networks: Specifying Global Robustness using Generative Models
Title | Verification of Neural Networks: Specifying Global Robustness using Generative Models |
Authors | Nathanaël Fijalkow, Mohit Kumar Gupta |
Abstract | The success of neural networks across most machine learning tasks and the persistence of adversarial examples have made the verification of such models an important quest. Several techniques have been successfully developed to verify robustness, and are now able to evaluate neural networks with thousands of nodes. The main weakness of this approach is in the specification: robustness is asserted on a validation set consisting of a finite set of examples, i.e. locally. We propose a notion of global robustness based on generative models, which asserts the robustness on a very large and representative set of examples. We show how this can be used for verifying neural networks. In this paper we experimentally explore the merits of this approach, and show how it can be used to construct realistic adversarial examples. |
Tasks | |
Published | 2019-10-11 |
URL | https://arxiv.org/abs/1910.05018v1 |
https://arxiv.org/pdf/1910.05018v1.pdf | |
PWC | https://paperswithcode.com/paper/verification-of-neural-networks-specifying |
Repo | https://github.com/mohitiitb/NeuralNetworkVerification_GlobalRobustness |
Framework | tf |
Residual Reactive Navigation: Combining Classical and Learned Navigation Strategies For Deployment in Unknown Environments
Title | Residual Reactive Navigation: Combining Classical and Learned Navigation Strategies For Deployment in Unknown Environments |
Authors | Krishan Rana, Ben Talbot, Vibhavari Dasagi, Michael Milford, Niko Sünderhauf |
Abstract | In this work we focus on improving the efficiency and generalisation of learned navigation strategies when transferred from its training environment to previously unseen ones. We present an extension of the residual reinforcement learning framework from the robotic manipulation literature and adapt it to the vast and unstructured environments that mobile robots can operate in. The concept is based on learning a residual control effect to add to a typical sub-optimal classical controller in order to close the performance gap, whilst guiding the exploration process during training for improved data efficiency. We exploit this tight coupling and propose a novel deployment strategy, switching Residual Reactive Navigation (sRRN), which yields efficient trajectories whilst probabilistically switching to a classical controller in cases of high policy uncertainty. Our approach achieves improved performance over end-to-end alternatives and can be incorporated as part of a complete navigation stack for cluttered indoor navigation tasks in the real world. The code and training environment for this project is made publicly available at https://sites.google.com/view/srrn/home. |
Tasks | |
Published | 2019-09-24 |
URL | https://arxiv.org/abs/1909.10972v2 |
https://arxiv.org/pdf/1909.10972v2.pdf | |
PWC | https://paperswithcode.com/paper/residual-reactive-navigation-combining |
Repo | https://github.com/krishanrana/2D_SRRN |
Framework | none |
Zero-Shot Open Entity Typing as Type-Compatible Grounding
Title | Zero-Shot Open Entity Typing as Type-Compatible Grounding |
Authors | Ben Zhou, Daniel Khashabi, Chen-Tse Tsai, Dan Roth |
Abstract | The problem of entity-typing has been studied predominantly in supervised learning fashion, mostly with task-specific annotations (for coarse types) and sometimes with distant supervision (for fine types). While such approaches have strong performance within datasets, they often lack the flexibility to transfer across text genres and to generalize to new type taxonomies. In this work we propose a zero-shot entity typing approach that requires no annotated data and can flexibly identify newly defined types. Given a type taxonomy defined as Boolean functions of FREEBASE “types”, we ground a given mention to a set of type-compatible Wikipedia entries and then infer the target mention’s types using an inference algorithm that makes use of the types of these entries. We evaluate our system on a broad range of datasets, including standard fine-grained and coarse-grained entity typing datasets, and also a dataset in the biological domain. Our system is shown to be competitive with state-of-the-art supervised NER systems and outperforms them on out-of-domain datasets. We also show that our system significantly outperforms other zero-shot fine typing systems. |
Tasks | Entity Typing |
Published | 2019-07-07 |
URL | https://arxiv.org/abs/1907.03228v1 |
https://arxiv.org/pdf/1907.03228v1.pdf | |
PWC | https://paperswithcode.com/paper/zero-shot-open-entity-typing-as-type-1 |
Repo | https://github.com/CogComp/zoe |
Framework | tf |
The Proper Care and Feeding of CAMELS: How Limited Training Data Affects Streamflow Prediction
Title | The Proper Care and Feeding of CAMELS: How Limited Training Data Affects Streamflow Prediction |
Authors | Martin Gauch, Juliane Mai, Jimmy Lin |
Abstract | Accurate streamflow prediction largely relies on historical records of both meteorological data and streamflow measurements. For many regions around the world, however, such data are only scarcely or not at all available. To select an appropriate model for a region with a given amount of historical data, it is therefore indispensable to know a model’s sensitivity to limited training data, both in terms of geographic diversity and different spans of time. In this study, we provide decision support for tree- and LSTM-based models. We feed the models meteorological measurements from the CAMELS dataset, and individually restrict the training period length and the number of basins used in training. Our findings show that tree-based models provide more accurate predictions on small datasets, while LSTMs are superior given sufficient training data. This is perhaps not surprising, as neural networks are known to be data-hungry; however, we are able to characterize each model’s strengths under different conditions, including the “breakeven point” when LSTMs begin to overtake tree-based models. |
Tasks | |
Published | 2019-11-17 |
URL | https://arxiv.org/abs/1911.07249v1 |
https://arxiv.org/pdf/1911.07249v1.pdf | |
PWC | https://paperswithcode.com/paper/the-proper-care-and-feeding-of-camels-how |
Repo | https://github.com/gauchm/ealstm_regional_modeling |
Framework | pytorch |
Few-Shot Representation Learning for Out-Of-Vocabulary Words
Title | Few-Shot Representation Learning for Out-Of-Vocabulary Words |
Authors | Ziniu Hu, Ting Chen, Kai-Wei Chang, Yizhou Sun |
Abstract | Existing approaches for learning word embeddings often assume there are sufficient occurrences for each word in the corpus, such that the representation of words can be accurately estimated from their contexts. However, in real-world scenarios, out-of-vocabulary (a.k.a. OOV) words that do not appear in training corpus emerge frequently. It is challenging to learn accurate representations of these words with only a few observations. In this paper, we formulate the learning of OOV embeddings as a few-shot regression problem, and address it by training a representation function to predict the oracle embedding vector (defined as embedding trained with abundant observations) based on limited observations. Specifically, we propose a novel hierarchical attention-based architecture to serve as the neural regression function, with which the context information of a word is encoded and aggregated from K observations. Furthermore, our approach can leverage Model-Agnostic Meta-Learning (MAML) for adapting the learned model to the new corpus fast and robustly. Experiments show that the proposed approach significantly outperforms existing methods in constructing accurate embeddings for OOV words, and improves downstream tasks where these embeddings are utilized. |
Tasks | few-shot regression, Learning Word Embeddings, Meta-Learning, Representation Learning, Word Embeddings |
Published | 2019-07-01 |
URL | https://arxiv.org/abs/1907.00505v1 |
https://arxiv.org/pdf/1907.00505v1.pdf | |
PWC | https://paperswithcode.com/paper/few-shot-representation-learning-for-out-of |
Repo | https://github.com/acbull/HiCE |
Framework | pytorch |
Abstract Reasoning with Distracting Features
Title | Abstract Reasoning with Distracting Features |
Authors | Kecheng Zheng, Zheng-jun Zha, Wei Wei |
Abstract | Abstraction reasoning is a long-standing challenge in artificial intelligence. Recent studies suggest that many of the deep architectures that have triumphed over other domains failed to work well in abstract reasoning. In this paper, we first illustrate that one of the main challenges in such a reasoning task is the presence of distracting features, which requires the learning algorithm to leverage counterevidence and to reject any of the false hypotheses in order to learn the true patterns. We later show that carefully designed learning trajectory over different categories of training data can effectively boost learning performance by mitigating the impacts of distracting features. Inspired by this fact, we propose feature robust abstract reasoning (FRAR) model, which consists of a reinforcement learning based teacher network to determine the sequence of training and a student network for predictions. Experimental results demonstrated strong improvements over baseline algorithms and we are able to beat the state-of-the-art models by 18.7% in the RAVEN dataset and 13.3% in the PGM dataset. |
Tasks | |
Published | 2019-12-02 |
URL | https://arxiv.org/abs/1912.00569v1 |
https://arxiv.org/pdf/1912.00569v1.pdf | |
PWC | https://paperswithcode.com/paper/abstract-reasoning-with-distracting-features-1 |
Repo | https://github.com/zkcys001/distracting_feature |
Framework | pytorch |
Bayesian Batch Active Learning as Sparse Subset Approximation
Title | Bayesian Batch Active Learning as Sparse Subset Approximation |
Authors | Robert Pinsler, Jonathan Gordon, Eric Nalisnick, José Miguel Hernández-Lobato |
Abstract | Leveraging the wealth of unlabeled data produced in recent years provides great potential for improving supervised models. When the cost of acquiring labels is high, probabilistic active learning methods can be used to greedily select the most informative data points to be labeled. However, for many large-scale problems standard greedy procedures become computationally infeasible and suffer from negligible model change. In this paper, we introduce a novel Bayesian batch active learning approach that mitigates these issues. Our approach is motivated by approximating the complete data posterior of the model parameters. While naive batch construction methods result in correlated queries, our algorithm produces diverse batches that enable efficient active learning at scale. We derive interpretable closed-form solutions akin to existing active learning procedures for linear models, and generalize to arbitrary models using random projections. We demonstrate the benefits of our approach on several large-scale regression and classification tasks. |
Tasks | Active Learning |
Published | 2019-08-06 |
URL | https://arxiv.org/abs/1908.02144v3 |
https://arxiv.org/pdf/1908.02144v3.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-batch-active-learning-as-sparse |
Repo | https://github.com/rpinsler/active-bayesian-coresets |
Framework | pytorch |
BPE-Dropout: Simple and Effective Subword Regularization
Title | BPE-Dropout: Simple and Effective Subword Regularization |
Authors | Ivan Provilkov, Dmitrii Emelianenko, Elena Voita |
Abstract | Subword segmentation is widely used to address the open vocabulary problem in machine translation. The dominant approach to subword segmentation is Byte Pair Encoding (BPE), which keeps the most frequent words intact while splitting the rare ones into multiple tokens. While multiple segmentations are possible even with the same vocabulary, BPE splits words into unique sequences; this may prevent a model from better learning the compositionality of words and being robust to segmentation errors. So far, the only way to overcome this BPE imperfection, its deterministic nature, was to create another subword segmentation algorithm (Kudo, 2018). In contrast, we show that BPE itself incorporates the ability to produce multiple segmentations of the same word. We introduce BPE-dropout - simple and effective subword regularization method based on and compatible with conventional BPE. It stochastically corrupts the segmentation procedure of BPE, which leads to producing multiple segmentations within the same fixed BPE framework. Using BPE-dropout during training and the standard BPE during inference improves translation quality up to 3 BLEU compared to BPE and up to 0.9 BLEU compared to the previous subword regularization. |
Tasks | Machine Translation |
Published | 2019-10-29 |
URL | https://arxiv.org/abs/1910.13267v1 |
https://arxiv.org/pdf/1910.13267v1.pdf | |
PWC | https://paperswithcode.com/paper/191013267 |
Repo | https://github.com/kh-mo/QA_wikisql |
Framework | none |
Knowledge-Embedded Routing Network for Scene Graph Generation
Title | Knowledge-Embedded Routing Network for Scene Graph Generation |
Authors | Tianshui Chen, Weihao Yu, Riquan Chen, Liang Lin |
Abstract | To understand a scene in depth not only involves locating/recognizing individual objects, but also requires to infer the relationships and interactions among them. However, since the distribution of real-world relationships is seriously unbalanced, existing methods perform quite poorly for the less frequent relationships. In this work, we find that the statistical correlations between object pairs and their relationships can effectively regularize semantic space and make prediction less ambiguous, and thus well address the unbalanced distribution issue. To achieve this, we incorporate these statistical correlations into deep neural networks to facilitate scene graph generation by developing a Knowledge-Embedded Routing Network. More specifically, we show that the statistical correlations between objects appearing in images and their relationships, can be explicitly represented by a structured knowledge graph, and a routing mechanism is learned to propagate messages through the graph to explore their interactions. Extensive experiments on the large-scale Visual Genome dataset demonstrate the superiority of the proposed method over current state-of-the-art competitors. |
Tasks | Graph Generation, Scene Graph Generation |
Published | 2019-03-08 |
URL | http://arxiv.org/abs/1903.03326v1 |
http://arxiv.org/pdf/1903.03326v1.pdf | |
PWC | https://paperswithcode.com/paper/knowledge-embedded-routing-network-for-scene |
Repo | https://github.com/HCPLab-SYSU/KERN |
Framework | pytorch |
Good Secretaries, Bad Truck Drivers? Occupational Gender Stereotypes in Sentiment Analysis
Title | Good Secretaries, Bad Truck Drivers? Occupational Gender Stereotypes in Sentiment Analysis |
Authors | Jayadev Bhaskaran, Isha Bhallamudi |
Abstract | In this work, we investigate the presence of occupational gender stereotypes in sentiment analysis models. Such a task has implications for reducing implicit biases in these models, which are being applied to an increasingly wide variety of downstream tasks. We release a new gender-balanced dataset of 800 sentences pertaining to specific professions and propose a methodology for using it as a test bench to evaluate sentiment analysis models. We evaluate the presence of occupational gender stereotypes in 3 different models using our approach, and explore their relationship with societal perceptions of occupations. |
Tasks | Sentiment Analysis |
Published | 2019-06-24 |
URL | https://arxiv.org/abs/1906.10256v2 |
https://arxiv.org/pdf/1906.10256v2.pdf | |
PWC | https://paperswithcode.com/paper/good-secretaries-bad-truck-drivers |
Repo | https://github.com/jayadevbhaskaran/gendered-sentiment |
Framework | none |
Variational Sequential Labelers for Semi-Supervised Learning
Title | Variational Sequential Labelers for Semi-Supervised Learning |
Authors | Mingda Chen, Qingming Tang, Karen Livescu, Kevin Gimpel |
Abstract | We introduce a family of multitask variational methods for semi-supervised sequence labeling. Our model family consists of a latent-variable generative model and a discriminative labeler. The generative models use latent variables to define the conditional probability of a word given its context, drawing inspiration from word prediction objectives commonly used in learning word embeddings. The labeler helps inject discriminative information into the latent space. We explore several latent variable configurations, including ones with hierarchical structure, which enables the model to account for both label-specific and word-specific information. Our models consistently outperform standard sequential baselines on 8 sequence labeling datasets, and improve further with unlabeled data. |
Tasks | Learning Word Embeddings, Word Embeddings |
Published | 2019-06-23 |
URL | https://arxiv.org/abs/1906.09535v1 |
https://arxiv.org/pdf/1906.09535v1.pdf | |
PWC | https://paperswithcode.com/paper/variational-sequential-labelers-for-semi-1 |
Repo | https://github.com/mingdachen/vsl |
Framework | pytorch |
Towards Robust Named Entity Recognition for Historic German
Title | Towards Robust Named Entity Recognition for Historic German |
Authors | Stefan Schweter, Johannes Baiter |
Abstract | Recent advances in language modeling using deep neural networks have shown that these models learn representations, that vary with the network depth from morphology to semantic relationships like co-reference. We apply pre-trained language models to low-resource named entity recognition for Historic German. We show on a series of experiments that character-based pre-trained language models do not run into trouble when faced with low-resource datasets. Our pre-trained character-based language models improve upon classical CRF-based methods and previous work on Bi-LSTMs by boosting F1 score performance by up to 6%. Our pre-trained language and NER models are publicly available under https://github.com/stefan-it/historic-ner . |
Tasks | Language Modelling, Named Entity Recognition |
Published | 2019-06-18 |
URL | https://arxiv.org/abs/1906.07592v1 |
https://arxiv.org/pdf/1906.07592v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-robust-named-entity-recognition-for |
Repo | https://github.com/stefan-it/historic-ner |
Framework | none |
Robust Evaluation of Language-Brain Encoding Experiments
Title | Robust Evaluation of Language-Brain Encoding Experiments |
Authors | Lisa Beinborn, Samira Abnar, Rochelle Choenni |
Abstract | Language-brain encoding experiments evaluate the ability of language models to predict brain responses elicited by language stimuli. The evaluation scenarios for this task have not yet been standardized which makes it difficult to compare and interpret results. We perform a series of evaluation experiments with a consistent encoding setup and compute the results for multiple fMRI datasets. In addition, we test the sensitivity of the evaluation measures to randomized data and analyze the effect of voxel selection methods. Our experimental framework is publicly available to make modelling decisions more transparent and support reproducibility for future comparisons. |
Tasks | |
Published | 2019-04-04 |
URL | http://arxiv.org/abs/1904.02547v1 |
http://arxiv.org/pdf/1904.02547v1.pdf | |
PWC | https://paperswithcode.com/paper/robust-evaluation-of-language-brain-encoding |
Repo | https://github.com/beinborn/brain-lang |
Framework | none |
Quantifying the Carbon Emissions of Machine Learning
Title | Quantifying the Carbon Emissions of Machine Learning |
Authors | Alexandre Lacoste, Alexandra Luccioni, Victor Schmidt, Thomas Dandres |
Abstract | From an environmental standpoint, there are a few crucial aspects of training a neural network that have a major impact on the quantity of carbon that it emits. These factors include: the location of the server used for training and the energy grid that it uses, the length of the training procedure, and even the make and model of hardware on which the training takes place. In order to approximate these emissions, we present our Machine Learning Emissions Calculator, a tool for our community to better understand the environmental impact of training ML models. We accompany this tool with an explanation of the factors cited above, as well as concrete actions that individual practitioners and organizations can take to mitigate their carbon emissions. |
Tasks | |
Published | 2019-10-21 |
URL | https://arxiv.org/abs/1910.09700v2 |
https://arxiv.org/pdf/1910.09700v2.pdf | |
PWC | https://paperswithcode.com/paper/quantifying-the-carbon-emissions-of-machine |
Repo | https://github.com/mlco2/impact |
Framework | none |