October 20, 2019

3152 words 15 mins read

Paper Group AWR 300

Paper Group AWR 300

Adaptive Mechanism Design: Learning to Promote Cooperation. Stochastic Deep Networks. Split and Rephrase: Better Evaluation and a Stronger Baseline. MSplit LBI: Realizing Feature Selection and Dense Estimation Simultaneously in Few-shot and Zero-shot Learning. A Qualitative Comparison of CoQA, SQuAD 2.0 and QuAC. Ranking Paragraphs for Improving An …

Adaptive Mechanism Design: Learning to Promote Cooperation

Title Adaptive Mechanism Design: Learning to Promote Cooperation
Authors Tobias Baumann, Thore Graepel, John Shawe-Taylor
Abstract In the future, artificial learning agents are likely to become increasingly widespread in our society. They will interact with both other learning agents and humans in a variety of complex settings including social dilemmas. We consider the problem of how an external agent can promote cooperation between artificial learners by distributing additional rewards and punishments based on observing the learners’ actions. We propose a rule for automatically learning how to create right incentives by considering the players’ anticipated parameter updates. Using this learning rule leads to cooperation with high social welfare in matrix games in which the agents would otherwise learn to defect with high probability. We show that the resulting cooperative outcome is stable in certain games even if the planning agent is turned off after a given number of episodes, while other games require ongoing intervention to maintain mutual cooperation. However, even in the latter case, the amount of necessary additional incentives decreases over time.
Tasks
Published 2018-06-11
URL https://arxiv.org/abs/1806.04067v2
PDF https://arxiv.org/pdf/1806.04067v2.pdf
PWC https://paperswithcode.com/paper/adaptive-mechanism-design-learning-to-promote
Repo https://github.com/tobiasbaumann1/Adaptive_Mechanism_Design
Framework tf

Stochastic Deep Networks

Title Stochastic Deep Networks
Authors Gwendoline de Bie, Gabriel Peyré, Marco Cuturi
Abstract Machine learning is increasingly targeting areas where input data cannot be accurately described by a single vector, but can be modeled instead using the more flexible concept of random vectors, namely probability measures or more simply point clouds of varying cardinality. Using deep architectures on measures poses, however, many challenging issues. Indeed, deep architectures are originally designed to handle fixedlength vectors, or, using recursive mechanisms, ordered sequences thereof. In sharp contrast, measures describe a varying number of weighted observations with no particular order. We propose in this work a deep framework designed to handle crucial aspects of measures, namely permutation invariances, variations in weights and cardinality. Architectures derived from this pipeline can (i) map measures to measures - using the concept of push-forward operators; (ii) bridge the gap between measures and Euclidean spaces - through integration steps. This allows to design discriminative networks (to classify or reduce the dimensionality of input measures), generative architectures (to synthesize measures) and recurrent pipelines (to predict measure dynamics). We provide a theoretical analysis of these building blocks, review our architectures’ approximation abilities and robustness w.r.t. perturbation, and try them on various discriminative and generative tasks.
Tasks
Published 2018-11-19
URL http://arxiv.org/abs/1811.07429v2
PDF http://arxiv.org/pdf/1811.07429v2.pdf
PWC https://paperswithcode.com/paper/stochastic-deep-networks
Repo https://github.com/gdebie/stochastic-deep-networks
Framework pytorch

Split and Rephrase: Better Evaluation and a Stronger Baseline

Title Split and Rephrase: Better Evaluation and a Stronger Baseline
Authors Roee Aharoni, Yoav Goldberg
Abstract Splitting and rephrasing a complex sentence into several shorter sentences that convey the same meaning is a challenging problem in NLP. We show that while vanilla seq2seq models can reach high scores on the proposed benchmark (Narayan et al., 2017), they suffer from memorization of the training set which contains more than 89% of the unique simple sentences from the validation and test sets. To aid this, we present a new train-development-test data split and neural models augmented with a copy-mechanism, outperforming the best reported baseline by 8.68 BLEU and fostering further progress on the task.
Tasks
Published 2018-05-02
URL http://arxiv.org/abs/1805.01035v1
PDF http://arxiv.org/pdf/1805.01035v1.pdf
PWC https://paperswithcode.com/paper/split-and-rephrase-better-evaluation-and-a
Repo https://github.com/roeeaharoni/sprp-acl2018
Framework pytorch

MSplit LBI: Realizing Feature Selection and Dense Estimation Simultaneously in Few-shot and Zero-shot Learning

Title MSplit LBI: Realizing Feature Selection and Dense Estimation Simultaneously in Few-shot and Zero-shot Learning
Authors Bo Zhao, Xinwei Sun, Yanwei Fu, Yuan Yao, Yizhou Wang
Abstract It is one typical and general topic of learning a good embedding model to efficiently learn the representation coefficients between two spaces/subspaces. To solve this task, $L_{1}$ regularization is widely used for the pursuit of feature selection and avoiding overfitting, and yet the sparse estimation of features in $L_{1}$ regularization may cause the underfitting of training data. $L_{2}$ regularization is also frequently used, but it is a biased estimator. In this paper, we propose the idea that the features consist of three orthogonal parts, \emph{namely} sparse strong signals, dense weak signals and random noise, in which both strong and weak signals contribute to the fitting of data. To facilitate such novel decomposition, \emph{MSplit} LBI is for the first time proposed to realize feature selection and dense estimation simultaneously. We provide theoretical and simulational verification that our method exceeds $L_{1}$ and $L_{2}$ regularization, and extensive experimental results show that our method achieves state-of-the-art performance in the few-shot and zero-shot learning.
Tasks Feature Selection, Zero-Shot Learning
Published 2018-06-12
URL http://arxiv.org/abs/1806.04360v1
PDF http://arxiv.org/pdf/1806.04360v1.pdf
PWC https://paperswithcode.com/paper/msplit-lbi-realizing-feature-selection-and
Repo https://github.com/PatrickZH/Zero-shot-Learning
Framework none

A Qualitative Comparison of CoQA, SQuAD 2.0 and QuAC

Title A Qualitative Comparison of CoQA, SQuAD 2.0 and QuAC
Authors Mark Yatskar
Abstract We compare three new datasets for question answering: SQuAD 2.0, QuAC, and CoQA, along several of their new features: (1) unanswerable questions, (2) multi-turn interactions, and (3) abstractive answers. We show that the datasets provide complementary coverage of the first two aspects, but weak coverage of the third. Because of the datasets’ structural similarity, a single extractive model can be easily adapted to any of the datasets and we show improved baseline results on both SQuAD 2.0 and CoQA. Despite the similarity, models trained on one dataset are ineffective on another dataset, but we find moderate performance improvement through pretraining. To encourage cross-evaluation, we release code for conversion between datasets at https://github.com/my89/co-squac .
Tasks Question Answering
Published 2018-09-27
URL https://arxiv.org/abs/1809.10735v2
PDF https://arxiv.org/pdf/1809.10735v2.pdf
PWC https://paperswithcode.com/paper/a-qualitative-comparison-of-coqa-squad-20-and
Repo https://github.com/my89/co-squac
Framework none

Ranking Paragraphs for Improving Answer Recall in Open-Domain Question Answering

Title Ranking Paragraphs for Improving Answer Recall in Open-Domain Question Answering
Authors Jinhyuk Lee, Seongjun Yun, Hyunjae Kim, Miyoung Ko, Jaewoo Kang
Abstract Recently, open-domain question answering (QA) has been combined with machine comprehension models to find answers in a large knowledge source. As open-domain QA requires retrieving relevant documents from text corpora to answer questions, its performance largely depends on the performance of document retrievers. However, since traditional information retrieval systems are not effective in obtaining documents with a high probability of containing answers, they lower the performance of QA systems. Simply extracting more documents increases the number of irrelevant documents, which also degrades the performance of QA systems. In this paper, we introduce Paragraph Ranker which ranks paragraphs of retrieved documents for a higher answer recall with less noise. We show that ranking paragraphs and aggregating answers using Paragraph Ranker improves performance of open-domain QA pipeline on the four open-domain QA datasets by 7.8% on average.
Tasks Information Retrieval, Open-Domain Question Answering, Question Answering, Reading Comprehension
Published 2018-10-01
URL http://arxiv.org/abs/1810.00494v1
PDF http://arxiv.org/pdf/1810.00494v1.pdf
PWC https://paperswithcode.com/paper/ranking-paragraphs-for-improving-answer
Repo https://github.com/yongqyu/ranking_paragraphs_pytorch
Framework pytorch

Topic-Guided Attention for Image Captioning

Title Topic-Guided Attention for Image Captioning
Authors Zhihao Zhu, Zhan Xue, Zejian Yuan
Abstract Attention mechanisms have attracted considerable interest in image captioning because of its powerful performance. Existing attention-based models use feedback information from the caption generator as guidance to determine which of the image features should be attended to. A common defect of these attention generation methods is that they lack a higher-level guiding information from the image itself, which sets a limit on selecting the most informative image features. Therefore, in this paper, we propose a novel attention mechanism, called topic-guided attention, which integrates image topics in the attention model as a guiding information to help select the most important image features. Moreover, we extract image features and image topics with separate networks, which can be fine-tuned jointly in an end-to-end manner during training. The experimental results on the benchmark Microsoft COCO dataset show that our method yields state-of-art performance on various quantitative metrics.
Tasks Image Captioning
Published 2018-07-10
URL http://arxiv.org/abs/1807.03514v1
PDF http://arxiv.org/pdf/1807.03514v1.pdf
PWC https://paperswithcode.com/paper/topic-guided-attention-for-image-captioning
Repo https://github.com/jsaikmr/Building-a-Topic-Modeling-for-Images-using-LDA-and-Transfer-Learning
Framework none

Fast and Accurate Intrinsic Symmetry Detection

Title Fast and Accurate Intrinsic Symmetry Detection
Authors Rajendra Nagar, Shanmuganathan Raman
Abstract In computer vision and graphics, various types of symmetries are extensively studied since symmetry present in objects is a fundamental cue for understanding the shape and the structure of objects. In this work, we detect the intrinsic reflective symmetry in triangle meshes where we have to find the intrinsically symmetric point for each point of the shape. We establish correspondences between functions defined on the shapes by extending the functional map framework and then recover the point-to-point correspondences. Previous approaches using the functional map for this task find the functional correspondences matrix by solving a non-linear optimization problem which makes them slow. In this work, we propose a closed form solution for this matrix which makes our approach faster. We find the closed-form solution based on our following results. If the given shape is intrinsically symmetric, then the shortest length geodesic between two intrinsically symmetric points is also intrinsically symmetric. If an eigenfunction of the Laplace-Beltrami operator for the given shape is an even (odd) function, then its restriction on the shortest length geodesic between two intrinsically symmetric points is also an even (odd) function. The sign of a low-frequency eigenfunction is the same on the neighboring points. Our method is invariant to the ordering of the eigenfunctions and has the least time complexity. We achieve the best performance on the SCAPE dataset and comparable performance with the state-of-the-art methods on the TOSCA dataset.
Tasks
Published 2018-07-26
URL http://arxiv.org/abs/1807.10162v4
PDF http://arxiv.org/pdf/1807.10162v4.pdf
PWC https://paperswithcode.com/paper/fast-and-accurate-intrinsic-symmetry
Repo https://github.com/r03ert0/interesting
Framework tf

Reference-less Measure of Faithfulness for Grammatical Error Correction

Title Reference-less Measure of Faithfulness for Grammatical Error Correction
Authors Leshem Choshen, Omri Abend
Abstract We propose USim, a semantic measure for Grammatical Error Correction (GEC) that measures the semantic faithfulness of the output to the source, thereby complementing existing reference-less measures (RLMs) for measuring the output’s grammaticality. USim operates by comparing the semantic symbolic structure of the source and the correction, without relying on manually-curated references. Our experiments establish the validity of USim, by showing that (1) semantic annotation can be consistently applied to ungrammatical text; (2) valid corrections obtain a high USim similarity score to the source; and (3) invalid corrections obtain a lower score.
Tasks Grammatical Error Correction
Published 2018-04-11
URL http://arxiv.org/abs/1804.03824v4
PDF http://arxiv.org/pdf/1804.03824v4.pdf
PWC https://paperswithcode.com/paper/reference-less-measure-of-faithfulness-for
Repo https://github.com/borgr/USim
Framework none

Polyglot Semantic Parsing in APIs

Title Polyglot Semantic Parsing in APIs
Authors Kyle Richardson, Jonathan Berant, Jonas Kuhn
Abstract Traditional approaches to semantic parsing (SP) work by training individual models for each available parallel dataset of text-meaning pairs. In this paper, we explore the idea of polyglot semantic translation, or learning semantic parsing models that are trained on multiple datasets and natural languages. In particular, we focus on translating text to code signature representations using the software component datasets of Richardson and Kuhn (2017a,b). The advantage of such models is that they can be used for parsing a wide variety of input natural languages and output programming languages, or mixed input languages, using a single unified model. To facilitate modeling of this type, we develop a novel graph-based decoding framework that achieves state-of-the-art performance on the above datasets, and apply this method to two other benchmark SP tasks.
Tasks Semantic Parsing
Published 2018-03-19
URL http://arxiv.org/abs/1803.06966v2
PDF http://arxiv.org/pdf/1803.06966v2.pdf
PWC https://paperswithcode.com/paper/polyglot-semantic-parsing-in-apis
Repo https://github.com/yakazimir/Code-Datasets
Framework none

Are you tough enough? Framework for Robustness Validation of Machine Comprehension Systems

Title Are you tough enough? Framework for Robustness Validation of Machine Comprehension Systems
Authors Barbara Rychalska, Dominika Basaj, Przemyslaw Biecek
Abstract Deep Learning NLP domain lacks procedures for the analysis of model robustness. In this paper we propose a framework which validates robustness of any Question Answering model through model explainers. We propose that a robust model should transgress the initial notion of semantic similarity induced by word embeddings to learn a more human-like understanding of meaning. We test this property by manipulating questions in two ways: swapping important question word for 1) its semantically correct synonym and 2) for word vector that is close in embedding space. We estimate importance of words in asked questions with Locally Interpretable Model Agnostic Explanations method (LIME). With these two steps we compare state-of-the-art Q&A models. We show that although accuracy of state-of-the-art models is high, they are very fragile to changes in the input. Moreover, we propose 2 adversarial training scenarios which raise model sensitivity to true synonyms by up to 7% accuracy measure. Our findings help to understand which models are more stable and how they can be improved. In addition, we have created and published a new dataset that may be used for validation of robustness of a Q&A model.
Tasks Question Answering, Reading Comprehension, Semantic Similarity, Semantic Textual Similarity, Word Embeddings
Published 2018-12-05
URL http://arxiv.org/abs/1812.02205v1
PDF http://arxiv.org/pdf/1812.02205v1.pdf
PWC https://paperswithcode.com/paper/are-you-tough-enough-framework-for-robustness
Repo https://github.com/MI2DataLab/nlp_interpretability_framework
Framework none

Learning to Search with MCTSnets

Title Learning to Search with MCTSnets
Authors Arthur Guez, Théophane Weber, Ioannis Antonoglou, Karen Simonyan, Oriol Vinyals, Daan Wierstra, Rémi Munos, David Silver
Abstract Planning problems are among the most important and well-studied problems in artificial intelligence. They are most typically solved by tree search algorithms that simulate ahead into the future, evaluate future states, and back-up those evaluations to the root of a search tree. Among these algorithms, Monte-Carlo tree search (MCTS) is one of the most general, powerful and widely used. A typical implementation of MCTS uses cleverly designed rules, optimized to the particular characteristics of the domain. These rules control where the simulation traverses, what to evaluate in the states that are reached, and how to back-up those evaluations. In this paper we instead learn where, what and how to search. Our architecture, which we call an MCTSnet, incorporates simulation-based search inside a neural network, by expanding, evaluating and backing-up a vector embedding. The parameters of the network are trained end-to-end using gradient-based optimisation. When applied to small searches in the well known planning problem Sokoban, the learned search algorithm significantly outperformed MCTS baselines.
Tasks
Published 2018-02-13
URL http://arxiv.org/abs/1802.04697v2
PDF http://arxiv.org/pdf/1802.04697v2.pdf
PWC https://paperswithcode.com/paper/learning-to-search-with-mctsnets
Repo https://github.com/dixantmittal/mctsnet
Framework pytorch

Decoupled Parallel Backpropagation with Convergence Guarantee

Title Decoupled Parallel Backpropagation with Convergence Guarantee
Authors Zhouyuan Huo, Bin Gu, Qian Yang, Heng Huang
Abstract Backpropagation algorithm is indispensable for the training of feedforward neural networks. It requires propagating error gradients sequentially from the output layer all the way back to the input layer. The backward locking in backpropagation algorithm constrains us from updating network layers in parallel and fully leveraging the computing resources. Recently, several algorithms have been proposed for breaking the backward locking. However, their performances degrade seriously when networks are deep. In this paper, we propose decoupled parallel backpropagation algorithm for deep learning optimization with convergence guarantee. Firstly, we decouple the backpropagation algorithm using delayed gradients, and show that the backward locking is removed when we split the networks into multiple modules. Then, we utilize decoupled parallel backpropagation in two stochastic methods and prove that our method guarantees convergence to critical points for the non-convex problem. Finally, we perform experiments for training deep convolutional neural networks on benchmark datasets. The experimental results not only confirm our theoretical analysis, but also demonstrate that the proposed method can achieve significant speedup without loss of accuracy.
Tasks
Published 2018-04-27
URL http://arxiv.org/abs/1804.10574v3
PDF http://arxiv.org/pdf/1804.10574v3.pdf
PWC https://paperswithcode.com/paper/decoupled-parallel-backpropagation-with
Repo https://github.com/unconst/MACH
Framework tf

Assessing Gender Bias in Machine Translation – A Case Study with Google Translate

Title Assessing Gender Bias in Machine Translation – A Case Study with Google Translate
Authors Marcelo O. R. Prates, Pedro H. C. Avelar, Luis Lamb
Abstract Recently there has been a growing concern about machine bias, where trained statistical models grow to reflect controversial societal asymmetries, such as gender or racial bias. A significant number of AI tools have recently been suggested to be harmfully biased towards some minority, with reports of racist criminal behavior predictors, Iphone X failing to differentiate between two Asian people and Google photos’ mistakenly classifying black people as gorillas. Although a systematic study of such biases can be difficult, we believe that automated translation tools can be exploited through gender neutral languages to yield a window into the phenomenon of gender bias in AI. In this paper, we start with a comprehensive list of job positions from the U.S. Bureau of Labor Statistics (BLS) and used it to build sentences in constructions like “He/She is an Engineer” in 12 different gender neutral languages such as Hungarian, Chinese, Yoruba, and several others. We translate these sentences into English using the Google Translate API, and collect statistics about the frequency of female, male and gender-neutral pronouns in the translated output. We show that GT exhibits a strong tendency towards male defaults, in particular for fields linked to unbalanced gender distribution such as STEM jobs. We ran these statistics against BLS’ data for the frequency of female participation in each job position, showing that GT fails to reproduce a real-world distribution of female workers. We provide experimental evidence that even if one does not expect in principle a 50:50 pronominal gender distribution, GT yields male defaults much more frequently than what would be expected from demographic data alone. We are hopeful that this work will ignite a debate about the need to augment current statistical translation tools with debiasing techniques which can already be found in the scientific literature.
Tasks Machine Translation
Published 2018-09-06
URL http://arxiv.org/abs/1809.02208v4
PDF http://arxiv.org/pdf/1809.02208v4.pdf
PWC https://paperswithcode.com/paper/assessing-gender-bias-in-machine-translation
Repo https://github.com/marceloprates/Gender-Bias
Framework none

Task-Embedded Control Networks for Few-Shot Imitation Learning

Title Task-Embedded Control Networks for Few-Shot Imitation Learning
Authors Stephen James, Michael Bloesch, Andrew J. Davison
Abstract Much like humans, robots should have the ability to leverage knowledge from previously learned tasks in order to learn new tasks quickly in new and unfamiliar environments. Despite this, most robot learning approaches have focused on learning a single task, from scratch, with a limited notion of generalisation, and no way of leveraging the knowledge to learn other tasks more efficiently. One possible solution is meta-learning, but many of the related approaches are limited in their ability to scale to a large number of tasks and to learn further tasks without forgetting previously learned ones. With this in mind, we introduce Task-Embedded Control Networks, which employ ideas from metric learning in order to create a task embedding that can be used by a robot to learn new tasks from one or more demonstrations. In the area of visually-guided manipulation, we present simulation results in which we surpass the performance of a state-of-the-art method when using only visual information from each demonstration. Additionally, we demonstrate that our approach can also be used in conjunction with domain randomisation to train our few-shot learning ability in simulation and then deploy in the real world without any additional training. Once deployed, the robot can learn new tasks from a single real-world demonstration.
Tasks Few-Shot Imitation Learning, Few-Shot Learning, Imitation Learning, Meta-Learning, Metric Learning
Published 2018-10-08
URL http://arxiv.org/abs/1810.03237v1
PDF http://arxiv.org/pdf/1810.03237v1.pdf
PWC https://paperswithcode.com/paper/task-embedded-control-networks-for-few-shot
Repo https://github.com/stepjam/PyRep
Framework none
comments powered by Disqus