October 15, 2019

2253 words 11 mins read

Paper Group NANR 231

Paper Group NANR 231

Lifelong Inverse Reinforcement Learning. Proceedings of the 2018 EMNLP Workshop W-NUT: The 4th Workshop on Noisy User-generated Text. Understanding Degeneracies and Ambiguities in Attribute Transfer. Keynote: Unveiling the Linguistic Weaknesses of Neural MT. Task-oriented Dialogue System for Automatic Diagnosis. Multicalibration: Calibration for th …

Lifelong Inverse Reinforcement Learning

Title Lifelong Inverse Reinforcement Learning
Authors Jorge Armando Mendez Mendez, Shashank Shivkumar, Eric Eaton
Abstract Methods for learning from demonstration (LfD) have shown success in acquiring behavior policies by imitating a user. However, even for a single task, LfD may require numerous demonstrations. For versatile agents that must learn many tasks via demonstration, this process would substantially burden the user if each task were learned in isolation. To address this challenge, we introduce the novel problem of lifelong learning from demonstration, which allows the agent to continually build upon knowledge learned from previously demonstrated tasks to accelerate the learning of new tasks, reducing the amount of demonstrations required. As one solution to this problem, we propose the first lifelong learning approach to inverse reinforcement learning, which learns consecutive tasks via demonstration, continually transferring knowledge between tasks to improve performance.
Tasks
Published 2018-12-01
URL http://papers.nips.cc/paper/7702-lifelong-inverse-reinforcement-learning
PDF http://papers.nips.cc/paper/7702-lifelong-inverse-reinforcement-learning.pdf
PWC https://paperswithcode.com/paper/lifelong-inverse-reinforcement-learning
Repo
Framework

Proceedings of the 2018 EMNLP Workshop W-NUT: The 4th Workshop on Noisy User-generated Text

Title Proceedings of the 2018 EMNLP Workshop W-NUT: The 4th Workshop on Noisy User-generated Text
Authors
Abstract
Tasks
Published 2018-11-01
URL https://www.aclweb.org/anthology/W18-6100/
PDF https://www.aclweb.org/anthology/W18-6100
PWC https://paperswithcode.com/paper/proceedings-of-the-2018-emnlp-workshop-w-nut
Repo
Framework

Understanding Degeneracies and Ambiguities in Attribute Transfer

Title Understanding Degeneracies and Ambiguities in Attribute Transfer
Authors Attila Szabo, Qiyang Hu, Tiziano Portenier, Matthias Zwicker, Paolo Favaro
Abstract We study the problem of building models that can transfer selected attributes from one image to another without affecting the other attributes. Towards this goal, we develop analysis and a training methodology for autoencoding models, whose encoded features aim to disentangle attributes. These features are explicitly split into two components: one that should represent attributes in common between pairs of images, and another that should represent attributes that change between pairs of images. We show that achieving this objective faces two main challenges: One is that the model may learn degenerate mappings, which we call shortcut problem, and the other is that the attribute representation for an image is not guaranteed to follow the same interpretation on another image, which we call reference ambiguity. To address the shortcut problem, we introduce novel constraints on image pairs and triplets and show their effectiveness both analytically and experimentally. In the case of the reference ambiguity, we formally prove that a model that guarantees an ideal feature separation cannot be built. We validate our findings on several datasets and show that, surprisingly, trained neural networks often do not exhibit the reference ambiguity.
Tasks
Published 2018-09-01
URL http://openaccess.thecvf.com/content_ECCV_2018/html/Attila_Szabo_Understanding_Degeneracies_and_ECCV_2018_paper.html
PDF http://openaccess.thecvf.com/content_ECCV_2018/papers/Attila_Szabo_Understanding_Degeneracies_and_ECCV_2018_paper.pdf
PWC https://paperswithcode.com/paper/understanding-degeneracies-and-ambiguities-in
Repo
Framework

Keynote: Unveiling the Linguistic Weaknesses of Neural MT

Title Keynote: Unveiling the Linguistic Weaknesses of Neural MT
Authors Arianna Bisazza
Abstract
Tasks Machine Translation
Published 2018-03-01
URL https://www.aclweb.org/anthology/W18-1801/
PDF https://www.aclweb.org/anthology/W18-1801
PWC https://paperswithcode.com/paper/keynote-unveiling-the-linguistic-weaknesses
Repo
Framework

Task-oriented Dialogue System for Automatic Diagnosis

Title Task-oriented Dialogue System for Automatic Diagnosis
Authors Zhongyu Wei, Qianlong Liu, Baolin Peng, Huaixiao Tou, Ting Chen, Xuanjing Huang, Kam-fai Wong, Xiangying Dai
Abstract In this paper, we make a move to build a dialogue system for automatic diagnosis. We first build a dataset collected from an online medical forum by extracting symptoms from both patients{'} self-reports and conversational data between patients and doctors. Then we propose a task-oriented dialogue system framework to make diagnosis for patients automatically, which can converse with patients to collect additional symptoms beyond their self-reports. Experimental results on our dataset show that additional symptoms extracted from conversation can greatly improve the accuracy for disease identification and our dialogue system is able to collect these symptoms automatically and make a better diagnosis.
Tasks
Published 2018-07-01
URL https://www.aclweb.org/anthology/P18-2033/
PDF https://www.aclweb.org/anthology/P18-2033
PWC https://paperswithcode.com/paper/task-oriented-dialogue-system-for-automatic
Repo
Framework

Multicalibration: Calibration for the (Computationally-Identifiable) Masses

Title Multicalibration: Calibration for the (Computationally-Identifiable) Masses
Authors Ursula Hebert-Johnson, Michael Kim, Omer Reingold, Guy Rothblum
Abstract We develop and study multicalibration as a new measure of fairness in machine learning that aims to mitigate inadvertent or malicious discrimination that is introduced at training time (even from ground truth data). Multicalibration guarantees meaningful (calibrated) predictions for every subpopulation that can be identified within a specified class of computations. The specified class can be quite rich; in particular, it can contain many overlapping subgroups of a protected group. We demonstrate that in many settings this strong notion of protection from discrimination is provably attainable and aligned with the goal of obtaining accurate predictions. Along the way, we present algorithms for learning a multicalibrated predictor, study the computational complexity of this task, and illustrate tight connections to the agnostic learning model.
Tasks Calibration
Published 2018-07-01
URL https://icml.cc/Conferences/2018/Schedule?showEvent=2448
PDF http://proceedings.mlr.press/v80/hebert-johnson18a/hebert-johnson18a.pdf
PWC https://paperswithcode.com/paper/multicalibration-calibration-for-the
Repo
Framework

SimpleNLG-ZH: a Linguistic Realisation Engine for Mandarin

Title SimpleNLG-ZH: a Linguistic Realisation Engine for Mandarin
Authors Guanyi Chen, Kees van Deemter, Chenghua Lin
Abstract We introduce SimpleNLG-ZH, a realisation engine for Mandarin that follows the software design paradigm of SimpleNLG (Gatt and Reiter, 2009). We explain the core grammar (morphology and syntax) and the lexicon of SimpleNLG-ZH, which is very different from English and other languages for which SimpleNLG engines have been built. The system was evaluated by regenerating expressions from a body of test sentences and a corpus of human-authored expressions. Human evaluation was conducted to estimate the quality of regenerated sentences.
Tasks Morphological Inflection, Text Generation
Published 2018-11-01
URL https://www.aclweb.org/anthology/W18-6506/
PDF https://www.aclweb.org/anthology/W18-6506
PWC https://paperswithcode.com/paper/simplenlg-zh-a-linguistic-realisation-engine
Repo
Framework

How Many Samples are Needed to Estimate a Convolutional Neural Network?

Title How Many Samples are Needed to Estimate a Convolutional Neural Network?
Authors Simon S. Du, Yining Wang, Xiyu Zhai, Sivaraman Balakrishnan, Ruslan R. Salakhutdinov, Aarti Singh
Abstract A widespread folklore for explaining the success of Convolutional Neural Networks (CNNs) is that CNNs use a more compact representation than the Fully-connected Neural Network (FNN) and thus require fewer training samples to accurately estimate their parameters. We initiate the study of rigorously characterizing the sample complexity of estimating CNNs. We show that for an $m$-dimensional convolutional filter with linear activation acting on a $d$-dimensional input, the sample complexity of achieving population prediction error of $\epsilon$ is $\widetilde{O(m/\epsilon^2)$, whereas the sample-complexity for its FNN counterpart is lower bounded by $\Omega(d/\epsilon^2)$ samples. Since, in typical settings $m \ll d$, this result demonstrates the advantage of using a CNN. We further consider the sample complexity of estimating a one-hidden-layer CNN with linear activation where both the $m$-dimensional convolutional filter and the $r$-dimensional output weights are unknown. For this model, we show that the sample complexity is $\widetilde{O}\left((m+r)/\epsilon^2\right)$ when the ratio between the stride size and the filter size is a constant. For both models, we also present lower bounds showing our sample complexities are tight up to logarithmic factors. Our main tools for deriving these results are a localized empirical process analysis and a new lemma characterizing the convolutional structure. We believe that these tools may inspire further developments in understanding CNNs.
Tasks
Published 2018-12-01
URL http://papers.nips.cc/paper/7320-how-many-samples-are-needed-to-estimate-a-convolutional-neural-network
PDF http://papers.nips.cc/paper/7320-how-many-samples-are-needed-to-estimate-a-convolutional-neural-network.pdf
PWC https://paperswithcode.com/paper/how-many-samples-are-needed-to-estimate-a-1
Repo
Framework

A Brief Introduction to Natural Language Generation within Computational Creativity

Title A Brief Introduction to Natural Language Generation within Computational Creativity
Authors Ben Burtenshaw
Abstract
Tasks Text Generation
Published 2018-11-01
URL https://www.aclweb.org/anthology/W18-6601/
PDF https://www.aclweb.org/anthology/W18-6601
PWC https://paperswithcode.com/paper/a-brief-introduction-to-natural-language
Repo
Framework

A probabilistic population code based on neural samples

Title A probabilistic population code based on neural samples
Authors Sabyasachi Shivkumar, Richard Lange, Ankani Chattoraj, Ralf Haefner
Abstract Sensory processing is often characterized as implementing probabilistic inference: networks of neurons compute posterior beliefs over unobserved causes given the sensory inputs. How these beliefs are computed and represented by neural responses is much-debated (Fiser et al. 2010, Pouget et al. 2013). A central debate concerns the question of whether neural responses represent samples of latent variables (Hoyer & Hyvarinnen 2003) or parameters of their distributions (Ma et al. 2006) with efforts being made to distinguish between them (Grabska-Barwinska et al. 2013). A separate debate addresses the question of whether neural responses are proportionally related to the encoded probabilities (Barlow 1969), or proportional to the logarithm of those probabilities (Jazayeri & Movshon 2006, Ma et al. 2006, Beck et al. 2012). Here, we show that these alternatives – contrary to common assumptions – are not mutually exclusive and that the very same system can be compatible with all of them. As a central analytical result, we show that modeling neural responses in area V1 as samples from a posterior distribution over latents in a linear Gaussian model of the image implies that those neural responses form a linear Probabilistic Population Code (PPC, Ma et al. 2006). In particular, the posterior distribution over some experimenter-defined variable like “orientation” is part of the exponential family with sufficient statistics that are linear in the neural sampling-based firing rates.
Tasks
Published 2018-12-01
URL http://papers.nips.cc/paper/7938-a-probabilistic-population-code-based-on-neural-samples
PDF http://papers.nips.cc/paper/7938-a-probabilistic-population-code-based-on-neural-samples.pdf
PWC https://paperswithcode.com/paper/a-probabilistic-population-code-based-on
Repo
Framework

Learning Distributional Token Representations from Visual Features

Title Learning Distributional Token Representations from Visual Features
Authors Samuel Broscheit
Abstract In this study, we compare token representations constructed from visual features (i.e., pixels) with standard lookup-based embeddings. Our goal is to gain insight about the challenges of encoding a text representation from low-level features, e.g. from characters or pixels. We focus on Chinese, which{—}as a logographic language{—}has properties that make a representation via visual features challenging and interesting. To train and evaluate different models for the token representation, we chose the task of character-based neural machine translation (NMT) from Chinese to English. We found that a token representation computed only from visual features can achieve competitive results to lookup embeddings. However, we also show different strengths and weaknesses in the models{'} performance in a part-of-speech tagging task and also a semantic similarity task. In summary, we show that it is possible to achieve a \textit{text representation} only from pixels. We hope that this is a useful stepping stone for future studies that exclusively rely on visual input, or aim at exploiting visual features of written language.
Tasks Machine Translation, Part-Of-Speech Tagging, Representation Learning, Semantic Similarity, Semantic Textual Similarity, Tokenization
Published 2018-07-01
URL https://www.aclweb.org/anthology/W18-3025/
PDF https://www.aclweb.org/anthology/W18-3025
PWC https://paperswithcode.com/paper/learning-distributional-token-representations
Repo
Framework

Learning Deep Generative Models With Discrete Latent Variables

Title Learning Deep Generative Models With Discrete Latent Variables
Authors Hengyuan Hu, Ruslan Salakhutdinov
Abstract There have been numerous recent advancements on learning deep generative models with latent variables thanks to the reparameterization trick that allows to train deep directed models effectively. However, since reparameterization trick only works on continuous variables, deep generative models with discrete latent variables still remain hard to train and perform considerably worse than their continuous counterparts. In this paper, we attempt to shrink this gap by introducing a new architecture and its learning procedure. We develop a hybrid generative model with binary latent variables that consists of an undirected graphical model and a deep neural network. We propose an efficient two-stage pretraining and training procedure that is crucial for learning these models. Experiments on binarized digits and images of natural scenes demonstrate that our model achieves close to the state-of-the-art performance in terms of density estimation and is capable of generating coherent images of natural scenes.
Tasks Density Estimation
Published 2018-01-01
URL https://openreview.net/forum?id=SkZ-BnyCW
PDF https://openreview.net/pdf?id=SkZ-BnyCW
PWC https://paperswithcode.com/paper/learning-deep-generative-models-with-discrete
Repo
Framework

Efficient Neural Architecture Search via Parameters Sharing

Title Efficient Neural Architecture Search via Parameters Sharing
Authors Hieu Pham, Melody Guan, Barret Zoph, Quoc Le, Jeff Dean
Abstract We propose Efficient Neural Architecture Search (ENAS), a fast and inexpensive approach for automatic model design. ENAS constructs a large computational graph, where each subgraph represents a neural network architecture, hence forcing all architectures to share their parameters. A controller is trained with policy gradient to search for a subgraph that maximizes the expected reward on a validation set. Meanwhile a model corresponding to the selected subgraph is trained to minimize a canonical cross entropy loss. Sharing parameters among child models allows ENAS to deliver strong empirical performances, whilst using much fewer GPU-hours than existing automatic model design approaches, and notably, 1000x less expensive than standard Neural Architecture Search. On Penn Treebank, ENAS discovers a novel architecture that achieves a test perplexity of 56.3, on par with the existing state-of-the-art among all methods without post-training processing. On CIFAR-10, ENAS finds a novel architecture that achieves 2.89% test error, which is on par with the 2.65% test error of NASNet (Zoph et al., 2018).
Tasks Neural Architecture Search
Published 2018-07-01
URL https://icml.cc/Conferences/2018/Schedule?showEvent=2247
PDF http://proceedings.mlr.press/v80/pham18a/pham18a.pdf
PWC https://paperswithcode.com/paper/efficient-neural-architecture-search-via
Repo
Framework

Proceedings of the 7th workshop on NLP for Computer Assisted Language Learning

Title Proceedings of the 7th workshop on NLP for Computer Assisted Language Learning
Authors
Abstract
Tasks
Published 2018-11-01
URL https://www.aclweb.org/anthology/W18-7100/
PDF https://www.aclweb.org/anthology/W18-7100
PWC https://paperswithcode.com/paper/proceedings-of-the-7th-workshop-on-nlp-for
Repo
Framework

A Mathematical Model For Optimal Decisions In A Representative Democracy

Title A Mathematical Model For Optimal Decisions In A Representative Democracy
Authors Malik Magdon-Ismail, Lirong Xia
Abstract Direct democracy, where each voter casts one vote, fails when the average voter competence falls below 50%. This happens in noisy settings when voters have limited information. Representative democracy, where voters choose representatives to vote, can be an elixir in both these situations. We introduce a mathematical model for studying representative democracy, in particular understanding the parameters of a representative democracy that gives maximum decision making capability. Our main result states that under general and natural conditions, 1. for fixed voting cost, the optimal number of representatives is linear; 2. for polynomial cost, the optimal number of representatives is logarithmic.
Tasks Decision Making
Published 2018-12-01
URL http://papers.nips.cc/paper/7720-a-mathematical-model-for-optimal-decisions-in-a-representative-democracy
PDF http://papers.nips.cc/paper/7720-a-mathematical-model-for-optimal-decisions-in-a-representative-democracy.pdf
PWC https://paperswithcode.com/paper/a-mathematical-model-for-optimal-decisions-in
Repo
Framework
comments powered by Disqus