October 15, 2019

2253 words 11 mins read

Paper Group NANR 231

Lifelong Inverse Reinforcement Learning. Proceedings of the 2018 EMNLP Workshop W-NUT: The 4th Workshop on Noisy User-generated Text. Understanding Degeneracies and Ambiguities in Attribute Transfer. Keynote: Unveiling the Linguistic Weaknesses of Neural MT. Task-oriented Dialogue System for Automatic Diagnosis. Multicalibration: Calibration for th …

Lifelong Inverse Reinforcement Learning


Title	Lifelong Inverse Reinforcement Learning
Authors	Jorge Armando Mendez Mendez, Shashank Shivkumar, Eric Eaton
Abstract	Methods for learning from demonstration (LfD) have shown success in acquiring behavior policies by imitating a user. However, even for a single task, LfD may require numerous demonstrations. For versatile agents that must learn many tasks via demonstration, this process would substantially burden the user if each task were learned in isolation. To address this challenge, we introduce the novel problem of lifelong learning from demonstration, which allows the agent to continually build upon knowledge learned from previously demonstrated tasks to accelerate the learning of new tasks, reducing the amount of demonstrations required. As one solution to this problem, we propose the first lifelong learning approach to inverse reinforcement learning, which learns consecutive tasks via demonstration, continually transferring knowledge between tasks to improve performance.
Tasks
Published	2018-12-01
URL	http://papers.nips.cc/paper/7702-lifelong-inverse-reinforcement-learning
PDF	http://papers.nips.cc/paper/7702-lifelong-inverse-reinforcement-learning.pdf
PWC	https://paperswithcode.com/paper/lifelong-inverse-reinforcement-learning
Repo
Framework

Proceedings of the 2018 EMNLP Workshop W-NUT: The 4th Workshop on Noisy User-generated Text


Title	Proceedings of the 2018 EMNLP Workshop W-NUT: The 4th Workshop on Noisy User-generated Text
Authors
Abstract
Tasks
Published	2018-11-01
URL	https://www.aclweb.org/anthology/W18-6100/
PDF	https://www.aclweb.org/anthology/W18-6100
PWC	https://paperswithcode.com/paper/proceedings-of-the-2018-emnlp-workshop-w-nut
Repo
Framework

Understanding Degeneracies and Ambiguities in Attribute Transfer


Title	Understanding Degeneracies and Ambiguities in Attribute Transfer
Authors	Attila Szabo, Qiyang Hu, Tiziano Portenier, Matthias Zwicker, Paolo Favaro
Abstract	We study the problem of building models that can transfer selected attributes from one image to another without affecting the other attributes. Towards this goal, we develop analysis and a training methodology for autoencoding models, whose encoded features aim to disentangle attributes. These features are explicitly split into two components: one that should represent attributes in common between pairs of images, and another that should represent attributes that change between pairs of images. We show that achieving this objective faces two main challenges: One is that the model may learn degenerate mappings, which we call shortcut problem, and the other is that the attribute representation for an image is not guaranteed to follow the same interpretation on another image, which we call reference ambiguity. To address the shortcut problem, we introduce novel constraints on image pairs and triplets and show their effectiveness both analytically and experimentally. In the case of the reference ambiguity, we formally prove that a model that guarantees an ideal feature separation cannot be built. We validate our findings on several datasets and show that, surprisingly, trained neural networks often do not exhibit the reference ambiguity.
Tasks
Published	2018-09-01
URL	http://openaccess.thecvf.com/content_ECCV_2018/html/Attila_Szabo_Understanding_Degeneracies_and_ECCV_2018_paper.html
PDF	http://openaccess.thecvf.com/content_ECCV_2018/papers/Attila_Szabo_Understanding_Degeneracies_and_ECCV_2018_paper.pdf
PWC	https://paperswithcode.com/paper/understanding-degeneracies-and-ambiguities-in
Repo
Framework

Keynote: Unveiling the Linguistic Weaknesses of Neural MT


Title	Keynote: Unveiling the Linguistic Weaknesses of Neural MT
Authors	Arianna Bisazza
Abstract
Tasks	Machine Translation
Published	2018-03-01
URL	https://www.aclweb.org/anthology/W18-1801/
PDF	https://www.aclweb.org/anthology/W18-1801
PWC	https://paperswithcode.com/paper/keynote-unveiling-the-linguistic-weaknesses
Repo
Framework

Task-oriented Dialogue System for Automatic Diagnosis


Title	Task-oriented Dialogue System for Automatic Diagnosis
Authors	Zhongyu Wei, Qianlong Liu, Baolin Peng, Huaixiao Tou, Ting Chen, Xuanjing Huang, Kam-fai Wong, Xiangying Dai
Abstract	In this paper, we make a move to build a dialogue system for automatic diagnosis. We first build a dataset collected from an online medical forum by extracting symptoms from both patients{'} self-reports and conversational data between patients and doctors. Then we propose a task-oriented dialogue system framework to make diagnosis for patients automatically, which can converse with patients to collect additional symptoms beyond their self-reports. Experimental results on our dataset show that additional symptoms extracted from conversation can greatly improve the accuracy for disease identification and our dialogue system is able to collect these symptoms automatically and make a better diagnosis.
Tasks
Published	2018-07-01
URL	https://www.aclweb.org/anthology/P18-2033/
PDF	https://www.aclweb.org/anthology/P18-2033
PWC	https://paperswithcode.com/paper/task-oriented-dialogue-system-for-automatic
Repo
Framework

Multicalibration: Calibration for the (Computationally-Identifiable) Masses


Title	Multicalibration: Calibration for the (Computationally-Identifiable) Masses
Authors	Ursula Hebert-Johnson, Michael Kim, Omer Reingold, Guy Rothblum
Abstract	We develop and study multicalibration as a new measure of fairness in machine learning that aims to mitigate inadvertent or malicious discrimination that is introduced at training time (even from ground truth data). Multicalibration guarantees meaningful (calibrated) predictions for every subpopulation that can be identified within a specified class of computations. The specified class can be quite rich; in particular, it can contain many overlapping subgroups of a protected group. We demonstrate that in many settings this strong notion of protection from discrimination is provably attainable and aligned with the goal of obtaining accurate predictions. Along the way, we present algorithms for learning a multicalibrated predictor, study the computational complexity of this task, and illustrate tight connections to the agnostic learning model.
Tasks	Calibration
Published	2018-07-01
URL	https://icml.cc/Conferences/2018/Schedule?showEvent=2448
PDF	http://proceedings.mlr.press/v80/hebert-johnson18a/hebert-johnson18a.pdf
PWC	https://paperswithcode.com/paper/multicalibration-calibration-for-the
Repo
Framework

SimpleNLG-ZH: a Linguistic Realisation Engine for Mandarin


Title	SimpleNLG-ZH: a Linguistic Realisation Engine for Mandarin
Authors	Guanyi Chen, Kees van Deemter, Chenghua Lin
Abstract	We introduce SimpleNLG-ZH, a realisation engine for Mandarin that follows the software design paradigm of SimpleNLG (Gatt and Reiter, 2009). We explain the core grammar (morphology and syntax) and the lexicon of SimpleNLG-ZH, which is very different from English and other languages for which SimpleNLG engines have been built. The system was evaluated by regenerating expressions from a body of test sentences and a corpus of human-authored expressions. Human evaluation was conducted to estimate the quality of regenerated sentences.
Tasks	Morphological Inflection, Text Generation
Published	2018-11-01
URL	https://www.aclweb.org/anthology/W18-6506/
PDF	https://www.aclweb.org/anthology/W18-6506
PWC	https://paperswithcode.com/paper/simplenlg-zh-a-linguistic-realisation-engine
Repo
Framework

How Many Samples are Needed to Estimate a Convolutional Neural Network?


Title	How Many Samples are Needed to Estimate a Convolutional Neural Network?
Authors	Simon S. Du, Yining Wang, Xiyu Zhai, Sivaraman Balakrishnan, Ruslan R. Salakhutdinov, Aarti Singh
Abstract	A widespread folklore for explaining the success of Convolutional Neural Networks (CNNs) is that CNNs use a more compact representation than the Fully-connected Neural Network (FNN) and thus require fewer training samples to accurately estimate their parameters. We initiate the study of rigorously characterizing the sample complexity of estimating CNNs. We show that for an $m$-dimensional convolutional filter with linear activation acting on a $d$-dimensional input, the sample complexity of achieving population prediction error of $\epsilon$ is $\widetilde{O(m/\epsilon^2)$, whereas the sample-complexity for its FNN counterpart is lower bounded by $\Omega(d/\epsilon^2)$ samples. Since, in typical settings $m \ll d$, this result demonstrates the advantage of using a CNN. We further consider the sample complexity of estimating a one-hidden-layer CNN with linear activation where both the $m$-dimensional convolutional filter and the $r$-dimensional output weights are unknown. For this model, we show that the sample complexity is $\widetilde{O}\left((m+r)/\epsilon^2\right)$ when the ratio between the stride size and the filter size is a constant. For both models, we also present lower bounds showing our sample complexities are tight up to logarithmic factors. Our main tools for deriving these results are a localized empirical process analysis and a new lemma characterizing the convolutional structure. We believe that these tools may inspire further developments in understanding CNNs.
Tasks
Published	2018-12-01
URL	http://papers.nips.cc/paper/7320-how-many-samples-are-needed-to-estimate-a-convolutional-neural-network
PDF	http://papers.nips.cc/paper/7320-how-many-samples-are-needed-to-estimate-a-convolutional-neural-network.pdf
PWC	https://paperswithcode.com/paper/how-many-samples-are-needed-to-estimate-a-1
Repo
Framework

A Brief Introduction to Natural Language Generation within Computational Creativity


Title	A Brief Introduction to Natural Language Generation within Computational Creativity
Authors	Ben Burtenshaw
Abstract
Tasks	Text Generation
Published	2018-11-01
URL	https://www.aclweb.org/anthology/W18-6601/
PDF	https://www.aclweb.org/anthology/W18-6601
PWC	https://paperswithcode.com/paper/a-brief-introduction-to-natural-language
Repo
Framework

A probabilistic population code based on neural samples


Title	A probabilistic population code based on neural samples
Authors	Sabyasachi Shivkumar, Richard Lange, Ankani Chattoraj, Ralf Haefner
Abstract	Sensory processing is often characterized as implementing probabilistic inference: networks of neurons compute posterior beliefs over unobserved causes given the sensory inputs. How these beliefs are computed and represented by neural responses is much-debated (Fiser et al. 2010, Pouget et al. 2013). A central debate concerns the question of whether neural responses represent samples of latent variables (Hoyer & Hyvarinnen 2003) or parameters of their distributions (Ma et al. 2006) with efforts being made to distinguish between them (Grabska-Barwinska et al. 2013). A separate debate addresses the question of whether neural responses are proportionally related to the encoded probabilities (Barlow 1969), or proportional to the logarithm of those probabilities (Jazayeri & Movshon 2006, Ma et al. 2006, Beck et al. 2012). Here, we show that these alternatives – contrary to common assumptions – are not mutually exclusive and that the very same system can be compatible with all of them. As a central analytical result, we show that modeling neural responses in area V1 as samples from a posterior distribution over latents in a linear Gaussian model of the image implies that those neural responses form a linear Probabilistic Population Code (PPC, Ma et al. 2006). In particular, the posterior distribution over some experimenter-defined variable like “orientation” is part of the exponential family with sufficient statistics that are linear in the neural sampling-based firing rates.
Tasks
Published	2018-12-01
URL	http://papers.nips.cc/paper/7938-a-probabilistic-population-code-based-on-neural-samples
PDF	http://papers.nips.cc/paper/7938-a-probabilistic-population-code-based-on-neural-samples.pdf
PWC	https://paperswithcode.com/paper/a-probabilistic-population-code-based-on
Repo
Framework

Learning Distributional Token Representations from Visual Features


Title	Learning Distributional Token Representations from Visual Features
Authors	Samuel Broscheit
Abstract	In this study, we compare token representations constructed from visual features (i.e., pixels) with standard lookup-based embeddings. Our goal is to gain insight about the challenges of encoding a text representation from low-level features, e.g. from characters or pixels. We focus on Chinese, which{—}as a logographic language{—}has properties that make a representation via visual features challenging and interesting. To train and evaluate different models for the token representation, we chose the task of character-based neural machine translation (NMT) from Chinese to English. We found that a token representation computed only from visual features can achieve competitive results to lookup embeddings. However, we also show different strengths and weaknesses in the models{'} performance in a part-of-speech tagging task and also a semantic similarity task. In summary, we show that it is possible to achieve a \textit{text representation} only from pixels. We hope that this is a useful stepping stone for future studies that exclusively rely on visual input, or aim at exploiting visual features of written language.
Tasks	Machine Translation, Part-Of-Speech Tagging, Representation Learning, Semantic Similarity, Semantic Textual Similarity, Tokenization
Published	2018-07-01
URL	https://www.aclweb.org/anthology/W18-3025/
PDF	https://www.aclweb.org/anthology/W18-3025
PWC	https://paperswithcode.com/paper/learning-distributional-token-representations
Repo
Framework

Learning Deep Generative Models With Discrete Latent Variables


Title	Learning Deep Generative Models With Discrete Latent Variables
Authors	Hengyuan Hu, Ruslan Salakhutdinov
Abstract	There have been numerous recent advancements on learning deep generative models with latent variables thanks to the reparameterization trick that allows to train deep directed models effectively. However, since reparameterization trick only works on continuous variables, deep generative models with discrete latent variables still remain hard to train and perform considerably worse than their continuous counterparts. In this paper, we attempt to shrink this gap by introducing a new architecture and its learning procedure. We develop a hybrid generative model with binary latent variables that consists of an undirected graphical model and a deep neural network. We propose an efficient two-stage pretraining and training procedure that is crucial for learning these models. Experiments on binarized digits and images of natural scenes demonstrate that our model achieves close to the state-of-the-art performance in terms of density estimation and is capable of generating coherent images of natural scenes.
Tasks	Density Estimation
Published	2018-01-01
URL	https://openreview.net/forum?id=SkZ-BnyCW
PDF	https://openreview.net/pdf?id=SkZ-BnyCW
PWC	https://paperswithcode.com/paper/learning-deep-generative-models-with-discrete
Repo
Framework


Title	Efficient Neural Architecture Search via Parameters Sharing
Authors	Hieu Pham, Melody Guan, Barret Zoph, Quoc Le, Jeff Dean
Abstract	We propose Efficient Neural Architecture Search (ENAS), a fast and inexpensive approach for automatic model design. ENAS constructs a large computational graph, where each subgraph represents a neural network architecture, hence forcing all architectures to share their parameters. A controller is trained with policy gradient to search for a subgraph that maximizes the expected reward on a validation set. Meanwhile a model corresponding to the selected subgraph is trained to minimize a canonical cross entropy loss. Sharing parameters among child models allows ENAS to deliver strong empirical performances, whilst using much fewer GPU-hours than existing automatic model design approaches, and notably, 1000x less expensive than standard Neural Architecture Search. On Penn Treebank, ENAS discovers a novel architecture that achieves a test perplexity of 56.3, on par with the existing state-of-the-art among all methods without post-training processing. On CIFAR-10, ENAS finds a novel architecture that achieves 2.89% test error, which is on par with the 2.65% test error of NASNet (Zoph et al., 2018).
Tasks	Neural Architecture Search
Published	2018-07-01
URL	https://icml.cc/Conferences/2018/Schedule?showEvent=2247
PDF	http://proceedings.mlr.press/v80/pham18a/pham18a.pdf
PWC	https://paperswithcode.com/paper/efficient-neural-architecture-search-via
Repo
Framework

Proceedings of the 7th workshop on NLP for Computer Assisted Language Learning


Title	Proceedings of the 7th workshop on NLP for Computer Assisted Language Learning
Authors
Abstract
Tasks
Published	2018-11-01
URL	https://www.aclweb.org/anthology/W18-7100/
PDF	https://www.aclweb.org/anthology/W18-7100
PWC	https://paperswithcode.com/paper/proceedings-of-the-7th-workshop-on-nlp-for
Repo
Framework

A Mathematical Model For Optimal Decisions In A Representative Democracy


Title	A Mathematical Model For Optimal Decisions In A Representative Democracy
Authors	Malik Magdon-Ismail, Lirong Xia
Abstract	Direct democracy, where each voter casts one vote, fails when the average voter competence falls below 50%. This happens in noisy settings when voters have limited information. Representative democracy, where voters choose representatives to vote, can be an elixir in both these situations. We introduce a mathematical model for studying representative democracy, in particular understanding the parameters of a representative democracy that gives maximum decision making capability. Our main result states that under general and natural conditions, 1. for fixed voting cost, the optimal number of representatives is linear; 2. for polynomial cost, the optimal number of representatives is logarithmic.
Tasks	Decision Making
Published	2018-12-01
URL	http://papers.nips.cc/paper/7720-a-mathematical-model-for-optimal-decisions-in-a-representative-democracy
PDF	http://papers.nips.cc/paper/7720-a-mathematical-model-for-optimal-decisions-in-a-representative-democracy.pdf
PWC	https://paperswithcode.com/paper/a-mathematical-model-for-optimal-decisions-in
Repo
Framework