October 15, 2019

1839 words 9 mins read

Paper Group NANR 256

Zero-shot Learning of Classifiers from Natural Language Quantification. Randomized Block Cubic Newton Method. Convolutions Are All You Need (For Classifying Character Sequences). Orthographic Features for Bilingual Lexicon Induction. Adversarial Multiple Source Domain Adaptation. Binary Rating Estimation with Graph Side Information. UZH at CoNLL–S …

Zero-shot Learning of Classifiers from Natural Language Quantification


Title	Zero-shot Learning of Classifiers from Natural Language Quantification
Authors	Shashank Srivastava, Igor Labutov, Tom Mitchell
Abstract	Humans can efficiently learn new concepts using language. We present a framework through which a set of explanations of a concept can be used to learn a classifier without access to any labeled examples. We use semantic parsing to map explanations to probabilistic assertions grounded in latent class labels and observed attributes of unlabeled data, and leverage the differential semantics of linguistic quantifiers (e.g., {`}usually{'} vs {`}always{'}) to drive model training. Experiments on three domains show that the learned classifiers outperform previous approaches for learning with limited data, and are comparable with fully supervised classifiers trained from a small number of labeled examples.
Tasks	Semantic Parsing, Zero-Shot Learning
Published	2018-07-01
URL	https://www.aclweb.org/anthology/P18-1029/
PDF	https://www.aclweb.org/anthology/P18-1029
PWC	https://paperswithcode.com/paper/zero-shot-learning-of-classifiers-from
Repo
Framework

Randomized Block Cubic Newton Method


Title	Randomized Block Cubic Newton Method
Authors	Nikita Doikov, Peter Richtarik, University Edinburgh
Abstract	We study the problem of minimizing the sum of three convex functions: a differentiable, twice-differentiable and a non-smooth term in a high dimensional setting. To this effect we propose and analyze a randomized block cubic Newton (RBCN) method, which in each iteration builds a model of the objective function formed as the sum of the natural models of its three components: a linear model with a quadratic regularizer for the differentiable term, a quadratic model with a cubic regularizer for the twice differentiable term, and perfect (proximal) model for the nonsmooth term. Our method in each iteration minimizes the model over a random subset of blocks of the search variable. RBCN is the first algorithm with these properties, generalizing several existing methods, matching the best known bounds in all special cases. We establish ${\cal O}(1/\epsilon)$, ${\cal O}(1/\sqrt{\epsilon})$ and ${\cal O}(\log (1/\epsilon))$ rates under different assumptions on the component functions. Lastly, we show numerically that our method outperforms the state-of-the-art on a variety of machine learning problems, including cubically regularized least-squares, logistic regression with constraints, and Poisson regression.
Tasks
Published	2018-07-01
URL	https://icml.cc/Conferences/2018/Schedule?showEvent=2322
PDF	http://proceedings.mlr.press/v80/doikov18a/doikov18a.pdf
PWC	https://paperswithcode.com/paper/randomized-block-cubic-newton-method
Repo
Framework

Convolutions Are All You Need (For Classifying Character Sequences)


Title	Convolutions Are All You Need (For Classifying Character Sequences)
Authors	Zach Wood-Doughty, Nicholas Andrews, Mark Dredze
Abstract	While recurrent neural networks (RNNs) are widely used for text classification, they demonstrate poor performance and slow convergence when trained on long sequences. When text is modeled as characters instead of words, the longer sequences make RNNs a poor choice. Convolutional neural networks (CNNs), although somewhat less ubiquitous than RNNs, have an internal structure more appropriate for long-distance character dependencies. To better understand how CNNs and RNNs differ in handling long sequences, we use them for text classification tasks in several character-level social media datasets. The CNN models vastly outperform the RNN models in our experiments, suggesting that CNNs are superior to RNNs at learning to classify character-level data.
Tasks	Document Classification, Language Modelling, Machine Translation, Text Classification
Published	2018-11-01
URL	https://www.aclweb.org/anthology/W18-6127/
PDF	https://www.aclweb.org/anthology/W18-6127
PWC	https://paperswithcode.com/paper/convolutions-are-all-you-need-for-classifying
Repo
Framework

Orthographic Features for Bilingual Lexicon Induction


Title	Orthographic Features for Bilingual Lexicon Induction
Authors	Parker Riley, Daniel Gildea
Abstract	Recent embedding-based methods in bilingual lexicon induction show good results, but do not take advantage of orthographic features, such as edit distance, which can be helpful for pairs of related languages. This work extends embedding-based methods to incorporate these features, resulting in significant accuracy gains for related languages.
Tasks	Machine Translation, Multilingual Word Embeddings, Unsupervised Machine Translation, Word Alignment, Word Embeddings
Published	2018-07-01
URL	https://www.aclweb.org/anthology/P18-2062/
PDF	https://www.aclweb.org/anthology/P18-2062
PWC	https://paperswithcode.com/paper/orthographic-features-for-bilingual-lexicon
Repo
Framework

Adversarial Multiple Source Domain Adaptation


Title	Adversarial Multiple Source Domain Adaptation
Authors	Han Zhao, Shanghang Zhang, Guanhang Wu, José M. F. Moura, Joao P. Costeira, Geoffrey J. Gordon
Abstract	While domain adaptation has been actively researched, most algorithms focus on the single-source-single-target adaptation setting. In this paper we propose new generalization bounds and algorithms under both classification and regression settings for unsupervised multiple source domain adaptation. Our theoretical analysis naturally leads to an efficient learning strategy using adversarial neural networks: we show how to interpret it as learning feature representations that are invariant to the multiple domain shifts while still being discriminative for the learning task. To this end, we propose multisource domain adversarial networks (MDAN) that approach domain adaptation by optimizing task-adaptive generalization bounds. To demonstrate the effectiveness of MDAN, we conduct extensive experiments showing superior adaptation performance on both classification and regression problems: sentiment analysis, digit classification, and vehicle counting.
Tasks	Domain Adaptation, Sentiment Analysis
Published	2018-12-01
URL	http://papers.nips.cc/paper/8075-adversarial-multiple-source-domain-adaptation
PDF	http://papers.nips.cc/paper/8075-adversarial-multiple-source-domain-adaptation.pdf
PWC	https://paperswithcode.com/paper/adversarial-multiple-source-domain-adaptation
Repo
Framework

Binary Rating Estimation with Graph Side Information


Title	Binary Rating Estimation with Graph Side Information
Authors	Kwangjun Ahn, Kangwook Lee, Hyunseung Cha, Changho Suh
Abstract	Rich experimental evidences show that one can better estimate users’ unknown ratings with the aid of graph side information such as social graphs. However, the gain is not theoretically quantified. In this work, we study the binary rating estimation problem to understand the fundamental value of graph side information. Considering a simple correlation model between a rating matrix and a graph, we characterize the sharp threshold on the number of observed entries required to recover the rating matrix (called the optimal sample complexity) as a function of the quality of graph side information (to be detailed). To the best of our knowledge, we are the first to reveal how much the graph side information reduces sample complexity. Further, we propose a computationally efficient algorithm that achieves the limit. Our experimental results demonstrate that the algorithm performs well even with real-world graphs.
Tasks
Published	2018-12-01
URL	http://papers.nips.cc/paper/7681-binary-rating-estimation-with-graph-side-information
PDF	http://papers.nips.cc/paper/7681-binary-rating-estimation-with-graph-side-information.pdf
PWC	https://paperswithcode.com/paper/binary-rating-estimation-with-graph-side
Repo
Framework

UZH at CoNLL–SIGMORPHON 2018 Shared Task on Universal Morphological Reinflection


Title	UZH at CoNLL–SIGMORPHON 2018 Shared Task on Universal Morphological Reinflection
Authors	Peter Makarov, Simon Clematide
Abstract
Tasks	Imitation Learning, Morphological Inflection
Published	2018-10-01
URL	https://www.aclweb.org/anthology/K18-3008/
PDF	https://www.aclweb.org/anthology/K18-3008
PWC	https://paperswithcode.com/paper/uzh-at-conll-sigmorphon-2018-shared-task-on
Repo
Framework

T"ubingen-Oslo system at SIGMORPHON shared task on morphological inflection. A multi-tasking multilingual sequence to sequence model.


Title	T"ubingen-Oslo system at SIGMORPHON shared task on morphological inflection. A multi-tasking multilingual sequence to sequence model.
Authors	Taraka Rama, {\c{C}}a{\u{g}}r{\i} {\c{C}}{"o}ltekin
Abstract
Tasks	Data Augmentation, Morphological Inflection
Published	2018-10-01
URL	https://www.aclweb.org/anthology/K18-3014/
PDF	https://www.aclweb.org/anthology/K18-3014
PWC	https://paperswithcode.com/paper/ta14bingen-oslo-system-at-sigmorphon-shared
Repo
Framework

Model-based imitation learning from state trajectories


Title	Model-based imitation learning from state trajectories
Authors	Subhajit Chaudhury, Daiki Kimura, Tadanobu Inoue, Ryuki Tachibana
Abstract	Imitation learning from demonstrations usually relies on learning a policy from trajectories of optimal states and actions. However, in real life expert demonstrations, often the action information is missing and only state trajectories are available. We present a model-based imitation learning method that can learn environment-specific optimal actions only from expert state trajectories. Our proposed method starts with a model-free reinforcement learning algorithm with a heuristic reward signal to sample environment dynamics, which is then used to train the state-transition probability. Subsequently, we learn the optimal actions from expert state trajectories by supervised learning, while back-propagating the error gradients through the modeled environment dynamics. Experimental evaluations show that our proposed method successfully achieves performance similar to (state, action) trajectory-based traditional imitation learning methods even in the absence of action information, with much fewer iterations compared to conventional model-free reinforcement learning methods. We also demonstrate that our method can learn to act from only video demonstrations of expert agent for simple games and can learn to achieve desired performance in less number of iterations.
Tasks	Imitation Learning
Published	2018-01-01
URL	https://openreview.net/forum?id=S1GDXzb0b
PDF	https://openreview.net/pdf?id=S1GDXzb0b
PWC	https://paperswithcode.com/paper/model-based-imitation-learning-from-state
Repo
Framework

Sequence Classification with Human Attention


Title	Sequence Classification with Human Attention
Authors	Maria Barrett, Joachim Bingel, Nora Hollenstein, Marek Rei, Anders S{\o}gaard
Abstract	Learning attention functions requires large volumes of data, but many NLP tasks simulate human behavior, and in this paper, we show that human attention really does provide a good inductive bias on many attention functions in NLP. Specifically, we use estimated human attention derived from eye-tracking corpora to regularize attention functions in recurrent neural networks. We show substantial improvements across a range of tasks, including sentiment analysis, grammatical error detection, and detection of abusive language.
Tasks	Eye Tracking, Grammatical Error Detection, Sentiment Analysis
Published	2018-10-01
URL	https://www.aclweb.org/anthology/K18-1030/
PDF	https://www.aclweb.org/anthology/K18-1030
PWC	https://paperswithcode.com/paper/sequence-classification-with-human-attention
Repo
Framework

Language Codes


Title	Language Codes
Authors	Jennifer DeCamp
Abstract
Tasks	Machine Translation
Published	2018-03-01
URL	https://www.aclweb.org/anthology/W18-2001/
PDF	https://www.aclweb.org/anthology/W18-2001
PWC	https://paperswithcode.com/paper/language-codes
Repo
Framework

Noise-Robust Morphological Disambiguation for Dialectal Arabic


Title	Noise-Robust Morphological Disambiguation for Dialectal Arabic
Authors	Nasser Zalmout, Alex Erdmann, er, Nizar Habash
Abstract	User-generated text tends to be noisy with many lexical and orthographic inconsistencies, making natural language processing (NLP) tasks more challenging. The challenging nature of noisy text processing is exacerbated for dialectal content, where in addition to spelling and lexical differences, dialectal text is characterized with morpho-syntactic and phonetic variations. These issues increase sparsity in NLP models and reduce accuracy. We present a neural morphological tagging and disambiguation model for Egyptian Arabic, with various extensions to handle noisy and inconsistent content. Our models achieve about 5{%} relative error reduction (1.1{%} absolute improvement) for full morphological analysis, and around 22{%} relative error reduction (1.8{%} absolute improvement) for part-of-speech tagging, over a state-of-the-art baseline.
Tasks	Lexical Normalization, Morphological Analysis, Morphological Tagging, Part-Of-Speech Tagging
Published	2018-06-01
URL	https://www.aclweb.org/anthology/N18-1087/
PDF	https://www.aclweb.org/anthology/N18-1087
PWC	https://paperswithcode.com/paper/noise-robust-morphological-disambiguation-for
Repo
Framework

Neural Morphological Tagging of Lemma Sequences for Machine Translation


Title	Neural Morphological Tagging of Lemma Sequences for Machine Translation
Authors	Costanza Conforti, Matthias Huck, Alex Fraser, er
Abstract
Tasks	Machine Translation, Morphological Tagging
Published	2018-03-01
URL	https://www.aclweb.org/anthology/W18-1805/
PDF	https://www.aclweb.org/anthology/W18-1805
PWC	https://paperswithcode.com/paper/neural-morphological-tagging-of-lemma
Repo
Framework

Phonologically Informed Edit Distance Algorithms for Word Alignment with Low-Resource Languages


Title	Phonologically Informed Edit Distance Algorithms for Word Alignment with Low-Resource Languages
Authors	Richard T. McCoy, Robert Frank
Abstract
Tasks	Machine Translation, Speech Recognition, Word Alignment
Published	2018-01-01
URL	https://www.aclweb.org/anthology/W18-0311/
PDF	https://www.aclweb.org/anthology/W18-0311
PWC	https://paperswithcode.com/paper/phonologically-informed-edit-distance
Repo
Framework

Alternating Randomized Block Coordinate Descent


Title	Alternating Randomized Block Coordinate Descent
Authors	Jelena Diakonikolas, Lorenzo Orecchia
Abstract	Block-coordinate descent algorithms and alternating minimization methods are fundamental optimization algorithms and an important primitive in large-scale optimization and machine learning. While various block-coordinate-descent-type methods have been studied extensively, only alternating minimization – which applies to the setting of only two blocks – is known to have convergence time that scales independently of the least smooth block. A natural question is then: is the setting of two blocks special? We show that the answer is “no” as long as the least smooth block can be optimized exactly – an assumption that is also needed in the setting of alternating minimization. We do so by introducing a novel algorithm AR-BCD, whose convergence time scales independently of the least smooth (possibly non-smooth) block. The basic algorithm generalizes both alternating minimization and randomized block coordinate (gradient) descent, and we also provide its accelerated version – AAR-BCD.
Tasks
Published	2018-07-01
URL	https://icml.cc/Conferences/2018/Schedule?showEvent=2445
PDF	http://proceedings.mlr.press/v80/diakonikolas18a/diakonikolas18a.pdf
PWC	https://paperswithcode.com/paper/alternating-randomized-block-coordinate
Repo
Framework