October 15, 2019

1839 words 9 mins read

Paper Group NANR 256

Paper Group NANR 256

Zero-shot Learning of Classifiers from Natural Language Quantification. Randomized Block Cubic Newton Method. Convolutions Are All You Need (For Classifying Character Sequences). Orthographic Features for Bilingual Lexicon Induction. Adversarial Multiple Source Domain Adaptation. Binary Rating Estimation with Graph Side Information. UZH at CoNLL–S …

Zero-shot Learning of Classifiers from Natural Language Quantification

Title Zero-shot Learning of Classifiers from Natural Language Quantification
Authors Shashank Srivastava, Igor Labutov, Tom Mitchell
Abstract Humans can efficiently learn new concepts using language. We present a framework through which a set of explanations of a concept can be used to learn a classifier without access to any labeled examples. We use semantic parsing to map explanations to probabilistic assertions grounded in latent class labels and observed attributes of unlabeled data, and leverage the differential semantics of linguistic quantifiers (e.g., {}usually{'} vs {}always{'}) to drive model training. Experiments on three domains show that the learned classifiers outperform previous approaches for learning with limited data, and are comparable with fully supervised classifiers trained from a small number of labeled examples.
Tasks Semantic Parsing, Zero-Shot Learning
Published 2018-07-01
URL https://www.aclweb.org/anthology/P18-1029/
PDF https://www.aclweb.org/anthology/P18-1029
PWC https://paperswithcode.com/paper/zero-shot-learning-of-classifiers-from
Repo
Framework

Randomized Block Cubic Newton Method

Title Randomized Block Cubic Newton Method
Authors Nikita Doikov, Peter Richtarik, University Edinburgh
Abstract We study the problem of minimizing the sum of three convex functions: a differentiable, twice-differentiable and a non-smooth term in a high dimensional setting. To this effect we propose and analyze a randomized block cubic Newton (RBCN) method, which in each iteration builds a model of the objective function formed as the sum of the natural models of its three components: a linear model with a quadratic regularizer for the differentiable term, a quadratic model with a cubic regularizer for the twice differentiable term, and perfect (proximal) model for the nonsmooth term. Our method in each iteration minimizes the model over a random subset of blocks of the search variable. RBCN is the first algorithm with these properties, generalizing several existing methods, matching the best known bounds in all special cases. We establish ${\cal O}(1/\epsilon)$, ${\cal O}(1/\sqrt{\epsilon})$ and ${\cal O}(\log (1/\epsilon))$ rates under different assumptions on the component functions. Lastly, we show numerically that our method outperforms the state-of-the-art on a variety of machine learning problems, including cubically regularized least-squares, logistic regression with constraints, and Poisson regression.
Tasks
Published 2018-07-01
URL https://icml.cc/Conferences/2018/Schedule?showEvent=2322
PDF http://proceedings.mlr.press/v80/doikov18a/doikov18a.pdf
PWC https://paperswithcode.com/paper/randomized-block-cubic-newton-method
Repo
Framework

Convolutions Are All You Need (For Classifying Character Sequences)

Title Convolutions Are All You Need (For Classifying Character Sequences)
Authors Zach Wood-Doughty, Nicholas Andrews, Mark Dredze
Abstract While recurrent neural networks (RNNs) are widely used for text classification, they demonstrate poor performance and slow convergence when trained on long sequences. When text is modeled as characters instead of words, the longer sequences make RNNs a poor choice. Convolutional neural networks (CNNs), although somewhat less ubiquitous than RNNs, have an internal structure more appropriate for long-distance character dependencies. To better understand how CNNs and RNNs differ in handling long sequences, we use them for text classification tasks in several character-level social media datasets. The CNN models vastly outperform the RNN models in our experiments, suggesting that CNNs are superior to RNNs at learning to classify character-level data.
Tasks Document Classification, Language Modelling, Machine Translation, Text Classification
Published 2018-11-01
URL https://www.aclweb.org/anthology/W18-6127/
PDF https://www.aclweb.org/anthology/W18-6127
PWC https://paperswithcode.com/paper/convolutions-are-all-you-need-for-classifying
Repo
Framework

Orthographic Features for Bilingual Lexicon Induction

Title Orthographic Features for Bilingual Lexicon Induction
Authors Parker Riley, Daniel Gildea
Abstract Recent embedding-based methods in bilingual lexicon induction show good results, but do not take advantage of orthographic features, such as edit distance, which can be helpful for pairs of related languages. This work extends embedding-based methods to incorporate these features, resulting in significant accuracy gains for related languages.
Tasks Machine Translation, Multilingual Word Embeddings, Unsupervised Machine Translation, Word Alignment, Word Embeddings
Published 2018-07-01
URL https://www.aclweb.org/anthology/P18-2062/
PDF https://www.aclweb.org/anthology/P18-2062
PWC https://paperswithcode.com/paper/orthographic-features-for-bilingual-lexicon
Repo
Framework

Adversarial Multiple Source Domain Adaptation

Title Adversarial Multiple Source Domain Adaptation
Authors Han Zhao, Shanghang Zhang, Guanhang Wu, José M. F. Moura, Joao P. Costeira, Geoffrey J. Gordon
Abstract While domain adaptation has been actively researched, most algorithms focus on the single-source-single-target adaptation setting. In this paper we propose new generalization bounds and algorithms under both classification and regression settings for unsupervised multiple source domain adaptation. Our theoretical analysis naturally leads to an efficient learning strategy using adversarial neural networks: we show how to interpret it as learning feature representations that are invariant to the multiple domain shifts while still being discriminative for the learning task. To this end, we propose multisource domain adversarial networks (MDAN) that approach domain adaptation by optimizing task-adaptive generalization bounds. To demonstrate the effectiveness of MDAN, we conduct extensive experiments showing superior adaptation performance on both classification and regression problems: sentiment analysis, digit classification, and vehicle counting.
Tasks Domain Adaptation, Sentiment Analysis
Published 2018-12-01
URL http://papers.nips.cc/paper/8075-adversarial-multiple-source-domain-adaptation
PDF http://papers.nips.cc/paper/8075-adversarial-multiple-source-domain-adaptation.pdf
PWC https://paperswithcode.com/paper/adversarial-multiple-source-domain-adaptation
Repo
Framework

Binary Rating Estimation with Graph Side Information

Title Binary Rating Estimation with Graph Side Information
Authors Kwangjun Ahn, Kangwook Lee, Hyunseung Cha, Changho Suh
Abstract Rich experimental evidences show that one can better estimate users’ unknown ratings with the aid of graph side information such as social graphs. However, the gain is not theoretically quantified. In this work, we study the binary rating estimation problem to understand the fundamental value of graph side information. Considering a simple correlation model between a rating matrix and a graph, we characterize the sharp threshold on the number of observed entries required to recover the rating matrix (called the optimal sample complexity) as a function of the quality of graph side information (to be detailed). To the best of our knowledge, we are the first to reveal how much the graph side information reduces sample complexity. Further, we propose a computationally efficient algorithm that achieves the limit. Our experimental results demonstrate that the algorithm performs well even with real-world graphs.
Tasks
Published 2018-12-01
URL http://papers.nips.cc/paper/7681-binary-rating-estimation-with-graph-side-information
PDF http://papers.nips.cc/paper/7681-binary-rating-estimation-with-graph-side-information.pdf
PWC https://paperswithcode.com/paper/binary-rating-estimation-with-graph-side
Repo
Framework

UZH at CoNLL–SIGMORPHON 2018 Shared Task on Universal Morphological Reinflection

Title UZH at CoNLL–SIGMORPHON 2018 Shared Task on Universal Morphological Reinflection
Authors Peter Makarov, Simon Clematide
Abstract
Tasks Imitation Learning, Morphological Inflection
Published 2018-10-01
URL https://www.aclweb.org/anthology/K18-3008/
PDF https://www.aclweb.org/anthology/K18-3008
PWC https://paperswithcode.com/paper/uzh-at-conll-sigmorphon-2018-shared-task-on
Repo
Framework

T"ubingen-Oslo system at SIGMORPHON shared task on morphological inflection. A multi-tasking multilingual sequence to sequence model.

Title T"ubingen-Oslo system at SIGMORPHON shared task on morphological inflection. A multi-tasking multilingual sequence to sequence model.
Authors Taraka Rama, {\c{C}}a{\u{g}}r{\i} {\c{C}}{"o}ltekin
Abstract
Tasks Data Augmentation, Morphological Inflection
Published 2018-10-01
URL https://www.aclweb.org/anthology/K18-3014/
PDF https://www.aclweb.org/anthology/K18-3014
PWC https://paperswithcode.com/paper/ta14bingen-oslo-system-at-sigmorphon-shared
Repo
Framework

Model-based imitation learning from state trajectories

Title Model-based imitation learning from state trajectories
Authors Subhajit Chaudhury, Daiki Kimura, Tadanobu Inoue, Ryuki Tachibana
Abstract Imitation learning from demonstrations usually relies on learning a policy from trajectories of optimal states and actions. However, in real life expert demonstrations, often the action information is missing and only state trajectories are available. We present a model-based imitation learning method that can learn environment-specific optimal actions only from expert state trajectories. Our proposed method starts with a model-free reinforcement learning algorithm with a heuristic reward signal to sample environment dynamics, which is then used to train the state-transition probability. Subsequently, we learn the optimal actions from expert state trajectories by supervised learning, while back-propagating the error gradients through the modeled environment dynamics. Experimental evaluations show that our proposed method successfully achieves performance similar to (state, action) trajectory-based traditional imitation learning methods even in the absence of action information, with much fewer iterations compared to conventional model-free reinforcement learning methods. We also demonstrate that our method can learn to act from only video demonstrations of expert agent for simple games and can learn to achieve desired performance in less number of iterations.
Tasks Imitation Learning
Published 2018-01-01
URL https://openreview.net/forum?id=S1GDXzb0b
PDF https://openreview.net/pdf?id=S1GDXzb0b
PWC https://paperswithcode.com/paper/model-based-imitation-learning-from-state
Repo
Framework

Sequence Classification with Human Attention

Title Sequence Classification with Human Attention
Authors Maria Barrett, Joachim Bingel, Nora Hollenstein, Marek Rei, Anders S{\o}gaard
Abstract Learning attention functions requires large volumes of data, but many NLP tasks simulate human behavior, and in this paper, we show that human attention really does provide a good inductive bias on many attention functions in NLP. Specifically, we use estimated human attention derived from eye-tracking corpora to regularize attention functions in recurrent neural networks. We show substantial improvements across a range of tasks, including sentiment analysis, grammatical error detection, and detection of abusive language.
Tasks Eye Tracking, Grammatical Error Detection, Sentiment Analysis
Published 2018-10-01
URL https://www.aclweb.org/anthology/K18-1030/
PDF https://www.aclweb.org/anthology/K18-1030
PWC https://paperswithcode.com/paper/sequence-classification-with-human-attention
Repo
Framework

Language Codes

Title Language Codes
Authors Jennifer DeCamp
Abstract
Tasks Machine Translation
Published 2018-03-01
URL https://www.aclweb.org/anthology/W18-2001/
PDF https://www.aclweb.org/anthology/W18-2001
PWC https://paperswithcode.com/paper/language-codes
Repo
Framework

Noise-Robust Morphological Disambiguation for Dialectal Arabic

Title Noise-Robust Morphological Disambiguation for Dialectal Arabic
Authors Nasser Zalmout, Alex Erdmann, er, Nizar Habash
Abstract User-generated text tends to be noisy with many lexical and orthographic inconsistencies, making natural language processing (NLP) tasks more challenging. The challenging nature of noisy text processing is exacerbated for dialectal content, where in addition to spelling and lexical differences, dialectal text is characterized with morpho-syntactic and phonetic variations. These issues increase sparsity in NLP models and reduce accuracy. We present a neural morphological tagging and disambiguation model for Egyptian Arabic, with various extensions to handle noisy and inconsistent content. Our models achieve about 5{%} relative error reduction (1.1{%} absolute improvement) for full morphological analysis, and around 22{%} relative error reduction (1.8{%} absolute improvement) for part-of-speech tagging, over a state-of-the-art baseline.
Tasks Lexical Normalization, Morphological Analysis, Morphological Tagging, Part-Of-Speech Tagging
Published 2018-06-01
URL https://www.aclweb.org/anthology/N18-1087/
PDF https://www.aclweb.org/anthology/N18-1087
PWC https://paperswithcode.com/paper/noise-robust-morphological-disambiguation-for
Repo
Framework

Neural Morphological Tagging of Lemma Sequences for Machine Translation

Title Neural Morphological Tagging of Lemma Sequences for Machine Translation
Authors Costanza Conforti, Matthias Huck, Alex Fraser, er
Abstract
Tasks Machine Translation, Morphological Tagging
Published 2018-03-01
URL https://www.aclweb.org/anthology/W18-1805/
PDF https://www.aclweb.org/anthology/W18-1805
PWC https://paperswithcode.com/paper/neural-morphological-tagging-of-lemma
Repo
Framework

Phonologically Informed Edit Distance Algorithms for Word Alignment with Low-Resource Languages

Title Phonologically Informed Edit Distance Algorithms for Word Alignment with Low-Resource Languages
Authors Richard T. McCoy, Robert Frank
Abstract
Tasks Machine Translation, Speech Recognition, Word Alignment
Published 2018-01-01
URL https://www.aclweb.org/anthology/W18-0311/
PDF https://www.aclweb.org/anthology/W18-0311
PWC https://paperswithcode.com/paper/phonologically-informed-edit-distance
Repo
Framework

Alternating Randomized Block Coordinate Descent

Title Alternating Randomized Block Coordinate Descent
Authors Jelena Diakonikolas, Lorenzo Orecchia
Abstract Block-coordinate descent algorithms and alternating minimization methods are fundamental optimization algorithms and an important primitive in large-scale optimization and machine learning. While various block-coordinate-descent-type methods have been studied extensively, only alternating minimization – which applies to the setting of only two blocks – is known to have convergence time that scales independently of the least smooth block. A natural question is then: is the setting of two blocks special? We show that the answer is “no” as long as the least smooth block can be optimized exactly – an assumption that is also needed in the setting of alternating minimization. We do so by introducing a novel algorithm AR-BCD, whose convergence time scales independently of the least smooth (possibly non-smooth) block. The basic algorithm generalizes both alternating minimization and randomized block coordinate (gradient) descent, and we also provide its accelerated version – AAR-BCD.
Tasks
Published 2018-07-01
URL https://icml.cc/Conferences/2018/Schedule?showEvent=2445
PDF http://proceedings.mlr.press/v80/diakonikolas18a/diakonikolas18a.pdf
PWC https://paperswithcode.com/paper/alternating-randomized-block-coordinate
Repo
Framework
comments powered by Disqus