October 15, 2019

2221 words 11 mins read

Paper Group NANR 249

Paper Group NANR 249

Backpropagation with Callbacks: Foundations for Efficient and Expressive Differentiable Programming. Tweety at SemEval-2018 Task 2: Predicting Emojis using Hierarchical Attention Neural Networks and Support Vector Machine. GATED FAST WEIGHTS FOR ASSOCIATIVE RETRIEVAL. A Multi-task Approach to Learning Multilingual Representations. Multilingual Seq2 …

Backpropagation with Callbacks: Foundations for Efficient and Expressive Differentiable Programming

Title Backpropagation with Callbacks: Foundations for Efficient and Expressive Differentiable Programming
Authors Fei Wang, James Decker, Xilun Wu, Gregory Essertel, Tiark Rompf
Abstract Training of deep learning models depends on gradient descent and end-to-end differentiation. Under the slogan of differentiable programming, there is an increasing demand for efficient automatic gradient computation for emerging network architectures that incorporate dynamic control flow, especially in NLP. In this paper we propose an implementation of backpropagation using functions with callbacks, where the forward pass is executed as a sequence of function calls, and the backward pass as a corresponding sequence of function returns. A key realization is that this technique of chaining callbacks is well known in the programming languages community as continuation-passing style (CPS). Any program can be converted to this form using standard techniques, and hence, any program can be mechanically converted to compute gradients. Our approach achieves the same flexibility as other reverse-mode automatic differentiation (AD) techniques, but it can be implemented without any auxiliary data structures besides the function call stack, and it can easily be combined with graph construction and native code generation techniques through forms of multi-stage programming, leading to a highly efficient implementation that combines the performance benefits of define-then-run software frameworks such as TensorFlow with the expressiveness of define-by-run frameworks such as PyTorch.
Tasks Code Generation, graph construction
Published 2018-12-01
URL http://papers.nips.cc/paper/8221-backpropagation-with-callbacks-foundations-for-efficient-and-expressive-differentiable-programming
PDF http://papers.nips.cc/paper/8221-backpropagation-with-callbacks-foundations-for-efficient-and-expressive-differentiable-programming.pdf
PWC https://paperswithcode.com/paper/backpropagation-with-callbacks-foundations
Repo
Framework

Tweety at SemEval-2018 Task 2: Predicting Emojis using Hierarchical Attention Neural Networks and Support Vector Machine

Title Tweety at SemEval-2018 Task 2: Predicting Emojis using Hierarchical Attention Neural Networks and Support Vector Machine
Authors Daniel Kopev, Atanas Atanasov, Dimitrina Zlatkova, Momchil Hardalov, Ivan Koychev, Ivelina Nikolova, Galia Angelova
Abstract We present the system built for SemEval-2018 Task 2 on Emoji Prediction. Although Twitter messages are very short we managed to design a wide variety of features: textual, semantic, sentiment, emotion-, and color-related ones. We investigated different methods of text preprocessing including replacing text emojis with respective tokens and splitting hashtags to capture more meaning. To represent text we used word n-grams and word embeddings. We experimented with a wide range of classifiers and our best results were achieved using a SVM-based classifier and a Hierarchical Attention Neural Network.
Tasks Word Embeddings
Published 2018-06-01
URL https://www.aclweb.org/anthology/S18-1080/
PDF https://www.aclweb.org/anthology/S18-1080
PWC https://paperswithcode.com/paper/tweety-at-semeval-2018-task-2-predicting
Repo
Framework

GATED FAST WEIGHTS FOR ASSOCIATIVE RETRIEVAL

Title GATED FAST WEIGHTS FOR ASSOCIATIVE RETRIEVAL
Authors Imanol Schlag, Jürgen Schmidhuber
Abstract We improve previous end-to-end differentiable neural networks (NNs) with fast weight memories. A gate mechanism updates fast weights at every time step of a sequence through two separate outer-product-based matrices generated by slow parts of the net. The system is trained on a complex sequence to sequence variation of the Associative Retrieval Problem with roughly 70 times more temporal memory (i.e. time-varying variables) than similar-sized standard recurrent NNs (RNNs). In terms of accuracy and number of parameters, our architecture outperforms a variety of RNNs, including Long Short-Term Memory, Hypernetworks, and related fast weight architectures.
Tasks
Published 2018-01-01
URL https://openreview.net/forum?id=HJ8W1Q-0Z
PDF https://openreview.net/pdf?id=HJ8W1Q-0Z
PWC https://paperswithcode.com/paper/gated-fast-weights-for-associative-retrieval
Repo
Framework

A Multi-task Approach to Learning Multilingual Representations

Title A Multi-task Approach to Learning Multilingual Representations
Authors Karan Singla, Dogan Can, Shrikanth Narayanan
Abstract We present a novel multi-task modeling approach to learning multilingual distributed representations of text. Our system learns word and sentence embeddings jointly by training a multilingual skip-gram model together with a cross-lingual sentence similarity model. Our architecture can transparently use both monolingual and sentence aligned bilingual corpora to learn multilingual embeddings, thus covering a vocabulary significantly larger than the vocabulary of the bilingual corpora alone. Our model shows competitive performance in a standard cross-lingual document classification task. We also show the effectiveness of our method in a limited resource scenario.
Tasks Cross-Lingual Document Classification, Document Classification, Sentence Embeddings, Word Embeddings
Published 2018-07-01
URL https://www.aclweb.org/anthology/P18-2035/
PDF https://www.aclweb.org/anthology/P18-2035
PWC https://paperswithcode.com/paper/a-multi-task-approach-to-learning
Repo
Framework

Multilingual Seq2seq Training with Similarity Loss for Cross-Lingual Document Classification

Title Multilingual Seq2seq Training with Similarity Loss for Cross-Lingual Document Classification
Authors Katherine Yu, Haoran Li, Barlas Oguz
Abstract In this paper we continue experiments where neural machine translation training is used to produce joint cross-lingual fixed-dimensional sentence embeddings. In this framework we introduce a simple method of adding a loss to the learning objective which penalizes distance between representations of bilingually aligned sentences. We evaluate cross-lingual transfer using two approaches, cross-lingual similarity search on an aligned corpus (Europarl) and cross-lingual document classification on a recently published benchmark Reuters corpus, and we find the similarity loss significantly improves performance on both. Furthermore, we notice that while our Reuters results are very competitive, our English results are not as competitive, showing room for improvement in the current cross-lingual state-of-the-art. Our results are based on a set of 6 European languages.
Tasks Cross-Lingual Document Classification, Cross-Lingual Transfer, Document Classification, Machine Translation, Representation Learning, Sentence Embedding, Sentence Embeddings, Word Embeddings
Published 2018-07-01
URL https://www.aclweb.org/anthology/W18-3023/
PDF https://www.aclweb.org/anthology/W18-3023
PWC https://paperswithcode.com/paper/multilingual-seq2seq-training-with-similarity
Repo
Framework

Demo2Vec: Reasoning Object Affordances From Online Videos

Title Demo2Vec: Reasoning Object Affordances From Online Videos
Authors Kuan Fang, Te-Lin Wu, Daniel Yang, Silvio Savarese, Joseph J. Lim
Abstract Watching expert demonstrations is an important way for humans and robots to reason about affordances of unseen objects. In this paper, we consider the problem of reasoning object affordances through the feature embedding of demonstration videos. We design the Demo2Vec model which learns to extract embedded vectors of demonstration videos and predicts the interaction region and the action label on a target image of the same object. We introduce the Online Product Review dataset for Affordance (OPRA) by collecting and labeling diverse YouTube product review videos. Our Demo2Vec model outperforms various recurrent neural network baselines on the collected dataset.
Tasks
Published 2018-06-01
URL http://openaccess.thecvf.com/content_cvpr_2018/html/Fang_Demo2Vec_Reasoning_Object_CVPR_2018_paper.html
PDF http://openaccess.thecvf.com/content_cvpr_2018/papers/Fang_Demo2Vec_Reasoning_Object_CVPR_2018_paper.pdf
PWC https://paperswithcode.com/paper/demo2vec-reasoning-object-affordances-from
Repo
Framework

Lyapunov Functions for First-Order Methods: Tight Automated Convergence Guarantees

Title Lyapunov Functions for First-Order Methods: Tight Automated Convergence Guarantees
Authors Adrien Taylor, Bryan Van Scoy, Laurent Lessard
Abstract We present a novel way of generating Lyapunov functions for proving linear convergence rates of first-order optimization methods. Our approach provably obtains the fastest linear convergence rate that can be verified by a quadratic Lyapunov function (with given states), and only relies on solving a small-sized semidefinite program. Our approach combines the advantages of performance estimation problems (PEP, due to Drori and Teboulle (2014)) and integral quadratic constraints (IQC, due to Lessard et al. (2016)), and relies on convex interpolation (due to Taylor et al. (2017c;b)).
Tasks
Published 2018-07-01
URL https://icml.cc/Conferences/2018/Schedule?showEvent=2168
PDF http://proceedings.mlr.press/v80/taylor18a/taylor18a.pdf
PWC https://paperswithcode.com/paper/lyapunov-functions-for-first-order-methods
Repo
Framework

Crowdsourced Multimodal Corpora Collection Tool

Title Crowdsourced Multimodal Corpora Collection Tool
Authors Patrik Jonell, Catharine Oertel, Dimosthenis Kontogiorgos, Jonas Beskow, Joakim Gustafson
Abstract
Tasks
Published 2018-05-01
URL https://www.aclweb.org/anthology/L18-1117/
PDF https://www.aclweb.org/anthology/L18-1117
PWC https://paperswithcode.com/paper/crowdsourced-multimodal-corpora-collection
Repo
Framework

Nonparametric Regression with Comparisons: Escaping the Curse of Dimensionality with Ordinal Information

Title Nonparametric Regression with Comparisons: Escaping the Curse of Dimensionality with Ordinal Information
Authors Yichong Xu, Hariank Muthakana, Sivaraman Balakrishnan, Aarti Singh, Artur Dubrawski
Abstract In supervised learning, we leverage a labeled dataset to design methods for function estimation. In many practical situations, we are able to obtain alternative feedback, possibly at a low cost. A broad goal is to understand the usefulness of, and to design algorithms to exploit, this alternative feedback. We focus on a semi-supervised setting where we obtain additional ordinal (or comparison) information for potentially unlabeled samples. We consider ordinal feedback of varying qualities where we have either a perfect ordering of the samples, a noisy ordering of the samples or noisy pairwise comparisons between the samples. We provide a precise quantification of the usefulness of these types of ordinal feedback in non-parametric regression, showing that in many cases it is possible to accurately estimate an underlying function with a very small labeled set, effectively escaping the curse of dimensionality. We develop an algorithm called Ranking-Regression (RR) and analyze its accuracy as a function of size of the labeled and unlabeled datasets and various noise parameters. We also present lower bounds, that establish fundamental limits for the task and show that RR is optimal in a variety of settings. Finally, we present experiments that show the efficacy of RR and investigate its robustness to various sources of noise and model-misspecification.
Tasks
Published 2018-07-01
URL https://icml.cc/Conferences/2018/Schedule?showEvent=2095
PDF http://proceedings.mlr.press/v80/xu18e/xu18e.pdf
PWC https://paperswithcode.com/paper/nonparametric-regression-with-comparisons-1
Repo
Framework

Streaming word similarity mining on the cheap

Title Streaming word similarity mining on the cheap
Authors Olof G{"o}rnerup, Daniel Gillblad
Abstract Accurately and efficiently estimating word similarities from text is fundamental in natural language processing. In this paper, we propose a fast and lightweight method for estimating similarities from streams by explicitly counting second-order co-occurrences. The method rests on the observation that words that are highly correlated with respect to such counts are also highly similar with respect to first-order co-occurrences. Using buffers of co-occurred words per word to count second-order co-occurrences, we can then estimate similarities in a single pass over data without having to do prohibitively expensive similarity calculations. We demonstrate that this approach is scalable, converges rapidly, behaves robustly under parameter changes, and that it captures word similarities on par with those given by state-of-the-art word embeddings.
Tasks Document Classification, Word Alignment, Word Embeddings
Published 2018-10-01
URL https://www.aclweb.org/anthology/D18-1172/
PDF https://www.aclweb.org/anthology/D18-1172
PWC https://paperswithcode.com/paper/streaming-word-similarity-mining-on-the-cheap
Repo
Framework

A Named Entity Recognition Shootout for German

Title A Named Entity Recognition Shootout for German
Authors Martin Riedl, Sebastian Pad{'o}
Abstract We ask how to practically build a model for German named entity recognition (NER) that performs at the state of the art for both contemporary and historical texts, i.e., a big-data and a small-data scenario. The two best-performing model families are pitted against each other (linear-chain CRFs and BiLSTM) to observe the trade-off between expressiveness and data requirements. BiLSTM outperforms the CRF when large datasets are available and performs inferior for the smallest dataset. BiLSTMs profit substantially from transfer learning, which enables them to be trained on multiple corpora, resulting in a new state-of-the-art model for German NER on two contemporary German corpora (CoNLL 2003 and GermEval 2014) and two historic corpora.
Tasks Entity Linking, Named Entity Recognition, Question Answering, Representation Learning, Transfer Learning
Published 2018-07-01
URL https://www.aclweb.org/anthology/P18-2020/
PDF https://www.aclweb.org/anthology/P18-2020
PWC https://paperswithcode.com/paper/a-named-entity-recognition-shootout-for
Repo
Framework

Binary Partitions with Approximate Minimum Impurity

Title Binary Partitions with Approximate Minimum Impurity
Authors Eduardo Laber, Marco Molinaro, Felipe Mello Pereira
Abstract The problem of splitting attributes is one of the main steps in the construction of decision trees. In order to decide the best split, impurity measures such as Entropy and Gini are widely used. In practice, decision-tree inducers use heuristics for finding splits with small impurity when they consider nominal attributes with a large number of distinct values. However, there are no known guarantees for the quality of the splits obtained by these heuristics. To fill this gap, we propose two new splitting procedures that provably achieve near-optimal impurity. We also report experiments that provide evidence that the proposed methods are interesting candidates to be employed in splitting nominal attributes with many values during decision tree/random forest induction.
Tasks
Published 2018-07-01
URL https://icml.cc/Conferences/2018/Schedule?showEvent=1929
PDF http://proceedings.mlr.press/v80/laber18a/laber18a.pdf
PWC https://paperswithcode.com/paper/binary-partitions-with-approximate-minimum
Repo
Framework

Annotating Chinese Light Verb Constructions according to PARSEME guidelines

Title Annotating Chinese Light Verb Constructions according to PARSEME guidelines
Authors Menghan Jiang, Natalia Klyueva, Hongzhi Xu, Chu-Ren Huang
Abstract
Tasks Machine Translation
Published 2018-05-01
URL https://www.aclweb.org/anthology/L18-1394/
PDF https://www.aclweb.org/anthology/L18-1394
PWC https://paperswithcode.com/paper/annotating-chinese-light-verb-constructions
Repo
Framework

NORMA: Neighborhood Sensitive Maps for Multilingual Word Embeddings

Title NORMA: Neighborhood Sensitive Maps for Multilingual Word Embeddings
Authors Ndapa Nakashole
Abstract Inducing multilingual word embeddings by learning a linear map between embedding spaces of different languages achieves remarkable accuracy on related languages. However, accuracy drops substantially when translating between distant languages. Given that languages exhibit differences in vocabulary, grammar, written form, or syntax, one would expect that embedding spaces of different languages have different structures especially for distant languages. With the goal of capturing such differences, we propose a method for learning neighborhood sensitive maps, NORMA. Our experiments show that NORMA outperforms current state-of-the-art methods for word translation between distant languages.
Tasks Machine Translation, Multilingual Word Embeddings, Word Embeddings
Published 2018-10-01
URL https://www.aclweb.org/anthology/D18-1047/
PDF https://www.aclweb.org/anthology/D18-1047
PWC https://paperswithcode.com/paper/norma-neighborhood-sensitive-maps-for
Repo
Framework

The Context-Aware Learner

Title The Context-Aware Learner
Authors Conor Durkan, Amos Storkey, Harrison Edwards
Abstract One important aspect of generalization in machine learning involves reasoning about previously seen data in new settings. Such reasoning requires learning disentangled representations of data which are interpretable in isolation, but can also be combined in a new, unseen scenario. To this end, we introduce the context-aware learner, a model based on the variational autoencoding framework, which can learn such representations across data sets exhibiting a number of distinct contexts. Moreover, it is successfully able to combine these representations to generate data not seen at training time. The model enjoys an exponential increase in representational ability for a linear increase in context count. We demonstrate that the theory readily extends to a meta-learning setting such as this, and describe a fully unsupervised model in complete generality. Finally, we validate our approach using an adaptation with weak supervision.
Tasks Meta-Learning
Published 2018-01-01
URL https://openreview.net/forum?id=BJRxfZbAW
PDF https://openreview.net/pdf?id=BJRxfZbAW
PWC https://paperswithcode.com/paper/the-context-aware-learner
Repo
Framework
comments powered by Disqus