January 25, 2020

2809 words 14 mins read

Paper Group NANR 74

Regression or classification? Automated Essay Scoring for Norwegian. Meta-Learning with Domain Adaptation for Few-Shot Learning under Domain Shift. Sparse Victory – A Large Scale Systematic Comparison of count-based and prediction-based vectorizers for text classification. HLT@SUDA at SemEval-2019 Task 1: UCCA Graph Parsing as Constituent Tree Par …

Regression or classification? Automated Essay Scoring for Norwegian


Title	Regression or classification? Automated Essay Scoring for Norwegian
Authors	Stig Johan Berggren, Taraka Rama, Lilja {\O}vrelid
Abstract	In this paper we present first results for the task of Automated Essay Scoring for Norwegian learner language. We analyze a number of properties of this task experimentally and assess (i) the formulation of the task as either regression or classification, (ii) the use of various non-neural and neural machine learning architectures with various types of input representations, and (iii) applying multi-task learning for joint prediction of essay scoring and native language identification. We find that a GRU-based attention model trained in a single-task setting performs best at the AES task.
Tasks	Language Identification, Multi-Task Learning, Native Language Identification
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-4409/
PDF	https://www.aclweb.org/anthology/W19-4409
PWC	https://paperswithcode.com/paper/regression-or-classification-automated-essay
Repo
Framework

Meta-Learning with Domain Adaptation for Few-Shot Learning under Domain Shift


Title	Meta-Learning with Domain Adaptation for Few-Shot Learning under Domain Shift
Authors	Doyen Sahoo, Hung Le, Chenghao Liu, Steven C. H. Hoi
Abstract	Few-Shot Learning (learning with limited labeled data) aims to overcome the limitations of traditional machine learning approaches which require thousands of labeled examples to train an effective model. Considered as a hallmark of human intelligence, the community has recently witnessed several contributions on this topic, in particular through meta-learning, where a model learns how to learn an effective model for few-shot learning. The main idea is to acquire prior knowledge from a set of training tasks, which is then used to perform (few-shot) test tasks. Most existing work assumes that both training and test tasks are drawn from the same distribution, and a large amount of labeled data is available in the training tasks. This is a very strong assumption which restricts the usage of meta-learning strategies in the real world where ample training tasks following the same distribution as test tasks may not be available. In this paper, we propose a novel meta-learning paradigm wherein a few-shot learning model is learnt, which simultaneously overcomes domain shift between the train and test tasks via adversarial domain adaptation. We demonstrate the efficacy the proposed method through extensive experiments.
Tasks	Domain Adaptation, Few-Shot Learning, Meta-Learning
Published	2019-05-01
URL	https://openreview.net/forum?id=ByGOuo0cYm
PDF	https://openreview.net/pdf?id=ByGOuo0cYm
PWC	https://paperswithcode.com/paper/meta-learning-with-domain-adaptation-for-few
Repo
Framework

Sparse Victory – A Large Scale Systematic Comparison of count-based and prediction-based vectorizers for text classification


Title	Sparse Victory – A Large Scale Systematic Comparison of count-based and prediction-based vectorizers for text classification
Authors	Rupak Chakraborty, Ashima Elhence, Kapil Arora
Abstract	In this paper we study the performance of several text vectorization algorithms on a diverse collection of 73 publicly available datasets. Traditional sparse vectorizers like Tf-Idf and Feature Hashing have been systematically compared with the latest state of the art neural word embeddings like Word2Vec, GloVe, FastText and character embeddings like ELMo, Flair. We have carried out an extensive analysis of the performance of these vectorizers across different dimensions like classification metrics (.i.e. precision, recall, accuracy), dataset-size, and imbalanced data (in terms of the distribution of the number of class labels). Our experiments reveal that the sparse vectorizers beat the neural word and character embedding models on 61 of the 73 datasets by an average margin of 3-5{%} (in terms of macro f1 score) and this performance is consistent across the different dimensions of comparison.
Tasks	Text Classification, Word Embeddings
Published	2019-09-01
URL	https://www.aclweb.org/anthology/R19-1022/
PDF	https://www.aclweb.org/anthology/R19-1022
PWC	https://paperswithcode.com/paper/sparse-victory-a-large-scale-systematic
Repo
Framework

HLT@SUDA at SemEval-2019 Task 1: UCCA Graph Parsing as Constituent Tree Parsing


Title	HLT@SUDA at SemEval-2019 Task 1: UCCA Graph Parsing as Constituent Tree Parsing
Authors	Wei Jiang, Zhenghua Li, Yu Zhang, Min Zhang
Abstract	This paper describes a simple UCCA semantic graph parsing approach. The key idea is to convert a UCCA semantic graph into a constituent tree, in which extra labels are deliberately designed to mark remote edges and discontinuous nodes for future recovery. In this way, we can make use of existing syntactic parsing techniques. Based on the data statistics, we recover discontinuous nodes directly according to the output labels of the constituent parser and use a biaffine classification model to recover the more complex remote edges. The classification model and the constituent parser are simultaneously trained under the multi-task learning framework. We use the multilingual BERT as extra features in the open tracks. Our system ranks the first place in the six English/German closed/open tracks among seven participating systems. For the seventh cross-lingual track, where there is little training data for French, we propose a language embedding approach to utilize English and German training data, and our result ranks the second place.
Tasks	Multi-Task Learning
Published	2019-06-01
URL	https://www.aclweb.org/anthology/S19-2002/
PDF	https://www.aclweb.org/anthology/S19-2002
PWC	https://paperswithcode.com/paper/hltsuda-at-semeval-2019-task-1-ucca-graph-1
Repo
Framework


Title	Trajectory VAE for multi-modal imitation
Authors	Xiaoyu Lu, Jan Stuehmer, Katja Hofmann
Abstract	We address the problem of imitating multi-modal expert demonstrations in sequential decision making problems. In many practical applications, for example video games, behavioural demonstrations are readily available that contain multi-modal structure not captured by typical existing imitation learning approaches. For example, differences in the observed players’ behaviours may be representative of different underlying playstyles. In this paper, we use a generative model to capture different emergent playstyles in an unsupervised manner, enabling the imitation of a diverse range of distinct behaviours. We utilise a variational autoencoder to learn an embedding of the different types of expert demonstrations on the trajectory level, and jointly learn a latent representation with a policy. In experiments on a range of 2D continuous control problems representative of Minecraft environments, we empirically demonstrate that our model can capture a multi-modal structured latent space from the demonstrated behavioural trajectories.
Tasks	Continuous Control, Decision Making, Imitation Learning
Published	2019-05-01
URL	https://openreview.net/forum?id=Byx1VnR9K7
PDF	https://openreview.net/pdf?id=Byx1VnR9K7
PWC	https://paperswithcode.com/paper/trajectory-vae-for-multi-modal-imitation
Repo
Framework

View Independent Generative Adversarial Network for Novel View Synthesis


Title	View Independent Generative Adversarial Network for Novel View Synthesis
Authors	Xiaogang Xu, Ying-Cong Chen, Jiaya Jia
Abstract	Synthesizing novel views from a 2D image requires to infer 3D structure and project it back to 2D from a new viewpoint. In this paper, we propose an encoder-decoder based generative adversarial network VI-GAN to tackle this problem. Our method is to let the network, after seeing many images of objects belonging to the same category in different views, obtain essential knowledge of intrinsic properties of the objects. To this end, an encoder is designed to extract view-independent feature that characterizes intrinsic properties of the input image, which includes 3D structure, color, texture etc. We also make the decoder hallucinate the image of a novel view based on the extracted feature and an arbitrary user-specific camera pose. Extensive experiments demonstrate that our model can synthesize high-quality images in different views with continuous camera poses, and is general for various applications.
Tasks	Novel View Synthesis
Published	2019-10-01
URL	http://openaccess.thecvf.com/content_ICCV_2019/html/Xu_View_Independent_Generative_Adversarial_Network_for_Novel_View_Synthesis_ICCV_2019_paper.html
PDF	http://openaccess.thecvf.com/content_ICCV_2019/papers/Xu_View_Independent_Generative_Adversarial_Network_for_Novel_View_Synthesis_ICCV_2019_paper.pdf
PWC	https://paperswithcode.com/paper/view-independent-generative-adversarial
Repo
Framework

The BLCU System in the BEA 2019 Shared Task


Title	The BLCU System in the BEA 2019 Shared Task
Authors	Liner Yang, Chencheng Wang
Abstract	This paper describes the BLCU Group submissions to the Building Educational Applications (BEA) 2019 Shared Task on Grammatical Error Correction (GEC). The task is to detect and correct grammatical errors that occurred in essays. We participate in 2 tracks including the Restricted Track and the Unrestricted Track. Our system is based on a Transformer model architecture. We integrate many effective methods proposed in recent years. Such as, Byte Pair Encoding, model ensemble, checkpoints average and spell checker. We also corrupt the public monolingual data to further improve the performance of the model. On the test data of the BEA 2019 Shared Task, our system yields F0.5 = 58.62 and 59.50, ranking twelfth and fourth respectively.
Tasks	Grammatical Error Correction
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-4421/
PDF	https://www.aclweb.org/anthology/W19-4421
PWC	https://paperswithcode.com/paper/the-blcu-system-in-the-bea-2019-shared-task
Repo
Framework

Modulating transfer between tasks in gradient-based meta-learning


Title	Modulating transfer between tasks in gradient-based meta-learning
Authors	Erin Grant, Ghassen Jerfel, Katherine Heller, Thomas L. Griffiths
Abstract	Learning-to-learn or meta-learning leverages data-driven inductive bias to increase the efficiency of learning on a novel task. This approach encounters difficulty when transfer is not mutually beneficial, for instance, when tasks are sufficiently dissimilar or change over time. Here, we use the connection between gradient-based meta-learning and hierarchical Bayes to propose a mixture of hierarchical Bayesian models over the parameters of an arbitrary function approximator such as a neural network. Generalizing the model-agnostic meta-learning (MAML) algorithm, we present a stochastic expectation maximization procedure to jointly estimate parameter initializations for gradient descent as well as a latent assignment of tasks to initializations. This approach better captures the diversity of training tasks as opposed to consolidating inductive biases into a single set of hyperparameters. Our experiments demonstrate better generalization on the standard miniImageNet benchmark for 1-shot classification. We further derive a novel and scalable non-parametric variant of our method that captures the evolution of a task distribution over time as demonstrated on a set of few-shot regression tasks.
Tasks	few-shot regression, Meta-Learning
Published	2019-05-01
URL	https://openreview.net/forum?id=HyxpNnRcFX
PDF	https://openreview.net/pdf?id=HyxpNnRcFX
PWC	https://paperswithcode.com/paper/modulating-transfer-between-tasks-in-gradient
Repo
Framework

Towards A Robust Morphological Analyzer for Kunwinjku


Title	Towards A Robust Morphological Analyzer for Kunwinjku
Authors	William Lane, Steven Bird
Abstract	Kunwinjku is an indigenous Australian language spoken in northern Australia which exhibits agglutinative and polysynthetic properties. Members of the community have expressed interest in co-developing language applications that promote their values and priorities. Modeling the morphology of the Kunwinjku language is an important step towards accomplishing the community{'}s goals. Finite State Transducers have long been the go-to method for modeling morphologically rich languages, and in this paper we discuss some of the distinct modeling challenges present in the morphosyntax of verbs in Kunwinjku. We show that a fairly straightforward implementation using standard features of the foma toolkit can account for much of the verb structure. Continuing challenges include robustness in the face of variation and unseen vocabulary, as well as how to handle complex reduplicative processes. Our future work will build off the baseline and challenges presented here.
Tasks
Published	2019-04-01
URL	https://www.aclweb.org/anthology/U19-1001/
PDF	https://www.aclweb.org/anthology/U19-1001
PWC	https://paperswithcode.com/paper/towards-a-robust-morphological-analyzer-for
Repo
Framework

Activity Regularization for Continual Learning


Title	Activity Regularization for Continual Learning
Authors	Quang H. Pham, Steven C. H. Hoi
Abstract	While deep neural networks have achieved remarkable successes, they suffer the well-known catastrophic forgetting issue when switching from existing tasks to tackle a new one. In this paper, we study continual learning with deep neural networks that learn from tasks arriving sequentially. We first propose an approximated multi-task learning framework that unifies a family of popular regularization based continual learning methods. We then analyze the weakness of existing approaches, and propose a novel regularization method named “Activity Regularization” (AR), which alleviates forgetting meanwhile keeping model’s plasticity to acquire new knowledge. Extensive experiments show that our method outperform state-of-the-art methods and effectively overcomes catastrophic forgetting.
Tasks	Continual Learning, Multi-Task Learning
Published	2019-05-01
URL	https://openreview.net/forum?id=S1G_cj05YQ
PDF	https://openreview.net/pdf?id=S1G_cj05YQ
PWC	https://paperswithcode.com/paper/activity-regularization-for-continual
Repo
Framework

Investigating Cross-Lingual Alignment Methods for Contextualized Embeddings with Token-Level Evaluation


Title	Investigating Cross-Lingual Alignment Methods for Contextualized Embeddings with Token-Level Evaluation
Authors	Qianchu Liu, Diana McCarthy, Ivan Vuli{'c}, Anna Korhonen
Abstract	In this paper, we present a thorough investigation on methods that align pre-trained contextualized embeddings into shared cross-lingual context-aware embedding space, providing strong reference benchmarks for future context-aware crosslingual models. We propose a novel and challenging task, Bilingual Token-level Sense Retrieval (BTSR). It specifically evaluates the accurate alignment of words with the same meaning in cross-lingual non-parallel contexts, currently not evaluated by existing tasks such as Bilingual Contextual Word Similarity and Sentence Retrieval. We show how the proposed BTSR task highlights the merits of different alignment methods. In particular, we find that using context average type-level alignment is effective in transferring monolingual contextualized embeddings cross-lingually especially in non-parallel contexts, and at the same time improves the monolingual space. Furthermore, aligning independently trained models yields better performance than aligning multilingual embeddings with shared vocabulary.
Tasks
Published	2019-11-01
URL	https://www.aclweb.org/anthology/K19-1004/
PDF	https://www.aclweb.org/anthology/K19-1004
PWC	https://paperswithcode.com/paper/investigating-cross-lingual-alignment-methods
Repo
Framework

Eliminating Exposure Bias and Metric Mismatch in Multiple Object Tracking


Title	Eliminating Exposure Bias and Metric Mismatch in Multiple Object Tracking
Authors	Andrii Maksai, Pascal Fua
Abstract	Identity Switching remains one of the main difficulties Multiple Object Tracking (MOT) algorithms have to deal with. Many state-of-the-art approaches now use sequence models to solve this problem but their training can be affected by biases that decrease their efficiency. In this paper, we introduce a new training procedure that confronts the algorithm to its own mistakes while explicitly attempting to minimize the number of switches, which results in better training. We propose an iterative scheme of building a rich training set and using it to learn a scoring function that is an explicit proxy for the target tracking metric. Whether using only simple geometric features or more sophisticated ones that also take appearance into account, our approach outperforms the state-of-the-art on several MOT benchmarks.
Tasks	Multiple Object Tracking, Object Tracking
Published	2019-06-01
URL	http://openaccess.thecvf.com/content_CVPR_2019/html/Maksai_Eliminating_Exposure_Bias_and_Metric_Mismatch_in_Multiple_Object_Tracking_CVPR_2019_paper.html
PDF	http://openaccess.thecvf.com/content_CVPR_2019/papers/Maksai_Eliminating_Exposure_Bias_and_Metric_Mismatch_in_Multiple_Object_Tracking_CVPR_2019_paper.pdf
PWC	https://paperswithcode.com/paper/eliminating-exposure-bias-and-metric-mismatch
Repo
Framework

Neural and FST-based approaches to grammatical error correction


Title	Neural and FST-based approaches to grammatical error correction
Authors	Zheng Yuan, Felix Stahlberg, Marek Rei, Bill Byrne, Helen Yannakoudakis
Abstract	In this paper, we describe our submission to the BEA 2019 shared task on grammatical error correction. We present a system pipeline that utilises both error detection and correction models. The input text is first corrected by two complementary neural machine translation systems: one using convolutional networks and multi-task learning, and another using a neural Transformer-based system. Training is performed on publicly available data, along with artificial examples generated through back-translation. The n-best lists of these two machine translation systems are then combined and scored using a finite state transducer (FST). Finally, an unsupervised re-ranking system is applied to the n-best output of the FST. The re-ranker uses a number of error detection features to re-rank the FST n-best list and identify the final 1-best correction hypothesis. Our system achieves 66.75{%} F 0.5 on error correction (ranking 4th), and 82.52{%} F 0.5 on token-level error detection (ranking 2nd) in the restricted track of the shared task.
Tasks	Grammatical Error Correction, Machine Translation, Multi-Task Learning
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-4424/
PDF	https://www.aclweb.org/anthology/W19-4424
PWC	https://paperswithcode.com/paper/neural-and-fst-based-approaches-to
Repo
Framework

OpenDialKG: Explainable Conversational Reasoning with Attention-based Walks over Knowledge Graphs


Title	OpenDialKG: Explainable Conversational Reasoning with Attention-based Walks over Knowledge Graphs
Authors	Seungwhan Moon, Pararth Shah, Anuj Kumar, Rajen Subba
Abstract	We study a conversational reasoning model that strategically traverses through a large-scale common fact knowledge graph (KG) to introduce engaging and contextually diverse entities and attributes. For this study, we collect a new Open-ended Dialog {\textless}-{\textgreater} KG parallel corpus called OpenDialKG, where each utterance from 15K human-to-human role-playing dialogs is manually annotated with ground-truth reference to corresponding entities and paths from a large-scale KG with 1M+ facts. We then propose the DialKG Walker model that learns the symbolic transitions of dialog contexts as structured traversals over KG, and predicts natural entities to introduce given previous dialog contexts via a novel domain-agnostic, attention-based graph path decoder. Automatic and human evaluations show that our model can retrieve more natural and human-like responses than the state-of-the-art baselines or rule-based models, in both in-domain and cross-domain tasks. The proposed model also generates a KG walk path for each entity retrieved, providing a natural way to explain conversational reasoning.
Tasks	Knowledge Graphs
Published	2019-07-01
URL	https://www.aclweb.org/anthology/P19-1081/
PDF	https://www.aclweb.org/anthology/P19-1081
PWC	https://paperswithcode.com/paper/opendialkg-explainable-conversational
Repo
Framework

Subquadratic High-Dimensional Hierarchical Clustering


Title	Subquadratic High-Dimensional Hierarchical Clustering
Authors	Amir Abboud, Vincent Cohen-Addad, Hussein Houdrouge
Abstract	We consider the widely-used average-linkage, single-linkage, and Ward’s methods for computing hierarchical clusterings of high-dimensional Euclidean inputs. It is easy to show that there is no efficient implementation of these algorithms in high dimensional Euclidean space since it implicitly requires to solve the closest pair problem, a notoriously difficult problem. However, how fast can these algorithms be implemented if we allow approximation? More precisely: these algorithms successively merge the clusters that are at closest average (for average-linkage), minimum distance (for single-linkage), or inducing the least sum-of-square error (for Ward’s). We ask whether one could obtain a significant running-time improvement if the algorithm can merge $\gamma$-approximate closest clusters (namely, clusters that are at distance (average, minimum, or sum-of-square error) at most $\gamma$ times the distance of the closest clusters). We show that one can indeed take advantage of the relaxation and compute the approximate hierarchical clustering tree using $\widetilde{O}(n)$ $\gamma$-approximate nearest neighbor queries. This leads to an algorithm running in time $\widetilde{O}(nd) + n^{1+O(1/\gamma)}$ for $d$-dimensional Euclidean space. We then provide experiments showing that these algorithms perform as well as the non-approximate version for classic classification tasks while achieving a significant speed-up.
Tasks
Published	2019-12-01
URL	http://papers.nips.cc/paper/9333-subquadratic-high-dimensional-hierarchical-clustering
PDF	http://papers.nips.cc/paper/9333-subquadratic-high-dimensional-hierarchical-clustering.pdf
PWC	https://paperswithcode.com/paper/subquadratic-high-dimensional-hierarchical
Repo
Framework