January 24, 2020

2920 words 14 mins read

Paper Group NANR 149

Towards Summarization for Social Media - Results of the TL;DR Challenge. Embedding Complementary Deep Networks for Image Classification. Scalable Knowledge Graph Construction from Text Collections. Surface Realisation Using Full Delexicalisation. Modeling Paths for Explainable Knowledge Base Completion. Probing Word and Sentence Embeddings for Long …


Title	Towards Summarization for Social Media - Results of the TL;DR Challenge
Authors	Shahbaz Syed, Michael V{"o}lske, Nedim Lipka, Benno Stein, Hinrich Sch{"u}tze, Martin Potthast
Abstract	In this paper, we report on the results of the TL;DR challenge, discussing an extensive manual evaluation of the expected properties of a good summary based on analyzing the comments provided by human annotators.
Tasks
Published	2019-10-01
URL	https://www.aclweb.org/anthology/W19-8666/
PDF	https://www.aclweb.org/anthology/W19-8666
PWC	https://paperswithcode.com/paper/towards-summarization-for-social-media
Repo
Framework

Embedding Complementary Deep Networks for Image Classification


Title	Embedding Complementary Deep Networks for Image Classification
Authors	Qiuyu Chen, Wei Zhang, Jun Yu, Jianping Fan
Abstract	In this paper, a deep embedding algorithm is developed to achieve higher accuracy rates on large-scale image classification. By adapting the importance of the object classes to their error rates, our deep embedding algorithm can train multiple complementary deep networks sequentially, where each of them focuses on achieving higher accuracy rates for different subsets of object classes in an easy-to-hard way. By integrating such complementary deep networks to generate an ensemble network, our deep embedding algorithm can improve the accuracy rates for the hard object classes (which initially have higher error rates) at certain degrees while effectively preserving high accuracy rates for the easy object classes. Our deep embedding algorithm has achieved higher overall accuracy rates on large scale image classification.
Tasks	Image Classification
Published	2019-06-01
URL	http://openaccess.thecvf.com/content_CVPR_2019/html/Chen_Embedding_Complementary_Deep_Networks_for_Image_Classification_CVPR_2019_paper.html
PDF	http://openaccess.thecvf.com/content_CVPR_2019/papers/Chen_Embedding_Complementary_Deep_Networks_for_Image_Classification_CVPR_2019_paper.pdf
PWC	https://paperswithcode.com/paper/embedding-complementary-deep-networks-for
Repo
Framework

Scalable Knowledge Graph Construction from Text Collections


Title	Scalable Knowledge Graph Construction from Text Collections
Authors	Ryan Clancy, Ihab F. Ilyas, Jimmy Lin
Abstract	We present a scalable, open-source platform that {``}distills{''} a potentially large text collection into a knowledge graph. Our platform takes documents stored in Apache Solr and scales out the Stanford CoreNLP toolkit via Apache Spark integration to extract mentions and relations that are then ingested into the Neo4j graph database. The raw knowledge graph is then enriched with facts extracted from an external knowledge graph. The complete product can be manipulated by various applications using Neo4j{'}s native Cypher query language: We present a subgraph-matching approach to align extracted relations with external facts and show that fact verification, locating textual support for asserted facts, detecting inconsistent and missing facts, and extracting distantly-supervised training data can all be performed within the same framework. \|
Tasks	graph construction
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-6607/
PDF	https://www.aclweb.org/anthology/D19-6607
PWC	https://paperswithcode.com/paper/scalable-knowledge-graph-construction-from-1
Repo
Framework

Surface Realisation Using Full Delexicalisation


Title	Surface Realisation Using Full Delexicalisation
Authors	Anastasia Shimorina, Claire Gardent
Abstract	Surface realisation (SR) maps a meaning representation to a sentence and can be viewed as consisting of three subtasks: word ordering, morphological inflection and contraction generation (e.g., clitic attachment in Portuguese or elision in French). We propose a modular approach to surface realisation which models each of these components separately, and evaluate our approach on the 10 languages covered by the SR{'}18 Surface Realisation Shared Task shallow track. We provide a detailed evaluation of how word order, morphological realisation and contractions are handled by the model and an analysis of the differences in word ordering performance across languages.
Tasks	Morphological Inflection
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-1305/
PDF	https://www.aclweb.org/anthology/D19-1305
PWC	https://paperswithcode.com/paper/surface-realisation-using-full
Repo
Framework

Modeling Paths for Explainable Knowledge Base Completion


Title	Modeling Paths for Explainable Knowledge Base Completion
Authors	Josua Stadelmaier, Sebastian Pad{'o}
Abstract	A common approach in knowledge base completion (KBC) is to learn representations for entities and relations in order to infer missing facts by generalizing existing ones. A shortcoming of standard models is that they do not explain their predictions to make them verifiable easily to human inspection. In this paper, we propose the Context Path Model (CPM) which generates explanations for new facts in KBC by providing sets of \textit{context paths} as supporting evidence for these triples. For example, a new triple (Theresa May, nationality, Britain) may be explained by the path (Theresa May, born in, Eastbourne, contained in, Britain). The CPM is formulated as a wrapper that can be applied on top of various existing KBC models. We evaluate it for the well-established TransE model. We observe that its performance remains very close despite the added complexity, and that most of the paths proposed as explanations provide meaningful evidence to assess the correctness.
Tasks	Knowledge Base Completion
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-4816/
PDF	https://www.aclweb.org/anthology/W19-4816
PWC	https://paperswithcode.com/paper/modeling-paths-for-explainable-knowledge-base
Repo
Framework

Probing Word and Sentence Embeddings for Long-distance Dependencies Effects in French and English


Title	Probing Word and Sentence Embeddings for Long-distance Dependencies Effects in French and English
Authors	Paola Merlo
Abstract	The recent wide-spread and strong interest in RNNs has spurred detailed investigations of the distributed representations they generate and specifically if they exhibit properties similar to those characterising human languages. Results are at present inconclusive. In this paper, we extend previous work on long-distance dependencies in three ways. We manipulate word embeddings to translate them in a space that is attuned to the linguistic properties under study. We extend the work to sentence embeddings and to new languages. We confirm previous negative results: word embeddings and sentence embeddings do not unequivocally encode fine-grained linguistic properties of long-distance dependencies.
Tasks	Sentence Embeddings, Word Embeddings
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-4817/
PDF	https://www.aclweb.org/anthology/W19-4817
PWC	https://paperswithcode.com/paper/probing-word-and-sentence-embeddings-for-long
Repo
Framework

Comparing MT Approaches for Text Normalization


Title	Comparing MT Approaches for Text Normalization
Authors	Claudia Matos Veliz, Orphee De Clercq, Veronique Hoste
Abstract	One of the main characteristics of social media data is the use of non-standard language. Since NLP tools have been trained on traditional text material their performance drops when applied to social media data. One way to overcome this is to first perform text normalization. In this work, we apply text normalization to noisy English and Dutch text coming from different social media genres: text messages, message board posts and tweets. We consider the normalization task as a Machine Translation problem and test the two leading paradigms: statistical and neural machine translation. For SMT we explore the added value of varying background corpora for training the language model. For NMT we have a look at data augmentation since the parallel datasets we are working with are limited in size. Our results reveal that when relying on SMT to perform the normalization it is beneficial to use a background corpus that is close to the genre you are normalizing. Regarding NMT, we find that the translations - or normalizations - coming out of this model are far from perfect and that for a low-resource language like Dutch adding additional training data works better than artificially augmenting the data.
Tasks	Data Augmentation, Language Modelling, Machine Translation
Published	2019-09-01
URL	https://www.aclweb.org/anthology/R19-1086/
PDF	https://www.aclweb.org/anthology/R19-1086
PWC	https://paperswithcode.com/paper/comparing-mt-approaches-for-text
Repo
Framework

Geolocation with Attention-Based Multitask Learning Models


Title	Geolocation with Attention-Based Multitask Learning Models
Authors	Tommaso Fornaciari, Dirk Hovy
Abstract	Geolocation, predicting the location of a post based on text and other information, has a huge potential for several social media applications. Typically, the problem is modeled as either multi-class classification or regression. In the first case, the classes are geographic areas previously identified; in the second, the models directly predict geographic coordinates. The former requires discretization of the coordinates, but yields better performance. The latter is potentially more precise and true to the nature of the problem, but often results in worse performance. We propose to combine the two approaches in an attentionbased multitask convolutional neural network that jointly predicts both discrete locations and continuous geographic coordinates. We evaluate the multi-task (MTL) model against singletask models and prior work. We find that MTL significantly improves performance, reporting large gains on one data set, but also note that the correlation between labels and coordinates has a marked impact on the effectiveness of including a regression task.
Tasks
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-5528/
PDF	https://www.aclweb.org/anthology/D19-5528
PWC	https://paperswithcode.com/paper/geolocation-with-attention-based-multitask
Repo
Framework

BERT for Question Generation


Title	BERT for Question Generation
Authors	Ying-Hong Chan, Yao-Chung Fan
Abstract	In this study, we investigate the employment of the pre-trained BERT language model to tackle question generation tasks. We introduce two neural architectures built on top of BERT for question generation tasks. The first one is a straightforward BERT employment, which reveals the defects of directly using BERT for text generation. And, the second one remedies the first one by restructuring the BERT employment into a sequential manner for taking information from previous decoded results. Our models are trained and evaluated on the question-answering dataset SQuAD. Experiment results show that our best model yields state-of-the-art performance which advances the BLEU4 score of existing best models from 16.85 to 18.91.
Tasks	Language Modelling, Question Answering, Question Generation, Text Generation
Published	2019-10-01
URL	https://www.aclweb.org/anthology/W19-8624/
PDF	https://www.aclweb.org/anthology/W19-8624
PWC	https://paperswithcode.com/paper/bert-for-question-generation
Repo
Framework

Sigmorphon 2019 Task 2 system description paper: Morphological analysis in context for many languages, with supervision from only a few


Title	Sigmorphon 2019 Task 2 system description paper: Morphological analysis in context for many languages, with supervision from only a few
Authors	Brad Aiken, Jared Kelly, Alexis Palmer, Suleyman Olcay Polat, Taraka Rama, Rodney Nielsen
Abstract	This paper presents the UNT HiLT+Ling system for the Sigmorphon 2019 shared Task 2: Morphological Analysis and Lemmatization in Context. Our core approach focuses on the morphological tagging task; part-of-speech tagging and lemmatization are treated as secondary tasks. Given the highly multilingual nature of the task, we propose an approach which makes minimal use of the supplied training data, in order to be extensible to languages without labeled training data for the morphological inflection task. Specifically, we use a parallel Bible corpus to align contextual embeddings at the verse level. The aligned verses are used to build cross-language translation matrices, which in turn are used to map between embedding spaces for the various languages. Finally, we use sets of inflected forms, primarily from a high-resource language, to induce vector representations for individual UniMorph tags. Morphological analysis is performed by matching vector representations to embeddings for individual tokens. While our system results are dramatically below the average system submitted for the shared task evaluation campaign, our method is (we suspect) unique in its minimal reliance on labeled training data.
Tasks	Lemmatization, Morphological Analysis, Morphological Inflection, Morphological Tagging, Part-Of-Speech Tagging
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-4211/
PDF	https://www.aclweb.org/anthology/W19-4211
PWC	https://paperswithcode.com/paper/sigmorphon-2019-task-2-system-description
Repo
Framework

Correlation clustering with local objectives


Title	Correlation clustering with local objectives
Authors	Sanchit Kalhan, Konstantin Makarychev, Timothy Zhou
Abstract	Correlation Clustering is a powerful graph partitioning model that aims to cluster items based on the notion of similarity between items. An instance of the Correlation Clustering problem consists of a graph G (not necessarily complete) whose edges are labeled by a binary classifier as similar and dissimilar. Classically, we are tasked with producing a clustering that minimizes the number of disagreements: an edge is in disagreement if it is a similar edge and is present across clusters or if it is a dissimilar edge and is present within a cluster. Define the disagreements vector to be an n dimensional vector indexed by the vertices, where the v-th index is the number of disagreements at vertex v. Recently, Puleo and Milenkovic (ICML ‘16) initiated the study of the Correlation Clustering framework in which the objectives were more general functions of the disagreements vector. In this paper, we study algorithms for minimizing \ell_q norms (q >= 1) of the disagreements vector for both arbitrary and complete graphs. We present the first known algorithm for minimizing the \ell_q norm of the disagreements vector on arbitrary graphs and also provide an improved algorithm for minimizing the \ell_q norm (q >= 1) of the disagreements vector on complete graphs. We also study an alternate cluster-wise local objective introduced by Ahmadi, Khuller and Saha (IPCO ‘19), which aims to minimize the maximum number of disagreements associated with a cluster. We present an improved (2 + \eps) approximation algorithm for this objective.
Tasks	graph partitioning
Published	2019-12-01
URL	http://papers.nips.cc/paper/9132-correlation-clustering-with-local-objectives
PDF	http://papers.nips.cc/paper/9132-correlation-clustering-with-local-objectives.pdf
PWC	https://paperswithcode.com/paper/correlation-clustering-with-local-objectives
Repo
Framework

Towards Zero-shot Language Modeling


Title	Towards Zero-shot Language Modeling
Authors	Edoardo Maria Ponti, Ivan Vuli{'c}, Ryan Cotterell, Roi Reichart, Anna Korhonen
Abstract	Can we construct a neural language model which is inductively biased towards learning human language? Motivated by this question, we aim at constructing an informative prior for held-out languages on the task of character-level, open-vocabulary language modelling. We obtain this prior as the posterior over network weights conditioned on the data from a sample of training languages, which is approximated through Laplace{'}s method. Based on a large and diverse sample of languages, the use of our prior outperforms baseline models with an uninformative prior in both zero-shot and few-shot settings, showing that the prior is imbued with universal linguistic knowledge. Moreover, we harness broad language-specific information available for most languages of the world, i.e., features from typological databases, as distant supervision for held-out languages. We explore several language modelling conditioning techniques, including concatenation and meta-networks for parameter generation. They appear beneficial in the few-shot setting, but ineffective in the zero-shot setting. Since the paucity of even plain digital text affects the majority of the world{'}s languages, we hope that these insights will broaden the scope of applications for language technology.
Tasks	Language Modelling
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-1288/
PDF	https://www.aclweb.org/anthology/D19-1288
PWC	https://paperswithcode.com/paper/towards-zero-shot-language-modeling
Repo
Framework

Inverting and Modeling Morphological Inflection


Title	Inverting and Modeling Morphological Inflection
Authors	Yohei Oseki, Yasutada Sudo, Hiromu Sakai, Alec Marantz
Abstract	Previous {`}wug{''} tests (Berko, 1958) on Japanese verbal inflection have demonstrated that Japanese speakers, both adults and children, cannot inflect novel present tense forms to {`}correct{''} past tense forms predicted by rules of existent verbs (de Chene, 1982; Vance, 1987, 1991; Klafehn, 2003, 2013), indicating that Japanese verbs are merely stored in the mental lexicon. However, the implicit assumption that present tense forms are bases for verbal inflection should not be blindly extended to morphologically rich languages like Japanese in which both present and past tense forms are morphologically complex without inherent direction (Albright, 2002). Interestingly, there are also independent observations in the acquisition literature to suggest that past tense forms may be bases for verbal inflection in Japanese (Klafehn, 2003; Murasugi et al., 2010; Hirose, 2017; Tatsumi et al., 2018). In this paper, we computationally simulate two directions of verbal inflection in Japanese, Present → Past and Past → Present, with the rule-based computational model called Minimal Generalization Learner (MGL; Albright and Hayes, 2003) and experimentally evaluate the model with the bidirectional {``}wug{''} test where humans inflect novel verbs in two opposite directions. We conclude that Japanese verbs can be computed online via some generalizations and those generalizations do depend on the direction of morphological inflection. \|
Tasks	Morphological Inflection
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-4220/
PDF	https://www.aclweb.org/anthology/W19-4220
PWC	https://paperswithcode.com/paper/inverting-and-modeling-morphological
Repo
Framework

Bayes Test of Precision, Recall, and F1 Measure for Comparison of Two Natural Language Processing Models


Title	Bayes Test of Precision, Recall, and F1 Measure for Comparison of Two Natural Language Processing Models
Authors	Ruibo Wang, Jihong Li
Abstract	Direct comparison on point estimation of the precision (P), recall (R), and F1 measure of two natural language processing (NLP) models on a common test corpus is unreasonable and results in less replicable conclusions due to a lack of a statistical test. However, the existing t-tests in cross-validation (CV) for model comparison are inappropriate because the distributions of P, R, F1 are skewed and an interval estimation of P, R, and F1 based on a t-test may exceed [0,1]. In this study, we propose to use a block-regularized 3{\mbox{$\times$}}2 CV (3{\mbox{$\times$}}2 BCV) in model comparison because it could regularize the difference in certain frequency distributions over linguistic units between training and validation sets and yield stable estimators of P, R, and F1. On the basis of the 3{\mbox{$\times$}}2 BCV, we calibrate the posterior distributions of P, R, and F1 and derive an accurate interval estimation of P, R, and F1. Furthermore, we formulate the comparison into a hypothesis testing problem and propose a novel Bayes test. The test could directly compute the probabilities of the hypotheses on the basis of the posterior distributions and provide more informative decisions than the existing significance t-tests. Three experiments with regard to NLP chunking tasks are conducted, and the results illustrate the validity of the Bayes test.
Tasks	Chunking
Published	2019-07-01
URL	https://www.aclweb.org/anthology/P19-1405/
PDF	https://www.aclweb.org/anthology/P19-1405
PWC	https://paperswithcode.com/paper/bayes-test-of-precision-recall-and-f1-measure
Repo
Framework

Submodular Function Minimization with Noisy Evaluation Oracle


Title	Submodular Function Minimization with Noisy Evaluation Oracle
Authors	Shinji Ito
Abstract	This paper considers submodular function minimization with \textit{noisy evaluation oracles} that return the function value of a submodular objective with zero-mean additive noise. For this problem, we provide an algorithm that returns an $O(n^{3/2}/\sqrt{T})$-additive approximate solution in expectation, where $n$ and $T$ stand for the size of the problem and the number of oracle calls, respectively. There is no room for reducing this error bound by a factor smaller than $O(1/\sqrt{n})$. Indeed, we show that any algorithm will suffer additive errors of $\Omega(n/\sqrt{T})$ in the worst case. Further, we consider an extended problem setting with \textit{multiple-point feedback} in which we can get the feedback of $k$ function values with each oracle call. Under the additional assumption that each noisy oracle is submodular and that $2 \leq k = O(1)$, we provide an algorithm with an $O(n/\sqrt{T})$-additive error bound as well as a worst-case analysis including a lower bound of $\Omega(n/\sqrt{T})$, which together imply that the algorithm achieves an optimal error bound up to a constant.
Tasks
Published	2019-12-01
URL	http://papers.nips.cc/paper/9378-submodular-function-minimization-with-noisy-evaluation-oracle
PDF	http://papers.nips.cc/paper/9378-submodular-function-minimization-with-noisy-evaluation-oracle.pdf
PWC	https://paperswithcode.com/paper/submodular-function-minimization-with-noisy
Repo
Framework