October 15, 2019

2280 words 11 mins read

Paper Group NANR 80

Language Technology for Multilingual Europe: An Analysis of a Large-Scale Survey regarding Challenges, Demands, Gaps and Needs. Shot Or Not: Comparison of NLP Approaches for Vaccination Behaviour Detection. Efficient Convex Completion of Coupled Tensors using Coupled Nuclear Norms. Automatic Identification of Drugs and Adverse Drug Reaction Related …

Language Technology for Multilingual Europe: An Analysis of a Large-Scale Survey regarding Challenges, Demands, Gaps and Needs


Title	Language Technology for Multilingual Europe: An Analysis of a Large-Scale Survey regarding Challenges, Demands, Gaps and Needs
Authors	Georg Rehm, Stefanie Hegele
Abstract
Tasks	Machine Translation
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1519/
PDF	https://www.aclweb.org/anthology/L18-1519
PWC	https://paperswithcode.com/paper/language-technology-for-multilingual-europe
Repo
Framework

Shot Or Not: Comparison of NLP Approaches for Vaccination Behaviour Detection


Title	Shot Or Not: Comparison of NLP Approaches for Vaccination Behaviour Detection
Authors	Aditya Joshi, Xiang Dai, Sarvnaz Karimi, Ross Sparks, C{'e}cile Paris, C Raina MacIntyre
Abstract	Vaccination behaviour detection deals with predicting whether or not a person received/was about to receive a vaccine. We present our submission for vaccination behaviour detection shared task at the SMM4H workshop. Our findings are based on three prevalent text classification approaches: rule-based, statistical and deep learning-based. Our final submissions are: (1) an ensemble of statistical classifiers with task-specific features derived using lexicons, language processing tools and word embeddings; and, (2) a LSTM classifier with pre-trained language models.
Tasks	Text Classification, Word Embeddings
Published	2018-10-01
URL	https://www.aclweb.org/anthology/W18-5911/
PDF	https://www.aclweb.org/anthology/W18-5911
PWC	https://paperswithcode.com/paper/shot-or-not-comparison-of-nlp-approaches-for
Repo
Framework

Efficient Convex Completion of Coupled Tensors using Coupled Nuclear Norms


Title	Efficient Convex Completion of Coupled Tensors using Coupled Nuclear Norms
Authors	Kishan Wimalawarne, Hiroshi Mamitsuka
Abstract	Coupled norms have emerged as a convex method to solve coupled tensor completion. A limitation with coupled norms is that they only induce low-rankness using the multilinear rank of coupled tensors. In this paper, we introduce a new set of coupled norms known as coupled nuclear norms by constraining the CP rank of coupled tensors. We propose new coupled completion models using the coupled nuclear norms as regularizers, which can be optimized using computationally efficient optimization methods. We derive excess risk bounds for proposed coupled completion models and show that proposed norms lead to better performance. Through simulation and real-data experiments, we demonstrate that proposed norms achieve better performance for coupled completion compared to existing coupled norms.
Tasks
Published	2018-12-01
URL	http://papers.nips.cc/paper/7922-efficient-convex-completion-of-coupled-tensors-using-coupled-nuclear-norms
PDF	http://papers.nips.cc/paper/7922-efficient-convex-completion-of-coupled-tensors-using-coupled-nuclear-norms.pdf
PWC	https://paperswithcode.com/paper/efficient-convex-completion-of-coupled
Repo
Framework


Title	Automatic Identification of Drugs and Adverse Drug Reaction Related Tweets
Authors	Segun Taofeek Aroyehun, Alex Gelbukh, er
Abstract	We describe our submissions to the Third Social Media Mining for Health Applications Shared Task. We participated in two tasks (tasks 1 and 3). For both tasks, we experimented with a traditional machine learning model (Naive Bayes Support Vector Machine (NBSVM)), deep learning models (Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM), and Bidirectional LSTM (BiLSTM)), and the combination of deep learning model with SVM. We observed that the NBSVM reaches superior performance on both tasks on our development split of the training data sets. Official result for task 1 based on the blind evaluation data shows that the predictions of the NBSVM achieved our team{'}s best F-score of 0.910 which is above the average score received by all submissions to the task. On task 3, the combination of of BiLSTM and SVM gives our best F-score for the positive class of 0.394.
Tasks
Published	2018-10-01
URL	https://www.aclweb.org/anthology/W18-5915/
PDF	https://www.aclweb.org/anthology/W18-5915
PWC	https://paperswithcode.com/paper/automatic-identification-of-drugs-and-adverse
Repo
Framework

Towards a Conversation-Analytic Taxonomy of Speech Overlap


Title	Towards a Conversation-Analytic Taxonomy of Speech Overlap
Authors	Felix Gervits, Matthias Scheutz
Abstract
Tasks
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1727/
PDF	https://www.aclweb.org/anthology/L18-1727
PWC	https://paperswithcode.com/paper/towards-a-conversation-analytic-taxonomy-of
Repo
Framework

Model Transfer with Explicit Knowledge of the Relation between Class Definitions


Title	Model Transfer with Explicit Knowledge of the Relation between Class Definitions
Authors	Hiyori Yoshikawa, Tomoya Iwakura
Abstract	This paper investigates learning methods for multi-class classification using labeled data for the target classification scheme and another labeled data for a similar but different classification scheme (support scheme). We show that if we have prior knowledge about the relation between support and target classification schemes in the form of a class correspondence table, we can use it to improve the model performance further than the simple multi-task learning approach. Instead of learning the individual classification layers for the support and target schemes, the proposed method converts the class label of each example on the support scheme into a set of candidate class labels on the target scheme via the class correspondence table, and then uses the candidate labels to learn the classification layer for the target scheme. We evaluate the proposed method on two tasks in NLP. The experimental results show that our method effectively learns the target schemes especially for the classes that have a tight connection to certain support classes.
Tasks	Multi-Task Learning, Named Entity Recognition, Text Classification
Published	2018-10-01
URL	https://www.aclweb.org/anthology/K18-1052/
PDF	https://www.aclweb.org/anthology/K18-1052
PWC	https://paperswithcode.com/paper/model-transfer-with-explicit-knowledge-of-the
Repo
Framework

Leveraging Web Based Evidence Gathering for Drug Information Identification from Tweets


Title	Leveraging Web Based Evidence Gathering for Drug Information Identification from Tweets
Authors	Rupsa Saha, Abir Naskar, Tirthankar Dasgupta, Lipika Dey
Abstract	In this paper, we have explored web-based evidence gathering and different linguistic features to automatically extract drug names from tweets and further classify such tweets into Adverse Drug Events or not. We have evaluated our proposed models with the dataset as released by the SMM4H workshop shared Task-1 and Task-3 respectively. Our evaluation results shows that the proposed model achieved good results, with Precision, Recall and F-scores of 78.5{%}, 88{%} and 82.9{%} respectively for Task1 and 33.2{%}, 54.7{%} and 41.3{%} for Task3.
Tasks
Published	2018-10-01
URL	https://www.aclweb.org/anthology/W18-5919/
PDF	https://www.aclweb.org/anthology/W18-5919
PWC	https://paperswithcode.com/paper/leveraging-web-based-evidence-gathering-for
Repo
Framework

Challenges in Converting the Index Thomisticus Treebank into Universal Dependencies


Title	Challenges in Converting the Index Thomisticus Treebank into Universal Dependencies
Authors	Flavio Massimiliano Cecchini, Marco Passarotti, Paola Marongiu, Daniel Zeman
Abstract	This paper describes the changes applied to the original process used to convert the \textit{Index Thomisticus} Treebank, a corpus including texts in Medieval Latin by Thomas Aquinas, into the annotation style of Universal Dependencies. The changes are made both to harmonise the Universal Dependencies version of the \textit{Index Thomisticus} Treebank with the two other available Latin treebanks and to fix errors and inconsistencies resulting from the original process. The paper details the treatment of different issues in PoS tagging, lemmatisation and assignment of dependency relations. Finally, it assesses the quality of the new conversion process by providing an evaluation against a gold standard.
Tasks	Dependency Parsing
Published	2018-11-01
URL	https://www.aclweb.org/anthology/W18-6004/
PDF	https://www.aclweb.org/anthology/W18-6004
PWC	https://paperswithcode.com/paper/challenges-in-converting-the-index
Repo
Framework

Unsupervised Feature Learning via Non-Parametric Instance Discrimination


Title	Unsupervised Feature Learning via Non-Parametric Instance Discrimination
Authors	Zhirong Wu, Yuanjun Xiong, Stella X. Yu, Dahua Lin
Abstract	Neural net classifiers trained on data with annotated class labels can also capture apparent visual similarity among categories without being directed to do so. We study whether this observation can be extended beyond the conventional domain of supervised learning: Can we learn a good feature representation that captures apparent similarity among instances, instead of classes, by merely asking the feature to be discriminative of individual instances? We formulate this intuition as a non-parametric classification problem at the instance-level, and use noise-contrastive estimation to tackle the computational challenges imposed by the large number of instance classes. Our experimental results demonstrate that, under unsu- pervised learning settings, our method surpasses the state-of-the-art on ImageNet classification by a large margin. Our method is also remarkable for consistently improving test performance with more training data and better network architectures. By fine-tuning the learned feature, we further obtain competitive results for semi-supervised learning and object detection tasks. Our non-parametric model is highly compact: With 128 features per image, our method requires only 600MB storage for a million images, enabling fast nearest neighbour retrieval at the run time.
Tasks	Object Detection, Semi-Supervised Image Classification
Published	2018-06-01
URL	http://openaccess.thecvf.com/content_cvpr_2018/html/Wu_Unsupervised_Feature_Learning_CVPR_2018_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2018/papers/Wu_Unsupervised_Feature_Learning_CVPR_2018_paper.pdf
PWC	https://paperswithcode.com/paper/unsupervised-feature-learning-via-non-1
Repo
Framework


Title	Stylistic variation over 200 years of court proceedings according to gender and social class
Authors	Stefania Degaetano-Ortlieb
Abstract	We present an approach to detect stylistic variation across social variables (here: gender and social class), considering also diachronic change in language use. For detection of stylistic variation, we use relative entropy, measuring the difference between probability distributions at different linguistic levels (here: lexis and grammar). In addition, by relative entropy, we can determine which linguistic units are related to stylistic variation.
Tasks
Published	2018-06-01
URL	https://www.aclweb.org/anthology/W18-1601/
PDF	https://www.aclweb.org/anthology/W18-1601
PWC	https://paperswithcode.com/paper/stylistic-variation-over-200-years-of-court
Repo
Framework

Network of Graph Convolutional Networks Trained on Random Walks


Title	Network of Graph Convolutional Networks Trained on Random Walks
Authors	Sami Abu-El-Haija, Amol Kapoor, Bryan Perozzi, Joonseok Lee
Abstract	Graph Convolutional Networks (GCNs) are a recently proposed architecture which has had success in semi-supervised learning on graph-structured data. At the same time, unsupervised learning of graph embeddings has benefited from the information contained in random walks. In this paper we propose a model, Network of GCNs (N-GCN), which marries these two lines of work. At its core, N-GCN trains multiple instances of GCNs over node pairs discovered at different distances in random walks, and learns a combination of the instance outputs which optimizes the classification objective. Our experiments show that our proposed N-GCN model achieves state-of-the-art performance on all of the challenging node classification tasks we consider: Cora, Citeseer, Pubmed, and PPI. In addition, our proposed method has other desirable properties, including generalization to recently proposed semi-supervised learning methods such as GraphSAGE, allowing us to propose N-SAGE, and resilience to adversarial input perturbations.
Tasks	Node Classification
Published	2018-01-01
URL	https://openreview.net/forum?id=SkaPsfZ0W
PDF	https://openreview.net/pdf?id=SkaPsfZ0W
PWC	https://paperswithcode.com/paper/network-of-graph-convolutional-networks
Repo
Framework

Rethinking the Agreement in Human Evaluation Tasks


Title	Rethinking the Agreement in Human Evaluation Tasks
Authors	Jacopo Amidei, Paul Piwek, Alistair Willis
Abstract	Human evaluations are broadly thought to be more valuable the higher the inter-annotator agreement. In this paper we examine this idea. We will describe our experiments and analysis within the area of Automatic Question Generation. Our experiments show how annotators diverge in language annotation tasks due to a range of ineliminable factors. For this reason, we believe that annotation schemes for natural language generation tasks that are aimed at evaluating language quality need to be treated with great care. In particular, an unchecked focus on reduction of disagreement among annotators runs the danger of creating generation goals that reward output that is more distant from, rather than closer to, natural human-like language. We conclude the paper by suggesting a new approach to the use of the agreement metrics in natural language generation evaluation tasks.
Tasks	Dialogue Generation, Question Generation, Text Generation
Published	2018-08-01
URL	https://www.aclweb.org/anthology/C18-1281/
PDF	https://www.aclweb.org/anthology/C18-1281
PWC	https://paperswithcode.com/paper/rethinking-the-agreement-in-human-evaluation
Repo
Framework

Input Combination Strategies for Multi-Source Transformer Decoder


Title	Input Combination Strategies for Multi-Source Transformer Decoder
Authors	Jind{\v{r}}ich Libovick{'y}, Jind{\v{r}}ich Helcl, David Mare{\v{c}}ek
Abstract	In multi-source sequence-to-sequence tasks, the attention mechanism can be modeled in several ways. This topic has been thoroughly studied on recurrent architectures. In this paper, we extend the previous work to the encoder-decoder attention in the Transformer architecture. We propose four different input combination strategies for the encoder-decoder attention: serial, parallel, flat, and hierarchical. We evaluate our methods on tasks of multimodal translation and translation with multiple source languages. The experiments show that the models are able to use multiple sources and improve over single source baselines.
Tasks	Image Captioning, Machine Translation, Multimodal Machine Translation, Text Summarization
Published	2018-10-01
URL	https://www.aclweb.org/anthology/W18-6326/
PDF	https://www.aclweb.org/anthology/W18-6326
PWC	https://paperswithcode.com/paper/input-combination-strategies-for-multi-source
Repo
Framework

An Empirical Investigation of Error Types in Vietnamese Parsing


Title	An Empirical Investigation of Error Types in Vietnamese Parsing
Authors	Quy Nguyen, Yusuke Miyao, Hiroshi Noji, Nhung Nguyen
Abstract	Syntactic parsing plays a crucial role in improving the quality of natural language processing tasks. Although there have been several research projects on syntactic parsing in Vietnamese, the parsing quality has been far inferior than those reported in major languages, such as English and Chinese. In this work, we evaluated representative constituency parsing models on a Vietnamese Treebank to look for the most suitable parsing method for Vietnamese. We then combined the advantages of automatic and manual analysis to investigate errors produced by the experimented parsers and find the reasons for them. Our analysis focused on three possible sources of parsing errors, namely limited training data, part-of-speech (POS) tagging errors, and ambiguous constructions. As a result, we found that the last two sources, which frequently appear in Vietnamese text, significantly attributed to the poor performance of Vietnamese parsing.
Tasks	Constituency Parsing, Part-Of-Speech Tagging
Published	2018-08-01
URL	https://www.aclweb.org/anthology/C18-1260/
PDF	https://www.aclweb.org/anthology/C18-1260
PWC	https://paperswithcode.com/paper/an-empirical-investigation-of-error-types-in
Repo
Framework

Document-Level Adaptation for Neural Machine Translation


Title	Document-Level Adaptation for Neural Machine Translation
Authors	Sachith Sri Ram Kothur, Rebecca Knowles, Philipp Koehn
Abstract	It is common practice to adapt machine translation systems to novel domains, but even a well-adapted system may be able to perform better on a particular document if it were to learn from a translator{'}s corrections within the document itself. We focus on adaptation within a single document {–} appropriate for an interactive translation scenario where a model adapts to a human translator{'}s input over the course of a document. We propose two methods: single-sentence adaptation (which performs online adaptation one sentence at a time) and dictionary adaptation (which specifically addresses the issue of translating novel words). Combining the two models results in improvements over both approaches individually, and over baseline systems, even on short documents. On WMT news test data, we observe an improvement of +1.8 BLEU points and +23.3{%} novel word translation accuracy and on EMEA data (descriptions of medications) we observe an improvement of +2.7 BLEU points and +49.2{%} novel word translation accuracy.
Tasks	Machine Translation
Published	2018-07-01
URL	https://www.aclweb.org/anthology/W18-2708/
PDF	https://www.aclweb.org/anthology/W18-2708
PWC	https://paperswithcode.com/paper/document-level-adaptation-for-neural-machine
Repo
Framework