Paper Group NANR 204
Experiments with ad hoc ambiguous abbreviation expansion. English-Ethiopian Languages Statistical Machine Translation. Learning to Decompose Compound Questions with Reinforcement Learning. Word Clustering for Historical Newspapers Analysis. Variable beam search for generative neural parsing and its relevance for the analysis of neuro-imaging signal …
Experiments with ad hoc ambiguous abbreviation expansion
Title | Experiments with ad hoc ambiguous abbreviation expansion |
Authors | Agnieszka Mykowiecka, Malgorzata Marciniak |
Abstract | The paper addresses experiments to expand ad hoc ambiguous abbreviations in medical notes on the basis of morphologically annotated texts, without using additional domain resources. We work on Polish data but the described approaches can be used for other languages too. We test two methods to select candidates for word abbreviation expansions. The first one automatically selects all words in text which might be an expansion of an abbreviation according to the language rules. The second method uses clustering of abbreviation occurrences to select representative elements which are manually annotated to determine lists of potential expansions. We then train a classifier to assign expansions to abbreviations based on three training sets: automatically obtained, consisting of manual annotation, and concatenation of the two previous ones. The results obtained for the manually annotated training data significantly outperform automatically obtained training data. Adding the automatically obtained training data to the manually annotated data improves the results, in particular for less frequent abbreviations. In this context the proposed a priori data driven selection of possible extensions turned out to be crucial. |
Tasks | |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-6207/ |
https://www.aclweb.org/anthology/D19-6207 | |
PWC | https://paperswithcode.com/paper/experiments-with-ad-hoc-ambiguous |
Repo | |
Framework | |
English-Ethiopian Languages Statistical Machine Translation
Title | English-Ethiopian Languages Statistical Machine Translation |
Authors | Solomon Teferra Abate, Michael Melese, Martha Yifiru Tachbelie, Million Meshesha, Solomon Atinafu, Wondwossen Mulugeta, Yaregal Assabie, Hafte Abera, Biniyam Ephrem, Tewodros Gebreselassie, Wondimagegnhue Tsegaye Tufa, Amanuel Lemma, Tsegaye Andargie, Seifedin Shifaw |
Abstract | In this paper, we describe an attempt towards the development of parallel corpora for English and Ethiopian Languages, such as Amharic, Tigrigna, Afan-Oromo, Wolaytta and Ge{'}ez. The corpora are used for conducting bi-directional SMT experiments. The BLEU scores of the bi-directional SMT systems show a promising result. The morphological richness of the Ethiopian languages has a great impact on the performance of SMT especially when the targets are Ethiopian languages. |
Tasks | Machine Translation |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/papers/W/W19/W19-3611/ |
https://www.aclweb.org/anthology/W19-3611 | |
PWC | https://paperswithcode.com/paper/english-ethiopian-languages-statistical |
Repo | |
Framework | |
Learning to Decompose Compound Questions with Reinforcement Learning
Title | Learning to Decompose Compound Questions with Reinforcement Learning |
Authors | Haihong Yang, Han Wang, Shuang Guo, Wei Zhang, Huajun Chen |
Abstract | As for knowledge-based question answering, a fundamental problem is to relax the assumption of answerable questions from simple questions to compound questions. Traditional approaches firstly detect topic entity mentioned in questions, then traverse the knowledge graph to find relations as a multi-hop path to answers, while we propose a novel approach to leverage simple-question answerers to answer compound questions. Our model consists of two parts: (i) a novel learning-to-decompose agent that learns a policy to decompose a compound question into simple questions and (ii) three independent simple-question answerers that classify the corresponding relations for each simple question. Experiments demonstrate that our model learns complex rules of compositionality as stochastic policy, which benefits simple neural networks to achieve state-of-the-art results on WebQuestions and MetaQA. We analyze the interpretable decomposition process as well as generated partitions. |
Tasks | Question Answering |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=SJl2ps0qKQ |
https://openreview.net/pdf?id=SJl2ps0qKQ | |
PWC | https://paperswithcode.com/paper/learning-to-decompose-compound-questions-with |
Repo | |
Framework | |
Word Clustering for Historical Newspapers Analysis
Title | Word Clustering for Historical Newspapers Analysis |
Authors | Lidia Pivovarova, Elaine Zosa, Jani Marjanen |
Abstract | This paper is a part of a collaboration between computer scientists and historians aimed at development of novel tools and methods to improve analysis of historical newspapers. We present a case study of ideological terms ending with -ism suffix in nineteenth century Finnish newspapers. We propose a two-step procedure to trace differences in word usages over time: training of diachronic embeddings on several time slices and when clustering embeddings of selected words together with their neighbours to obtain historical context. The obtained clusters turn out to be useful for historical studies. The paper also discuss specific difficulties related to development historian-oriented tools. |
Tasks | |
Published | 2019-09-01 |
URL | https://www.aclweb.org/anthology/W19-9002/ |
https://www.aclweb.org/anthology/W19-9002 | |
PWC | https://paperswithcode.com/paper/word-clustering-for-historical-newspapers |
Repo | |
Framework | |
Variable beam search for generative neural parsing and its relevance for the analysis of neuro-imaging signal
Title | Variable beam search for generative neural parsing and its relevance for the analysis of neuro-imaging signal |
Authors | Benoit Crabb{'e}, Murielle Fabre, Christophe Pallier |
Abstract | This paper describes a method of variable beam size inference for Recurrent Neural Network Grammar (rnng) by drawing inspiration from sequential Monte-Carlo methods such as particle filtering. The paper studies the relevance of such methods for speeding up the computations of direct generative parsing for rnng. But it also studies the potential cognitive interpretation of the underlying representations built by the search method (beam activity) through analysis of neuro-imaging signal. |
Tasks | |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-1106/ |
https://www.aclweb.org/anthology/D19-1106 | |
PWC | https://paperswithcode.com/paper/variable-beam-search-for-generative-neural |
Repo | |
Framework | |
A Case Study on Neural Headline Generation for Editing Support
Title | A Case Study on Neural Headline Generation for Editing Support |
Authors | Kazuma Murao, Ken Kobayashi, Hayato Kobayashi, Taichi Yatsuka, Takeshi Masuyama, Tatsuru Higurashi, Yoshimune Tabuchi |
Abstract | There have been many studies on neural headline generation models trained with a lot of (article, headline) pairs. However, there are few situations for putting such models into practical use in the real world since news articles typically already have corresponding headlines. In this paper, we describe a practical use case of neural headline generation in a news aggregator, where dozens of professional editors constantly select important news articles and manually create their headlines, which are much shorter than the original headlines. Specifically, we show how to deploy our model to an editing support tool and report the results of comparing the behavior of the editors before and after the release. |
Tasks | |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/N19-2010/ |
https://www.aclweb.org/anthology/N19-2010 | |
PWC | https://paperswithcode.com/paper/a-case-study-on-neural-headline-generation |
Repo | |
Framework | |
What Makes a Good Counselor? Learning to Distinguish between High-quality and Low-quality Counseling Conversations
Title | What Makes a Good Counselor? Learning to Distinguish between High-quality and Low-quality Counseling Conversations |
Authors | Ver{'o}nica P{'e}rez-Rosas, Xinyi Wu, Kenneth Resnicow, Rada Mihalcea |
Abstract | The quality of a counseling intervention relies highly on the active collaboration between clients and counselors. In this paper, we explore several linguistic aspects of the collaboration process occurring during counseling conversations. Specifically, we address the differences between high-quality and low-quality counseling. Our approach examines participants{'} turn-by-turn interaction, their linguistic alignment, the sentiment expressed by speakers during the conversation, as well as the different topics being discussed. Our results suggest important language differences in low- and high-quality counseling, which we further use to derive linguistic features able to capture the differences between the two groups. These features are then used to build automatic classifiers that can predict counseling quality with accuracies of up to 88{%}. |
Tasks | |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/P19-1088/ |
https://www.aclweb.org/anthology/P19-1088 | |
PWC | https://paperswithcode.com/paper/what-makes-a-good-counselor-learning-to |
Repo | |
Framework | |
Analyzing Bayesian Crosslingual Transfer in Topic Models
Title | Analyzing Bayesian Crosslingual Transfer in Topic Models |
Authors | Shudong Hao, Michael J. Paul |
Abstract | We introduce a theoretical analysis of crosslingual transfer in probabilistic topic models. By formulating posterior inference through Gibbs sampling as a process of language transfer, we propose a new measure that quantifies the loss of knowledge across languages during this process. This measure enables us to derive a PAC-Bayesian bound that elucidates the factors affecting model quality, both during training and in downstream applications. We provide experimental validation of the analysis on a diverse set of five languages, and discuss best practices for data collection and model design based on our analysis. |
Tasks | Topic Models |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/N19-1158/ |
https://www.aclweb.org/anthology/N19-1158 | |
PWC | https://paperswithcode.com/paper/analyzing-bayesian-crosslingual-transfer-in |
Repo | |
Framework | |
Leveraging SNOMED CT terms and relations for machine translation of clinical texts from Basque to Spanish
Title | Leveraging SNOMED CT terms and relations for machine translation of clinical texts from Basque to Spanish |
Authors | Xabier Soto, Olatz Perez-De-Vi{~n}aspre, Maite Oronoz, Gorka Labaka |
Abstract | |
Tasks | Machine Translation |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-7102/ |
https://www.aclweb.org/anthology/W19-7102 | |
PWC | https://paperswithcode.com/paper/leveraging-snomed-ct-terms-and-relations-for |
Repo | |
Framework | |
Learning Internal Dense But External Sparse Structures of Deep Neural Network
Title | Learning Internal Dense But External Sparse Structures of Deep Neural Network |
Authors | Yiqun Duan |
Abstract | Recent years have witnessed two seemingly opposite developments of deep convolutional neural networks (CNNs). On one hand, increasing the density of CNNs by adding cross-layer connections achieve higher accuracy. On the other hand, creating sparsity structures through regularization and pruning methods enjoys lower computational costs. In this paper, we bridge these two by proposing a new network structure with locally dense yet externally sparse connections. This new structure uses dense modules, as basic building blocks and then sparsely connects these modules via a novel algorithm during the training process. Experimental results demonstrate that the locally dense yet externally sparse structure could acquire competitive performance on benchmark tasks (CIFAR10, CIFAR100, and ImageNet) while keeping the network structure slim. |
Tasks | |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=BkNUFjR5KQ |
https://openreview.net/pdf?id=BkNUFjR5KQ | |
PWC | https://paperswithcode.com/paper/learning-internal-dense-but-external-sparse |
Repo | |
Framework | |
Improving Language Generation from Feature-Rich Tree-Structured Data with Relational Graph Convolutional Encoders
Title | Improving Language Generation from Feature-Rich Tree-Structured Data with Relational Graph Convolutional Encoders |
Authors | Xudong Hong, Ernie Chang, Vera Demberg |
Abstract | The Multilingual Surface Realization Shared Task 2019 focuses on generating sentences from lemmatized sets of universal dependency parses with rich features. This paper describes the results of our participation in the deep track. The core innovation in our approach is to use a graph convolutional network to encode the dependency trees given as input. Upon adding morphological features, our system achieves the third rank without using data augmentation techniques or additional components (such as a re-ranker). |
Tasks | Data Augmentation, Text Generation |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-6310/ |
https://www.aclweb.org/anthology/D19-6310 | |
PWC | https://paperswithcode.com/paper/improving-language-generation-from-feature |
Repo | |
Framework | |
Reinforcement Learning of Multi-Domain Dialog Policies Via Action Embeddings
Title | Reinforcement Learning of Multi-Domain Dialog Policies Via Action Embeddings |
Authors | Jorge A. Mendez, Alborz Geramifard, Mohammad Ghavamzadeh, Bing Liu |
Abstract | Learning task-oriented dialog policies via reinforcement learning typically requireslarge amounts of interaction with users, which in practice renders such methodsunusable for real-world applications. In order to reduce the data requirements, wepropose to leverage data from across different dialog domains, thereby reducingthe amount of data required from each given domain. In particular, we propose tolearn domain-agnosticaction embeddings, which capture general-purpose structurethat informs the system how to act given the current dialog context, and are thenspecialized to a specific domain. We show how this approach is capable of learningwith significantly less interaction with users, with a reduction of 35% in the numberof dialogs required to learn, and to a higher level of proficiency than trainingseparate policies for each domain on a set of simulated domains. |
Tasks | |
Published | 2019-12-09 |
URL | http://alborz-geramifard.com/workshops/neurips19-Conversational-AI/Papers/33.pdf |
http://alborz-geramifard.com/workshops/neurips19-Conversational-AI/Papers/33.pdf | |
PWC | https://paperswithcode.com/paper/reinforcement-learning-of-multi-domain-dialog |
Repo | |
Framework | |
Modeling Inter and Intra-Class Relations in the Triplet Loss for Zero-Shot Learning
Title | Modeling Inter and Intra-Class Relations in the Triplet Loss for Zero-Shot Learning |
Authors | Yannick Le Cacheux, Herve Le Borgne, Michel Crucianu |
Abstract | Recognizing visual unseen classes, i.e. for which no training data is available, is known as Zero Shot Learning (ZSL). Some of the best performing methods apply the triplet loss to seen classes to learn a mapping between visual representations of images and attribute vectors that constitute class prototypes. They nevertheless make several implicit assumptions that limit their performance on real use cases, particularly with fine-grained datasets comprising a large number of classes. We identify three of these assumptions and put forward corresponding novel contributions to address them. Our approach consists in taking into account both inter-class and intra-class relations, respectively by being more permissive with confusions between similar classes, and by penalizing visual samples which are atypical to their class. The approach is tested on four datasets, including the large-scale ImageNet, and exhibits performances significantly above recent methods, even generative methods based on more restrictive hypotheses. |
Tasks | Zero-Shot Learning |
Published | 2019-10-01 |
URL | http://openaccess.thecvf.com/content_ICCV_2019/html/Le_Cacheux_Modeling_Inter_and_Intra-Class_Relations_in_the_Triplet_Loss_for_ICCV_2019_paper.html |
http://openaccess.thecvf.com/content_ICCV_2019/papers/Le_Cacheux_Modeling_Inter_and_Intra-Class_Relations_in_the_Triplet_Loss_for_ICCV_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/modeling-inter-and-intra-class-relations-in |
Repo | |
Framework | |
Grammar and Meaning: Analysing the Topology of Diachronic Word Embeddings
Title | Grammar and Meaning: Analysing the Topology of Diachronic Word Embeddings |
Authors | Yuri Bizzoni, Stefania Degaetano-Ortlieb, Katrin Menzel, Pauline Krielke, Elke Teich |
Abstract | The paper showcases the application of word embeddings to change in language use in the domain of science, focusing on the Late Modern English period (17-19th century). Historically, this is the period in which many registers of English developed, including the language of science. Our overarching interest is the linguistic development of scientific writing to a distinctive (group of) register(s). A register is marked not only by the choice of lexical words (discourse domain) but crucially by grammatical choices which indicate style. The focus of the paper is on the latter, tracing words with primarily grammatical functions (function words and some selected, poly-functional word forms) diachronically. To this end, we combine diachronic word embeddings with appropriate visualization and exploratory techniques such as clustering and relative entropy for meaningful aggregation of data and diachronic comparison. |
Tasks | Word Embeddings |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-4722/ |
https://www.aclweb.org/anthology/W19-4722 | |
PWC | https://paperswithcode.com/paper/grammar-and-meaning-analysing-the-topology-of |
Repo | |
Framework | |
The Expressive Power of Deep Neural Networks with Circulant Matrices
Title | The Expressive Power of Deep Neural Networks with Circulant Matrices |
Authors | Alexandre Araujo, Benjamin Negrevergne, Yann Chevaleyre, Jamal Atif |
Abstract | Recent results from linear algebra stating that any matrix can be decomposed into products of diagonal and circulant matrices has lead to the design of compact deep neural network architectures that perform well in practice. In this paper, we bridge the gap between these good empirical results and the theoretical approximation capabilities of Deep diagonal-circulant ReLU networks. More precisely, we first demonstrate that a Deep diagonal-circulant ReLU networks of bounded width and small depth can approximate a deep ReLU network in which the dense matrices are of low rank. Based on this result, we provide new bounds on the expressive power and universal approximativeness of this type of networks. We support our experimental results with thorough experiments on a large, real world video classification problem. |
Tasks | Video Classification |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=SkeUG30cFQ |
https://openreview.net/pdf?id=SkeUG30cFQ | |
PWC | https://paperswithcode.com/paper/the-expressive-power-of-deep-neural-networks |
Repo | |
Framework | |