July 26, 2019

1840 words 9 mins read

Paper Group NANR 121

Paper Group NANR 121

URIEL and lang2vec: Representing languages as typological, geographical, and phylogenetic vectors. A morphological analyser for Kven. Proceedings of the 2nd Workshop on Representation Learning for NLP. Raising to Object in Japanese: An HPSG Analysis. Understanding of unknown medical words. Combining Word-Level and Character-Level Representations fo …

URIEL and lang2vec: Representing languages as typological, geographical, and phylogenetic vectors

Title URIEL and lang2vec: Representing languages as typological, geographical, and phylogenetic vectors
Authors Patrick Littell, David R. Mortensen, Ke Lin, Katherine Kairis, Carlisle Turner, Lori Levin
Abstract We introduce the URIEL knowledge base for massively multilingual NLP and the lang2vec utility, which provides information-rich vector identifications of languages drawn from typological, geographical, and phylogenetic databases and normalized to have straightforward and consistent formats, naming, and semantics. The goal of URIEL and lang2vec is to enable multilingual NLP, especially on less-resourced languages and make possible types of experiments (especially but not exclusively related to NLP tasks) that are otherwise difficult or impossible due to the sparsity and incommensurability of the data sources. lang2vec vectors have been shown to reduce perplexity in multilingual language modeling, when compared to one-hot language identification vectors.
Tasks Language Identification, Language Modelling
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-2002/
PDF https://www.aclweb.org/anthology/E17-2002
PWC https://paperswithcode.com/paper/uriel-and-lang2vec-representing-languages-as
Repo
Framework

A morphological analyser for Kven

Title A morphological analyser for Kven
Authors Sindre Reino Trosterud, Trond Trosterud, Anna-Kaisa R{"a}is{"a}nen, Leena Niiranen, Mervi Haavisto, Kaisa Maliniemi
Abstract
Tasks
Published 2017-01-01
URL https://www.aclweb.org/anthology/W17-0608/
PDF https://www.aclweb.org/anthology/W17-0608
PWC https://paperswithcode.com/paper/a-morphological-analyser-for-kven
Repo
Framework

Proceedings of the 2nd Workshop on Representation Learning for NLP

Title Proceedings of the 2nd Workshop on Representation Learning for NLP
Authors
Abstract
Tasks Representation Learning
Published 2017-08-01
URL https://www.aclweb.org/anthology/W17-2600/
PDF https://www.aclweb.org/anthology/W17-2600
PWC https://paperswithcode.com/paper/proceedings-of-the-2nd-workshop-on-2
Repo
Framework

Raising to Object in Japanese: An HPSG Analysis

Title Raising to Object in Japanese: An HPSG Analysis
Authors Akira Ohtani
Abstract
Tasks
Published 2017-11-01
URL https://www.aclweb.org/anthology/Y17-1013/
PDF https://www.aclweb.org/anthology/Y17-1013
PWC https://paperswithcode.com/paper/raising-to-object-in-japanese-an-hpsg
Repo
Framework

Understanding of unknown medical words

Title Understanding of unknown medical words
Authors Natalia Grabar, Thierry Hamon
Abstract We assume that unknown words with internal structure (affixed words or compounds) can provide speakers with linguistic cues as for their meaning, and thus help their decoding and understanding. To verify this hypothesis, we propose to work with a set of French medical words. These words are annotated by five annotators. Then, two kinds of analysis are performed: analysis of the evolution of understandable and non-understandable words (globally and according to some suffixes) and analysis of clusters created with unsupervised algorithms on basis of linguistic and extra-linguistic features of the studied words. Our results suggest that, according to linguistic sensitivity of annotators, technical words can be decoded and become understandable. As for the clusters, some of them distinguish between understandable and non-understandable words. Resources built in this work will be made freely available for the research purposes.
Tasks
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-8005/
PDF https://doi.org/10.26615/978-954-452-044-1_005
PWC https://paperswithcode.com/paper/understanding-of-unknown-medical-words
Repo
Framework

Combining Word-Level and Character-Level Representations for Relation Classification of Informal Text

Title Combining Word-Level and Character-Level Representations for Relation Classification of Informal Text
Authors Dongyun Liang, Weiran Xu, Yinge Zhao
Abstract Word representation models have achieved great success in natural language processing tasks, such as relation classification. However, it does not always work on informal text, and the morphemes of some misspelling words may carry important short-distance semantic information. We propose a hybrid model, combining the merits of word-level and character-level representations to learn better representations on informal text. Experiments on two dataset of relation classification, SemEval-2010 Task8 and a large-scale one we compile from informal text, show that our model achieves a competitive result in the former and state-of-the-art with the other.
Tasks Relation Classification, Representation Learning, Slot Filling
Published 2017-08-01
URL https://www.aclweb.org/anthology/W17-2606/
PDF https://www.aclweb.org/anthology/W17-2606
PWC https://paperswithcode.com/paper/combining-word-level-and-character-level-1
Repo
Framework

Modeling Large-Scale Structured Relationships with Shared Memory for Knowledge Base Completion

Title Modeling Large-Scale Structured Relationships with Shared Memory for Knowledge Base Completion
Authors Yelong Shen, Po-Sen Huang, Ming-Wei Chang, Jianfeng Gao
Abstract Recent studies on knowledge base completion, the task of recovering missing relationships based on recorded relations, demonstrate the importance of learning embeddings from multi-step relations. However, due to the size of knowledge bases, learning multi-step relations directly on top of observed triplets could be costly. Hence, a manually designed procedure is often used when training the models. In this paper, we propose Implicit ReasoNets (IRNs), which is designed to perform multi-step inference implicitly through a controller and shared memory. Without a human-designed inference procedure, IRNs use training data to learn to perform multi-step inference in an embedding neural space through the shared memory and controller. While the inference procedure does not explicitly operate on top of observed triplets, our proposed model outperforms all previous approaches on the popular FB15k benchmark by more than 5.7{%}.
Tasks Knowledge Base Completion, Question Answering, Representation Learning
Published 2017-08-01
URL https://www.aclweb.org/anthology/W17-2608/
PDF https://www.aclweb.org/anthology/W17-2608
PWC https://paperswithcode.com/paper/modeling-large-scale-structured-relationships
Repo
Framework

Single-Rooted DAGs in Regular DAG Languages: Parikh Image and Path Languages

Title Single-Rooted DAGs in Regular DAG Languages: Parikh Image and Path Languages
Authors Martin Berglund, Henrik Bj{"o}rklund, Frank Drewes
Abstract
Tasks
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-6210/
PDF https://www.aclweb.org/anthology/W17-6210
PWC https://paperswithcode.com/paper/single-rooted-dags-in-regular-dag-languages
Repo
Framework

Incrementality all the way up

Title Incrementality all the way up
Authors Ellen Breitholtz, Christine Howes, Robin Cooper
Abstract
Tasks
Published 2017-01-01
URL https://www.aclweb.org/anthology/W17-7201/
PDF https://www.aclweb.org/anthology/W17-7201
PWC https://paperswithcode.com/paper/incrementality-all-the-way-up
Repo
Framework

Arabic Textual Entailment with Word Embeddings

Title Arabic Textual Entailment with Word Embeddings
Authors Nada Almarwani, Mona Diab
Abstract Determining the textual entailment between texts is important in many NLP tasks, such as summarization, question answering, and information extraction and retrieval. Various methods have been suggested based on external knowledge sources; however, such resources are not always available in all languages and their acquisition is typically laborious and very costly. Distributional word representations such as word embeddings learned over large corpora have been shown to capture syntactic and semantic word relationships. Such models have contributed to improving the performance of several NLP tasks. In this paper, we address the problem of textual entailment in Arabic. We employ both traditional features and distributional representations. Crucially, we do not depend on any external resources in the process. Our suggested approach yields state of the art performance on a standard data set, ArbTE, achieving an accuracy of 76.2 {%} compared to state of the art of 69.3 {%}.
Tasks Machine Translation, Natural Language Inference, Question Answering, Word Embeddings
Published 2017-04-01
URL https://www.aclweb.org/anthology/W17-1322/
PDF https://www.aclweb.org/anthology/W17-1322
PWC https://paperswithcode.com/paper/arabic-textual-entailment-with-word
Repo
Framework

Gaussian process based nonlinear latent structure discovery in multivariate spike train data

Title Gaussian process based nonlinear latent structure discovery in multivariate spike train data
Authors Anqi Wu, Nicholas G. Roy, Stephen Keeley, Jonathan W. Pillow
Abstract A large body of recent work focuses on methods for extracting low-dimensional latent structure from multi-neuron spike train data. Most such methods employ either linear latent dynamics or linear mappings from latent space to log spike rates. Here we propose a doubly nonlinear latent variable model that can identify low-dimensional structure underlying apparently high-dimensional spike train data. We introduce the Poisson Gaussian-Process Latent Variable Model (P-GPLVM), which consists of Poisson spiking observations and two underlying Gaussian processes—one governing a temporal latent variable and another governing a set of nonlinear tuning curves. The use of nonlinear tuning curves enables discovery of low-dimensional latent structure even when spike responses exhibit high linear dimensionality (e.g., as found in hippocampal place cell codes). To learn the model from data, we introduce the decoupled Laplace approximation, a fast approximate inference method that allows us to efficiently optimize the latent path while marginalizing over tuning curves. We show that this method outperforms previous Laplace-approximation-based inference methods in both the speed of convergence and accuracy. We apply the model to spike trains recorded from hippocampal place cells and show that it compares favorably to a variety of previous methods for latent structure discovery, including variational auto-encoder (VAE) based methods that parametrize the nonlinear mapping from latent space to spike rates with a deep neural network.
Tasks Gaussian Processes
Published 2017-12-01
URL http://papers.nips.cc/paper/6941-gaussian-process-based-nonlinear-latent-structure-discovery-in-multivariate-spike-train-data
PDF http://papers.nips.cc/paper/6941-gaussian-process-based-nonlinear-latent-structure-discovery-in-multivariate-spike-train-data.pdf
PWC https://paperswithcode.com/paper/gaussian-process-based-nonlinear-latent
Repo
Framework

Representing Compositionality based on Multiple Timescales Gated Recurrent Neural Networks with Adaptive Temporal Hierarchy for Character-Level Language Models

Title Representing Compositionality based on Multiple Timescales Gated Recurrent Neural Networks with Adaptive Temporal Hierarchy for Character-Level Language Models
Authors Dennis Singh Moirangthem, Jegyung Son, Minho Lee
Abstract A novel character-level neural language model is proposed in this paper. The proposed model incorporates a biologically inspired temporal hierarchy in the architecture for representing multiple compositions of language in order to handle longer sequences for the character-level language model. The temporal hierarchy is introduced in the language model by utilizing a Gated Recurrent Neural Network with multiple timescales. The proposed model incorporates a timescale adaptation mechanism for enhancing the performance of the language model. We evaluate our proposed model using the popular Penn Treebank and Text8 corpora. The experiments show that the use of multiple timescales in a Neural Language Model (NLM) enables improved performance despite having fewer parameters and with no additional computation requirements. Our experiments also demonstrate the ability of the adaptive temporal hierarchies to represent multiple compositonality without the help of complex hierarchical architectures and shows that better representation of the longer sequences lead to enhanced performance of the probabilistic language model.
Tasks Language Modelling, Representation Learning
Published 2017-08-01
URL https://www.aclweb.org/anthology/W17-2616/
PDF https://www.aclweb.org/anthology/W17-2616
PWC https://paperswithcode.com/paper/representing-compositionality-based-on
Repo
Framework

Prediction of Frame-to-Frame Relations in the FrameNet Hierarchy with Frame Embeddings

Title Prediction of Frame-to-Frame Relations in the FrameNet Hierarchy with Frame Embeddings
Authors Teresa Botschen, Hatem Mousselly-Sergieh, Iryna Gurevych
Abstract Automatic completion of frame-to-frame (F2F) relations in the FrameNet (FN) hierarchy has received little attention, although they incorporate meta-level commonsense knowledge and are used in downstream approaches. We address the problem of sparsely annotated F2F relations. First, we examine whether the manually defined F2F relations emerge from text by learning text-based frame embeddings. Our analysis reveals insights about the difficulty of reconstructing F2F relations purely from text. Second, we present different systems for predicting F2F relations; our best-performing one uses the FN hierarchy to train on and to ground embeddings in. A comparison of systems and embeddings exposes the crucial influence of knowledge-based embeddings to a system{'}s performance in predicting F2F relations.
Tasks Natural Language Inference, Representation Learning, Semantic Role Labeling
Published 2017-08-01
URL https://www.aclweb.org/anthology/W17-2618/
PDF https://www.aclweb.org/anthology/W17-2618
PWC https://paperswithcode.com/paper/prediction-of-frame-to-frame-relations-in-the
Repo
Framework

Lexicalized vs. Delexicalized Parsing in Low-Resource Scenarios

Title Lexicalized vs. Delexicalized Parsing in Low-Resource Scenarios
Authors Agnieszka Falenska, {"O}zlem {\c{C}}etino{\u{g}}lu
Abstract We present a systematic analysis of lexicalized vs. delexicalized parsing in low-resource scenarios, and propose a methodology to choose one method over another under certain conditions. We create a set of simulation experiments on 41 languages and apply our findings to 9 low-resource languages. Experimental results show that our methodology chooses the best approach in 8 out of 9 cases.
Tasks Dependency Parsing, Word Embeddings
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-6303/
PDF https://www.aclweb.org/anthology/W17-6303
PWC https://paperswithcode.com/paper/lexicalized-vs-delexicalized-parsing-in-low
Repo
Framework

Proceedings of the Biomedical NLP Workshop associated with RANLP 2017

Title Proceedings of the Biomedical NLP Workshop associated with RANLP 2017
Authors Svetla Boytcheva, Kevin Bretonnel Cohen, Guergana Savova, Galia Angelova
Abstract
Tasks
Published 2017-09-01
URL https://www.aclweb.org/anthology/papers/W17-8000/w17-8000
PDF https://www.aclweb.org/anthology/W17-8000
PWC https://paperswithcode.com/paper/proceedings-of-the-biomedical-nlp-workshop
Repo
Framework
comments powered by Disqus