July 26, 2019

1840 words 9 mins read

Paper Group NANR 121

URIEL and lang2vec: Representing languages as typological, geographical, and phylogenetic vectors. A morphological analyser for Kven. Proceedings of the 2nd Workshop on Representation Learning for NLP. Raising to Object in Japanese: An HPSG Analysis. Understanding of unknown medical words. Combining Word-Level and Character-Level Representations fo …

URIEL and lang2vec: Representing languages as typological, geographical, and phylogenetic vectors


Title	URIEL and lang2vec: Representing languages as typological, geographical, and phylogenetic vectors
Authors	Patrick Littell, David R. Mortensen, Ke Lin, Katherine Kairis, Carlisle Turner, Lori Levin
Abstract	We introduce the URIEL knowledge base for massively multilingual NLP and the lang2vec utility, which provides information-rich vector identifications of languages drawn from typological, geographical, and phylogenetic databases and normalized to have straightforward and consistent formats, naming, and semantics. The goal of URIEL and lang2vec is to enable multilingual NLP, especially on less-resourced languages and make possible types of experiments (especially but not exclusively related to NLP tasks) that are otherwise difficult or impossible due to the sparsity and incommensurability of the data sources. lang2vec vectors have been shown to reduce perplexity in multilingual language modeling, when compared to one-hot language identification vectors.
Tasks	Language Identification, Language Modelling
Published	2017-04-01
URL	https://www.aclweb.org/anthology/E17-2002/
PDF	https://www.aclweb.org/anthology/E17-2002
PWC	https://paperswithcode.com/paper/uriel-and-lang2vec-representing-languages-as
Repo
Framework

A morphological analyser for Kven


Title	A morphological analyser for Kven
Authors	Sindre Reino Trosterud, Trond Trosterud, Anna-Kaisa R{"a}is{"a}nen, Leena Niiranen, Mervi Haavisto, Kaisa Maliniemi
Abstract
Tasks
Published	2017-01-01
URL	https://www.aclweb.org/anthology/W17-0608/
PDF	https://www.aclweb.org/anthology/W17-0608
PWC	https://paperswithcode.com/paper/a-morphological-analyser-for-kven
Repo
Framework

Proceedings of the 2nd Workshop on Representation Learning for NLP


Title	Proceedings of the 2nd Workshop on Representation Learning for NLP
Authors
Abstract
Tasks	Representation Learning
Published	2017-08-01
URL	https://www.aclweb.org/anthology/W17-2600/
PDF	https://www.aclweb.org/anthology/W17-2600
PWC	https://paperswithcode.com/paper/proceedings-of-the-2nd-workshop-on-2
Repo
Framework

Raising to Object in Japanese: An HPSG Analysis


Title	Raising to Object in Japanese: An HPSG Analysis
Authors	Akira Ohtani
Abstract
Tasks
Published	2017-11-01
URL	https://www.aclweb.org/anthology/Y17-1013/
PDF	https://www.aclweb.org/anthology/Y17-1013
PWC	https://paperswithcode.com/paper/raising-to-object-in-japanese-an-hpsg
Repo
Framework

Understanding of unknown medical words


Title	Understanding of unknown medical words
Authors	Natalia Grabar, Thierry Hamon
Abstract	We assume that unknown words with internal structure (affixed words or compounds) can provide speakers with linguistic cues as for their meaning, and thus help their decoding and understanding. To verify this hypothesis, we propose to work with a set of French medical words. These words are annotated by five annotators. Then, two kinds of analysis are performed: analysis of the evolution of understandable and non-understandable words (globally and according to some suffixes) and analysis of clusters created with unsupervised algorithms on basis of linguistic and extra-linguistic features of the studied words. Our results suggest that, according to linguistic sensitivity of annotators, technical words can be decoded and become understandable. As for the clusters, some of them distinguish between understandable and non-understandable words. Resources built in this work will be made freely available for the research purposes.
Tasks
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-8005/
PDF	https://doi.org/10.26615/978-954-452-044-1_005
PWC	https://paperswithcode.com/paper/understanding-of-unknown-medical-words
Repo
Framework

Combining Word-Level and Character-Level Representations for Relation Classification of Informal Text


Title	Combining Word-Level and Character-Level Representations for Relation Classification of Informal Text
Authors	Dongyun Liang, Weiran Xu, Yinge Zhao
Abstract	Word representation models have achieved great success in natural language processing tasks, such as relation classification. However, it does not always work on informal text, and the morphemes of some misspelling words may carry important short-distance semantic information. We propose a hybrid model, combining the merits of word-level and character-level representations to learn better representations on informal text. Experiments on two dataset of relation classification, SemEval-2010 Task8 and a large-scale one we compile from informal text, show that our model achieves a competitive result in the former and state-of-the-art with the other.
Tasks	Relation Classification, Representation Learning, Slot Filling
Published	2017-08-01
URL	https://www.aclweb.org/anthology/W17-2606/
PDF	https://www.aclweb.org/anthology/W17-2606
PWC	https://paperswithcode.com/paper/combining-word-level-and-character-level-1
Repo
Framework

Modeling Large-Scale Structured Relationships with Shared Memory for Knowledge Base Completion


Title	Modeling Large-Scale Structured Relationships with Shared Memory for Knowledge Base Completion
Authors	Yelong Shen, Po-Sen Huang, Ming-Wei Chang, Jianfeng Gao
Abstract	Recent studies on knowledge base completion, the task of recovering missing relationships based on recorded relations, demonstrate the importance of learning embeddings from multi-step relations. However, due to the size of knowledge bases, learning multi-step relations directly on top of observed triplets could be costly. Hence, a manually designed procedure is often used when training the models. In this paper, we propose Implicit ReasoNets (IRNs), which is designed to perform multi-step inference implicitly through a controller and shared memory. Without a human-designed inference procedure, IRNs use training data to learn to perform multi-step inference in an embedding neural space through the shared memory and controller. While the inference procedure does not explicitly operate on top of observed triplets, our proposed model outperforms all previous approaches on the popular FB15k benchmark by more than 5.7{%}.
Tasks	Knowledge Base Completion, Question Answering, Representation Learning
Published	2017-08-01
URL	https://www.aclweb.org/anthology/W17-2608/
PDF	https://www.aclweb.org/anthology/W17-2608
PWC	https://paperswithcode.com/paper/modeling-large-scale-structured-relationships
Repo
Framework

Single-Rooted DAGs in Regular DAG Languages: Parikh Image and Path Languages


Title	Single-Rooted DAGs in Regular DAG Languages: Parikh Image and Path Languages
Authors	Martin Berglund, Henrik Bj{"o}rklund, Frank Drewes
Abstract
Tasks
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-6210/
PDF	https://www.aclweb.org/anthology/W17-6210
PWC	https://paperswithcode.com/paper/single-rooted-dags-in-regular-dag-languages
Repo
Framework

Incrementality all the way up


Title	Incrementality all the way up
Authors	Ellen Breitholtz, Christine Howes, Robin Cooper
Abstract
Tasks
Published	2017-01-01
URL	https://www.aclweb.org/anthology/W17-7201/
PDF	https://www.aclweb.org/anthology/W17-7201
PWC	https://paperswithcode.com/paper/incrementality-all-the-way-up
Repo
Framework

Arabic Textual Entailment with Word Embeddings


Title	Arabic Textual Entailment with Word Embeddings
Authors	Nada Almarwani, Mona Diab
Abstract	Determining the textual entailment between texts is important in many NLP tasks, such as summarization, question answering, and information extraction and retrieval. Various methods have been suggested based on external knowledge sources; however, such resources are not always available in all languages and their acquisition is typically laborious and very costly. Distributional word representations such as word embeddings learned over large corpora have been shown to capture syntactic and semantic word relationships. Such models have contributed to improving the performance of several NLP tasks. In this paper, we address the problem of textual entailment in Arabic. We employ both traditional features and distributional representations. Crucially, we do not depend on any external resources in the process. Our suggested approach yields state of the art performance on a standard data set, ArbTE, achieving an accuracy of 76.2 {%} compared to state of the art of 69.3 {%}.
Tasks	Machine Translation, Natural Language Inference, Question Answering, Word Embeddings
Published	2017-04-01
URL	https://www.aclweb.org/anthology/W17-1322/
PDF	https://www.aclweb.org/anthology/W17-1322
PWC	https://paperswithcode.com/paper/arabic-textual-entailment-with-word
Repo
Framework

Gaussian process based nonlinear latent structure discovery in multivariate spike train data


Title	Gaussian process based nonlinear latent structure discovery in multivariate spike train data
Authors	Anqi Wu, Nicholas G. Roy, Stephen Keeley, Jonathan W. Pillow
Abstract	A large body of recent work focuses on methods for extracting low-dimensional latent structure from multi-neuron spike train data. Most such methods employ either linear latent dynamics or linear mappings from latent space to log spike rates. Here we propose a doubly nonlinear latent variable model that can identify low-dimensional structure underlying apparently high-dimensional spike train data. We introduce the Poisson Gaussian-Process Latent Variable Model (P-GPLVM), which consists of Poisson spiking observations and two underlying Gaussian processes—one governing a temporal latent variable and another governing a set of nonlinear tuning curves. The use of nonlinear tuning curves enables discovery of low-dimensional latent structure even when spike responses exhibit high linear dimensionality (e.g., as found in hippocampal place cell codes). To learn the model from data, we introduce the decoupled Laplace approximation, a fast approximate inference method that allows us to efficiently optimize the latent path while marginalizing over tuning curves. We show that this method outperforms previous Laplace-approximation-based inference methods in both the speed of convergence and accuracy. We apply the model to spike trains recorded from hippocampal place cells and show that it compares favorably to a variety of previous methods for latent structure discovery, including variational auto-encoder (VAE) based methods that parametrize the nonlinear mapping from latent space to spike rates with a deep neural network.
Tasks	Gaussian Processes
Published	2017-12-01
URL	http://papers.nips.cc/paper/6941-gaussian-process-based-nonlinear-latent-structure-discovery-in-multivariate-spike-train-data
PDF	http://papers.nips.cc/paper/6941-gaussian-process-based-nonlinear-latent-structure-discovery-in-multivariate-spike-train-data.pdf
PWC	https://paperswithcode.com/paper/gaussian-process-based-nonlinear-latent
Repo
Framework

Representing Compositionality based on Multiple Timescales Gated Recurrent Neural Networks with Adaptive Temporal Hierarchy for Character-Level Language Models


Title	Representing Compositionality based on Multiple Timescales Gated Recurrent Neural Networks with Adaptive Temporal Hierarchy for Character-Level Language Models
Authors	Dennis Singh Moirangthem, Jegyung Son, Minho Lee
Abstract	A novel character-level neural language model is proposed in this paper. The proposed model incorporates a biologically inspired temporal hierarchy in the architecture for representing multiple compositions of language in order to handle longer sequences for the character-level language model. The temporal hierarchy is introduced in the language model by utilizing a Gated Recurrent Neural Network with multiple timescales. The proposed model incorporates a timescale adaptation mechanism for enhancing the performance of the language model. We evaluate our proposed model using the popular Penn Treebank and Text8 corpora. The experiments show that the use of multiple timescales in a Neural Language Model (NLM) enables improved performance despite having fewer parameters and with no additional computation requirements. Our experiments also demonstrate the ability of the adaptive temporal hierarchies to represent multiple compositonality without the help of complex hierarchical architectures and shows that better representation of the longer sequences lead to enhanced performance of the probabilistic language model.
Tasks	Language Modelling, Representation Learning
Published	2017-08-01
URL	https://www.aclweb.org/anthology/W17-2616/
PDF	https://www.aclweb.org/anthology/W17-2616
PWC	https://paperswithcode.com/paper/representing-compositionality-based-on
Repo
Framework

Prediction of Frame-to-Frame Relations in the FrameNet Hierarchy with Frame Embeddings


Title	Prediction of Frame-to-Frame Relations in the FrameNet Hierarchy with Frame Embeddings
Authors	Teresa Botschen, Hatem Mousselly-Sergieh, Iryna Gurevych
Abstract	Automatic completion of frame-to-frame (F2F) relations in the FrameNet (FN) hierarchy has received little attention, although they incorporate meta-level commonsense knowledge and are used in downstream approaches. We address the problem of sparsely annotated F2F relations. First, we examine whether the manually defined F2F relations emerge from text by learning text-based frame embeddings. Our analysis reveals insights about the difficulty of reconstructing F2F relations purely from text. Second, we present different systems for predicting F2F relations; our best-performing one uses the FN hierarchy to train on and to ground embeddings in. A comparison of systems and embeddings exposes the crucial influence of knowledge-based embeddings to a system{'}s performance in predicting F2F relations.
Tasks	Natural Language Inference, Representation Learning, Semantic Role Labeling
Published	2017-08-01
URL	https://www.aclweb.org/anthology/W17-2618/
PDF	https://www.aclweb.org/anthology/W17-2618
PWC	https://paperswithcode.com/paper/prediction-of-frame-to-frame-relations-in-the
Repo
Framework

Lexicalized vs. Delexicalized Parsing in Low-Resource Scenarios


Title	Lexicalized vs. Delexicalized Parsing in Low-Resource Scenarios
Authors	Agnieszka Falenska, {"O}zlem {\c{C}}etino{\u{g}}lu
Abstract	We present a systematic analysis of lexicalized vs. delexicalized parsing in low-resource scenarios, and propose a methodology to choose one method over another under certain conditions. We create a set of simulation experiments on 41 languages and apply our findings to 9 low-resource languages. Experimental results show that our methodology chooses the best approach in 8 out of 9 cases.
Tasks	Dependency Parsing, Word Embeddings
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-6303/
PDF	https://www.aclweb.org/anthology/W17-6303
PWC	https://paperswithcode.com/paper/lexicalized-vs-delexicalized-parsing-in-low
Repo
Framework

Proceedings of the Biomedical NLP Workshop associated with RANLP 2017


Title	Proceedings of the Biomedical NLP Workshop associated with RANLP 2017
Authors	Svetla Boytcheva, Kevin Bretonnel Cohen, Guergana Savova, Galia Angelova
Abstract
Tasks
Published	2017-09-01
URL	https://www.aclweb.org/anthology/papers/W17-8000/w17-8000
PDF	https://www.aclweb.org/anthology/W17-8000
PWC	https://paperswithcode.com/paper/proceedings-of-the-biomedical-nlp-workshop
Repo
Framework