Paper Group NANR 121
URIEL and lang2vec: Representing languages as typological, geographical, and phylogenetic vectors. A morphological analyser for Kven. Proceedings of the 2nd Workshop on Representation Learning for NLP. Raising to Object in Japanese: An HPSG Analysis. Understanding of unknown medical words. Combining Word-Level and Character-Level Representations fo …
URIEL and lang2vec: Representing languages as typological, geographical, and phylogenetic vectors
Title | URIEL and lang2vec: Representing languages as typological, geographical, and phylogenetic vectors |
Authors | Patrick Littell, David R. Mortensen, Ke Lin, Katherine Kairis, Carlisle Turner, Lori Levin |
Abstract | We introduce the URIEL knowledge base for massively multilingual NLP and the lang2vec utility, which provides information-rich vector identifications of languages drawn from typological, geographical, and phylogenetic databases and normalized to have straightforward and consistent formats, naming, and semantics. The goal of URIEL and lang2vec is to enable multilingual NLP, especially on less-resourced languages and make possible types of experiments (especially but not exclusively related to NLP tasks) that are otherwise difficult or impossible due to the sparsity and incommensurability of the data sources. lang2vec vectors have been shown to reduce perplexity in multilingual language modeling, when compared to one-hot language identification vectors. |
Tasks | Language Identification, Language Modelling |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/E17-2002/ |
https://www.aclweb.org/anthology/E17-2002 | |
PWC | https://paperswithcode.com/paper/uriel-and-lang2vec-representing-languages-as |
Repo | |
Framework | |
A morphological analyser for Kven
Title | A morphological analyser for Kven |
Authors | Sindre Reino Trosterud, Trond Trosterud, Anna-Kaisa R{"a}is{"a}nen, Leena Niiranen, Mervi Haavisto, Kaisa Maliniemi |
Abstract | |
Tasks | |
Published | 2017-01-01 |
URL | https://www.aclweb.org/anthology/W17-0608/ |
https://www.aclweb.org/anthology/W17-0608 | |
PWC | https://paperswithcode.com/paper/a-morphological-analyser-for-kven |
Repo | |
Framework | |
Proceedings of the 2nd Workshop on Representation Learning for NLP
Title | Proceedings of the 2nd Workshop on Representation Learning for NLP |
Authors | |
Abstract | |
Tasks | Representation Learning |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/W17-2600/ |
https://www.aclweb.org/anthology/W17-2600 | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-2nd-workshop-on-2 |
Repo | |
Framework | |
Raising to Object in Japanese: An HPSG Analysis
Title | Raising to Object in Japanese: An HPSG Analysis |
Authors | Akira Ohtani |
Abstract | |
Tasks | |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/Y17-1013/ |
https://www.aclweb.org/anthology/Y17-1013 | |
PWC | https://paperswithcode.com/paper/raising-to-object-in-japanese-an-hpsg |
Repo | |
Framework | |
Understanding of unknown medical words
Title | Understanding of unknown medical words |
Authors | Natalia Grabar, Thierry Hamon |
Abstract | We assume that unknown words with internal structure (affixed words or compounds) can provide speakers with linguistic cues as for their meaning, and thus help their decoding and understanding. To verify this hypothesis, we propose to work with a set of French medical words. These words are annotated by five annotators. Then, two kinds of analysis are performed: analysis of the evolution of understandable and non-understandable words (globally and according to some suffixes) and analysis of clusters created with unsupervised algorithms on basis of linguistic and extra-linguistic features of the studied words. Our results suggest that, according to linguistic sensitivity of annotators, technical words can be decoded and become understandable. As for the clusters, some of them distinguish between understandable and non-understandable words. Resources built in this work will be made freely available for the research purposes. |
Tasks | |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-8005/ |
https://doi.org/10.26615/978-954-452-044-1_005 | |
PWC | https://paperswithcode.com/paper/understanding-of-unknown-medical-words |
Repo | |
Framework | |
Combining Word-Level and Character-Level Representations for Relation Classification of Informal Text
Title | Combining Word-Level and Character-Level Representations for Relation Classification of Informal Text |
Authors | Dongyun Liang, Weiran Xu, Yinge Zhao |
Abstract | Word representation models have achieved great success in natural language processing tasks, such as relation classification. However, it does not always work on informal text, and the morphemes of some misspelling words may carry important short-distance semantic information. We propose a hybrid model, combining the merits of word-level and character-level representations to learn better representations on informal text. Experiments on two dataset of relation classification, SemEval-2010 Task8 and a large-scale one we compile from informal text, show that our model achieves a competitive result in the former and state-of-the-art with the other. |
Tasks | Relation Classification, Representation Learning, Slot Filling |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/W17-2606/ |
https://www.aclweb.org/anthology/W17-2606 | |
PWC | https://paperswithcode.com/paper/combining-word-level-and-character-level-1 |
Repo | |
Framework | |
Modeling Large-Scale Structured Relationships with Shared Memory for Knowledge Base Completion
Title | Modeling Large-Scale Structured Relationships with Shared Memory for Knowledge Base Completion |
Authors | Yelong Shen, Po-Sen Huang, Ming-Wei Chang, Jianfeng Gao |
Abstract | Recent studies on knowledge base completion, the task of recovering missing relationships based on recorded relations, demonstrate the importance of learning embeddings from multi-step relations. However, due to the size of knowledge bases, learning multi-step relations directly on top of observed triplets could be costly. Hence, a manually designed procedure is often used when training the models. In this paper, we propose Implicit ReasoNets (IRNs), which is designed to perform multi-step inference implicitly through a controller and shared memory. Without a human-designed inference procedure, IRNs use training data to learn to perform multi-step inference in an embedding neural space through the shared memory and controller. While the inference procedure does not explicitly operate on top of observed triplets, our proposed model outperforms all previous approaches on the popular FB15k benchmark by more than 5.7{%}. |
Tasks | Knowledge Base Completion, Question Answering, Representation Learning |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/W17-2608/ |
https://www.aclweb.org/anthology/W17-2608 | |
PWC | https://paperswithcode.com/paper/modeling-large-scale-structured-relationships |
Repo | |
Framework | |
Single-Rooted DAGs in Regular DAG Languages: Parikh Image and Path Languages
Title | Single-Rooted DAGs in Regular DAG Languages: Parikh Image and Path Languages |
Authors | Martin Berglund, Henrik Bj{"o}rklund, Frank Drewes |
Abstract | |
Tasks | |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-6210/ |
https://www.aclweb.org/anthology/W17-6210 | |
PWC | https://paperswithcode.com/paper/single-rooted-dags-in-regular-dag-languages |
Repo | |
Framework | |
Incrementality all the way up
Title | Incrementality all the way up |
Authors | Ellen Breitholtz, Christine Howes, Robin Cooper |
Abstract | |
Tasks | |
Published | 2017-01-01 |
URL | https://www.aclweb.org/anthology/W17-7201/ |
https://www.aclweb.org/anthology/W17-7201 | |
PWC | https://paperswithcode.com/paper/incrementality-all-the-way-up |
Repo | |
Framework | |
Arabic Textual Entailment with Word Embeddings
Title | Arabic Textual Entailment with Word Embeddings |
Authors | Nada Almarwani, Mona Diab |
Abstract | Determining the textual entailment between texts is important in many NLP tasks, such as summarization, question answering, and information extraction and retrieval. Various methods have been suggested based on external knowledge sources; however, such resources are not always available in all languages and their acquisition is typically laborious and very costly. Distributional word representations such as word embeddings learned over large corpora have been shown to capture syntactic and semantic word relationships. Such models have contributed to improving the performance of several NLP tasks. In this paper, we address the problem of textual entailment in Arabic. We employ both traditional features and distributional representations. Crucially, we do not depend on any external resources in the process. Our suggested approach yields state of the art performance on a standard data set, ArbTE, achieving an accuracy of 76.2 {%} compared to state of the art of 69.3 {%}. |
Tasks | Machine Translation, Natural Language Inference, Question Answering, Word Embeddings |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/W17-1322/ |
https://www.aclweb.org/anthology/W17-1322 | |
PWC | https://paperswithcode.com/paper/arabic-textual-entailment-with-word |
Repo | |
Framework | |
Gaussian process based nonlinear latent structure discovery in multivariate spike train data
Title | Gaussian process based nonlinear latent structure discovery in multivariate spike train data |
Authors | Anqi Wu, Nicholas G. Roy, Stephen Keeley, Jonathan W. Pillow |
Abstract | A large body of recent work focuses on methods for extracting low-dimensional latent structure from multi-neuron spike train data. Most such methods employ either linear latent dynamics or linear mappings from latent space to log spike rates. Here we propose a doubly nonlinear latent variable model that can identify low-dimensional structure underlying apparently high-dimensional spike train data. We introduce the Poisson Gaussian-Process Latent Variable Model (P-GPLVM), which consists of Poisson spiking observations and two underlying Gaussian processes—one governing a temporal latent variable and another governing a set of nonlinear tuning curves. The use of nonlinear tuning curves enables discovery of low-dimensional latent structure even when spike responses exhibit high linear dimensionality (e.g., as found in hippocampal place cell codes). To learn the model from data, we introduce the decoupled Laplace approximation, a fast approximate inference method that allows us to efficiently optimize the latent path while marginalizing over tuning curves. We show that this method outperforms previous Laplace-approximation-based inference methods in both the speed of convergence and accuracy. We apply the model to spike trains recorded from hippocampal place cells and show that it compares favorably to a variety of previous methods for latent structure discovery, including variational auto-encoder (VAE) based methods that parametrize the nonlinear mapping from latent space to spike rates with a deep neural network. |
Tasks | Gaussian Processes |
Published | 2017-12-01 |
URL | http://papers.nips.cc/paper/6941-gaussian-process-based-nonlinear-latent-structure-discovery-in-multivariate-spike-train-data |
http://papers.nips.cc/paper/6941-gaussian-process-based-nonlinear-latent-structure-discovery-in-multivariate-spike-train-data.pdf | |
PWC | https://paperswithcode.com/paper/gaussian-process-based-nonlinear-latent |
Repo | |
Framework | |
Representing Compositionality based on Multiple Timescales Gated Recurrent Neural Networks with Adaptive Temporal Hierarchy for Character-Level Language Models
Title | Representing Compositionality based on Multiple Timescales Gated Recurrent Neural Networks with Adaptive Temporal Hierarchy for Character-Level Language Models |
Authors | Dennis Singh Moirangthem, Jegyung Son, Minho Lee |
Abstract | A novel character-level neural language model is proposed in this paper. The proposed model incorporates a biologically inspired temporal hierarchy in the architecture for representing multiple compositions of language in order to handle longer sequences for the character-level language model. The temporal hierarchy is introduced in the language model by utilizing a Gated Recurrent Neural Network with multiple timescales. The proposed model incorporates a timescale adaptation mechanism for enhancing the performance of the language model. We evaluate our proposed model using the popular Penn Treebank and Text8 corpora. The experiments show that the use of multiple timescales in a Neural Language Model (NLM) enables improved performance despite having fewer parameters and with no additional computation requirements. Our experiments also demonstrate the ability of the adaptive temporal hierarchies to represent multiple compositonality without the help of complex hierarchical architectures and shows that better representation of the longer sequences lead to enhanced performance of the probabilistic language model. |
Tasks | Language Modelling, Representation Learning |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/W17-2616/ |
https://www.aclweb.org/anthology/W17-2616 | |
PWC | https://paperswithcode.com/paper/representing-compositionality-based-on |
Repo | |
Framework | |
Prediction of Frame-to-Frame Relations in the FrameNet Hierarchy with Frame Embeddings
Title | Prediction of Frame-to-Frame Relations in the FrameNet Hierarchy with Frame Embeddings |
Authors | Teresa Botschen, Hatem Mousselly-Sergieh, Iryna Gurevych |
Abstract | Automatic completion of frame-to-frame (F2F) relations in the FrameNet (FN) hierarchy has received little attention, although they incorporate meta-level commonsense knowledge and are used in downstream approaches. We address the problem of sparsely annotated F2F relations. First, we examine whether the manually defined F2F relations emerge from text by learning text-based frame embeddings. Our analysis reveals insights about the difficulty of reconstructing F2F relations purely from text. Second, we present different systems for predicting F2F relations; our best-performing one uses the FN hierarchy to train on and to ground embeddings in. A comparison of systems and embeddings exposes the crucial influence of knowledge-based embeddings to a system{'}s performance in predicting F2F relations. |
Tasks | Natural Language Inference, Representation Learning, Semantic Role Labeling |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/W17-2618/ |
https://www.aclweb.org/anthology/W17-2618 | |
PWC | https://paperswithcode.com/paper/prediction-of-frame-to-frame-relations-in-the |
Repo | |
Framework | |
Lexicalized vs. Delexicalized Parsing in Low-Resource Scenarios
Title | Lexicalized vs. Delexicalized Parsing in Low-Resource Scenarios |
Authors | Agnieszka Falenska, {"O}zlem {\c{C}}etino{\u{g}}lu |
Abstract | We present a systematic analysis of lexicalized vs. delexicalized parsing in low-resource scenarios, and propose a methodology to choose one method over another under certain conditions. We create a set of simulation experiments on 41 languages and apply our findings to 9 low-resource languages. Experimental results show that our methodology chooses the best approach in 8 out of 9 cases. |
Tasks | Dependency Parsing, Word Embeddings |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-6303/ |
https://www.aclweb.org/anthology/W17-6303 | |
PWC | https://paperswithcode.com/paper/lexicalized-vs-delexicalized-parsing-in-low |
Repo | |
Framework | |
Proceedings of the Biomedical NLP Workshop associated with RANLP 2017
Title | Proceedings of the Biomedical NLP Workshop associated with RANLP 2017 |
Authors | Svetla Boytcheva, Kevin Bretonnel Cohen, Guergana Savova, Galia Angelova |
Abstract | |
Tasks | |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/papers/W17-8000/w17-8000 |
https://www.aclweb.org/anthology/W17-8000 | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-biomedical-nlp-workshop |
Repo | |
Framework | |