Paper Group NANR 174
A Parallel Recurrent Neural Network for Language Modeling with POS Tags. Learning Transferable Representation for Bilingual Relation Extraction via Convolutional Neural Networks. Evaluating Persuasion Strategies and Deep Reinforcement Learning methods for Negotiation Dialogue agents. SVD-Softmax: Fast Softmax Approximation on Large Vocabulary Neura …
A Parallel Recurrent Neural Network for Language Modeling with POS Tags
Title | A Parallel Recurrent Neural Network for Language Modeling with POS Tags |
Authors | Chao Su, Heyan Huang, Shumin Shi, Yuhang Guo, Hao Wu |
Abstract | |
Tasks | Language Modelling, Machine Translation, Speech Recognition, Text Generation |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/Y17-1021/ |
https://www.aclweb.org/anthology/Y17-1021 | |
PWC | https://paperswithcode.com/paper/a-parallel-recurrent-neural-network-for |
Repo | |
Framework | |
Learning Transferable Representation for Bilingual Relation Extraction via Convolutional Neural Networks
Title | Learning Transferable Representation for Bilingual Relation Extraction via Convolutional Neural Networks |
Authors | Bonan Min, Zhuolin Jiang, Marjorie Freedman, Ralph Weischedel |
Abstract | Typically, relation extraction models are trained to extract instances of a relation ontology using only training data from a single language. However, the concepts represented by the relation ontology (e.g. ResidesIn, EmployeeOf) are language independent. The numbers of annotated examples available for a given ontology vary between languages. For example, there are far fewer annotated examples in Spanish and Japanese than English and Chinese. Furthermore, using only language-specific training data results in the need to manually annotate equivalently large amounts of training for each new language a system encounters. We propose a deep neural network to learn transferable, discriminative bilingual representation. Experiments on the ACE 2005 multilingual training corpus demonstrate that the joint training process results in significant improvement in relation classification performance over the monolingual counterparts. The learnt representation is discriminative and transferable between languages. When using 10{%} (25K English words, or 30K Chinese characters) of the training data, our approach results in doubling F1 compared to a monolingual baseline. We achieve comparable performance to the monolingual system trained with 250K English words (or 300K Chinese characters) With 50{%} of training data. |
Tasks | Knowledge Base Population, Question Answering, Relation Classification, Relation Extraction, Word Embeddings |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/I17-1068/ |
https://www.aclweb.org/anthology/I17-1068 | |
PWC | https://paperswithcode.com/paper/learning-transferable-representation-for |
Repo | |
Framework | |
Evaluating Persuasion Strategies and Deep Reinforcement Learning methods for Negotiation Dialogue agents
Title | Evaluating Persuasion Strategies and Deep Reinforcement Learning methods for Negotiation Dialogue agents |
Authors | Simon Keizer, Markus Guhe, Heriberto Cuay{'a}huitl, Ioannis Efstathiou, Klaus-Peter Engelbrecht, Mihai Dobre, Alex Lascarides, Oliver Lemon |
Abstract | In this paper we present a comparative evaluation of various negotiation strategies within an online version of the game {``}Settlers of Catan{''}. The comparison is based on human subjects playing games against artificial game-playing agents ({`}bots{'}) which implement different negotiation dialogue strategies, using a chat dialogue interface to negotiate trades. Our results suggest that a negotiation strategy that uses persuasion, as well as a strategy that is trained from data using Deep Reinforcement Learning, both lead to an improved win rate against humans, compared to previous rule-based and supervised learning baseline dialogue negotiators. | |
Tasks | |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/E17-2077/ |
https://www.aclweb.org/anthology/E17-2077 | |
PWC | https://paperswithcode.com/paper/evaluating-persuasion-strategies-and-deep |
Repo | |
Framework | |
SVD-Softmax: Fast Softmax Approximation on Large Vocabulary Neural Networks
Title | SVD-Softmax: Fast Softmax Approximation on Large Vocabulary Neural Networks |
Authors | Kyuhong Shim, Minjae Lee, Iksoo Choi, Yoonho Boo, Wonyong Sung |
Abstract | We propose a fast approximation method of a softmax function with a very large vocabulary using singular value decomposition (SVD). SVD-softmax targets fast and accurate probability estimation of the topmost probable words during inference of neural network language models. The proposed method transforms the weight matrix used in the calculation of the output vector by using SVD. The approximate probability of each word can be estimated with only a small part of the weight matrix by using a few large singular values and the corresponding elements for most of the words. We applied the technique to language modeling and neural machine translation and present a guideline for good approximation. The algorithm requires only approximately 20% of arithmetic operations for an 800K vocabulary case and shows more than a three-fold speedup on a GPU. |
Tasks | Language Modelling, Machine Translation |
Published | 2017-12-01 |
URL | http://papers.nips.cc/paper/7130-svd-softmax-fast-softmax-approximation-on-large-vocabulary-neural-networks |
http://papers.nips.cc/paper/7130-svd-softmax-fast-softmax-approximation-on-large-vocabulary-neural-networks.pdf | |
PWC | https://paperswithcode.com/paper/svd-softmax-fast-softmax-approximation-on |
Repo | |
Framework | |
Evaluating the Reliability and Interaction of Recursively Used Feature Classes for Terminology Extraction
Title | Evaluating the Reliability and Interaction of Recursively Used Feature Classes for Terminology Extraction |
Authors | Anna H{"a}tty, Michael Dorna, Sabine Schulte im Walde |
Abstract | Feature design and selection is a crucial aspect when treating terminology extraction as a machine learning classification problem. We designed feature classes which characterize different properties of terms based on distributions, and propose a new feature class for components of term candidates. By using random forests, we infer optimal features which are later used to build decision tree classifiers. We evaluate our method using the ACL RD-TEC dataset. We demonstrate the importance of the novel feature class for downgrading termhood which exploits properties of term components. Furthermore, our classification suggests that the identification of reliable term candidates should be performed successively, rather than just once. |
Tasks | Machine Translation, Text Generation |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/E17-4012/ |
https://www.aclweb.org/anthology/E17-4012 | |
PWC | https://paperswithcode.com/paper/evaluating-the-reliability-and-interaction-of |
Repo | |
Framework | |
Word Vector Space Specialisation
Title | Word Vector Space Specialisation |
Authors | Ivan Vuli{'c}, Nikola Mrk{\v{s}}i{'c}, Mohammad Taher Pilehvar |
Abstract | Specialising vector spaces to maximise their content with respect to one key property of vector space models (e.g. semantic similarity vs. relatedness or lexical entailment) while mitigating others has become an active and attractive research topic in representation learning. Such specialised vector spaces support different classes of NLP problems. Proposed approaches fall into two broad categories: a) Unsupervised methods which learn from raw textual corpora in more sophisticated ways (e.g. using context selection, extracting co-occurrence information from word patterns, attending over contexts); and b) Knowledge-base driven approaches which exploit available resources to encode external information into distributional vector spaces, injecting knowledge from semantic lexicons (e.g., WordNet, FrameNet, PPDB). In this tutorial, we will introduce researchers to state-of-the-art methods for constructing vector spaces specialised for a broad range of downstream NLP applications. We will deliver a detailed survey of the proposed methods and discuss best practices for intrinsic and application-oriented evaluation of such vector spaces.Throughout the tutorial, we will provide running examples reaching beyond English as the only (and probably the easiest) use-case language, in order to demonstrate the applicability and modelling challenges of current representation learning architectures in other languages. |
Tasks | Representation Learning, Semantic Similarity, Semantic Textual Similarity |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/E17-5004/ |
https://www.aclweb.org/anthology/E17-5004 | |
PWC | https://paperswithcode.com/paper/word-vector-space-specialisation |
Repo | |
Framework | |
Bootstrapping a Romanian Corpus for Medical Named Entity Recognition
Title | Bootstrapping a Romanian Corpus for Medical Named Entity Recognition |
Authors | Maria Mitrofan |
Abstract | Named Entity Recognition (NER) is an important component of natural language processing (NLP), with applicability in biomedical domain, enabling knowledge-discovery from medical texts. Due to the fact that for the Romanian language there are only a few linguistic resources specific to the biomedical domain, it was created a sub-corpus specific to this domain. In this paper we present a newly developed Romanian sub-corpus for medical-domain NER, which is a valuable asset for the field of biomedical text processing. We provide a description of the sub-corpus, informative statistics about data-composition and we evaluate an automatic NER tool on the newly created resource. |
Tasks | Medical Named Entity Recognition, Named Entity Recognition, Question Answering, Relation Extraction |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/R17-1066/ |
https://doi.org/10.26615/978-954-452-049-6_066 | |
PWC | https://paperswithcode.com/paper/bootstrapping-a-romanian-corpus-for-medical |
Repo | |
Framework | |
Data Driven Computing with Noisy Material Data Sets
Title | Data Driven Computing with Noisy Material Data Sets |
Authors | T.Kirchdoerfer, M.Ortiz |
Abstract | We formulate a Data Driven Computing paradigm, termed max-ent Data Driven Computing, that generalizes distance-minimizing Data Driven Computing and is robust with respect to outliers. Robustness is achieved by means of clustering analysis. Specifically, we assign data points a variable relevance depending on distance to the solution and on maximum-entropy estimation. The resulting scheme consists of the minimization of a suitably-defined free energy over phase space subject to compatibility and equilibrium constraints. Distance-minimizing Data Driven schemes are recovered in the limit of zero temperature. We present selected numerical tests that establish the convergence properties of the max-ent Data Driven solvers and solutions. |
Tasks | Stress-Strain Relation |
Published | 2017-11-01 |
URL | https://www.sciencedirect.com/science/article/pii/S0045782517304012 |
https://www.sciencedirect.com/science/article/pii/S0045782517304012 | |
PWC | https://paperswithcode.com/paper/data-driven-computing-with-noisy-material |
Repo | |
Framework | |
Optimal Sample Complexity of M-wise Data for Top-K Ranking
Title | Optimal Sample Complexity of M-wise Data for Top-K Ranking |
Authors | Minje Jang, Sunghyun Kim, Changho Suh, Sewoong Oh |
Abstract | We explore the top-K rank aggregation problem in which one aims to recover a consistent ordering that focuses on top-K ranked items based on partially revealed preference information. We examine an M-wise comparison model that builds on the Plackett-Luce (PL) model where for each sample, M items are ranked according to their perceived utilities modeled as noisy observations of their underlying true utilities. As our result, we characterize the minimax optimality on the sample size for top-K ranking. The optimal sample size turns out to be inversely proportional to M. We devise an algorithm that effectively converts M-wise samples into pairwise ones and employs a spectral method using the refined data. In demonstrating its optimality, we develop a novel technique for deriving tight $\ell_\infty$ estimation error bounds, which is key to accurately analyzing the performance of top-K ranking algorithms, but has been challenging. Recent work relied on an additional maximum-likelihood estimation (MLE) stage merged with a spectral method to attain good estimates in $\ell_\infty$ error to achieve the limit for the pairwise model. In contrast, although it is valid in slightly restricted regimes, our result demonstrates a spectral method alone to be sufficient for the general M-wise model. We run numerical experiments using synthetic data and confirm that the optimal sample size decreases at the rate of 1/M. Moreover, running our algorithm on real-world data, we find that its applicability extends to settings that may not fit the PL model. |
Tasks | |
Published | 2017-12-01 |
URL | http://papers.nips.cc/paper/6766-optimal-sample-complexity-of-m-wise-data-for-top-k-ranking |
http://papers.nips.cc/paper/6766-optimal-sample-complexity-of-m-wise-data-for-top-k-ranking.pdf | |
PWC | https://paperswithcode.com/paper/optimal-sample-complexity-of-m-wise-data-for |
Repo | |
Framework | |
The WebNLG Challenge: Generating Text from RDF Data
Title | The WebNLG Challenge: Generating Text from RDF Data |
Authors | Claire Gardent, Anastasia Shimorina, Shashi Narayan, Laura Perez-Beltrachini |
Abstract | The WebNLG challenge consists in mapping sets of RDF triples to text. It provides a common benchmark on which to train, evaluate and compare {``}microplanners{''}, i.e. generation systems that verbalise a given content by making a range of complex interacting choices including referring expression generation, aggregation, lexicalisation, surface realisation and sentence segmentation. In this paper, we introduce the microplanning task, describe data preparation, introduce our evaluation methodology, analyse participant results and provide a brief description of the participating systems. | |
Tasks | Text Generation |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-3518/ |
https://www.aclweb.org/anthology/W17-3518 | |
PWC | https://paperswithcode.com/paper/the-webnlg-challenge-generating-text-from-rdf |
Repo | |
Framework | |
Label Efficient Learning of Transferable Representations acrosss Domains and Tasks
Title | Label Efficient Learning of Transferable Representations acrosss Domains and Tasks |
Authors | Zelun Luo, Yuliang Zou, Judy Hoffman, Li F. Fei-Fei |
Abstract | We propose a framework that learns a representation transferable across different domains and tasks in a data efficient manner. Our approach battles domain shift with a domain adversarial loss, and generalizes the embedding to novel task using a metric learning-based approach. Our model is simultaneously optimized on labeled source data and unlabeled or sparsely labeled data in the target domain. Our method shows compelling results on novel classes within a new domain even when only a few labeled examples per class are available, outperforming the prevalent fine-tuning approach. In addition, we demonstrate the effectiveness of our framework on the transfer learning task from image object recognition to video action recognition. |
Tasks | Metric Learning, Object Recognition, Temporal Action Localization, Transfer Learning |
Published | 2017-12-01 |
URL | http://papers.nips.cc/paper/6621-label-efficient-learning-of-transferable-representations-acrosss-domains-and-tasks |
http://papers.nips.cc/paper/6621-label-efficient-learning-of-transferable-representations-acrosss-domains-and-tasks.pdf | |
PWC | https://paperswithcode.com/paper/label-efficient-learning-of-transferable |
Repo | |
Framework | |
Open Relation Extraction and Grounding
Title | Open Relation Extraction and Grounding |
Authors | Dian Yu, Lifu Huang, Heng Ji |
Abstract | Previous open Relation Extraction (open RE) approaches mainly rely on linguistic patterns and constraints to extract important relational triples from large-scale corpora. However, they lack of abilities to cover diverse relation expressions or measure the relative importance of candidate triples within a sentence. It is also challenging to name the relation type of a relational triple merely based on context words, which could limit the usefulness of open RE in downstream applications. We propose a novel importance-based open RE approach by exploiting the global structure of a dependency tree to extract salient triples. We design an unsupervised relation type naming method by grounding relational triples to a large-scale Knowledge Base (KB) schema, leveraging KB triples and weighted context words associated with relational triples. Experiments on the English Slot Filling 2013 dataset demonstrate that our approach achieves 8.1{%} higher F-score over state-of-the-art open RE methods. |
Tasks | Relation Extraction, Slot Filling |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/I17-1086/ |
https://www.aclweb.org/anthology/I17-1086 | |
PWC | https://paperswithcode.com/paper/open-relation-extraction-and-grounding |
Repo | |
Framework | |
To Sing like a Mockingbird
Title | To Sing like a Mockingbird |
Authors | Lorenzo Gatti, G{"o}zde {"O}zbal, Oliviero Stock, Carlo Strapparava |
Abstract | Musical parody, i.e. the act of changing the lyrics of an existing and very well-known song, is a commonly used technique for creating catchy advertising tunes and for mocking people or events. Here we describe a system for automatically producing a musical parody, starting from a corpus of songs. The system can automatically identify characterizing words and concepts related to a novel text, which are taken from the daily news. These concepts are then used as seeds to appropriately replace part of the original lyrics of a song, using metrical, rhyming and lexical constraints. Finally, the parody can be sung with a singing speech synthesizer, with no intervention from the user. |
Tasks | |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/E17-2048/ |
https://www.aclweb.org/anthology/E17-2048 | |
PWC | https://paperswithcode.com/paper/to-sing-like-a-mockingbird |
Repo | |
Framework | |
Course Concept Extraction in MOOCs via Embedding-Based Graph Propagation
Title | Course Concept Extraction in MOOCs via Embedding-Based Graph Propagation |
Authors | Liangming Pan, Xiaochen Wang, Chengjiang Li, Juanzi Li, Jie Tang |
Abstract | Massive Open Online Courses (MOOCs), offering a new way to study online, are revolutionizing education. One challenging issue in MOOCs is how to design effective and fine-grained course concepts such that students with different backgrounds can grasp the essence of the course. In this paper, we conduct a systematic investigation of the problem of course concept extraction for MOOCs. We propose to learn latent representations for candidate concepts via an embedding-based method. Moreover, we develop a graph-based propagation algorithm to rank the candidate concepts based on the learned representations. We evaluate the proposed method using different courses from XuetangX and Coursera. Experimental results show that our method significantly outperforms all the alternative methods (+0.013-0.318 in terms of R-precision; p{\textless}{\textless}0.01, t-test). |
Tasks | |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/I17-1088/ |
https://www.aclweb.org/anthology/I17-1088 | |
PWC | https://paperswithcode.com/paper/course-concept-extraction-in-moocs-via |
Repo | |
Framework | |
Learning to Diagnose: Assimilating Clinical Narratives using Deep Reinforcement Learning
Title | Learning to Diagnose: Assimilating Clinical Narratives using Deep Reinforcement Learning |
Authors | Yuan Ling, Sadid A. Hasan, Vivek Datla, Ashequl Qadir, Kathy Lee, Joey Liu, Oladimeji Farri |
Abstract | Clinical diagnosis is a critical and non-trivial aspect of patient care which often requires significant medical research and investigation based on an underlying clinical scenario. This paper proposes a novel approach by formulating clinical diagnosis as a reinforcement learning problem. During training, the reinforcement learning agent mimics the clinician{'}s cognitive process and learns the optimal policy to obtain the most appropriate diagnoses for a clinical narrative. This is achieved through an iterative search for candidate diagnoses from external knowledge sources via a sentence-by-sentence analysis of the inherent clinical context. A deep Q-network architecture is trained to optimize a reward function that measures the accuracy of the candidate diagnoses. Experiments on the TREC CDS datasets demonstrate the effectiveness of our system over various non-reinforcement learning-based systems. |
Tasks | Decision Making |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/I17-1090/ |
https://www.aclweb.org/anthology/I17-1090 | |
PWC | https://paperswithcode.com/paper/learning-to-diagnose-assimilating-clinical |
Repo | |
Framework | |