Paper Group NANR 69
Legal Text Interpretation: Identifying Hohfeldian Relations from Text. FastHybrid: A Hybrid Model for Efficient Answer Selection. ASM Kernel: Graph Kernel using Approximate Subgraph Matching for Relation Extraction. AppDialogue: Multi-App Dialogues for Intelligent Assistants. WikiCoref: An English Coreference-annotated Corpus of Wikipedia Articles. …
Legal Text Interpretation: Identifying Hohfeldian Relations from Text
Title | Legal Text Interpretation: Identifying Hohfeldian Relations from Text |
Authors | Wim Peters, Adam Wyner |
Abstract | The paper investigates the extent of the support semi-automatic analysis can provide for the specific task of assigning Hohfeldian relations of Duty, using the General Architecture for Text Engineering tool for the automated extraction of Duty instances and the bearers of associated roles. The outcome of the analysis supports scholars in identifying Hohfeldian structures in legal text when performing close reading of the texts. A cyclic workflow involving automated annotation and expert feedback will incrementally increase the quality and coverage of the automatic extraction process, and increasingly reduce the amount of manual work required of the scholar. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1059/ |
https://www.aclweb.org/anthology/L16-1059 | |
PWC | https://paperswithcode.com/paper/legal-text-interpretation-identifying |
Repo | |
Framework | |
FastHybrid: A Hybrid Model for Efficient Answer Selection
Title | FastHybrid: A Hybrid Model for Efficient Answer Selection |
Authors | Lidan Wang, Ming Tan, Jiawei Han |
Abstract | Answer selection is a core component in any question-answering systems. It aims to select correct answer sentences for a given question from a pool of candidate sentences. In recent years, many deep learning methods have been proposed and shown excellent results for this task. However, these methods typically require extensive parameter (and hyper-parameter) tuning, which give rise to efficiency issues for large-scale datasets, and potentially make them less portable across new datasets and domains (as re-tuning is usually required). In this paper, we propose an extremely efficient hybrid model (FastHybrid) that tackles the problem from both an accuracy and scalability point of view. FastHybrid is a light-weight model that requires little tuning and adaptation across different domains. It combines a fast deep model (which will be introduced in the method section) with an initial information retrieval model to effectively and efficiently handle answer selection. We introduce a new efficient attention mechanism in the hybrid model and demonstrate its effectiveness on several QA datasets. Experimental results show that although the hybrid uses no training data, its accuracy is often on-par with supervised deep learning techniques, while significantly reducing training and tuning costs across different domains. |
Tasks | Answer Selection, Information Retrieval, Open-Domain Question Answering, Question Answering |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-1224/ |
https://www.aclweb.org/anthology/C16-1224 | |
PWC | https://paperswithcode.com/paper/fasthybrid-a-hybrid-model-for-efficient |
Repo | |
Framework | |
ASM Kernel: Graph Kernel using Approximate Subgraph Matching for Relation Extraction
Title | ASM Kernel: Graph Kernel using Approximate Subgraph Matching for Relation Extraction |
Authors | Nagesh C. Panyam, Karin Verspoor, Trevor Cohn, Rao Kotagiri |
Abstract | |
Tasks | Feature Engineering, Relation Extraction, Sentence Classification |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/U16-1007/ |
https://www.aclweb.org/anthology/U16-1007 | |
PWC | https://paperswithcode.com/paper/asm-kernel-graph-kernel-using-approximate |
Repo | |
Framework | |
AppDialogue: Multi-App Dialogues for Intelligent Assistants
Title | AppDialogue: Multi-App Dialogues for Intelligent Assistants |
Authors | Ming Sun, Yun-Nung Chen, Zhenhao Hua, Yulian Tamres-Rudnicky, Arnab Dash, Alex Rudnicky, er |
Abstract | Users will interact with an individual app on smart devices (e.g., phone, TV, car) to fulfill a specific goal (e.g. find a photographer), but users may also pursue more complex tasks that will span multiple domains and apps (e.g. plan a wedding ceremony). Planning and executing such multi-app tasks are typically managed by users, considering the required global context awareness. To investigate how users arrange domains/apps to fulfill complex tasks in their daily life, we conducted a user study on 14 participants to collect such data from their Android smart phones. This document 1) summarizes the techniques used in the data collection and 2) provides a brief statistical description of the data. This data guilds the future direction for researchers in the fields of conversational agent and personal assistant, etc. This data is available at http://AppDialogue.com. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1499/ |
https://www.aclweb.org/anthology/L16-1499 | |
PWC | https://paperswithcode.com/paper/appdialogue-multi-app-dialogues-for |
Repo | |
Framework | |
WikiCoref: An English Coreference-annotated Corpus of Wikipedia Articles
Title | WikiCoref: An English Coreference-annotated Corpus of Wikipedia Articles |
Authors | Abbas Ghaddar, Phillippe Langlais |
Abstract | This paper presents WikiCoref, an English corpus annotated for anaphoric relations, where all documents are from the English version of Wikipedia. Our annotation scheme follows the one of OntoNotes with a few disparities. We annotated each markable with coreference type, mention type and the equivalent Freebase topic. Since most similar annotation efforts concentrate on very specific types of written text, mainly newswire, there is a lack of resources for otherwise over-used Wikipedia texts. The corpus described in this paper addresses this issue. We present a freely available resource we initially devised for improving coreference resolution algorithms dedicated to Wikipedia texts. Our corpus has no restriction on the topics of the documents being annotated, and documents of various sizes have been considered for annotation. |
Tasks | Coreference Resolution |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1021/ |
https://www.aclweb.org/anthology/L16-1021 | |
PWC | https://paperswithcode.com/paper/wikicoref-an-english-coreference-annotated |
Repo | |
Framework | |
Attention-Based Convolutional Neural Network for Semantic Relation Extraction
Title | Attention-Based Convolutional Neural Network for Semantic Relation Extraction |
Authors | Yatian Shen, Xuanjing Huang |
Abstract | Nowadays, neural networks play an important role in the task of relation classification. In this paper, we propose a novel attention-based convolutional neural network architecture for this task. Our model makes full use of word embedding, part-of-speech tag embedding and position embedding information. Word level attention mechanism is able to better determine which parts of the sentence are most influential with respect to the two entities of interest. This architecture enables learning some important features from task-specific labeled data, forgoing the need for external knowledge such as explicit dependency structures. Experiments on the SemEval-2010 Task 8 benchmark dataset show that our model achieves better performances than several state-of-the-art neural network models and can achieve a competitive performance just with minimal feature engineering. |
Tasks | Feature Engineering, Relation Classification, Relation Extraction |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-1238/ |
https://www.aclweb.org/anthology/C16-1238 | |
PWC | https://paperswithcode.com/paper/attention-based-convolutional-neural-network-2 |
Repo | |
Framework | |
A Sequence Model Approach to Relation Extraction in Portuguese
Title | A Sequence Model Approach to Relation Extraction in Portuguese |
Authors | S Collovini, ra, Gabriel Machado, Renata Vieira |
Abstract | The task of Relation Extraction from texts is one of the main challenges in the area of Information Extraction, considering the required linguistic knowledge and the sophistication of the language processing techniques employed. This task aims at identifying and classifying semantic relations that occur between entities recognized in a given text. In this paper, we evaluated a Conditional Random Fields classifier for the extraction of any relation descriptor occurring between named entities (Organisation, Person and Place categories), as well as pre-defined relation types between these entities in Portuguese texts. |
Tasks | Relation Extraction |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1301/ |
https://www.aclweb.org/anthology/L16-1301 | |
PWC | https://paperswithcode.com/paper/a-sequence-model-approach-to-relation |
Repo | |
Framework | |
Encoding a syntactic dictionary into a super granular unification grammar
Title | Encoding a syntactic dictionary into a super granular unification grammar |
Authors | Sylvain Kahane, Fran{\c{c}}ois Lareau |
Abstract | We show how to turn a large-scale syntactic dictionary into a dependency-based unification grammar where each piece of lexical information calls a separate rule, yielding a super granular grammar. Subcategorization, raising and control verbs, auxiliaries and copula, passivization, and tough-movement are discussed. We focus on the semantics-syntax interface and offer a new perspective on syntactic structure. |
Tasks | |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-3812/ |
https://www.aclweb.org/anthology/W16-3812 | |
PWC | https://paperswithcode.com/paper/encoding-a-syntactic-dictionary-into-a-super |
Repo | |
Framework | |
``He Said She Said’’ ― a Male/Female Corpus of Polish
Title | ``He Said She Said’’ ― a Male/Female Corpus of Polish | |
Authors | Filip Grali{'n}ski, {\L}ukasz Borchmann, Piotr Wierzcho{'n} |
Abstract | Gender differences in language use have long been of interest in linguistics. The task of automatic gender attribution has been considered in computational linguistics as well. Most research of this type is done using (usually English) texts with authorship metadata. In this paper, we propose a new method of male/female corpus creation based on gender-specific first-person expressions. The method was applied on CommonCrawl Web corpus for Polish (language, in which gender-revealing first-person expressions are particularly frequent) to yield a large (780M words) and varied collection of men{'}s and women{'}s texts. The whole procedure for building the corpus and filtering out unwanted texts is described in the present paper. The quality check was done on a random sample of the corpus to make sure that the majority (84{%}) of texts are correctly attributed, natural texts. Some preliminary (socio)linguistic insights (websites and words frequently occurring in male/female fragments) are given as well. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1648/ |
https://www.aclweb.org/anthology/L16-1648 | |
PWC | https://paperswithcode.com/paper/he-said-she-said-a-a-malefemale-corpus-of |
Repo | |
Framework | |
VSoLSCSum: Building a Vietnamese Sentence-Comment Dataset for Social Context Summarization
Title | VSoLSCSum: Building a Vietnamese Sentence-Comment Dataset for Social Context Summarization |
Authors | Minh-Tien Nguyen, Dac Viet Lai, Phong-Khac Do, Duc-Vu Tran, Minh-Le Nguyen |
Abstract | This paper presents VSoLSCSum, a Vietnamese linked sentence-comment dataset, which was manually created to treat the lack of standard corpora for social context summarization in Vietnamese. The dataset was collected through the keywords of 141 Web documents in 12 special events, which were mentioned on Vietnamese Web pages. Social users were asked to involve in creating standard summaries and the label of each sentence or comment. The inter-agreement calculated by Cohen{'}s Kappa among raters after validating is 0.685. To illustrate the potential use of our dataset, a learning to rank method was trained by using a set of local and social features. Experimental results indicate that the summary model trained on our dataset outperforms state-of-the-art baselines in both ROUGE-1 and ROUGE-2 in social context summarization. |
Tasks | Learning-To-Rank |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-5405/ |
https://www.aclweb.org/anthology/W16-5405 | |
PWC | https://paperswithcode.com/paper/vsolscsum-building-a-vietnamese-sentence |
Repo | |
Framework | |
A Sparse Interactive Model for Matrix Completion with Side Information
Title | A Sparse Interactive Model for Matrix Completion with Side Information |
Authors | Jin Lu, Guannan Liang, Jiangwen Sun, Jinbo Bi |
Abstract | Matrix completion methods can benefit from side information besides the partially observed matrix. The use of side features describing the row and column entities of a matrix has been shown to reduce the sample complexity for completing the matrix. We propose a novel sparse formulation that explicitly models the interaction between the row and column side features to approximate the matrix entries. Unlike early methods, this model does not require the low-rank condition on the model parameter matrix. We prove that when the side features can span the latent feature space of the matrix to be recovered, the number of observed entries needed for an exact recovery is $O(\log N)$ where $N$ is the size of the matrix. When the side features are corrupted latent features of the matrix with a small perturbation, our method can achieve an $\epsilon$-recovery with $O(\log N)$ sample complexity, and maintains a $\O(N^{3/2})$ rate similar to classfic methods with no side information. An efficient linearized Lagrangian algorithm is developed with a strong guarantee of convergence. Empirical results show that our approach outperforms three state-of-the-art methods both in simulations and on real world datasets. |
Tasks | Matrix Completion |
Published | 2016-12-01 |
URL | http://papers.nips.cc/paper/6265-a-sparse-interactive-model-for-matrix-completion-with-side-information |
http://papers.nips.cc/paper/6265-a-sparse-interactive-model-for-matrix-completion-with-side-information.pdf | |
PWC | https://paperswithcode.com/paper/a-sparse-interactive-model-for-matrix |
Repo | |
Framework | |
From OpenCCG to AI Planning: Detecting Infeasible Edges in Sentence Generation
Title | From OpenCCG to AI Planning: Detecting Infeasible Edges in Sentence Generation |
Authors | Maximilian Schwenger, {'A}lvaro Torralba, Joerg Hoffmann, David M. Howcroft, Vera Demberg |
Abstract | The search space in grammar-based natural language generation tasks can get very large, which is particularly problematic when generating long utterances or paragraphs. Using surface realization with OpenCCG as an example, we show that we can effectively detect partial solutions (edges) which cannot ultimately be part of a complete sentence because of their syntactic category. Formulating the completion of an edge into a sentence as finding a solution path in a large state-transition system, we demonstrate a connection to AI Planning which is concerned with this kind of problem. We design a compilation from OpenCCG into AI Planning allowing the detection of infeasible edges via AI Planning dead-end detection methods (proving the absence of a solution to the compilation). Our experiments show that this can filter out large fractions of infeasible edges in, and thus benefit the performance of, complex realization processes. |
Tasks | Text Generation |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-1144/ |
https://www.aclweb.org/anthology/C16-1144 | |
PWC | https://paperswithcode.com/paper/from-openccg-to-ai-planning-detecting |
Repo | |
Framework | |
Siamese Convolutional Networks for Cognate Identification
Title | Siamese Convolutional Networks for Cognate Identification |
Authors | Taraka Rama |
Abstract | In this paper, we present phoneme level Siamese convolutional networks for the task of pair-wise cognate identification. We represent a word as a two-dimensional matrix and employ a siamese convolutional network for learning deep representations. We present siamese architectures that jointly learn phoneme level feature representations and language relatedness from raw words for cognate identification. Compared to previous works, we train and test on larger and realistic datasets; and, show that siamese architectures consistently perform better than traditional linear classifier approach. |
Tasks | |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-1097/ |
https://www.aclweb.org/anthology/C16-1097 | |
PWC | https://paperswithcode.com/paper/siamese-convolutional-networks-for-cognate |
Repo | |
Framework | |
Wow! What a Useful Extension! Introducing Non-Referential Concepts to Wordnet
Title | Wow! What a Useful Extension! Introducing Non-Referential Concepts to Wordnet |
Authors | Luis Morgado Da Costa, Francis Bond |
Abstract | In this paper we present the ongoing efforts to expand the depth and breath of the Open Multilingual Wordnet coverage by introducing two new classes of non-referential concepts to wordnet hierarchies: interjections and numeral classifiers. The lexical semantic hierarchy pioneered by Princeton Wordnet has traditionally restricted its coverage to referential and contentful classes of words: such as nouns, verbs, adjectives and adverbs. Previous efforts have been employed to enrich wordnet resources including, for example, the inclusion of pronouns, determiners and quantifiers within their hierarchies. Following similar efforts, and motivated by the ongoing semantic annotation of the NTU-Multilingual Corpus, we decided that the four traditional classes of words present in wordnets were too restrictive. Though non-referential, interjections and classifiers possess interesting semantics features that can be well captured by lexical resources like wordnets. In this paper, we will further motivate our decision to include non-referential concepts in wordnets and give an account of the current state of this expansion. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1685/ |
https://www.aclweb.org/anthology/L16-1685 | |
PWC | https://paperswithcode.com/paper/wow-what-a-useful-extension-introducing-non |
Repo | |
Framework | |
Transfer of Corpus-Specific Dialogue Act Annotation to ISO Standard: Is it worth it?
Title | Transfer of Corpus-Specific Dialogue Act Annotation to ISO Standard: Is it worth it? |
Authors | Shammur Absar Chowdhury, Evgeny Stepanov, Giuseppe Riccardi |
Abstract | Spoken conversation corpora often adapt existing Dialogue Act (DA) annotation specifications, such as DAMSL, DIT++, etc., to task specific needs, yielding incompatible annotations; thus, limiting corpora re-usability. Recently accepted ISO standard for DA annotation {–} Dialogue Act Markup Language (DiAML) {–} is designed as domain and application independent. Moreover, the clear separation of dialogue dimensions and communicative functions, coupled with the hierarchical organization of the latter, allows for classification at different levels of granularity. However, re-annotating existing corpora with the new scheme might require significant effort. In this paper we test the utility of the ISO standard through comparative evaluation of the corpus-specific legacy and the semi-automatically transferred DiAML DA annotations on supervised dialogue act classification task. To test the domain independence of the resulting annotations, we perform cross-domain and data aggregation evaluation. Compared to the legacy annotation scheme, on the Italian LUNA Human-Human corpus, the DiAML annotation scheme exhibits better cross-domain and data aggregation classification performance, while maintaining comparable in-domain performance. |
Tasks | Dialogue Act Classification |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1020/ |
https://www.aclweb.org/anthology/L16-1020 | |
PWC | https://paperswithcode.com/paper/transfer-of-corpus-specific-dialogue-act |
Repo | |
Framework | |