May 5, 2019

2487 words 12 mins read

Paper Group NANR 69

Paper Group NANR 69

Legal Text Interpretation: Identifying Hohfeldian Relations from Text. FastHybrid: A Hybrid Model for Efficient Answer Selection. ASM Kernel: Graph Kernel using Approximate Subgraph Matching for Relation Extraction. AppDialogue: Multi-App Dialogues for Intelligent Assistants. WikiCoref: An English Coreference-annotated Corpus of Wikipedia Articles. …

Title Legal Text Interpretation: Identifying Hohfeldian Relations from Text
Authors Wim Peters, Adam Wyner
Abstract The paper investigates the extent of the support semi-automatic analysis can provide for the specific task of assigning Hohfeldian relations of Duty, using the General Architecture for Text Engineering tool for the automated extraction of Duty instances and the bearers of associated roles. The outcome of the analysis supports scholars in identifying Hohfeldian structures in legal text when performing close reading of the texts. A cyclic workflow involving automated annotation and expert feedback will incrementally increase the quality and coverage of the automatic extraction process, and increasingly reduce the amount of manual work required of the scholar.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1059/
PDF https://www.aclweb.org/anthology/L16-1059
PWC https://paperswithcode.com/paper/legal-text-interpretation-identifying
Repo
Framework

FastHybrid: A Hybrid Model for Efficient Answer Selection

Title FastHybrid: A Hybrid Model for Efficient Answer Selection
Authors Lidan Wang, Ming Tan, Jiawei Han
Abstract Answer selection is a core component in any question-answering systems. It aims to select correct answer sentences for a given question from a pool of candidate sentences. In recent years, many deep learning methods have been proposed and shown excellent results for this task. However, these methods typically require extensive parameter (and hyper-parameter) tuning, which give rise to efficiency issues for large-scale datasets, and potentially make them less portable across new datasets and domains (as re-tuning is usually required). In this paper, we propose an extremely efficient hybrid model (FastHybrid) that tackles the problem from both an accuracy and scalability point of view. FastHybrid is a light-weight model that requires little tuning and adaptation across different domains. It combines a fast deep model (which will be introduced in the method section) with an initial information retrieval model to effectively and efficiently handle answer selection. We introduce a new efficient attention mechanism in the hybrid model and demonstrate its effectiveness on several QA datasets. Experimental results show that although the hybrid uses no training data, its accuracy is often on-par with supervised deep learning techniques, while significantly reducing training and tuning costs across different domains.
Tasks Answer Selection, Information Retrieval, Open-Domain Question Answering, Question Answering
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-1224/
PDF https://www.aclweb.org/anthology/C16-1224
PWC https://paperswithcode.com/paper/fasthybrid-a-hybrid-model-for-efficient
Repo
Framework

ASM Kernel: Graph Kernel using Approximate Subgraph Matching for Relation Extraction

Title ASM Kernel: Graph Kernel using Approximate Subgraph Matching for Relation Extraction
Authors Nagesh C. Panyam, Karin Verspoor, Trevor Cohn, Rao Kotagiri
Abstract
Tasks Feature Engineering, Relation Extraction, Sentence Classification
Published 2016-12-01
URL https://www.aclweb.org/anthology/U16-1007/
PDF https://www.aclweb.org/anthology/U16-1007
PWC https://paperswithcode.com/paper/asm-kernel-graph-kernel-using-approximate
Repo
Framework

AppDialogue: Multi-App Dialogues for Intelligent Assistants

Title AppDialogue: Multi-App Dialogues for Intelligent Assistants
Authors Ming Sun, Yun-Nung Chen, Zhenhao Hua, Yulian Tamres-Rudnicky, Arnab Dash, Alex Rudnicky, er
Abstract Users will interact with an individual app on smart devices (e.g., phone, TV, car) to fulfill a specific goal (e.g. find a photographer), but users may also pursue more complex tasks that will span multiple domains and apps (e.g. plan a wedding ceremony). Planning and executing such multi-app tasks are typically managed by users, considering the required global context awareness. To investigate how users arrange domains/apps to fulfill complex tasks in their daily life, we conducted a user study on 14 participants to collect such data from their Android smart phones. This document 1) summarizes the techniques used in the data collection and 2) provides a brief statistical description of the data. This data guilds the future direction for researchers in the fields of conversational agent and personal assistant, etc. This data is available at http://AppDialogue.com.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1499/
PDF https://www.aclweb.org/anthology/L16-1499
PWC https://paperswithcode.com/paper/appdialogue-multi-app-dialogues-for
Repo
Framework

WikiCoref: An English Coreference-annotated Corpus of Wikipedia Articles

Title WikiCoref: An English Coreference-annotated Corpus of Wikipedia Articles
Authors Abbas Ghaddar, Phillippe Langlais
Abstract This paper presents WikiCoref, an English corpus annotated for anaphoric relations, where all documents are from the English version of Wikipedia. Our annotation scheme follows the one of OntoNotes with a few disparities. We annotated each markable with coreference type, mention type and the equivalent Freebase topic. Since most similar annotation efforts concentrate on very specific types of written text, mainly newswire, there is a lack of resources for otherwise over-used Wikipedia texts. The corpus described in this paper addresses this issue. We present a freely available resource we initially devised for improving coreference resolution algorithms dedicated to Wikipedia texts. Our corpus has no restriction on the topics of the documents being annotated, and documents of various sizes have been considered for annotation.
Tasks Coreference Resolution
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1021/
PDF https://www.aclweb.org/anthology/L16-1021
PWC https://paperswithcode.com/paper/wikicoref-an-english-coreference-annotated
Repo
Framework

Attention-Based Convolutional Neural Network for Semantic Relation Extraction

Title Attention-Based Convolutional Neural Network for Semantic Relation Extraction
Authors Yatian Shen, Xuanjing Huang
Abstract Nowadays, neural networks play an important role in the task of relation classification. In this paper, we propose a novel attention-based convolutional neural network architecture for this task. Our model makes full use of word embedding, part-of-speech tag embedding and position embedding information. Word level attention mechanism is able to better determine which parts of the sentence are most influential with respect to the two entities of interest. This architecture enables learning some important features from task-specific labeled data, forgoing the need for external knowledge such as explicit dependency structures. Experiments on the SemEval-2010 Task 8 benchmark dataset show that our model achieves better performances than several state-of-the-art neural network models and can achieve a competitive performance just with minimal feature engineering.
Tasks Feature Engineering, Relation Classification, Relation Extraction
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-1238/
PDF https://www.aclweb.org/anthology/C16-1238
PWC https://paperswithcode.com/paper/attention-based-convolutional-neural-network-2
Repo
Framework

A Sequence Model Approach to Relation Extraction in Portuguese

Title A Sequence Model Approach to Relation Extraction in Portuguese
Authors S Collovini, ra, Gabriel Machado, Renata Vieira
Abstract The task of Relation Extraction from texts is one of the main challenges in the area of Information Extraction, considering the required linguistic knowledge and the sophistication of the language processing techniques employed. This task aims at identifying and classifying semantic relations that occur between entities recognized in a given text. In this paper, we evaluated a Conditional Random Fields classifier for the extraction of any relation descriptor occurring between named entities (Organisation, Person and Place categories), as well as pre-defined relation types between these entities in Portuguese texts.
Tasks Relation Extraction
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1301/
PDF https://www.aclweb.org/anthology/L16-1301
PWC https://paperswithcode.com/paper/a-sequence-model-approach-to-relation
Repo
Framework

Encoding a syntactic dictionary into a super granular unification grammar

Title Encoding a syntactic dictionary into a super granular unification grammar
Authors Sylvain Kahane, Fran{\c{c}}ois Lareau
Abstract We show how to turn a large-scale syntactic dictionary into a dependency-based unification grammar where each piece of lexical information calls a separate rule, yielding a super granular grammar. Subcategorization, raising and control verbs, auxiliaries and copula, passivization, and tough-movement are discussed. We focus on the semantics-syntax interface and offer a new perspective on syntactic structure.
Tasks
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-3812/
PDF https://www.aclweb.org/anthology/W16-3812
PWC https://paperswithcode.com/paper/encoding-a-syntactic-dictionary-into-a-super
Repo
Framework

``He Said She Said’’ ― a Male/Female Corpus of Polish

Title ``He Said She Said’’ ― a Male/Female Corpus of Polish |
Authors Filip Grali{'n}ski, {\L}ukasz Borchmann, Piotr Wierzcho{'n}
Abstract Gender differences in language use have long been of interest in linguistics. The task of automatic gender attribution has been considered in computational linguistics as well. Most research of this type is done using (usually English) texts with authorship metadata. In this paper, we propose a new method of male/female corpus creation based on gender-specific first-person expressions. The method was applied on CommonCrawl Web corpus for Polish (language, in which gender-revealing first-person expressions are particularly frequent) to yield a large (780M words) and varied collection of men{'}s and women{'}s texts. The whole procedure for building the corpus and filtering out unwanted texts is described in the present paper. The quality check was done on a random sample of the corpus to make sure that the majority (84{%}) of texts are correctly attributed, natural texts. Some preliminary (socio)linguistic insights (websites and words frequently occurring in male/female fragments) are given as well.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1648/
PDF https://www.aclweb.org/anthology/L16-1648
PWC https://paperswithcode.com/paper/he-said-she-said-a-a-malefemale-corpus-of
Repo
Framework

VSoLSCSum: Building a Vietnamese Sentence-Comment Dataset for Social Context Summarization

Title VSoLSCSum: Building a Vietnamese Sentence-Comment Dataset for Social Context Summarization
Authors Minh-Tien Nguyen, Dac Viet Lai, Phong-Khac Do, Duc-Vu Tran, Minh-Le Nguyen
Abstract This paper presents VSoLSCSum, a Vietnamese linked sentence-comment dataset, which was manually created to treat the lack of standard corpora for social context summarization in Vietnamese. The dataset was collected through the keywords of 141 Web documents in 12 special events, which were mentioned on Vietnamese Web pages. Social users were asked to involve in creating standard summaries and the label of each sentence or comment. The inter-agreement calculated by Cohen{'}s Kappa among raters after validating is 0.685. To illustrate the potential use of our dataset, a learning to rank method was trained by using a set of local and social features. Experimental results indicate that the summary model trained on our dataset outperforms state-of-the-art baselines in both ROUGE-1 and ROUGE-2 in social context summarization.
Tasks Learning-To-Rank
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-5405/
PDF https://www.aclweb.org/anthology/W16-5405
PWC https://paperswithcode.com/paper/vsolscsum-building-a-vietnamese-sentence
Repo
Framework

A Sparse Interactive Model for Matrix Completion with Side Information

Title A Sparse Interactive Model for Matrix Completion with Side Information
Authors Jin Lu, Guannan Liang, Jiangwen Sun, Jinbo Bi
Abstract Matrix completion methods can benefit from side information besides the partially observed matrix. The use of side features describing the row and column entities of a matrix has been shown to reduce the sample complexity for completing the matrix. We propose a novel sparse formulation that explicitly models the interaction between the row and column side features to approximate the matrix entries. Unlike early methods, this model does not require the low-rank condition on the model parameter matrix. We prove that when the side features can span the latent feature space of the matrix to be recovered, the number of observed entries needed for an exact recovery is $O(\log N)$ where $N$ is the size of the matrix. When the side features are corrupted latent features of the matrix with a small perturbation, our method can achieve an $\epsilon$-recovery with $O(\log N)$ sample complexity, and maintains a $\O(N^{3/2})$ rate similar to classfic methods with no side information. An efficient linearized Lagrangian algorithm is developed with a strong guarantee of convergence. Empirical results show that our approach outperforms three state-of-the-art methods both in simulations and on real world datasets.
Tasks Matrix Completion
Published 2016-12-01
URL http://papers.nips.cc/paper/6265-a-sparse-interactive-model-for-matrix-completion-with-side-information
PDF http://papers.nips.cc/paper/6265-a-sparse-interactive-model-for-matrix-completion-with-side-information.pdf
PWC https://paperswithcode.com/paper/a-sparse-interactive-model-for-matrix
Repo
Framework

From OpenCCG to AI Planning: Detecting Infeasible Edges in Sentence Generation

Title From OpenCCG to AI Planning: Detecting Infeasible Edges in Sentence Generation
Authors Maximilian Schwenger, {'A}lvaro Torralba, Joerg Hoffmann, David M. Howcroft, Vera Demberg
Abstract The search space in grammar-based natural language generation tasks can get very large, which is particularly problematic when generating long utterances or paragraphs. Using surface realization with OpenCCG as an example, we show that we can effectively detect partial solutions (edges) which cannot ultimately be part of a complete sentence because of their syntactic category. Formulating the completion of an edge into a sentence as finding a solution path in a large state-transition system, we demonstrate a connection to AI Planning which is concerned with this kind of problem. We design a compilation from OpenCCG into AI Planning allowing the detection of infeasible edges via AI Planning dead-end detection methods (proving the absence of a solution to the compilation). Our experiments show that this can filter out large fractions of infeasible edges in, and thus benefit the performance of, complex realization processes.
Tasks Text Generation
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-1144/
PDF https://www.aclweb.org/anthology/C16-1144
PWC https://paperswithcode.com/paper/from-openccg-to-ai-planning-detecting
Repo
Framework

Siamese Convolutional Networks for Cognate Identification

Title Siamese Convolutional Networks for Cognate Identification
Authors Taraka Rama
Abstract In this paper, we present phoneme level Siamese convolutional networks for the task of pair-wise cognate identification. We represent a word as a two-dimensional matrix and employ a siamese convolutional network for learning deep representations. We present siamese architectures that jointly learn phoneme level feature representations and language relatedness from raw words for cognate identification. Compared to previous works, we train and test on larger and realistic datasets; and, show that siamese architectures consistently perform better than traditional linear classifier approach.
Tasks
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-1097/
PDF https://www.aclweb.org/anthology/C16-1097
PWC https://paperswithcode.com/paper/siamese-convolutional-networks-for-cognate
Repo
Framework

Wow! What a Useful Extension! Introducing Non-Referential Concepts to Wordnet

Title Wow! What a Useful Extension! Introducing Non-Referential Concepts to Wordnet
Authors Luis Morgado Da Costa, Francis Bond
Abstract In this paper we present the ongoing efforts to expand the depth and breath of the Open Multilingual Wordnet coverage by introducing two new classes of non-referential concepts to wordnet hierarchies: interjections and numeral classifiers. The lexical semantic hierarchy pioneered by Princeton Wordnet has traditionally restricted its coverage to referential and contentful classes of words: such as nouns, verbs, adjectives and adverbs. Previous efforts have been employed to enrich wordnet resources including, for example, the inclusion of pronouns, determiners and quantifiers within their hierarchies. Following similar efforts, and motivated by the ongoing semantic annotation of the NTU-Multilingual Corpus, we decided that the four traditional classes of words present in wordnets were too restrictive. Though non-referential, interjections and classifiers possess interesting semantics features that can be well captured by lexical resources like wordnets. In this paper, we will further motivate our decision to include non-referential concepts in wordnets and give an account of the current state of this expansion.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1685/
PDF https://www.aclweb.org/anthology/L16-1685
PWC https://paperswithcode.com/paper/wow-what-a-useful-extension-introducing-non
Repo
Framework

Transfer of Corpus-Specific Dialogue Act Annotation to ISO Standard: Is it worth it?

Title Transfer of Corpus-Specific Dialogue Act Annotation to ISO Standard: Is it worth it?
Authors Shammur Absar Chowdhury, Evgeny Stepanov, Giuseppe Riccardi
Abstract Spoken conversation corpora often adapt existing Dialogue Act (DA) annotation specifications, such as DAMSL, DIT++, etc., to task specific needs, yielding incompatible annotations; thus, limiting corpora re-usability. Recently accepted ISO standard for DA annotation {–} Dialogue Act Markup Language (DiAML) {–} is designed as domain and application independent. Moreover, the clear separation of dialogue dimensions and communicative functions, coupled with the hierarchical organization of the latter, allows for classification at different levels of granularity. However, re-annotating existing corpora with the new scheme might require significant effort. In this paper we test the utility of the ISO standard through comparative evaluation of the corpus-specific legacy and the semi-automatically transferred DiAML DA annotations on supervised dialogue act classification task. To test the domain independence of the resulting annotations, we perform cross-domain and data aggregation evaluation. Compared to the legacy annotation scheme, on the Italian LUNA Human-Human corpus, the DiAML annotation scheme exhibits better cross-domain and data aggregation classification performance, while maintaining comparable in-domain performance.
Tasks Dialogue Act Classification
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1020/
PDF https://www.aclweb.org/anthology/L16-1020
PWC https://paperswithcode.com/paper/transfer-of-corpus-specific-dialogue-act
Repo
Framework
comments powered by Disqus