Paper Group NANR 70
Focus Annotation of Task-based Data: A Comparison of Expert and Crowd-Sourced Annotation in a Reading Comprehension Corpus. Proceedings of the Fifth Workshop on Computational Linguistics for Literature. PARC 3.0: A Corpus of Attribution Relations. Replicability of Research in Biomedical Natural Language Processing: a pilot evaluation for a coding t …
Focus Annotation of Task-based Data: A Comparison of Expert and Crowd-Sourced Annotation in a Reading Comprehension Corpus
Title | Focus Annotation of Task-based Data: A Comparison of Expert and Crowd-Sourced Annotation in a Reading Comprehension Corpus |
Authors | Kordula De Kuthy, Ramon Ziai, Detmar Meurers |
Abstract | While the formal pragmatic concepts in information structure, such as the focus of an utterance, are precisely defined in theoretical linguistics and potentially very useful in conceptual and practical terms, it has turned out to be difficult to reliably annotate such notions in corpus data. We present a large-scale focus annotation effort designed to overcome this problem. Our annotation study is based on the tasked-based corpus CREG, which consists of answers to explicitly given reading comprehension questions. We compare focus annotation by trained annotators with a crowd-sourcing setup making use of untrained native speakers. Given the task context and an annotation process incrementally making the question form and answer type explicit, the trained annotators reach substantial agreement for focus annotation. Interestingly, the crowd-sourcing setup also supports high-quality annotation ― for specific subtypes of data. Finally, we turn to the question whether the relevance of focus annotation can be extrinsically evaluated. We show that automatic short-answer assessment significantly improves for focus annotated data. The focus annotated CREG corpus is freely available and constitutes the largest such resource for German. |
Tasks | Reading Comprehension |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1621/ |
https://www.aclweb.org/anthology/L16-1621 | |
PWC | https://paperswithcode.com/paper/focus-annotation-of-task-based-data-a |
Repo | |
Framework | |
Proceedings of the Fifth Workshop on Computational Linguistics for Literature
Title | Proceedings of the Fifth Workshop on Computational Linguistics for Literature |
Authors | |
Abstract | |
Tasks | |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/W16-0200/ |
https://www.aclweb.org/anthology/W16-0200 | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-fifth-workshop-on |
Repo | |
Framework | |
PARC 3.0: A Corpus of Attribution Relations
Title | PARC 3.0: A Corpus of Attribution Relations |
Authors | Silvia Pareti |
Abstract | Quotation and opinion extraction, discourse and factuality have all partly addressed the annotation and identification of Attribution Relations. However, disjoint efforts have provided a partial and partly inaccurate picture of attribution and generated small or incomplete resources, thus limiting the applicability of machine learning approaches. This paper presents PARC 3.0, a large corpus fully annotated with Attribution Relations (ARs). The annotation scheme was tested with an inter-annotator agreement study showing satisfactory results for the identification of ARs and high agreement on the selection of the text spans corresponding to its constitutive elements: source, cue and content. The corpus, which comprises around 20k ARs, was used to investigate the range of structures that can express attribution. The results show a complex and varied relation of which the literature has addressed only a portion. PARC 3.0 is available for research use and can be used in a range of different studies to analyse attribution and validate assumptions as well as to develop supervised attribution extraction models. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1619/ |
https://www.aclweb.org/anthology/L16-1619 | |
PWC | https://paperswithcode.com/paper/parc-30-a-corpus-of-attribution-relations |
Repo | |
Framework | |
Replicability of Research in Biomedical Natural Language Processing: a pilot evaluation for a coding task
Title | Replicability of Research in Biomedical Natural Language Processing: a pilot evaluation for a coding task |
Authors | Aur{'e}lie N{'e}v{'e}ol, Kevin Cohen, Cyril Grouin, Aude Robert |
Abstract | |
Tasks | |
Published | 2016-11-01 |
URL | https://www.aclweb.org/anthology/W16-6110/ |
https://www.aclweb.org/anthology/W16-6110 | |
PWC | https://paperswithcode.com/paper/replicability-of-research-in-biomedical |
Repo | |
Framework | |
Composition-Preserving Deep Photo Aesthetics Assessment
Title | Composition-Preserving Deep Photo Aesthetics Assessment |
Authors | Long Mai, Hailin Jin, Feng Liu |
Abstract | Photo aesthetics assessment is challenging. Deep convolutional neural network (ConvNet) methods have recently shown promising results for aesthetics assessment. The performance of these deep ConvNet methods, however, is often compromised by the constraint that the neural network only takes the fixed-size input. To accommodate this requirement, input images need to be transformed via cropping, scaling, or padding, which often damages image composition, reduces image resolution, or causes image distortion, thus compromising the aesthetics of the original images. In this paper, we present a composition-preserving deep ConvNet method that directly learns aesthetics features from the original input images without any image transformations. Specifically, our method adds an adaptive spatial pooling layer upon the regular convolution and pooling layers to directly handle input images with original sizes and aspect ratios. To allow for multi-scale feature extraction, we develop the Multi-Net Adaptive Spatial Pooling ConvNet architecture which consists of multiple sub-networks with different adaptive spatial pooling sizes and leverage a scene-based aggregation layer to effectively combine the predictions from multiple sub-networks. Our experiments on the large-scale aesthetics assessment benchmark (AVA) demonstrate that our method can significantly improve the state-of-the-art results in photo aesthetics assessment. |
Tasks | Aesthetics Quality Assessment |
Published | 2016-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2016/html/Mai_Composition-Preserving_Deep_Photo_CVPR_2016_paper.html |
http://openaccess.thecvf.com/content_cvpr_2016/papers/Mai_Composition-Preserving_Deep_Photo_CVPR_2016_paper.pdf | |
PWC | https://paperswithcode.com/paper/composition-preserving-deep-photo-aesthetics |
Repo | |
Framework | |
Syntax Matters for Rhetorical Structure: The Case of Chiasmus
Title | Syntax Matters for Rhetorical Structure: The Case of Chiasmus |
Authors | Marie Dubremetz, Joakim Nivre |
Abstract | |
Tasks | |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/W16-0206/ |
https://www.aclweb.org/anthology/W16-0206 | |
PWC | https://paperswithcode.com/paper/syntax-matters-for-rhetorical-structure-the |
Repo | |
Framework | |
Bilingual Chronological Classification of Hafez’s Poems
Title | Bilingual Chronological Classification of Hafez’s Poems |
Authors | Arya Rahgozar, Diana Inkpen |
Abstract | |
Tasks | Text Classification |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/W16-0207/ |
https://www.aclweb.org/anthology/W16-0207 | |
PWC | https://paperswithcode.com/paper/bilingual-chronological-classification-of |
Repo | |
Framework | |
Data61-CSIRO systems at the CLPsych 2016 Shared Task
Title | Data61-CSIRO systems at the CLPsych 2016 Shared Task |
Authors | Sunghwan Mac Kim, Yufei Wang, Stephen Wan, C{'e}cile Paris |
Abstract | |
Tasks | Text Classification |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/W16-0313/ |
https://www.aclweb.org/anthology/W16-0313 | |
PWC | https://paperswithcode.com/paper/data61-csiro-systems-at-the-clpsych-2016 |
Repo | |
Framework | |
Domainless Adaptation by Constrained Decoding on a Schema Lattice
Title | Domainless Adaptation by Constrained Decoding on a Schema Lattice |
Authors | Young-Bum Kim, Karl Stratos, Ruhi Sarikaya |
Abstract | In many applications such as personal digital assistants, there is a constant need for new domains to increase the system{'}s coverage of user queries. A conventional approach is to learn a separate model every time a new domain is introduced. This approach is slow, inefficient, and a bottleneck for scaling to a large number of domains. In this paper, we introduce a framework that allows us to have a single model that can handle all domains: including unknown domains that may be created in the future as long as they are covered in the master schema. The key idea is to remove the need for distinguishing domains by explicitly predicting the schema of queries. Given permitted schema of a query, we perform constrained decoding on a lattice of slot sequences allowed under the schema. The proposed model achieves competitive and often superior performance over the conventional model trained separately per domain. |
Tasks | Multi-Label Classification, Spoken Language Understanding |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-1193/ |
https://www.aclweb.org/anthology/C16-1193 | |
PWC | https://paperswithcode.com/paper/domainless-adaptation-by-constrained-decoding |
Repo | |
Framework | |
Classification of comment helpfulness to improve knowledge sharing among medical practitioners.
Title | Classification of comment helpfulness to improve knowledge sharing among medical practitioners. |
Authors | Pierre Andr{'e} M{'e}nard, Caroline Barri{`e}re |
Abstract | |
Tasks | Opinion Mining |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/W16-0414/ |
https://www.aclweb.org/anthology/W16-0414 | |
PWC | https://paperswithcode.com/paper/classification-of-comment-helpfulness-to |
Repo | |
Framework | |
The UMD CLPsych 2016 Shared Task System: Text Representation for Predicting Triage of Forum Posts about Mental Health
Title | The UMD CLPsych 2016 Shared Task System: Text Representation for Predicting Triage of Forum Posts about Mental Health |
Authors | Meir Friedenberg, Hadi Amiri, Hal Daum{'e} III, Philip Resnik |
Abstract | |
Tasks | Representation Learning, Topic Models |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/W16-0319/ |
https://www.aclweb.org/anthology/W16-0319 | |
PWC | https://paperswithcode.com/paper/the-umd-clpsych-2016-shared-task-system-text |
Repo | |
Framework | |
Proceedings of the Second Workshop on Computational Approaches to Code Switching
Title | Proceedings of the Second Workshop on Computational Approaches to Code Switching |
Authors | |
Abstract | |
Tasks | |
Published | 2016-11-01 |
URL | https://www.aclweb.org/anthology/W16-5800/ |
https://www.aclweb.org/anthology/W16-5800 | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-second-workshop-on-11 |
Repo | |
Framework | |
Optimal Architectures in a Solvable Model of Deep Networks
Title | Optimal Architectures in a Solvable Model of Deep Networks |
Authors | Jonathan Kadmon, Haim Sompolinsky |
Abstract | Deep neural networks have received a considerable attention due to the success of their training for real world machine learning applications. They are also of great interest to the understanding of sensory processing in cortical sensory hierarchies. The purpose of this work is to advance our theoretical understanding of the computational benefits of these architectures. Using a simple model of clustered noisy inputs and a simple learning rule, we provide analytically derived recursion relations describing the propagation of the signals along the deep network. By analysis of these equations, and defining performance measures, we show that these model networks have optimal depths. We further explore the dependence of the optimal architecture on the system parameters. |
Tasks | |
Published | 2016-12-01 |
URL | http://papers.nips.cc/paper/6330-optimal-architectures-in-a-solvable-model-of-deep-networks |
http://papers.nips.cc/paper/6330-optimal-architectures-in-a-solvable-model-of-deep-networks.pdf | |
PWC | https://paperswithcode.com/paper/optimal-architectures-in-a-solvable-model-of |
Repo | |
Framework | |
Building an Arabic Machine Translation Post-Edited Corpus: Guidelines and Annotation
Title | Building an Arabic Machine Translation Post-Edited Corpus: Guidelines and Annotation |
Authors | Wajdi Zaghouani, Nizar Habash, Ossama Obeid, Behrang Mohit, Houda Bouamor, Kemal Oflazer |
Abstract | We present our guidelines and annotation procedure to create a human corrected machine translated post-edited corpus for the Modern Standard Arabic. Our overarching goal is to use the annotated corpus to develop automatic machine translation post-editing systems for Arabic that can be used to help accelerate the human revision process of translated texts. The creation of any manually annotated corpus usually presents many challenges. In order to address these challenges, we created comprehensive and simplified annotation guidelines which were used by a team of five annotators and one lead annotator. In order to ensure a high annotation agreement between the annotators, multiple training sessions were held and regular inter-annotator agreement measures were performed to check the annotation quality. The created corpus of manual post-edited translations of English to Arabic articles is the largest to date for this language pair. |
Tasks | Machine Translation |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1295/ |
https://www.aclweb.org/anthology/L16-1295 | |
PWC | https://paperswithcode.com/paper/building-an-arabic-machine-translation-post |
Repo | |
Framework | |
Learning to Weight Translations using Ordinal Linear Regression and Query-generated Training Data for Ad-hoc Retrieval with Long Queries
Title | Learning to Weight Translations using Ordinal Linear Regression and Query-generated Training Data for Ad-hoc Retrieval with Long Queries |
Authors | Javid Dadashkarimi, Masoud Jalili Sabet, Azadeh Shakery |
Abstract | Ordinal regression which is known with learning to rank has long been used in information retrieval (IR). Learning to rank algorithms, have been tailored in document ranking, information filtering, and building large aligned corpora successfully. In this paper, we propose to use this algorithm for query modeling in cross-language environments. To this end, first we build a query-generated training data using pseudo-relevant documents to the query and all translation candidates. The pseudo-relevant documents are obtained by top-ranked documents in response to a translation of the original query. The class of each candidate in the training data is determined based on presence/absence of the candidate in the pseudo-relevant documents. We learn an ordinal regression model to score the candidates based on their relevance to the context of the query, and after that, we construct a query-dependent translation model using a softmax function. Finally, we re-weight the query based on the obtained model. Experimental results on French, German, Spanish, and Italian CLEF collections demonstrate that the proposed method achieves better results compared to state-of-the-art cross-language information retrieval methods, particularly in long queries with large training data. |
Tasks | Document Ranking, Information Retrieval, Learning-To-Rank |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-1162/ |
https://www.aclweb.org/anthology/C16-1162 | |
PWC | https://paperswithcode.com/paper/learning-to-weight-translations-using-ordinal |
Repo | |
Framework | |