Paper Group NAWR 12
Reassessing the Goals of Grammatical Error Correction: Fluency Instead of Grammaticality. Fast ε-free Inference of Simulation Models with Bayesian Conditional Density Estimation. Addressee and Response Selection for Multi-Party Conversation. Named Entity Recognition in Swedish Health Records with Character-Based Deep Bidirectional LSTMs. Jigg: A Fr …
Reassessing the Goals of Grammatical Error Correction: Fluency Instead of Grammaticality
Title | Reassessing the Goals of Grammatical Error Correction: Fluency Instead of Grammaticality |
Authors | Keisuke Sakaguchi, Courtney Napoles, Matt Post, Joel Tetreault |
Abstract | The field of grammatical error correction (GEC) has grown substantially in recent years, with research directed at both evaluation metrics and improved system performance against those metrics. One unvisited assumption, however, is the reliance of GEC evaluation on error-coded corpora, which contain specific labeled corrections. We examine current practices and show that GEC{'}s reliance on such corpora unnaturally constrains annotation and automatic evaluation, resulting in (a) sentences that do not sound acceptable to native speakers and (b) system rankings that do not correlate with human judgments. In light of this, we propose an alternate approach that jettisons costly error coding in favor of unannotated, whole-sentence rewrites. We compare the performance of existing metrics over different gold-standard annotations, and show that automatic evaluation with our new annotation scheme has very strong correlation with expert rankings (� = 0.82). As a result, we advocate for a fundamental and necessary shift in the goal of GEC, from correcting small, labeled error types, to producing text that has native fluency. |
Tasks | Grammatical Error Correction |
Published | 2016-01-01 |
URL | https://www.aclweb.org/anthology/Q16-1013/ |
https://www.aclweb.org/anthology/Q16-1013 | |
PWC | https://paperswithcode.com/paper/reassessing-the-goals-of-grammatical-error |
Repo | https://github.com/keisks/reassess-gec |
Framework | none |
Fast ε-free Inference of Simulation Models with Bayesian Conditional Density Estimation
Title | Fast ε-free Inference of Simulation Models with Bayesian Conditional Density Estimation |
Authors | George Papamakarios, Iain Murray |
Abstract | Many statistical models can be simulated forwards but have intractable likelihoods. Approximate Bayesian Computation (ABC) methods are used to infer properties of these models from data. Traditionally these methods approximate the posterior over parameters by conditioning on data being inside an ε-ball around the observed data, which is only correct in the limit ε→0. Monte Carlo methods can then draw samples from the approximate posterior to approximate predictions or error bars on parameters. These algorithms critically slow down as ε→0, and in practice draw samples from a broader distribution than the posterior. We propose a new approach to likelihood-free inference based on Bayesian conditional density estimation. Preliminary inferences based on limited simulation data are used to guide later simulations. In some cases, learning an accurate parametric representation of the entire true posterior distribution requires fewer model simulations than Monte Carlo ABC methods need to produce a single sample from an approximate posterior. |
Tasks | Density Estimation |
Published | 2016-12-01 |
URL | http://papers.nips.cc/paper/6084-fast-free-inference-of-simulation-models-with-bayesian-conditional-density-estimation |
http://papers.nips.cc/paper/6084-fast-free-inference-of-simulation-models-with-bayesian-conditional-density-estimation.pdf | |
PWC | https://paperswithcode.com/paper/fast-free-inference-of-simulation-models-with-1 |
Repo | https://github.com/gpapamak/epsilon_free_inference |
Framework | none |
Addressee and Response Selection for Multi-Party Conversation
Title | Addressee and Response Selection for Multi-Party Conversation |
Authors | Hiroki Ouchi, Yuta Tsuboi |
Abstract | |
Tasks | Short-Text Conversation |
Published | 2016-11-01 |
URL | https://www.aclweb.org/anthology/D16-1231/ |
https://www.aclweb.org/anthology/D16-1231 | |
PWC | https://paperswithcode.com/paper/addressee-and-response-selection-for-multi |
Repo | https://github.com/hiroki13/response-ranking |
Framework | none |
Named Entity Recognition in Swedish Health Records with Character-Based Deep Bidirectional LSTMs
Title | Named Entity Recognition in Swedish Health Records with Character-Based Deep Bidirectional LSTMs |
Authors | Simon Almgren, Sean Pavlov, Olof Mogren |
Abstract | We propose an approach for named entity recognition in medical data, using a character-based deep bidirectional recurrent neural network. Such models can learn features and patterns based on the character sequence, and are not limited to a fixed vocabulary. This makes them very well suited for the NER task in the medical domain. Our experimental evaluation shows promising results, with a 60{%} improvement in F 1 score over the baseline, and our system generalizes well between different datasets. |
Tasks | Boundary Detection, Feature Engineering, Named Entity Recognition |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-5104/ |
https://www.aclweb.org/anthology/W16-5104 | |
PWC | https://paperswithcode.com/paper/named-entity-recognition-in-swedish-health |
Repo | https://github.com/olofmogren/biomedical-ner-data-swedish |
Framework | none |
Jigg: A Framework for an Easy Natural Language Processing Pipeline
Title | Jigg: A Framework for an Easy Natural Language Processing Pipeline |
Authors | Hiroshi Noji, Yusuke Miyao |
Abstract | |
Tasks | Coreference Resolution, Dependency Parsing |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/P16-4018/ |
https://www.aclweb.org/anthology/P16-4018 | |
PWC | https://paperswithcode.com/paper/jigg-a-framework-for-an-easy-natural-language |
Repo | https://github.com/mynlp/jigg |
Framework | none |
Entity Linking with a Paraphrase Flavor
Title | Entity Linking with a Paraphrase Flavor |
Authors | Maria Pershina, Yifan He, Ralph Grishman |
Abstract | The task of Named Entity Linking is to link entity mentions in the document to their correct entries in a knowledge base and to cluster NIL mentions. Ambiguous, misspelled, and incomplete entity mention names are the main challenges in the linking process. We propose a novel approach that combines two state-of-the-art models ― for entity disambiguation and for paraphrase detection ― to overcome these challenges. We consider name variations as paraphrases of the same entity mention and adopt a paraphrase model for this task. Our approach utilizes a graph-based disambiguation model based on Personalized Page Rank, and then refines and clusters its output using the paraphrase similarity between entity mention strings. It achieves a competitive performance of 80.5{%} in B3+F clustering score on diagnostic TAC EDL 2014 data. |
Tasks | Entity Disambiguation, Entity Linking |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1088/ |
https://www.aclweb.org/anthology/L16-1088 | |
PWC | https://paperswithcode.com/paper/entity-linking-with-a-paraphrase-flavor |
Repo | https://github.com/masha-p/paraphrase_flavor |
Framework | none |
Generative Topic Embedding: a Continuous Representation of Documents
Title | Generative Topic Embedding: a Continuous Representation of Documents |
Authors | Shaohua Li, Tat-Seng Chua, Jun Zhu, Chunyan Miao |
Abstract | |
Tasks | Document Classification, Topic Models |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/P16-1063/ |
https://www.aclweb.org/anthology/P16-1063 | |
PWC | https://paperswithcode.com/paper/generative-topic-embedding-a-continuous |
Repo | https://github.com/askerlee/topicvec |
Framework | none |
Action Recognition Based on Optimal Joint Selection and Discriminative Depth Descriptor
Title | Action Recognition Based on Optimal Joint Selection and Discriminative Depth Descriptor |
Authors | Haomiao Ni, Hong Liu, Xiangdong Wang, Yueliang Qian |
Abstract | This paper proposes a novel human action recognition using the decision-level fusion of both skeleton and depth sequence. Firstly, a state-of-the-art descriptor RBPL, relative body part locations, is adopted to represent skeleton. But the original RBPL employs all the available joints, which may introduce redundancy or noise. This paper proposes an adaptive optimal joint selection model based on the distance traveled by joints before RBPL for each different action, which can reduce redundant joints. Then we use dynamic time warping to handle temporal misalignment and adopt KELM, kernel-based extreme learning machine, for action recognition. Secondly, an efficient feature descriptor DMM-disLBP, depth motion maps-based discriminative local binary patterns, is constructed to describe depth sequences, and KELM is also used for classification. Finally, we present an effective decision fusion for action recognition based on the maximum sum of decision values from skeleton and depth maps. Comparing with the baseline methods, we improve the performance using either skeleton or depth information, and achieve the state-of-the-art average recognition accuracy on the public dataset MSR Action3D using proposed fusing strategy. |
Tasks | Temporal Action Localization |
Published | 2016-11-27 |
URL | https://www.researchgate.net/publication/308721518_Action_Recognition_Based_on_Optimal_Joint_Selection_and_Discriminative_Depth_Descriptor |
https://www.researchgate.net/publication/308721518_Action_Recognition_Based_on_Optimal_Joint_Selection_and_Discriminative_Depth_Descriptor | |
PWC | https://paperswithcode.com/paper/action-recognition-based-on-optimal-joint |
Repo | https://github.com/nihaomiao/ACCV16_DFMSDV |
Framework | none |
``Why Should I Trust You?'': Explaining the Predictions of Any Classifier
Title | ``Why Should I Trust You?'': Explaining the Predictions of Any Classifier | |
Authors | Marco Ribeiro, Sameer Singh, Carlos Guestrin |
Abstract | |
Tasks | Document Classification, Feature Engineering, Sentiment Analysis |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/N16-3020/ |
https://www.aclweb.org/anthology/N16-3020 | |
PWC | https://paperswithcode.com/paper/awhy-should-i-trust-youa-explaining-the |
Repo | https://github.com/UW-MODE/naacl16-demo |
Framework | none |
Sense-annotating a Lexical Substitution Data Set with Ubyline
Title | Sense-annotating a Lexical Substitution Data Set with Ubyline |
Authors | Tristan Miller, Mohamed Khemakhem, Richard Eckart de Castilho, Iryna Gurevych |
Abstract | We describe the construction of GLASS, a newly sense-annotated version of the German lexical substitution data set used at the GermEval 2015: LexSub shared task. Using the two annotation layers, we conduct the first known empirical study of the relationship between manually applied word senses and lexical substitutions. We find that synonymy and hypernymy/hyponymy are the only semantic relations directly linking targets to their substitutes, and that substitutes in the target{'}s hypernymy/hyponymy taxonomy closely align with the synonyms of a single GermaNet synset. Despite this, these substitutes account for a minority of those provided by the annotators. The results of our analysis accord with those of a previous study on English-language data (albeit with automatically induced word senses), leading us to suspect that the sense―substitution relations we discovered may be of a universal nature. We also tentatively conclude that relatively cheap lexical substitution annotations can be used as a knowledge source for automatic WSD. Also introduced in this paper is Ubyline, the web application used to produce the sense annotations. Ubyline presents an intuitive user interface optimized for annotating lexical sample data, and is readily adaptable to sense inventories other than GermaNet. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1134/ |
https://www.aclweb.org/anthology/L16-1134 | |
PWC | https://paperswithcode.com/paper/sense-annotating-a-lexical-substitution-data |
Repo | https://github.com/UKPLab/lrec2016-ubyline |
Framework | none |
Can Topic Modelling benefit from Word Sense Information?
Title | Can Topic Modelling benefit from Word Sense Information? |
Authors | Adriana Ferrugento, Hugo Gon{\c{c}}alo Oliveira, Ana Alves, Filipe Rodrigues |
Abstract | This paper proposes a new topic model that exploits word sense information in order to discover less redundant and more informative topics. Word sense information is obtained from WordNet and the discovered topics are groups of synsets, instead of mere surface words. A key feature is that all the known senses of a word are considered, with their probabilities. Alternative configurations of the model are described and compared to each other and to LDA, the most popular topic model. However, the obtained results suggest that there are no benefits of enriching LDA with word sense information. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1540/ |
https://www.aclweb.org/anthology/L16-1540 | |
PWC | https://paperswithcode.com/paper/can-topic-modelling-benefit-from-word-sense |
Repo | https://github.com/aferrugento/SemLDA |
Framework | none |
Creating Linked Data Morphological Language Resources with MMoOn - The Hebrew Morpheme Inventory
Title | Creating Linked Data Morphological Language Resources with MMoOn - The Hebrew Morpheme Inventory |
Authors | Bettina Klimek, Natanael Arndt, Sebastian Krause, Timotheus Arndt |
Abstract | The development of standard models for describing general lexical resources has led to the emergence of numerous lexical datasets of various languages in the Semantic Web. However, equivalent models covering the linguistic domain of morphology do not exist. As a result, there are hardly any language resources of morphemic data available in RDF to date. This paper presents the creation of the Hebrew Morpheme Inventory from a manually compiled tabular dataset comprising around 52.000 entries. It is an ongoing effort of representing the lexemes, word-forms and morphologigal patterns together with their underlying relations based on the newly created Multilingual Morpheme Ontology (MMoOn). It will be shown how segmented Hebrew language data can be granularly described in a Linked Data format, thus, serving as an exemplary case for creating morpheme inventories of any inflectional language with MMoOn. The resulting dataset is described a) according to the structure of the underlying data format, b) with respect to the Hebrew language characteristic of building word-forms directly from roots, c) by exemplifying how inflectional information is realized and d) with regard to its enrichment with external links to sense resources. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1143/ |
https://www.aclweb.org/anthology/L16-1143 | |
PWC | https://paperswithcode.com/paper/creating-linked-data-morphological-language |
Repo | https://github.com/aksw/mmoon |
Framework | none |
Semi-automatic Detection of Cross-lingual Marketing Blunders based on Pragmatic Label Propagation in Wiktionary
Title | Semi-automatic Detection of Cross-lingual Marketing Blunders based on Pragmatic Label Propagation in Wiktionary |
Authors | Christian M. Meyer, Judith Eckle-Kohler, Iryna Gurevych |
Abstract | We introduce the task of detecting cross-lingual marketing blunders, which occur if a trade name resembles an inappropriate or negatively connotated word in a target language. To this end, we suggest a formal task definition and a semi-automatic method based the propagation of pragmatic labels from Wiktionary across sense-disambiguated translations. Our final tool assists users by providing clues for problematic names in any language, which we simulate in two experiments on detecting previously occurred marketing blunders and identifying relevant clues for established international brands. We conclude the paper with a suggested research roadmap for this new task. To initiate further research, we publish our online demo along with the source code and data at \url{http://uby.ukp.informatik.tu-darmstadt.de/blunder/}. |
Tasks | |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-1195/ |
https://www.aclweb.org/anthology/C16-1195 | |
PWC | https://paperswithcode.com/paper/semi-automatic-detection-of-cross-lingual |
Repo | https://github.com/UKPLab/coling2016-marketing-blunders |
Framework | none |
Online SSVEP-based BCI using Riemannian geometry
Title | Online SSVEP-based BCI using Riemannian geometry |
Authors | Emmanuel Kalunga, Sylvain Chevallier, Quentin Barthélemy, Karim Djouani, Eric Monacelli, Yskandar Hamam |
Abstract | Challenges for the next generation of Brain Computer Interfaces (BCI) are to mitigate the common sources of variability (electronic, electrical, biological) and to develop online and adaptive systems following the evolution of the subject׳s brain waves. Studying electroencephalographic (EEG) signals from their associated covariance matrices allows the construction of a representation which is invariant to extrinsic perturbations. As covariance matrices should be estimated, this paper first presents a thorough study of all estimators conducted on real EEG recording. Working in Euclidean space with covariance matrices is known to be error-prone, one might take advantage of algorithmic advances in Riemannian geometry and matrix manifold to implement methods for Symmetric Positive-Definite (SPD) matrices. Nonetheless, existing classification algorithms in Riemannian spaces are designed for offline analysis. We propose a novel algorithm for online and asynchronous processing of brain signals, borrowing principles from semi-unsupervised approaches and following a dynamic stopping scheme to provide a prediction as soon as possible. The assessment is conducted on real EEG recording: this is the first study on Steady-State Visually Evoked Potential (SSVEP) experimentations to exploit online classification based on Riemannian geometry. The proposed online algorithm is evaluated and compared with state-of-the-art SSVEP methods, which are based on Canonical Correlation Analysis (CCA). It is shown to improve both the classification accuracy and the information transfer rate in the online and asynchronous setup. |
Tasks | EEG |
Published | 2016-05-26 |
URL | https://hal.archives-ouvertes.fr/hal-01351623 |
https://hal.archives-ouvertes.fr/hal-01351623/document | |
PWC | https://paperswithcode.com/paper/online-ssvep-based-bci-using-riemannian |
Repo | https://github.com/emmanuelkalunga/Online-SSVEP |
Framework | none |
TMop: a Tool for Unsupervised Translation Memory Cleaning
Title | TMop: a Tool for Unsupervised Translation Memory Cleaning |
Authors | Masoud Jalili Sabet, Matteo Negri, Marco Turchi, Jos{'e} G. C. de Souza, Marcello Federico |
Abstract | |
Tasks | Machine Translation |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/P16-4009/ |
https://www.aclweb.org/anthology/P16-4009 | |
PWC | https://paperswithcode.com/paper/tmop-a-tool-for-unsupervised-translation |
Repo | https://github.com/hlt-mt/TMOP |
Framework | none |