May 5, 2019

2235 words 11 mins read

Paper Group NAWR 12

Paper Group NAWR 12

Reassessing the Goals of Grammatical Error Correction: Fluency Instead of Grammaticality. Fast ε-free Inference of Simulation Models with Bayesian Conditional Density Estimation. Addressee and Response Selection for Multi-Party Conversation. Named Entity Recognition in Swedish Health Records with Character-Based Deep Bidirectional LSTMs. Jigg: A Fr …

Reassessing the Goals of Grammatical Error Correction: Fluency Instead of Grammaticality

Title Reassessing the Goals of Grammatical Error Correction: Fluency Instead of Grammaticality
Authors Keisuke Sakaguchi, Courtney Napoles, Matt Post, Joel Tetreault
Abstract The field of grammatical error correction (GEC) has grown substantially in recent years, with research directed at both evaluation metrics and improved system performance against those metrics. One unvisited assumption, however, is the reliance of GEC evaluation on error-coded corpora, which contain specific labeled corrections. We examine current practices and show that GEC{'}s reliance on such corpora unnaturally constrains annotation and automatic evaluation, resulting in (a) sentences that do not sound acceptable to native speakers and (b) system rankings that do not correlate with human judgments. In light of this, we propose an alternate approach that jettisons costly error coding in favor of unannotated, whole-sentence rewrites. We compare the performance of existing metrics over different gold-standard annotations, and show that automatic evaluation with our new annotation scheme has very strong correlation with expert rankings (� = 0.82). As a result, we advocate for a fundamental and necessary shift in the goal of GEC, from correcting small, labeled error types, to producing text that has native fluency.
Tasks Grammatical Error Correction
Published 2016-01-01
URL https://www.aclweb.org/anthology/Q16-1013/
PDF https://www.aclweb.org/anthology/Q16-1013
PWC https://paperswithcode.com/paper/reassessing-the-goals-of-grammatical-error
Repo https://github.com/keisks/reassess-gec
Framework none

Fast ε-free Inference of Simulation Models with Bayesian Conditional Density Estimation

Title Fast ε-free Inference of Simulation Models with Bayesian Conditional Density Estimation
Authors George Papamakarios, Iain Murray
Abstract Many statistical models can be simulated forwards but have intractable likelihoods. Approximate Bayesian Computation (ABC) methods are used to infer properties of these models from data. Traditionally these methods approximate the posterior over parameters by conditioning on data being inside an ε-ball around the observed data, which is only correct in the limit ε→0. Monte Carlo methods can then draw samples from the approximate posterior to approximate predictions or error bars on parameters. These algorithms critically slow down as ε→0, and in practice draw samples from a broader distribution than the posterior. We propose a new approach to likelihood-free inference based on Bayesian conditional density estimation. Preliminary inferences based on limited simulation data are used to guide later simulations. In some cases, learning an accurate parametric representation of the entire true posterior distribution requires fewer model simulations than Monte Carlo ABC methods need to produce a single sample from an approximate posterior.
Tasks Density Estimation
Published 2016-12-01
URL http://papers.nips.cc/paper/6084-fast-free-inference-of-simulation-models-with-bayesian-conditional-density-estimation
PDF http://papers.nips.cc/paper/6084-fast-free-inference-of-simulation-models-with-bayesian-conditional-density-estimation.pdf
PWC https://paperswithcode.com/paper/fast-free-inference-of-simulation-models-with-1
Repo https://github.com/gpapamak/epsilon_free_inference
Framework none

Addressee and Response Selection for Multi-Party Conversation

Title Addressee and Response Selection for Multi-Party Conversation
Authors Hiroki Ouchi, Yuta Tsuboi
Abstract
Tasks Short-Text Conversation
Published 2016-11-01
URL https://www.aclweb.org/anthology/D16-1231/
PDF https://www.aclweb.org/anthology/D16-1231
PWC https://paperswithcode.com/paper/addressee-and-response-selection-for-multi
Repo https://github.com/hiroki13/response-ranking
Framework none

Named Entity Recognition in Swedish Health Records with Character-Based Deep Bidirectional LSTMs

Title Named Entity Recognition in Swedish Health Records with Character-Based Deep Bidirectional LSTMs
Authors Simon Almgren, Sean Pavlov, Olof Mogren
Abstract We propose an approach for named entity recognition in medical data, using a character-based deep bidirectional recurrent neural network. Such models can learn features and patterns based on the character sequence, and are not limited to a fixed vocabulary. This makes them very well suited for the NER task in the medical domain. Our experimental evaluation shows promising results, with a 60{%} improvement in F 1 score over the baseline, and our system generalizes well between different datasets.
Tasks Boundary Detection, Feature Engineering, Named Entity Recognition
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-5104/
PDF https://www.aclweb.org/anthology/W16-5104
PWC https://paperswithcode.com/paper/named-entity-recognition-in-swedish-health
Repo https://github.com/olofmogren/biomedical-ner-data-swedish
Framework none

Jigg: A Framework for an Easy Natural Language Processing Pipeline

Title Jigg: A Framework for an Easy Natural Language Processing Pipeline
Authors Hiroshi Noji, Yusuke Miyao
Abstract
Tasks Coreference Resolution, Dependency Parsing
Published 2016-08-01
URL https://www.aclweb.org/anthology/P16-4018/
PDF https://www.aclweb.org/anthology/P16-4018
PWC https://paperswithcode.com/paper/jigg-a-framework-for-an-easy-natural-language
Repo https://github.com/mynlp/jigg
Framework none

Entity Linking with a Paraphrase Flavor

Title Entity Linking with a Paraphrase Flavor
Authors Maria Pershina, Yifan He, Ralph Grishman
Abstract The task of Named Entity Linking is to link entity mentions in the document to their correct entries in a knowledge base and to cluster NIL mentions. Ambiguous, misspelled, and incomplete entity mention names are the main challenges in the linking process. We propose a novel approach that combines two state-of-the-art models ― for entity disambiguation and for paraphrase detection ― to overcome these challenges. We consider name variations as paraphrases of the same entity mention and adopt a paraphrase model for this task. Our approach utilizes a graph-based disambiguation model based on Personalized Page Rank, and then refines and clusters its output using the paraphrase similarity between entity mention strings. It achieves a competitive performance of 80.5{%} in B3+F clustering score on diagnostic TAC EDL 2014 data.
Tasks Entity Disambiguation, Entity Linking
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1088/
PDF https://www.aclweb.org/anthology/L16-1088
PWC https://paperswithcode.com/paper/entity-linking-with-a-paraphrase-flavor
Repo https://github.com/masha-p/paraphrase_flavor
Framework none

Generative Topic Embedding: a Continuous Representation of Documents

Title Generative Topic Embedding: a Continuous Representation of Documents
Authors Shaohua Li, Tat-Seng Chua, Jun Zhu, Chunyan Miao
Abstract
Tasks Document Classification, Topic Models
Published 2016-08-01
URL https://www.aclweb.org/anthology/P16-1063/
PDF https://www.aclweb.org/anthology/P16-1063
PWC https://paperswithcode.com/paper/generative-topic-embedding-a-continuous
Repo https://github.com/askerlee/topicvec
Framework none

Action Recognition Based on Optimal Joint Selection and Discriminative Depth Descriptor

Title Action Recognition Based on Optimal Joint Selection and Discriminative Depth Descriptor
Authors Haomiao Ni, Hong Liu, Xiangdong Wang, Yueliang Qian
Abstract This paper proposes a novel human action recognition using the decision-level fusion of both skeleton and depth sequence. Firstly, a state-of-the-art descriptor RBPL, relative body part locations, is adopted to represent skeleton. But the original RBPL employs all the available joints, which may introduce redundancy or noise. This paper proposes an adaptive optimal joint selection model based on the distance traveled by joints before RBPL for each different action, which can reduce redundant joints. Then we use dynamic time warping to handle temporal misalignment and adopt KELM, kernel-based extreme learning machine, for action recognition. Secondly, an efficient feature descriptor DMM-disLBP, depth motion maps-based discriminative local binary patterns, is constructed to describe depth sequences, and KELM is also used for classification. Finally, we present an effective decision fusion for action recognition based on the maximum sum of decision values from skeleton and depth maps. Comparing with the baseline methods, we improve the performance using either skeleton or depth information, and achieve the state-of-the-art average recognition accuracy on the public dataset MSR Action3D using proposed fusing strategy.
Tasks Temporal Action Localization
Published 2016-11-27
URL https://www.researchgate.net/publication/308721518_Action_Recognition_Based_on_Optimal_Joint_Selection_and_Discriminative_Depth_Descriptor
PDF https://www.researchgate.net/publication/308721518_Action_Recognition_Based_on_Optimal_Joint_Selection_and_Discriminative_Depth_Descriptor
PWC https://paperswithcode.com/paper/action-recognition-based-on-optimal-joint
Repo https://github.com/nihaomiao/ACCV16_DFMSDV
Framework none

``Why Should I Trust You?'': Explaining the Predictions of Any Classifier

Title ``Why Should I Trust You?'': Explaining the Predictions of Any Classifier |
Authors Marco Ribeiro, Sameer Singh, Carlos Guestrin
Abstract
Tasks Document Classification, Feature Engineering, Sentiment Analysis
Published 2016-06-01
URL https://www.aclweb.org/anthology/N16-3020/
PDF https://www.aclweb.org/anthology/N16-3020
PWC https://paperswithcode.com/paper/awhy-should-i-trust-youa-explaining-the
Repo https://github.com/UW-MODE/naacl16-demo
Framework none

Sense-annotating a Lexical Substitution Data Set with Ubyline

Title Sense-annotating a Lexical Substitution Data Set with Ubyline
Authors Tristan Miller, Mohamed Khemakhem, Richard Eckart de Castilho, Iryna Gurevych
Abstract We describe the construction of GLASS, a newly sense-annotated version of the German lexical substitution data set used at the GermEval 2015: LexSub shared task. Using the two annotation layers, we conduct the first known empirical study of the relationship between manually applied word senses and lexical substitutions. We find that synonymy and hypernymy/hyponymy are the only semantic relations directly linking targets to their substitutes, and that substitutes in the target{'}s hypernymy/hyponymy taxonomy closely align with the synonyms of a single GermaNet synset. Despite this, these substitutes account for a minority of those provided by the annotators. The results of our analysis accord with those of a previous study on English-language data (albeit with automatically induced word senses), leading us to suspect that the sense―substitution relations we discovered may be of a universal nature. We also tentatively conclude that relatively cheap lexical substitution annotations can be used as a knowledge source for automatic WSD. Also introduced in this paper is Ubyline, the web application used to produce the sense annotations. Ubyline presents an intuitive user interface optimized for annotating lexical sample data, and is readily adaptable to sense inventories other than GermaNet.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1134/
PDF https://www.aclweb.org/anthology/L16-1134
PWC https://paperswithcode.com/paper/sense-annotating-a-lexical-substitution-data
Repo https://github.com/UKPLab/lrec2016-ubyline
Framework none

Can Topic Modelling benefit from Word Sense Information?

Title Can Topic Modelling benefit from Word Sense Information?
Authors Adriana Ferrugento, Hugo Gon{\c{c}}alo Oliveira, Ana Alves, Filipe Rodrigues
Abstract This paper proposes a new topic model that exploits word sense information in order to discover less redundant and more informative topics. Word sense information is obtained from WordNet and the discovered topics are groups of synsets, instead of mere surface words. A key feature is that all the known senses of a word are considered, with their probabilities. Alternative configurations of the model are described and compared to each other and to LDA, the most popular topic model. However, the obtained results suggest that there are no benefits of enriching LDA with word sense information.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1540/
PDF https://www.aclweb.org/anthology/L16-1540
PWC https://paperswithcode.com/paper/can-topic-modelling-benefit-from-word-sense
Repo https://github.com/aferrugento/SemLDA
Framework none

Creating Linked Data Morphological Language Resources with MMoOn - The Hebrew Morpheme Inventory

Title Creating Linked Data Morphological Language Resources with MMoOn - The Hebrew Morpheme Inventory
Authors Bettina Klimek, Natanael Arndt, Sebastian Krause, Timotheus Arndt
Abstract The development of standard models for describing general lexical resources has led to the emergence of numerous lexical datasets of various languages in the Semantic Web. However, equivalent models covering the linguistic domain of morphology do not exist. As a result, there are hardly any language resources of morphemic data available in RDF to date. This paper presents the creation of the Hebrew Morpheme Inventory from a manually compiled tabular dataset comprising around 52.000 entries. It is an ongoing effort of representing the lexemes, word-forms and morphologigal patterns together with their underlying relations based on the newly created Multilingual Morpheme Ontology (MMoOn). It will be shown how segmented Hebrew language data can be granularly described in a Linked Data format, thus, serving as an exemplary case for creating morpheme inventories of any inflectional language with MMoOn. The resulting dataset is described a) according to the structure of the underlying data format, b) with respect to the Hebrew language characteristic of building word-forms directly from roots, c) by exemplifying how inflectional information is realized and d) with regard to its enrichment with external links to sense resources.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1143/
PDF https://www.aclweb.org/anthology/L16-1143
PWC https://paperswithcode.com/paper/creating-linked-data-morphological-language
Repo https://github.com/aksw/mmoon
Framework none

Semi-automatic Detection of Cross-lingual Marketing Blunders based on Pragmatic Label Propagation in Wiktionary

Title Semi-automatic Detection of Cross-lingual Marketing Blunders based on Pragmatic Label Propagation in Wiktionary
Authors Christian M. Meyer, Judith Eckle-Kohler, Iryna Gurevych
Abstract We introduce the task of detecting cross-lingual marketing blunders, which occur if a trade name resembles an inappropriate or negatively connotated word in a target language. To this end, we suggest a formal task definition and a semi-automatic method based the propagation of pragmatic labels from Wiktionary across sense-disambiguated translations. Our final tool assists users by providing clues for problematic names in any language, which we simulate in two experiments on detecting previously occurred marketing blunders and identifying relevant clues for established international brands. We conclude the paper with a suggested research roadmap for this new task. To initiate further research, we publish our online demo along with the source code and data at \url{http://uby.ukp.informatik.tu-darmstadt.de/blunder/}.
Tasks
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-1195/
PDF https://www.aclweb.org/anthology/C16-1195
PWC https://paperswithcode.com/paper/semi-automatic-detection-of-cross-lingual
Repo https://github.com/UKPLab/coling2016-marketing-blunders
Framework none

Online SSVEP-based BCI using Riemannian geometry

Title Online SSVEP-based BCI using Riemannian geometry
Authors Emmanuel Kalunga, Sylvain Chevallier, Quentin Barthélemy, Karim Djouani, Eric Monacelli, Yskandar Hamam
Abstract Challenges for the next generation of Brain Computer Interfaces (BCI) are to mitigate the common sources of variability (electronic, electrical, biological) and to develop online and adaptive systems following the evolution of the subject׳s brain waves. Studying electroencephalographic (EEG) signals from their associated covariance matrices allows the construction of a representation which is invariant to extrinsic perturbations. As covariance matrices should be estimated, this paper first presents a thorough study of all estimators conducted on real EEG recording. Working in Euclidean space with covariance matrices is known to be error-prone, one might take advantage of algorithmic advances in Riemannian geometry and matrix manifold to implement methods for Symmetric Positive-Definite (SPD) matrices. Nonetheless, existing classification algorithms in Riemannian spaces are designed for offline analysis. We propose a novel algorithm for online and asynchronous processing of brain signals, borrowing principles from semi-unsupervised approaches and following a dynamic stopping scheme to provide a prediction as soon as possible. The assessment is conducted on real EEG recording: this is the first study on Steady-State Visually Evoked Potential (SSVEP) experimentations to exploit online classification based on Riemannian geometry. The proposed online algorithm is evaluated and compared with state-of-the-art SSVEP methods, which are based on Canonical Correlation Analysis (CCA). It is shown to improve both the classification accuracy and the information transfer rate in the online and asynchronous setup.
Tasks EEG
Published 2016-05-26
URL https://hal.archives-ouvertes.fr/hal-01351623
PDF https://hal.archives-ouvertes.fr/hal-01351623/document
PWC https://paperswithcode.com/paper/online-ssvep-based-bci-using-riemannian
Repo https://github.com/emmanuelkalunga/Online-SSVEP
Framework none

TMop: a Tool for Unsupervised Translation Memory Cleaning

Title TMop: a Tool for Unsupervised Translation Memory Cleaning
Authors Masoud Jalili Sabet, Matteo Negri, Marco Turchi, Jos{'e} G. C. de Souza, Marcello Federico
Abstract
Tasks Machine Translation
Published 2016-08-01
URL https://www.aclweb.org/anthology/P16-4009/
PDF https://www.aclweb.org/anthology/P16-4009
PWC https://paperswithcode.com/paper/tmop-a-tool-for-unsupervised-translation
Repo https://github.com/hlt-mt/TMOP
Framework none
comments powered by Disqus