July 26, 2019

1706 words 9 mins read

Paper Group NANR 55

Paper Group NANR 55

Proceedings of the Workshop Human-Informed Translation and Interpreting Technology. Word Re-Embedding via Manifold Dimensionality Retention. Improving the Character Ngram Model for the DSL Task with BM25 Weighting and Less Frequently Used Feature Sets. Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2 …

Proceedings of the Workshop Human-Informed Translation and Interpreting Technology

Title Proceedings of the Workshop Human-Informed Translation and Interpreting Technology
Authors Irina Temnikova, Constantin Orasan, Gloria Corpas Pastor, Stephan Vogel
Abstract
Tasks
Published 2017-09-01
URL https://www.aclweb.org/anthology/papers/W17-7900/w17-7900
PDF https://www.aclweb.org/anthology/W17-7900
PWC https://paperswithcode.com/paper/proceedings-of-the-workshop-human-informed
Repo
Framework

Word Re-Embedding via Manifold Dimensionality Retention

Title Word Re-Embedding via Manifold Dimensionality Retention
Authors Souleiman Hasan, Edward Curry
Abstract Word embeddings seek to recover a Euclidean metric space by mapping words into vectors, starting from words co-occurrences in a corpus. Word embeddings may underestimate the similarity between nearby words, and overestimate it between distant words in the Euclidean metric space. In this paper, we re-embed pre-trained word embeddings with a stage of manifold learning which retains dimensionality. We show that this approach is theoretically founded in the metric recovery paradigm, and empirically show that it can improve on state-of-the-art embeddings in word similarity tasks 0.5 - 5.0{%} points depending on the original space.
Tasks Machine Translation, Named Entity Recognition, Part-Of-Speech Tagging, Word Embeddings
Published 2017-09-01
URL https://www.aclweb.org/anthology/D17-1033/
PDF https://www.aclweb.org/anthology/D17-1033
PWC https://paperswithcode.com/paper/word-re-embedding-via-manifold-dimensionality
Repo
Framework

Improving the Character Ngram Model for the DSL Task with BM25 Weighting and Less Frequently Used Feature Sets

Title Improving the Character Ngram Model for the DSL Task with BM25 Weighting and Less Frequently Used Feature Sets
Authors Yves Bestgen
Abstract This paper describes the system developed by the Centre for English Corpus Linguistics (CECL) to discriminating similar languages, language varieties and dialects. Based on a SVM with character and POStag n-grams as features and the BM25 weighting scheme, it achieved 92.7{%} accuracy in the Discriminating between Similar Languages (DSL) task, ranking first among eleven systems but with a lead over the next three teams of only 0.2{%}. A simpler version of the system ranked second in the German Dialect Identification (GDI) task thanks to several ad hoc postprocessing steps. Complementary analyses carried out by a cross-validation procedure suggest that the BM25 weighting scheme could be competitive in this type of tasks, at least in comparison with the sublinear TF-IDF. POStag n-grams also improved the system performance.
Tasks Language Identification
Published 2017-04-01
URL https://www.aclweb.org/anthology/W17-1214/
PDF https://www.aclweb.org/anthology/W17-1214
PWC https://paperswithcode.com/paper/improving-the-character-ngram-model-for-the
Repo
Framework

Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

Title Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers)
Authors
Abstract
Tasks
Published 2017-11-01
URL https://www.aclweb.org/anthology/I17-2000/
PDF https://www.aclweb.org/anthology/I17-2000
PWC https://paperswithcode.com/paper/proceedings-of-the-eighth-international-joint-1
Repo
Framework

Towards Abstractive Multi-Document Summarization Using Submodular Function-Based Framework, Sentence Compression and Merging

Title Towards Abstractive Multi-Document Summarization Using Submodular Function-Based Framework, Sentence Compression and Merging
Authors Yllias Chali, Moin Tanvee, Mir Tafseer Nayeem
Abstract We propose a submodular function-based summarization system which integrates three important measures namely importance, coverage, and non-redundancy to detect the important sentences for the summary. We design monotone and submodular functions which allow us to apply an efficient and scalable greedy algorithm to obtain informative and well-covered summaries. In addition, we integrate two abstraction-based methods namely sentence compression and merging for generating an abstractive sentence set. We design our summarization models for both generic and query-focused summarization. Experimental results on DUC-2004 and DUC-2007 datasets show that our generic and query-focused summarizers have outperformed the state-of-the-art summarization systems in terms of ROUGE-1 and ROUGE-2 recall and F-measure.
Tasks Abstractive Text Summarization, Document Summarization, Multi-Document Summarization, Sentence Compression, Text Generation
Published 2017-11-01
URL https://www.aclweb.org/anthology/I17-2071/
PDF https://www.aclweb.org/anthology/I17-2071
PWC https://paperswithcode.com/paper/towards-abstractive-multi-document
Repo
Framework

A corpus-based study on synesthesia in Korean ordinary language

Title A corpus-based study on synesthesia in Korean ordinary language
Authors Charmhun Jo
Abstract
Tasks
Published 2017-11-01
URL https://www.aclweb.org/anthology/Y17-1034/
PDF https://www.aclweb.org/anthology/Y17-1034
PWC https://paperswithcode.com/paper/a-corpus-based-study-on-synesthesia-in-korean
Repo
Framework

Intrusions of Masbate Lexicon in Local Bilingual Tabloid

Title Intrusions of Masbate Lexicon in Local Bilingual Tabloid
Authors Cecilia Genuino, Romualdo Mabuan
Abstract
Tasks
Published 2017-11-01
URL https://www.aclweb.org/anthology/Y17-1035/
PDF https://www.aclweb.org/anthology/Y17-1035
PWC https://paperswithcode.com/paper/intrusions-of-masbate-lexicon-in-local
Repo
Framework

Tweet Extraction for News Production Considering Unreality

Title Tweet Extraction for News Production Considering Unreality
Authors Yuka Takei, Taro Miyazaki, Ichiro Yamada, Jun Goto
Abstract
Tasks
Published 2017-11-01
URL https://www.aclweb.org/anthology/Y17-1049/
PDF https://www.aclweb.org/anthology/Y17-1049
PWC https://paperswithcode.com/paper/tweet-extraction-for-news-production
Repo
Framework

A Dataset and Classifier for Recognizing Social Media English

Title A Dataset and Classifier for Recognizing Social Media English
Authors Su Lin Blodgett, Johnny Wei, Brendan O{'}Connor
Abstract While language identification works well on standard texts, it performs much worse on social media language, in particular dialectal language{—}even for English. First, to support work on English language identification, we contribute a new dataset of tweets annotated for English versus non-English, with attention to ambiguity, code-switching, and automatic generation issues. It is randomly sampled from all public messages, avoiding biases towards pre-existing language classifiers. Second, we find that a demographic language model{—}which identifies messages with language similar to that used by several U.S. ethnic populations on Twitter{—}can be used to improve English language identification performance when combined with a traditional supervised language identifier. It increases recall with almost no loss of precision, including, surprisingly, for English messages written by non-U.S. authors. Our dataset and identifier ensemble are available online.
Tasks Language Identification, Language Modelling
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-4408/
PDF https://www.aclweb.org/anthology/W17-4408
PWC https://paperswithcode.com/paper/a-dataset-and-classifier-for-recognizing
Repo
Framework

完全基於類神經網路之語音合成系統初步研究 (A Preliminary Study on Fully Neural Network-based Speech Synthesis System) [In Chinese]

Title 完全基於類神經網路之語音合成系統初步研究 (A Preliminary Study on Fully Neural Network-based Speech Synthesis System) [In Chinese]
Authors Shu-Han Liao, Ya-Bo Chai, Yuan-Fu Liao
Abstract
Tasks Speech Synthesis
Published 2017-11-01
URL https://www.aclweb.org/anthology/O17-1021/
PDF https://www.aclweb.org/anthology/O17-1021
PWC https://paperswithcode.com/paper/aa-ao14eccc2e-a1eae3ac3cac-c-a-preliminary
Repo
Framework

Automatic Morpheme Segmentation and Labeling in Universal Dependencies Resources

Title Automatic Morpheme Segmentation and Labeling in Universal Dependencies Resources
Authors Miikka Silfverberg, Mans Hulden
Abstract
Tasks Semantic Textual Similarity, Word Embeddings
Published 2017-05-01
URL https://www.aclweb.org/anthology/W17-0418/
PDF https://www.aclweb.org/anthology/W17-0418
PWC https://paperswithcode.com/paper/automatic-morpheme-segmentation-and-labeling
Repo
Framework

Role-Preserving Redaction of Medical Records to Enable Ontology-Driven Processing

Title Role-Preserving Redaction of Medical Records to Enable Ontology-Driven Processing
Authors Seth Polsley, Atif Tahir, Muppala Raju, Akintayo Akinleye, Duane Steward
Abstract Electronic medical records (EMR) have largely replaced hand-written patient files in healthcare. The growing pool of EMR data presents a significant resource in medical research, but the U.S. Health Insurance Portability and Accountability Act (HIPAA) mandates redacting medical records before performing any analysis on the same. This process complicates obtaining medical data and can remove much useful information from the record. As part of a larger project involving ontology-driven medical processing, we employ a method of recognizing protected health information (PHI) that maps to ontological terms. We then use the relationships defined in the ontology to redact medical texts so that roles and semantics of terms are retained without compromising anonymity. The method is evaluated by clinical experts on several hundred medical documents, achieving up to a 98.8{%} f-score, and has already shown promise for retaining semantic information in later processing.
Tasks
Published 2017-08-01
URL https://www.aclweb.org/anthology/W17-2324/
PDF https://www.aclweb.org/anthology/W17-2324
PWC https://paperswithcode.com/paper/role-preserving-redaction-of-medical-records
Repo
Framework

Deep Neural Network based system for solving Arithmetic Word problems

Title Deep Neural Network based system for solving Arithmetic Word problems
Authors Purvanshi Mehta, Pruthwik Mishra, Vinayak Athavale, Manish Shrivastava, Dipti Sharma
Abstract This paper presents DILTON a system which solves simple arithmetic word problems. DILTON uses a Deep Neural based model to solve math word problems. DILTON divides the question into two parts - worldstate and query. The worldstate and the query are processed separately in two different networks and finally, the networks are merged to predict the final operation. We report the first deep learning approach for the prediction of operation between two numbers. DILTON learns to predict operations with 88.81{%} accuracy in a corpus of primary school questions.
Tasks
Published 2017-11-01
URL https://www.aclweb.org/anthology/I17-3017/
PDF https://www.aclweb.org/anthology/I17-3017
PWC https://paperswithcode.com/paper/deep-neural-network-based-system-for-solving
Repo
Framework

Non-parametric Structured Output Networks

Title Non-parametric Structured Output Networks
Authors Andreas Lehrmann, Leonid Sigal
Abstract Deep neural networks (DNNs) and probabilistic graphical models (PGMs) are the two main tools for statistical modeling. While DNNs provide the ability to model rich and complex relationships between input and output variables, PGMs provide the ability to encode dependencies among the output variables themselves. End-to-end training methods for models with structured graphical dependencies on top of neural predictions have recently emerged as a principled way of combining these two paradigms. While these models have proven to be powerful in discriminative settings with discrete outputs, extensions to structured continuous spaces, as well as performing efficient inference in these spaces, are lacking. We propose non-parametric structured output networks (NSON), a modular approach that cleanly separates a non-parametric, structured posterior representation from a discriminative inference scheme but allows joint end-to-end training of both components. Our experiments evaluate the ability of NSONs to capture structured posterior densities (modeling) and to compute complex statistics of those densities (inference). We compare our model to output spaces of varying expressiveness and popular variational and sampling-based inference algorithms.
Tasks
Published 2017-12-01
URL http://papers.nips.cc/paper/7009-non-parametric-structured-output-networks
PDF http://papers.nips.cc/paper/7009-non-parametric-structured-output-networks.pdf
PWC https://paperswithcode.com/paper/non-parametric-structured-output-networks
Repo
Framework

Author-aware Aspect Topic Sentiment Model to Retrieve Supporting Opinions from Reviews

Title Author-aware Aspect Topic Sentiment Model to Retrieve Supporting Opinions from Reviews
Authors Lahari Poddar, Wynne Hsu, Mong Li Lee
Abstract User generated content about products and services in the form of reviews are often diverse and even contradictory. This makes it difficult for users to know if an opinion in a review is prevalent or biased. We study the problem of searching for supporting opinions in the context of reviews. We propose a framework called SURF, that first identifies opinions expressed in a review, and then finds similar opinions from other reviews. We design a novel probabilistic graphical model that captures opinions as a combination of aspect, topic and sentiment dimensions, takes into account the preferences of individual authors, as well as the quality of the entity under review, and encodes the flow of thoughts in a review by constraining the aspect distribution dynamically among successive review segments. We derive a similarity measure that considers both lexical and semantic similarity to find supporting opinions. Experiments on TripAdvisor hotel reviews and Yelp restaurant reviews show that our model outperforms existing methods for modeling opinions, and the proposed framework is effective in finding supporting opinions.
Tasks Semantic Similarity, Semantic Textual Similarity
Published 2017-09-01
URL https://www.aclweb.org/anthology/D17-1049/
PDF https://www.aclweb.org/anthology/D17-1049
PWC https://paperswithcode.com/paper/author-aware-aspect-topic-sentiment-model-to
Repo
Framework
comments powered by Disqus