July 26, 2019

1706 words 9 mins read

Paper Group NANR 55

Proceedings of the Workshop Human-Informed Translation and Interpreting Technology. Word Re-Embedding via Manifold Dimensionality Retention. Improving the Character Ngram Model for the DSL Task with BM25 Weighting and Less Frequently Used Feature Sets. Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2 …

Proceedings of the Workshop Human-Informed Translation and Interpreting Technology


Title	Proceedings of the Workshop Human-Informed Translation and Interpreting Technology
Authors	Irina Temnikova, Constantin Orasan, Gloria Corpas Pastor, Stephan Vogel
Abstract
Tasks
Published	2017-09-01
URL	https://www.aclweb.org/anthology/papers/W17-7900/w17-7900
PDF	https://www.aclweb.org/anthology/W17-7900
PWC	https://paperswithcode.com/paper/proceedings-of-the-workshop-human-informed
Repo
Framework

Word Re-Embedding via Manifold Dimensionality Retention


Title	Word Re-Embedding via Manifold Dimensionality Retention
Authors	Souleiman Hasan, Edward Curry
Abstract	Word embeddings seek to recover a Euclidean metric space by mapping words into vectors, starting from words co-occurrences in a corpus. Word embeddings may underestimate the similarity between nearby words, and overestimate it between distant words in the Euclidean metric space. In this paper, we re-embed pre-trained word embeddings with a stage of manifold learning which retains dimensionality. We show that this approach is theoretically founded in the metric recovery paradigm, and empirically show that it can improve on state-of-the-art embeddings in word similarity tasks 0.5 - 5.0{%} points depending on the original space.
Tasks	Machine Translation, Named Entity Recognition, Part-Of-Speech Tagging, Word Embeddings
Published	2017-09-01
URL	https://www.aclweb.org/anthology/D17-1033/
PDF	https://www.aclweb.org/anthology/D17-1033
PWC	https://paperswithcode.com/paper/word-re-embedding-via-manifold-dimensionality
Repo
Framework

Improving the Character Ngram Model for the DSL Task with BM25 Weighting and Less Frequently Used Feature Sets


Title	Improving the Character Ngram Model for the DSL Task with BM25 Weighting and Less Frequently Used Feature Sets
Authors	Yves Bestgen
Abstract	This paper describes the system developed by the Centre for English Corpus Linguistics (CECL) to discriminating similar languages, language varieties and dialects. Based on a SVM with character and POStag n-grams as features and the BM25 weighting scheme, it achieved 92.7{%} accuracy in the Discriminating between Similar Languages (DSL) task, ranking first among eleven systems but with a lead over the next three teams of only 0.2{%}. A simpler version of the system ranked second in the German Dialect Identification (GDI) task thanks to several ad hoc postprocessing steps. Complementary analyses carried out by a cross-validation procedure suggest that the BM25 weighting scheme could be competitive in this type of tasks, at least in comparison with the sublinear TF-IDF. POStag n-grams also improved the system performance.
Tasks	Language Identification
Published	2017-04-01
URL	https://www.aclweb.org/anthology/W17-1214/
PDF	https://www.aclweb.org/anthology/W17-1214
PWC	https://paperswithcode.com/paper/improving-the-character-ngram-model-for-the
Repo
Framework

Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers)


Title	Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers)
Authors
Abstract
Tasks
Published	2017-11-01
URL	https://www.aclweb.org/anthology/I17-2000/
PDF	https://www.aclweb.org/anthology/I17-2000
PWC	https://paperswithcode.com/paper/proceedings-of-the-eighth-international-joint-1
Repo
Framework

Towards Abstractive Multi-Document Summarization Using Submodular Function-Based Framework, Sentence Compression and Merging


Title	Towards Abstractive Multi-Document Summarization Using Submodular Function-Based Framework, Sentence Compression and Merging
Authors	Yllias Chali, Moin Tanvee, Mir Tafseer Nayeem
Abstract	We propose a submodular function-based summarization system which integrates three important measures namely importance, coverage, and non-redundancy to detect the important sentences for the summary. We design monotone and submodular functions which allow us to apply an efficient and scalable greedy algorithm to obtain informative and well-covered summaries. In addition, we integrate two abstraction-based methods namely sentence compression and merging for generating an abstractive sentence set. We design our summarization models for both generic and query-focused summarization. Experimental results on DUC-2004 and DUC-2007 datasets show that our generic and query-focused summarizers have outperformed the state-of-the-art summarization systems in terms of ROUGE-1 and ROUGE-2 recall and F-measure.
Tasks	Abstractive Text Summarization, Document Summarization, Multi-Document Summarization, Sentence Compression, Text Generation
Published	2017-11-01
URL	https://www.aclweb.org/anthology/I17-2071/
PDF	https://www.aclweb.org/anthology/I17-2071
PWC	https://paperswithcode.com/paper/towards-abstractive-multi-document
Repo
Framework

A corpus-based study on synesthesia in Korean ordinary language


Title	A corpus-based study on synesthesia in Korean ordinary language
Authors	Charmhun Jo
Abstract
Tasks
Published	2017-11-01
URL	https://www.aclweb.org/anthology/Y17-1034/
PDF	https://www.aclweb.org/anthology/Y17-1034
PWC	https://paperswithcode.com/paper/a-corpus-based-study-on-synesthesia-in-korean
Repo
Framework

Intrusions of Masbate Lexicon in Local Bilingual Tabloid


Title	Intrusions of Masbate Lexicon in Local Bilingual Tabloid
Authors	Cecilia Genuino, Romualdo Mabuan
Abstract
Tasks
Published	2017-11-01
URL	https://www.aclweb.org/anthology/Y17-1035/
PDF	https://www.aclweb.org/anthology/Y17-1035
PWC	https://paperswithcode.com/paper/intrusions-of-masbate-lexicon-in-local
Repo
Framework

Tweet Extraction for News Production Considering Unreality


Title	Tweet Extraction for News Production Considering Unreality
Authors	Yuka Takei, Taro Miyazaki, Ichiro Yamada, Jun Goto
Abstract
Tasks
Published	2017-11-01
URL	https://www.aclweb.org/anthology/Y17-1049/
PDF	https://www.aclweb.org/anthology/Y17-1049
PWC	https://paperswithcode.com/paper/tweet-extraction-for-news-production
Repo
Framework


Title	A Dataset and Classifier for Recognizing Social Media English
Authors	Su Lin Blodgett, Johnny Wei, Brendan O{'}Connor
Abstract	While language identification works well on standard texts, it performs much worse on social media language, in particular dialectal language{—}even for English. First, to support work on English language identification, we contribute a new dataset of tweets annotated for English versus non-English, with attention to ambiguity, code-switching, and automatic generation issues. It is randomly sampled from all public messages, avoiding biases towards pre-existing language classifiers. Second, we find that a demographic language model{—}which identifies messages with language similar to that used by several U.S. ethnic populations on Twitter{—}can be used to improve English language identification performance when combined with a traditional supervised language identifier. It increases recall with almost no loss of precision, including, surprisingly, for English messages written by non-U.S. authors. Our dataset and identifier ensemble are available online.
Tasks	Language Identification, Language Modelling
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-4408/
PDF	https://www.aclweb.org/anthology/W17-4408
PWC	https://paperswithcode.com/paper/a-dataset-and-classifier-for-recognizing
Repo
Framework

完全基於類神經網路之語音合成系統初步研究 (A Preliminary Study on Fully Neural Network-based Speech Synthesis System) [In Chinese]


Title	完全基於類神經網路之語音合成系統初步研究 (A Preliminary Study on Fully Neural Network-based Speech Synthesis System) [In Chinese]
Authors	Shu-Han Liao, Ya-Bo Chai, Yuan-Fu Liao
Abstract
Tasks	Speech Synthesis
Published	2017-11-01
URL	https://www.aclweb.org/anthology/O17-1021/
PDF	https://www.aclweb.org/anthology/O17-1021
PWC	https://paperswithcode.com/paper/aa-ao14eccc2e-a1eae3ac3cac-c-a-preliminary
Repo
Framework

Automatic Morpheme Segmentation and Labeling in Universal Dependencies Resources


Title	Automatic Morpheme Segmentation and Labeling in Universal Dependencies Resources
Authors	Miikka Silfverberg, Mans Hulden
Abstract
Tasks	Semantic Textual Similarity, Word Embeddings
Published	2017-05-01
URL	https://www.aclweb.org/anthology/W17-0418/
PDF	https://www.aclweb.org/anthology/W17-0418
PWC	https://paperswithcode.com/paper/automatic-morpheme-segmentation-and-labeling
Repo
Framework

Role-Preserving Redaction of Medical Records to Enable Ontology-Driven Processing


Title	Role-Preserving Redaction of Medical Records to Enable Ontology-Driven Processing
Authors	Seth Polsley, Atif Tahir, Muppala Raju, Akintayo Akinleye, Duane Steward
Abstract	Electronic medical records (EMR) have largely replaced hand-written patient files in healthcare. The growing pool of EMR data presents a significant resource in medical research, but the U.S. Health Insurance Portability and Accountability Act (HIPAA) mandates redacting medical records before performing any analysis on the same. This process complicates obtaining medical data and can remove much useful information from the record. As part of a larger project involving ontology-driven medical processing, we employ a method of recognizing protected health information (PHI) that maps to ontological terms. We then use the relationships defined in the ontology to redact medical texts so that roles and semantics of terms are retained without compromising anonymity. The method is evaluated by clinical experts on several hundred medical documents, achieving up to a 98.8{%} f-score, and has already shown promise for retaining semantic information in later processing.
Tasks
Published	2017-08-01
URL	https://www.aclweb.org/anthology/W17-2324/
PDF	https://www.aclweb.org/anthology/W17-2324
PWC	https://paperswithcode.com/paper/role-preserving-redaction-of-medical-records
Repo
Framework

Deep Neural Network based system for solving Arithmetic Word problems


Title	Deep Neural Network based system for solving Arithmetic Word problems
Authors	Purvanshi Mehta, Pruthwik Mishra, Vinayak Athavale, Manish Shrivastava, Dipti Sharma
Abstract	This paper presents DILTON a system which solves simple arithmetic word problems. DILTON uses a Deep Neural based model to solve math word problems. DILTON divides the question into two parts - worldstate and query. The worldstate and the query are processed separately in two different networks and finally, the networks are merged to predict the final operation. We report the first deep learning approach for the prediction of operation between two numbers. DILTON learns to predict operations with 88.81{%} accuracy in a corpus of primary school questions.
Tasks
Published	2017-11-01
URL	https://www.aclweb.org/anthology/I17-3017/
PDF	https://www.aclweb.org/anthology/I17-3017
PWC	https://paperswithcode.com/paper/deep-neural-network-based-system-for-solving
Repo
Framework

Non-parametric Structured Output Networks


Title	Non-parametric Structured Output Networks
Authors	Andreas Lehrmann, Leonid Sigal
Abstract	Deep neural networks (DNNs) and probabilistic graphical models (PGMs) are the two main tools for statistical modeling. While DNNs provide the ability to model rich and complex relationships between input and output variables, PGMs provide the ability to encode dependencies among the output variables themselves. End-to-end training methods for models with structured graphical dependencies on top of neural predictions have recently emerged as a principled way of combining these two paradigms. While these models have proven to be powerful in discriminative settings with discrete outputs, extensions to structured continuous spaces, as well as performing efficient inference in these spaces, are lacking. We propose non-parametric structured output networks (NSON), a modular approach that cleanly separates a non-parametric, structured posterior representation from a discriminative inference scheme but allows joint end-to-end training of both components. Our experiments evaluate the ability of NSONs to capture structured posterior densities (modeling) and to compute complex statistics of those densities (inference). We compare our model to output spaces of varying expressiveness and popular variational and sampling-based inference algorithms.
Tasks
Published	2017-12-01
URL	http://papers.nips.cc/paper/7009-non-parametric-structured-output-networks
PDF	http://papers.nips.cc/paper/7009-non-parametric-structured-output-networks.pdf
PWC	https://paperswithcode.com/paper/non-parametric-structured-output-networks
Repo
Framework

Author-aware Aspect Topic Sentiment Model to Retrieve Supporting Opinions from Reviews


Title	Author-aware Aspect Topic Sentiment Model to Retrieve Supporting Opinions from Reviews
Authors	Lahari Poddar, Wynne Hsu, Mong Li Lee
Abstract	User generated content about products and services in the form of reviews are often diverse and even contradictory. This makes it difficult for users to know if an opinion in a review is prevalent or biased. We study the problem of searching for supporting opinions in the context of reviews. We propose a framework called SURF, that first identifies opinions expressed in a review, and then finds similar opinions from other reviews. We design a novel probabilistic graphical model that captures opinions as a combination of aspect, topic and sentiment dimensions, takes into account the preferences of individual authors, as well as the quality of the entity under review, and encodes the flow of thoughts in a review by constraining the aspect distribution dynamically among successive review segments. We derive a similarity measure that considers both lexical and semantic similarity to find supporting opinions. Experiments on TripAdvisor hotel reviews and Yelp restaurant reviews show that our model outperforms existing methods for modeling opinions, and the proposed framework is effective in finding supporting opinions.
Tasks	Semantic Similarity, Semantic Textual Similarity
Published	2017-09-01
URL	https://www.aclweb.org/anthology/D17-1049/
PDF	https://www.aclweb.org/anthology/D17-1049
PWC	https://paperswithcode.com/paper/author-aware-aspect-topic-sentiment-model-to
Repo
Framework