October 15, 2019

2886 words 14 mins read

Paper Group NANR 146

Paper Group NANR 146

Simple Algorithms For Sentiment Analysis On Sentiment Rich, Data Poor Domains.. A Hybrid System for Chinese Grammatical Error Diagnosis and Correction. Revisiting (\epsilon, \gamma, \tau)-similarity learning for domain adaptation. Ling@CASS Solution to the NLP-TEA CGED Shared Task 2018. The Importance of Recommender and Feedback Features in a Pronu …

Simple Algorithms For Sentiment Analysis On Sentiment Rich, Data Poor Domains.

Title Simple Algorithms For Sentiment Analysis On Sentiment Rich, Data Poor Domains.
Authors Prathusha K Sarma, William Sethares
Abstract Standard word embedding algorithms learn vector representations from large corpora of text documents in an unsupervised fashion. However, the quality of word embeddings learned from these algorithms is affected by the size of training data sets. Thus, applications of these algorithms in domains with only moderate amounts of available data is limited. In this paper we introduce an algorithm that learns word embeddings jointly with a classifier. Our algorithm is called SWESA (Supervised Word Embeddings for Sentiment Analysis). SWESA leverages document label information to learn vector representations of words from a modest corpus of text documents by solving an optimization problem that minimizes a cost function with respect to both word embeddings and the weight vector used for classification. Experiments on several real world data sets show that SWESA has superior performance on domains with limited data, when compared to previously suggested approaches to word embeddings and sentiment analysis tasks.
Tasks Sentiment Analysis, Text Classification, Word Embeddings
Published 2018-08-01
URL https://www.aclweb.org/anthology/C18-1290/
PDF https://www.aclweb.org/anthology/C18-1290
PWC https://paperswithcode.com/paper/simple-algorithms-for-sentiment-analysis-on
Repo
Framework

A Hybrid System for Chinese Grammatical Error Diagnosis and Correction

Title A Hybrid System for Chinese Grammatical Error Diagnosis and Correction
Authors Chen Li, Junpei Zhou, Zuyi Bao, Hengyou Liu, Guangwei Xu, Linlin Li
Abstract This paper introduces the DM{_}NLP team{'}s system for NLPTEA 2018 shared task of Chinese Grammatical Error Diagnosis (CGED), which can be used to detect and correct grammatical errors in texts written by Chinese as a Foreign Language (CFL) learners. This task aims at not only detecting four types of grammatical errors including redundant words (R), missing words (M), bad word selection (S) and disordered words (W), but also recommending corrections for errors of M and S types. We proposed a hybrid system including four models for this task with two stages: the detection stage and the correction stage. In the detection stage, we first used a BiLSTM-CRF model to tag potential errors by sequence labeling, along with some handcraft features. Then we designed three Grammatical Error Correction (GEC) models to generate corrections, which could help to tune the detection result. In the correction stage, candidates were generated by the three GEC models and then merged to output the final corrections for M and S types. Our system reached the highest precision in the correction subtask, which was the most challenging part of this shared task, and got top 3 on F1 scores for position detection of errors.
Tasks Grammatical Error Correction
Published 2018-07-01
URL https://www.aclweb.org/anthology/W18-3708/
PDF https://www.aclweb.org/anthology/W18-3708
PWC https://paperswithcode.com/paper/a-hybrid-system-for-chinese-grammatical-error
Repo
Framework

Revisiting (\epsilon, \gamma, \tau)-similarity learning for domain adaptation

Title Revisiting (\epsilon, \gamma, \tau)-similarity learning for domain adaptation
Authors Sofiane Dhouib, Ievgen Redko
Abstract Similarity learning is an active research area in machine learning that tackles the problem of finding a similarity function tailored to an observable data sample in order to achieve efficient classification. This learning scenario has been generally formalized by the means of a $(\epsilon, \gamma, \tau)-$good similarity learning framework in the context of supervised classification and has been shown to have strong theoretical guarantees. In this paper, we propose to extend the theoretical analysis of similarity learning to the domain adaptation setting, a particular situation occurring when the similarity is learned and then deployed on samples following different probability distributions. We give a new definition of an $(\epsilon, \gamma)-$good similarity for domain adaptation and prove several results quantifying the performance of a similarity function on a target domain after it has been trained on a source domain. We particularly show that if the source distribution dominates the target one, then principally new domain adaptation learning bounds can be proved.
Tasks Domain Adaptation
Published 2018-12-01
URL http://papers.nips.cc/paper/7969-revisiting-epsilon-gamma-tau-similarity-learning-for-domain-adaptation
PDF http://papers.nips.cc/paper/7969-revisiting-epsilon-gamma-tau-similarity-learning-for-domain-adaptation.pdf
PWC https://paperswithcode.com/paper/revisiting-epsilon-gamma-tau-similarity
Repo
Framework

Ling@CASS Solution to the NLP-TEA CGED Shared Task 2018

Title Ling@CASS Solution to the NLP-TEA CGED Shared Task 2018
Authors Qinan Hu, Yongwei Zhang, Fang Liu, Yueguo Gu
Abstract In this study, we employ the sequence to sequence learning to model the task of grammar error correction. The system takes potentially erroneous sentences as inputs, and outputs correct sentences. To breakthrough the bottlenecks of very limited size of manually labeled data, we adopt a semi-supervised approach. Specifically, we adapt correct sentences written by native Chinese speakers to generate pseudo grammatical errors made by learners of Chinese as a second language. We use the pseudo data to pre-train the model, and the CGED data to fine-tune it. Being aware of the significance of precision in a grammar error correction system in real scenarios, we use ensembles to boost precision. When using inputs as simple as Chinese characters, the ensembled system achieves a precision at 86.56{%} in the detection of erroneous sentences, and a precision at 51.53{%} in the correction of errors of Selection and Missing types.
Tasks
Published 2018-07-01
URL https://www.aclweb.org/anthology/W18-3709/
PDF https://www.aclweb.org/anthology/W18-3709
PWC https://paperswithcode.com/paper/lingcass-solution-to-the-nlp-tea-cged-shared
Repo
Framework

The Importance of Recommender and Feedback Features in a Pronunciation Learning Aid

Title The Importance of Recommender and Feedback Features in a Pronunciation Learning Aid
Authors Dzikri Fudholi, Hanna Suominen
Abstract Verbal communication {—} and pronunciation as its part {—} is a core skill that can be developed through guided learning. An artificial intelligence system can take a role in these guided learning approaches as an enabler of an application for pronunciation learning with a recommender system to guide language learners through exercises and feedback system to correct their pronunciation. In this paper, we report on a user study on language learners{'} perceived usefulness of the application. 16 international students who spoke non-native English and lived in Australia participated. 13 of them said they need to improve their pronunciation skills in English because of their foreign accent. The feedback system with features for pronunciation scoring, speech replay, and giving a pronunciation example was deemed essential by most of the respondents. In contrast, a clear dichotomy between the recommender system perceived as useful or useless existed; the system had features to prompt new common words or old poorly-scored words. These results can be used to target research and development from information retrieval and reinforcement learning for better and better recommendations to speech recognition and speech analytics for accent acquisition.
Tasks Information Retrieval, Recommendation Systems, Speech Recognition
Published 2018-07-01
URL https://www.aclweb.org/anthology/W18-3711/
PDF https://www.aclweb.org/anthology/W18-3711
PWC https://paperswithcode.com/paper/the-importance-of-recommender-and-feedback
Repo
Framework

Augmenting Textual Qualitative Features in Deep Convolution Recurrent Neural Network for Automatic Essay Scoring

Title Augmenting Textual Qualitative Features in Deep Convolution Recurrent Neural Network for Automatic Essay Scoring
Authors Tirthankar Dasgupta, Abir Naskar, Lipika Dey, Rupsa Saha
Abstract In this paper we present a qualitatively enhanced deep convolution recurrent neural network for computing the quality of a text in an automatic essay scoring task. The novelty of the work lies in the fact that instead of considering only the word and sentence representation of a text, we try to augment the different complex linguistic, cognitive and psycological features associated within a text document along with a hierarchical convolution recurrent neural network framework. Our preliminary investigation shows that incorporation of such qualitative feature vectors along with standard word/sentence embeddings can give us better understanding about improving the overall evaluation of the input essays.
Tasks Sentence Embeddings
Published 2018-07-01
URL https://www.aclweb.org/anthology/W18-3713/
PDF https://www.aclweb.org/anthology/W18-3713
PWC https://paperswithcode.com/paper/augmenting-textual-qualitative-features-in
Repo
Framework

A Web-based Framework for Collecting and Assessing Highlighted Sentences in a Document

Title A Web-based Framework for Collecting and Assessing Highlighted Sentences in a Document
Authors Sasha Spala, Franck Dernoncourt, Walter Chang, Carl Dockhorn
Abstract Automatically highlighting a text aims at identifying key portions that are the most important to a reader. In this paper, we present a web-based framework designed to efficiently and scalably crowdsource two independent but related tasks: collecting highlight annotations, and comparing the performance of automated highlighting systems. The first task is necessary to understand human preferences and train supervised automated highlighting systems. The second task yields a more accurate and fine-grained evaluation than existing automated performance metrics.
Tasks
Published 2018-08-01
URL https://www.aclweb.org/anthology/C18-2017/
PDF https://www.aclweb.org/anthology/C18-2017
PWC https://paperswithcode.com/paper/a-web-based-framework-for-collecting-and
Repo
Framework

Reflection Removal for Large-Scale 3D Point Clouds

Title Reflection Removal for Large-Scale 3D Point Clouds
Authors Jae-Seong Yun, Jae-Young Sim
Abstract Large-scale 3D point clouds (LS3DPCs) captured by terrestrial LiDAR scanners often exhibit reflection artifacts by glasses, which degrade the performance of related computer vision techniques. In this paper, we propose an efficient reflection removal algorithm for LS3DPCs. We first partition the unit sphere into local surface patches which are then classified into the ordinary patches and the glass patches according to the number of echo pulses from emitted laser pulses. Then we estimate the glass region of dominant reflection artifacts by measuring the reliability. We also detect and remove the virtual points using the conditions of the reflection symmetry and the geometric similarity. We test the performance of the proposed algorithm on LS3DPCs capturing real-world outdoor scenes, and show that the proposed algorithm estimates valid glass regions faithfully and removes the virtual points caused by reflection artifacts successfully.
Tasks
Published 2018-06-01
URL http://openaccess.thecvf.com/content_cvpr_2018/html/Yun_Reflection_Removal_for_CVPR_2018_paper.html
PDF http://openaccess.thecvf.com/content_cvpr_2018/papers/Yun_Reflection_Removal_for_CVPR_2018_paper.pdf
PWC https://paperswithcode.com/paper/reflection-removal-for-large-scale-3d-point
Repo
Framework

Measuring Beginner Friendliness of Japanese Web Pages explaining Academic Concepts by Integrating Neural Image Feature and Text Features

Title Measuring Beginner Friendliness of Japanese Web Pages explaining Academic Concepts by Integrating Neural Image Feature and Text Features
Authors Hayato Shiokawa, Kota Kawaguchi, Bingcai Han, Takehito Utsuro, Yasuhide Kawada, Masaharu Yoshioka, K, Noriko o
Abstract Search engine is an important tool of modern academic study, but the results are lack of measurement of beginner friendliness. In order to improve the efficiency of using search engine for academic study, it is necessary to invent a technique of measuring the beginner friendliness of a Web page explaining academic concepts and to build an automatic measurement system. This paper studies how to integrate heterogeneous features such as a neural image feature generated from the image of the Web page by a variant of CNN (convolutional neural network) as well as text features extracted from the body text of the HTML file of the Web page. Integration is performed through the framework of the SVM classifier learning. Evaluation results show that heterogeneous features perform better than each individual type of features.
Tasks
Published 2018-07-01
URL https://www.aclweb.org/anthology/W18-3721/
PDF https://www.aclweb.org/anthology/W18-3721
PWC https://paperswithcode.com/paper/measuring-beginner-friendliness-of-japanese
Repo
Framework

The First Multilingual Surface Realisation Shared Task (SR’18): Overview and Evaluation Results

Title The First Multilingual Surface Realisation Shared Task (SR’18): Overview and Evaluation Results
Authors Simon Mille, Anja Belz, Bernd Bohnet, Yvette Graham, Emily Pitler, Leo Wanner
Abstract We report results from the SR{'}18 Shared Task, a new multilingual surface realisation task organised as part of the ACL{'}18 Workshop on Multilingual Surface Realisation. As in its English-only predecessor task SR{'}11, the shared task comprised two tracks with different levels of complexity: (a) a shallow track where the inputs were full UD structures with word order information removed and tokens lemmatised; and (b) a deep track where additionally, functional words and morphological information were removed. The shallow track was offered in ten, and the deep track in three languages. Systems were evaluated (a) automatically, using a range of intrinsic metrics, and (b) by human judges in terms of readability and meaning similarity. This report presents the evaluation results, along with descriptions of the SR{'}18 tracks, data and evaluation methods. For full descriptions of the participating systems, please see the separate system reports elsewhere in this volume.
Tasks
Published 2018-07-01
URL https://www.aclweb.org/anthology/W18-3601/
PDF https://www.aclweb.org/anthology/W18-3601
PWC https://paperswithcode.com/paper/the-first-multilingual-surface-realisation-1
Repo
Framework

Bridging resolution: Task definition, corpus resources and rule-based experiments

Title Bridging resolution: Task definition, corpus resources and rule-based experiments
Authors Ina Roesiger, Arndt Riester, Jonas Kuhn
Abstract Recent work on bridging resolution has so far been based on the corpus ISNotes (Markert et al. 2012), as this was the only corpus available with unrestricted bridging annotation. Hou et al. 2014{'}s rule-based system currently achieves state-of-the-art performance on this corpus, as learning-based approaches suffer from the lack of available training data. Recently, a number of new corpora with bridging annotations have become available. To test the generalisability of the approach by Hou et al. 2014, we apply a slightly extended rule-based system to these corpora. Besides the expected out-of-domain effects, we also observe low performance on some of the in-domain corpora. Our analysis shows that this is the result of two very different phenomena being defined as bridging, namely referential and lexical bridging. We also report that filtering out gold or predicted coreferent anaphors before applying the bridging resolution system helps improve bridging resolution.
Tasks Coreference Resolution
Published 2018-08-01
URL https://www.aclweb.org/anthology/C18-1298/
PDF https://www.aclweb.org/anthology/C18-1298
PWC https://paperswithcode.com/paper/bridging-resolution-task-definition-corpus
Repo
Framework

Chinese Grammatical Error Diagnosis Based on CRF and LSTM-CRF model

Title Chinese Grammatical Error Diagnosis Based on CRF and LSTM-CRF model
Authors Yujie Zhou, Yinan Shao, Yong Zhou
Abstract When learning Chinese as a foreign language, the learners may have some grammatical errors due to negative migration of their native languages. However, few grammar checking applications have been developed to support the learners. The goal of this paper is to develop a tool to automatically diagnose four types of grammatical errors which are redundant words (R), missing words (M), bad word selection (S) and disordered words (W) in Chinese sentences written by those foreign learners. In this paper, a conventional linear CRF model with specific feature engineering and a LSTM-CRF model are used to solve the CGED (Chinese Grammatical Error Diagnosis) task. We make some improvement on both models and the submitted results have better performance on false positive rate and accuracy than the average of all runs from CGED2018 for all three evaluation levels.
Tasks Feature Engineering
Published 2018-07-01
URL https://www.aclweb.org/anthology/W18-3724/
PDF https://www.aclweb.org/anthology/W18-3724
PWC https://paperswithcode.com/paper/chinese-grammatical-error-diagnosis-based-on
Repo
Framework

Contextualized Character Representation for Chinese Grammatical Error Diagnosis

Title Contextualized Character Representation for Chinese Grammatical Error Diagnosis
Authors Jianbo Zhao, Si Li, Zhiqing Lin
Abstract Nowadays, more and more people are learning Chinese as their second language. Establishing an automatic diagnosis system for Chinese grammatical error has become an important challenge. In this paper, we propose a Chinese grammatical error diagnosis (CGED) model with contextualized character representation. Compared to the traditional model using LSTM (Long-Short Term Memory), our model have better performance and there is no need to add too many artificial features.
Tasks
Published 2018-07-01
URL https://www.aclweb.org/anthology/W18-3725/
PDF https://www.aclweb.org/anthology/W18-3725
PWC https://paperswithcode.com/paper/contextualized-character-representation-for
Repo
Framework

CMMC-BDRC Solution to the NLP-TEA-2018 Chinese Grammatical Error Diagnosis Task

Title CMMC-BDRC Solution to the NLP-TEA-2018 Chinese Grammatical Error Diagnosis Task
Authors Yongwei Zhang, Qinan Hu, Fang Liu, Yueguo Gu
Abstract Chinese grammatical error diagnosis is an important natural language processing (NLP) task, which is also an important application using artificial intelligence technology in language education. This paper introduces a system developed by the Chinese Multilingual {&} Multimodal Corpus and Big Data Research Center for the NLP-TEA shared task, named Chinese Grammar Error Diagnosis (CGED). This system regards diagnosing errors task as a sequence tagging problem, while takes correction task as a text classification problem. Finally, in the 12 teams, this system gets the highest F1 score in the detection task and the second highest F1 score in mean in the identification task, position task and the correction task.
Tasks Data Augmentation, Text Classification
Published 2018-07-01
URL https://www.aclweb.org/anthology/W18-3726/
PDF https://www.aclweb.org/anthology/W18-3726
PWC https://paperswithcode.com/paper/cmmc-bdrc-solution-to-the-nlp-tea-2018
Repo
Framework

Source Critical Reinforcement Learning for Transferring Spoken Language Understanding to a New Language

Title Source Critical Reinforcement Learning for Transferring Spoken Language Understanding to a New Language
Authors He Bai, Yu Zhou, Jiajun Zhang, Liang Zhao, Mei-Yuh Hwang, Chengqing Zong
Abstract To deploy a spoken language understanding (SLU) model to a new language, language transferring is desired to avoid the trouble of acquiring and labeling a new big SLU corpus. An SLU corpus is a monolingual corpus with domain/intent/slot labels. Translating the original SLU corpus into the target language is an attractive strategy. However, SLU corpora consist of plenty of semantic labels (slots), which general-purpose translators cannot handle well, not to mention additional culture differences. This paper focuses on the language transferring task given a small in-domain parallel SLU corpus. The in-domain parallel corpus can be used as the first adaptation on the general translator. But more importantly, we show how to use reinforcement learning (RL) to further adapt the adapted translator, where translated sentences with more proper slot tags receive higher rewards. Our reward is derived from the source input sentence exclusively, unlike reward via actor-critical methods or computing reward with a ground truth target sentence. Hence we can adapt the translator the second time, using the big monolingual SLU corpus from the source language. We evaluate our approach on Chinese to English language transferring for SLU systems. The experimental results show that the generated English SLU corpus via adaptation and reinforcement learning gives us over 97{%} in the slot F1 score and over 84{%} accuracy in domain classification. It demonstrates the effectiveness of the proposed language transferring method. Compared with naive translation, our proposed method improves domain classification accuracy by relatively 22{%}, and the slot filling F1 score by relatively more than 71{%}.
Tasks Machine Translation, Slot Filling, Spoken Language Understanding
Published 2018-08-01
URL https://www.aclweb.org/anthology/C18-1305/
PDF https://www.aclweb.org/anthology/C18-1305
PWC https://paperswithcode.com/paper/source-critical-reinforcement-learning-for-1
Repo
Framework
comments powered by Disqus