January 24, 2020

2426 words 12 mins read

Paper Group NANR 253

Tw-StAR at SemEval-2019 Task 5: N-gram embeddings for Hate Speech Detection in Multilingual Tweets. Vista.ue at SemEval-2019 Task 5: Single Multilingual Hate Speech Detection Model. Proceedings of the 2nd Workshop on Machine Reading for Question Answering. Inspecting Unification of Encoding and Matching with Transformer: A Case Study of Machine Rea …

Tw-StAR at SemEval-2019 Task 5: N-gram embeddings for Hate Speech Detection in Multilingual Tweets


Title	Tw-StAR at SemEval-2019 Task 5: N-gram embeddings for Hate Speech Detection in Multilingual Tweets
Authors	Hala Mulki, Chedi Bechikh Ali, Hatem Haddad, Ismail Babao{\u{g}}lu
Abstract	In this paper, we describe our contribution in SemEval-2019: subtask A of task 5 {``}Multilingual detection of hate speech against immigrants and women in Twitter (HatEval){''}. We developed two hate speech detection model variants through Tw-StAR framework. While the first model adopted one-hot encoding ngrams to train an NB classifier, the second generated and learned n-gram embeddings within a feedforward neural network. For both models, specific terms, selected via MWT patterns, were tagged in the input data. With two feature types employed, we could investigate the ability of n-gram embeddings to rival one-hot n-grams. Our results showed that in English, n-gram embeddings outperformed one-hot ngrams. However, representing Spanish tweets by one-hot n-grams yielded a slightly better performance compared to that of n-gram embeddings. The official ranking indicated that Tw-StAR ranked 9th for English and 20th for Spanish. \|
Tasks	Hate Speech Detection
Published	2019-06-01
URL	https://www.aclweb.org/anthology/S19-2090/
PDF	https://www.aclweb.org/anthology/S19-2090
PWC	https://paperswithcode.com/paper/tw-star-at-semeval-2019-task-5-n-gram
Repo
Framework

Vista.ue at SemEval-2019 Task 5: Single Multilingual Hate Speech Detection Model


Title	Vista.ue at SemEval-2019 Task 5: Single Multilingual Hate Speech Detection Model
Authors	Kashyap Raiyani, Teresa Gon{\c{c}}alves, Paulo Quaresma, Vitor Nogueira
Abstract	This paper shares insight from participating in SemEval-2019 Task 5. The main propose of this system-description paper is to facilitate the reader with replicability and to provide insightful analysis of the developed system. Here in Vista.ue, we proposed a single multilingual hate speech detection model. This model was ranked 46/70 for English Task A and 31/43 for English Task B. Vista.ue was able to rank 38/41 for Spanish Task A and 22/25 for Spanish Task B.
Tasks	Hate Speech Detection
Published	2019-06-01
URL	https://www.aclweb.org/anthology/S19-2094/
PDF	https://www.aclweb.org/anthology/S19-2094
PWC	https://paperswithcode.com/paper/vistaue-at-semeval-2019-task-5-single
Repo
Framework

Proceedings of the 2nd Workshop on Machine Reading for Question Answering


Title	Proceedings of the 2nd Workshop on Machine Reading for Question Answering
Authors
Abstract
Tasks	Question Answering, Reading Comprehension
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-5800/
PDF	https://www.aclweb.org/anthology/D19-5800
PWC	https://paperswithcode.com/paper/proceedings-of-the-2nd-workshop-on-machine
Repo
Framework

Inspecting Unification of Encoding and Matching with Transformer: A Case Study of Machine Reading Comprehension


Title	Inspecting Unification of Encoding and Matching with Transformer: A Case Study of Machine Reading Comprehension
Authors	Hangbo Bao, Li Dong, Furu Wei, Wenhui Wang, Nan Yang, Lei Cui, Songhao Piao, Ming Zhou
Abstract	Most machine reading comprehension (MRC) models separately handle encoding and matching with different network architectures. In contrast, pretrained language models with Transformer layers, such as GPT (Radford et al., 2018) and BERT (Devlin et al., 2018), have achieved competitive performance on MRC. A research question that naturally arises is: apart from the benefits of pre-training, how many performance gain comes from the unified network architecture. In this work, we evaluate and analyze unifying encoding and matching components with Transformer for the MRC task. Experimental results on SQuAD show that the unified model outperforms previous networks that separately treat encoding and matching. We also introduce a metric to inspect whether a Transformer layer tends to perform encoding or matching. The analysis results show that the unified model learns different modeling strategies compared with previous manually-designed models.
Tasks	Machine Reading Comprehension, Reading Comprehension
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-5802/
PDF	https://www.aclweb.org/anthology/D19-5802
PWC	https://paperswithcode.com/paper/inspecting-unification-of-encoding-and
Repo
Framework

Multiple Text Style Transfer by using Word-level Conditional Generative Adversarial Network with Two-Phase Training


Title	Multiple Text Style Transfer by using Word-level Conditional Generative Adversarial Network with Two-Phase Training
Authors	Chih-Te Lai, Yi-Te Hong, Hong-You Chen, Chi-Jen Lu, Shou-De Lin
Abstract	The objective of non-parallel text style transfer, or controllable text generation, is to alter specific attributes (e.g. sentiment, mood, tense, politeness, etc) of a given text while preserving its remaining attributes and content. Generative adversarial network (GAN) is a popular model to ensure the transferred sentences are realistic and have the desired target styles. However, training GAN often suffers from mode collapse problem, which causes that the transferred text is little related to the original text. In this paper, we propose a new GAN model with a word-level conditional architecture and a two-phase training procedure. By using a style-related condition architecture before generating a word, our model is able to maintain style-unrelated words while changing the others. By separating the training procedure into reconstruction and transfer phases, our model is able to learn a proper text generation process, which further improves the content preservation. We test our model on polarity sentiment transfer and multiple-attribute transfer tasks. The empirical results show that our model achieves comparable evaluation scores in both transfer accuracy and fluency but significantly outperforms other state-of-the-art models in content compatibility on three real-world datasets.
Tasks	Style Transfer, Text Generation, Text Style Transfer
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-1366/
PDF	https://www.aclweb.org/anthology/D19-1366
PWC	https://paperswithcode.com/paper/multiple-text-style-transfer-by-using-word
Repo
Framework

Answer-Supervised Question Reformulation for Enhancing Conversational Machine Comprehension


Title	Answer-Supervised Question Reformulation for Enhancing Conversational Machine Comprehension
Authors	Qian Li, Hui Su, Cheng Niu, Daling Wang, Zekang Li, Shi Feng, Yifei Zhang
Abstract	In conversational machine comprehension, it has become one of the research hotspots integrating conversational history information through question reformulation for obtaining better answers. However, the existing question reformulation models are trained only using supervised question labels annotated by annotators without considering any feedback information from answers. In this paper, we propose a novel Answer-Supervised Question Reformulation (ASQR) model for enhancing conversational machine comprehension with reinforcement learning technology. ASQR utilizes a pointer-copy-based question reformulation model as an agent, takes an action to predict the next word, and observes a reward for the whole sentence state after generating the end-of-sequence token. The experimental results on QuAC dataset prove that our ASQR model is more effective in conversational machine comprehension. Moreover, pretraining is essential in reinforcement learning models, so we provide a high-quality annotated dataset for question reformulation by sampling a part of QuAC dataset.
Tasks	Reading Comprehension
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-5805/
PDF	https://www.aclweb.org/anthology/D19-5805
PWC	https://paperswithcode.com/paper/answer-supervised-question-reformulation-for
Repo
Framework

Improving the Robustness of Deep Reading Comprehension Models by Leveraging Syntax Prior


Title	Improving the Robustness of Deep Reading Comprehension Models by Leveraging Syntax Prior
Authors	Bowen Wu, Haoyang Huang, Zongsheng Wang, Qihang Feng, Jingsong Yu, Baoxun Wang
Abstract	Despite the remarkable progress on Machine Reading Comprehension (MRC) with the help of open-source datasets, recent studies indicate that most of the current MRC systems unfortunately suffer from weak robustness against adversarial samples. To address this issue, we attempt to take sentence syntax as the leverage in the answer predicting process which previously only takes account of phrase-level semantics. Furthermore, to better utilize the sentence syntax and improve the robustness, we propose a Syntactic Leveraging Network, which is designed to deal with adversarial samples by exploiting the syntactic elements of a question. The experiment results indicate that our method is promising for improving the generalization and robustness of MRC models against the influence of adversarial samples, with performance well-maintained.
Tasks	Machine Reading Comprehension, Reading Comprehension
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-5807/
PDF	https://www.aclweb.org/anthology/D19-5807
PWC	https://paperswithcode.com/paper/improving-the-robustness-of-deep-reading
Repo
Framework

Perturbation Analysis of the 8-Point Algorithm: A Case Study for Wide FoV Cameras


Title	Perturbation Analysis of the 8-Point Algorithm: A Case Study for Wide FoV Cameras
Authors	Thiago L. T. da Silveira, Claudio R. Jung
Abstract	This paper presents a perturbation analysis for the estimate of epipolar matrices using the 8-Point Algorithm (8-PA). Our approach explores existing bounds for singular subspaces and relates them to the 8-PA, without assuming any kind of error distribution for the matched features. In particular, if we use unit vectors as homogeneous image coordinates, we show that having a wide spatial distribution of matched features in both views tends to generate lower error bounds for the epipolar matrix error. Our experimental validation indicates that the bounds and the effective errors tend to decrease as the camera Field of View (FoV) increases, and that using the 8-PA for spherical images (that present 360degx180deg FoV) leads to accurate essential matrices. As an additional contribution, we present bounds for the direction of the translation vector extracted from the essential matrix based on singular subspace analysis.
Tasks
Published	2019-06-01
URL	http://openaccess.thecvf.com/content_CVPR_2019/html/da_Silveira_Perturbation_Analysis_of_the_8-Point_Algorithm_A_Case_Study_for_CVPR_2019_paper.html
PDF	http://openaccess.thecvf.com/content_CVPR_2019/papers/da_Silveira_Perturbation_Analysis_of_the_8-Point_Algorithm_A_Case_Study_for_CVPR_2019_paper.pdf
PWC	https://paperswithcode.com/paper/perturbation-analysis-of-the-8-point
Repo
Framework

Document-Level Event Factuality Identification via Adversarial Neural Network


Title	Document-Level Event Factuality Identification via Adversarial Neural Network
Authors	Zhong Qian, Peifeng Li, Qiaoming Zhu, Guodong Zhou
Abstract	Document-level event factuality identification is an important subtask in event factuality and is crucial for discourse understanding in Natural Language Processing (NLP). Previous studies mainly suffer from the scarcity of suitable corpus and effective methods. To solve these two issues, we first construct a corpus annotated with both document- and sentence-level event factuality information on both English and Chinese texts. Then we present an LSTM neural network based on adversarial training with both intra- and inter-sequence attentions to identify document-level event factuality. Experimental results show that our neural network model can outperform various baselines on the constructed corpus.
Tasks
Published	2019-06-01
URL	https://www.aclweb.org/anthology/N19-1287/
PDF	https://www.aclweb.org/anthology/N19-1287
PWC	https://paperswithcode.com/paper/document-level-event-factuality
Repo
Framework

Song Lyrics Summarization Inspired by Audio Thumbnailing


Title	Song Lyrics Summarization Inspired by Audio Thumbnailing
Authors	Michael Fell, Elena Cabrio, G, Fabien on, Alain Giboin
Abstract	Given the peculiar structure of songs, applying generic text summarization methods to lyrics can lead to the generation of highly redundant and incoherent text. In this paper, we propose to enhance state-of-the-art text summarization approaches with a method inspired by audio thumbnailing. Instead of searching for the thumbnail clues in the audio of the song, we identify equivalent clues in the lyrics. We then show how these summaries that take into account the audio nature of the lyrics outperform the generic methods according to both an automatic evaluation and human judgments.
Tasks	Text Summarization
Published	2019-09-01
URL	https://www.aclweb.org/anthology/R19-1038/
PDF	https://www.aclweb.org/anthology/R19-1038
PWC	https://paperswithcode.com/paper/song-lyrics-summarization-inspired-by-audio
Repo
Framework

Improving Natural Language Understanding by Reverse Mapping Bytepair Encoding


Title	Improving Natural Language Understanding by Reverse Mapping Bytepair Encoding
Authors	Chaodong Tong, Huailiang Peng, Qiong Dai, Lei Jiang, Jianghua Huang
Abstract	We propose a method called reverse mapping bytepair encoding, which maps named-entity information and other word-level linguistic features back to subwords during the encoding procedure of bytepair encoding (BPE). We employ this method to the Generative Pre-trained Transformer (OpenAI GPT) by adding a weighted linear layer after the embedding layer. We also propose a new model architecture named as the multi-channel separate transformer to employ a training process without parameter-sharing. Evaluation on Stories Cloze, RTE, SciTail and SST-2 datasets demonstrates the effectiveness of our approach.
Tasks
Published	2019-11-01
URL	https://www.aclweb.org/anthology/K19-1016/
PDF	https://www.aclweb.org/anthology/K19-1016
PWC	https://paperswithcode.com/paper/improving-natural-language-understanding-by
Repo
Framework

CN-HIT-MI.T at SemEval-2019 Task 6: Offensive Language Identification Based on BiLSTM with Double Attention


Title	CN-HIT-MI.T at SemEval-2019 Task 6: Offensive Language Identification Based on BiLSTM with Double Attention
Authors	Yaojie Zhang, Bing Xu, Tiejun Zhao
Abstract	Offensive language has become pervasive in social media. In Offensive Language Identification tasks, it may be difficult to predict accurately only according to the surface words. So we try to dig deeper semantic information of text. This paper presents use an attention-based two layers bidirectional longshort memory neural network (BiLSTM) for semantic feature extraction. Additionally, a residual connection mechanism is used to synthesize two different deep features, and an emoji attention mechanism is used to extract semantic information of emojis in text. We participated in three sub-tasks of SemEval 2019 Task 6 as CN-HIT-MI.T team. Our macro-averaged F1-score in sub-task A is 0.768, ranking 28/103. We got 0.638 in sub-task B, ranking 30/75. In sub-task C, we got 0.549, ranking 22/65. We also tried some other methods of not submitting results.
Tasks	Language Identification
Published	2019-06-01
URL	https://www.aclweb.org/anthology/S19-2101/
PDF	https://www.aclweb.org/anthology/S19-2101
PWC	https://paperswithcode.com/paper/cn-hit-mit-at-semeval-2019-task-6-offensive
Repo
Framework

Finite State Transducer based Morphology analysis for Malayalam Language


Title	Finite State Transducer based Morphology analysis for Malayalam Language
Authors	Santhosh Thottingal
Abstract
Tasks
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-6801/
PDF	https://www.aclweb.org/anthology/W19-6801
PWC	https://paperswithcode.com/paper/finite-state-transducer-based-morphology
Repo
Framework

Emad at SemEval-2019 Task 6: Offensive Language Identification using Traditional Machine Learning and Deep Learning approaches


Title	Emad at SemEval-2019 Task 6: Offensive Language Identification using Traditional Machine Learning and Deep Learning approaches
Authors	Emad Kebriaei, Samaneh Karimi, Nazanin Sabri, Azadeh Shakery
Abstract	In this paper, the used methods and the results obtained by our team, entitled Emad, on the OffensEval 2019 shared task organized at SemEval 2019 are presented. The OffensEval shared task includes three sub-tasks namely Offensive language identification, Automatic categorization of offense types and Offense target identification. We participated in sub-task A and tried various methods including traditional machine learning methods, deep learning methods and also a combination of the first two sets of methods. We also proposed a data augmentation method using word embedding to improve the performance of our methods. The results show that the augmentation approach outperforms other methods in terms of macro-f1.
Tasks	Data Augmentation, Language Identification
Published	2019-06-01
URL	https://www.aclweb.org/anthology/S19-2107/
PDF	https://www.aclweb.org/anthology/S19-2107
PWC	https://paperswithcode.com/paper/emad-at-semeval-2019-task-6-offensive
Repo
Framework

Multi-Level Context Ultra-Aggregation for Stereo Matching


Title	Multi-Level Context Ultra-Aggregation for Stereo Matching
Authors	Guang-Yu Nie, Ming-Ming Cheng, Yun Liu, Zhengfa Liang, Deng-Ping Fan, Yue Liu, Yongtian Wang
Abstract	Exploiting multi-level context information to cost volume can improve the performance of learning-based stereo matching methods. In recent years, 3-D Convolution Neural Networks (3-D CNNs) show the advantages in regularizing cost volume but are limited by unary features learning in matching cost computation. However, existing methods only use features from plain convolution layers or a simple aggregation of multi-level features to calculate cost volume, which is insufficient because stereo matching requires discriminative features to identify corresponding pixels in rectified stereo image pairs. In this paper, we propose a unary features descriptor using multi-level context ultra-aggregation (MCUA), which encapsulates all convolutional features into a more discriminative representation by intra- and inter-level features combination. Specifically, a child module that takes low-resolution images as input captures larger context information; the larger context information from each layer is densely connected to the main branch of the network. MCUA makes good usage of multi-level features with richer context and performs the image-to-image prediction holistically. We introduce our MCUA scheme for cost volume calculation and test it on PSM-Net. We also evaluate our method on Scene Flow and KITTI 2012/2015 stereo datasets. Experimental results show that our method outperforms state-of-the-art methods by a notable margin and effectively improves the accuracy of stereo matching.
Tasks	Stereo Matching, Stereo Matching Hand
Published	2019-06-01
URL	http://openaccess.thecvf.com/content_CVPR_2019/html/Nie_Multi-Level_Context_Ultra-Aggregation_for_Stereo_Matching_CVPR_2019_paper.html
PDF	http://openaccess.thecvf.com/content_CVPR_2019/papers/Nie_Multi-Level_Context_Ultra-Aggregation_for_Stereo_Matching_CVPR_2019_paper.pdf
PWC	https://paperswithcode.com/paper/multi-level-context-ultra-aggregation-for
Repo
Framework