Paper Group NANR 253
Tw-StAR at SemEval-2019 Task 5: N-gram embeddings for Hate Speech Detection in Multilingual Tweets. Vista.ue at SemEval-2019 Task 5: Single Multilingual Hate Speech Detection Model. Proceedings of the 2nd Workshop on Machine Reading for Question Answering. Inspecting Unification of Encoding and Matching with Transformer: A Case Study of Machine Rea …
Tw-StAR at SemEval-2019 Task 5: N-gram embeddings for Hate Speech Detection in Multilingual Tweets
Title | Tw-StAR at SemEval-2019 Task 5: N-gram embeddings for Hate Speech Detection in Multilingual Tweets |
Authors | Hala Mulki, Chedi Bechikh Ali, Hatem Haddad, Ismail Babao{\u{g}}lu |
Abstract | In this paper, we describe our contribution in SemEval-2019: subtask A of task 5 {``}Multilingual detection of hate speech against immigrants and women in Twitter (HatEval){''}. We developed two hate speech detection model variants through Tw-StAR framework. While the first model adopted one-hot encoding ngrams to train an NB classifier, the second generated and learned n-gram embeddings within a feedforward neural network. For both models, specific terms, selected via MWT patterns, were tagged in the input data. With two feature types employed, we could investigate the ability of n-gram embeddings to rival one-hot n-grams. Our results showed that in English, n-gram embeddings outperformed one-hot ngrams. However, representing Spanish tweets by one-hot n-grams yielded a slightly better performance compared to that of n-gram embeddings. The official ranking indicated that Tw-StAR ranked 9th for English and 20th for Spanish. | |
Tasks | Hate Speech Detection |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/S19-2090/ |
https://www.aclweb.org/anthology/S19-2090 | |
PWC | https://paperswithcode.com/paper/tw-star-at-semeval-2019-task-5-n-gram |
Repo | |
Framework | |
Vista.ue at SemEval-2019 Task 5: Single Multilingual Hate Speech Detection Model
Title | Vista.ue at SemEval-2019 Task 5: Single Multilingual Hate Speech Detection Model |
Authors | Kashyap Raiyani, Teresa Gon{\c{c}}alves, Paulo Quaresma, Vitor Nogueira |
Abstract | This paper shares insight from participating in SemEval-2019 Task 5. The main propose of this system-description paper is to facilitate the reader with replicability and to provide insightful analysis of the developed system. Here in Vista.ue, we proposed a single multilingual hate speech detection model. This model was ranked 46/70 for English Task A and 31/43 for English Task B. Vista.ue was able to rank 38/41 for Spanish Task A and 22/25 for Spanish Task B. |
Tasks | Hate Speech Detection |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/S19-2094/ |
https://www.aclweb.org/anthology/S19-2094 | |
PWC | https://paperswithcode.com/paper/vistaue-at-semeval-2019-task-5-single |
Repo | |
Framework | |
Proceedings of the 2nd Workshop on Machine Reading for Question Answering
Title | Proceedings of the 2nd Workshop on Machine Reading for Question Answering |
Authors | |
Abstract | |
Tasks | Question Answering, Reading Comprehension |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-5800/ |
https://www.aclweb.org/anthology/D19-5800 | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-2nd-workshop-on-machine |
Repo | |
Framework | |
Inspecting Unification of Encoding and Matching with Transformer: A Case Study of Machine Reading Comprehension
Title | Inspecting Unification of Encoding and Matching with Transformer: A Case Study of Machine Reading Comprehension |
Authors | Hangbo Bao, Li Dong, Furu Wei, Wenhui Wang, Nan Yang, Lei Cui, Songhao Piao, Ming Zhou |
Abstract | Most machine reading comprehension (MRC) models separately handle encoding and matching with different network architectures. In contrast, pretrained language models with Transformer layers, such as GPT (Radford et al., 2018) and BERT (Devlin et al., 2018), have achieved competitive performance on MRC. A research question that naturally arises is: apart from the benefits of pre-training, how many performance gain comes from the unified network architecture. In this work, we evaluate and analyze unifying encoding and matching components with Transformer for the MRC task. Experimental results on SQuAD show that the unified model outperforms previous networks that separately treat encoding and matching. We also introduce a metric to inspect whether a Transformer layer tends to perform encoding or matching. The analysis results show that the unified model learns different modeling strategies compared with previous manually-designed models. |
Tasks | Machine Reading Comprehension, Reading Comprehension |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-5802/ |
https://www.aclweb.org/anthology/D19-5802 | |
PWC | https://paperswithcode.com/paper/inspecting-unification-of-encoding-and |
Repo | |
Framework | |
Multiple Text Style Transfer by using Word-level Conditional Generative Adversarial Network with Two-Phase Training
Title | Multiple Text Style Transfer by using Word-level Conditional Generative Adversarial Network with Two-Phase Training |
Authors | Chih-Te Lai, Yi-Te Hong, Hong-You Chen, Chi-Jen Lu, Shou-De Lin |
Abstract | The objective of non-parallel text style transfer, or controllable text generation, is to alter specific attributes (e.g. sentiment, mood, tense, politeness, etc) of a given text while preserving its remaining attributes and content. Generative adversarial network (GAN) is a popular model to ensure the transferred sentences are realistic and have the desired target styles. However, training GAN often suffers from mode collapse problem, which causes that the transferred text is little related to the original text. In this paper, we propose a new GAN model with a word-level conditional architecture and a two-phase training procedure. By using a style-related condition architecture before generating a word, our model is able to maintain style-unrelated words while changing the others. By separating the training procedure into reconstruction and transfer phases, our model is able to learn a proper text generation process, which further improves the content preservation. We test our model on polarity sentiment transfer and multiple-attribute transfer tasks. The empirical results show that our model achieves comparable evaluation scores in both transfer accuracy and fluency but significantly outperforms other state-of-the-art models in content compatibility on three real-world datasets. |
Tasks | Style Transfer, Text Generation, Text Style Transfer |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-1366/ |
https://www.aclweb.org/anthology/D19-1366 | |
PWC | https://paperswithcode.com/paper/multiple-text-style-transfer-by-using-word |
Repo | |
Framework | |
Answer-Supervised Question Reformulation for Enhancing Conversational Machine Comprehension
Title | Answer-Supervised Question Reformulation for Enhancing Conversational Machine Comprehension |
Authors | Qian Li, Hui Su, Cheng Niu, Daling Wang, Zekang Li, Shi Feng, Yifei Zhang |
Abstract | In conversational machine comprehension, it has become one of the research hotspots integrating conversational history information through question reformulation for obtaining better answers. However, the existing question reformulation models are trained only using supervised question labels annotated by annotators without considering any feedback information from answers. In this paper, we propose a novel Answer-Supervised Question Reformulation (ASQR) model for enhancing conversational machine comprehension with reinforcement learning technology. ASQR utilizes a pointer-copy-based question reformulation model as an agent, takes an action to predict the next word, and observes a reward for the whole sentence state after generating the end-of-sequence token. The experimental results on QuAC dataset prove that our ASQR model is more effective in conversational machine comprehension. Moreover, pretraining is essential in reinforcement learning models, so we provide a high-quality annotated dataset for question reformulation by sampling a part of QuAC dataset. |
Tasks | Reading Comprehension |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-5805/ |
https://www.aclweb.org/anthology/D19-5805 | |
PWC | https://paperswithcode.com/paper/answer-supervised-question-reformulation-for |
Repo | |
Framework | |
Improving the Robustness of Deep Reading Comprehension Models by Leveraging Syntax Prior
Title | Improving the Robustness of Deep Reading Comprehension Models by Leveraging Syntax Prior |
Authors | Bowen Wu, Haoyang Huang, Zongsheng Wang, Qihang Feng, Jingsong Yu, Baoxun Wang |
Abstract | Despite the remarkable progress on Machine Reading Comprehension (MRC) with the help of open-source datasets, recent studies indicate that most of the current MRC systems unfortunately suffer from weak robustness against adversarial samples. To address this issue, we attempt to take sentence syntax as the leverage in the answer predicting process which previously only takes account of phrase-level semantics. Furthermore, to better utilize the sentence syntax and improve the robustness, we propose a Syntactic Leveraging Network, which is designed to deal with adversarial samples by exploiting the syntactic elements of a question. The experiment results indicate that our method is promising for improving the generalization and robustness of MRC models against the influence of adversarial samples, with performance well-maintained. |
Tasks | Machine Reading Comprehension, Reading Comprehension |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-5807/ |
https://www.aclweb.org/anthology/D19-5807 | |
PWC | https://paperswithcode.com/paper/improving-the-robustness-of-deep-reading |
Repo | |
Framework | |
Perturbation Analysis of the 8-Point Algorithm: A Case Study for Wide FoV Cameras
Title | Perturbation Analysis of the 8-Point Algorithm: A Case Study for Wide FoV Cameras |
Authors | Thiago L. T. da Silveira, Claudio R. Jung |
Abstract | This paper presents a perturbation analysis for the estimate of epipolar matrices using the 8-Point Algorithm (8-PA). Our approach explores existing bounds for singular subspaces and relates them to the 8-PA, without assuming any kind of error distribution for the matched features. In particular, if we use unit vectors as homogeneous image coordinates, we show that having a wide spatial distribution of matched features in both views tends to generate lower error bounds for the epipolar matrix error. Our experimental validation indicates that the bounds and the effective errors tend to decrease as the camera Field of View (FoV) increases, and that using the 8-PA for spherical images (that present 360degx180deg FoV) leads to accurate essential matrices. As an additional contribution, we present bounds for the direction of the translation vector extracted from the essential matrix based on singular subspace analysis. |
Tasks | |
Published | 2019-06-01 |
URL | http://openaccess.thecvf.com/content_CVPR_2019/html/da_Silveira_Perturbation_Analysis_of_the_8-Point_Algorithm_A_Case_Study_for_CVPR_2019_paper.html |
http://openaccess.thecvf.com/content_CVPR_2019/papers/da_Silveira_Perturbation_Analysis_of_the_8-Point_Algorithm_A_Case_Study_for_CVPR_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/perturbation-analysis-of-the-8-point |
Repo | |
Framework | |
Document-Level Event Factuality Identification via Adversarial Neural Network
Title | Document-Level Event Factuality Identification via Adversarial Neural Network |
Authors | Zhong Qian, Peifeng Li, Qiaoming Zhu, Guodong Zhou |
Abstract | Document-level event factuality identification is an important subtask in event factuality and is crucial for discourse understanding in Natural Language Processing (NLP). Previous studies mainly suffer from the scarcity of suitable corpus and effective methods. To solve these two issues, we first construct a corpus annotated with both document- and sentence-level event factuality information on both English and Chinese texts. Then we present an LSTM neural network based on adversarial training with both intra- and inter-sequence attentions to identify document-level event factuality. Experimental results show that our neural network model can outperform various baselines on the constructed corpus. |
Tasks | |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/N19-1287/ |
https://www.aclweb.org/anthology/N19-1287 | |
PWC | https://paperswithcode.com/paper/document-level-event-factuality |
Repo | |
Framework | |
Song Lyrics Summarization Inspired by Audio Thumbnailing
Title | Song Lyrics Summarization Inspired by Audio Thumbnailing |
Authors | Michael Fell, Elena Cabrio, G, Fabien on, Alain Giboin |
Abstract | Given the peculiar structure of songs, applying generic text summarization methods to lyrics can lead to the generation of highly redundant and incoherent text. In this paper, we propose to enhance state-of-the-art text summarization approaches with a method inspired by audio thumbnailing. Instead of searching for the thumbnail clues in the audio of the song, we identify equivalent clues in the lyrics. We then show how these summaries that take into account the audio nature of the lyrics outperform the generic methods according to both an automatic evaluation and human judgments. |
Tasks | Text Summarization |
Published | 2019-09-01 |
URL | https://www.aclweb.org/anthology/R19-1038/ |
https://www.aclweb.org/anthology/R19-1038 | |
PWC | https://paperswithcode.com/paper/song-lyrics-summarization-inspired-by-audio |
Repo | |
Framework | |
Improving Natural Language Understanding by Reverse Mapping Bytepair Encoding
Title | Improving Natural Language Understanding by Reverse Mapping Bytepair Encoding |
Authors | Chaodong Tong, Huailiang Peng, Qiong Dai, Lei Jiang, Jianghua Huang |
Abstract | We propose a method called reverse mapping bytepair encoding, which maps named-entity information and other word-level linguistic features back to subwords during the encoding procedure of bytepair encoding (BPE). We employ this method to the Generative Pre-trained Transformer (OpenAI GPT) by adding a weighted linear layer after the embedding layer. We also propose a new model architecture named as the multi-channel separate transformer to employ a training process without parameter-sharing. Evaluation on Stories Cloze, RTE, SciTail and SST-2 datasets demonstrates the effectiveness of our approach. |
Tasks | |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/K19-1016/ |
https://www.aclweb.org/anthology/K19-1016 | |
PWC | https://paperswithcode.com/paper/improving-natural-language-understanding-by |
Repo | |
Framework | |
CN-HIT-MI.T at SemEval-2019 Task 6: Offensive Language Identification Based on BiLSTM with Double Attention
Title | CN-HIT-MI.T at SemEval-2019 Task 6: Offensive Language Identification Based on BiLSTM with Double Attention |
Authors | Yaojie Zhang, Bing Xu, Tiejun Zhao |
Abstract | Offensive language has become pervasive in social media. In Offensive Language Identification tasks, it may be difficult to predict accurately only according to the surface words. So we try to dig deeper semantic information of text. This paper presents use an attention-based two layers bidirectional longshort memory neural network (BiLSTM) for semantic feature extraction. Additionally, a residual connection mechanism is used to synthesize two different deep features, and an emoji attention mechanism is used to extract semantic information of emojis in text. We participated in three sub-tasks of SemEval 2019 Task 6 as CN-HIT-MI.T team. Our macro-averaged F1-score in sub-task A is 0.768, ranking 28/103. We got 0.638 in sub-task B, ranking 30/75. In sub-task C, we got 0.549, ranking 22/65. We also tried some other methods of not submitting results. |
Tasks | Language Identification |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/S19-2101/ |
https://www.aclweb.org/anthology/S19-2101 | |
PWC | https://paperswithcode.com/paper/cn-hit-mit-at-semeval-2019-task-6-offensive |
Repo | |
Framework | |
Finite State Transducer based Morphology analysis for Malayalam Language
Title | Finite State Transducer based Morphology analysis for Malayalam Language |
Authors | Santhosh Thottingal |
Abstract | |
Tasks | |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-6801/ |
https://www.aclweb.org/anthology/W19-6801 | |
PWC | https://paperswithcode.com/paper/finite-state-transducer-based-morphology |
Repo | |
Framework | |
Emad at SemEval-2019 Task 6: Offensive Language Identification using Traditional Machine Learning and Deep Learning approaches
Title | Emad at SemEval-2019 Task 6: Offensive Language Identification using Traditional Machine Learning and Deep Learning approaches |
Authors | Emad Kebriaei, Samaneh Karimi, Nazanin Sabri, Azadeh Shakery |
Abstract | In this paper, the used methods and the results obtained by our team, entitled Emad, on the OffensEval 2019 shared task organized at SemEval 2019 are presented. The OffensEval shared task includes three sub-tasks namely Offensive language identification, Automatic categorization of offense types and Offense target identification. We participated in sub-task A and tried various methods including traditional machine learning methods, deep learning methods and also a combination of the first two sets of methods. We also proposed a data augmentation method using word embedding to improve the performance of our methods. The results show that the augmentation approach outperforms other methods in terms of macro-f1. |
Tasks | Data Augmentation, Language Identification |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/S19-2107/ |
https://www.aclweb.org/anthology/S19-2107 | |
PWC | https://paperswithcode.com/paper/emad-at-semeval-2019-task-6-offensive |
Repo | |
Framework | |
Multi-Level Context Ultra-Aggregation for Stereo Matching
Title | Multi-Level Context Ultra-Aggregation for Stereo Matching |
Authors | Guang-Yu Nie, Ming-Ming Cheng, Yun Liu, Zhengfa Liang, Deng-Ping Fan, Yue Liu, Yongtian Wang |
Abstract | Exploiting multi-level context information to cost volume can improve the performance of learning-based stereo matching methods. In recent years, 3-D Convolution Neural Networks (3-D CNNs) show the advantages in regularizing cost volume but are limited by unary features learning in matching cost computation. However, existing methods only use features from plain convolution layers or a simple aggregation of multi-level features to calculate cost volume, which is insufficient because stereo matching requires discriminative features to identify corresponding pixels in rectified stereo image pairs. In this paper, we propose a unary features descriptor using multi-level context ultra-aggregation (MCUA), which encapsulates all convolutional features into a more discriminative representation by intra- and inter-level features combination. Specifically, a child module that takes low-resolution images as input captures larger context information; the larger context information from each layer is densely connected to the main branch of the network. MCUA makes good usage of multi-level features with richer context and performs the image-to-image prediction holistically. We introduce our MCUA scheme for cost volume calculation and test it on PSM-Net. We also evaluate our method on Scene Flow and KITTI 2012/2015 stereo datasets. Experimental results show that our method outperforms state-of-the-art methods by a notable margin and effectively improves the accuracy of stereo matching. |
Tasks | Stereo Matching, Stereo Matching Hand |
Published | 2019-06-01 |
URL | http://openaccess.thecvf.com/content_CVPR_2019/html/Nie_Multi-Level_Context_Ultra-Aggregation_for_Stereo_Matching_CVPR_2019_paper.html |
http://openaccess.thecvf.com/content_CVPR_2019/papers/Nie_Multi-Level_Context_Ultra-Aggregation_for_Stereo_Matching_CVPR_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/multi-level-context-ultra-aggregation-for |
Repo | |
Framework | |