Paper Group NANR 84
Fill the GAP: Exploiting BERT for Pronoun Resolution. An LSTM Adaptation Study of (Un)grammaticality. Stochastic Quantized Activation: To prevent Overfitting in Fast Adversarial Training. Self-Attention Networks for Intent Detection. Modelling Adaptive Presentations in Human-Robot Interaction using Behaviour Trees. From Explainability to Explanatio …
Fill the GAP: Exploiting BERT for Pronoun Resolution
Title | Fill the GAP: Exploiting BERT for Pronoun Resolution |
Authors | Kai-Chou Yang, Timothy Niven, Tzu Hsuan Chou, Hung-Yu Kao |
Abstract | In this paper, we describe our entry in the gendered pronoun resolution competition which achieved fourth place without data augmentation. Our method is an ensemble system of BERTs which resolves co-reference in an interaction space. We report four insights from our work: BERT{'}s representations involve significant redundancy; modeling interaction effects similar to natural language inference models is useful for this task; there is an optimal BERT layer to extract representations for pronoun resolution; and the difference between the attention weights from the pronoun to the candidate entities was highly correlated with the correct label, with interesting implications for future work. |
Tasks | Data Augmentation, Natural Language Inference |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-3815/ |
https://www.aclweb.org/anthology/W19-3815 | |
PWC | https://paperswithcode.com/paper/fill-the-gap-exploiting-bert-for-pronoun |
Repo | |
Framework | |
An LSTM Adaptation Study of (Un)grammaticality
Title | An LSTM Adaptation Study of (Un)grammaticality |
Authors | Shammur Absar Chowdhury, Roberto Zamparelli |
Abstract | We propose a novel approach to the study of how artificial neural network perceive the distinction between grammatical and ungrammatical sentences, a crucial task in the growing field of synthetic linguistics. The method is based on performance measures of language models trained on corpora and fine-tuned with either grammatical or ungrammatical sentences, then applied to (different types of) grammatical or ungrammatical sentences. The results show that both in the difficult and highly symmetrical task of detecting subject islands and in the more open CoLA dataset, grammatical sentences give rise to better scores than ungrammatical ones, possibly because they can be better integrated within the body of linguistic structural knowledge that the language model has accumulated. |
Tasks | Language Modelling |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-4821/ |
https://www.aclweb.org/anthology/W19-4821 | |
PWC | https://paperswithcode.com/paper/an-lstm-adaptation-study-of-ungrammaticality |
Repo | |
Framework | |
Stochastic Quantized Activation: To prevent Overfitting in Fast Adversarial Training
Title | Stochastic Quantized Activation: To prevent Overfitting in Fast Adversarial Training |
Authors | Wonjun Yoon, Jisuk Park, Daeshik Kim |
Abstract | Existing neural networks are vulnerable to “adversarial examples”—created by adding maliciously designed small perturbations in inputs to induce a misclassification by the networks. The most investigated defense strategy is adversarial training which augments training data with adversarial examples. However, applying single-step adversaries in adversarial training does not support the robustness of the networks, instead, they will even make the networks to be overfitted. In contrast to the single-step, multi-step training results in the state-of-the-art performance on MNIST and CIFAR10, yet it needs a massive amount of time. Therefore, we propose a method, Stochastic Quantized Activation (SQA) that solves overfitting problems in single-step adversarial training and fastly achieves the robustness comparable to the multi-step. SQA attenuates the adversarial effects by providing random selectivity to activation functions and allows the network to learn robustness with only single-step training. Throughout the experiment, our method demonstrates the state-of-the-art robustness against one of the strongest white-box attacks as PGD training, but with much less computational cost. Finally, we visualize the learning process of the network with SQA to handle strong adversaries, which is different from existing methods. |
Tasks | |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=ryxeB30cYX |
https://openreview.net/pdf?id=ryxeB30cYX | |
PWC | https://paperswithcode.com/paper/stochastic-quantized-activation-to-prevent |
Repo | |
Framework | |
Self-Attention Networks for Intent Detection
Title | Self-Attention Networks for Intent Detection |
Authors | Sevinj Yolchuyeva, G{'e}za N{'e}meth, B{'a}lint Gyires-T{'o}th |
Abstract | Self-attention networks (SAN) have shown promising performance in various Natural Language Processing (NLP) scenarios, especially in machine translation. One of the main points of SANs is the strength of capturing long-range and multi-scale dependencies from the data. In this paper, we present a novel intent detection system which is based on a self-attention network and a Bi-LSTM. Our approach shows improvement by using a transformer model and deep averaging network-based universal sentence encoder compared to previous solutions. We evaluate the system on Snips, Smart Speaker, Smart Lights, and ATIS datasets by different evaluation metrics. The performance of the proposed model is compared with LSTM with the same datasets. |
Tasks | Intent Detection, Machine Translation |
Published | 2019-09-01 |
URL | https://www.aclweb.org/anthology/R19-1157/ |
https://www.aclweb.org/anthology/R19-1157 | |
PWC | https://paperswithcode.com/paper/self-attention-networks-for-intent-detection |
Repo | |
Framework | |
Modelling Adaptive Presentations in Human-Robot Interaction using Behaviour Trees
Title | Modelling Adaptive Presentations in Human-Robot Interaction using Behaviour Trees |
Authors | Nils Axelsson, Gabriel Skantze |
Abstract | In dialogue, speakers continuously adapt their speech to accommodate the listener, based on the feedback they receive. In this paper, we explore the modelling of such behaviours in the context of a robot presenting a painting. A Behaviour Tree is used to organise the behaviour on different levels, and allow the robot to adapt its behaviour in real-time; the tree organises engagement, joint attention, turn-taking, feedback and incremental speech processing. An initial implementation of the model is presented, and the system is evaluated in a user study, where the adaptive robot presenter is compared to a non-adaptive version. The adaptive version is found to be more engaging by the users, although no effects are found on the retention of the presented material. |
Tasks | |
Published | 2019-09-01 |
URL | https://www.aclweb.org/anthology/W19-5940/ |
https://www.aclweb.org/anthology/W19-5940 | |
PWC | https://paperswithcode.com/paper/modelling-adaptive-presentations-in-human |
Repo | |
Framework | |
From Explainability to Explanation: Using a Dialogue Setting to Elicit Annotations with Justifications
Title | From Explainability to Explanation: Using a Dialogue Setting to Elicit Annotations with Justifications |
Authors | Nazia Attari, Martin Heckmann, David Schlangen |
Abstract | Despite recent attempts in the field of explainable AI to go beyond black box prediction models, typically already the training data for supervised machine learning is collected in a manner that treats the annotator as a {}black box{''}, the internal workings of which remains unobserved. We present an annotation method where a task is given to a pair of annotators who collaborate on finding the best response. With this we want to shed light on the questions if the collaboration increases the quality of the responses and if this { }thinking together{''} provides useful information in itself, as it at least partially reveals their reasoning steps. Furthermore, we expect that this setting puts the focus on explanation as a linguistic act, vs. explainability as a property of models. In a crowd-sourcing experiment, we investigated three different annotation tasks, each in a collaborative dialogical (two annotators) and monological (one annotator) setting. Our results indicate that our experiment elicits collaboration and that this collaboration increases the response accuracy. We see large differences in the annotators{'} behavior depending on the task. Similarly, we also observe that the dialog patterns emerging from the collaboration vary significantly with the task. |
Tasks | |
Published | 2019-09-01 |
URL | https://www.aclweb.org/anthology/W19-5938/ |
https://www.aclweb.org/anthology/W19-5938 | |
PWC | https://paperswithcode.com/paper/from-explainability-to-explanation-using-a |
Repo | |
Framework | |
Neural News Recommendation with Topic-Aware News Representation
Title | Neural News Recommendation with Topic-Aware News Representation |
Authors | Chuhan Wu, Fangzhao Wu, Mingxiao An, Yongfeng Huang, Xing Xie |
Abstract | News recommendation can help users find interested news and alleviate information overload. The topic information of news is critical for learning accurate news and user representations for news recommendation. However, it is not considered in many existing news recommendation methods. In this paper, we propose a neural news recommendation approach with topic-aware news representations. The core of our approach is a topic-aware news encoder and a user encoder. In the news encoder we learn representations of news from their titles via CNN networks and apply attention networks to select important words. In addition, we propose to learn topic-aware news representations by jointly training the news encoder with an auxiliary topic classification task. In the user encoder we learn the representations of users from their browsed news and use attention networks to select informative news for user representation learning. Extensive experiments on a real-world dataset validate the effectiveness of our approach. |
Tasks | Representation Learning |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/P19-1110/ |
https://www.aclweb.org/anthology/P19-1110 | |
PWC | https://paperswithcode.com/paper/neural-news-recommendation-with-topic-aware |
Repo | |
Framework | |
Length of non-projective sentences: A pilot study using a Czech UD treebank
Title | Length of non-projective sentences: A pilot study using a Czech UD treebank |
Authors | Jan Macutek, Radek Cech, Jiri Milicka |
Abstract | |
Tasks | |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-7913/ |
https://www.aclweb.org/anthology/W19-7913 | |
PWC | https://paperswithcode.com/paper/length-of-non-projective-sentences-a-pilot |
Repo | |
Framework | |
Detecting Aggression and Toxicity using a Multi Dimension Capsule Network
Title | Detecting Aggression and Toxicity using a Multi Dimension Capsule Network |
Authors | Saurabh Srivastava, Prerna Khurana |
Abstract | In the era of social media, hate speech, trolling and verbal abuse have become a common issue. We present an approach to automatically classify such statements, using a new deep learning architecture. Our model comprises of a Multi Dimension Capsule Network that generates the representation of sentences which we use for classification. We further provide an analysis of our model{'}s interpretation of such statements. We compare the results of our model with state-of-art classification algorithms and demonstrate our model{'}s ability. It also has the capability to handle comments that are written in both Hindi and English, which are provided in the TRAC dataset. We also compare results on Kaggle{'}s Toxic comment classification dataset. |
Tasks | |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-3517/ |
https://www.aclweb.org/anthology/W19-3517 | |
PWC | https://paperswithcode.com/paper/detecting-aggression-and-toxicity-using-a |
Repo | |
Framework | |
Prediction of User Emotion and Dialogue Success Using Audio Spectrograms and Convolutional Neural Networks
Title | Prediction of User Emotion and Dialogue Success Using Audio Spectrograms and Convolutional Neural Networks |
Authors | Athanasios Lykartsis, Margarita Kotti |
Abstract | In this paper we aim to predict dialogue success and user satisfaction as well as emotion on a turn level. To achieve this, we investigate the use of spectrogram representations, extracted from audio files, in combination with several types of convolutional neural networks. The experiments were performed on the Let{'}s Go V2 database, comprising 5065 audio files and having labels for subjective and objective dialogue turn success, as well as the emotional state of the user. Results show that by using only audio, it is possible to predict turn success with very high accuracy for all three labels (90{%}). The best performing input representation were 1s long mel-spectrograms in combination with a CNN with a bottleneck architecture. The resulting system has the potential to be used real-time. Our results significantly surpass the state of the art for dialogue success prediction based only on audio. |
Tasks | |
Published | 2019-09-01 |
URL | https://www.aclweb.org/anthology/W19-5939/ |
https://www.aclweb.org/anthology/W19-5939 | |
PWC | https://paperswithcode.com/paper/prediction-of-user-emotion-and-dialogue |
Repo | |
Framework | |
Context-specific Language Modeling for Human Trafficking Detection from Online Advertisements
Title | Context-specific Language Modeling for Human Trafficking Detection from Online Advertisements |
Authors | Saeideh Shahrokh Esfahani, Michael J. Cafarella, Maziyar Baran Pouyan, Gregory DeAngelo, Elena Eneva, Andy E. Fano |
Abstract | Human trafficking is a worldwide crisis. Traffickers exploit their victims by anonymously offering sexual services through online advertisements. These ads often contain clues that law enforcement can use to separate out potential trafficking cases from volunteer sex advertisements. The problem is that the sheer volume of ads is too overwhelming for manual processing. Ideally, a centralized semi-automated tool can be used to assist law enforcement agencies with this task. Here, we present an approach using natural language processing to identify trafficking ads on these websites. We propose a classifier by integrating multiple text feature sets, including the publicly available pre-trained textual language model Bi-directional Encoder Representation from transformers (BERT). In this paper, we demonstrate that a classifier using this composite feature set has significantly better performance compared to any single feature set alone. |
Tasks | Language Modelling |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/P19-1114/ |
https://www.aclweb.org/anthology/P19-1114 | |
PWC | https://paperswithcode.com/paper/context-specific-language-modeling-for-human |
Repo | |
Framework | |
A Theory of Fermat Paths for Non-Line-Of-Sight Shape Reconstruction
Title | A Theory of Fermat Paths for Non-Line-Of-Sight Shape Reconstruction |
Authors | Shumian Xin, Sotiris Nousias, Kiriakos N. Kutulakos, Aswin C. Sankaranarayanan, Srinivasa G. Narasimhan, Ioannis Gkioulekas |
Abstract | We present a novel theory of Fermat paths of light between a known visible scene and an unknown object not in the line of sight of a transient camera. These light paths either obey specular reflection or are reflected by the object’s boundary, and hence encode the shape of the hidden object. We prove that Fermat paths correspond to discontinuities in the transient measurements. We then derive a novel constraint that relates the spatial derivatives of the path lengths at these discontinuities to the surface normal. Based on this theory, we present an algorithm, called Fermat Flow, to estimate the shape of the non-line-of-sight object. Our method allows, for the first time, accurate shape recovery of complex objects, ranging from diffuse to specular, that are hidden around the corner as well as hidden behind a diffuser. Finally, our approach is agnostic to the particular technology used for transient imaging. As such, we demonstrate mm-scale shape recovery from pico-second scale transients using a SPAD and ultrafast laser, as well as micron-scale reconstruction from femto-second scale transients using interferometry. We believe our work is a significant advance over the state-of-the-art in non-line-of-sight imaging. |
Tasks | |
Published | 2019-06-01 |
URL | http://openaccess.thecvf.com/content_CVPR_2019/html/Xin_A_Theory_of_Fermat_Paths_for_Non-Line-Of-Sight_Shape_Reconstruction_CVPR_2019_paper.html |
http://openaccess.thecvf.com/content_CVPR_2019/papers/Xin_A_Theory_of_Fermat_Paths_for_Non-Line-Of-Sight_Shape_Reconstruction_CVPR_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/a-theory-of-fermat-paths-for-non-line-of |
Repo | |
Framework | |
A Paraphrase Generation System for EHR Question Answering
Title | A Paraphrase Generation System for EHR Question Answering |
Authors | Sarvesh Soni, Kirk Roberts |
Abstract | This paper proposes a dataset and method for automatically generating paraphrases for clinical questions relating to patient-specific information in electronic health records (EHRs). Crowdsourcing is used to collect 10,578 unique questions across 946 semantically distinct paraphrase clusters. This corpus is then used with a deep learning-based question paraphrasing method utilizing variational autoencoder and LSTM encoder/decoder. The ultimate use of such a method is to improve the performance of automatic question answering methods for EHRs. |
Tasks | Paraphrase Generation, Question Answering |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-5003/ |
https://www.aclweb.org/anthology/W19-5003 | |
PWC | https://paperswithcode.com/paper/a-paraphrase-generation-system-for-ehr |
Repo | |
Framework | |
Biomedical Event Extraction based on Knowledge-driven Tree-LSTM
Title | Biomedical Event Extraction based on Knowledge-driven Tree-LSTM |
Authors | Diya Li, Lifu Huang, Heng Ji, Jiawei Han |
Abstract | Event extraction for the biomedical domain is more challenging than that in the general news domain since it requires broader acquisition of domain-specific knowledge and deeper understanding of complex contexts. To better encode contextual information and external background knowledge, we propose a novel knowledge base (KB)-driven tree-structured long short-term memory networks (Tree-LSTM) framework, incorporating two new types of features: (1) dependency structures to capture wide contexts; (2) entity properties (types and category descriptions) from external ontologies via entity linking. We evaluate our approach on the BioNLP shared task with Genia dataset and achieve a new state-of-the-art result. In addition, both quantitative and qualitative studies demonstrate the advancement of the Tree-LSTM and the external knowledge representation for biomedical event extraction. |
Tasks | Entity Linking |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/N19-1145/ |
https://www.aclweb.org/anthology/N19-1145 | |
PWC | https://paperswithcode.com/paper/biomedical-event-extraction-based-on |
Repo | |
Framework | |
Wide-Area Crowd Counting via Ground-Plane Density Maps and Multi-View Fusion CNNs
Title | Wide-Area Crowd Counting via Ground-Plane Density Maps and Multi-View Fusion CNNs |
Authors | Qi Zhang, Antoni B. Chan |
Abstract | Crowd counting in single-view images has achieved outstanding performance on existing counting datasets. However, single-view counting is not applicable to large and wide scenes (e.g., public parks, long subway platforms, or event spaces) because a single camera cannot capture the whole scene in adequate detail for counting, e.g., when the scene is too large to fit into the field-of-view of the camera, too long so that the resolution is too low on faraway crowds, or when there are too many large objects that occlude large portions of the crowd. Therefore, to solve the wide-area counting task requires multiple cameras with overlapping fields-of-view. In this paper, we propose a deep neural network framework for multi-view crowd counting, which fuses information from multiple camera views to predict a scene-level density map on the ground-plane of the 3D world. We consider 3 versions of the fusion framework: the late fusion model fuses camera-view density map; the naive early fusion model fuses camera-view feature maps; and the multi-view multi-scale early fusion model favors that features aligned to the same ground-plane point have consistent scales. We test our 3 fusion models on 3 multi-view counting datasets, PETS2009, DukeMTMC, and a newly collected multi-view counting dataset containing a crowded street intersection. Our methods achieve state-of-the-art results compared to other multi-view counting baselines. |
Tasks | Crowd Counting |
Published | 2019-06-01 |
URL | http://openaccess.thecvf.com/content_CVPR_2019/html/Zhang_Wide-Area_Crowd_Counting_via_Ground-Plane_Density_Maps_and_Multi-View_Fusion_CVPR_2019_paper.html |
http://openaccess.thecvf.com/content_CVPR_2019/papers/Zhang_Wide-Area_Crowd_Counting_via_Ground-Plane_Density_Maps_and_Multi-View_Fusion_CVPR_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/wide-area-crowd-counting-via-ground-plane |
Repo | |
Framework | |