Paper Group NANR 169
Semi-Supervised Semantic Role Labeling with Cross-View Training. Weak Supervision for Learning Discourse Structure. Structure-Preserving Stereoscopic View Synthesis With Multi-Scale Adversarial Correlation Matching. Hierarchy Response Learning for Neural Conversation Generation. Predicting Malware Attributes from Cybersecurity Texts. Characterizati …
Semi-Supervised Semantic Role Labeling with Cross-View Training
Title | Semi-Supervised Semantic Role Labeling with Cross-View Training |
Authors | Rui Cai, Mirella Lapata |
Abstract | The successful application of neural networks to a variety of NLP tasks has provided strong impetus to develop end-to-end models for semantic role labeling which forego the need for extensive feature engineering. Recent approaches rely on high-quality annotations which are costly to obtain, and mostly unavailable in low resource scenarios (e.g., rare languages or domains). Our work aims to reduce the annotation effort involved via semi-supervised learning. We propose an end-to-end SRL model and demonstrate it can effectively leverage unlabeled data under the cross-view training modeling paradigm. Our LSTM-based semantic role labeler is jointly trained with a sentence learner, which performs POS tagging, dependency parsing, and predicate identification which we argue are critical to learning directly from unlabeled data without recourse to external pre-processing tools. Experimental results on the CoNLL-2009 benchmark dataset show that our model outperforms the state of the art in English, and consistently improves performance in other languages, including Chinese, German, and Spanish. |
Tasks | Dependency Parsing, Feature Engineering, Semantic Role Labeling |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-1094/ |
https://www.aclweb.org/anthology/D19-1094 | |
PWC | https://paperswithcode.com/paper/semi-supervised-semantic-role-labeling-with |
Repo | |
Framework | |
Weak Supervision for Learning Discourse Structure
Title | Weak Supervision for Learning Discourse Structure |
Authors | Sonia Badene, Kate Thompson, Jean-Pierre Lorr{'e}, Nicholas Asher |
Abstract | This paper provides a detailed comparison of a data programming approach with (i) off-the-shelf, state-of-the-art deep learning architectures that optimize their representations (BERT) and (ii) handcrafted-feature approaches previously used in the discourse analysis literature. We compare these approaches on the task of learning discourse structure for multi-party dialogue. The data programming paradigm offered by the Snorkel framework allows a user to label training data using expert-composed heuristics, which are then transformed via the {}generative step{''} into probability distributions of the class labels given the data. We show that on our task the generative model outperforms both deep learning architectures as well as more traditional ML approaches when learning discourse structure{---}it even outperforms the combination of deep learning methods and hand-crafted features. We also implement several strategies for { }decoding{''} our generative model output in order to improve our results. We conclude that weak supervision methods hold great promise as a means for creating and improving data sets for discourse structure. |
Tasks | |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-1234/ |
https://www.aclweb.org/anthology/D19-1234 | |
PWC | https://paperswithcode.com/paper/weak-supervision-for-learning-discourse |
Repo | |
Framework | |
Structure-Preserving Stereoscopic View Synthesis With Multi-Scale Adversarial Correlation Matching
Title | Structure-Preserving Stereoscopic View Synthesis With Multi-Scale Adversarial Correlation Matching |
Authors | Yu Zhang, Dongqing Zou, Jimmy S. Ren, Zhe Jiang, Xiaohao Chen |
Abstract | This paper addresses stereoscopic view synthesis from a single image. Various recent works solve this task by reorganizing pixels from the input view to reconstruct the target one in a stereo setup. However, purely depending on such photometric-based reconstruction process, the network may produce structurally inconsistent results. Regarding this issue, this work proposes Multi-Scale Adversarial Correlation Matching (MS-ACM), a novel learning framework for structure-aware view synthesis. The proposed framework does not assume any costly supervision signal of scene structures such as depth. Instead, it models structures as self-correlation coefficients extracted from multi-scale feature maps in transformed spaces. In training, the feature space attempts to push the correlation distances between the synthesized and target images far apart, thus amplifying inconsistent structures. At the same time, the view synthesis network minimizes such correlation distances by fixing mistakes it makes. With such adversarial training, structural errors of different scales and levels are iteratively discovered and reduced, preserving both global layouts and fine-grained details. Extensive experiments on the KITTI benchmark show that MS-ACM improves both visual quality and the metrics over existing methods when plugged into recent view synthesis architectures. |
Tasks | |
Published | 2019-06-01 |
URL | http://openaccess.thecvf.com/content_CVPR_2019/html/Zhang_Structure-Preserving_Stereoscopic_View_Synthesis_With_Multi-Scale_Adversarial_Correlation_Matching_CVPR_2019_paper.html |
http://openaccess.thecvf.com/content_CVPR_2019/papers/Zhang_Structure-Preserving_Stereoscopic_View_Synthesis_With_Multi-Scale_Adversarial_Correlation_Matching_CVPR_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/structure-preserving-stereoscopic-view |
Repo | |
Framework | |
Hierarchy Response Learning for Neural Conversation Generation
Title | Hierarchy Response Learning for Neural Conversation Generation |
Authors | Bo Zhang, Xiaoming Zhang |
Abstract | The neural encoder-decoder models have shown great promise in neural conversation generation. However, they cannot perceive and express the intention effectively, and hence often generate dull and generic responses. Unlike past work that has focused on diversifying the output at word-level or discourse-level with a flat model to alleviate this problem, we propose a hierarchical generation model to capture the different levels of diversity using the conditional variational autoencoders. Specifically, a hierarchical response generation (HRG) framework is proposed to capture the conversation intention in a natural and coherent way. It has two modules, namely, an expression reconstruction model to capture the hierarchical correlation between expression and intention, and an expression attention model to effectively combine the expressions with contents. Finally, the training procedure of HRG is improved by introducing reconstruction loss. Experiment results show that our model can generate the responses with more appropriate content and expression. |
Tasks | |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-1186/ |
https://www.aclweb.org/anthology/D19-1186 | |
PWC | https://paperswithcode.com/paper/hierarchy-response-learning-for-neural |
Repo | |
Framework | |
Predicting Malware Attributes from Cybersecurity Texts
Title | Predicting Malware Attributes from Cybersecurity Texts |
Authors | Arpita Roy, Youngja Park, Shimei Pan |
Abstract | Text analytics is a useful tool for studying malware behavior and tracking emerging threats. The task of automated malware attribute identification based on cybersecurity texts is very challenging due to a large number of malware attribute labels and a small number of training instances. In this paper, we propose a novel feature learning method to leverage diverse knowledge sources such as small amount of human annotations, unlabeled text and specifications about malware attribute labels. Our evaluation has demonstrated the effectiveness of our method over the state-of-the-art malware attribute prediction systems. |
Tasks | |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/N19-1293/ |
https://www.aclweb.org/anthology/N19-1293 | |
PWC | https://paperswithcode.com/paper/predicting-malware-attributes-from |
Repo | |
Framework | |
Characterization and Learning of Causal Graphs with Latent Variables from Soft Interventions
Title | Characterization and Learning of Causal Graphs with Latent Variables from Soft Interventions |
Authors | Murat Kocaoglu, Amin Jaber, Karthikeyan Shanmugam, Elias Bareinboim |
Abstract | The challenge of learning the causal structure underlying a certain phenomenon is undertaken by connecting the set of conditional independences (CIs) readable from the observational data, on the one side, with the set of corresponding constraints implied over the graphical structure, on the other, which are tied through a graphical criterion known as d-separation (Pearl, 1988). In this paper, we investigate the more general scenario where multiple observational and experimental distributions are available. We start with the simple observation that the invariances given by CIs/d-separation are just one special type of a broader set of constraints, which follow from the careful comparison of the different distributions available. Remarkably, these new constraints are intrinsically connected with do-calculus (Pearl, 1995) in the context of soft-interventions. We introduce a novel notion of interventional equivalence class of causal graphs with latent variables based on these invariances, which associates each graphical structure with a set of interventional distributions that respect the do-calculus rules. Given a collection of distributions, two causal graphs are called interventionally equivalent if they are associated with the same family of interventional distributions, where the elements of the family are indistinguishable using the invariances obtained from a direct application of the calculus rules. We introduce a graphical representation that can be used to determine if two causal graphs are interventionally equivalent. We provide a formal graphical characterization of this equivalence. Finally, we extend the FCI algorithm, which was originally designed to operate based on CIs, to combine observational and interventional datasets, including new orientation rules particular to this setting. |
Tasks | |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/9581-characterization-and-learning-of-causal-graphs-with-latent-variables-from-soft-interventions |
http://papers.nips.cc/paper/9581-characterization-and-learning-of-causal-graphs-with-latent-variables-from-soft-interventions.pdf | |
PWC | https://paperswithcode.com/paper/characterization-and-learning-of-causal |
Repo | |
Framework | |
Incorporating Word Attention into Character-Based Word Segmentation
Title | Incorporating Word Attention into Character-Based Word Segmentation |
Authors | Shohei Higashiyama, Masao Utiyama, Eiichiro Sumita, Masao Ideuchi, Yoshiaki Oida, Yohei Sakamoto, Isaac Okada |
Abstract | Neural network models have been actively applied to word segmentation, especially Chinese, because of the ability to minimize the effort in feature engineering. Typical segmentation models are categorized as character-based, for conducting exact inference, or word-based, for utilizing word-level information. We propose a character-based model utilizing word information to leverage the advantages of both types of models. Our model learns the importance of multiple candidate words for a character on the basis of an attention mechanism, and makes use of it for segmentation decisions. The experimental results show that our model achieves better performance than the state-of-the-art models on both Japanese and Chinese benchmark datasets. |
Tasks | Feature Engineering |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/N19-1276/ |
https://www.aclweb.org/anthology/N19-1276 | |
PWC | https://paperswithcode.com/paper/incorporating-word-attention-into-character |
Repo | |
Framework | |
Unsupervised Learning of Cross-Lingual Symbol Embeddings Without Parallel Data
Title | Unsupervised Learning of Cross-Lingual Symbol Embeddings Without Parallel Data |
Authors | Mark Granroth-Wilding, Hannu Toivonen |
Abstract | |
Tasks | |
Published | 2019-01-01 |
URL | https://www.aclweb.org/anthology/W19-0103/ |
https://www.aclweb.org/anthology/W19-0103 | |
PWC | https://paperswithcode.com/paper/unsupervised-learning-of-cross-lingual-symbol |
Repo | |
Framework | |
Mop Moire Patterns Using MopNet
Title | Mop Moire Patterns Using MopNet |
Authors | Bin He, Ce Wang, Boxin Shi, Ling-Yu Duan |
Abstract | Moire pattern is a common image quality degradation caused by frequency aliasing between monitors and cameras when taking screen-shot photos. The complex frequency distribution, imbalanced magnitude in colour channels, and diverse appearance attributes of moire pattern make its removal a challenging problem. In this paper, we propose a Moire pattern Removal Neural Network (MopNet) to solve this problem. All core components of MopNet are specially designed for unique properties of moire patterns, including the multi-scale feature aggregation addressing complex frequency, the channel-wise target edge predictor to exploit imbalanced magnitude among colour channels, and the attribute-aware classifier to characterize the diverse appearance for better modelling Moire patterns. Quantitative and qualitative experimental comparison validate the state-of-the-art performance of MopNet. |
Tasks | |
Published | 2019-10-01 |
URL | http://openaccess.thecvf.com/content_ICCV_2019/html/He_Mop_Moire_Patterns_Using_MopNet_ICCV_2019_paper.html |
http://openaccess.thecvf.com/content_ICCV_2019/papers/He_Mop_Moire_Patterns_Using_MopNet_ICCV_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/mop-moire-patterns-using-mopnet |
Repo | |
Framework | |
Evaluation Order Effects in Dynamic Continuized CCG: From Negative Polarity Items to Balanced Punctuation
Title | Evaluation Order Effects in Dynamic Continuized CCG: From Negative Polarity Items to Balanced Punctuation |
Authors | Michael White |
Abstract | |
Tasks | |
Published | 2019-01-01 |
URL | https://www.aclweb.org/anthology/W19-0123/ |
https://www.aclweb.org/anthology/W19-0123 | |
PWC | https://paperswithcode.com/paper/evaluation-order-effects-in-dynamic |
Repo | |
Framework | |
Fact or Factitious? Contextualized Opinion Spam Detection
Title | Fact or Factitious? Contextualized Opinion Spam Detection |
Authors | Stefan Kennedy, Niall Walsh, Kirils Sloka, Andrew McCarren, Jennifer Foster |
Abstract | In this paper we perform an analytic comparison of a number of techniques used to detect fake and deceptive online reviews. We apply a number machine learning approaches found to be effective, and introduce our own approach by fine-tuning state of the art contextualised embeddings. The results we obtain show the potential of contextualised embeddings for fake review detection, and lay the groundwork for future research in this area. |
Tasks | |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/P19-2048/ |
https://www.aclweb.org/anthology/P19-2048 | |
PWC | https://paperswithcode.com/paper/fact-or-factitious-contextualized-opinion |
Repo | |
Framework | |
FAVAE: SEQUENCE DISENTANGLEMENT USING IN- FORMATION BOTTLENECK PRINCIPLE
Title | FAVAE: SEQUENCE DISENTANGLEMENT USING IN- FORMATION BOTTLENECK PRINCIPLE |
Authors | Masanori Yamada, Kim Heecheol, Kosuke Miyoshi, Hiroshi Yamakawa |
Abstract | A state-of-the-art generative model, a ”factorized action variational autoencoder (FAVAE),” is presented for learning disentangled and interpretable representations from sequential data via the information bottleneck without supervision. The purpose of disentangled representation learning is to obtain interpretable and transferable representations from data. We focused on the disentangled representation of sequential data because there is a wide range of potential applications if disentanglement representation is extended to sequential data such as video, speech, and stock price data. Sequential data is characterized by dynamic factors and static factors: dynamic factors are time-dependent, and static factors are independent of time. Previous works succeed in disentangling static factors and dynamic factors by explicitly modeling the priors of latent variables to distinguish between static and dynamic factors. However, this model can not disentangle representations between dynamic factors, such as disentangling ”picking” and ”throwing” in robotic tasks. In this paper, we propose new model that can disentangle multiple dynamic factors. Since our method does not require modeling priors, it is capable of disentangling ”between” dynamic factors. In experiments, we show that FAVAE can extract the disentangled dynamic factors. |
Tasks | Representation Learning |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=Hygm8jC9FQ |
https://openreview.net/pdf?id=Hygm8jC9FQ | |
PWC | https://paperswithcode.com/paper/favae-sequence-disentanglement-using-in-1 |
Repo | |
Framework | |
Argument Component Classification by Relation Identification by Neural Network and TextRank
Title | Argument Component Classification by Relation Identification by Neural Network and TextRank |
Authors | Mamoru Deguchi, Kazunori Yamaguchi |
Abstract | In recent years, argumentation mining, which automatically extracts the structure of argumentation from unstructured documents such as essays and debates, is gaining attention. For argumentation mining applications, argument-component classification is an important subtask. The existing methods can be classified into supervised methods and unsupervised methods. Supervised document classification performs classification using a single sentence without relying on the whole document. On the other hand, unsupervised document classification has the advantage of being able to use the whole document, but accuracy of these methods is not so high. In this paper, we propose a method for argument-component classification that combines relation identification by neural networks and TextRank to integrate relation informations (i.e. the strength of the relation). This method can use argumentation-specific knowledge by employing a supervised learning on a corpus while maintaining the advantage of using the whole document. Experiments on two corpora, one consisting of student essay and the other of Wikipedia articles, show the effectiveness of this method. |
Tasks | Document Classification |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-4510/ |
https://www.aclweb.org/anthology/W19-4510 | |
PWC | https://paperswithcode.com/paper/argument-component-classification-by-relation |
Repo | |
Framework | |
Improving Sample-based Evaluation for Generative Adversarial Networks
Title | Improving Sample-based Evaluation for Generative Adversarial Networks |
Authors | Shaohui Liu*, Yi Wei*, Jiwen Lu, Jie Zhou |
Abstract | In this paper, we propose an improved quantitative evaluation framework for Generative Adversarial Networks (GANs) on generating domain-specific images, where we improve conventional evaluation methods on two levels: the feature representation and the evaluation metric. Unlike most existing evaluation frameworks which transfer the representation of ImageNet inception model to map images onto the feature space, our framework uses a specialized encoder to acquire fine-grained domain-specific representation. Moreover, for datasets with multiple classes, we propose Class-Aware Frechet Distance (CAFD), which employs a Gaussian mixture model on the feature space to better fit the multi-manifold feature distribution. Experiments and analysis on both the feature level and the image level were conducted to demonstrate improvements of our proposed framework over the recently proposed state-of-the-art FID method. To our best knowledge, we are the first to provide counter examples where FID gives inconsistent results with human judgments. It is shown in the experiments that our framework is able to overcome the shortness of FID and improves robustness. Code will be made available. |
Tasks | |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=HJlY0jA5F7 |
https://openreview.net/pdf?id=HJlY0jA5F7 | |
PWC | https://paperswithcode.com/paper/improving-sample-based-evaluation-for |
Repo | |
Framework | |
The importance of sharing patient-generated clinical speech and language data
Title | The importance of sharing patient-generated clinical speech and language data |
Authors | Kathleen C. Fraser, Nicklas Linz, Hali Lindsay, Alex K{"o}nig, ra |
Abstract | Increased access to large datasets has driven progress in NLP. However, most computational studies of clinically-validated, patient-generated speech and language involve very few datapoints, as such data are difficult (and expensive) to collect. In this position paper, we argue that we must find ways to promote data sharing across research groups, in order to build datasets of a more appropriate size for NLP and machine learning analysis. We review the benefits and challenges of sharing clinical language data, and suggest several concrete actions by both clinical and NLP researchers to encourage multi-site and multi-disciplinary data sharing. We also propose the creation of a collaborative data sharing platform, to allow NLP researchers to take a more active responsibility for data transcription, annotation, and curation. |
Tasks | |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/W19-3007/ |
https://www.aclweb.org/anthology/W19-3007 | |
PWC | https://paperswithcode.com/paper/the-importance-of-sharing-patient-generated |
Repo | |
Framework | |