Paper Group NANR 28
Development and Evaluation of Three Named Entity Recognition Systems for Serbian - The Case of Personal Names. Harnessing Pre-Trained Neural Networks with Rules for Formality Style Transfer. MulCode: A Multiplicative Multi-way Model for Compressing Neural Language Model. Bias Also Matters: Bias Attribution for Deep Neural Network Explanation. Expla …
Development and Evaluation of Three Named Entity Recognition Systems for Serbian - The Case of Personal Names
Title | Development and Evaluation of Three Named Entity Recognition Systems for Serbian - The Case of Personal Names |
Authors | {\v{S}}, Branislava rih, Cvetana Krstev, Ranka Stankovic |
Abstract | In this paper we present a rule- and lexicon-based system for the recognition of Named Entities (NE) in Serbian newspaper texts that was used to prepare a gold standard annotated with personal names. It was further used to prepare training sets for four different levels of annotation, which were further used to train two Named Entity Recognition (NER) systems: Stanford and spaCy. All obtained models, together with a rule- and lexicon-based system were evaluated on two sample texts: a part of the gold standard and an independent newspaper text of approximately the same size. The results show that rule- and lexicon-based system outperforms trained models in all four scenarios (measured by F1), while Stanford models has the highest precision. All systems obtain best results in recognizing full names, while the recognition of first names only is rather poor. The produced models are incorporated into a Web platform NER{&}Beyond that provides various NE-related functions. |
Tasks | Named Entity Recognition |
Published | 2019-09-01 |
URL | https://www.aclweb.org/anthology/R19-1122/ |
https://www.aclweb.org/anthology/R19-1122 | |
PWC | https://paperswithcode.com/paper/development-and-evaluation-of-three-named |
Repo | |
Framework | |
Harnessing Pre-Trained Neural Networks with Rules for Formality Style Transfer
Title | Harnessing Pre-Trained Neural Networks with Rules for Formality Style Transfer |
Authors | Yunli Wang, Yu Wu, Lili Mou, Zhoujun Li, Wenhan Chao |
Abstract | Formality text style transfer plays an important role in various NLP applications, such as non-native speaker assistants and child education. Early studies normalize informal sentences with rules, before statistical and neural models become a prevailing method in the field. While a rule-based system is still a common preprocessing step for formality style transfer in the neural era, it could introduce noise if we use the rules in a naive way such as data preprocessing. To mitigate this problem, we study how to harness rules into a state-of-the-art neural network that is typically pretrained on massive corpora. We propose three fine-tuning methods in this paper and achieve a new state-of-the-art on benchmark datasets |
Tasks | Style Transfer, Text Style Transfer |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-1365/ |
https://www.aclweb.org/anthology/D19-1365 | |
PWC | https://paperswithcode.com/paper/harnessing-pre-trained-neural-networks-with |
Repo | |
Framework | |
MulCode: A Multiplicative Multi-way Model for Compressing Neural Language Model
Title | MulCode: A Multiplicative Multi-way Model for Compressing Neural Language Model |
Authors | Yukun Ma, Patrick H. Chen, Cho-Jui Hsieh |
Abstract | It is challenging to deploy deep neural nets on memory-constrained devices due to the explosion of numbers of parameters. Especially, the input embedding layer and Softmax layer usually dominate the memory usage in an RNN-based language model. For example, input embedding and Softmax matrices in IWSLT-2014 German-to-English data set account for more than 80{%} of the total model parameters. To compress these embedding layers, we propose MulCode, a novel multi-way multiplicative neural compressor. MulCode learns an adaptively created matrix and its multiplicative compositions. Together with a prior weighted loss, Multicode is more effective than the state-of-the-art compression methods. On the IWSLT-2014 machine translation data set, MulCode achieved 17 times compression rate for the embedding and Softmax matrices, and when combined with quantization technique, our method can achieve 41.38 times compression rate with very little loss in performance. |
Tasks | Language Modelling, Machine Translation, Quantization |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-1529/ |
https://www.aclweb.org/anthology/D19-1529 | |
PWC | https://paperswithcode.com/paper/mulcode-a-multiplicative-multi-way-model-for |
Repo | |
Framework | |
Bias Also Matters: Bias Attribution for Deep Neural Network Explanation
Title | Bias Also Matters: Bias Attribution for Deep Neural Network Explanation |
Authors | Shengjie Wang, Tianyi Zhou, Jeff Bilmes |
Abstract | The gradient of a deep neural network (DNN) w.r.t. the input provides information that can be used to explain the output prediction in terms of the input features and has been widely studied to assist in interpreting DNNs. In a linear model (i.e., $g(x)=wx+b$), the gradient corresponds solely to the weights $w$. Such a model can reasonably locally linearly approximate a smooth nonlinear DNN, and hence the weights of this local model are the gradient. The other part, however, of a local linear model, i.e., the bias $b$, is usually overlooked in attribution methods since it is not part of the gradient. In this paper, we observe that since the bias in a DNN also has a non-negligible contribution to the correctness of predictions, it can also play a significant role in understanding DNN behaviors. In particular, we study how to attribute a DNN’s bias to its input features. We propose a backpropagation-type algorithm ``bias back-propagation (BBp)’’ that starts at the output layer and iteratively attributes the bias of each layer to its input nodes as well as combining the resulting bias term of the previous layer. This process stops at the input layer, where summing up the attributions over all the input features exactly recovers $b$. Together with the backpropagation of the gradient generating $w$, we can fully recover the locally linear model $g(x)=wx+b$. Hence, the attribution of the DNN outputs to its inputs is decomposed into two parts, the gradient $w$ and the bias attribution, providing separate and complementary explanations. We study several possible attribution methods applied to the bias of each layer in BBp. In experiments, we show that BBp can generate complementary and highly interpretable explanations of DNNs in addition to gradient-based attributions. | |
Tasks | |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=B1xeyhCctQ |
https://openreview.net/pdf?id=B1xeyhCctQ | |
PWC | https://paperswithcode.com/paper/bias-also-matters-bias-attribution-for-deep |
Repo | |
Framework | |
Explainability Methods for Graph Convolutional Neural Networks
Title | Explainability Methods for Graph Convolutional Neural Networks |
Authors | Phillip E. Pope, Soheil Kolouri, Mohammad Rostami, Charles E. Martin, Heiko Hoffmann |
Abstract | With the growing use of graph convolutional neural networks (GCNNs) comes the need for explainability. In this paper, we introduce explainability methods for GCNNs. We develop the graph analogues of three prominent explainability methods for convolutional neural networks: contrastive gradient-based (CG) saliency maps, Class Activation Mapping (CAM), and Excitation Back-Propagation (EB) and their variants, gradient-weighted CAM (Grad-CAM) and contrastive EB (c-EB). We show a proof-of-concept of these methods on classification problems in two application domains: visual scene graphs and molecular graphs. To compare the methods, we identify three desirable properties of explanations: (1) their importance to classification, as measured by the impact of occlusions, (2) their contrastivity with respect to different classes, and (3) their sparseness on a graph. We call the corresponding quantitative metrics fidelity, contrastivity, and sparsity and evaluate them for each method. Lastly, we analyze the salient subgraphs obtained from explanations and report frequently occurring patterns. |
Tasks | |
Published | 2019-06-01 |
URL | http://openaccess.thecvf.com/content_CVPR_2019/html/Pope_Explainability_Methods_for_Graph_Convolutional_Neural_Networks_CVPR_2019_paper.html |
http://openaccess.thecvf.com/content_CVPR_2019/papers/Pope_Explainability_Methods_for_Graph_Convolutional_Neural_Networks_CVPR_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/explainability-methods-for-graph |
Repo | |
Framework | |
NTUA-ISLab at SemEval-2019 Task 9: Mining Suggestions in the wild
Title | NTUA-ISLab at SemEval-2019 Task 9: Mining Suggestions in the wild |
Authors | Rol Potamias, os Alex, ros, Alex Neofytou, ros, Georgios Siolas |
Abstract | As online customer forums and product comparison sites increase their societal influence, users are actively expressing their opinions and posting their recommendations on their fellow customers online. However, systems capable of recognizing suggestions still lack in stability. Suggestion Mining, a novel and challenging field of Natural Language Processing, is increasingly gaining attention, aiming to track user advice on online forums. In this paper, a carefully designed methodology to identify customer-to-company and customer-to-customer suggestions is presented. The methodology implements a rule-based classifier using heuristic, lexical and syntactic patterns. The approach ranked at 5th and 1st position, achieving an f1-score of 0.749 and 0.858 for SemEval-2019/Suggestion Mining sub-tasks A and B, respectively. In addition, we were able to improve performance results by combining the rule-based classifier with a recurrent convolutional neural network, that exhibits an f1-score of 0.79 for subtask A. |
Tasks | |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/S19-2215/ |
https://www.aclweb.org/anthology/S19-2215 | |
PWC | https://paperswithcode.com/paper/ntua-islab-at-semeval-2019-task-9-mining |
Repo | |
Framework | |
Financial Text Data Analytics Framework for Business Confidence Indices and Inter-Industry Relations
Title | Financial Text Data Analytics Framework for Business Confidence Indices and Inter-Industry Relations |
Authors | Hiroki Sakaji, Ryota Kuramoto, Hiroyasu Matsushima, Kiyoshi Izumi, Takashi Shimada, Keita Sunakawa |
Abstract | |
Tasks | |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-5507/ |
https://www.aclweb.org/anthology/W19-5507 | |
PWC | https://paperswithcode.com/paper/financial-text-data-analytics-framework-for |
Repo | |
Framework | |
From Creditworthiness to Trustworthiness with Alternative NLP/NLU Approaches
Title | From Creditworthiness to Trustworthiness with Alternative NLP/NLU Approaches |
Authors | Charles Crouspeyre, Eleonore Alesi, Karine Lespinasse |
Abstract | |
Tasks | |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-5516/ |
https://www.aclweb.org/anthology/W19-5516 | |
PWC | https://paperswithcode.com/paper/from-creditworthiness-to-trustworthiness-with |
Repo | |
Framework | |
AIG Investments.AI at the FinSBD Task: Sentence Boundary Detection through Sequence Labelling and BERT Fine-tuning
Title | AIG Investments.AI at the FinSBD Task: Sentence Boundary Detection through Sequence Labelling and BERT Fine-tuning |
Authors | Jinhua Du, Yan Huang, Karo Moilanen |
Abstract | |
Tasks | Boundary Detection |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-5513/ |
https://www.aclweb.org/anthology/W19-5513 | |
PWC | https://paperswithcode.com/paper/aig-investmentsai-at-the-finsbd-task-sentence |
Repo | |
Framework | |
Economic Causal-Chain Search using Text Mining Technology
Title | Economic Causal-Chain Search using Text Mining Technology |
Authors | Kiyoshi Izumi, Hiroki Sakaji |
Abstract | |
Tasks | |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-5510/ |
https://www.aclweb.org/anthology/W19-5510 | |
PWC | https://paperswithcode.com/paper/economic-causal-chain-search-using-text |
Repo | |
Framework | |
mhirano at the FinSBD Task: Pointwise Prediction Based on Multi-layer Perceptron for Sentence Boundary Detection
Title | mhirano at the FinSBD Task: Pointwise Prediction Based on Multi-layer Perceptron for Sentence Boundary Detection |
Authors | Masanori Hirano, Hiroki Sakaji, Kiyoshi Izumi, Hiroyasu Matsushima |
Abstract | |
Tasks | Boundary Detection |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-5518/ |
https://www.aclweb.org/anthology/W19-5518 | |
PWC | https://paperswithcode.com/paper/mhirano-at-the-finsbd-task-pointwise |
Repo | |
Framework | |
BIOfid Dataset: Publishing a German Gold Standard for Named Entity Recognition in Historical Biodiversity Literature
Title | BIOfid Dataset: Publishing a German Gold Standard for Named Entity Recognition in Historical Biodiversity Literature |
Authors | Sajawel Ahmed, Manuel Stoeckel, Christine Driller, Adrian Pachzelt, Alex Mehler, er |
Abstract | The Specialized Information Service Biodiversity Research (BIOfid) has been launched to mobilize valuable biological data from printed literature hidden in German libraries for over the past 250 years. In this project, we annotate German texts converted by OCR from historical scientific literature on the biodiversity of plants, birds, moths and butterflies. Our work enables the automatic extraction of biological information previously buried in the mass of papers and volumes. For this purpose, we generated training data for the tasks of Named Entity Recognition (NER) and Taxa Recognition (TR) in biological documents. We use this data to train a number of leading machine learning tools and create a gold standard for TR in biodiversity literature. More specifically, we perform a practical analysis of our newly generated BIOfid dataset through various downstream-task evaluations and establish a new state of the art for TR with 80.23{%} F-score. In this sense, our paper lays the foundations for future work in the field of information extraction in biology texts. |
Tasks | Named Entity Recognition, Optical Character Recognition |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/K19-1081/ |
https://www.aclweb.org/anthology/K19-1081 | |
PWC | https://paperswithcode.com/paper/biofid-dataset-publishing-a-german-gold |
Repo | |
Framework | |
Nearly-Unsupervised Hashcode Representations for Biomedical Relation Extraction
Title | Nearly-Unsupervised Hashcode Representations for Biomedical Relation Extraction |
Authors | Sahil Garg, Aram Galstyan, Greg Ver Steeg, Guillermo Cecchi |
Abstract | Recently, kernelized locality sensitive hashcodes have been successfully employed as representations of natural language text, especially showing high relevance to biomedical relation extraction tasks. In this paper, we propose to optimize the hashcode representations in a nearly unsupervised manner, in which we only use data points, but not their class labels, for learning. The optimized hashcode representations are then fed to a supervised classifi er following the prior work. This nearly unsupervised approach allows fine-grained optimization of each hash function, which is particularly suitable for building hashcode representations generalizing from a training set to a test set. We empirically evaluate the proposed approach for biomedical relation extraction tasks, obtaining significant accuracy improvements w.r.t. state-of-the-art supervised and semi-supervised approaches. |
Tasks | Relation Extraction |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-1414/ |
https://www.aclweb.org/anthology/D19-1414 | |
PWC | https://paperswithcode.com/paper/nearly-unsupervised-hashcode-representations-1 |
Repo | |
Framework | |
DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression Networks
Title | DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression Networks |
Authors | Sagnik Das, Ke Ma, Zhixin Shu, Dimitris Samaras, Roy Shilkrot |
Abstract | Capturing document images with hand-held devices in unstructured environments is a common practice nowadays. However, “casual” photos of documents are usually unsuitable for automatic information extraction, mainly due to physical distortion of the document paper, as well as various camera positions and illumination conditions. In this work, we propose DewarpNet, a deep-learning approach for document image unwarping from a single image. Our insight is that the 3D geometry of the document not only determines the warping of its texture but also causes the illumination effects. Therefore, our novelty resides on the explicit modeling of 3D shape for document paper in an end-to-end pipeline. Also, we contribute the largest and most comprehensive dataset for document image unwarping to date - Doc3D. This dataset features multiple ground-truth annotations, including 3D shape, surface normals, UV map, albedo image, etc. Training with Doc3D, we demonstrate state-of-the-art performance for DewarpNet with extensive qualitative and quantitative evaluations. Our network also significantly improves OCR performance on captured document images, decreasing character error rate by 42% on average. Both the code and the dataset are released. |
Tasks | Optical Character Recognition |
Published | 2019-10-01 |
URL | http://openaccess.thecvf.com/content_ICCV_2019/html/Das_DewarpNet_Single-Image_Document_Unwarping_With_Stacked_3D_and_2D_Regression_ICCV_2019_paper.html |
http://openaccess.thecvf.com/content_ICCV_2019/papers/Das_DewarpNet_Single-Image_Document_Unwarping_With_Stacked_3D_and_2D_Regression_ICCV_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/dewarpnet-single-image-document-unwarping |
Repo | |
Framework | |
ULSAna: Universal Language Semantic Analyzer
Title | ULSAna: Universal Language Semantic Analyzer |
Authors | Ond{\v{r}}ej Pra{\v{z}}{'a}k, Miloslav Konopik |
Abstract | We present a live cross-lingual system capable of producing shallow semantic annotations of natural language sentences for 51 languages at this time. The domain of the input sentences is in principle unconstrained. The system uses single training data (in English) for all the languages. The resulting semantic annotations are therefore consistent across different languages. We use CoNLL Semantic Role Labeling training data and Universal dependencies as the basis for the system. The system is publicly available and supports processing data in batches; therefore, it can be easily used by the community for the following research tasks. |
Tasks | Semantic Role Labeling |
Published | 2019-09-01 |
URL | https://www.aclweb.org/anthology/R19-1112/ |
https://www.aclweb.org/anthology/R19-1112 | |
PWC | https://paperswithcode.com/paper/ulsana-universal-language-semantic-analyzer |
Repo | |
Framework | |