Paper Group NANR 4
Combining Shallow and Deep Learning for Aggressive Text Detection. Learned Shape-Tailored Descriptors for Segmentation. The Spectrum of the Fisher Information Matrix of a Single-Hidden-Layer Neural Network. Interaction-Aware Topic Model for Microblog Conversations through Network Embedding and User Attention. Improve Neural Entity Recognition via M …
Combining Shallow and Deep Learning for Aggressive Text Detection
Title | Combining Shallow and Deep Learning for Aggressive Text Detection |
Authors | Viktor Golem, Mladen Karan, Jan {\v{S}}najder |
Abstract | We describe the participation of team TakeLab in the aggression detection shared task at the TRAC1 workshop for English. Aggression manifests in a variety of ways. Unlike some forms of aggression that are impossible to prevent in day-to-day life, aggressive speech abounding on social networks could in principle be prevented or at least reduced by simply disabling users that post aggressively worded messages. The first step in achieving this is to detect such messages. The task, however, is far from being trivial, as what is considered as aggressive speech can be quite subjective, and the task is further complicated by the noisy nature of user-generated text on social networks. Our system learns to distinguish between open aggression, covert aggression, and non-aggression in social media texts. We tried different machine learning approaches, including traditional (shallow) machine learning models, deep learning models, and a combination of both. We achieved respectable results, ranking 4th and 8th out of 31 submissions on the Facebook and Twitter test sets, respectively. |
Tasks | |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/W18-4422/ |
https://www.aclweb.org/anthology/W18-4422 | |
PWC | https://paperswithcode.com/paper/combining-shallow-and-deep-learning-for |
Repo | |
Framework | |
Learned Shape-Tailored Descriptors for Segmentation
Title | Learned Shape-Tailored Descriptors for Segmentation |
Authors | Naeemullah Khan, Ganesh Sundaramoorthi |
Abstract | We address the problem of texture segmentation by grouping dense pixel-wise descriptors. We introduce and construct learned Shape-Tailored Descriptors that aggregate image statistics only within regions of interest to avoid mixing statistics of different textures, and that are invariant to complex nuisances (e.g., illumination, perspective and deformations). This is accomplished by training a neural network to discriminate base shape-tailored descriptors of oriented gradients at various scales. These descriptors are defined through partial differential equations to obtain data at various scales in arbitrarily shaped regions. We formulate and optimize a joint optimization problem in the segmentation and descriptors to discriminate these base descriptors using the learned metric, equivalent to grouping learned descriptors. We test the method on datasets to illustrate the effect of both the shape-tailored and learned properties of the descriptors. Experiments show that the descriptors learned on a small dataset of segmented images generalize well to unseen textures in other datasets, showing the generic nature of these descriptors. We show stateof- the-art results on texture segmentation benchmarks. |
Tasks | |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Khan_Learned_Shape-Tailored_Descriptors_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Khan_Learned_Shape-Tailored_Descriptors_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/learned-shape-tailored-descriptors-for |
Repo | |
Framework | |
The Spectrum of the Fisher Information Matrix of a Single-Hidden-Layer Neural Network
Title | The Spectrum of the Fisher Information Matrix of a Single-Hidden-Layer Neural Network |
Authors | Jeffrey Pennington, Pratik Worah |
Abstract | An important factor contributing to the success of deep learning has been the remarkable ability to optimize large neural networks using simple first-order optimization algorithms like stochastic gradient descent. While the efficiency of such methods depends crucially on the local curvature of the loss surface, very little is actually known about how this geometry depends on network architecture and hyperparameters. In this work, we extend a recently-developed framework for studying spectra of nonlinear random matrices to characterize an important measure of curvature, namely the eigenvalues of the Fisher information matrix. We focus on a single-hidden-layer neural network with Gaussian data and weights and provide an exact expression for the spectrum in the limit of infinite width. We find that linear networks suffer worse conditioning than nonlinear networks and that nonlinear networks are generically non-degenerate. We also predict and demonstrate empirically that by adjusting the nonlinearity, the spectrum can be tuned so as to improve the efficiency of first-order optimization methods. |
Tasks | |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/7786-the-spectrum-of-the-fisher-information-matrix-of-a-single-hidden-layer-neural-network |
http://papers.nips.cc/paper/7786-the-spectrum-of-the-fisher-information-matrix-of-a-single-hidden-layer-neural-network.pdf | |
PWC | https://paperswithcode.com/paper/the-spectrum-of-the-fisher-information-matrix |
Repo | |
Framework | |
Interaction-Aware Topic Model for Microblog Conversations through Network Embedding and User Attention
Title | Interaction-Aware Topic Model for Microblog Conversations through Network Embedding and User Attention |
Authors | Ruifang He, Xuefei Zhang, Di Jin, Longbiao Wang, Jianwu Dang, Xiangang Li |
Abstract | Traditional topic models are insufficient for topic extraction in social media. The existing methods only consider text information or simultaneously model the posts and the static characteristics of social media. They ignore that one discusses diverse topics when dynamically interacting with different people. Moreover, people who talk about the same topic have different effects on the topic. In this paper, we propose an Interaction-Aware Topic Model (IATM) for microblog conversations by integrating network embedding and user attention. A conversation network linking users based on reposting and replying relationship is constructed to mine the dynamic user behaviours. We model dynamic interactions and user attention so as to learn interaction-aware edge embeddings with social context. Then they are incorporated into neural variational inference for generating the more consistent topics. The experiments on three real-world datasets show that our proposed model is effective. |
Tasks | Network Embedding, Representation Learning, Topic Models |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/C18-1118/ |
https://www.aclweb.org/anthology/C18-1118 | |
PWC | https://paperswithcode.com/paper/interaction-aware-topic-model-for-microblog |
Repo | |
Framework | |
Improve Neural Entity Recognition via Multi-Task Data Selection and Constrained Decoding
Title | Improve Neural Entity Recognition via Multi-Task Data Selection and Constrained Decoding |
Authors | Huasha Zhao, Yi Yang, Qiong Zhang, Luo Si |
Abstract | Entity recognition is a widely benchmarked task in natural language processing due to its massive applications. The state-of-the-art solution applies a neural architecture named BiLSTM-CRF to model the language sequences. In this paper, we propose an entity recognition system that improves this neural architecture with two novel techniques. The first technique is Multi-Task Data Selection, which ensures the consistency of data distribution and labeling guidelines between source and target datasets. The other one is constrained decoding using knowledge base. The decoder of the model operates at the document level, and leverages global and external information sources to further improve performance. Extensive experiments have been conducted to show the advantages of each technique. Our system achieves state-of-the-art results on the English entity recognition task in KBP 2017 official evaluation, and it also yields very strong results in other languages. |
Tasks | Domain Adaptation, Machine Reading Comprehension, Multi-Task Learning, Named Entity Recognition, Question Answering, Reading Comprehension, Transfer Learning |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/N18-2056/ |
https://www.aclweb.org/anthology/N18-2056 | |
PWC | https://paperswithcode.com/paper/improve-neural-entity-recognition-via-multi |
Repo | |
Framework | |
Towards AMR-BR: A SemBank for Brazilian Portuguese Language
Title | Towards AMR-BR: A SemBank for Brazilian Portuguese Language |
Authors | Rafael Anchi{^e}ta, Thiago Pardo |
Abstract | |
Tasks | Entity Linking, Machine Reading Comprehension, Machine Translation, Named Entity Recognition, Natural Language Inference, Question Answering, Reading Comprehension, Semantic Parsing, Semantic Role Labeling, Text Generation, Word Sense Disambiguation |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1157/ |
https://www.aclweb.org/anthology/L18-1157 | |
PWC | https://paperswithcode.com/paper/towards-amr-br-a-sembank-for-brazilian |
Repo | |
Framework | |
Abstractive Unsupervised Multi-Document Summarization using Paraphrastic Sentence Fusion
Title | Abstractive Unsupervised Multi-Document Summarization using Paraphrastic Sentence Fusion |
Authors | Mir Tafseer Nayeem, Tanvir Ahmed Fuad, Yllias Chali |
Abstract | In this work, we aim at developing an unsupervised abstractive summarization system in the multi-document setting. We design a paraphrastic sentence fusion model which jointly performs sentence fusion and paraphrasing using skip-gram word embedding model at the sentence level. Our model improves the information coverage and at the same time abstractiveness of the generated sentences. We conduct our experiments on the human-generated multi-sentence compression datasets and evaluate our system on several newly proposed Machine Translation (MT) evaluation metrics. Furthermore, we apply our sentence level model to implement an abstractive multi-document summarization system where documents usually contain a related set of sentences. We also propose an optimal solution for the classical summary length limit problem which was not addressed in the past research. For the document level summary, we conduct experiments on the datasets of two different domains (e.g., news article and user reviews) which are well suited for multi-document abstractive summarization. Our experiments demonstrate that the methods bring significant improvements over the state-of-the-art methods. |
Tasks | Abstractive Text Summarization, Document Summarization, Machine Translation, Multi-Document Summarization, Sentence Compression, Text Generation |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/C18-1102/ |
https://www.aclweb.org/anthology/C18-1102 | |
PWC | https://paperswithcode.com/paper/abstractive-unsupervised-multi-document |
Repo | |
Framework | |
Prediction for the Newsroom: Which Articles Will Get the Most Comments?
Title | Prediction for the Newsroom: Which Articles Will Get the Most Comments? |
Authors | Carl Ambroselli, Julian Risch, Ralf Krestel, Andreas Loos |
Abstract | The overwhelming success of the Web and mobile technologies has enabled millions to share their opinions publicly at any time. But the same success also endangers this freedom of speech due to closing down of participatory sites misused by individuals or interest groups. We propose to support manual moderation by proactively drawing the attention of our moderators to article discussions that most likely need their intervention. To this end, we predict which articles will receive a high number of comments. In contrast to existing work, we enrich the article with metadata, extract semantic and linguistic features, and exploit annotated data from a foreign language corpus. Our logistic regression model improves F1-scores by over 80{%} in comparison to state-of-the-art approaches. |
Tasks | |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/N18-3024/ |
https://www.aclweb.org/anthology/N18-3024 | |
PWC | https://paperswithcode.com/paper/prediction-for-the-newsroom-which-articles |
Repo | |
Framework | |
Point Precisely: Towards Ensuring the Precision of Data in Generated Texts Using Delayed Copy Mechanism
Title | Point Precisely: Towards Ensuring the Precision of Data in Generated Texts Using Delayed Copy Mechanism |
Authors | Liunian Li, Xiaojun Wan |
Abstract | The task of data-to-text generation aims to generate descriptive texts conditioned on a number of database records, and recent neural models have shown significant progress on this task. The attention based encoder-decoder models with copy mechanism have achieved state-of-the-art results on a few data-to-text datasets. However, such models still face the problem of putting incorrect data records in the generated texts, especially on some more challenging datasets like RotoWire. In this paper, we propose a two-stage approach with a delayed copy mechanism to improve the precision of data records in the generated texts. Our approach first adopts an encoder-decoder model to generate a template text with data slots to be filled and then leverages a proposed delayed copy mechanism to fill in the slots with proper data records. Our delayed copy mechanism can take into account all the information of the input data records and the full generated template text by using double attention, position-aware attention and a pairwise ranking loss. The two models in the two stages are trained separately. Evaluation results on the RotoWire dataset verify the efficacy of our proposed approach to generate better templates and copy data records more precisely. |
Tasks | Data-to-Text Generation, Slot Filling, Text Generation |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/C18-1089/ |
https://www.aclweb.org/anthology/C18-1089 | |
PWC | https://paperswithcode.com/paper/point-precisely-towards-ensuring-the |
Repo | |
Framework | |
Deep Neural Networks for Coreference Resolution for Polish
Title | Deep Neural Networks for Coreference Resolution for Polish |
Authors | Bart{\l}omiej Nito{'n}, Pawe{\l} Morawiecki, Maciej Ogrodniczuk |
Abstract | |
Tasks | Coreference Resolution |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1060/ |
https://www.aclweb.org/anthology/L18-1060 | |
PWC | https://paperswithcode.com/paper/deep-neural-networks-for-coreference |
Repo | |
Framework | |
SzegedKoref: A Hungarian Coreference Corpus
Title | SzegedKoref: A Hungarian Coreference Corpus |
Authors | Veronika Vincze, Kl{'a}ra Heged{\H{u}}s, Alex Sliz-Nagy, Rich{'a}rd Farkas |
Abstract | |
Tasks | Coreference Resolution |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1061/ |
https://www.aclweb.org/anthology/L18-1061 | |
PWC | https://paperswithcode.com/paper/szegedkoref-a-hungarian-coreference-corpus |
Repo | |
Framework | |
Discovery of Predictive Representations With a Network of General Value Functions
Title | Discovery of Predictive Representations With a Network of General Value Functions |
Authors | Matthew Schlegel, Andrew Patterson, Adam White, Martha White |
Abstract | The ability of an agent to {\em discover} its own learning objectives has long been considered a key ingredient for artificial general intelligence. Breakthroughs in autonomous decision making and reinforcement learning have primarily been in domains where the agent’s goal is outlined and clear: such as playing a game to win, or driving safely. Several studies have demonstrated that learning extramural sub-tasks and auxiliary predictions can improve (1) single human-specified task learning, (2) transfer of learning, (3) and the agent’s learned representation of the world. In all these examples, the agent was instructed what to learn about. We investigate a framework for discovery: curating a large collection of predictions, which are used to construct the agent’s representation of the world. Specifically, our system maintains a large collection of predictions, continually pruning and replacing predictions. We highlight the importance of considering stability rather than convergence for such a system, and develop an adaptive, regularized algorithm towards that aim. We provide several experiments in computational micro-worlds demonstrating that this simple approach can be effective for discovering useful predictions autonomously. |
Tasks | Decision Making |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=ryZElGZ0Z |
https://openreview.net/pdf?id=ryZElGZ0Z | |
PWC | https://paperswithcode.com/paper/discovery-of-predictive-representations-with |
Repo | |
Framework | |
Detecting Offensive Tweets in Hindi-English Code-Switched Language
Title | Detecting Offensive Tweets in Hindi-English Code-Switched Language |
Authors | Puneet Mathur, Rajiv Shah, Ramit Sawhney, Debanjan Mahata |
Abstract | The exponential rise of social media websites like Twitter, Facebook and Reddit in linguistically diverse geographical regions has led to hybridization of popular native languages with English in an effort to ease communication. The paper focuses on the classification of offensive tweets written in Hinglish language, which is a portmanteau of the Indic language Hindi with the Roman script. The paper introduces a novel tweet dataset, titled Hindi-English Offensive Tweet (HEOT) dataset, consisting of tweets in Hindi-English code switched language split into three classes: non-offensive, abusive and hate-speech. Further, we approach the problem of classification of the tweets in HEOT dataset using transfer learning wherein the proposed model employing Convolutional Neural Networks is pre-trained on tweets in English followed by retraining on Hinglish tweets. |
Tasks | Hate Speech Detection, Transfer Learning |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/W18-3504/ |
https://www.aclweb.org/anthology/W18-3504 | |
PWC | https://paperswithcode.com/paper/detecting-offensive-tweets-in-hindi-english |
Repo | |
Framework | |
Predicting Stances from Social Media Posts using Factorization Machines
Title | Predicting Stances from Social Media Posts using Factorization Machines |
Authors | Akira Sasaki, Kazuaki Hanawa, Naoaki Okazaki, Kentaro Inui |
Abstract | Social media provide platforms to express, discuss, and shape opinions about events and issues in the real world. An important step to analyze the discussions on social media and to assist in healthy decision-making is stance detection. This paper presents an approach to detect the stance of a user toward a topic based on their stances toward other topics and the social media posts of the user. We apply factorization machines, a widely used method in item recommendation, to model user preferences toward topics from the social media data. The experimental results demonstrate that users{'} posts are useful to model topic preferences and therefore predict stances of silent users. |
Tasks | Decision Making, Stance Detection |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/C18-1286/ |
https://www.aclweb.org/anthology/C18-1286 | |
PWC | https://paperswithcode.com/paper/predicting-stances-from-social-media-posts |
Repo | |
Framework | |
HashCount at SemEval-2018 Task 3: Concatenative Featurization of Tweet and Hashtags for Irony Detection
Title | HashCount at SemEval-2018 Task 3: Concatenative Featurization of Tweet and Hashtags for Irony Detection |
Authors | Won Ik Cho, Woo Hyun Kang, Nam Soo Kim |
Abstract | This paper proposes a novel feature extraction process for SemEval task 3: Irony detection in English tweets. The proposed system incorporates a concatenative featurization of tweet and hashtags, which helps distinguishing between the irony-related and the other components. The system embeds tweets into a vector sequence with widely used pretrained word vectors, partially using a character embedding for the words that are out of vocabulary. Identification was performed with BiLSTM and CNN classifiers, achieving F1 score of 0.5939 (23/42) and 0.3925 (10/28) each for the binary and the multi-class case, respectively. The reliability of the proposed scheme was verified by analyzing the Gold test data, which demonstrates how hashtags can be taken into account when identifying various types of irony. |
Tasks | Feature Engineering, Hate Speech Detection, Opinion Mining, Sentiment Analysis |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/S18-1089/ |
https://www.aclweb.org/anthology/S18-1089 | |
PWC | https://paperswithcode.com/paper/hashcount-at-semeval-2018-task-3 |
Repo | |
Framework | |