Paper Group NANR 170
Predicting Helpful Posts in Open-Ended Discussion Forums: A Neural Architecture. SSN_NLP at SemEval-2019 Task 6: Offensive Language Identification in Social Media using Traditional and Deep Machine Learning Approaches. SSN_NLP at SemEval-2019 Task 3: Contextual Emotion Identification from Textual Conversation using Seq2Seq Deep Neural Network. Se …
Predicting Helpful Posts in Open-Ended Discussion Forums: A Neural Architecture
Title | Predicting Helpful Posts in Open-Ended Discussion Forums: A Neural Architecture |
Authors | Kishaloy Halder, Min-Yen Kan, Kazunari Sugiyama |
Abstract | Users participate in online discussion forums to learn from others and share their knowledge with the community. They often start a thread with a question or by sharing their new findings on a certain topic. We find that, unlike Community Question Answering, where questions are mostly factoid based, the threads in a forum are often open-ended (e.g., asking for recommendations from others) without a single correct answer. In this paper, we address the task of identifying helpful posts in a forum thread to help users comprehend long running discussion threads, which often contain repetitive or irrelevant posts. We propose a recurrent neural network based architecture to model (i) the relevance of a post regarding the original post starting the thread and (ii) the novelty it brings to the discussion, compared to the previous posts in the thread. Experimental results on different types of online forum datasets show that our model significantly outperforms the state-of-the-art neural network models for text classification. |
Tasks | Community Question Answering, Question Answering, Text Classification |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/N19-1318/ |
https://www.aclweb.org/anthology/N19-1318 | |
PWC | https://paperswithcode.com/paper/predicting-helpful-posts-in-open-ended |
Repo | |
Framework | |
SSN_NLP at SemEval-2019 Task 6: Offensive Language Identification in Social Media using Traditional and Deep Machine Learning Approaches
Title | SSN_NLP at SemEval-2019 Task 6: Offensive Language Identification in Social Media using Traditional and Deep Machine Learning Approaches |
Authors | Thenmozhi D., Senthil Kumar B., Srinethe Sharavanan, Ch, Aravindan rabose |
Abstract | Offensive language identification (OLI) in user generated text is automatic detection of any profanity, insult, obscenity, racism or vulgarity that degrades an individual or a group. It is helpful for hate speech detection, flame detection and cyber bullying. Due to immense growth of accessibility to social media, OLI helps to avoid abuse and hurts. In this paper, we present deep and traditional machine learning approaches for OLI. In deep learning approach, we have used bi-directional LSTM with different attention mechanisms to build the models and in traditional machine learning, TF-IDF weighting schemes with classifiers namely Multinomial Naive Bayes and Support Vector Machines with Stochastic Gradient Descent optimizer are used for model building. The approaches are evaluated on the OffensEval@SemEval2019 dataset and our team SSN{_}NLP submitted runs for three tasks of OffensEval shared task. The best runs of SSN{_}NLP obtained the F1 scores as 0.53, 0.48, 0.3 and the accuracies as 0.63, 0.84 and 0.42 for the tasks A, B and C respectively. Our approaches improved the base line F1 scores by 12{%}, 26{%} and 14{%} for Task A, B and C respectively. |
Tasks | Hate Speech Detection, Language Identification |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/S19-2130/ |
https://www.aclweb.org/anthology/S19-2130 | |
PWC | https://paperswithcode.com/paper/ssn_nlp-at-semeval-2019-task-6-offensive |
Repo | |
Framework | |
SSN_NLP at SemEval-2019 Task 3: Contextual Emotion Identification from Textual Conversation using Seq2Seq Deep Neural Network
Title | SSN_NLP at SemEval-2019 Task 3: Contextual Emotion Identification from Textual Conversation using Seq2Seq Deep Neural Network |
Authors | Senthil Kumar B., Thenmozhi D., Ch, Aravindan rabose, Srinethe Sharavanan |
Abstract | Emotion identification is a process of identifying the emotions automatically from text, speech or images. Emotion identification from textual conversations is a challenging problem due to absence of gestures, vocal intonation and facial expressions. It enables conversational agents, chat bots and messengers to detect and report the emotions to the user instantly for a healthy conversation by avoiding emotional cues and miscommunications. We have adopted a Seq2Seq deep neural network to identify the emotions present in the text sequences. Several layers namely embedding layer, encoding-decoding layer, softmax layer and a loss layer are used to map the sequences from textual conversations to the emotions namely Angry, Happy, Sad and Others. We have evaluated our approach on the EmoContext@SemEval2019 dataset and we have obtained the micro-averaged F1 scores as 0.595 and 0.6568 for the pre-evaluation dataset and final evaluation test set respectively. Our approach improved the base line score by 7{%} for final evaluation test set. |
Tasks | |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/S19-2055/ |
https://www.aclweb.org/anthology/S19-2055 | |
PWC | https://paperswithcode.com/paper/ssn_nlp-at-semeval-2019-task-3-contextual |
Repo | |
Framework | |
Self-Supervised Representation Learning From Multi-Domain Data
Title | Self-Supervised Representation Learning From Multi-Domain Data |
Authors | Zeyu Feng, Chang Xu, Dacheng Tao |
Abstract | We present an information-theoretically motivated constraint for self-supervised representation learning from multiple related domains. In contrast to previous self-supervised learning methods, our approach learns from multiple domains, which has the benefit of decreasing the build-in bias of individual domain, as well as leveraging information and allowing knowledge transfer across multiple domains. The proposed mutual information constraints encourage neural network to extract common invariant information across domains and to preserve peculiar information of each domain simultaneously. We adopt tractable upper and lower bounds of mutual information to make the proposed constraints solvable. The learned representation is more unbiased and robust toward the input images. Extensive experimental results on both multi-domain and large-scale datasets demonstrate the necessity and advantage of multi-domain self-supervised learning with mutual information constraints. Representations learned in our framework on state-of-the-art methods achieve improved performance than those learned on a single domain. |
Tasks | Representation Learning, Transfer Learning |
Published | 2019-10-01 |
URL | http://openaccess.thecvf.com/content_ICCV_2019/html/Feng_Self-Supervised_Representation_Learning_From_Multi-Domain_Data_ICCV_2019_paper.html |
http://openaccess.thecvf.com/content_ICCV_2019/papers/Feng_Self-Supervised_Representation_Learning_From_Multi-Domain_Data_ICCV_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/self-supervised-representation-learning-from-1 |
Repo | |
Framework | |
Policy Optimization via Stochastic Recursive Gradient Algorithm
Title | Policy Optimization via Stochastic Recursive Gradient Algorithm |
Authors | Huizhuo Yuan, Chris Junchi Li, Yuhao Tang, Yuren Zhou |
Abstract | In this paper, we propose the StochAstic Recursive grAdient Policy Optimization (SARAPO) algorithm which is a novel variance reduction method on Trust Region Policy Optimization (TRPO). The algorithm incorporates the StochAstic Recursive grAdient algoritHm(SARAH) into the TRPO framework. Compared with the existing Stochastic Variance Reduced Policy Optimization (SVRPO), our algorithm is more stable in the variance. Furthermore, by theoretical analysis the ordinary differential equation and the stochastic differential equation (ODE/SDE) of SARAH, we analyze its convergence property and stability. Our experiments demonstrate its performance on a variety of benchmark tasks. We show that our algorithm gets better improvement in each iteration and matches or even outperforms SVRPO and TRPO. |
Tasks | |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=rJl3S2A9t7 |
https://openreview.net/pdf?id=rJl3S2A9t7 | |
PWC | https://paperswithcode.com/paper/policy-optimization-via-stochastic-recursive |
Repo | |
Framework | |
Modeling MWEs in BTB-WN
Title | Modeling MWEs in BTB-WN |
Authors | Laska Laskova, Petya Osenova, Kiril Simov, Ivajlo Radev, Zara Kancheva |
Abstract | The paper presents the characteristics of the predominant types of MultiWord expressions (MWEs) in the BulTreeBank WordNet {–} BTB-WN. Their distribution in BTB-WN is discussed with respect to the overall hierarchical organization of the lexical resource. Also, a catena-based modeling is proposed for handling the issues of lexical semantics of MWEs. |
Tasks | |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-5109/ |
https://www.aclweb.org/anthology/W19-5109 | |
PWC | https://paperswithcode.com/paper/modeling-mwes-in-btb-wn |
Repo | |
Framework | |
Description-Based Zero-shot Fine-Grained Entity Typing
Title | Description-Based Zero-shot Fine-Grained Entity Typing |
Authors | Rasha Obeidat, Xiaoli Fern, Hamed Shahbazi, Prasad Tadepalli |
Abstract | Fine-grained Entity typing (FGET) is the task of assigning a fine-grained type from a hierarchy to entity mentions in the text. As the taxonomy of types evolves continuously, it is desirable for an entity typing system to be able to recognize novel types without additional training. This work proposes a zero-shot entity typing approach that utilizes the type description available from Wikipedia to build a distributed semantic representation of the types. During training, our system learns to align the entity mentions and their corresponding type representations on the known types. At test time, any new type can be incorporated into the system given its Wikipedia descriptions. We evaluate our approach on FIGER, a public benchmark entity tying dataset. Because the existing test set of FIGER covers only a small portion of the fine-grained types, we create a new test set by manually annotating a portion of the noisy training data. Our experiments demonstrate the effectiveness of the proposed method in recognizing novel types that are not present in the training data. |
Tasks | Entity Typing |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/N19-1087/ |
https://www.aclweb.org/anthology/N19-1087 | |
PWC | https://paperswithcode.com/paper/description-based-zero-shot-fine-grained |
Repo | |
Framework | |
Joint Syntax Representation Learning and Visual Cue Translation for Video Captioning
Title | Joint Syntax Representation Learning and Visual Cue Translation for Video Captioning |
Authors | Jingyi Hou, Xinxiao Wu, Wentian Zhao, Jiebo Luo, Yunde Jia |
Abstract | Video captioning is a challenging task that involves not only visual perception but also syntax representation learning. Recent progress in video captioning has been achieved through visual perception, but syntax representation learning is still under-explored. We propose a novel video captioning approach that takes into account both visual perception and syntax representation learning to generate accurate descriptions of videos. Specifically, we use sentence templates composed of Part-of-Speech (POS) tags to represent the syntax structure of captions, and accordingly, syntax representation learning is performed by directly inferring POS tags from videos. The visual perception is implemented by a mixture model which translates visual cues into lexical words that are conditional on the learned syntactic structure of sentences. Thus, a video captioning task consists of two sub-tasks: video POS tagging and visual cue translation, which are jointly modeled and trained in an end-to-end fashion. Evaluations on three public benchmark datasets demonstrate that our proposed method achieves substantially better performance than the state-of-the-art methods, which validates the superiority of joint modeling of syntax representation learning and visual perception for video captioning. |
Tasks | Representation Learning, Video Captioning |
Published | 2019-10-01 |
URL | http://openaccess.thecvf.com/content_ICCV_2019/html/Hou_Joint_Syntax_Representation_Learning_and_Visual_Cue_Translation_for_Video_ICCV_2019_paper.html |
http://openaccess.thecvf.com/content_ICCV_2019/papers/Hou_Joint_Syntax_Representation_Learning_and_Visual_Cue_Translation_for_Video_ICCV_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/joint-syntax-representation-learning-and |
Repo | |
Framework | |
Customizing Grapheme-to-Phoneme System for Non-Trivial Transcription Problems in Bangla Language
Title | Customizing Grapheme-to-Phoneme System for Non-Trivial Transcription Problems in Bangla Language |
Authors | Sudipta Saha Shubha, Nafis Sadeq, Shafayat Ahmed, Md. Nahidul Islam, Muhammad Abdullah Adnan, Md. Yasin Ali Khan, Mohammad Zuberul Islam |
Abstract | Grapheme to phoneme (G2P) conversion is an integral part in various text and speech processing systems, such as: Text to Speech system, Speech Recognition system, etc. The existing methodologies for G2P conversion in Bangla language are mostly rule-based. However, data-driven approaches have proved their superiority over rule-based approaches for large-scale G2P conversion in other languages, such as: English, German, etc. As the performance of data-driven approaches for G2P conversion depend largely on pronunciation lexicon on which the system is trained, in this paper, we investigate on developing an improved training lexicon by identifying and categorizing the critical cases in Bangla language and include those critical cases in training lexicon for developing a robust G2P conversion system in Bangla language. Additionally, we have incorporated nasal vowels in our proposed phoneme list. Our methodology outperforms other state-of-the-art approaches for G2P conversion in Bangla language. |
Tasks | Speech Recognition |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/N19-1322/ |
https://www.aclweb.org/anthology/N19-1322 | |
PWC | https://paperswithcode.com/paper/customizing-grapheme-to-phoneme-system-for |
Repo | |
Framework | |
Learn From Neighbour: A Curriculum That Train Low Weighted Samples By Imitating
Title | Learn From Neighbour: A Curriculum That Train Low Weighted Samples By Imitating |
Authors | Benyuan Sun, Yizhou Wang |
Abstract | Deep neural networks, which gain great success in a wide spectrum of applications, are often time, compute and storage hungry. Curriculum learning proposed to boost training of network by a syllabus from easy to hard. However, the relationship between data complexity and network training is unclear: why hard example harm the performance at beginning but helps at end. In this paper, we aim to investigate on this problem. Similar to internal covariate shift in network forward pass, the distribution changes in weight of top layers also affects training of preceding layers during the backward pass. We call this phenomenon inverse “internal covariate shift”. Training hard examples aggravates the distribution shifting and damages the training. To address this problem, we introduce a curriculum loss that consists of two parts: a) an adaptive weight that mitigates large early punishment; b) an additional representation loss for low weighted samples. The intuition of the loss is very simple. We train top layers on “good” samples to reduce large shifting, and encourage “bad” samples to learn from “good” sample. In detail, the adaptive weight assigns small values to hard examples, reducing the influence of noisy gradients. On the other hand, the less-weighted hard sample receives the proposed representation loss. Low-weighted data gets nearly no training signal and can stuck in embedding space for a long time. The proposed representation loss aims to encourage their training. This is done by letting them learn a better representation from its superior neighbours but not participate in learning of top layers. In this way, the fluctuation of top layers is reduced and hard samples also received signals for training. We found in this paper that curriculum learning needs random sampling between tasks for better training. Our curriculum loss is easy to combine with existing stochastic algorithms like SGD. Experimental result shows an consistent improvement over several benchmark datasets. |
Tasks | |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=r1luCsCqFm |
https://openreview.net/pdf?id=r1luCsCqFm | |
PWC | https://paperswithcode.com/paper/learn-from-neighbour-a-curriculum-that-train |
Repo | |
Framework | |
BIGSAGE: unsupervised inductive representation learning of graph via bi-attended sampling and global-biased aggregating
Title | BIGSAGE: unsupervised inductive representation learning of graph via bi-attended sampling and global-biased aggregating |
Authors | Xin Luo, Hankz Hankui Zhuo |
Abstract | Different kinds of representation learning techniques on graph have shown significant effect in downstream machine learning tasks. Recently, in order to inductively learn representations for graph structures that is unobservable during training, a general framework with sampling and aggregating (GraphSAGE) was proposed by Hamilton and Ying and had been proved more efficient than transductive methods on fileds like transfer learning or evolving dataset. However, GraphSAGE is uncapable of selective neighbor sampling and lack of memory of known nodes that’ve been trained. To address these problems, we present an unsupervised method that samples neighborhood information attended by co-occurring structures and optimizes a trainable global bias as a representation expectation for each node in the given graph. Experiments show that our approach outperforms the state-of-the-art inductive and unsupervised methods for representation learning on graphs. |
Tasks | Representation Learning, Transfer Learning |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=SygxYoC5FX |
https://openreview.net/pdf?id=SygxYoC5FX | |
PWC | https://paperswithcode.com/paper/bigsage-unsupervised-inductive-representation |
Repo | |
Framework | |
Complex event representation in a typed feature structure implementation of Role and Reference Grammar
Title | Complex event representation in a typed feature structure implementation of Role and Reference Grammar |
Authors | Erika Bellingham |
Abstract | |
Tasks | |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/W19-1004/ |
https://www.aclweb.org/anthology/W19-1004 | |
PWC | https://paperswithcode.com/paper/complex-event-representation-in-a-typed |
Repo | |
Framework | |
Fine-Grained Entity Typing via Hierarchical Multi Graph Convolutional Networks
Title | Fine-Grained Entity Typing via Hierarchical Multi Graph Convolutional Networks |
Authors | Hailong Jin, Lei Hou, Juanzi Li, Tiansi Dong |
Abstract | This paper addresses the problem of inferring the fine-grained type of an entity from a knowledge base. We convert this problem into the task of graph-based semi-supervised classification, and propose Hierarchical Multi Graph Convolutional Network (HMGCN), a novel Deep Learning architecture to tackle this problem. We construct three kinds of connectivity matrices to capture different kinds of semantic correlations between entities. A recursive regularization is proposed to model the subClassOf relations between types in given type hierarchy. Extensive experiments with two large-scale public datasets show that our proposed method significantly outperforms four state-of-the-art methods. |
Tasks | Entity Typing |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-1502/ |
https://www.aclweb.org/anthology/D19-1502 | |
PWC | https://paperswithcode.com/paper/fine-grained-entity-typing-via-hierarchical |
Repo | |
Framework | |
Overfitting Detection of Deep Neural Networks without a Hold Out Set
Title | Overfitting Detection of Deep Neural Networks without a Hold Out Set |
Authors | Konrad Groh |
Abstract | Overfitting is an ubiquitous problem in neural network training and usually mitigated using a holdout data set. Here we challenge this rationale and investigate criteria for overfitting without using a holdout data set. Specifically, we train a model for a fixed number of epochs multiple times with varying fractions of randomized labels and for a range of regularization strengths. A properly trained model should not be able to attain an accuracy greater than the fraction of properly labeled data points. Otherwise the model overfits. We introduce two criteria for detecting overfitting and one to detect underfitting. We analyze early stopping, the regularization factor, and network depth. In safety critical applications we are interested in models and parameter settings which perform well and are not likely to overfit. The methods of this paper allow characterizing and identifying such models. |
Tasks | |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=B1lKtjA9FQ |
https://openreview.net/pdf?id=B1lKtjA9FQ | |
PWC | https://paperswithcode.com/paper/overfitting-detection-of-deep-neural-networks |
Repo | |
Framework | |
Sketch Me if You Can: Towards Generating Detailed Descriptions of Object Shape by Grounding in Images and Drawings
Title | Sketch Me if You Can: Towards Generating Detailed Descriptions of Object Shape by Grounding in Images and Drawings |
Authors | Ting Han, Sina Zarrie{\ss} |
Abstract | A lot of recent work in Language {&} Vision has looked at generating descriptions or referring expressions for objects in scenes of real-world images, though focusing mostly on relatively simple language like object names, color and location attributes (e.g., brown chair on the left). This paper presents work on Draw-and-Tell, a dataset of detailed descriptions for common objects in images where annotators have produced fine-grained attribute-centric expressions distinguishing a target object from a range of similar objects. Additionally, the dataset comes with hand-drawn sketches for each object. As Draw-and-Tell is medium-sized and contains a rich vocabulary, it constitutes an interesting challenge for CNN-LSTM architectures used in state-of-the-art image captioning models. We explore whether the additional modality given through sketches can help such a model to learn to accurately ground detailed language referring expressions to object shapes. Our results are encouraging. |
Tasks | Image Captioning |
Published | 2019-10-01 |
URL | https://www.aclweb.org/anthology/W19-8618/ |
https://www.aclweb.org/anthology/W19-8618 | |
PWC | https://paperswithcode.com/paper/sketch-me-if-you-can-towards-generating |
Repo | |
Framework | |