January 24, 2020

2834 words 14 mins read

Paper Group NANR 170

Paper Group NANR 170

Predicting Helpful Posts in Open-Ended Discussion Forums: A Neural Architecture. SSN_NLP at SemEval-2019 Task 6: Offensive Language Identification in Social Media using Traditional and Deep Machine Learning Approaches. SSN_NLP at SemEval-2019 Task 3: Contextual Emotion Identification from Textual Conversation using Seq2Seq Deep Neural Network. Se …

Predicting Helpful Posts in Open-Ended Discussion Forums: A Neural Architecture

Title Predicting Helpful Posts in Open-Ended Discussion Forums: A Neural Architecture
Authors Kishaloy Halder, Min-Yen Kan, Kazunari Sugiyama
Abstract Users participate in online discussion forums to learn from others and share their knowledge with the community. They often start a thread with a question or by sharing their new findings on a certain topic. We find that, unlike Community Question Answering, where questions are mostly factoid based, the threads in a forum are often open-ended (e.g., asking for recommendations from others) without a single correct answer. In this paper, we address the task of identifying helpful posts in a forum thread to help users comprehend long running discussion threads, which often contain repetitive or irrelevant posts. We propose a recurrent neural network based architecture to model (i) the relevance of a post regarding the original post starting the thread and (ii) the novelty it brings to the discussion, compared to the previous posts in the thread. Experimental results on different types of online forum datasets show that our model significantly outperforms the state-of-the-art neural network models for text classification.
Tasks Community Question Answering, Question Answering, Text Classification
Published 2019-06-01
URL https://www.aclweb.org/anthology/N19-1318/
PDF https://www.aclweb.org/anthology/N19-1318
PWC https://paperswithcode.com/paper/predicting-helpful-posts-in-open-ended
Repo
Framework

SSN_NLP at SemEval-2019 Task 6: Offensive Language Identification in Social Media using Traditional and Deep Machine Learning Approaches

Title SSN_NLP at SemEval-2019 Task 6: Offensive Language Identification in Social Media using Traditional and Deep Machine Learning Approaches
Authors Thenmozhi D., Senthil Kumar B., Srinethe Sharavanan, Ch, Aravindan rabose
Abstract Offensive language identification (OLI) in user generated text is automatic detection of any profanity, insult, obscenity, racism or vulgarity that degrades an individual or a group. It is helpful for hate speech detection, flame detection and cyber bullying. Due to immense growth of accessibility to social media, OLI helps to avoid abuse and hurts. In this paper, we present deep and traditional machine learning approaches for OLI. In deep learning approach, we have used bi-directional LSTM with different attention mechanisms to build the models and in traditional machine learning, TF-IDF weighting schemes with classifiers namely Multinomial Naive Bayes and Support Vector Machines with Stochastic Gradient Descent optimizer are used for model building. The approaches are evaluated on the OffensEval@SemEval2019 dataset and our team SSN{_}NLP submitted runs for three tasks of OffensEval shared task. The best runs of SSN{_}NLP obtained the F1 scores as 0.53, 0.48, 0.3 and the accuracies as 0.63, 0.84 and 0.42 for the tasks A, B and C respectively. Our approaches improved the base line F1 scores by 12{%}, 26{%} and 14{%} for Task A, B and C respectively.
Tasks Hate Speech Detection, Language Identification
Published 2019-06-01
URL https://www.aclweb.org/anthology/S19-2130/
PDF https://www.aclweb.org/anthology/S19-2130
PWC https://paperswithcode.com/paper/ssn_nlp-at-semeval-2019-task-6-offensive
Repo
Framework

SSN_NLP at SemEval-2019 Task 3: Contextual Emotion Identification from Textual Conversation using Seq2Seq Deep Neural Network

Title SSN_NLP at SemEval-2019 Task 3: Contextual Emotion Identification from Textual Conversation using Seq2Seq Deep Neural Network
Authors Senthil Kumar B., Thenmozhi D., Ch, Aravindan rabose, Srinethe Sharavanan
Abstract Emotion identification is a process of identifying the emotions automatically from text, speech or images. Emotion identification from textual conversations is a challenging problem due to absence of gestures, vocal intonation and facial expressions. It enables conversational agents, chat bots and messengers to detect and report the emotions to the user instantly for a healthy conversation by avoiding emotional cues and miscommunications. We have adopted a Seq2Seq deep neural network to identify the emotions present in the text sequences. Several layers namely embedding layer, encoding-decoding layer, softmax layer and a loss layer are used to map the sequences from textual conversations to the emotions namely Angry, Happy, Sad and Others. We have evaluated our approach on the EmoContext@SemEval2019 dataset and we have obtained the micro-averaged F1 scores as 0.595 and 0.6568 for the pre-evaluation dataset and final evaluation test set respectively. Our approach improved the base line score by 7{%} for final evaluation test set.
Tasks
Published 2019-06-01
URL https://www.aclweb.org/anthology/S19-2055/
PDF https://www.aclweb.org/anthology/S19-2055
PWC https://paperswithcode.com/paper/ssn_nlp-at-semeval-2019-task-3-contextual
Repo
Framework

Self-Supervised Representation Learning From Multi-Domain Data

Title Self-Supervised Representation Learning From Multi-Domain Data
Authors Zeyu Feng, Chang Xu, Dacheng Tao
Abstract We present an information-theoretically motivated constraint for self-supervised representation learning from multiple related domains. In contrast to previous self-supervised learning methods, our approach learns from multiple domains, which has the benefit of decreasing the build-in bias of individual domain, as well as leveraging information and allowing knowledge transfer across multiple domains. The proposed mutual information constraints encourage neural network to extract common invariant information across domains and to preserve peculiar information of each domain simultaneously. We adopt tractable upper and lower bounds of mutual information to make the proposed constraints solvable. The learned representation is more unbiased and robust toward the input images. Extensive experimental results on both multi-domain and large-scale datasets demonstrate the necessity and advantage of multi-domain self-supervised learning with mutual information constraints. Representations learned in our framework on state-of-the-art methods achieve improved performance than those learned on a single domain.
Tasks Representation Learning, Transfer Learning
Published 2019-10-01
URL http://openaccess.thecvf.com/content_ICCV_2019/html/Feng_Self-Supervised_Representation_Learning_From_Multi-Domain_Data_ICCV_2019_paper.html
PDF http://openaccess.thecvf.com/content_ICCV_2019/papers/Feng_Self-Supervised_Representation_Learning_From_Multi-Domain_Data_ICCV_2019_paper.pdf
PWC https://paperswithcode.com/paper/self-supervised-representation-learning-from-1
Repo
Framework

Policy Optimization via Stochastic Recursive Gradient Algorithm

Title Policy Optimization via Stochastic Recursive Gradient Algorithm
Authors Huizhuo Yuan, Chris Junchi Li, Yuhao Tang, Yuren Zhou
Abstract In this paper, we propose the StochAstic Recursive grAdient Policy Optimization (SARAPO) algorithm which is a novel variance reduction method on Trust Region Policy Optimization (TRPO). The algorithm incorporates the StochAstic Recursive grAdient algoritHm(SARAH) into the TRPO framework. Compared with the existing Stochastic Variance Reduced Policy Optimization (SVRPO), our algorithm is more stable in the variance. Furthermore, by theoretical analysis the ordinary differential equation and the stochastic differential equation (ODE/SDE) of SARAH, we analyze its convergence property and stability. Our experiments demonstrate its performance on a variety of benchmark tasks. We show that our algorithm gets better improvement in each iteration and matches or even outperforms SVRPO and TRPO.
Tasks
Published 2019-05-01
URL https://openreview.net/forum?id=rJl3S2A9t7
PDF https://openreview.net/pdf?id=rJl3S2A9t7
PWC https://paperswithcode.com/paper/policy-optimization-via-stochastic-recursive
Repo
Framework

Modeling MWEs in BTB-WN

Title Modeling MWEs in BTB-WN
Authors Laska Laskova, Petya Osenova, Kiril Simov, Ivajlo Radev, Zara Kancheva
Abstract The paper presents the characteristics of the predominant types of MultiWord expressions (MWEs) in the BulTreeBank WordNet {–} BTB-WN. Their distribution in BTB-WN is discussed with respect to the overall hierarchical organization of the lexical resource. Also, a catena-based modeling is proposed for handling the issues of lexical semantics of MWEs.
Tasks
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-5109/
PDF https://www.aclweb.org/anthology/W19-5109
PWC https://paperswithcode.com/paper/modeling-mwes-in-btb-wn
Repo
Framework

Description-Based Zero-shot Fine-Grained Entity Typing

Title Description-Based Zero-shot Fine-Grained Entity Typing
Authors Rasha Obeidat, Xiaoli Fern, Hamed Shahbazi, Prasad Tadepalli
Abstract Fine-grained Entity typing (FGET) is the task of assigning a fine-grained type from a hierarchy to entity mentions in the text. As the taxonomy of types evolves continuously, it is desirable for an entity typing system to be able to recognize novel types without additional training. This work proposes a zero-shot entity typing approach that utilizes the type description available from Wikipedia to build a distributed semantic representation of the types. During training, our system learns to align the entity mentions and their corresponding type representations on the known types. At test time, any new type can be incorporated into the system given its Wikipedia descriptions. We evaluate our approach on FIGER, a public benchmark entity tying dataset. Because the existing test set of FIGER covers only a small portion of the fine-grained types, we create a new test set by manually annotating a portion of the noisy training data. Our experiments demonstrate the effectiveness of the proposed method in recognizing novel types that are not present in the training data.
Tasks Entity Typing
Published 2019-06-01
URL https://www.aclweb.org/anthology/N19-1087/
PDF https://www.aclweb.org/anthology/N19-1087
PWC https://paperswithcode.com/paper/description-based-zero-shot-fine-grained
Repo
Framework

Joint Syntax Representation Learning and Visual Cue Translation for Video Captioning

Title Joint Syntax Representation Learning and Visual Cue Translation for Video Captioning
Authors Jingyi Hou, Xinxiao Wu, Wentian Zhao, Jiebo Luo, Yunde Jia
Abstract Video captioning is a challenging task that involves not only visual perception but also syntax representation learning. Recent progress in video captioning has been achieved through visual perception, but syntax representation learning is still under-explored. We propose a novel video captioning approach that takes into account both visual perception and syntax representation learning to generate accurate descriptions of videos. Specifically, we use sentence templates composed of Part-of-Speech (POS) tags to represent the syntax structure of captions, and accordingly, syntax representation learning is performed by directly inferring POS tags from videos. The visual perception is implemented by a mixture model which translates visual cues into lexical words that are conditional on the learned syntactic structure of sentences. Thus, a video captioning task consists of two sub-tasks: video POS tagging and visual cue translation, which are jointly modeled and trained in an end-to-end fashion. Evaluations on three public benchmark datasets demonstrate that our proposed method achieves substantially better performance than the state-of-the-art methods, which validates the superiority of joint modeling of syntax representation learning and visual perception for video captioning.
Tasks Representation Learning, Video Captioning
Published 2019-10-01
URL http://openaccess.thecvf.com/content_ICCV_2019/html/Hou_Joint_Syntax_Representation_Learning_and_Visual_Cue_Translation_for_Video_ICCV_2019_paper.html
PDF http://openaccess.thecvf.com/content_ICCV_2019/papers/Hou_Joint_Syntax_Representation_Learning_and_Visual_Cue_Translation_for_Video_ICCV_2019_paper.pdf
PWC https://paperswithcode.com/paper/joint-syntax-representation-learning-and
Repo
Framework

Customizing Grapheme-to-Phoneme System for Non-Trivial Transcription Problems in Bangla Language

Title Customizing Grapheme-to-Phoneme System for Non-Trivial Transcription Problems in Bangla Language
Authors Sudipta Saha Shubha, Nafis Sadeq, Shafayat Ahmed, Md. Nahidul Islam, Muhammad Abdullah Adnan, Md. Yasin Ali Khan, Mohammad Zuberul Islam
Abstract Grapheme to phoneme (G2P) conversion is an integral part in various text and speech processing systems, such as: Text to Speech system, Speech Recognition system, etc. The existing methodologies for G2P conversion in Bangla language are mostly rule-based. However, data-driven approaches have proved their superiority over rule-based approaches for large-scale G2P conversion in other languages, such as: English, German, etc. As the performance of data-driven approaches for G2P conversion depend largely on pronunciation lexicon on which the system is trained, in this paper, we investigate on developing an improved training lexicon by identifying and categorizing the critical cases in Bangla language and include those critical cases in training lexicon for developing a robust G2P conversion system in Bangla language. Additionally, we have incorporated nasal vowels in our proposed phoneme list. Our methodology outperforms other state-of-the-art approaches for G2P conversion in Bangla language.
Tasks Speech Recognition
Published 2019-06-01
URL https://www.aclweb.org/anthology/N19-1322/
PDF https://www.aclweb.org/anthology/N19-1322
PWC https://paperswithcode.com/paper/customizing-grapheme-to-phoneme-system-for
Repo
Framework

Learn From Neighbour: A Curriculum That Train Low Weighted Samples By Imitating

Title Learn From Neighbour: A Curriculum That Train Low Weighted Samples By Imitating
Authors Benyuan Sun, Yizhou Wang
Abstract Deep neural networks, which gain great success in a wide spectrum of applications, are often time, compute and storage hungry. Curriculum learning proposed to boost training of network by a syllabus from easy to hard. However, the relationship between data complexity and network training is unclear: why hard example harm the performance at beginning but helps at end. In this paper, we aim to investigate on this problem. Similar to internal covariate shift in network forward pass, the distribution changes in weight of top layers also affects training of preceding layers during the backward pass. We call this phenomenon inverse “internal covariate shift”. Training hard examples aggravates the distribution shifting and damages the training. To address this problem, we introduce a curriculum loss that consists of two parts: a) an adaptive weight that mitigates large early punishment; b) an additional representation loss for low weighted samples. The intuition of the loss is very simple. We train top layers on “good” samples to reduce large shifting, and encourage “bad” samples to learn from “good” sample. In detail, the adaptive weight assigns small values to hard examples, reducing the influence of noisy gradients. On the other hand, the less-weighted hard sample receives the proposed representation loss. Low-weighted data gets nearly no training signal and can stuck in embedding space for a long time. The proposed representation loss aims to encourage their training. This is done by letting them learn a better representation from its superior neighbours but not participate in learning of top layers. In this way, the fluctuation of top layers is reduced and hard samples also received signals for training. We found in this paper that curriculum learning needs random sampling between tasks for better training. Our curriculum loss is easy to combine with existing stochastic algorithms like SGD. Experimental result shows an consistent improvement over several benchmark datasets.
Tasks
Published 2019-05-01
URL https://openreview.net/forum?id=r1luCsCqFm
PDF https://openreview.net/pdf?id=r1luCsCqFm
PWC https://paperswithcode.com/paper/learn-from-neighbour-a-curriculum-that-train
Repo
Framework

BIGSAGE: unsupervised inductive representation learning of graph via bi-attended sampling and global-biased aggregating

Title BIGSAGE: unsupervised inductive representation learning of graph via bi-attended sampling and global-biased aggregating
Authors Xin Luo, Hankz Hankui Zhuo
Abstract Different kinds of representation learning techniques on graph have shown significant effect in downstream machine learning tasks. Recently, in order to inductively learn representations for graph structures that is unobservable during training, a general framework with sampling and aggregating (GraphSAGE) was proposed by Hamilton and Ying and had been proved more efficient than transductive methods on fileds like transfer learning or evolving dataset. However, GraphSAGE is uncapable of selective neighbor sampling and lack of memory of known nodes that’ve been trained. To address these problems, we present an unsupervised method that samples neighborhood information attended by co-occurring structures and optimizes a trainable global bias as a representation expectation for each node in the given graph. Experiments show that our approach outperforms the state-of-the-art inductive and unsupervised methods for representation learning on graphs.
Tasks Representation Learning, Transfer Learning
Published 2019-05-01
URL https://openreview.net/forum?id=SygxYoC5FX
PDF https://openreview.net/pdf?id=SygxYoC5FX
PWC https://paperswithcode.com/paper/bigsage-unsupervised-inductive-representation
Repo
Framework

Complex event representation in a typed feature structure implementation of Role and Reference Grammar

Title Complex event representation in a typed feature structure implementation of Role and Reference Grammar
Authors Erika Bellingham
Abstract
Tasks
Published 2019-06-01
URL https://www.aclweb.org/anthology/W19-1004/
PDF https://www.aclweb.org/anthology/W19-1004
PWC https://paperswithcode.com/paper/complex-event-representation-in-a-typed
Repo
Framework

Fine-Grained Entity Typing via Hierarchical Multi Graph Convolutional Networks

Title Fine-Grained Entity Typing via Hierarchical Multi Graph Convolutional Networks
Authors Hailong Jin, Lei Hou, Juanzi Li, Tiansi Dong
Abstract This paper addresses the problem of inferring the fine-grained type of an entity from a knowledge base. We convert this problem into the task of graph-based semi-supervised classification, and propose Hierarchical Multi Graph Convolutional Network (HMGCN), a novel Deep Learning architecture to tackle this problem. We construct three kinds of connectivity matrices to capture different kinds of semantic correlations between entities. A recursive regularization is proposed to model the subClassOf relations between types in given type hierarchy. Extensive experiments with two large-scale public datasets show that our proposed method significantly outperforms four state-of-the-art methods.
Tasks Entity Typing
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-1502/
PDF https://www.aclweb.org/anthology/D19-1502
PWC https://paperswithcode.com/paper/fine-grained-entity-typing-via-hierarchical
Repo
Framework

Overfitting Detection of Deep Neural Networks without a Hold Out Set

Title Overfitting Detection of Deep Neural Networks without a Hold Out Set
Authors Konrad Groh
Abstract Overfitting is an ubiquitous problem in neural network training and usually mitigated using a holdout data set. Here we challenge this rationale and investigate criteria for overfitting without using a holdout data set. Specifically, we train a model for a fixed number of epochs multiple times with varying fractions of randomized labels and for a range of regularization strengths. A properly trained model should not be able to attain an accuracy greater than the fraction of properly labeled data points. Otherwise the model overfits. We introduce two criteria for detecting overfitting and one to detect underfitting. We analyze early stopping, the regularization factor, and network depth. In safety critical applications we are interested in models and parameter settings which perform well and are not likely to overfit. The methods of this paper allow characterizing and identifying such models.
Tasks
Published 2019-05-01
URL https://openreview.net/forum?id=B1lKtjA9FQ
PDF https://openreview.net/pdf?id=B1lKtjA9FQ
PWC https://paperswithcode.com/paper/overfitting-detection-of-deep-neural-networks
Repo
Framework

Sketch Me if You Can: Towards Generating Detailed Descriptions of Object Shape by Grounding in Images and Drawings

Title Sketch Me if You Can: Towards Generating Detailed Descriptions of Object Shape by Grounding in Images and Drawings
Authors Ting Han, Sina Zarrie{\ss}
Abstract A lot of recent work in Language {&} Vision has looked at generating descriptions or referring expressions for objects in scenes of real-world images, though focusing mostly on relatively simple language like object names, color and location attributes (e.g., brown chair on the left). This paper presents work on Draw-and-Tell, a dataset of detailed descriptions for common objects in images where annotators have produced fine-grained attribute-centric expressions distinguishing a target object from a range of similar objects. Additionally, the dataset comes with hand-drawn sketches for each object. As Draw-and-Tell is medium-sized and contains a rich vocabulary, it constitutes an interesting challenge for CNN-LSTM architectures used in state-of-the-art image captioning models. We explore whether the additional modality given through sketches can help such a model to learn to accurately ground detailed language referring expressions to object shapes. Our results are encouraging.
Tasks Image Captioning
Published 2019-10-01
URL https://www.aclweb.org/anthology/W19-8618/
PDF https://www.aclweb.org/anthology/W19-8618
PWC https://paperswithcode.com/paper/sketch-me-if-you-can-towards-generating
Repo
Framework
comments powered by Disqus