Paper Group NANR 112
``Condescending, Rude, Assholes’': Framing gender and hostility on Stack Overflow. Barack’s Wife Hillary: Using Knowledge Graphs for Fact-Aware Language Modeling. Selective Self-Training for semi-supervised Learning. DUTH at SemEval-2019 Task 8: Part-Of-Speech Features for Question Classification. VARIATIONAL SGD: DROPOUT , GENERALIZATION AND CRITI …
``Condescending, Rude, Assholes’': Framing gender and hostility on Stack Overflow
Title | ``Condescending, Rude, Assholes’': Framing gender and hostility on Stack Overflow | |
Authors | Sian Brooke |
Abstract | The disciplines of Gender Studies and Data Science are incompatible. This is conventional wisdom, supported by how many computational studies simplify gender into an immutable binary categorization that appears crude to the critical social researcher. I argue that the characterization of gender norms is context specific and may prove valuable in constructing useful models. I show how gender can be framed in computational studies as a stylized repetition of acts mediated by a social structure, and not a possessed biological category. By conducting a review of existing work, I show how gender should be explored in multiplicity in computational research through clustering techniques, and layout how this is being achieved in a study in progress on gender hostility on Stack Overflow. |
Tasks | |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-3519/ |
https://www.aclweb.org/anthology/W19-3519 | |
PWC | https://paperswithcode.com/paper/condescending-rude-assholes-framing-gender |
Repo | |
Framework | |
Barack’s Wife Hillary: Using Knowledge Graphs for Fact-Aware Language Modeling
Title | Barack’s Wife Hillary: Using Knowledge Graphs for Fact-Aware Language Modeling |
Authors | Robert Logan, Nelson F. Liu, Matthew E. Peters, Matt Gardner, Sameer Singh |
Abstract | Modeling human language requires the ability to not only generate fluent text but also encode factual knowledge. However, traditional language models are only capable of remembering facts seen at training time, and often have difficulty recalling them. To address this, we introduce the knowledge graph language model (KGLM), a neural language model with mechanisms for selecting and copying facts from a knowledge graph that are relevant to the context. These mechanisms enable the model to render information it has never seen before, as well as generate out-of-vocabulary tokens. We also introduce the Linked WikiText-2 dataset, a corpus of annotated text aligned to the Wikidata knowledge graph whose contents (roughly) match the popular WikiText-2 benchmark. In experiments, we demonstrate that the KGLM achieves significantly better performance than a strong baseline language model. We additionally compare different language model{'}s ability to complete sentences requiring factual knowledge, showing that the KGLM outperforms even very large language models in generating facts. |
Tasks | Knowledge Graphs, Language Modelling |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/P19-1598/ |
https://www.aclweb.org/anthology/P19-1598 | |
PWC | https://paperswithcode.com/paper/baracks-wife-hillary-using-knowledge-graphs-1 |
Repo | |
Framework | |
Selective Self-Training for semi-supervised Learning
Title | Selective Self-Training for semi-supervised Learning |
Authors | Jisoo Jeong, Seungeui Lee, Nojun Kwak |
Abstract | Semi-supervised learning (SSL) is a study that efficiently exploits a large amount of unlabeled data to improve performance in conditions of limited labeled data. Most of the conventional SSL methods assume that the classes of unlabeled data are included in the set of classes of labeled data. In addition, these methods do not sort out useless unlabeled samples and use all the unlabeled data for learning, which is not suitable for realistic situations. In this paper, we propose an SSL method called selective self-training (SST), which selectively decides whether to include each unlabeled sample in the training process. It is also designed to be applied to a more real situation where classes of unlabeled data are different from the ones of the labeled data. For the conventional SSL problems which deal with data where both the labeled and unlabeled samples share the same class categories, the proposed method not only performs comparable to other conventional SSL algorithms but also can be combined with other SSL algorithms. While the conventional methods cannot be applied to the new SSL problems where the separated data do not share the classes, our method does not show any performance degradation even if the classes of unlabeled data are different from those of the labeled data. |
Tasks | |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=SyzrLjA5FQ |
https://openreview.net/pdf?id=SyzrLjA5FQ | |
PWC | https://paperswithcode.com/paper/selective-self-training-for-semi-supervised |
Repo | |
Framework | |
DUTH at SemEval-2019 Task 8: Part-Of-Speech Features for Question Classification
Title | DUTH at SemEval-2019 Task 8: Part-Of-Speech Features for Question Classification |
Authors | Anastasios Bairaktaris, Symeon Symeonidis, Avi Arampatzis |
Abstract | This report describes the methods employed by the Democritus University of Thrace (DUTH) team for participating in SemEval-2019 Task 8: Fact Checking in Community Question Answering Forums. Our team dealt only with Subtask A: Question Classification. Our approach was based on shallow natural language processing (NLP) pre-processing techniques to reduce the noise in data, feature selection methods, and supervised machine learning algorithms such as NearestCentroid, Perceptron, and LinearSVC. To determine the essential features, we were aided by exploratory data analysis and visualizations. In order to improve classification accuracy, we developed a customized list of stopwords, retaining some opinion- and fact-denoting common function words which would have been removed by standard stoplisting. Furthermore, we examined the usefulness of part-of-speech (POS) categories for the task; by trying to remove nouns and adjectives, we found some evidence that verbs are a valuable POS category for the opinion question class. |
Tasks | Community Question Answering, Feature Selection, Question Answering |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/S19-2202/ |
https://www.aclweb.org/anthology/S19-2202 | |
PWC | https://paperswithcode.com/paper/duth-at-semeval-2019-task-8-part-of-speech |
Repo | |
Framework | |
VARIATIONAL SGD: DROPOUT , GENERALIZATION AND CRITICAL POINT AT THE END OF CONVEXITY
Title | VARIATIONAL SGD: DROPOUT , GENERALIZATION AND CRITICAL POINT AT THE END OF CONVEXITY |
Authors | Michael Tetelman |
Abstract | The goal of the paper is to propose an algorithm for learning the most generalizable solution from given training data. It is shown that Bayesian approach leads to a solution that dependent on statistics of training data and not on particular samples. The solution is stable under perturbations of training data because it is defined by an integral contribution of multiple maxima of the likelihood and not by a single global maximum. Specifically, the Bayesian probability distribution of parameters (weights) of a probabilistic model given by a neural network is estimated via recurrent variational approximations. Derived recurrent update rules correspond to SGD-type rules for finding a minimum of an effective loss that is an average of an original negative log-likelihood over the Gaussian distributions of weights, which makes it a function of means and variances. The effective loss is convex for large variances and non-convex in the limit of small variances. Among stationary solutions of the update rules there are trivial solutions with zero variances at local minima of the original loss and a single non-trivial solution with finite variances that is a critical point at the end of convexity of the effective loss in the mean-variance space. At the critical point both first- and second-order gradients of the effective loss w.r.t. means are zero. The empirical study confirms that the critical point represents the most generalizable solution. While the location of the critical point in the weight space depends on specifics of the used probabilistic model some properties at the critical point are universal and model independent. |
Tasks | |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=r1ztwiCcYQ |
https://openreview.net/pdf?id=r1ztwiCcYQ | |
PWC | https://paperswithcode.com/paper/variational-sgd-dropout-generalization-and |
Repo | |
Framework | |
Detecting Concealed Information in Text and Speech
Title | Detecting Concealed Information in Text and Speech |
Authors | Shengli Hu |
Abstract | Motivated by infamous cheating scandals in the media industry, the wine industry, and political campaigns, we address the problem of detecting concealed information in technical settings. In this work, we explore acoustic-prosodic and linguistic indicators of information concealment by collecting a unique corpus of professionals practicing for oral exams while concealing information. We reveal subtle signs of concealing information in speech and text, compare and contrast them with those in deception detection literature, uncovering the link between concealing information and deception. We then present a series of experiments that automatically detect concealed information from text and speech. We compare the use of acoustic-prosodic, linguistic, and individual feature sets, using different machine learning models. Finally, we present a multi-task learning framework with acoustic, linguistic, and individual features, that outperforms human performance by over 15{%}. |
Tasks | Deception Detection, Multi-Task Learning |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/P19-1039/ |
https://www.aclweb.org/anthology/P19-1039 | |
PWC | https://paperswithcode.com/paper/detecting-concealed-information-in-text-and |
Repo | |
Framework | |
Self-Attention Architectures for Answer-Agnostic Neural Question Generation
Title | Self-Attention Architectures for Answer-Agnostic Neural Question Generation |
Authors | Thomas Scialom, Benjamin Piwowarski, Jacopo Staiano |
Abstract | Neural architectures based on self-attention, such as Transformers, recently attracted interest from the research community, and obtained significant improvements over the state of the art in several tasks. We explore how Transformers can be adapted to the task of Neural Question Generation without constraining the model to focus on a specific answer passage. We study the effect of several strategies to deal with out-of-vocabulary words such as copy mechanisms, placeholders, and contextual word embeddings. We report improvements obtained over the state-of-the-art on the SQuAD dataset according to automated metrics (BLEU, ROUGE), as well as qualitative human assessments of the system outputs. |
Tasks | Question Generation, Word Embeddings |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/P19-1604/ |
https://www.aclweb.org/anthology/P19-1604 | |
PWC | https://paperswithcode.com/paper/self-attention-architectures-for-answer |
Repo | |
Framework | |
Negative Lexically Constrained Decoding for Paraphrase Generation
Title | Negative Lexically Constrained Decoding for Paraphrase Generation |
Authors | Tomoyuki Kajiwara |
Abstract | Paraphrase generation can be regarded as monolingual translation. Unlike bilingual machine translation, paraphrase generation rewrites only a limited portion of an input sentence. Hence, previous methods based on machine translation often perform conservatively to fail to make necessary rewrites. To solve this problem, we propose a neural model for paraphrase generation that first identifies words in the source sentence that should be paraphrased. Then, these words are paraphrased by the negative lexically constrained decoding that avoids outputting these words as they are. Experiments on text simplification and formality transfer show that our model improves the quality of paraphrasing by making necessary rewrites to an input sentence. |
Tasks | Machine Translation, Paraphrase Generation, Text Simplification |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/P19-1607/ |
https://www.aclweb.org/anthology/P19-1607 | |
PWC | https://paperswithcode.com/paper/negative-lexically-constrained-decoding-for |
Repo | |
Framework | |
Large-Scale Transfer Learning for Natural Language Generation
Title | Large-Scale Transfer Learning for Natural Language Generation |
Authors | Sergey Golovanov, Rauf Kurbanov, Sergey Nikolenko, Kyryl Truskovskyi, Alex Tselousov, er, Thomas Wolf |
Abstract | Large-scale pretrained language models define state of the art in natural language processing, achieving outstanding performance on a variety of tasks. We study how these architectures can be applied and adapted for natural language generation, comparing a number of architectural and training schemes. We focus in particular on open-domain dialog as a typical high entropy generation task, presenting and comparing different architectures for adapting pretrained models with state of the art results. |
Tasks | Text Generation, Transfer Learning |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/P19-1608/ |
https://www.aclweb.org/anthology/P19-1608 | |
PWC | https://paperswithcode.com/paper/large-scale-transfer-learning-for-natural |
Repo | |
Framework | |
Box of Lies: Multimodal Deception Detection in Dialogues
Title | Box of Lies: Multimodal Deception Detection in Dialogues |
Authors | Felix Soldner, Ver{'o}nica P{'e}rez-Rosas, Rada Mihalcea |
Abstract | Deception often takes place during everyday conversations, yet conversational dialogues remain largely unexplored by current work on automatic deception detection. In this paper, we address the task of detecting multimodal deceptive cues during conversational dialogues. We introduce a multimodal dataset containing deceptive conversations between participants playing the Box of Lies game from The Tonight Show Starring Jimmy Fallon, in which they try to guess whether an object description provided by their opponent is deceptive or not. We conduct annotations of multimodal communication behaviors, including facial and linguistic behaviors, and derive several learning features based on these annotations. Initial classification experiments show promising results, performing well above both a random and a human baseline, and reaching up to 69{%} accuracy in distinguishing deceptive and truthful behaviors. |
Tasks | Deception Detection |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/N19-1175/ |
https://www.aclweb.org/anthology/N19-1175 | |
PWC | https://paperswithcode.com/paper/box-of-lies-multimodal-deception-detection-in |
Repo | |
Framework | |
TENSOR RING NETS ADAPTED DEEP MULTI-TASK LEARNING
Title | TENSOR RING NETS ADAPTED DEEP MULTI-TASK LEARNING |
Authors | Xinqi Chen, Ming Hou, Guoxu Zhou, Qibin Zhao |
Abstract | Recent deep multi-task learning (MTL) has been witnessed its success in alleviating data scarcity of some task by utilizing domain-specific knowledge from related tasks. Nonetheless, several major issues of deep MTL, including the effectiveness of sharing mechanisms, the efficiency of model complexity and the flexibility of network architectures, still remain largely unaddressed. To this end, we propose a novel generalized latent-subspace based knowledge sharing mechanism for linking task-specific models, namely tensor ring multi-task learning (TRMTL). TRMTL has a highly compact representation, and it is very effective in transferring task-invariant knowledge while being super flexible in learning task-specific features, successfully mitigating the dilemma of both negative-transfer in lower layers and under-transfer in higher layers. Under our TRMTL, it is feasible for each task to have heterogenous input data dimensionality or distinct feature sizes at different hidden layers. Experiments on a variety of datasets demonstrate our model is capable of significantly improving each single task’s performance, particularly favourable in scenarios where some of the tasks have insufficient data. |
Tasks | Multi-Task Learning |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=BJxmXhRcK7 |
https://openreview.net/pdf?id=BJxmXhRcK7 | |
PWC | https://paperswithcode.com/paper/tensor-ring-nets-adapted-deep-multi-task |
Repo | |
Framework | |
Radial Distortion Triangulation
Title | Radial Distortion Triangulation |
Authors | Zuzana Kukelova, Viktor Larsson |
Abstract | This paper presents the first optimal, maximal likelihood, solution to the triangulation problem for radially distorted cameras. The proposed solution to the two-view triangulation problem minimizes the L2-norm of the reprojection error in the distorted image space. We cast the problem as the search for corrected distorted image points, and we use a Lagrange multiplier formulation to impose the epipolar constraint for undistorted points. For the one-parameter division model, this formulation leads to a system of five quartic polynomial equations in five unknowns, which can be exactly solved using the Groebner basis method. While the proposed Groebner basis solution is provably optimal; it is too slow for practical applications. Therefore, we developed a fast iterative solver to this problem. Extensive empirical tests show that the iterative algorithm delivers the optimal solution virtually every time, thus making it an L2-optimal algorithm de facto. It is iterative in nature, yet in practice, it converges in no more than five iterations. We thoroughly evaluate the proposed method on both synthetic and real-world data, and we show the benefits of performing the triangulation in the distorted space in the presence of radial distortion. |
Tasks | |
Published | 2019-06-01 |
URL | http://openaccess.thecvf.com/content_CVPR_2019/html/Kukelova_Radial_Distortion_Triangulation_CVPR_2019_paper.html |
http://openaccess.thecvf.com/content_CVPR_2019/papers/Kukelova_Radial_Distortion_Triangulation_CVPR_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/radial-distortion-triangulation |
Repo | |
Framework | |
Constraint-based Learning of Phonological Processes
Title | Constraint-based Learning of Phonological Processes |
Authors | Shraddha Barke, Rose Kunkel, Nadia Polikarpova, Eric Meinhardt, Eric Bakovic, Leon Bergen |
Abstract | Phonological processes are context-dependent sound changes in natural languages. We present an unsupervised approach to learning human-readable descriptions of phonological processes from collections of related utterances. Our approach builds upon a technique from the programming languages community called constraint-based program synthesis. We contribute a novel encoding of the learning problem into Boolean Satisfiability constraints, which enables both data efficiency and fast inference. We evaluate our system on textbook phonology problems and datasets from the literature, and show that it achieves high accuracy at interactive speeds. |
Tasks | Program Synthesis |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-1639/ |
https://www.aclweb.org/anthology/D19-1639 | |
PWC | https://paperswithcode.com/paper/constraint-based-learning-of-phonological |
Repo | |
Framework | |
Chameleon: A Language Model Adaptation Toolkit for Automatic Speech Recognition of Conversational Speech
Title | Chameleon: A Language Model Adaptation Toolkit for Automatic Speech Recognition of Conversational Speech |
Authors | Yuanfeng Song, Di Jiang, Weiwei Zhao, Qian Xu, Raymond Chi-Wing Wong, Qiang Yang |
Abstract | Language model is a vital component in modern automatic speech recognition (ASR) systems. Since {``}one-size-fits-all{''} language model works suboptimally for conversational speeches, language model adaptation (LMA) is considered as a promising solution for solving this problem. In order to compare the state-of-the-art LMA techniques and systematically demonstrate their effect in conversational speech recognition, we develop a novel toolkit named Chameleon, which includes the state-of-the-art cache-based and topic-based LMA techniques. This demonstration does not only vividly visualize underlying working mechanisms of a variety of the state-of-the-art LMA models but also provide an interface for the user to customize the hyperparameters of them. With this demonstration, the audience can experience the effect of LMA in an interactive and real-time fashion. We wish this demonstration would inspire more research on better language model techniques for ASR. | |
Tasks | Language Modelling, Speech Recognition |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-3007/ |
https://www.aclweb.org/anthology/D19-3007 | |
PWC | https://paperswithcode.com/paper/chameleon-a-language-model-adaptation-toolkit |
Repo | |
Framework | |
Interactive Image Segmentation via Backpropagating Refinement Scheme
Title | Interactive Image Segmentation via Backpropagating Refinement Scheme |
Authors | Won-Dong Jang, Chang-Su Kim |
Abstract | An interactive image segmentation algorithm, which accepts user-annotations about a target object and the background, is proposed in this work. We convert user-annotations into interaction maps by measuring distances of each pixel to the annotated locations. Then, we perform the forward pass in a convolutional neural network, which outputs an initial segmentation map. However, the user-annotated locations can be mislabeled in the initial result. Therefore, we develop the backpropagating refinement scheme (BRS), which corrects the mislabeled pixels. Experimental results demonstrate that the proposed algorithm outperforms the conventional algorithms on four challenging datasets. Furthermore, we demonstrate the generality and applicability of BRS in other computer vision tasks, by transforming existing convolutional neural networks into user-interactive ones. |
Tasks | Semantic Segmentation |
Published | 2019-06-01 |
URL | http://openaccess.thecvf.com/content_CVPR_2019/html/Jang_Interactive_Image_Segmentation_via_Backpropagating_Refinement_Scheme_CVPR_2019_paper.html |
http://openaccess.thecvf.com/content_CVPR_2019/papers/Jang_Interactive_Image_Segmentation_via_Backpropagating_Refinement_Scheme_CVPR_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/interactive-image-segmentation-via |
Repo | |
Framework | |