Paper Group NANR 82
Language and the Shifting Sands of Domain, Space and Time (Invited Talk). NILC at CWI 2018: Exploring Feature Engineering and Feature Learning. Combining Deep Learning and Argumentative Reasoning for the Analysis of Social Media Textual Content Using Small Data Sets. Lingmotif-lex: a Wide-coverage, State-of-the-art Lexicon for Sentiment Analysis. A …
Language and the Shifting Sands of Domain, Space and Time (Invited Talk)
Title | Language and the Shifting Sands of Domain, Space and Time (Invited Talk) |
Authors | Timothy Baldwin |
Abstract | In this talk, I will first present recent work on domain debiasing in the context of language identification, then discuss a new line of work on language variety analysis in the form of dialect map generation. Finally, I will reflect on the interplay between time and space on language variation, and speculate on how these can be captured in a single model. |
Tasks | Language Identification |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/W18-3908/ |
https://www.aclweb.org/anthology/W18-3908 | |
PWC | https://paperswithcode.com/paper/language-and-the-shifting-sands-of-domain |
Repo | |
Framework | |
NILC at CWI 2018: Exploring Feature Engineering and Feature Learning
Title | NILC at CWI 2018: Exploring Feature Engineering and Feature Learning |
Authors | Nathan Hartmann, Le dos Santos, ro Borges |
Abstract | This paper describes the results of NILC team at CWI 2018. We developed solutions following three approaches: (i) a feature engineering method using lexical, n-gram and psycholinguistic features, (ii) a shallow neural network method using only word embeddings, and (iii) a Long Short-Term Memory (LSTM) language model, which is pre-trained on a large text corpus to produce a contextualized word vector. The feature engineering method obtained our best results for the classification task and the LSTM model achieved the best results for the probabilistic classification task. Our results show that deep neural networks are able to perform as well as traditional machine learning methods using manually engineered features for the task of complex word identification in English. |
Tasks | Complex Word Identification, Feature Engineering, Language Modelling, Lexical Simplification, Text Simplification, Word Embeddings |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/W18-0540/ |
https://www.aclweb.org/anthology/W18-0540 | |
PWC | https://paperswithcode.com/paper/nilc-at-cwi-2018-exploring-feature |
Repo | |
Framework | |
Combining Deep Learning and Argumentative Reasoning for the Analysis of Social Media Textual Content Using Small Data Sets
Title | Combining Deep Learning and Argumentative Reasoning for the Analysis of Social Media Textual Content Using Small Data Sets |
Authors | Oana Cocarascu, Francesca Toni |
Abstract | The use of social media has become a regular habit for many and has changed the way people interact with each other. In this article, we focus on analyzing whether news headlines support tweets and whether reviews are deceptive by analyzing the interaction or the influence that these texts have on the others, thus exploiting contextual information. Concretely, we define a deep learning method for relation{–}based argument mining to extract argumentative relations of attack and support. We then use this method for determining whether news articles support tweets, a useful task in fact-checking settings, where determining agreement toward a statement is a useful step toward determining its truthfulness. Furthermore, we use our method for extracting bipolar argumentation frameworks from reviews to help detect whether they are deceptive. We show experimentally that our method performs well in both settings. In particular, in the case of deception detection, our method contributes a novel argumentative feature that, when used in combination with other features in standard supervised classifiers, outperforms the latter even on small data sets. |
Tasks | Argument Mining, Deception Detection |
Published | 2018-12-01 |
URL | https://www.aclweb.org/anthology/J18-4011/ |
https://www.aclweb.org/anthology/J18-4011 | |
PWC | https://paperswithcode.com/paper/combining-deep-learning-and-argumentative |
Repo | |
Framework | |
Lingmotif-lex: a Wide-coverage, State-of-the-art Lexicon for Sentiment Analysis
Title | Lingmotif-lex: a Wide-coverage, State-of-the-art Lexicon for Sentiment Analysis |
Authors | Antonio Moreno-Ortiz, Chantal P{'e}rez-Hern{'a}ndez |
Abstract | |
Tasks | Emotion Classification, Opinion Mining, Sentiment Analysis |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1420/ |
https://www.aclweb.org/anthology/L18-1420 | |
PWC | https://paperswithcode.com/paper/lingmotif-lex-a-wide-coverage-state-of-the |
Repo | |
Framework | |
Annotating Attribution Relations in Arabic
Title | Annotating Attribution Relations in Arabic |
Authors | Amal Alsaif, Tasniem Alyahya, Madawi Alotaibi, Huda Almuzaini, Abeer Algahtani |
Abstract | |
Tasks | Information Retrieval, Opinion Mining, Question Answering |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1632/ |
https://www.aclweb.org/anthology/L18-1632 | |
PWC | https://paperswithcode.com/paper/annotating-attribution-relations-in-arabic |
Repo | |
Framework | |
The Collision of Quality and Technology with Reality
Title | The Collision of Quality and Technology with Reality |
Authors | Don DePalma |
Abstract | |
Tasks | Common Sense Reasoning |
Published | 2018-03-01 |
URL | https://www.aclweb.org/anthology/W18-1907/ |
https://www.aclweb.org/anthology/W18-1907 | |
PWC | https://paperswithcode.com/paper/the-collision-of-quality-and-technology-with |
Repo | |
Framework | |
U-PC: Unsupervised Planogram Compliance
Title | U-PC: Unsupervised Planogram Compliance |
Authors | Archan Ray, Nishant Kumar, Avishek Shaw, Dipti Prasad Mukherjee |
Abstract | We present an end-to-end solution for recognizing merchandise displayed in the shelves of a supermarket. Given images of individual products, which are taken under ideal illumination for product marketing, the challenge is to find these products automatically in the images of the shelves. Note that the images of shelves are taken using hand-held camera under store level illumination. We provide a two-layer hypotheses generation and verification model. In the first layer, the model predicts a set of candidate merchandise at a specific location of the shelf while in the second layer, the hypothesis is verified by a novel graph theoretic approach. The performance of the proposed approach on two publicly available datasets is better than the competing approaches by at least 10%. |
Tasks | |
Published | 2018-09-01 |
URL | http://openaccess.thecvf.com/content_ECCV_2018/html/Archan_Ray_U-PC_Unsupervised_Planogram_ECCV_2018_paper.html |
http://openaccess.thecvf.com/content_ECCV_2018/papers/Archan_Ray_U-PC_Unsupervised_Planogram_ECCV_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/u-pc-unsupervised-planogram-compliance |
Repo | |
Framework | |
Directing Generative Networks with Weighted Maximum Mean Discrepancy
Title | Directing Generative Networks with Weighted Maximum Mean Discrepancy |
Authors | Maurice Diesendruck, Guy W. Cole, Sinead Williamson |
Abstract | The maximum mean discrepancy (MMD) between two probability measures P and Q is a metric that is zero if and only if all moments of the two measures are equal, making it an appealing statistic for two-sample tests. Given i.i.d. samples from P and Q, Gretton et al. (2012) show that we can construct an unbiased estimator for the square of the MMD between the two distributions. If P is a distribution of interest and Q is the distribution implied by a generative neural network with stochastic inputs, we can use this estimator to train our neural network. However, in practice we do not always have i.i.d. samples from our target of interest. Data sets often exhibit biases—for example, under-representation of certain demographics—and if we ignore this fact our machine learning algorithms will propagate these biases. Alternatively, it may be useful to assume our data has been gathered via a biased sample selection mechanism in order to manipulate properties of the estimating distribution Q. In this paper, we construct an estimator for the MMD between P and Q when we only have access to P via some biased sample selection mechanism, and suggest methods for estimating this sample selection mechanism when it is not already known. We show that this estimator can be used to train generative neural networks on a biased data sample, to give a simulator that reverses the effect of that bias. |
Tasks | |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=SyuWNMZ0W |
https://openreview.net/pdf?id=SyuWNMZ0W | |
PWC | https://paperswithcode.com/paper/directing-generative-networks-with-weighted |
Repo | |
Framework | |
Semi-Supervised Disfluency Detection
Title | Semi-Supervised Disfluency Detection |
Authors | Feng Wang, Wei Chen, Zhen Yang, Qianqian Dong, Shuang Xu, Bo Xu |
Abstract | While the disfluency detection has achieved notable success in the past years, it still severely suffers from the data scarcity. To tackle this problem, we propose a novel semi-supervised approach which can utilize large amounts of unlabelled data. In this work, a light-weight neural net is proposed to extract the hidden features based solely on self-attention without any Recurrent Neural Network (RNN) or Convolutional Neural Network (CNN). In addition, we use the unlabelled corpus to enhance the performance. Besides, the Generative Adversarial Network (GAN) training is applied to enforce the similar distribution between the labelled and unlabelled data. The experimental results show that our approach achieves significant improvements over strong baselines. |
Tasks | Machine Translation, Question Answering |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/C18-1299/ |
https://www.aclweb.org/anthology/C18-1299 | |
PWC | https://paperswithcode.com/paper/semi-supervised-disfluency-detection |
Repo | |
Framework | |
Humor Recognition Using Deep Learning
Title | Humor Recognition Using Deep Learning |
Authors | Peng-Yu Chen, Von-Wun Soo |
Abstract | Humor is an essential but most fascinating element in personal communication. How to build computational models to discover the structures of humor, recognize humor and even generate humor remains a challenge and there have been yet few attempts on it. In this paper, we construct and collect four datasets with distinct joke types in both English and Chinese and conduct learning experiments on humor recognition. We implement a Convolutional Neural Network (CNN) with extensive filter size, number and Highway Networks to increase the depth of networks. Results show that our model outperforms in recognition of different types of humor with benchmarks collected in both English and Chinese languages on accuracy, precision, and recall in comparison to previous works. |
Tasks | |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/N18-2018/ |
https://www.aclweb.org/anthology/N18-2018 | |
PWC | https://paperswithcode.com/paper/humor-recognition-using-deep-learning |
Repo | |
Framework | |
What is image captioning made of?
Title | What is image captioning made of? |
Authors | Pranava Madhyastha, Josiah Wang, Lucia Specia |
Abstract | We hypothesize that end-to-end neural image captioning systems work seemingly well because they exploit and learn ‘distributional similarity’ in a multimodal feature space, by mapping a test image to similar training images in this space and generating a caption from the same space. To validate our hypothesis, we focus on the ‘image’ side of image captioning, and vary the input image representation but keep the RNN text generation model of a CNN-RNN constant. We propose a sparse bag-of-objects vector as an interpretable representation to investigate our distributional similarity hypothesis. We found that image captioning models (i) are capable of separating structure from noisy input representations; (ii) experience virtually no significant performance loss when a high dimensional representation is compressed to a lower dimensional space; (iii) cluster images with similar visual and linguistic information together; (iv) are heavily reliant on test sets with a similar distribution as the training set; (v) repeatedly generate the same captions by matching images and ‘retrieving’ a caption in the joint visual-textual space. Our experiments all point to one fact: that our distributional similarity hypothesis holds. We conclude that, regardless of the image representation, image captioning systems seem to match images and generate captions in a learned joint image-text semantic subspace. |
Tasks | Image Captioning, Text Generation |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=HJNGGmZ0Z |
https://openreview.net/pdf?id=HJNGGmZ0Z | |
PWC | https://paperswithcode.com/paper/what-is-image-captioning-made-of |
Repo | |
Framework | |
TNT-NLG, System 1: Using a statistical NLG to massively augment crowd-sourced data for neural generation
Title | TNT-NLG, System 1: Using a statistical NLG to massively augment crowd-sourced data for neural generation |
Authors | Shereen Oraby, Lena Reed, Shubhangi Tandon, Stephanie Lukin, Marilyn A. Walker |
Abstract | Ever since the successful application of sequence to sequence learning for neural machine translation systems (Sutskever et al., 2014), interest has surged in its applicability towards language generation in other problem domains. In the area of natural language generation (NLG), there has been a great deal of interest in end-to-end (E2E) neural models that learn and generate natural language sentence realizations in one step. In this paper, we present TNT-NLG System 1, our first system submission to the E2E NLG Challenge, where we generate natural language (NL) realizations from meaning representations (MRs) in the restaurant domain by massively expanding the training dataset. We develop two models for this system, based on Dusek et al.’s (2016a) open source baseline model and context-aware neural language generator. Starting with the MR and NL pairs from the E2E generation challenge dataset, we explode the size of the training set using PERSONAGE (Mairesse and Walker, 2010), a statistical generator able to produce varied realizations from MRs, and use our expanded data as contextual input into our models. We present evaluation results using automated and human evaluation metrics, and describe directions for future work. |
Tasks | Data-to-Text Generation, Machine Translation, Text Generation |
Published | 2018-04-26 |
URL | http://www.macs.hw.ac.uk/InteractionLab/E2E/final_papers/E2E-TNT_NLG1.pdf |
http://www.macs.hw.ac.uk/InteractionLab/E2E/final_papers/E2E-TNT_NLG1.pdf | |
PWC | https://paperswithcode.com/paper/tnt-nlg-system-1-using-a-statistical-nlg-to |
Repo | |
Framework | |
A Multi-answer Multi-task Framework for Real-world Machine Reading Comprehension
Title | A Multi-answer Multi-task Framework for Real-world Machine Reading Comprehension |
Authors | Jiahua Liu, Wan Wei, Maosong Sun, Hao Chen, Yantao Du, Dekang Lin |
Abstract | The task of machine reading comprehension (MRC) has evolved from answering simple questions from well-edited text to answering real questions from users out of web data. In the real-world setting, full-body text from multiple relevant documents in the top search results are provided as context for questions from user queries, including not only questions with a single, short, and factual answer, but also questions about reasons, procedures, and opinions. In this case, multiple answers could be equally valid for a single question and each answer may occur multiple times in the context, which should be taken into consideration when we build MRC system. We propose a multi-answer multi-task framework, in which different loss functions are used for multiple reference answers. Minimum Risk Training is applied to solve the multi-occurrence problem of a single answer. Combined with a simple heuristic passage extraction strategy for overlong documents, our model increases the ROUGE-L score on the DuReader dataset from 44.18, the previous state-of-the-art, to 51.09. |
Tasks | Information Retrieval, Machine Reading Comprehension, Question Answering, Reading Comprehension |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1235/ |
https://www.aclweb.org/anthology/D18-1235 | |
PWC | https://paperswithcode.com/paper/a-multi-answer-multi-task-framework-for-real |
Repo | |
Framework | |
LaSTUS/TALN at Complex Word Identification (CWI) 2018 Shared Task
Title | LaSTUS/TALN at Complex Word Identification (CWI) 2018 Shared Task |
Authors | Ahmed AbuRa{'}ed, Horacio Saggion |
Abstract | This paper presents the participation of the LaSTUS/TALN team in the Complex Word Identification (CWI) Shared Task 2018 in the English monolingual track . The purpose of the task was to determine if a word in a given sentence can be judged as complex or not by a certain target audience. For the English track, task organizers provided a training and a development datasets of 27,299 and 3,328 words respectively together with the sentence in which each word occurs. The words were judged as complex or not by 20 human evaluators; ten of whom are natives. We submitted two systems: one system modeled each word to evaluate as a numeric vector populated with a set of lexical, semantic and contextual features while the other system relies on a word embedding representation and a distance metric. We trained two separate classifiers to automatically decide if each word is complex or not. We submitted six runs, two for each of the three subsets of the English monolingual CWI track. |
Tasks | Complex Word Identification, Lexical Simplification, Text Simplification |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/W18-0517/ |
https://www.aclweb.org/anthology/W18-0517 | |
PWC | https://paperswithcode.com/paper/lastustaln-at-complex-word-identification-cwi |
Repo | |
Framework | |
R2P2: A ReparameteRized Pushforward Policy for Diverse, Precise Generative Path Forecasting
Title | R2P2: A ReparameteRized Pushforward Policy for Diverse, Precise Generative Path Forecasting |
Authors | Nicholas Rhinehart, Kris M. Kitani, Paul Vernaza |
Abstract | We propose a method to forecast a vehicle’s ego-motion as a distribution over spatiotemporal paths, conditioned on features (e.g., from LIDAR and images) embedded in an overhead map. The method learns a policy inducing a distribution over simulated trajectories that is both diverse (produces most paths likely under the data) and precise (mostly produces paths likely under the data). This balance is achieved through minimization of a symmetrized cross-entropy between the distribution and demonstration data. By viewing the simulated-outcome distribution as the pushforward of a simple distribution under a simulation operator, we obtain expressions for the cross-entropy metrics that can be efficiently evaluated and differentiated, enabling stochastic-gradient optimization. We propose concrete policy architectures for this model, discuss our evaluation metrics relative to previously-used metrics, and demonstrate the superiority of our method relative to state-of-the-art methods in both the KITTI dataset and a similar but novel and larger real-world dataset explicitly designed for the vehicle forecasting domain. |
Tasks | |
Published | 2018-09-01 |
URL | http://openaccess.thecvf.com/content_ECCV_2018/html/Nicholas_Rhinehart_R2P2_A_ReparameteRized_ECCV_2018_paper.html |
http://openaccess.thecvf.com/content_ECCV_2018/papers/Nicholas_Rhinehart_R2P2_A_ReparameteRized_ECCV_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/r2p2-a-reparameterized-pushforward-policy-for |
Repo | |
Framework | |