Paper Group NANR 212
TwiSty: A Multilingual Twitter Stylometry Corpus for Gender and Personality Profiling. Cross-lingual Transfer of Correlations between Parts of Speech and Gaze Features. Joint Transition-based Dependency Parsing and Disfluency Detection for Automatic Speech Recognition Texts. GhoSt-NN: A Representative Gold Standard of German Noun-Noun Compounds. Le …
TwiSty: A Multilingual Twitter Stylometry Corpus for Gender and Personality Profiling
Title | TwiSty: A Multilingual Twitter Stylometry Corpus for Gender and Personality Profiling |
Authors | Ben Verhoeven, Walter Daelemans, Barbara Plank |
Abstract | Personality profiling is the task of detecting personality traits of authors based on writing style. Several personality typologies exist, however, the Briggs-Myer Type Indicator (MBTI) is particularly popular in the non-scientific community, and many people use it to analyse their own personality and talk about the results online. Therefore, large amounts of self-assessed data on MBTI are readily available on social-media platforms such as Twitter. We present a novel corpus of tweets annotated with the MBTI personality type and gender of their author for six Western European languages (Dutch, German, French, Italian, Portuguese and Spanish). We outline the corpus creation and annotation, show statistics of the obtained data distributions and present first baselines on Myers-Briggs personality profiling and gender prediction for all six languages. |
Tasks | Gender Prediction |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1258/ |
https://www.aclweb.org/anthology/L16-1258 | |
PWC | https://paperswithcode.com/paper/twisty-a-multilingual-twitter-stylometry |
Repo | |
Framework | |
Cross-lingual Transfer of Correlations between Parts of Speech and Gaze Features
Title | Cross-lingual Transfer of Correlations between Parts of Speech and Gaze Features |
Authors | Maria Barrett, Frank Keller, Anders S{\o}gaard |
Abstract | Several recent studies have shown that eye movements during reading provide information about grammatical and syntactic processing, which can assist the induction of NLP models. All these studies have been limited to English, however. This study shows that gaze and part of speech (PoS) correlations largely transfer across English and French. This means that we can replicate previous studies on gaze-based PoS tagging for French, but also that we can use English gaze data to assist the induction of French NLP models. |
Tasks | Cross-Lingual Transfer, Sentence Compression |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-1126/ |
https://www.aclweb.org/anthology/C16-1126 | |
PWC | https://paperswithcode.com/paper/cross-lingual-transfer-of-correlations |
Repo | |
Framework | |
Joint Transition-based Dependency Parsing and Disfluency Detection for Automatic Speech Recognition Texts
Title | Joint Transition-based Dependency Parsing and Disfluency Detection for Automatic Speech Recognition Texts |
Authors | Masashi Yoshikawa, Hiroyuki Shindo, Yuji Matsumoto |
Abstract | |
Tasks | Dependency Parsing, Speech Recognition, Transition-Based Dependency Parsing |
Published | 2016-11-01 |
URL | https://www.aclweb.org/anthology/D16-1109/ |
https://www.aclweb.org/anthology/D16-1109 | |
PWC | https://paperswithcode.com/paper/joint-transition-based-dependency-parsing-and |
Repo | |
Framework | |
GhoSt-NN: A Representative Gold Standard of German Noun-Noun Compounds
Title | GhoSt-NN: A Representative Gold Standard of German Noun-Noun Compounds |
Authors | Sabine Schulte im Walde, Anna H{"a}tty, Stefan Bott, Nana Khvtisavrishvili |
Abstract | This paper presents a novel gold standard of German noun-noun compounds (Ghost-NN) including 868 compounds annotated with corpus frequencies of the compounds and their constituents, productivity and ambiguity of the constituents, semantic relations between the constituents, and compositionality ratings of compound-constituent pairs. Moreover, a subset of the compounds containing 180 compounds is balanced for the productivity of the modifiers (distinguishing low/mid/high productivity) and the ambiguity of the heads (distinguishing between heads with 1, 2 and {\textgreater}2 senses |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1362/ |
https://www.aclweb.org/anthology/L16-1362 | |
PWC | https://paperswithcode.com/paper/ghost-nn-a-representative-gold-standard-of |
Repo | |
Framework | |
Learning brain regions via large-scale online structured sparse dictionary learning
Title | Learning brain regions via large-scale online structured sparse dictionary learning |
Authors | Elvis Dohmatob, Arthur Mensch, Gael Varoquaux, Bertrand Thirion |
Abstract | We propose a multivariate online dictionary-learning method for obtaining decompositions of brain images with structured and sparse components (aka atoms). Sparsity is to be understood in the usual sense: the dictionary atoms are constrained to contain mostly zeros. This is imposed via an $\ell_1$-norm constraint. By “structured”, we mean that the atoms are piece-wise smooth and compact, thus making up blobs, as opposed to scattered patterns of activation. We propose to use a Sobolev (Laplacian) penalty to impose this type of structure. Combining the two penalties, we obtain decompositions that properly delineate brain structures from functional images. This non-trivially extends the online dictionary-learning work of Mairal et al. (2010), at the price of only a factor of 2 or 3 on the overall running time. Just like the Mairal et al. (2010) reference method, the online nature of our proposed algorithm allows it to scale to arbitrarily sized datasets. Experiments on brain data show that our proposed method extracts structured and denoised dictionaries that are more intepretable and better capture inter-subject variability in small medium, and large-scale regimes alike, compared to state-of-the-art models. |
Tasks | Dictionary Learning |
Published | 2016-12-01 |
URL | http://papers.nips.cc/paper/6352-learning-brain-regions-via-large-scale-online-structured-sparse-dictionary-learning |
http://papers.nips.cc/paper/6352-learning-brain-regions-via-large-scale-online-structured-sparse-dictionary-learning.pdf | |
PWC | https://paperswithcode.com/paper/learning-brain-regions-via-large-scale-online |
Repo | |
Framework | |
Shallow Semantic Reasoning from an Incomplete Gold Standard for Learner Language
Title | Shallow Semantic Reasoning from an Incomplete Gold Standard for Learner Language |
Authors | Levi King, Markus Dickinson |
Abstract | |
Tasks | Grammatical Error Detection |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/W16-0512/ |
https://www.aclweb.org/anthology/W16-0512 | |
PWC | https://paperswithcode.com/paper/shallow-semantic-reasoning-from-an-incomplete |
Repo | |
Framework | |
運用序列到序列生成架構於重寫式自動摘要(Exploiting Sequence-to-Sequence Generation Framework for Automatic Abstractive Summarization)[In Chinese]
Title | 運用序列到序列生成架構於重寫式自動摘要(Exploiting Sequence-to-Sequence Generation Framework for Automatic Abstractive Summarization)[In Chinese] |
Authors | Yu-Lun Hsieh, Shih-Hung Liu, Kuan-Yu Chen, Hsin-Min Wang, Wen-Lian Hsu, Berlin Chen |
Abstract | |
Tasks | Abstractive Text Summarization, Text Summarization |
Published | 2016-10-01 |
URL | https://www.aclweb.org/anthology/O16-1012/ |
https://www.aclweb.org/anthology/O16-1012 | |
PWC | https://paperswithcode.com/paper/ec-aoaaaoac14ea-a14eaaeexploiting-sequence-to |
Repo | |
Framework | |
What to Do with an Airport? Mining Arguments in the German Online Participation Project Tempelhofer Feld
Title | What to Do with an Airport? Mining Arguments in the German Online Participation Project Tempelhofer Feld |
Authors | Matthias Liebeck, Katharina Esau, Stefan Conrad |
Abstract | |
Tasks | Argument Mining, Decision Making |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-2817/ |
https://www.aclweb.org/anthology/W16-2817 | |
PWC | https://paperswithcode.com/paper/what-to-do-with-an-airport-mining-arguments |
Repo | |
Framework | |
Rhetorical structure and argumentation structure in monologue text
Title | Rhetorical structure and argumentation structure in monologue text |
Authors | Andreas Peldszus, Manfred Stede |
Abstract | |
Tasks | Argument Mining |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-2812/ |
https://www.aclweb.org/anthology/W16-2812 | |
PWC | https://paperswithcode.com/paper/rhetorical-structure-and-argumentation |
Repo | |
Framework | |
Assessing the Potential of Metaphoricity of verbs using corpus data
Title | Assessing the Potential of Metaphoricity of verbs using corpus data |
Authors | Marco Del Tredici, N{'u}ria Bel |
Abstract | The paper investigates the relation between metaphoricity and distributional characteristics of verbs, introducing POM, a corpus-derived index that can be used to define the upper bound of metaphoricity of any expression in which a given verb occurs. The work moves from the observation that while some verbs can be used to create highly metaphoric expressions, others can not. We conjecture that this fact is related to the number of contexts in which a verb occurs and to the frequency of each context. This intuition is modelled by introducing a method in which each context of a verb in a corpus is assigned a vector representation, and a clustering algorithm is employed to identify similar contexts. Eventually, the Standard Deviation of the relative frequency values of the clusters is computed and taken as the POM of the target verb. We tested POM in two experimental settings obtaining values of accuracy of 84{%} and 92{%}. Since we are convinced, along with (Shutoff, 2015), that metaphor detection systems should be concerned only with the identification of highly metaphoric expressions, we believe that POM could be profitably employed by these systems to a priori exclude expressions that, due to the verb they include, can only have low degrees of metaphoricity |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1724/ |
https://www.aclweb.org/anthology/L16-1724 | |
PWC | https://paperswithcode.com/paper/assessing-the-potential-of-metaphoricity-of |
Repo | |
Framework | |
TweetGeo - A Tool for Collecting, Processing and Analysing Geo-encoded Linguistic Data
Title | TweetGeo - A Tool for Collecting, Processing and Analysing Geo-encoded Linguistic Data |
Authors | Nikola Ljube{\v{s}}i{'c}, Tanja Samard{\v{z}}i{'c}, Curdin Derungs |
Abstract | In this paper we present a newly developed tool that enables researchers interested in spatial variation of language to define a geographic perimeter of interest, collect data from the Twitter streaming API published in that perimeter, filter the obtained data by language and country, define and extract variables of interest and analyse the extracted variables by one spatial statistic and two spatial visualisations. We showcase the tool on the area and a selection of languages spoken in former Yugoslavia. By defining the perimeter, languages and a series of linguistic variables of interest we demonstrate the data collection, processing and analysis capabilities of the tool. |
Tasks | |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-1322/ |
https://www.aclweb.org/anthology/C16-1322 | |
PWC | https://paperswithcode.com/paper/tweetgeo-a-tool-for-collecting-processing-and |
Repo | |
Framework | |
DTSim at SemEval-2016 Task 1: Semantic Similarity Model Including Multi-Level Alignment and Vector-Based Compositional Semantics
Title | DTSim at SemEval-2016 Task 1: Semantic Similarity Model Including Multi-Level Alignment and Vector-Based Compositional Semantics |
Authors | Rajendra Banjade, Nabin Maharjan, Dipesh Gautam, Vasile Rus |
Abstract | |
Tasks | Chunking, Semantic Composition, Semantic Similarity, Semantic Textual Similarity |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/S16-1097/ |
https://www.aclweb.org/anthology/S16-1097 | |
PWC | https://paperswithcode.com/paper/dtsim-at-semeval-2016-task-1-semantic |
Repo | |
Framework | |
Tables as Semi-structured Knowledge for Question Answering
Title | Tables as Semi-structured Knowledge for Question Answering |
Authors | Sujay Kumar Jauhar, Peter Turney, Eduard Hovy |
Abstract | |
Tasks | Information Retrieval, Question Answering |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/P16-1045/ |
https://www.aclweb.org/anthology/P16-1045 | |
PWC | https://paperswithcode.com/paper/tables-as-semi-structured-knowledge-for |
Repo | |
Framework | |
Unsupervised Pre-training With Seq2Seq Reconstruction Loss for Deep Relation Extraction Models
Title | Unsupervised Pre-training With Seq2Seq Reconstruction Loss for Deep Relation Extraction Models |
Authors | Zhuang Li, Lizhen Qu, Qiongkai Xu, Mark Johnson |
Abstract | |
Tasks | Relation Extraction |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/U16-1006/ |
https://www.aclweb.org/anthology/U16-1006 | |
PWC | https://paperswithcode.com/paper/unsupervised-pre-training-with-seq2seq |
Repo | |
Framework | |
Two Decades of Terminology: European Framework Programmes Titles
Title | Two Decades of Terminology: European Framework Programmes Titles |
Authors | Gabriella Pardelli, Sara Goggi, Silvia Giannini, Stefania Biagioni |
Abstract | This work analyses a corpus made of the titles of research projects belonging to the last four European Commission Framework Programmes (FP4, FP5, FP6, FP7) during a time span of nearly two decades (1994-2012). The starting point is the idea of creating a corpus of titles which would constitute a terminological niche, a sort of {``}cluster map{''} offering an overall vision on the terms used and the links between them. Moreover, by performing a terminological comparison over a period of time it is possible to trace the presence of obsolete words in outdated research areas as well as of neologisms in the most recent fields. Within this scenario, the minimal purpose is to build a corpus of titles of European projects belonging to the several Framework Programmes in order to obtain a terminological mapping of relevant words in the various research areas: particularly significant would be those terms spread across different domains or those extremely tied to a specific domain. A term could actually be found in many fields and being able to acknowledge and retrieve this cross-presence means being able to linking those different domains by means of a process of terminological mapping. | |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1058/ |
https://www.aclweb.org/anthology/L16-1058 | |
PWC | https://paperswithcode.com/paper/two-decades-of-terminology-european-framework |
Repo | |
Framework | |