May 4, 2019

1693 words 8 mins read

Paper Group NANR 212

TwiSty: A Multilingual Twitter Stylometry Corpus for Gender and Personality Profiling. Cross-lingual Transfer of Correlations between Parts of Speech and Gaze Features. Joint Transition-based Dependency Parsing and Disfluency Detection for Automatic Speech Recognition Texts. GhoSt-NN: A Representative Gold Standard of German Noun-Noun Compounds. Le …

TwiSty: A Multilingual Twitter Stylometry Corpus for Gender and Personality Profiling


Title	TwiSty: A Multilingual Twitter Stylometry Corpus for Gender and Personality Profiling
Authors	Ben Verhoeven, Walter Daelemans, Barbara Plank
Abstract	Personality profiling is the task of detecting personality traits of authors based on writing style. Several personality typologies exist, however, the Briggs-Myer Type Indicator (MBTI) is particularly popular in the non-scientific community, and many people use it to analyse their own personality and talk about the results online. Therefore, large amounts of self-assessed data on MBTI are readily available on social-media platforms such as Twitter. We present a novel corpus of tweets annotated with the MBTI personality type and gender of their author for six Western European languages (Dutch, German, French, Italian, Portuguese and Spanish). We outline the corpus creation and annotation, show statistics of the obtained data distributions and present first baselines on Myers-Briggs personality profiling and gender prediction for all six languages.
Tasks	Gender Prediction
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1258/
PDF	https://www.aclweb.org/anthology/L16-1258
PWC	https://paperswithcode.com/paper/twisty-a-multilingual-twitter-stylometry
Repo
Framework

Cross-lingual Transfer of Correlations between Parts of Speech and Gaze Features


Title	Cross-lingual Transfer of Correlations between Parts of Speech and Gaze Features
Authors	Maria Barrett, Frank Keller, Anders S{\o}gaard
Abstract	Several recent studies have shown that eye movements during reading provide information about grammatical and syntactic processing, which can assist the induction of NLP models. All these studies have been limited to English, however. This study shows that gaze and part of speech (PoS) correlations largely transfer across English and French. This means that we can replicate previous studies on gaze-based PoS tagging for French, but also that we can use English gaze data to assist the induction of French NLP models.
Tasks	Cross-Lingual Transfer, Sentence Compression
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-1126/
PDF	https://www.aclweb.org/anthology/C16-1126
PWC	https://paperswithcode.com/paper/cross-lingual-transfer-of-correlations
Repo
Framework

Joint Transition-based Dependency Parsing and Disfluency Detection for Automatic Speech Recognition Texts


Title	Joint Transition-based Dependency Parsing and Disfluency Detection for Automatic Speech Recognition Texts
Authors	Masashi Yoshikawa, Hiroyuki Shindo, Yuji Matsumoto
Abstract
Tasks	Dependency Parsing, Speech Recognition, Transition-Based Dependency Parsing
Published	2016-11-01
URL	https://www.aclweb.org/anthology/D16-1109/
PDF	https://www.aclweb.org/anthology/D16-1109
PWC	https://paperswithcode.com/paper/joint-transition-based-dependency-parsing-and
Repo
Framework

GhoSt-NN: A Representative Gold Standard of German Noun-Noun Compounds


Title	GhoSt-NN: A Representative Gold Standard of German Noun-Noun Compounds
Authors	Sabine Schulte im Walde, Anna H{"a}tty, Stefan Bott, Nana Khvtisavrishvili
Abstract	This paper presents a novel gold standard of German noun-noun compounds (Ghost-NN) including 868 compounds annotated with corpus frequencies of the compounds and their constituents, productivity and ambiguity of the constituents, semantic relations between the constituents, and compositionality ratings of compound-constituent pairs. Moreover, a subset of the compounds containing 180 compounds is balanced for the productivity of the modifiers (distinguishing low/mid/high productivity) and the ambiguity of the heads (distinguishing between heads with 1, 2 and {\textgreater}2 senses
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1362/
PDF	https://www.aclweb.org/anthology/L16-1362
PWC	https://paperswithcode.com/paper/ghost-nn-a-representative-gold-standard-of
Repo
Framework

Learning brain regions via large-scale online structured sparse dictionary learning


Title	Learning brain regions via large-scale online structured sparse dictionary learning
Authors	Elvis Dohmatob, Arthur Mensch, Gael Varoquaux, Bertrand Thirion
Abstract	We propose a multivariate online dictionary-learning method for obtaining decompositions of brain images with structured and sparse components (aka atoms). Sparsity is to be understood in the usual sense: the dictionary atoms are constrained to contain mostly zeros. This is imposed via an $\ell_1$-norm constraint. By “structured”, we mean that the atoms are piece-wise smooth and compact, thus making up blobs, as opposed to scattered patterns of activation. We propose to use a Sobolev (Laplacian) penalty to impose this type of structure. Combining the two penalties, we obtain decompositions that properly delineate brain structures from functional images. This non-trivially extends the online dictionary-learning work of Mairal et al. (2010), at the price of only a factor of 2 or 3 on the overall running time. Just like the Mairal et al. (2010) reference method, the online nature of our proposed algorithm allows it to scale to arbitrarily sized datasets. Experiments on brain data show that our proposed method extracts structured and denoised dictionaries that are more intepretable and better capture inter-subject variability in small medium, and large-scale regimes alike, compared to state-of-the-art models.
Tasks	Dictionary Learning
Published	2016-12-01
URL	http://papers.nips.cc/paper/6352-learning-brain-regions-via-large-scale-online-structured-sparse-dictionary-learning
PDF	http://papers.nips.cc/paper/6352-learning-brain-regions-via-large-scale-online-structured-sparse-dictionary-learning.pdf
PWC	https://paperswithcode.com/paper/learning-brain-regions-via-large-scale-online
Repo
Framework

Shallow Semantic Reasoning from an Incomplete Gold Standard for Learner Language


Title	Shallow Semantic Reasoning from an Incomplete Gold Standard for Learner Language
Authors	Levi King, Markus Dickinson
Abstract
Tasks	Grammatical Error Detection
Published	2016-06-01
URL	https://www.aclweb.org/anthology/W16-0512/
PDF	https://www.aclweb.org/anthology/W16-0512
PWC	https://paperswithcode.com/paper/shallow-semantic-reasoning-from-an-incomplete
Repo
Framework

運用序列到序列生成架構於重寫式自動摘要(Exploiting Sequence-to-Sequence Generation Framework for Automatic Abstractive Summarization)[In Chinese]


Title	運用序列到序列生成架構於重寫式自動摘要(Exploiting Sequence-to-Sequence Generation Framework for Automatic Abstractive Summarization)[In Chinese]
Authors	Yu-Lun Hsieh, Shih-Hung Liu, Kuan-Yu Chen, Hsin-Min Wang, Wen-Lian Hsu, Berlin Chen
Abstract
Tasks	Abstractive Text Summarization, Text Summarization
Published	2016-10-01
URL	https://www.aclweb.org/anthology/O16-1012/
PDF	https://www.aclweb.org/anthology/O16-1012
PWC	https://paperswithcode.com/paper/ec-aoaaaoac14ea-a14eaaeexploiting-sequence-to
Repo
Framework

What to Do with an Airport? Mining Arguments in the German Online Participation Project Tempelhofer Feld


Title	What to Do with an Airport? Mining Arguments in the German Online Participation Project Tempelhofer Feld
Authors	Matthias Liebeck, Katharina Esau, Stefan Conrad
Abstract
Tasks	Argument Mining, Decision Making
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-2817/
PDF	https://www.aclweb.org/anthology/W16-2817
PWC	https://paperswithcode.com/paper/what-to-do-with-an-airport-mining-arguments
Repo
Framework

Rhetorical structure and argumentation structure in monologue text


Title	Rhetorical structure and argumentation structure in monologue text
Authors	Andreas Peldszus, Manfred Stede
Abstract
Tasks	Argument Mining
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-2812/
PDF	https://www.aclweb.org/anthology/W16-2812
PWC	https://paperswithcode.com/paper/rhetorical-structure-and-argumentation
Repo
Framework

Assessing the Potential of Metaphoricity of verbs using corpus data


Title	Assessing the Potential of Metaphoricity of verbs using corpus data
Authors	Marco Del Tredici, N{'u}ria Bel
Abstract	The paper investigates the relation between metaphoricity and distributional characteristics of verbs, introducing POM, a corpus-derived index that can be used to define the upper bound of metaphoricity of any expression in which a given verb occurs. The work moves from the observation that while some verbs can be used to create highly metaphoric expressions, others can not. We conjecture that this fact is related to the number of contexts in which a verb occurs and to the frequency of each context. This intuition is modelled by introducing a method in which each context of a verb in a corpus is assigned a vector representation, and a clustering algorithm is employed to identify similar contexts. Eventually, the Standard Deviation of the relative frequency values of the clusters is computed and taken as the POM of the target verb. We tested POM in two experimental settings obtaining values of accuracy of 84{%} and 92{%}. Since we are convinced, along with (Shutoff, 2015), that metaphor detection systems should be concerned only with the identification of highly metaphoric expressions, we believe that POM could be profitably employed by these systems to a priori exclude expressions that, due to the verb they include, can only have low degrees of metaphoricity
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1724/
PDF	https://www.aclweb.org/anthology/L16-1724
PWC	https://paperswithcode.com/paper/assessing-the-potential-of-metaphoricity-of
Repo
Framework

TweetGeo - A Tool for Collecting, Processing and Analysing Geo-encoded Linguistic Data


Title	TweetGeo - A Tool for Collecting, Processing and Analysing Geo-encoded Linguistic Data
Authors	Nikola Ljube{\v{s}}i{'c}, Tanja Samard{\v{z}}i{'c}, Curdin Derungs
Abstract	In this paper we present a newly developed tool that enables researchers interested in spatial variation of language to define a geographic perimeter of interest, collect data from the Twitter streaming API published in that perimeter, filter the obtained data by language and country, define and extract variables of interest and analyse the extracted variables by one spatial statistic and two spatial visualisations. We showcase the tool on the area and a selection of languages spoken in former Yugoslavia. By defining the perimeter, languages and a series of linguistic variables of interest we demonstrate the data collection, processing and analysis capabilities of the tool.
Tasks
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-1322/
PDF	https://www.aclweb.org/anthology/C16-1322
PWC	https://paperswithcode.com/paper/tweetgeo-a-tool-for-collecting-processing-and
Repo
Framework

DTSim at SemEval-2016 Task 1: Semantic Similarity Model Including Multi-Level Alignment and Vector-Based Compositional Semantics


Title	DTSim at SemEval-2016 Task 1: Semantic Similarity Model Including Multi-Level Alignment and Vector-Based Compositional Semantics
Authors	Rajendra Banjade, Nabin Maharjan, Dipesh Gautam, Vasile Rus
Abstract
Tasks	Chunking, Semantic Composition, Semantic Similarity, Semantic Textual Similarity
Published	2016-06-01
URL	https://www.aclweb.org/anthology/S16-1097/
PDF	https://www.aclweb.org/anthology/S16-1097
PWC	https://paperswithcode.com/paper/dtsim-at-semeval-2016-task-1-semantic
Repo
Framework

Tables as Semi-structured Knowledge for Question Answering


Title	Tables as Semi-structured Knowledge for Question Answering
Authors	Sujay Kumar Jauhar, Peter Turney, Eduard Hovy
Abstract
Tasks	Information Retrieval, Question Answering
Published	2016-08-01
URL	https://www.aclweb.org/anthology/P16-1045/
PDF	https://www.aclweb.org/anthology/P16-1045
PWC	https://paperswithcode.com/paper/tables-as-semi-structured-knowledge-for
Repo
Framework

Unsupervised Pre-training With Seq2Seq Reconstruction Loss for Deep Relation Extraction Models


Title	Unsupervised Pre-training With Seq2Seq Reconstruction Loss for Deep Relation Extraction Models
Authors	Zhuang Li, Lizhen Qu, Qiongkai Xu, Mark Johnson
Abstract
Tasks	Relation Extraction
Published	2016-12-01
URL	https://www.aclweb.org/anthology/U16-1006/
PDF	https://www.aclweb.org/anthology/U16-1006
PWC	https://paperswithcode.com/paper/unsupervised-pre-training-with-seq2seq
Repo
Framework

Two Decades of Terminology: European Framework Programmes Titles


Title	Two Decades of Terminology: European Framework Programmes Titles
Authors	Gabriella Pardelli, Sara Goggi, Silvia Giannini, Stefania Biagioni
Abstract	This work analyses a corpus made of the titles of research projects belonging to the last four European Commission Framework Programmes (FP4, FP5, FP6, FP7) during a time span of nearly two decades (1994-2012). The starting point is the idea of creating a corpus of titles which would constitute a terminological niche, a sort of {``}cluster map{''} offering an overall vision on the terms used and the links between them. Moreover, by performing a terminological comparison over a period of time it is possible to trace the presence of obsolete words in outdated research areas as well as of neologisms in the most recent fields. Within this scenario, the minimal purpose is to build a corpus of titles of European projects belonging to the several Framework Programmes in order to obtain a terminological mapping of relevant words in the various research areas: particularly significant would be those terms spread across different domains or those extremely tied to a specific domain. A term could actually be found in many fields and being able to acknowledge and retrieve this cross-presence means being able to linking those different domains by means of a process of terminological mapping. \|
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1058/
PDF	https://www.aclweb.org/anthology/L16-1058
PWC	https://paperswithcode.com/paper/two-decades-of-terminology-european-framework
Repo
Framework