Paper Group NANR 23
KnowYourNyms? A Game of Semantic Relationships. Stacked Sentence-Document Classifier Approach for Improving Native Language Identification. CVBed: Structuring CVs usingWord Embeddings. BuzzSaw at SemEval-2017 Task 7: Global vs. Local Context for Interpreting and Locating Homographic English Puns with Sense Embeddings. Proceedings of the NoDaLiDa 20 …
KnowYourNyms? A Game of Semantic Relationships
Title | KnowYourNyms? A Game of Semantic Relationships |
Authors | Ross Mechanic, Dean Fulgoni, Hannah Cutler, Sneha Rajana, Zheyuan Liu, Bradley Jackson, Anne Cocos, Chris Callison-Burch, Marianna Apidianaki |
Abstract | Semantic relation knowledge is crucial for natural language understanding. We introduce {``}KnowYourNyms?{''}, a web-based game for learning semantic relations. While providing users with an engaging experience, the application collects large amounts of data that can be used to improve semantic relation classifiers. The data also broadly informs us of how people perceive the relationships between words, providing useful insights for research in psychology and linguistics. | |
Tasks | Text Classification |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/D17-2007/ |
https://www.aclweb.org/anthology/D17-2007 | |
PWC | https://paperswithcode.com/paper/knowyournyms-a-game-of-semantic-relationships |
Repo | |
Framework | |
Stacked Sentence-Document Classifier Approach for Improving Native Language Identification
Title | Stacked Sentence-Document Classifier Approach for Improving Native Language Identification |
Authors | Andrea Cimino, Felice Dell{'}Orletta |
Abstract | In this paper, we describe the approach of the ItaliaNLP Lab team to native language identification and discuss the results we submitted as participants to the essay track of NLI Shared Task 2017. We introduce for the first time a 2-stacked sentence-document architecture for native language identification that is able to exploit both local sentence information and a wide set of general-purpose features qualifying the lexical and grammatical structure of the whole document. When evaluated on the official test set, our sentence-document stacked architecture obtained the best result among all the participants of the essay track with an F1 score of 0.8818. |
Tasks | Document Classification, Language Identification, Native Language Identification, Sentence Classification, Text Classification |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-5049/ |
https://www.aclweb.org/anthology/W17-5049 | |
PWC | https://paperswithcode.com/paper/stacked-sentence-document-classifier-approach |
Repo | |
Framework | |
CVBed: Structuring CVs usingWord Embeddings
Title | CVBed: Structuring CVs usingWord Embeddings |
Authors | Shweta Garg, Sudhanshu S Singh, Abhijit Mishra, Kuntal Dey |
Abstract | Automatic analysis of curriculum vitae (CVs) of applicants is of tremendous importance in recruitment scenarios. The semi-structuredness of CVs, however, makes CV processing a challenging task. We propose a solution towards transforming CVs to follow a unified structure, thereby, paving ways for smoother CV analysis. The problem of restructuring is posed as a section relabeling problem, where each section of a given CV gets reassigned to a predefined label. Our relabeling method relies on semantic relatedness computed between section header, content and labels, based on phrase-embeddings learned from a large pool of CVs. We follow different heuristics to measure semantic relatedness. Our best heuristic achieves an F-score of 93.17{%} on a test dataset with gold-standard labels obtained using manual annotation. |
Tasks | Word Embeddings |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/I17-2059/ |
https://www.aclweb.org/anthology/I17-2059 | |
PWC | https://paperswithcode.com/paper/cvbed-structuring-cvs-usingword-embeddings |
Repo | |
Framework | |
BuzzSaw at SemEval-2017 Task 7: Global vs. Local Context for Interpreting and Locating Homographic English Puns with Sense Embeddings
Title | BuzzSaw at SemEval-2017 Task 7: Global vs. Local Context for Interpreting and Locating Homographic English Puns with Sense Embeddings |
Authors | Dieke Oele, Kilian Evang |
Abstract | This paper describes our system participating in the SemEval-2017 Task 7, for the subtasks of homographic pun location and homographic pun interpretation. For pun interpretation, we use a knowledge-based Word Sense Disambiguation (WSD) method based on sense embeddings. Pun-based jokes can be divided into two parts, each containing information about the two distinct senses of the pun. To exploit this structure we split the context that is input to the WSD system into two local contexts and find the best sense for each of them. We use the output of pun interpretation for pun location. As we expect the two meanings of a pun to be very dissimilar, we compute sense embedding cosine distances for each sense-pair and select the word that has the highest distance. We describe experiments on different methods of splitting the context and compare our method to several baselines. We find evidence supporting our hypotheses and obtain competitive results for pun interpretation. |
Tasks | Word Sense Disambiguation |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/S17-2076/ |
https://www.aclweb.org/anthology/S17-2076 | |
PWC | https://paperswithcode.com/paper/buzzsaw-at-semeval-2017-task-7-global-vs |
Repo | |
Framework | |
Proceedings of the NoDaLiDa 2017 Workshop on Processing Historical Language
Title | Proceedings of the NoDaLiDa 2017 Workshop on Processing Historical Language |
Authors | |
Abstract | |
Tasks | |
Published | 2017-05-01 |
URL | https://www.aclweb.org/anthology/W17-0500/ |
https://www.aclweb.org/anthology/W17-0500 | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-nodalida-2017-workshop-on-1 |
Repo | |
Framework | |
Language Generation from DB Query
Title | Language Generation from DB Query |
Authors | Kristina Kocijan, Bo{\v{z}}o Bekavac, Kre{\v{s}}imir {\v{S}}ojat |
Abstract | |
Tasks | Machine Translation, Text Generation |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-3804/ |
https://www.aclweb.org/anthology/W17-3804 | |
PWC | https://paperswithcode.com/paper/language-generation-from-db-query |
Repo | |
Framework | |
TOTEMSS: Topic-based, Temporal Sentiment Summarisation for Twitter
Title | TOTEMSS: Topic-based, Temporal Sentiment Summarisation for Twitter |
Authors | Bo Wang, Maria Liakata, Adam Tsakalidis, Spiros Georgakopoulos Kolaitis, Symeon Papadopoulos, Lazaros Apostolidis, Arkaitz Zubiaga, Rob Procter, Yiannis Kompatsiaris |
Abstract | We present a system for time sensitive, topic based summarisation of the sentiment around target entities and topics in collections of tweets. We describe the main elements of the system and illustrate its functionality with two examples of sentiment analysis of topics related to the 2017 UK general election. |
Tasks | Opinion Mining, Sentiment Analysis |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/I17-3006/ |
https://www.aclweb.org/anthology/I17-3006 | |
PWC | https://paperswithcode.com/paper/totemss-topic-based-temporal-sentiment |
Repo | |
Framework | |
Detecting Good Arguments in a Non-Topic-Specific Way: An Oxymoron?
Title | Detecting Good Arguments in a Non-Topic-Specific Way: An Oxymoron? |
Authors | Beata Beigman Klebanov, Binod Gyawali, Yi Song |
Abstract | Automatic identification of good arguments on a controversial topic has applications in civics and education, to name a few. While in the civics context it might be acceptable to create separate models for each topic, in the context of scoring of students{'} writing there is a preference for a single model that applies to all responses. Given that good arguments for one topic are likely to be irrelevant for another, is a single model for detecting good arguments a contradiction in terms? We investigate the extent to which it is possible to close the performance gap between topic-specific and across-topics models for identification of good arguments. |
Tasks | |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/P17-2038/ |
https://www.aclweb.org/anthology/P17-2038 | |
PWC | https://paperswithcode.com/paper/detecting-good-arguments-in-a-non-topic |
Repo | |
Framework | |
Using Stanford Part-of-Speech Tagger for the Morphologically-rich Filipino Language
Title | Using Stanford Part-of-Speech Tagger for the Morphologically-rich Filipino Language |
Authors | Matthew Phillip Go, Nicco Nocon |
Abstract | |
Tasks | Word Sense Disambiguation |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/Y17-1014/ |
https://www.aclweb.org/anthology/Y17-1014 | |
PWC | https://paperswithcode.com/paper/using-stanford-part-of-speech-tagger-for-the |
Repo | |
Framework | |
Forecasting Consumer Spending from Purchase Intentions Expressed on Social Media
Title | Forecasting Consumer Spending from Purchase Intentions Expressed on Social Media |
Authors | Viktor Pekar, Jane Binner |
Abstract | Consumer spending is an important macroeconomic indicator that is used by policy-makers to judge the health of an economy. In this paper we present a novel method for predicting future consumer spending from social media data. In contrast to previous work that largely relied on sentiment analysis, the proposed method models consumer spending from purchase intentions found on social media. Our experiments with time series analysis models and machine-learning regression models reveal utility of this data for making short-term forecasts of consumer spending: for three- and seven-day horizons, prediction variables derived from social media help to improve forecast accuracy by 11{%} to 18{%} for all the three models, in comparison to models that used only autoregressive predictors. |
Tasks | Sentiment Analysis, Time Series, Time Series Analysis |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-5212/ |
https://www.aclweb.org/anthology/W17-5212 | |
PWC | https://paperswithcode.com/paper/forecasting-consumer-spending-from-purchase |
Repo | |
Framework | |
Event extraction from Twitter using Non-Parametric Bayesian Mixture Model with Word Embeddings
Title | Event extraction from Twitter using Non-Parametric Bayesian Mixture Model with Word Embeddings |
Authors | Deyu Zhou, Xuan Zhang, Yulan He |
Abstract | To extract structured representations of newsworthy events from Twitter, unsupervised models typically assume that tweets involving the same named entities and expressed using similar words are likely to belong to the same event. Hence, they group tweets into clusters based on the co-occurrence patterns of named entities and topical keywords. However, there are two main limitations. First, they require the number of events to be known beforehand, which is not realistic in practical applications. Second, they don{'}t recognise that the same named entity might be referred to by multiple mentions and tweets using different mentions would be wrongly assigned to different events. To overcome these limitations, we propose a non-parametric Bayesian mixture model with word embeddings for event extraction, in which the number of events can be inferred automatically and the issue of lexical variations for the same named entity can be dealt with properly. Our model has been evaluated on three datasets with sizes ranging between 2,499 and over 60 million tweets. Experimental results show that our model outperforms the baseline approach on all datasets by 5-8{%} in F-measure. |
Tasks | Word Embeddings |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/E17-1076/ |
https://www.aclweb.org/anthology/E17-1076 | |
PWC | https://paperswithcode.com/paper/event-extraction-from-twitter-using-non |
Repo | |
Framework | |
Indexicals and Compositionality: Inside-Out or Outside-In?
Title | Indexicals and Compositionality: Inside-Out or Outside-In? |
Authors | Johan Bos |
Abstract | |
Tasks | Semantic Parsing |
Published | 2017-01-01 |
URL | https://www.aclweb.org/anthology/W17-6905/ |
https://www.aclweb.org/anthology/W17-6905 | |
PWC | https://paperswithcode.com/paper/indexicals-and-compositionality-inside-out-or |
Repo | |
Framework | |
Polish evaluation dataset for compositional distributional semantics models
Title | Polish evaluation dataset for compositional distributional semantics models |
Authors | Alina Wr{'o}blewska, Katarzyna Krasnowska-Kiera{'s} |
Abstract | The paper presents a procedure of building an evaluation dataset. for the validation of compositional distributional semantics models estimated for languages other than English. The procedure generally builds on steps designed to assemble the SICK corpus, which contains pairs of English sentences annotated for semantic relatedness and entailment, because we aim at building a comparable dataset. However, the implementation of particular building steps significantly differs from the original SICK design assumptions, which is caused by both lack of necessary extraneous resources for an investigated language and the need for language-specific transformation rules. The designed procedure is verified on Polish, a fusional language with a relatively free word order, and contributes to building a Polish evaluation dataset. The resource consists of 10K sentence pairs which are human-annotated for semantic relatedness and entailment. The dataset may be used for the evaluation of compositional distributional semantics models of Polish. |
Tasks | Semantic Composition |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/P17-1073/ |
https://www.aclweb.org/anthology/P17-1073 | |
PWC | https://paperswithcode.com/paper/polish-evaluation-dataset-for-compositional |
Repo | |
Framework | |
Partial Hard Thresholding: Towards A Principled Analysis of Support Recovery
Title | Partial Hard Thresholding: Towards A Principled Analysis of Support Recovery |
Authors | Jie Shen, Ping Li |
Abstract | In machine learning and compressed sensing, it is of central importance to understand when a tractable algorithm recovers the support of a sparse signal from its compressed measurements. In this paper, we present a principled analysis on the support recovery performance for a family of hard thresholding algorithms. To this end, we appeal to the partial hard thresholding (PHT) operator proposed recently by Jain et al. [IEEE Trans. Information Theory, 2017]. We show that under proper conditions, PHT recovers an arbitrary “s”-sparse signal within O(s kappa log kappa) iterations where “kappa” is an appropriate condition number. Specifying the PHT operator, we obtain the best known result for hard thresholding pursuit and orthogonal matching pursuit with replacement. Experiments on the simulated data complement our theoretical findings and also illustrate the effectiveness of PHT compared to other popular recovery methods. |
Tasks | |
Published | 2017-12-01 |
URL | http://papers.nips.cc/paper/6905-partial-hard-thresholding-towards-a-principled-analysis-of-support-recovery |
http://papers.nips.cc/paper/6905-partial-hard-thresholding-towards-a-principled-analysis-of-support-recovery.pdf | |
PWC | https://paperswithcode.com/paper/partial-hard-thresholding-towards-a |
Repo | |
Framework | |
Adaptive stimulus selection for optimizing neural population responses
Title | Adaptive stimulus selection for optimizing neural population responses |
Authors | Benjamin Cowley, Ryan Williamson, Katerina Clemens, Matthew Smith, Byron M. Yu |
Abstract | Adaptive stimulus selection methods in neuroscience have primarily focused on maximizing the firing rate of a single recorded neuron. When recording from a population of neurons, it is usually not possible to find a single stimulus that maximizes the firing rates of all neurons. This motivates optimizing an objective function that takes into account the responses of all recorded neurons together. We propose “Adept,” an adaptive stimulus selection method that can optimize population objective functions. In simulations, we first confirmed that population objective functions elicited more diverse stimulus responses than single-neuron objective functions. Then, we tested Adept in a closed-loop electrophysiological experiment in which population activity was recorded from macaque V4, a cortical area known for mid-level visual processing. To predict neural responses, we used the outputs of a deep convolutional neural network model as feature embeddings. Images chosen by Adept elicited mean neural responses that were 20% larger than those for randomly-chosen natural images, and also evoked a larger diversity of neural responses. Such adaptive stimulus selection methods can facilitate experiments that involve neurons far from the sensory periphery, for which it is often unclear which stimuli to present. |
Tasks | |
Published | 2017-12-01 |
URL | http://papers.nips.cc/paper/6738-adaptive-stimulus-selection-for-optimizing-neural-population-responses |
http://papers.nips.cc/paper/6738-adaptive-stimulus-selection-for-optimizing-neural-population-responses.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-stimulus-selection-for-optimizing |
Repo | |
Framework | |