Paper Group NANR 93
Enriching Complex Networks with Word Embeddings for Detecting Mild Cognitive Impairment from Speech Transcripts. Temporal Orientation of Tweets for Predicting Income of Users. Adversarial Adaptation of Synthetic or Stale Data. An Approach to Extract Product Features from Chinese Consumer Reviews and Establish Product Feature Structure Tree. 語音文件檢索使 …
Enriching Complex Networks with Word Embeddings for Detecting Mild Cognitive Impairment from Speech Transcripts
Title | Enriching Complex Networks with Word Embeddings for Detecting Mild Cognitive Impairment from Speech Transcripts |
Authors | Le Santos, ro, Edilson Anselmo Corr{^e}a J{'u}nior, Osvaldo Oliveira Jr, Diego Amancio, Let{'\i}cia Mansur, S Alu{'\i}sio, ra |
Abstract | Mild Cognitive Impairment (MCI) is a mental disorder difficult to diagnose. Linguistic features, mainly from parsers, have been used to detect MCI, but this is not suitable for large-scale assessments. MCI disfluencies produce non-grammatical speech that requires manual or high precision automatic correction of transcripts. In this paper, we modeled transcripts into complex networks and enriched them with word embedding (CNE) to better represent short texts produced in neuropsychological assessments. The network measurements were applied with well-known classifiers to automatically identify MCI in transcripts, in a binary classification task. A comparison was made with the performance of traditional approaches using Bag of Words (BoW) and linguistic features for three datasets: DementiaBank in English, and Cinderella and Arizona-Battery in Portuguese. Overall, CNE provided higher accuracy than using only complex networks, while Support Vector Machine was superior to other classifiers. CNE provided the highest accuracies for DementiaBank and Cinderella, but BoW was more efficient for the Arizona-Battery dataset probably owing to its short narratives. The approach using linguistic features yielded higher accuracy if the transcriptions of the Cinderella dataset were manually revised. Taken together, the results indicate that complex networks enriched with embedding is promising for detecting MCI in large-scale assessments. |
Tasks | Word Embeddings |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/P17-1118/ |
https://www.aclweb.org/anthology/P17-1118 | |
PWC | https://paperswithcode.com/paper/enriching-complex-networks-with-word-1 |
Repo | |
Framework | |
Temporal Orientation of Tweets for Predicting Income of Users
Title | Temporal Orientation of Tweets for Predicting Income of Users |
Authors | Mohammed Hasanuzzaman, Sabyasachi Kamila, M Kaur, eep, Sriparna Saha, Asif Ekbal |
Abstract | Automatically estimating a user{'}s socio-economic profile from their language use in social media can significantly help social science research and various downstream applications ranging from business to politics. The current paper presents the first study where user cognitive structure is used to build a predictive model of income. In particular, we first develop a classifier using a weakly supervised learning framework to automatically time-tag tweets as past, present, or future. We quantify a user{'}s overall temporal orientation based on their distribution of tweets, and use it to build a predictive model of income. Our analysis uncovers a correlation between future temporal orientation and income. Finally, we measure the predictive power of future temporal orientation on income by performing regression. |
Tasks | |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/P17-2104/ |
https://www.aclweb.org/anthology/P17-2104 | |
PWC | https://paperswithcode.com/paper/temporal-orientation-of-tweets-for-predicting |
Repo | |
Framework | |
Adversarial Adaptation of Synthetic or Stale Data
Title | Adversarial Adaptation of Synthetic or Stale Data |
Authors | Young-Bum Kim, Karl Stratos, Dongchan Kim |
Abstract | Two types of data shift common in practice are 1. transferring from synthetic data to live user data (a deployment shift), and 2. transferring from stale data to current data (a temporal shift). Both cause a distribution mismatch between training and evaluation, leading to a model that overfits the flawed training data and performs poorly on the test data. We propose a solution to this mismatch problem by framing it as domain adaptation, treating the flawed training dataset as a source domain and the evaluation dataset as a target domain. To this end, we use and build on several recent advances in neural domain adaptation such as adversarial training (Ganinet al., 2016) and domain separation network (Bousmalis et al., 2016), proposing a new effective adversarial training scheme. In both supervised and unsupervised adaptation scenarios, our approach yields clear improvement over strong baselines. |
Tasks | Domain Adaptation, Spoken Language Understanding |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/P17-1119/ |
https://www.aclweb.org/anthology/P17-1119 | |
PWC | https://paperswithcode.com/paper/adversarial-adaptation-of-synthetic-or-stale |
Repo | |
Framework | |
An Approach to Extract Product Features from Chinese Consumer Reviews and Establish Product Feature Structure Tree
Title | An Approach to Extract Product Features from Chinese Consumer Reviews and Establish Product Feature Structure Tree |
Authors | Xinsheng Xu, Jing Lin, Ying Xiao, Jianzhe Yu |
Abstract | |
Tasks | |
Published | 2017-06-01 |
URL | https://www.aclweb.org/anthology/O17-2003/ |
https://www.aclweb.org/anthology/O17-2003 | |
PWC | https://paperswithcode.com/paper/an-approach-to-extract-product-features-from |
Repo | |
Framework | |
語音文件檢索使用類神經網路技術 (On the Use of Neural Network Modeling Techniques for Spoken Document Retrieval) [In Chinese]
Title | 語音文件檢索使用類神經網路技術 (On the Use of Neural Network Modeling Techniques for Spoken Document Retrieval) [In Chinese] |
Authors | Tien-Hong Lo, Ying-Wen Chen, Kuan-Yu Chen, Hsin-Min Wang, Berlin Chen |
Abstract | |
Tasks | |
Published | 2017-12-01 |
URL | https://www.aclweb.org/anthology/O17-3002/ |
https://www.aclweb.org/anthology/O17-3002 | |
PWC | https://paperswithcode.com/paper/eae3aac-a12c-eccc2e-e-on-the-use-of-neural |
Repo | |
Framework | |
Joint Unsupervised Learning of Semantic Representation of Words and Roles in Dependency Trees
Title | Joint Unsupervised Learning of Semantic Representation of Words and Roles in Dependency Trees |
Authors | Michal Konkol |
Abstract | In this paper, we introduce WoRel, a model that jointly learns word embeddings and a semantic representation of word relations. The model learns from plain text sentences and their dependency parse trees. The word embeddings produced by WoRel outperform Skip-Gram and GloVe in word similarity and syntactical word analogy tasks and have comparable results on word relatedness and semantic word analogy tasks. We show that the semantic representation of relations enables us to express the meaning of phrases and is a promising research direction for semantics at the sentence level. |
Tasks | Named Entity Recognition, Question Answering, Sentiment Analysis, Word Embeddings |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/R17-1052/ |
https://doi.org/10.26615/978-954-452-049-6_052 | |
PWC | https://paperswithcode.com/paper/joint-unsupervised-learning-of-semantic |
Repo | |
Framework | |
A computationally-assisted procedure for discovering poetic organization within oral tradition
Title | A computationally-assisted procedure for discovering poetic organization within oral tradition |
Authors | David Meyer |
Abstract | |
Tasks | |
Published | 2017-03-01 |
URL | https://www.aclweb.org/anthology/W17-0113/ |
https://www.aclweb.org/anthology/W17-0113 | |
PWC | https://paperswithcode.com/paper/a-computationally-assisted-procedure-for |
Repo | |
Framework | |
On the Distribution of Lexical Features at Multiple Levels of Analysis
Title | On the Distribution of Lexical Features at Multiple Levels of Analysis |
Authors | Fatemeh Almodaresi, Lyle Ungar, Vivek Kulkarni, Mohsen Zakeri, Salvatore Giorgi, H. Andrew Schwartz |
Abstract | Natural language processing has increasingly moved from modeling documents and words toward studying the people behind the language. This move to working with data at the user or community level has presented the field with different characteristics of linguistic data. In this paper, we empirically characterize various lexical distributions at different levels of analysis, showing that, while most features are decidedly sparse and non-normal at the message-level (as with traditional NLP), they follow the central limit theorem to become much more Log-normal or even Normal at the user- and county-levels. Finally, we demonstrate that modeling lexical features for the correct level of analysis leads to marked improvements in common social scientific prediction tasks. |
Tasks | Document Classification, Sentiment Analysis |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/P17-2013/ |
https://www.aclweb.org/anthology/P17-2013 | |
PWC | https://paperswithcode.com/paper/on-the-distribution-of-lexical-features-at |
Repo | |
Framework | |
On the Challenges of Translating NLP Research into Commercial Products
Title | On the Challenges of Translating NLP Research into Commercial Products |
Authors | Daniel Dahlmeier |
Abstract | This paper highlights challenges in industrial research related to translating research in natural language processing into commercial products. While the interest in natural language processing from industry is significant, the transfer of research to commercial products is non-trivial and its challenges are often unknown to or underestimated by many researchers. I discuss current obstacles and provide suggestions for increasing the chances for translating research to commercial success based on my experience in industrial research. |
Tasks | |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/P17-2015/ |
https://www.aclweb.org/anthology/P17-2015 | |
PWC | https://paperswithcode.com/paper/on-the-challenges-of-translating-nlp-research |
Repo | |
Framework | |
A Web-Based Interactive Tool for Creating, Inspecting, Editing, and Publishing Etymological Datasets
Title | A Web-Based Interactive Tool for Creating, Inspecting, Editing, and Publishing Etymological Datasets |
Authors | Johann-Mattis List |
Abstract | The paper presents the Etymological DICtionary ediTOR (EDICTOR), a free, interactive, web-based tool designed to aid historical linguists in creating, editing, analysing, and publishing etymological datasets. The EDICTOR offers interactive solutions for important tasks in historical linguistics, including facilitated input and segmentation of phonetic transcriptions, quantitative and qualitative analyses of phonetic and morphological data, enhanced interfaces for cognate class assignment and multiple word alignment, and automated evaluation of regular sound correspondences. As a web-based tool written in JavaScript, the EDICTOR can be used in standard web browsers across all major platforms. |
Tasks | Word Alignment |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/E17-3003/ |
https://www.aclweb.org/anthology/E17-3003 | |
PWC | https://paperswithcode.com/paper/a-web-based-interactive-tool-for-creating |
Repo | |
Framework | |
There’s no `Count or Predict’ but task-based selection for distributional models
Title | There’s no `Count or Predict’ but task-based selection for distributional models | |
Authors | Martin Riedl, Chris Biemann |
Abstract | |
Tasks | |
Published | 2017-01-01 |
URL | https://www.aclweb.org/anthology/W17-6933/ |
https://www.aclweb.org/anthology/W17-6933 | |
PWC | https://paperswithcode.com/paper/theres-no-count-or-predict-but-task-based |
Repo | |
Framework | |
Curriculum Design for Code-switching: Experiments with Language Identification and Language Modeling with Deep Neural Networks
Title | Curriculum Design for Code-switching: Experiments with Language Identification and Language Modeling with Deep Neural Networks |
Authors | Monojit Choudhury, Kalika Bali, Sunayana Sitaram, Ashutosh Baheti |
Abstract | |
Tasks | Language Identification, Language Modelling |
Published | 2017-12-01 |
URL | https://www.aclweb.org/anthology/W17-7509/ |
https://www.aclweb.org/anthology/W17-7509 | |
PWC | https://paperswithcode.com/paper/curriculum-design-for-code-switching |
Repo | |
Framework | |
Multi-modal Summarization for Asynchronous Collection of Text, Image, Audio and Video
Title | Multi-modal Summarization for Asynchronous Collection of Text, Image, Audio and Video |
Authors | Haoran Li, Junnan Zhu, Cong Ma, Jiajun Zhang, Chengqing Zong |
Abstract | The rapid increase of the multimedia data over the Internet necessitates multi-modal summarization from collections of text, image, audio and video. In this work, we propose an extractive Multi-modal Summarization (MMS) method which can automatically generate a textual summary given a set of documents, images, audios and videos related to a specific topic. The key idea is to bridge the semantic gaps between multi-modal contents. For audio information, we design an approach to selectively use its transcription. For vision information, we learn joint representations of texts and images using a neural network. Finally, all the multi-modal aspects are considered to generate the textural summary by maximizing the salience, non-redundancy, readability and coverage through budgeted optimization of submodular functions. We further introduce an MMS corpus in English and Chinese. The experimental results on this dataset demonstrate that our method outperforms other competitive baseline methods. |
Tasks | Document Summarization, Speech Recognition, Video Summarization |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/D17-1114/ |
https://www.aclweb.org/anthology/D17-1114 | |
PWC | https://paperswithcode.com/paper/multi-modal-summarization-for-asynchronous |
Repo | |
Framework | |
Idiom-Aware Compositional Distributed Semantics
Title | Idiom-Aware Compositional Distributed Semantics |
Authors | Pengfei Liu, Kaiyu Qian, Xipeng Qiu, Xuanjing Huang |
Abstract | Idioms are peculiar linguistic constructions that impose great challenges for representing the semantics of language, especially in current prevailing end-to-end neural models, which assume that the semantics of a phrase or sentence can be literally composed from its constitutive words. In this paper, we propose an idiom-aware distributed semantic model to build representation of sentences on the basis of understanding their contained idioms. Our models are grounded in the literal-first psycholinguistic hypothesis, which can adaptively learn semantic compositionality of a phrase literally or idiomatically. To better evaluate our models, we also construct an idiom-enriched sentiment classification dataset with considerable scale and abundant peculiarities of idioms. The qualitative and quantitative experimental analyses demonstrate the efficacy of our models. |
Tasks | Machine Translation, Sentiment Analysis, Text Classification |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/D17-1124/ |
https://www.aclweb.org/anthology/D17-1124 | |
PWC | https://paperswithcode.com/paper/idiom-aware-compositional-distributed |
Repo | |
Framework | |
QLUT at SemEval-2017 Task 1: Semantic Textual Similarity Based on Word Embeddings
Title | QLUT at SemEval-2017 Task 1: Semantic Textual Similarity Based on Word Embeddings |
Authors | Fanqing Meng, Wenpeng Lu, Yuteng Zhang, Jinyong Cheng, Yuehan Du, Shuwang Han |
Abstract | This paper reports the details of our submissions in the task 1 of SemEval 2017. This task aims at assessing the semantic textual similarity of two sentences or texts. We submit three unsupervised systems based on word embeddings. The differences between these runs are the various preprocessing on evaluation data. The best performance of these systems on the evaluation of Pearson correlation is 0.6887. Unsurprisingly, results of our runs demonstrate that data preprocessing, such as tokenization, lemmatization, extraction of content words and removing stop words, is helpful and plays a significant role in improving the performance of models. |
Tasks | Lemmatization, Semantic Textual Similarity, Tokenization, Word Embeddings |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/S17-2020/ |
https://www.aclweb.org/anthology/S17-2020 | |
PWC | https://paperswithcode.com/paper/qlut-at-semeval-2017-task-1-semantic-textual |
Repo | |
Framework | |