Paper Group NANR 61
KGEval: Accuracy Estimation of Automatically Constructed Knowledge Graphs. NSEmo at EmoInt-2017: An Ensemble to Predict Emotion Intensity in Tweets. Incorporating Side Information by Adaptive Convolution. Accurate Supervised and Semi-Supervised Machine Reading for Long Documents. Hierarchical Clustering Beyond the Worst-Case. Leveraging Diverse Lex …
KGEval: Accuracy Estimation of Automatically Constructed Knowledge Graphs
Title | KGEval: Accuracy Estimation of Automatically Constructed Knowledge Graphs |
Authors | Prakhar Ojha, Partha Talukdar |
Abstract | Automatic construction of large knowledge graphs (KG) by mining web-scale text datasets has received considerable attention recently. Estimating accuracy of such automatically constructed KGs is a challenging problem due to their size and diversity. This important problem has largely been ignored in prior research {–} we fill this gap and propose KGEval. KGEval uses coupling constraints to bind facts and crowdsources those few that can infer large parts of the graph. We demonstrate that the objective optimized by KGEval is submodular and NP-hard, allowing guarantees for our approximation algorithm. Through experiments on real-world datasets, we demonstrate that KGEval best estimates KG accuracy compared to other baselines, while requiring significantly lesser number of human evaluations. |
Tasks | Knowledge Graphs |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/D17-1183/ |
https://www.aclweb.org/anthology/D17-1183 | |
PWC | https://paperswithcode.com/paper/kgeval-accuracy-estimation-of-automatically |
Repo | |
Framework | |
NSEmo at EmoInt-2017: An Ensemble to Predict Emotion Intensity in Tweets
Title | NSEmo at EmoInt-2017: An Ensemble to Predict Emotion Intensity in Tweets |
Authors | Sreekanth Madisetty, Maunendra Sankar Desarkar |
Abstract | In this paper, we describe a method to predict emotion intensity in tweets. Our approach is an ensemble of three regression methods. The first method uses content-based features (hashtags, emoticons, elongated words, etc.). The second method considers word n-grams and character n-grams for training. The final method uses lexicons, word embeddings, word n-grams, character n-grams for training the model. An ensemble of these three methods gives better performance than individual methods. We applied our method on WASSA emotion dataset. Achieved results are as follows: average Pearson correlation is 0.706, average Spearman correlation is 0.696, average Pearson correlation for gold scores in range 0.5 to 1 is 0.539, and average Spearman correlation for gold scores in range 0.5 to 1 is 0.514. |
Tasks | Emotion Recognition, Sentiment Analysis, Word Embeddings |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-5230/ |
https://www.aclweb.org/anthology/W17-5230 | |
PWC | https://paperswithcode.com/paper/nsemo-at-emoint-2017-an-ensemble-to-predict |
Repo | |
Framework | |
Incorporating Side Information by Adaptive Convolution
Title | Incorporating Side Information by Adaptive Convolution |
Authors | Di Kang, Debarun Dhar, Antoni Chan |
Abstract | Computer vision tasks often have side information available that is helpful to solve the task. For example, for crowd counting, the camera perspective (e.g., camera angle and height) gives a clue about the appearance and scale of people in the scene. While side information has been shown to be useful for counting systems using traditional hand-crafted features, it has not been fully utilized in counting systems based on deep learning. In order to incorporate the available side information, we propose an adaptive convolutional neural network (ACNN), where the convolution filter weights adapt to the current scene context via the side information. In particular, we model the filter weights as a low-dimensional manifold within the high-dimensional space of filter weights. The filter weights are generated using a learned ``filter manifold’’ sub-network, whose input is the side information. With the help of side information and adaptive weights, the ACNN can disentangle the variations related to the side information, and extract discriminative features related to the current context (e.g. camera perspective, noise level, blur kernel parameters). We demonstrate the effectiveness of ACNN incorporating side information on 3 tasks: crowd counting, corrupted digit recognition, and image deblurring. Our experiments show that ACNN improves the performance compared to a plain CNN with a similar number of parameters. Since existing crowd counting datasets do not contain ground-truth side information, we collect a new dataset with the ground-truth camera angle and height as the side information. | |
Tasks | Crowd Counting, Deblurring |
Published | 2017-12-01 |
URL | http://papers.nips.cc/paper/6976-incorporating-side-information-by-adaptive-convolution |
http://papers.nips.cc/paper/6976-incorporating-side-information-by-adaptive-convolution.pdf | |
PWC | https://paperswithcode.com/paper/incorporating-side-information-by-adaptive |
Repo | |
Framework | |
Accurate Supervised and Semi-Supervised Machine Reading for Long Documents
Title | Accurate Supervised and Semi-Supervised Machine Reading for Long Documents |
Authors | Daniel Hewlett, Llion Jones, Alex Lacoste, re, Izzeddin Gur |
Abstract | We introduce a hierarchical architecture for machine reading capable of extracting precise information from long documents. The model divides the document into small, overlapping windows and encodes all windows in parallel with an RNN. It then attends over these window encodings, reducing them to a single encoding, which is decoded into an answer using a sequence decoder. This hierarchical approach allows the model to scale to longer documents without increasing the number of sequential steps. In a supervised setting, our model achieves state of the art accuracy of 76.8 on the WikiReading dataset. We also evaluate the model in a semi-supervised setting by downsampling the WikiReading training set to create increasingly smaller amounts of supervision, while leaving the full unlabeled document corpus to train a sequence autoencoder on document windows. We evaluate models that can reuse autoencoder states and outputs without fine-tuning their weights, allowing for more efficient training and inference. |
Tasks | Question Answering, Reading Comprehension |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/D17-1214/ |
https://www.aclweb.org/anthology/D17-1214 | |
PWC | https://paperswithcode.com/paper/accurate-supervised-and-semi-supervised |
Repo | |
Framework | |
Hierarchical Clustering Beyond the Worst-Case
Title | Hierarchical Clustering Beyond the Worst-Case |
Authors | Vincent Cohen-Addad, Varun Kanade, Frederik Mallmann-Trenn |
Abstract | Hiererachical clustering, that is computing a recursive partitioning of a dataset to obtain clusters at increasingly finer granularity is a fundamental problem in data analysis. Although hierarchical clustering has mostly been studied through procedures such as linkage algorithms, or top-down heuristics, rather than as optimization problems, recently Dasgupta [1] proposed an objective function for hierarchical clustering and initiated a line of work developing algorithms that explicitly optimize an objective (see also [2, 3, 4]). In this paper, we consider a fairly general random graph model for hierarchical clustering, called the hierarchical stochastic blockmodel (HSBM), and show that in certain regimes the SVD approach of McSherry [5] combined with specific linkage methods results in a clustering that give an O(1)-approximation to Dasgupta’s cost function. We also show that an approach based on SDP relaxations for balanced cuts based on the work of Makarychev et al. [6], combined with the recursive sparsest cut algorithm of Dasgupta, yields an O(1) approximation in slightly larger regimes and also in the semi-random setting, where an adversary may remove edges from the random graph generated according to an HSBM. Finally, we report empirical evaluation on synthetic and real-world data showing that our proposed SVD-based method does indeed achieve a better cost than other widely-used heurstics and also results in a better classification accuracy when the underlying problem was that of multi-class classification. |
Tasks | |
Published | 2017-12-01 |
URL | http://papers.nips.cc/paper/7200-hierarchical-clustering-beyond-the-worst-case |
http://papers.nips.cc/paper/7200-hierarchical-clustering-beyond-the-worst-case.pdf | |
PWC | https://paperswithcode.com/paper/hierarchical-clustering-beyond-the-worst-case |
Repo | |
Framework | |
Leveraging Diverse Lexical Chains to Construct Essays for Chinese College Entrance Examination
Title | Leveraging Diverse Lexical Chains to Construct Essays for Chinese College Entrance Examination |
Authors | Liunian Li, Xiaojun Wan, Jin-ge Yao, Siming Yan |
Abstract | In this work we study the challenging task of automatically constructing essays for Chinese college entrance examination where the topic is specified in advance. We explore a sentence extraction framework based on diversified lexical chains to capture coherence and richness. Experimental analysis shows the effectiveness of our approach and reveals the importance of information richness in essay writing. |
Tasks | |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/I17-2060/ |
https://www.aclweb.org/anthology/I17-2060 | |
PWC | https://paperswithcode.com/paper/leveraging-diverse-lexical-chains-to |
Repo | |
Framework | |
Proceedings of the 15th Meeting on the Mathematics of Language
Title | Proceedings of the 15th Meeting on the Mathematics of Language |
Authors | |
Abstract | |
Tasks | |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/W17-3400/ |
https://www.aclweb.org/anthology/W17-3400 | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-15th-meeting-on-the |
Repo | |
Framework | |
A Stylistic Analysis of a Philippine Essay, ``The Will of the River’’
Title | A Stylistic Analysis of a Philippine Essay, ``The Will of the River’’ | |
Authors | Pilar Caparas |
Abstract | |
Tasks | |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/Y17-1030/ |
https://www.aclweb.org/anthology/Y17-1030 | |
PWC | https://paperswithcode.com/paper/a-stylistic-analysis-of-a-philippine-essay |
Repo | |
Framework | |
Initializing Convolutional Filters with Semantic Features for Text Classification
Title | Initializing Convolutional Filters with Semantic Features for Text Classification |
Authors | Shen Li, Zhe Zhao, Tao Liu, Renfen Hu, Xiaoyong Du |
Abstract | Convolutional Neural Networks (CNNs) are widely used in NLP tasks. This paper presents a novel weight initialization method to improve the CNNs for text classification. Instead of randomly initializing the convolutional filters, we encode semantic features into them, which helps the model focus on learning useful features at the beginning of the training. Experiments demonstrate the effectiveness of the initialization technique on seven text classification tasks, including sentiment analysis and topic classification. |
Tasks | Sentiment Analysis, Text Classification |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/D17-1201/ |
https://www.aclweb.org/anthology/D17-1201 | |
PWC | https://paperswithcode.com/paper/initializing-convolutional-filters-with |
Repo | |
Framework | |
Finding Patterns in Noisy Crowds: Regression-based Annotation Aggregation for Crowdsourced Data
Title | Finding Patterns in Noisy Crowds: Regression-based Annotation Aggregation for Crowdsourced Data |
Authors | Natalie Parde, Rodney Nielsen |
Abstract | Crowdsourcing offers a convenient means of obtaining labeled data quickly and inexpensively. However, crowdsourced labels are often noisier than expert-annotated data, making it difficult to aggregate them meaningfully. We present an aggregation approach that learns a regression model from crowdsourced annotations to predict aggregated labels for instances that have no expert adjudications. The predicted labels achieve a correlation of 0.594 with expert labels on our data, outperforming the best alternative aggregation method by 11.9{%}. Our approach also outperforms the alternatives on third-party datasets. |
Tasks | |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/D17-1204/ |
https://www.aclweb.org/anthology/D17-1204 | |
PWC | https://paperswithcode.com/paper/finding-patterns-in-noisy-crowds-regression |
Repo | |
Framework | |
Document-Level Multi-Aspect Sentiment Classification as Machine Comprehension
Title | Document-Level Multi-Aspect Sentiment Classification as Machine Comprehension |
Authors | Yichun Yin, Yangqiu Song, Ming Zhang |
Abstract | Document-level multi-aspect sentiment classification is an important task for customer relation management. In this paper, we model the task as a machine comprehension problem where pseudo question-answer pairs are constructed by a small number of aspect-related keywords and aspect ratings. A hierarchical iterative attention model is introduced to build aspectspecific representations by frequent and repeated interactions between documents and aspect questions. We adopt a hierarchical architecture to represent both word level and sentence level information, and use the attention operations for aspect questions and documents alternatively with the multiple hop mechanism. Experimental results on the TripAdvisor and BeerAdvocate datasets show that our model outperforms classical baselines. We will release our code and data for the method replicability. |
Tasks | Multi-Task Learning, Reading Comprehension, Sentiment Analysis |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/D17-1217/ |
https://www.aclweb.org/anthology/D17-1217 | |
PWC | https://paperswithcode.com/paper/document-level-multi-aspect-sentiment-1 |
Repo | |
Framework | |
What do we need to know about an unknown word when parsing German
Title | What do we need to know about an unknown word when parsing German |
Authors | Bich-Ngoc Do, Ines Rehbein, Anette Frank |
Abstract | We propose a new type of subword embedding designed to provide more information about unknown compounds, a major source for OOV words in German. We present an extrinsic evaluation where we use the compound embeddings as input to a neural dependency parser and compare the results to the ones obtained with other types of embeddings. Our evaluation shows that adding compound embeddings yields a significant improvement of 2{%} LAS over using word embeddings when no POS information is available. When adding POS embeddings to the input, however, the effect levels out. This suggests that it is not the missing information about the semantics of the unknown words that causes problems for parsing German, but the lack of morphological information for unknown words. To augment our evaluation, we also test the new embeddings in a language modelling task that requires both syntactic and semantic information. |
Tasks | Language Modelling, Word Embeddings |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-4117/ |
https://www.aclweb.org/anthology/W17-4117 | |
PWC | https://paperswithcode.com/paper/what-do-we-need-to-know-about-an-unknown-word |
Repo | |
Framework | |
Speeding Up Neural Machine Translation Decoding by Shrinking Run-time Vocabulary
Title | Speeding Up Neural Machine Translation Decoding by Shrinking Run-time Vocabulary |
Authors | Xing Shi, Kevin Knight |
Abstract | We speed up Neural Machine Translation (NMT) decoding by shrinking run-time target vocabulary. We experiment with two shrinking approaches: Locality Sensitive Hashing (LSH) and word alignments. Using the latter method, we get a 2x overall speed-up over a highly-optimized GPU implementation, without hurting BLEU. On certain low-resource language pairs, the same methods improve BLEU by 0.5 points. We also report a negative result for LSH on GPUs, due to relatively large overhead, though it was successful on CPUs. Compared with Locality Sensitive Hashing (LSH), decoding with word alignments is GPU-friendly, orthogonal to existing speedup methods and more robust across language pairs. |
Tasks | Machine Translation |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/P17-2091/ |
https://www.aclweb.org/anthology/P17-2091 | |
PWC | https://paperswithcode.com/paper/speeding-up-neural-machine-translation-1 |
Repo | |
Framework | |
Proactive Learning for Named Entity Recognition
Title | Proactive Learning for Named Entity Recognition |
Authors | Maolin Li, Nhung Nguyen, Sophia Ananiadou |
Abstract | The goal of active learning is to minimise the cost of producing an annotated dataset, in which annotators are assumed to be perfect, i.e., they always choose the correct labels. However, in practice, annotators are not infallible, and they are likely to assign incorrect labels to some instances. Proactive learning is a generalisation of active learning that can model different kinds of annotators. Although proactive learning has been applied to certain labelling tasks, such as text classification, there is little work on its application to named entity (NE) tagging. In this paper, we propose a proactive learning method for producing NE annotated corpora, using two annotators with different levels of expertise, and who charge different amounts based on their levels of experience. To optimise both cost and annotation quality, we also propose a mechanism to present multiple sentences to annotators at each iteration. Experimental results for several corpora show that our method facilitates the construction of high-quality NE labelled datasets at minimal cost. |
Tasks | Active Learning, Named Entity Recognition, Text Classification |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/W17-2314/ |
https://www.aclweb.org/anthology/W17-2314 | |
PWC | https://paperswithcode.com/paper/proactive-learning-for-named-entity |
Repo | |
Framework | |
Exploring Joint Neural Model for Sentence Level Discourse Parsing and Sentiment Analysis
Title | Exploring Joint Neural Model for Sentence Level Discourse Parsing and Sentiment Analysis |
Authors | Bita Nejat, Giuseppe Carenini, Raymond Ng |
Abstract | Discourse Parsing and Sentiment Analysis are two fundamental tasks in Natural Language Processing that have been shown to be mutually beneficial. In this work, we design and compare two Neural Based models for jointly learning both tasks. In the proposed approach, we first create a vector representation for all the text segments in the input sentence. Next, we apply three different Recursive Neural Net models: one for discourse structure prediction, one for discourse relation prediction and one for sentiment analysis. Finally, we combine these Neural Nets in two different joint models: Multi-tasking and Pre-training. Our results on two standard corpora indicate that both methods result in improvements in each task but Multi-tasking has a bigger impact than Pre-training. Specifically for Discourse Parsing, we see improvements in the prediction of the set of contrastive relations. |
Tasks | Sentiment Analysis |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/W17-5535/ |
https://www.aclweb.org/anthology/W17-5535 | |
PWC | https://paperswithcode.com/paper/exploring-joint-neural-model-for-sentence |
Repo | |
Framework | |