Paper Group NANR 102
Predictor-Estimator using Multilevel Task Learning with Stack Propagation for Neural Quality Estimation. Arabic POS Tagging: Don’t Abandon Feature Engineering Just Yet. Word vectors, reuse, and replicability: Towards a community repository of large-text resources. Using English Dictionaries to generate Commonsense Knowledge in Natural Language. A c …
Predictor-Estimator using Multilevel Task Learning with Stack Propagation for Neural Quality Estimation
Title | Predictor-Estimator using Multilevel Task Learning with Stack Propagation for Neural Quality Estimation |
Authors | Hyun Kim, Jong-Hyeok Lee, Seung-Hoon Na |
Abstract | |
Tasks | Language Modelling, Machine Translation, Part-Of-Speech Tagging |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-4763/ |
https://www.aclweb.org/anthology/W17-4763 | |
PWC | https://paperswithcode.com/paper/predictor-estimator-using-multilevel-task |
Repo | |
Framework | |
Arabic POS Tagging: Don’t Abandon Feature Engineering Just Yet
Title | Arabic POS Tagging: Don’t Abandon Feature Engineering Just Yet |
Authors | Kareem Darwish, Hamdy Mubarak, Ahmed Abdelali, Mohamed Eldesouki |
Abstract | This paper focuses on comparing between using Support Vector Machine based ranking (SVM-Rank) and Bidirectional Long-Short-Term-Memory (bi-LSTM) neural-network based sequence labeling in building a state-of-the-art Arabic part-of-speech tagging system. Using SVM-Rank leads to state-of-the-art results, but with a fair amount of feature engineering. Using bi-LSTM, particularly when combined with word embeddings, may lead to competitive POS-tagging results by automatically deducing latent linguistic features. However, we show that augmenting bi-LSTM sequence labeling with some of the features that we used for the SVM-Rank based tagger yields to further improvements. We also show that gains that realized by using embeddings may not be additive with the gains achieved by the features. We are open-sourcing both the SVM-Rank and the bi-LSTM based systems for free. |
Tasks | Feature Engineering, Named Entity Recognition, Part-Of-Speech Tagging, Word Embeddings |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/W17-1316/ |
https://www.aclweb.org/anthology/W17-1316 | |
PWC | https://paperswithcode.com/paper/arabic-pos-tagging-dont-abandon-feature |
Repo | |
Framework | |
Word vectors, reuse, and replicability: Towards a community repository of large-text resources
Title | Word vectors, reuse, and replicability: Towards a community repository of large-text resources |
Authors | Murhaf Fares, Andrey Kutuzov, Stephan Oepen, Erik Velldal |
Abstract | |
Tasks | Semantic Textual Similarity, Word Embeddings |
Published | 2017-05-01 |
URL | https://www.aclweb.org/anthology/W17-0237/ |
https://www.aclweb.org/anthology/W17-0237 | |
PWC | https://paperswithcode.com/paper/word-vectors-reuse-and-replicability-towards |
Repo | |
Framework | |
Using English Dictionaries to generate Commonsense Knowledge in Natural Language
Title | Using English Dictionaries to generate Commonsense Knowledge in Natural Language |
Authors | Ali Almiman, Allan Ramsay |
Abstract | This paper presents an approach to generating common sense knowledge written in raw English sentences. Instead of using public contributors to feed this source, this system chose to employ expert linguistics decisions by using definitions from English dictionaries. Because the definitions in English dictionaries are not prepared to be transformed into inference rules, some preprocessing steps were taken to turn each relation of word:definition in dictionaries into an inference rule in the form left-hand side ⇒ right-hand side. In this paper, we applied this mechanism using two dictionaries: The MacMillan Dictionary and WordNet definitions. A random set of 200 inference rules were extracted equally from the two dictionaries, and then we used human judgment as to whether these rules are {`}True{'} or not. For the MacMillan Dictionary the precision reaches 0.74 with 0.508 recall, and the WordNet definitions resulted in 0.73 precision with 0.09 recall. | |
Tasks | Common Sense Reasoning, Natural Language Inference |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/R17-1009/ |
https://doi.org/10.26615/978-954-452-049-6_009 | |
PWC | https://paperswithcode.com/paper/using-english-dictionaries-to-generate |
Repo | |
Framework | |
A constrained graph algebra for semantic parsing with AMRs
Title | A constrained graph algebra for semantic parsing with AMRs |
Authors | Jonas Groschwitz, Meaghan Fowlie, Mark Johnson, Alex Koller, er |
Abstract | |
Tasks | Semantic Parsing |
Published | 2017-01-01 |
URL | https://www.aclweb.org/anthology/W17-6810/ |
https://www.aclweb.org/anthology/W17-6810 | |
PWC | https://paperswithcode.com/paper/a-constrained-graph-algebra-for-semantic |
Repo | |
Framework | |
Attentive Language Models
Title | Attentive Language Models |
Authors | Giancarlo Salton, Robert Ross, John Kelleher |
Abstract | In this paper, we extend Recurrent Neural Network Language Models (RNN-LMs) with an attention mechanism. We show that an {}attentive{''} RNN-LM (with 11M parameters) achieves a better perplexity than larger RNN-LMs (with 66M parameters) and achieves performance comparable to an ensemble of 10 similar sized RNN-LMs. We also show that an { }attentive{''} RNN-LM needs less contextual information to achieve similar results to the state-of-the-art on the wikitext2 dataset. |
Tasks | Image Captioning, Machine Translation, Speech Recognition |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/I17-1045/ |
https://www.aclweb.org/anthology/I17-1045 | |
PWC | https://paperswithcode.com/paper/attentive-language-models |
Repo | |
Framework | |
Work Hard, Play Hard: Email Classification on the Avocado and Enron Corpora
Title | Work Hard, Play Hard: Email Classification on the Avocado and Enron Corpora |
Authors | Sakhar Alkhereyf, Owen Rambow |
Abstract | In this paper, we present an empirical study of email classification into two main categories {}Business{''} and { }Personal{''}. We train on the Enron email corpus, and test on the Enron and Avocado email corpora. We show that information from the email exchange networks improves the performance of classification. We represent the email exchange networks as social networks with graph structures. For this classification task, we extract social networks features from the graphs in addition to lexical features from email content and we compare the performance of SVM and Extra-Trees classifiers using these features. Combining graph features with lexical features improves the performance on both classifiers. We also provide manually annotated sets of the Avocado and Enron email corpora as a supplementary contribution. |
Tasks | |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/W17-2408/ |
https://www.aclweb.org/anthology/W17-2408 | |
PWC | https://paperswithcode.com/paper/work-hard-play-hard-email-classification-on |
Repo | |
Framework | |
Integrating Deep Linguistic Features in Factuality Prediction over Unified Datasets
Title | Integrating Deep Linguistic Features in Factuality Prediction over Unified Datasets |
Authors | Gabriel Stanovsky, Judith Eckle-Kohler, Yevgeniy Puzikov, Ido Dagan, Iryna Gurevych |
Abstract | Previous models for the assessment of commitment towards a predicate in a sentence (also known as factuality prediction) were trained and tested against a specific annotated dataset, subsequently limiting the generality of their results. In this work we propose an intuitive method for mapping three previously annotated corpora onto a single factuality scale, thereby enabling models to be tested across these corpora. In addition, we design a novel model for factuality prediction by first extending a previous rule-based factuality prediction system and applying it over an abstraction of dependency trees, and then using the output of this system in a supervised classifier. We show that this model outperforms previous methods on all three datasets. We make both the unified factuality corpus and our new model publicly available. |
Tasks | Knowledge Base Population, Question Answering |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/P17-2056/ |
https://www.aclweb.org/anthology/P17-2056 | |
PWC | https://paperswithcode.com/paper/integrating-deep-linguistic-features-in |
Repo | |
Framework | |
Accuracy First: Selecting a Differential Privacy Level for Accuracy Constrained ERM
Title | Accuracy First: Selecting a Differential Privacy Level for Accuracy Constrained ERM |
Authors | Katrina Ligett, Seth Neel, Aaron Roth, Bo Waggoner, Steven Z. Wu |
Abstract | Traditional approaches to differential privacy assume a fixed privacy requirement ε for a computation, and attempt to maximize the accuracy of the computation subject to the privacy constraint. As differential privacy is increasingly deployed in practical settings, it may often be that there is instead a fixed accuracy requirement for a given computation and the data analyst would like to maximize the privacy of the computation subject to the accuracy constraint. This raises the question of how to find and run a maximally private empirical risk minimizer subject to a given accuracy requirement. We propose a general “noise reduction” framework that can apply to a variety of private empirical risk minimization (ERM) algorithms, using them to “search” the space of privacy levels to find the empirically strongest one that meets the accuracy constraint, and incurring only logarithmic overhead in the number of privacy levels searched. The privacy analysis of our algorithm leads naturally to a version of differential privacy where the privacy parameters are dependent on the data, which we term ex-post privacy, and which is related to the recently introduced notion of privacy odometers. We also give an ex-post privacy analysis of the classical AboveThreshold privacy tool, modifying it to allow for queries chosen depending on the database. Finally, we apply our approach to two common objective functions, regularized linear and logistic regression, and empirically compare our noise reduction methods to (i) inverting the theoretical utility guarantees of standard private ERM algorithms and (ii) a stronger empirical baseline based on binary search. |
Tasks | |
Published | 2017-12-01 |
URL | http://papers.nips.cc/paper/6850-accuracy-first-selecting-a-differential-privacy-level-for-accuracy-constrained-erm |
http://papers.nips.cc/paper/6850-accuracy-first-selecting-a-differential-privacy-level-for-accuracy-constrained-erm.pdf | |
PWC | https://paperswithcode.com/paper/accuracy-first-selecting-a-differential |
Repo | |
Framework | |
Uprooting and Rerooting Higher-Order Graphical Models
Title | Uprooting and Rerooting Higher-Order Graphical Models |
Authors | Mark Rowland, Adrian Weller |
Abstract | The idea of uprooting and rerooting graphical models was introduced specifically for binary pairwise models by Weller (2016) as a way to transform a model to any of a whole equivalence class of related models, such that inference on any one model yields inference results for all others. This is very helpful since inference, or relevant bounds, may be much easier to obtain or more accurate for some model in the class. Here we introduce methods to extend the approach to models with higher-order potentials and develop theoretical insights. In particular, we show that the triplet-consistent polytope TRI is unique in being `universally rooted’. We demonstrate empirically that rerooting can significantly improve accuracy of methods of inference for higher-order models at negligible computational cost. | |
Tasks | |
Published | 2017-12-01 |
URL | http://papers.nips.cc/paper/6625-uprooting-and-rerooting-higher-order-graphical-models |
http://papers.nips.cc/paper/6625-uprooting-and-rerooting-higher-order-graphical-models.pdf | |
PWC | https://paperswithcode.com/paper/uprooting-and-rerooting-higher-order |
Repo | |
Framework | |
SUT System Description for Anti-Spoofing 2017 Challenge
Title | SUT System Description for Anti-Spoofing 2017 Challenge |
Authors | Mohammad Adiban, Hossein Sameti, Noushin Maghsoodi, Sajjad Shahsavari |
Abstract | |
Tasks | Quantization, Speaker Verification, Speech Synthesis |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/O17-1025/ |
https://www.aclweb.org/anthology/O17-1025 | |
PWC | https://paperswithcode.com/paper/sut-system-description-for-anti-spoofing-2017 |
Repo | |
Framework | |
Improving ROUGE for Timeline Summarization
Title | Improving ROUGE for Timeline Summarization |
Authors | Sebastian Martschat, Katja Markert |
Abstract | Current evaluation metrics for timeline summarization either ignore the temporal aspect of the task or require strict date matching. We introduce variants of ROUGE that allow alignment of daily summaries via temporal distance or semantic similarity. We argue for the suitability of these variants in a theoretical analysis and demonstrate it in a battery of task-specific tests. |
Tasks | Semantic Similarity, Semantic Textual Similarity, Timeline Summarization |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/E17-2046/ |
https://www.aclweb.org/anthology/E17-2046 | |
PWC | https://paperswithcode.com/paper/improving-rouge-for-timeline-summarization |
Repo | |
Framework | |
MappSent: a Textual Mapping Approach for Question-to-Question Similarity
Title | MappSent: a Textual Mapping Approach for Question-to-Question Similarity |
Authors | Amir Hazem, Basma El Amel Boussaha, Hern, Nicolas ez |
Abstract | Since the advent of word embedding methods, the representation of longer pieces of texts such as sentences and paragraphs is gaining more and more interest, especially for textual similarity tasks. Mikolov et al. (2013) have demonstrated that words and phrases exhibit linear structures that allow to meaningfully combine words by an element-wise addition of their vector representations. Recently, Arora et al. (2017) have shown that removing the projections of the weighted average sum of word embedding vectors on their first principal components, outperforms sophisticated supervised methods including RNN{'}s and LSTM{'}s. Inspired by Mikolov et al. (2013) and Arora et al. (2017) findings and by a bilingual word mapping technique presented in Artetxe et al. (2016), we introduce MappSent, a novel approach for textual similarity. Based on a linear sentence embedding representation, its principle is to build a matrix that maps sentences in a joint-subspace where similar sets of sentences are pushed closer. We evaluate our approach on the SemEval 2016/2017 question-to-question similarity task and show that overall MappSent achieves competitive results and outperforms in most cases state-of-art methods. |
Tasks | Community Question Answering, Question Answering, Question Similarity, Sentence Embedding, Word Embeddings |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/R17-1040/ |
https://doi.org/10.26615/978-954-452-049-6_040 | |
PWC | https://paperswithcode.com/paper/mappsent-a-textual-mapping-approach-for |
Repo | |
Framework | |
Doubly Greedy Primal-Dual Coordinate Descent for Sparse Empirical Risk Minimization
Title | Doubly Greedy Primal-Dual Coordinate Descent for Sparse Empirical Risk Minimization |
Authors | Qi Lei, Ian En-Hsu Yen, Chao-yuan Wu, Inderjit S. Dhillon, Pradeep Ravikumar |
Abstract | We consider the popular problem of sparse empirical risk minimization with linear predictors and a large number of both features and observations. With a convex-concave saddle point objective reformulation, we propose a Doubly Greedy Primal-Dual Coordinate Descent algorithm that is able to exploit sparsity in both primal and dual variables. It enjoys a low cost per iteration and our theoretical analysis shows that it converges linearly with a good iteration complexity, provided that the set of primal variables is sparse. We then extend this algorithm further to leverage active sets. The resulting new algorithm is even faster, and experiments on large-scale Multi-class data sets show that our algorithm achieves up to 30 times speedup on several state-of-the-art optimization methods. |
Tasks | |
Published | 2017-08-01 |
URL | https://icml.cc/Conferences/2017/Schedule?showEvent=864 |
http://proceedings.mlr.press/v70/lei17b/lei17b.pdf | |
PWC | https://paperswithcode.com/paper/doubly-greedy-primal-dual-coordinate-descent |
Repo | |
Framework | |
Graph-based Event Extraction from Twitter
Title | Graph-based Event Extraction from Twitter |
Authors | Amosse Edouard, Elena Cabrio, Sara Tonelli, Nhan Le-Thanh |
Abstract | Detecting which tweets describe a specific event and clustering them is one of the main challenging tasks related to Social Media currently addressed in the NLP community. Existing approaches have mainly focused on detecting spikes in clusters around specific keywords or Named Entities (NE). However, one of the main drawbacks of such approaches is the difficulty in understanding when the same keywords describe different events. In this paper, we propose a novel approach that exploits NE mentions in tweets and their entity context to create a temporal event graph. Then, using simple graph theory techniques and a PageRank-like algorithm, we process the event graphs to detect clusters of tweets describing the same events. Experiments on two gold standard datasets show that our approach achieves state-of-the-art results both in terms of evaluation performances and the quality of the detected events. |
Tasks | Sentiment Analysis |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/R17-1031/ |
https://doi.org/10.26615/978-954-452-049-6_031 | |
PWC | https://paperswithcode.com/paper/graph-based-event-extraction-from-twitter |
Repo | |
Framework | |