May 5, 2019

1794 words 9 mins read

Paper Group NAWR 8

Modeling topic dependencies in semantically coherent text spans with copulas. CATENA: CAusal and TEmporal relation extraction from NAtural language texts. Syntactic realization with data-driven neural tree grammars. Weighted Neural Bag-of-n-grams Model: New Baselines for Text Classification. MDSWriter: Annotation Tool for Creating High-Quality Mult …

Modeling topic dependencies in semantically coherent text spans with copulas


Title	Modeling topic dependencies in semantically coherent text spans with copulas
Authors	Georgios Balikas, Hesam Amoualian, Marianne Clausel, Eric Gaussier, Massih R. Amini
Abstract	The exchangeability assumption in topic models like Latent Dirichlet Allocation (LDA) often results in inferring inconsistent topics for the words of text spans like noun-phrases, which are usually expected to be topically coherent. We propose copulaLDA, that extends LDA by integrating part of the text structure to the model and relaxes the conditional independence assumption between the word-specific latent topics given the per-document topic distributions. To this end, we assume that the words of text spans like noun-phrases are topically bound and we model this dependence with copulas. We demonstrate empirically the effectiveness of copulaLDA on both intrinsic and extrinsic evaluation tasks on several publicly available corpora.
Tasks	Topic Models
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-1166/
PDF	https://www.aclweb.org/anthology/C16-1166
PWC	https://paperswithcode.com/paper/modeling-topic-dependencies-in-semantically
Repo	https://github.com/balikasg/topicModelling
Framework	none

CATENA: CAusal and TEmporal relation extraction from NAtural language texts


Title	CATENA: CAusal and TEmporal relation extraction from NAtural language texts
Authors	Paramita Mirza, Sara Tonelli
Abstract	We present CATENA, a sieve-based system to perform temporal and causal relation extraction and classification from English texts, exploiting the interaction between the temporal and the causal model. We evaluate the performance of each sieve, showing that the rule-based, the machine-learned and the reasoning components all contribute to achieving state-of-the-art performance on TempEval-3 and TimeBank-Dense data. Although causal relations are much sparser than temporal ones, the architecture and the selected features are mostly suitable to serve both tasks. The effects of the interaction between the temporal and the causal components, although limited, yield promising results and confirm the tight connection between the temporal and the causal dimension of texts.
Tasks	Question Answering, Relation Classification, Relation Extraction, Temporal Information Extraction
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-1007/
PDF	https://www.aclweb.org/anthology/C16-1007
PWC	https://paperswithcode.com/paper/catena-causal-and-temporal-relation
Repo	https://github.com/paramitamirza/CATENA
Framework	none

Syntactic realization with data-driven neural tree grammars


Title	Syntactic realization with data-driven neural tree grammars
Authors	Brian McMahan, Matthew Stone
Abstract	A key component in surface realization in natural language generation is to choose concrete syntactic relationships to express a target meaning. We develop a new method for syntactic choice based on learning a stochastic tree grammar in a neural architecture. This framework can exploit state-of-the-art methods for modeling word sequences and generalizing across vocabulary. We also induce embeddings to generalize over elementary tree structures and exploit a tree recurrence over the input structure to model long-distance influences between NLG choices. We evaluate the models on the task of linearizing unannotated dependency trees, documenting the contribution of our modeling techniques to improvements in both accuracy and run time.
Tasks	Language Modelling, Text Generation
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-1022/
PDF	https://www.aclweb.org/anthology/C16-1022
PWC	https://paperswithcode.com/paper/syntactic-realization-with-data-driven-neural
Repo	https://github.com/braingineer/neural_tree_grammar
Framework	none

Weighted Neural Bag-of-n-grams Model: New Baselines for Text Classification


Title	Weighted Neural Bag-of-n-grams Model: New Baselines for Text Classification
Authors	Bofang Li, Zhe Zhao, Tao Liu, Puwei Wang, Xiaoyong Du
Abstract	NBSVM is one of the most popular methods for text classification and has been widely used as baselines for various text representation approaches. It uses Naive Bayes (NB) feature to weight sparse bag-of-n-grams representation. N-gram captures word order in short context and NB feature assigns more weights to those important words. However, NBSVM suffers from sparsity problem and is reported to be exceeded by newly proposed distributed (dense) text representations learned by neural networks. In this paper, we transfer the n-grams and NB weighting to neural models. We train n-gram embeddings and use NB weighting to guide the neural models to focus on important words. In fact, our methods can be viewed as distributed (dense) counterparts of sparse bag-of-n-grams in NBSVM. We discover that n-grams and NB weighting are also effective in distributed representations. As a result, our models achieve new strong baselines on 9 text classification datasets, e.g. on IMDB dataset, we reach performance of 93.5{%} accuracy, which exceeds previous state-of-the-art results obtained by deep neural models. All source codes are publicly available at \url{https://github.com/zhezhaoa/neural_BOW_toolkit}.
Tasks	Text Classification, Word Embeddings
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-1150/
PDF	https://www.aclweb.org/anthology/C16-1150
PWC	https://paperswithcode.com/paper/weighted-neural-bag-of-n-grams-model-new
Repo	https://github.com/zhezhaoa/neural_BOW_toolkit
Framework	none

MDSWriter: Annotation Tool for Creating High-Quality Multi-Document Summarization Corpora


Title	MDSWriter: Annotation Tool for Creating High-Quality Multi-Document Summarization Corpora
Authors	Christian M. Meyer, Darina Benikova, Margot Mieskes, Iryna Gurevych
Abstract
Tasks	Document Summarization, Multi-Document Summarization
Published	2016-08-01
URL	https://www.aclweb.org/anthology/P16-4017/
PDF	https://www.aclweb.org/anthology/P16-4017
PWC	https://paperswithcode.com/paper/mdswriter-annotation-tool-for-creating-high
Repo	https://github.com/UKPLab/mdswriter
Framework	none

On the Impact of Seed Words on Sentiment Polarity Lexicon Induction


Title	On the Impact of Seed Words on Sentiment Polarity Lexicon Induction
Authors	Dame Jovanoski, Veno Pachovski, Preslav Nakov
Abstract	Sentiment polarity lexicons are key resources for sentiment analysis, and researchers have invested a lot of efforts in their manual creation. However, there has been a recent shift towards automatically extracted lexicons, which are orders of magnitude larger and perform much better. These lexicons are typically mined using bootstrapping, starting from very few seed words whose polarity is given, e.g., 50-60 words, and sometimes even just 5-6. Here we demonstrate that much higher-quality lexicons can be built by starting with hundreds of words and phrases as seeds, especially when they are in-domain. Thus, we combine (i) mid-sized high-quality manually crafted lexicons as seeds and (ii) bootstrapping, in order to build large-scale lexicons.
Tasks	Sentiment Analysis, Text Classification
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-1147/
PDF	https://www.aclweb.org/anthology/C16-1147
PWC	https://paperswithcode.com/paper/on-the-impact-of-seed-words-on-sentiment
Repo	https://github.com/badc0re/sent-lex
Framework	none

Bad Company—Neighborhoods in Neural Embedding Spaces Considered Harmful


Title	Bad Company—Neighborhoods in Neural Embedding Spaces Considered Harmful
Authors	Johannes Hellrich, Udo Hahn
Abstract	We assess the reliability and accuracy of (neural) word embeddings for both modern and historical English and German. Our research provides deeper insights into the empirically justified choice of optimal training methods and parameters. The overall low reliability we observe, nevertheless, casts doubt on the suitability of word neighborhoods in embedding spaces as a basis for qualitative conclusions on synchronic and diachronic lexico-semantic matters, an issue currently high up in the agenda of Digital Humanities.
Tasks	Word Embeddings
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-1262/
PDF	https://www.aclweb.org/anthology/C16-1262
PWC	https://paperswithcode.com/paper/bad-companyaneighborhoods-in-neural-embedding
Repo	https://github.com/hellrich/coling2016
Framework	none

Data-Driven Morphological Analysis and Disambiguation for Morphologically Rich Languages and Universal Dependencies


Title	Data-Driven Morphological Analysis and Disambiguation for Morphologically Rich Languages and Universal Dependencies
Authors	Amir More, Reut Tsarfaty
Abstract	Parsing texts into universal dependencies (UD) in realistic scenarios requires infrastructure for the morphological analysis and disambiguation (MA{&}D) of typologically different languages as a first tier. MA{&}D is particularly challenging in morphologically rich languages (MRLs), where the ambiguous space-delimited tokens ought to be disambiguated with respect to their constituent morphemes, each morpheme carrying its own tag and a rich set features. Here we present a novel, language-agnostic, framework for MA{&}D, based on a transition system with two variants {—} word-based and morpheme-based {—} and a dedicated transition to mitigate the biases of variable-length morpheme sequences. Our experiments on a Modern Hebrew case study show state of the art results, and we show that the morpheme-based MD consistently outperforms our word-based variant. We further illustrate the utility and multilingual coverage of our framework by morphologically analyzing and disambiguating the large set of languages in the UD treebanks.
Tasks	Morphological Analysis
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-1033/
PDF	https://www.aclweb.org/anthology/C16-1033
PWC	https://paperswithcode.com/paper/data-driven-morphological-analysis-and
Repo	https://github.com/habeanf/yap
Framework	none

Learning principled bilingual mappings of word embeddings while preserving monolingual invariance


Title	Learning principled bilingual mappings of word embeddings while preserving monolingual invariance
Authors	Mikel Artetxe, Gorka Labaka, Eneko Agirre
Abstract
Tasks	Machine Translation, Word Embeddings
Published	2016-11-01
URL	https://www.aclweb.org/anthology/D16-1250/
PDF	https://www.aclweb.org/anthology/D16-1250
PWC	https://paperswithcode.com/paper/learning-principled-bilingual-mappings-of
Repo	https://github.com/artetxem/vecmap
Framework	none

BIRA: Improved Predictive Exchange Word Clustering


Title	BIRA: Improved Predictive Exchange Word Clustering
Authors	Jon Dehdari, Liling Tan, Josef van Genabith
Abstract
Tasks	Chunking, Machine Translation, Word Alignment
Published	2016-06-01
URL	https://www.aclweb.org/anthology/N16-1139/
PDF	https://www.aclweb.org/anthology/N16-1139
PWC	https://paperswithcode.com/paper/bira-improved-predictive-exchange-word
Repo	https://github.com/jonsafari/clustercat
Framework	none

Hierarchical Attention Networks for Document Classification


Title	Hierarchical Attention Networks for Document Classification
Authors	Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, Eduard Hovy
Abstract
Tasks	Citation Intent Classification, Document Classification, Sentiment Analysis, Text Classification
Published	2016-06-01
URL	https://www.aclweb.org/anthology/N16-1174/
PDF	https://www.aclweb.org/anthology/N16-1174
PWC	https://paperswithcode.com/paper/hierarchical-attention-networks-for-document
Repo	https://github.com/ematvey/hierarchical-attention-networks
Framework	tf

The Product Cut


Title	The Product Cut
Authors	Thomas Laurent, James Von Brecht, Xavier Bresson, Arthur Szlam
Abstract	We introduce a theoretical and algorithmic framework for multi-way graph partitioning that relies on a multiplicative cut-based objective. We refer to this objective as the Product Cut. We provide a detailed investigation of the mathematical properties of this objective and an effective algorithm for its optimization. The proposed model has strong mathematical underpinnings, and the corresponding algorithm achieves state-of-the-art performance on benchmark data sets.
Tasks	graph partitioning
Published	2016-12-01
URL	http://papers.nips.cc/paper/6226-the-product-cut
PDF	http://papers.nips.cc/paper/6226-the-product-cut.pdf
PWC	https://paperswithcode.com/paper/the-product-cut
Repo	https://github.com/xbresson/pcut
Framework	none

Transition-Based Neural Word Segmentation


Title	Transition-Based Neural Word Segmentation
Authors	Meishan Zhang, Yue Zhang, Guohong Fu
Abstract
Tasks	Chinese Word Segmentation, Feature Engineering
Published	2016-08-01
URL	https://www.aclweb.org/anthology/P16-1040/
PDF	https://www.aclweb.org/anthology/P16-1040
PWC	https://paperswithcode.com/paper/transition-based-neural-word-segmentation
Repo	https://github.com/SUTDNLP/NNTransitionSegmentor
Framework	none

Sublinear Time Orthogonal Tensor Decomposition


Title	Sublinear Time Orthogonal Tensor Decomposition
Authors	Zhao Song, David Woodruff, Huan Zhang
Abstract	A recent work (Wang et. al., NIPS 2015) gives the fastest known algorithms for orthogonal tensor decomposition with provable guarantees. Their algorithm is based on computing sketches of the input tensor, which requires reading the entire input. We show in a number of cases one can achieve the same theoretical guarantees in sublinear time, i.e., even without reading most of the input tensor. Instead of using sketches to estimate inner products in tensor decomposition algorithms, we use importance sampling. To achieve sublinear time, we need to know the norms of tensor slices, and we show how to do this in a number of important cases. For symmetric tensors $ T = \sum_{i=1}^k \lambda_i u_i^{\otimes p}$ with $\lambda_i > 0$ for all i, we estimate such norms in sublinear time whenever p is even. For the important case of p = 3 and small values of k, we can also estimate such norms. For asymmetric tensors sublinear time is not possible in general, but we show if the tensor slice norms are just slightly below $\ T _F$ then sublinear time is again possible. One of the main strengths of our work is empirical - in a number of cases our algorithm is orders of magnitude faster than existing methods with the same accuracy.
Tasks
Published	2016-12-01
URL	http://papers.nips.cc/paper/6496-sublinear-time-orthogonal-tensor-decomposition
PDF	http://papers.nips.cc/paper/6496-sublinear-time-orthogonal-tensor-decomposition.pdf
PWC	https://paperswithcode.com/paper/sublinear-time-orthogonal-tensor
Repo	https://github.com/huanzhang12/sampling_tensor_decomp
Framework	none

Phrasal Substitution of Idiomatic Expressions


Title	Phrasal Substitution of Idiomatic Expressions
Authors	Changsheng Liu, Rebecca Hwa
Abstract
Tasks	Automatic Post-Editing, Lexical Simplification, Machine Translation, Sentiment Analysis, Word Sense Disambiguation
Published	2016-06-01
URL	https://www.aclweb.org/anthology/N16-1040/
PDF	https://www.aclweb.org/anthology/N16-1040
PWC	https://paperswithcode.com/paper/phrasal-substitution-of-idiomatic-expressions
Repo	https://github.com/liucs1986/idiom_corpus
Framework	none