January 24, 2020

2396 words 12 mins read

Paper Group NANR 227

Provable Guarantees on Learning Hierarchical Generative Models with Deep CNNs. SyntagNet: Challenging Supervised Word Sense Disambiguation with Lexical-Semantic Combinations. MoNoise: A Multi-lingual and Easy-to-use Lexical Normalization Tool. A Multi-Pairwise Extension of Procrustes Analysis for Multilingual Word Translation. Hybrid RNN at SemEval …

Provable Guarantees on Learning Hierarchical Generative Models with Deep CNNs


Title	Provable Guarantees on Learning Hierarchical Generative Models with Deep CNNs
Authors	Eran Malach, Shai Shalev-Shwartz
Abstract	Learning deep networks is computationally hard in the general case. To show any positive theoretical results, one must make assumptions on the data distribution. Current theoretical works often make assumptions that are very far from describing real data, like sampling from Gaussian distribution or linear separability of the data. We describe an algorithm that learns convolutional neural network, assuming the data is sampled from a deep generative model that generates images level by level, where lower resolution images correspond to latent semantic classes. We analyze the convergence rate of our algorithm assuming the data is indeed generated according to this model (as well as additional assumptions). While we do not pretend to claim that the assumptions are realistic for natural images, we do believe that they capture some true properties of real data. Furthermore, we show that on CIFAR-10, the algorithm we analyze achieves results in the same ballpark with vanilla convolutional neural networks that are trained with SGD.
Tasks
Published	2019-01-01
URL	https://openreview.net/forum?id=HkxWrsC5FQ
PDF	https://openreview.net/pdf?id=HkxWrsC5FQ
PWC	https://paperswithcode.com/paper/provable-guarantees-on-learning-hierarchical
Repo
Framework

SyntagNet: Challenging Supervised Word Sense Disambiguation with Lexical-Semantic Combinations


Title	SyntagNet: Challenging Supervised Word Sense Disambiguation with Lexical-Semantic Combinations
Authors	Marco Maru, Federico Scozzafava, Federico Martelli, Roberto Navigli
Abstract	Current research in knowledge-based Word Sense Disambiguation (WSD) indicates that performances depend heavily on the Lexical Knowledge Base (LKB) employed. This paper introduces SyntagNet, a novel resource consisting of manually disambiguated lexical-semantic combinations. By capturing sense distinctions evoked by syntagmatic relations, SyntagNet enables knowledge-based WSD systems to establish a new state of the art which challenges the hitherto unrivaled performances attained by supervised approaches. To the best of our knowledge, SyntagNet is the first large-scale manually-curated resource of this kind made available to the community (at http://syntagnet.org).
Tasks	Word Sense Disambiguation
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-1359/
PDF	https://www.aclweb.org/anthology/D19-1359
PWC	https://paperswithcode.com/paper/syntagnet-challenging-supervised-word-sense
Repo
Framework

MoNoise: A Multi-lingual and Easy-to-use Lexical Normalization Tool


Title	MoNoise: A Multi-lingual and Easy-to-use Lexical Normalization Tool
Authors	Rob van der Goot
Abstract	In this paper, we introduce and demonstrate the online demo as well as the command line interface of a lexical normalization system (MoNoise) for a variety of languages. We further improve this model by using features from the original word for every normalization candidate. For comparison with future work, we propose the bundling of seven datasets in six languages to form a new benchmark, together with a novel evaluation metric which is particularly suitable for cross-dataset comparisons. MoNoise reaches a new state-of-art performance for six out of seven of these datasets. Furthermore, we allow the user to tune the {`}aggressiveness{'} of the normalization, and show how the model can be made more efficient with only a small loss in performance. The online demo can be found on: http://www.robvandergoot.com/monoise and the corresponding code on: https://bitbucket.org/robvanderg/monoise/ \|
Tasks	Lexical Normalization
Published	2019-07-01
URL	https://www.aclweb.org/anthology/P19-3032/
PDF	https://www.aclweb.org/anthology/P19-3032
PWC	https://paperswithcode.com/paper/monoise-a-multi-lingual-and-easy-to-use
Repo
Framework

A Multi-Pairwise Extension of Procrustes Analysis for Multilingual Word Translation


Title	A Multi-Pairwise Extension of Procrustes Analysis for Multilingual Word Translation
Authors	Hagai Taitelbaum, Gal Chechik, Jacob Goldberger
Abstract	In this paper we present a novel approach to simultaneously representing multiple languages in a common space. Procrustes Analysis (PA) is commonly used to find the optimal orthogonal word mapping in the bilingual case. The proposed Multi Pairwise Procrustes Analysis (MPPA) is a natural extension of the PA algorithm to multilingual word mapping. Unlike previous PA extensions that require a k-way dictionary, this approach requires only pairwise bilingual dictionaries that are much easier to construct.
Tasks
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-1363/
PDF	https://www.aclweb.org/anthology/D19-1363
PWC	https://paperswithcode.com/paper/a-multi-pairwise-extension-of-procrustes
Repo
Framework

Hybrid RNN at SemEval-2019 Task 9: Blending Information Sources for Domain-Independent Suggestion Mining


Title	Hybrid RNN at SemEval-2019 Task 9: Blending Information Sources for Domain-Independent Suggestion Mining
Authors	Aysu Ezen-Can, Ethem F. Can
Abstract	Social media has an increasing amount of information that both customers and companies can benefit from. These social media posts can include Tweets or be in the form of vocalization of complements and complaints (e.g., reviews) of a product or service. Researchers have been actively mining this invaluable information source to automatically generate insights. Mining sentiments of customer reviews is an example that has gained momentum due to its potential to gather information that customers are not happy about. Instead of reading millions of reviews, companies prefer sentiment analysis to obtain feedback and to improve their products or services. In this work, we aim to identify information that companies can act on, or other customers can utilize for making their own experience better. This is different from identifying if reviews of a product or service is negative, positive, or neutral. To that end, we classify sentences of a given review as suggestion or not suggestion so that readers of the reviews do not have to go through thousands of reviews but instead can focus on actionable items and applicable suggestions. To identify suggestions within reviews, we employ a hybrid approach that utilizes a recurrent neural network (RNN) along with rule-based features to build a domain-independent suggestion mining model. In this way, a model trained on electronics reviews is used to extract suggestions from hotel reviews.
Tasks	Sentiment Analysis
Published	2019-06-01
URL	https://www.aclweb.org/anthology/S19-2210/
PDF	https://www.aclweb.org/anthology/S19-2210
PWC	https://paperswithcode.com/paper/hybrid-rnn-at-semeval-2019-task-9-blending
Repo
Framework

Context Dependent Modulation of Activation Function


Title	Context Dependent Modulation of Activation Function
Authors	Long Sha, Jonathan Schwarcz, Pengyu Hong
Abstract	We propose a modification to traditional Artificial Neural Networks (ANNs), which provides the ANNs with new aptitudes motivated by biological neurons. Biological neurons work far beyond linearly summing up synaptic inputs and then transforming the integrated information. A biological neuron change firing modes accordingly to peripheral factors (e.g., neuromodulators) as well as intrinsic ones. Our modification connects a new type of ANN nodes, which mimic the function of biological neuromodulators and are termed modulators, to enable other traditional ANN nodes to adjust their activation sensitivities in run-time based on their input patterns. In this manner, we enable the slope of the activation function to be context dependent. This modification produces statistically significant improvements in comparison with traditional ANN nodes in the context of Convolutional Neural Networks and Long Short-Term Memory networks.
Tasks
Published	2019-05-01
URL	https://openreview.net/forum?id=rylWVnR5YQ
PDF	https://openreview.net/pdf?id=rylWVnR5YQ
PWC	https://paperswithcode.com/paper/context-dependent-modulation-of-activation
Repo
Framework


Title	Proceedings of the 3rd Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature
Authors
Abstract
Tasks
Published	2019-06-01
URL	https://www.aclweb.org/anthology/W19-2500/
PDF	https://www.aclweb.org/anthology/W19-2500
PWC	https://paperswithcode.com/paper/proceedings-of-the-3rd-joint-sighum-workshop
Repo
Framework

Improved Differentiable Architecture Search for Language Modeling and Named Entity Recognition


Title	Improved Differentiable Architecture Search for Language Modeling and Named Entity Recognition
Authors	Yufan Jiang, Chi Hu, Tong Xiao, Chunliang Zhang, Jingbo Zhu
Abstract	In this paper, we study differentiable neural architecture search (NAS) methods for natural language processing. In particular, we improve differentiable architecture search by removing the softmax-local constraint. Also, we apply differentiable NAS to named entity recognition (NER). It is the first time that differentiable NAS methods are adopted in NLP tasks other than language modeling. On both the PTB language modeling and CoNLL-2003 English NER data, our method outperforms strong baselines. It achieves a new state-of-the-art on the NER task.
Tasks	Language Modelling, Named Entity Recognition, Neural Architecture Search
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-1367/
PDF	https://www.aclweb.org/anthology/D19-1367
PWC	https://paperswithcode.com/paper/improved-differentiable-architecture-search
Repo
Framework

Word Familiarity Rate Estimation Using a Bayesian Linear Mixed Model


Title	Word Familiarity Rate Estimation Using a Bayesian Linear Mixed Model
Authors	Masayuki Asahara
Abstract	This paper presents research on word familiarity rate estimation using the {`}Word List by Semantic Principles{'}. We collected rating information on 96,557 words in the {`}Word List by Semantic Principles{'} via Yahoo! crowdsourcing. We asked 3,392 subject participants to use their introspection to rate the familiarity of words based on the five perspectives of {`}KNOW{'}, {`}WRITE{'}, {`}READ{'}, {`}SPEAK{'}, and {`}LISTEN{'}, and each word was rated by at least 16 subject participants. We used Bayesian linear mixed models to estimate the word familiarity rates. We also explored the ratings with the semantic labels used in the {`}Word List by Semantic Principles{'}.
Tasks
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-5902/
PDF	https://www.aclweb.org/anthology/D19-5902
PWC	https://paperswithcode.com/paper/word-familiarity-rate-estimation-using-a
Repo
Framework

KnowSemLM: A Knowledge Infused Semantic Language Model


Title	KnowSemLM: A Knowledge Infused Semantic Language Model
Authors	Haoruo Peng, Qiang Ning, Dan Roth
Abstract	Story understanding requires developing expectations of what events come next in text. Prior knowledge {–} both statistical and declarative {–} is essential in guiding such expectations. While existing semantic language models (SemLM) capture event co-occurrence information by modeling event sequences as semantic frames, entities, and other semantic units, this paper aims at augmenting them with causal knowledge (i.e., one event is likely to lead to another). Such knowledge is modeled at the frame and entity level, and can be obtained either statistically from text or stated declaratively. The proposed method, KnowSemLM, infuses this knowledge into a semantic LM by joint training and inference, and is shown to be effective on both the event cloze test and story/referent prediction tasks.
Tasks	Language Modelling
Published	2019-11-01
URL	https://www.aclweb.org/anthology/K19-1051/
PDF	https://www.aclweb.org/anthology/K19-1051
PWC	https://paperswithcode.com/paper/knowsemlm-a-knowledge-infused-semantic
Repo
Framework

Computational Argumentation Synthesis as a Language Modeling Task


Title	Computational Argumentation Synthesis as a Language Modeling Task
Authors	Roxanne El Baff, Henning Wachsmuth, Khalid Al Khatib, Manfred Stede, Benno Stein
Abstract	Synthesis approaches in computational argumentation so far are restricted to generating claim-like argument units or short summaries of debates. Ultimately, however, we expect computers to generate whole new arguments for a given stance towards some topic, backing up claims following argumentative and rhetorical considerations. In this paper, we approach such an argumentation synthesis as a language modeling task. In our language model, argumentative discourse units are the {`}words{''}, and arguments represent the {`}sentences{''}. Given a pool of units for any unseen topic-stance pair, the model selects a set of unit types according to a basic rhetorical strategy (logos vs. pathos), arranges the structure of the types based on the units{'} argumentative roles, and finally {``}phrases{''} an argument by instantiating the structure with semantically coherent units from the pool. Our evaluation suggests that the model can, to some extent, mimic the human synthesis of strategy-specific arguments. \|
Tasks	Language Modelling
Published	2019-10-01
URL	https://www.aclweb.org/anthology/W19-8607/
PDF	https://www.aclweb.org/anthology/W19-8607
PWC	https://paperswithcode.com/paper/computational-argumentation-synthesis-as-a
Repo
Framework

Textual and Visual Characteristics of Mathematical Expressions in Scholar Documents


Title	Textual and Visual Characteristics of Mathematical Expressions in Scholar Documents
Authors	Vidas Daudaravicius
Abstract	Mathematical expressions (ME) are widely used in scholar documents. In this paper we analyze characteristics of textual and visual MEs characteristics for the image-to-LaTeX translation task. While there are open data-sets of LaTeX files with MEs included it is very complicated to extract these MEs from a document and to compile the list of MEs. Therefore we release a corpus of open-access scholar documents with PDF and JATS-XML parallel files. The MEs in these documents are LaTeX encoded and are document independent. The data contains more than 1.2 million distinct annotated formulae and more than 80 million raw tokens of LaTeX MEs in more than 8 thousand documents. While the variety of textual lengths and visual sizes of MEs are not well defined we found that the task of analyzing MEs in scholar documents can be reduced to the subtask of a particular text length, image width and height bounds, and display MEs can be processed as arrays of partial MEs.
Tasks
Published	2019-06-01
URL	https://www.aclweb.org/anthology/W19-2610/
PDF	https://www.aclweb.org/anthology/W19-2610
PWC	https://paperswithcode.com/paper/textual-and-visual-characteristics-of
Repo
Framework

Automatic Generation and Semantic Grading of Esperanto Sentences in a Teaching Context


Title	Automatic Generation and Semantic Grading of Esperanto Sentences in a Teaching Context
Authors	Eckhard Bick
Abstract
Tasks
Published	2019-09-01
URL	https://www.aclweb.org/anthology/W19-6302/
PDF	https://www.aclweb.org/anthology/W19-6302
PWC	https://paperswithcode.com/paper/automatic-generation-and-semantic-grading-of
Repo
Framework

EigenSent: Spectral sentence embeddings using higher-order Dynamic Mode Decomposition


Title	EigenSent: Spectral sentence embeddings using higher-order Dynamic Mode Decomposition
Authors	Subhradeep Kayal, George Tsatsaronis
Abstract	Distributed representation of words, or word embeddings, have motivated methods for calculating semantic representations of word sequences such as phrases, sentences and paragraphs. Most of the existing methods to do so either use algorithms to learn such representations, or improve on calculating weighted averages of the word vectors. In this work, we experiment with spectral methods of signal representation and summarization as mechanisms for constructing such word-sequence embeddings in an unsupervised fashion. In particular, we explore an algorithm rooted in fluid-dynamics, known as higher-order Dynamic Mode Decomposition, which is designed to capture the eigenfrequencies, and hence the fundamental transition dynamics, of periodic and quasi-periodic systems. It is empirically observed that this approach, which we call EigenSent, can summarize transitions in a sequence of words and generate an embedding that can represent well the sequence itself. To the best of the authors{'} knowledge, this is the first application of a spectral decomposition and signal summarization technique on text, to create sentence embeddings. We test the efficacy of this algorithm in creating sentence embeddings on three public datasets, where it performs appreciably well. Moreover it is also shown that, due to the positive combination of their complementary properties, concatenating the embeddings generated by EigenSent with simple word vector averaging achieves state-of-the-art results.
Tasks	Sentence Embeddings, Word Embeddings
Published	2019-07-01
URL	https://www.aclweb.org/anthology/P19-1445/
PDF	https://www.aclweb.org/anthology/P19-1445
PWC	https://paperswithcode.com/paper/eigensent-spectral-sentence-embeddings-using
Repo
Framework

Annotating Shallow Discourse Relations in Twitter Conversations


Title	Annotating Shallow Discourse Relations in Twitter Conversations
Authors	Tatjana Scheffler, Berfin Akta{\c{s}}, Debopam Das, Manfred Stede
Abstract	We introduce our pilot study applying PDTB-style annotation to Twitter conversations. Lexically grounded coherence annotation for Twitter threads will enable detailed investigations of the discourse structure of conversations on social media. Here, we present our corpus of 185 threads and annotation, including an inter-annotator agreement study. We discuss our observations as to how Twitter discourses differ from written news text wrt. discourse connectives and relations. We confirm our hypothesis that discourse relations in written social media conversations are expressed differently than in (news) text. We find that in Twitter, connective arguments frequently are not full syntactic clauses, and that a few general connectives expressing EXPANSION and CONTINGENCY make up the majority of the explicit relations in our data.
Tasks
Published	2019-06-01
URL	https://www.aclweb.org/anthology/W19-2707/
PDF	https://www.aclweb.org/anthology/W19-2707
PWC	https://paperswithcode.com/paper/annotating-shallow-discourse-relations-in
Repo
Framework