July 26, 2019

2131 words 11 mins read

Paper Group NANR 75

Neural Regularized Domain Adaptation for Chinese Word Segmentation. Population Matching Discrepancy and Applications in Deep Learning. Proceedings of the 4th Workshop on Argument Mining. Co-reference Resolution of Elided Subjects and Possessive Pronouns in Spanish-English Statistical Machine Translation. Word Embeddings as Features for Supervised C …

Neural Regularized Domain Adaptation for Chinese Word Segmentation


Title	Neural Regularized Domain Adaptation for Chinese Word Segmentation
Authors	Zuyi Bao, Si Li, Weiran Xu, Sheng Gao
Abstract	For Chinese word segmentation, the large-scale annotated corpora mainly focus on newswire and only a handful of annotated data is available in other domains such as patents and literature. Considering the limited amount of annotated target domain data, it is a challenge for segmenters to learn domain-specific information while avoid getting over-fitted at the same time. In this paper, we propose a neural regularized domain adaptation method for Chinese word segmentation. The teacher networks trained in source domain are employed to regularize the training process of the student network by preserving the general knowledge. In the experiments, our neural regularized domain adaptation method achieves a better performance comparing to previous methods.
Tasks	Chinese Word Segmentation, Domain Adaptation, Model Compression
Published	2017-12-01
URL	https://www.aclweb.org/anthology/W17-6002/
PDF	https://www.aclweb.org/anthology/W17-6002
PWC	https://paperswithcode.com/paper/neural-regularized-domain-adaptation-for
Repo
Framework

Population Matching Discrepancy and Applications in Deep Learning


Title	Population Matching Discrepancy and Applications in Deep Learning
Authors	Jianfei Chen, Chongxuan Li, Yizhong Ru, Jun Zhu
Abstract	A differentiable estimation of the distance between two distributions based on samples is important for many deep learning tasks. One such estimation is maximum mean discrepancy (MMD). However, MMD suffers from its sensitive kernel bandwidth hyper-parameter, weak gradients, and large mini-batch size when used as a training objective. In this paper, we propose population matching discrepancy (PMD) for estimating the distribution distance based on samples, as well as an algorithm to learn the parameters of the distributions using PMD as an objective. PMD is defined as the minimum weight matching of sample populations from each distribution, and we prove that PMD is a strongly consistent estimator of the first Wasserstein metric. We apply PMD to two deep learning tasks, domain adaptation and generative modeling. Empirical results demonstrate that PMD overcomes the aforementioned drawbacks of MMD, and outperforms MMD on both tasks in terms of the performance as well as the convergence speed.
Tasks	Domain Adaptation
Published	2017-12-01
URL	http://papers.nips.cc/paper/7206-population-matching-discrepancy-and-applications-in-deep-learning
PDF	http://papers.nips.cc/paper/7206-population-matching-discrepancy-and-applications-in-deep-learning.pdf
PWC	https://paperswithcode.com/paper/population-matching-discrepancy-and
Repo
Framework

Proceedings of the 4th Workshop on Argument Mining


Title	Proceedings of the 4th Workshop on Argument Mining
Authors
Abstract
Tasks	Argument Mining
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-5100/
PDF	https://www.aclweb.org/anthology/W17-5100
PWC	https://paperswithcode.com/paper/proceedings-of-the-4th-workshop-on-argument
Repo
Framework

Co-reference Resolution of Elided Subjects and Possessive Pronouns in Spanish-English Statistical Machine Translation


Title	Co-reference Resolution of Elided Subjects and Possessive Pronouns in Spanish-English Statistical Machine Translation
Authors	Annette Rios Gonzales, Don Tuggener
Abstract	This paper presents a straightforward method to integrate co-reference information into phrase-based machine translation to address the problems of i) elided subjects and ii) morphological underspecification of pronouns when translating from pro-drop languages. We evaluate the method for the language pair Spanish-English and find that translation quality improves with the addition of co-reference information.
Tasks	Coreference Resolution, Machine Translation, Tokenization
Published	2017-04-01
URL	https://www.aclweb.org/anthology/E17-2104/
PDF	https://www.aclweb.org/anthology/E17-2104
PWC	https://paperswithcode.com/paper/co-reference-resolution-of-elided-subjects
Repo
Framework

Word Embeddings as Features for Supervised Coreference Resolution


Title	Word Embeddings as Features for Supervised Coreference Resolution
Authors	Iliana Simova, Hans Uszkoreit
Abstract	A common reason for errors in coreference resolution is the lack of semantic information to help determine the compatibility between mentions referring to the same entity. Distributed representations, which have been shown successful in encoding relatedness between words, could potentially be a good source of such knowledge. Moreover, being obtained in an unsupervised manner, they could help address data sparsity issues in labeled training data at a small cost. In this work we investigate whether and to what extend features derived from word embeddings can be successfully used for supervised coreference resolution. We experiment with several word embedding models, and several different types of embeddingbased features, including embedding cluster and cosine similarity-based features. Our evaluations show improvements in the performance of a supervised state-of-theart coreference system.
Tasks	Coreference Resolution, Dimensionality Reduction, Machine Translation, Question Answering, Relation Extraction, Word Embeddings
Published	2017-09-01
URL	https://www.aclweb.org/anthology/R17-1088/
PDF	https://doi.org/10.26615/978-954-452-049-6_088
PWC	https://paperswithcode.com/paper/word-embeddings-as-features-for-supervised
Repo
Framework

Prerequisite Relation Learning for Concepts in MOOCs


Title	Prerequisite Relation Learning for Concepts in MOOCs
Authors	Liangming Pan, Chengjiang Li, Juanzi Li, Jie Tang
Abstract	What prerequisite knowledge should students achieve a level of mastery before moving forward to learn subsequent coursewares? We study the extent to which the prerequisite relation between knowledge concepts in Massive Open Online Courses (MOOCs) can be inferred automatically. In particular, what kinds of information can be leverage to uncover the potential prerequisite relation between knowledge concepts. We first propose a representation learning-based method for learning latent representations of course concepts, and then investigate how different features capture the prerequisite relations between concepts. Our experiments on three datasets form Coursera show that the proposed method achieves significant improvements (+5.9-48.0{%} by F1-score) comparing with existing methods.
Tasks	Representation Learning
Published	2017-07-01
URL	https://www.aclweb.org/anthology/P17-1133/
PDF	https://www.aclweb.org/anthology/P17-1133
PWC	https://paperswithcode.com/paper/prerequisite-relation-learning-for-concepts
Repo
Framework

Universal Dependencies for Arabic


Title	Universal Dependencies for Arabic
Authors	Dima Taji, Nizar Habash, Daniel Zeman
Abstract	We describe the process of creating NUDAR, a Universal Dependency treebank for Arabic. We present the conversion from the Penn Arabic Treebank to the Universal Dependency syntactic representation through an intermediate dependency representation. We discuss the challenges faced in the conversion of the trees, the decisions we made to solve them, and the validation of our conversion. We also present initial parsing results on NUDAR.
Tasks	Machine Translation, Question Answering
Published	2017-04-01
URL	https://www.aclweb.org/anthology/W17-1320/
PDF	https://www.aclweb.org/anthology/W17-1320
PWC	https://paperswithcode.com/paper/universal-dependencies-for-arabic
Repo
Framework

Large-scale news entity sentiment analysis


Title	Large-scale news entity sentiment analysis
Authors	Ralf Steinberger, Stefanie Hegele, Hristo Tanev, Leonida Della Rocca
Abstract	We work on detecting positive or negative sentiment towards named entities in very large volumes of news articles. The aim is to monitor changes over time, as well as to work towards media bias detection by com-paring differences across news sources and countries. With view to applying the same method to dozens of languages, we use lin-guistically light-weight methods: searching for positive and negative terms in bags of words around entity mentions (also consid-ering negation). Evaluation results are good and better than a third-party baseline sys-tem, but precision is not sufficiently high to display the results publicly in our multilin-gual news analysis system Europe Media Monitor (EMM). In this paper, we focus on describing our effort to improve the English language results by avoiding the biggest sources of errors. We also present new work on using a syntactic parser to identify safe opinion recognition rules, such as predica-tive structures in which sentiment words di-rectly refer to an entity. The precision of this method is good, but recall is very low.
Tasks	Part-Of-Speech Tagging, Sentiment Analysis
Published	2017-09-01
URL	https://www.aclweb.org/anthology/R17-1091/
PDF	https://doi.org/10.26615/978-954-452-049-6_091
PWC	https://paperswithcode.com/paper/large-scale-news-entity-sentiment-analysis
Repo
Framework

Efficient Online Bandit Multiclass Learning with O(sqrt{T}) Regret


Title	Efficient Online Bandit Multiclass Learning with O(sqrt{T}) Regret
Authors	Alina Beygelzimer, Francesco Orabona, Chicheng Zhang
Abstract	We present an efficient second-order algorithm with $\tilde{O}(1/\eta \sqrt{T})$ regret for the bandit online multiclass problem. The regret bound holds simultaneously with respect to a family of loss functions parameterized by $\eta$, ranging from hinge loss ($\eta=0$) to squared hinge loss ($\eta=1$). This provides a solution to the open problem of (Abernethy, J. and Rakhlin, A. An efficient bandit algorithm for $\sqrt{T}$-regret in online multiclass prediction? In COLT, 2009). We test our algorithm experimentally, showing that it performs favorably against earlier algorithms.
Tasks
Published	2017-08-01
URL	https://icml.cc/Conferences/2017/Schedule?showEvent=557
PDF	http://proceedings.mlr.press/v70/beygelzimer17a/beygelzimer17a.pdf
PWC	https://paperswithcode.com/paper/efficient-online-bandit-multiclass-learning-1
Repo
Framework

COVER: Covering the Semantically Tractable Questions


Title	COVER: Covering the Semantically Tractable Questions
Authors	Michael Minock
Abstract	In semantic parsing, natural language questions map to expressions in a meaning representation language (MRL) over some fixed vocabulary of predicates. To do this reliably, one must guarantee that for a wide class of natural language questions (the so called semantically tractable questions), correct interpretations are always in the mapped set of possibilities. In this demonstration, we introduce the system COVER which significantly clarifies, revises and extends the basic notion of semantic tractability. COVER achieves coverage of 89{%} while the earlier PRECISE system achieved coverage of 77{%} on the well known GeoQuery corpus. Like PRECISE, COVER requires only a simple domain lexicon and integrates off-the-shelf syntactic parsers. Beyond PRECISE, COVER also integrates off-the-shelf theorem provers to provide more accurate results. COVER is written in Python and uses the NLTK.
Tasks	Automated Theorem Proving, Semantic Parsing
Published	2017-04-01
URL	https://www.aclweb.org/anthology/E17-3001/
PDF	https://www.aclweb.org/anthology/E17-3001
PWC	https://paperswithcode.com/paper/cover-covering-the-semantically-tractable
Repo
Framework

Sew-Embed at SemEval-2017 Task 2: Language-Independent Concept Representations from a Semantically Enriched Wikipedia


Title	Sew-Embed at SemEval-2017 Task 2: Language-Independent Concept Representations from a Semantically Enriched Wikipedia
Authors	Claudio Delli Bovi, Aless Raganato, ro
Abstract	This paper describes Sew-Embed, our language-independent approach to multilingual and cross-lingual semantic word similarity as part of the SemEval-2017 Task 2. We leverage the Wikipedia-based concept representations developed by Raganato et al. (2016), and propose an embedded augmentation of their explicit high-dimensional vectors, which we obtain by plugging in an arbitrary word (or sense) embedding representation, and computing a weighted average in the continuous vector space. We evaluate Sew-Embed with two different off-the-shelf embedding representations, and report their performances across all monolingual and cross-lingual benchmarks available for the task. Despite its simplicity, especially compared with supervised or overly tuned approaches, Sew-Embed achieves competitive results in the cross-lingual setting (3rd best result in the global ranking of subtask 2, score 0.56).
Tasks	Semantic Textual Similarity
Published	2017-08-01
URL	https://www.aclweb.org/anthology/S17-2041/
PDF	https://www.aclweb.org/anthology/S17-2041
PWC	https://paperswithcode.com/paper/sew-embed-at-semeval-2017-task-2-language
Repo
Framework

Adversarial Training for Unsupervised Bilingual Lexicon Induction


Title	Adversarial Training for Unsupervised Bilingual Lexicon Induction
Authors	Meng Zhang, Yang Liu, Huanbo Luan, Maosong Sun
Abstract	Word embeddings are well known to capture linguistic regularities of the language on which they are trained. Researchers also observe that these regularities can transfer across languages. However, previous endeavors to connect separate monolingual word embeddings typically require cross-lingual signals as supervision, either in the form of parallel corpus or seed lexicon. In this work, we show that such cross-lingual connection can actually be established without any form of supervision. We achieve this end by formulating the problem as a natural adversarial game, and investigating techniques that are crucial to successful training. We carry out evaluation on the unsupervised bilingual lexicon induction task. Even though this task appears intrinsically cross-lingual, we are able to demonstrate encouraging performance without any cross-lingual clues.
Tasks	Word Embeddings
Published	2017-07-01
URL	https://www.aclweb.org/anthology/P17-1179/
PDF	https://www.aclweb.org/anthology/P17-1179
PWC	https://paperswithcode.com/paper/adversarial-training-for-unsupervised
Repo
Framework

A Unified Maximum Likelihood Approach for Estimating Symmetric Properties of Discrete Distributions


Title	A Unified Maximum Likelihood Approach for Estimating Symmetric Properties of Discrete Distributions
Authors	Jayadev Acharya, Hirakendu Das, Alon Orlitsky, Ananda Theertha Suresh
Abstract	Symmetric distribution properties such as support size, support coverage, entropy, and proximity to uniformity, arise in many applications. Recently, researchers applied different estimators and analysis tools to derive asymptotically sample-optimal approximations for each of these properties. We show that a single, simple, plug-in estimator—profile maximum likelihood (PML)—is sample competitive for all symmetric properties, and in particular is asymptotically sample-optimal for all the above properties.
Tasks
Published	2017-08-01
URL	https://icml.cc/Conferences/2017/Schedule?showEvent=811
PDF	http://proceedings.mlr.press/v70/acharya17a/acharya17a.pdf
PWC	https://paperswithcode.com/paper/a-unified-maximum-likelihood-approach-for-1
Repo
Framework

Zipporah: a Fast and Scalable Data Cleaning System for Noisy Web-Crawled Parallel Corpora


Title	Zipporah: a Fast and Scalable Data Cleaning System for Noisy Web-Crawled Parallel Corpora
Authors	Hainan Xu, Philipp Koehn
Abstract	We introduce Zipporah, a fast and scalable data cleaning system. We propose a novel type of bag-of-words translation feature, and train logistic regression models to classify good data and synthetic noisy data in the proposed feature space. The trained model is used to score parallel sentences in the data pool for selection. As shown in experiments, Zipporah selects a high-quality parallel corpus from a large, mixed quality data pool. In particular, for one noisy dataset, Zipporah achieves a 2.1 BLEU score improvement with using 1/5 of the data over using the entire corpus.
Tasks	Language Modelling, Machine Translation, Outlier Detection
Published	2017-09-01
URL	https://www.aclweb.org/anthology/D17-1319/
PDF	https://www.aclweb.org/anthology/D17-1319
PWC	https://paperswithcode.com/paper/zipporah-a-fast-and-scalable-data-cleaning
Repo
Framework

Tensor Decomposition via Simultaneous Power Iteration


Title	Tensor Decomposition via Simultaneous Power Iteration
Authors	Po-An Wang, Chi-Jen Lu
Abstract	Tensor decomposition is an important problem with many applications across several disciplines, and a popular approach for this problem is the tensor power method. However, previous works with theoretical guarantee based on this approach can only find the top eigenvectors one after one, unlike the case for matrices. In this paper, we show how to find the eigenvectors simultaneously with the help of a new initialization procedure. This allows us to achieve a better running time in the batch setting, as well as a lower sample complexity in the streaming setting.
Tasks
Published	2017-08-01
URL	https://icml.cc/Conferences/2017/Schedule?showEvent=560
PDF	http://proceedings.mlr.press/v70/wang17i/wang17i.pdf
PWC	https://paperswithcode.com/paper/tensor-decomposition-via-simultaneous-power
Repo
Framework