Paper Group NANR 75
Neural Regularized Domain Adaptation for Chinese Word Segmentation. Population Matching Discrepancy and Applications in Deep Learning. Proceedings of the 4th Workshop on Argument Mining. Co-reference Resolution of Elided Subjects and Possessive Pronouns in Spanish-English Statistical Machine Translation. Word Embeddings as Features for Supervised C …
Neural Regularized Domain Adaptation for Chinese Word Segmentation
Title | Neural Regularized Domain Adaptation for Chinese Word Segmentation |
Authors | Zuyi Bao, Si Li, Weiran Xu, Sheng Gao |
Abstract | For Chinese word segmentation, the large-scale annotated corpora mainly focus on newswire and only a handful of annotated data is available in other domains such as patents and literature. Considering the limited amount of annotated target domain data, it is a challenge for segmenters to learn domain-specific information while avoid getting over-fitted at the same time. In this paper, we propose a neural regularized domain adaptation method for Chinese word segmentation. The teacher networks trained in source domain are employed to regularize the training process of the student network by preserving the general knowledge. In the experiments, our neural regularized domain adaptation method achieves a better performance comparing to previous methods. |
Tasks | Chinese Word Segmentation, Domain Adaptation, Model Compression |
Published | 2017-12-01 |
URL | https://www.aclweb.org/anthology/W17-6002/ |
https://www.aclweb.org/anthology/W17-6002 | |
PWC | https://paperswithcode.com/paper/neural-regularized-domain-adaptation-for |
Repo | |
Framework | |
Population Matching Discrepancy and Applications in Deep Learning
Title | Population Matching Discrepancy and Applications in Deep Learning |
Authors | Jianfei Chen, Chongxuan Li, Yizhong Ru, Jun Zhu |
Abstract | A differentiable estimation of the distance between two distributions based on samples is important for many deep learning tasks. One such estimation is maximum mean discrepancy (MMD). However, MMD suffers from its sensitive kernel bandwidth hyper-parameter, weak gradients, and large mini-batch size when used as a training objective. In this paper, we propose population matching discrepancy (PMD) for estimating the distribution distance based on samples, as well as an algorithm to learn the parameters of the distributions using PMD as an objective. PMD is defined as the minimum weight matching of sample populations from each distribution, and we prove that PMD is a strongly consistent estimator of the first Wasserstein metric. We apply PMD to two deep learning tasks, domain adaptation and generative modeling. Empirical results demonstrate that PMD overcomes the aforementioned drawbacks of MMD, and outperforms MMD on both tasks in terms of the performance as well as the convergence speed. |
Tasks | Domain Adaptation |
Published | 2017-12-01 |
URL | http://papers.nips.cc/paper/7206-population-matching-discrepancy-and-applications-in-deep-learning |
http://papers.nips.cc/paper/7206-population-matching-discrepancy-and-applications-in-deep-learning.pdf | |
PWC | https://paperswithcode.com/paper/population-matching-discrepancy-and |
Repo | |
Framework | |
Proceedings of the 4th Workshop on Argument Mining
Title | Proceedings of the 4th Workshop on Argument Mining |
Authors | |
Abstract | |
Tasks | Argument Mining |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-5100/ |
https://www.aclweb.org/anthology/W17-5100 | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-4th-workshop-on-argument |
Repo | |
Framework | |
Co-reference Resolution of Elided Subjects and Possessive Pronouns in Spanish-English Statistical Machine Translation
Title | Co-reference Resolution of Elided Subjects and Possessive Pronouns in Spanish-English Statistical Machine Translation |
Authors | Annette Rios Gonzales, Don Tuggener |
Abstract | This paper presents a straightforward method to integrate co-reference information into phrase-based machine translation to address the problems of i) elided subjects and ii) morphological underspecification of pronouns when translating from pro-drop languages. We evaluate the method for the language pair Spanish-English and find that translation quality improves with the addition of co-reference information. |
Tasks | Coreference Resolution, Machine Translation, Tokenization |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/E17-2104/ |
https://www.aclweb.org/anthology/E17-2104 | |
PWC | https://paperswithcode.com/paper/co-reference-resolution-of-elided-subjects |
Repo | |
Framework | |
Word Embeddings as Features for Supervised Coreference Resolution
Title | Word Embeddings as Features for Supervised Coreference Resolution |
Authors | Iliana Simova, Hans Uszkoreit |
Abstract | A common reason for errors in coreference resolution is the lack of semantic information to help determine the compatibility between mentions referring to the same entity. Distributed representations, which have been shown successful in encoding relatedness between words, could potentially be a good source of such knowledge. Moreover, being obtained in an unsupervised manner, they could help address data sparsity issues in labeled training data at a small cost. In this work we investigate whether and to what extend features derived from word embeddings can be successfully used for supervised coreference resolution. We experiment with several word embedding models, and several different types of embeddingbased features, including embedding cluster and cosine similarity-based features. Our evaluations show improvements in the performance of a supervised state-of-theart coreference system. |
Tasks | Coreference Resolution, Dimensionality Reduction, Machine Translation, Question Answering, Relation Extraction, Word Embeddings |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/R17-1088/ |
https://doi.org/10.26615/978-954-452-049-6_088 | |
PWC | https://paperswithcode.com/paper/word-embeddings-as-features-for-supervised |
Repo | |
Framework | |
Prerequisite Relation Learning for Concepts in MOOCs
Title | Prerequisite Relation Learning for Concepts in MOOCs |
Authors | Liangming Pan, Chengjiang Li, Juanzi Li, Jie Tang |
Abstract | What prerequisite knowledge should students achieve a level of mastery before moving forward to learn subsequent coursewares? We study the extent to which the prerequisite relation between knowledge concepts in Massive Open Online Courses (MOOCs) can be inferred automatically. In particular, what kinds of information can be leverage to uncover the potential prerequisite relation between knowledge concepts. We first propose a representation learning-based method for learning latent representations of course concepts, and then investigate how different features capture the prerequisite relations between concepts. Our experiments on three datasets form Coursera show that the proposed method achieves significant improvements (+5.9-48.0{%} by F1-score) comparing with existing methods. |
Tasks | Representation Learning |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/P17-1133/ |
https://www.aclweb.org/anthology/P17-1133 | |
PWC | https://paperswithcode.com/paper/prerequisite-relation-learning-for-concepts |
Repo | |
Framework | |
Universal Dependencies for Arabic
Title | Universal Dependencies for Arabic |
Authors | Dima Taji, Nizar Habash, Daniel Zeman |
Abstract | We describe the process of creating NUDAR, a Universal Dependency treebank for Arabic. We present the conversion from the Penn Arabic Treebank to the Universal Dependency syntactic representation through an intermediate dependency representation. We discuss the challenges faced in the conversion of the trees, the decisions we made to solve them, and the validation of our conversion. We also present initial parsing results on NUDAR. |
Tasks | Machine Translation, Question Answering |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/W17-1320/ |
https://www.aclweb.org/anthology/W17-1320 | |
PWC | https://paperswithcode.com/paper/universal-dependencies-for-arabic |
Repo | |
Framework | |
Large-scale news entity sentiment analysis
Title | Large-scale news entity sentiment analysis |
Authors | Ralf Steinberger, Stefanie Hegele, Hristo Tanev, Leonida Della Rocca |
Abstract | We work on detecting positive or negative sentiment towards named entities in very large volumes of news articles. The aim is to monitor changes over time, as well as to work towards media bias detection by com-paring differences across news sources and countries. With view to applying the same method to dozens of languages, we use lin-guistically light-weight methods: searching for positive and negative terms in bags of words around entity mentions (also consid-ering negation). Evaluation results are good and better than a third-party baseline sys-tem, but precision is not sufficiently high to display the results publicly in our multilin-gual news analysis system Europe Media Monitor (EMM). In this paper, we focus on describing our effort to improve the English language results by avoiding the biggest sources of errors. We also present new work on using a syntactic parser to identify safe opinion recognition rules, such as predica-tive structures in which sentiment words di-rectly refer to an entity. The precision of this method is good, but recall is very low. |
Tasks | Part-Of-Speech Tagging, Sentiment Analysis |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/R17-1091/ |
https://doi.org/10.26615/978-954-452-049-6_091 | |
PWC | https://paperswithcode.com/paper/large-scale-news-entity-sentiment-analysis |
Repo | |
Framework | |
Efficient Online Bandit Multiclass Learning with O(sqrt{T}) Regret
Title | Efficient Online Bandit Multiclass Learning with O(sqrt{T}) Regret |
Authors | Alina Beygelzimer, Francesco Orabona, Chicheng Zhang |
Abstract | We present an efficient second-order algorithm with $\tilde{O}(1/\eta \sqrt{T})$ regret for the bandit online multiclass problem. The regret bound holds simultaneously with respect to a family of loss functions parameterized by $\eta$, ranging from hinge loss ($\eta=0$) to squared hinge loss ($\eta=1$). This provides a solution to the open problem of (Abernethy, J. and Rakhlin, A. An efficient bandit algorithm for $\sqrt{T}$-regret in online multiclass prediction? In COLT, 2009). We test our algorithm experimentally, showing that it performs favorably against earlier algorithms. |
Tasks | |
Published | 2017-08-01 |
URL | https://icml.cc/Conferences/2017/Schedule?showEvent=557 |
http://proceedings.mlr.press/v70/beygelzimer17a/beygelzimer17a.pdf | |
PWC | https://paperswithcode.com/paper/efficient-online-bandit-multiclass-learning-1 |
Repo | |
Framework | |
COVER: Covering the Semantically Tractable Questions
Title | COVER: Covering the Semantically Tractable Questions |
Authors | Michael Minock |
Abstract | In semantic parsing, natural language questions map to expressions in a meaning representation language (MRL) over some fixed vocabulary of predicates. To do this reliably, one must guarantee that for a wide class of natural language questions (the so called semantically tractable questions), correct interpretations are always in the mapped set of possibilities. In this demonstration, we introduce the system COVER which significantly clarifies, revises and extends the basic notion of semantic tractability. COVER achieves coverage of 89{%} while the earlier PRECISE system achieved coverage of 77{%} on the well known GeoQuery corpus. Like PRECISE, COVER requires only a simple domain lexicon and integrates off-the-shelf syntactic parsers. Beyond PRECISE, COVER also integrates off-the-shelf theorem provers to provide more accurate results. COVER is written in Python and uses the NLTK. |
Tasks | Automated Theorem Proving, Semantic Parsing |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/E17-3001/ |
https://www.aclweb.org/anthology/E17-3001 | |
PWC | https://paperswithcode.com/paper/cover-covering-the-semantically-tractable |
Repo | |
Framework | |
Sew-Embed at SemEval-2017 Task 2: Language-Independent Concept Representations from a Semantically Enriched Wikipedia
Title | Sew-Embed at SemEval-2017 Task 2: Language-Independent Concept Representations from a Semantically Enriched Wikipedia |
Authors | Claudio Delli Bovi, Aless Raganato, ro |
Abstract | This paper describes Sew-Embed, our language-independent approach to multilingual and cross-lingual semantic word similarity as part of the SemEval-2017 Task 2. We leverage the Wikipedia-based concept representations developed by Raganato et al. (2016), and propose an embedded augmentation of their explicit high-dimensional vectors, which we obtain by plugging in an arbitrary word (or sense) embedding representation, and computing a weighted average in the continuous vector space. We evaluate Sew-Embed with two different off-the-shelf embedding representations, and report their performances across all monolingual and cross-lingual benchmarks available for the task. Despite its simplicity, especially compared with supervised or overly tuned approaches, Sew-Embed achieves competitive results in the cross-lingual setting (3rd best result in the global ranking of subtask 2, score 0.56). |
Tasks | Semantic Textual Similarity |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/S17-2041/ |
https://www.aclweb.org/anthology/S17-2041 | |
PWC | https://paperswithcode.com/paper/sew-embed-at-semeval-2017-task-2-language |
Repo | |
Framework | |
Adversarial Training for Unsupervised Bilingual Lexicon Induction
Title | Adversarial Training for Unsupervised Bilingual Lexicon Induction |
Authors | Meng Zhang, Yang Liu, Huanbo Luan, Maosong Sun |
Abstract | Word embeddings are well known to capture linguistic regularities of the language on which they are trained. Researchers also observe that these regularities can transfer across languages. However, previous endeavors to connect separate monolingual word embeddings typically require cross-lingual signals as supervision, either in the form of parallel corpus or seed lexicon. In this work, we show that such cross-lingual connection can actually be established without any form of supervision. We achieve this end by formulating the problem as a natural adversarial game, and investigating techniques that are crucial to successful training. We carry out evaluation on the unsupervised bilingual lexicon induction task. Even though this task appears intrinsically cross-lingual, we are able to demonstrate encouraging performance without any cross-lingual clues. |
Tasks | Word Embeddings |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/P17-1179/ |
https://www.aclweb.org/anthology/P17-1179 | |
PWC | https://paperswithcode.com/paper/adversarial-training-for-unsupervised |
Repo | |
Framework | |
A Unified Maximum Likelihood Approach for Estimating Symmetric Properties of Discrete Distributions
Title | A Unified Maximum Likelihood Approach for Estimating Symmetric Properties of Discrete Distributions |
Authors | Jayadev Acharya, Hirakendu Das, Alon Orlitsky, Ananda Theertha Suresh |
Abstract | Symmetric distribution properties such as support size, support coverage, entropy, and proximity to uniformity, arise in many applications. Recently, researchers applied different estimators and analysis tools to derive asymptotically sample-optimal approximations for each of these properties. We show that a single, simple, plug-in estimator—profile maximum likelihood (PML)—is sample competitive for all symmetric properties, and in particular is asymptotically sample-optimal for all the above properties. |
Tasks | |
Published | 2017-08-01 |
URL | https://icml.cc/Conferences/2017/Schedule?showEvent=811 |
http://proceedings.mlr.press/v70/acharya17a/acharya17a.pdf | |
PWC | https://paperswithcode.com/paper/a-unified-maximum-likelihood-approach-for-1 |
Repo | |
Framework | |
Zipporah: a Fast and Scalable Data Cleaning System for Noisy Web-Crawled Parallel Corpora
Title | Zipporah: a Fast and Scalable Data Cleaning System for Noisy Web-Crawled Parallel Corpora |
Authors | Hainan Xu, Philipp Koehn |
Abstract | We introduce Zipporah, a fast and scalable data cleaning system. We propose a novel type of bag-of-words translation feature, and train logistic regression models to classify good data and synthetic noisy data in the proposed feature space. The trained model is used to score parallel sentences in the data pool for selection. As shown in experiments, Zipporah selects a high-quality parallel corpus from a large, mixed quality data pool. In particular, for one noisy dataset, Zipporah achieves a 2.1 BLEU score improvement with using 1/5 of the data over using the entire corpus. |
Tasks | Language Modelling, Machine Translation, Outlier Detection |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/D17-1319/ |
https://www.aclweb.org/anthology/D17-1319 | |
PWC | https://paperswithcode.com/paper/zipporah-a-fast-and-scalable-data-cleaning |
Repo | |
Framework | |
Tensor Decomposition via Simultaneous Power Iteration
Title | Tensor Decomposition via Simultaneous Power Iteration |
Authors | Po-An Wang, Chi-Jen Lu |
Abstract | Tensor decomposition is an important problem with many applications across several disciplines, and a popular approach for this problem is the tensor power method. However, previous works with theoretical guarantee based on this approach can only find the top eigenvectors one after one, unlike the case for matrices. In this paper, we show how to find the eigenvectors simultaneously with the help of a new initialization procedure. This allows us to achieve a better running time in the batch setting, as well as a lower sample complexity in the streaming setting. |
Tasks | |
Published | 2017-08-01 |
URL | https://icml.cc/Conferences/2017/Schedule?showEvent=560 |
http://proceedings.mlr.press/v70/wang17i/wang17i.pdf | |
PWC | https://paperswithcode.com/paper/tensor-decomposition-via-simultaneous-power |
Repo | |
Framework | |