May 5, 2019

1450 words 7 mins read

Paper Group NANR 117

Paper Group NANR 117

VectorWeavers at SemEval-2016 Task 10: From Incremental Meaning to Semantic Unit (phrase by phrase). LitWay, Discriminative Extraction for Different Bio-Events. Wasserstein Training of Restricted Boltzmann Machines. Cross-lingual Pronoun Prediction with Linguistically Informed Features. Feature-distributed sparse regression: a screen-and-clean appr …

VectorWeavers at SemEval-2016 Task 10: From Incremental Meaning to Semantic Unit (phrase by phrase)

Title VectorWeavers at SemEval-2016 Task 10: From Incremental Meaning to Semantic Unit (phrase by phrase)
Authors Andreas Scherbakov, Ekaterina Vylomova, Fei Liu, Timothy Baldwin
Abstract
Tasks Word Embeddings
Published 2016-06-01
URL https://www.aclweb.org/anthology/S16-1145/
PDF https://www.aclweb.org/anthology/S16-1145
PWC https://paperswithcode.com/paper/vectorweavers-at-semeval-2016-task-10-from
Repo
Framework

LitWay, Discriminative Extraction for Different Bio-Events

Title LitWay, Discriminative Extraction for Different Bio-Events
Authors Chen Li, Zhiqiang Rao, Xiangrong Zhang
Abstract
Tasks Relation Extraction
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-3004/
PDF https://www.aclweb.org/anthology/W16-3004
PWC https://paperswithcode.com/paper/litway-discriminative-extraction-for
Repo
Framework

Wasserstein Training of Restricted Boltzmann Machines

Title Wasserstein Training of Restricted Boltzmann Machines
Authors Grégoire Montavon, Klaus-Robert Müller, Marco Cuturi
Abstract Boltzmann machines are able to learn highly complex, multimodal, structured and multiscale real-world data distributions. Parameters of the model are usually learned by minimizing the Kullback-Leibler (KL) divergence from training samples to the learned model. We propose in this work a novel approach for Boltzmann machine training which assumes that a meaningful metric between observations is known. This metric between observations can then be used to define the Wasserstein distance between the distribution induced by the Boltzmann machine on the one hand, and that given by the training sample on the other hand. We derive a gradient of that distance with respect to the model parameters. Minimization of this new objective leads to generative models with different statistical properties. We demonstrate their practical potential on data completion and denoising, for which the metric between observations plays a crucial role.
Tasks Denoising
Published 2016-12-01
URL http://papers.nips.cc/paper/6248-wasserstein-training-of-restricted-boltzmann-machines
PDF http://papers.nips.cc/paper/6248-wasserstein-training-of-restricted-boltzmann-machines.pdf
PWC https://paperswithcode.com/paper/wasserstein-training-of-restricted-boltzmann
Repo
Framework

Cross-lingual Pronoun Prediction with Linguistically Informed Features

Title Cross-lingual Pronoun Prediction with Linguistically Informed Features
Authors Rachel Bawden
Abstract
Tasks Coreference Resolution, Language Modelling, Machine Translation, Word Alignment
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-2348/
PDF https://www.aclweb.org/anthology/W16-2348
PWC https://paperswithcode.com/paper/cross-lingual-pronoun-prediction-with
Repo
Framework

Feature-distributed sparse regression: a screen-and-clean approach

Title Feature-distributed sparse regression: a screen-and-clean approach
Authors Jiyan Yang, Michael W. Mahoney, Michael Saunders, Yuekai Sun
Abstract Most existing approaches to distributed sparse regression assume the data is partitioned by samples. However, for high-dimensional data (D » N), it is more natural to partition the data by features. We propose an algorithm to distributed sparse regression when the data is partitioned by features rather than samples. Our approach allows the user to tailor our general method to various distributed computing platforms by trading-off the total amount of data (in bits) sent over the communication network and the number of rounds of communication. We show that an implementation of our approach is capable of solving L1-regularized L2 regression problems with millions of features in minutes.
Tasks
Published 2016-12-01
URL http://papers.nips.cc/paper/6187-feature-distributed-sparse-regression-a-screen-and-clean-approach
PDF http://papers.nips.cc/paper/6187-feature-distributed-sparse-regression-a-screen-and-clean-approach.pdf
PWC https://paperswithcode.com/paper/feature-distributed-sparse-regression-a
Repo
Framework

Does String-Based Neural MT Learn Source Syntax?

Title Does String-Based Neural MT Learn Source Syntax?
Authors Xing Shi, Inkit Padhi, Kevin Knight
Abstract
Tasks Machine Translation
Published 2016-11-01
URL https://www.aclweb.org/anthology/D16-1159/
PDF https://www.aclweb.org/anthology/D16-1159
PWC https://paperswithcode.com/paper/does-string-based-neural-mt-learn-source
Repo
Framework

Evaluating Embeddings using Syntax-based Classification Tasks as a Proxy for Parser Performance

Title Evaluating Embeddings using Syntax-based Classification Tasks as a Proxy for Parser Performance
Authors Arne K{"o}hn
Abstract
Tasks Dependency Parsing
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-2512/
PDF https://www.aclweb.org/anthology/W16-2512
PWC https://paperswithcode.com/paper/evaluating-embeddings-using-syntax-based
Repo
Framework

Overview of the 2016 ALTA Shared Task: Cross-KB Coreference

Title Overview of the 2016 ALTA Shared Task: Cross-KB Coreference
Authors Andrew Chisholm, Ben Hachey, Diego Moll{'a}
Abstract
Tasks Coreference Resolution
Published 2016-12-01
URL https://www.aclweb.org/anthology/U16-1020/
PDF https://www.aclweb.org/anthology/U16-1020
PWC https://paperswithcode.com/paper/overview-of-the-2016-alta-shared-task-cross
Repo
Framework

融合多任務學習類神經網路聲學模型訓練於會議語音辨識之研究(Leveraging Multi-task Learning with Neural Network Based Acoustic Modeling for Improved Meeting Speech Recognition) [In Chinese]

Title 融合多任務學習類神經網路聲學模型訓練於會議語音辨識之研究(Leveraging Multi-task Learning with Neural Network Based Acoustic Modeling for Improved Meeting Speech Recognition) [In Chinese]
Authors Ming-Han Yang, Yao-Chi Hsu, Hsiao-Tsung Hung, Ying-Wen Chen, Berlin Chen, Kuan-Yu Chen
Abstract
Tasks Multi-Task Learning, Speech Recognition
Published 2016-10-01
URL https://www.aclweb.org/anthology/O16-1002/
PDF https://www.aclweb.org/anthology/O16-1002
PWC https://paperswithcode.com/paper/eaaaaa-ceccc2e-e2a-ae-c-14eeae3e34-ea1c
Repo
Framework

The on-line version of Grammatical Dictionary of Polish

Title The on-line version of Grammatical Dictionary of Polish
Authors Marcin Woli{'n}ski, Witold Kiera{'s}
Abstract We present the new online edition of a dictionary of Polish inflection ― the Grammatical Dictionary of Polish (http://sgjp.pl). The dictionary is interesting for several reasons: it is comprehensive (over 330,000 lexemes corresponding to almost 4,300,000 different textual words; 1116 handcrafted inflectional patterns), the inflection is presented in an explicit manner in the form of carefully designed tables, the user interface facilitates advanced queries by several features (lemmas, forms, applicable grammatical categories, types of inflection). Moreover, the data of the dictionary is used in morphological analysers, including our product Morfeusz (http://sgjp. pl/morfeusz). From the start, the dictionary was meant to be comfortable for the human reader as well as to be ready for use in NLP applications. In the paper we briefly discuss both aspects of the resource.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1412/
PDF https://www.aclweb.org/anthology/L16-1412
PWC https://paperswithcode.com/paper/the-on-line-version-of-grammatical-dictionary
Repo
Framework

Metaphor as a Medium for Emotion: An Empirical Study

Title Metaphor as a Medium for Emotion: An Empirical Study
Authors Saif Mohammad, Ekaterina Shutova, Peter Turney
Abstract
Tasks
Published 2016-08-01
URL https://www.aclweb.org/anthology/S16-2003/
PDF https://www.aclweb.org/anthology/S16-2003
PWC https://paperswithcode.com/paper/metaphor-as-a-medium-for-emotion-an-empirical
Repo
Framework

Extracting Social Networks from Literary Text with Word Embedding Tools

Title Extracting Social Networks from Literary Text with Word Embedding Tools
Authors Gerhard Wohlgenannt, Ekaterina Chernyak, Dmitry Ilvovsky
Abstract In this paper a social network is extracted from a literary text. The social network shows, how frequent the characters interact and how similar their social behavior is. Two types of similarity measures are used: the first applies co-occurrence statistics, while the second exploits cosine similarity on different types of word embedding vectors. The results are evaluated by a paid micro-task crowdsourcing survey. The experiments suggest that specific types of word embeddings like word2vec are well-suited for the task at hand and the specific circumstances of literary fiction text.
Tasks Language Modelling, Word Embeddings
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-4004/
PDF https://www.aclweb.org/anthology/W16-4004
PWC https://paperswithcode.com/paper/extracting-social-networks-from-literary-text
Repo
Framework

New release of Mixer-6: Improved validity for phonetic study of speaker variation and identification

Title New release of Mixer-6: Improved validity for phonetic study of speaker variation and identification
Authors Eleanor Chodroff, Matthew Maciejewski, Jan Trmal, Sanjeev Khudanpur, John Godfrey
Abstract The Mixer series of speech corpora were collected over several years, principally to support annual NIST evaluations of speaker recognition (SR) technologies. These evaluations focused on conversational speech over a variety of channels and recording conditions. One of the series, Mixer-6, added a new condition, read speech, to support basic scientific research on speaker characteristics, as well as technology evaluation. With read speech it is possible to make relatively precise measurements of phonetic events and features, which can be correlated with the performance of speaker recognition algorithms, or directly used in phonetic analysis of speaker variability. The read speech, as originally recorded, was adequate for large-scale evaluations (e.g., fixed-text speaker ID algorithms) but only marginally suitable for acoustic-phonetic studies. Numerous errors due largely to speaker behavior remained in the corpus, with no record of their locations or rate of occurrence. We undertook the effort to correct this situation with automatic methods supplemented by human listening and annotation. The present paper describes the tools and methods, resulting corrections, and some examples of the kinds of research studies enabled by these enhancements.
Tasks Speaker Recognition
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1210/
PDF https://www.aclweb.org/anthology/L16-1210
PWC https://paperswithcode.com/paper/new-release-of-mixer-6-improved-validity-for
Repo
Framework

Arabic Corpora for Credibility Analysis

Title Arabic Corpora for Credibility Analysis
Authors Ayman Al Zaatari, Rim El Ballouli, Shady ELbassouni, Wassim El-Hajj, Hazem Hajj, Khaled Shaban, Nizar Habash, Emad Yahya
Abstract A significant portion of data generated on blogging and microblogging websites is non-credible as shown in many recent studies. To filter out such non-credible information, machine learning can be deployed to build automatic credibility classifiers. However, as in the case with most supervised machine learning approaches, a sufficiently large and accurate training data must be available. In this paper, we focus on building a public Arabic corpus of blogs and microblogs that can be used for credibility classification. We focus on Arabic due to the recent popularity of blogs and microblogs in the Arab World and due to the lack of any such public corpora in Arabic. We discuss our data acquisition approach and annotation process, provide rigid analysis on the annotated data and finally report some results on the effectiveness of our data for credibility classification.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1696/
PDF https://www.aclweb.org/anthology/L16-1696
PWC https://paperswithcode.com/paper/arabic-corpora-for-credibility-analysis
Repo
Framework

Easy Questions First? A Case Study on Curriculum Learning for Question Answering

Title Easy Questions First? A Case Study on Curriculum Learning for Question Answering
Authors Mrinmaya Sachan, Eric Xing
Abstract
Tasks Active Learning, Question Answering
Published 2016-08-01
URL https://www.aclweb.org/anthology/P16-1043/
PDF https://www.aclweb.org/anthology/P16-1043
PWC https://paperswithcode.com/paper/easy-questions-first-a-case-study-on
Repo
Framework
comments powered by Disqus