May 5, 2019

1450 words 7 mins read

Paper Group NANR 117

VectorWeavers at SemEval-2016 Task 10: From Incremental Meaning to Semantic Unit (phrase by phrase). LitWay, Discriminative Extraction for Different Bio-Events. Wasserstein Training of Restricted Boltzmann Machines. Cross-lingual Pronoun Prediction with Linguistically Informed Features. Feature-distributed sparse regression: a screen-and-clean appr …

VectorWeavers at SemEval-2016 Task 10: From Incremental Meaning to Semantic Unit (phrase by phrase)


Title	VectorWeavers at SemEval-2016 Task 10: From Incremental Meaning to Semantic Unit (phrase by phrase)
Authors	Andreas Scherbakov, Ekaterina Vylomova, Fei Liu, Timothy Baldwin
Abstract
Tasks	Word Embeddings
Published	2016-06-01
URL	https://www.aclweb.org/anthology/S16-1145/
PDF	https://www.aclweb.org/anthology/S16-1145
PWC	https://paperswithcode.com/paper/vectorweavers-at-semeval-2016-task-10-from
Repo
Framework

LitWay, Discriminative Extraction for Different Bio-Events


Title	LitWay, Discriminative Extraction for Different Bio-Events
Authors	Chen Li, Zhiqiang Rao, Xiangrong Zhang
Abstract
Tasks	Relation Extraction
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-3004/
PDF	https://www.aclweb.org/anthology/W16-3004
PWC	https://paperswithcode.com/paper/litway-discriminative-extraction-for
Repo
Framework

Wasserstein Training of Restricted Boltzmann Machines


Title	Wasserstein Training of Restricted Boltzmann Machines
Authors	Grégoire Montavon, Klaus-Robert Müller, Marco Cuturi
Abstract	Boltzmann machines are able to learn highly complex, multimodal, structured and multiscale real-world data distributions. Parameters of the model are usually learned by minimizing the Kullback-Leibler (KL) divergence from training samples to the learned model. We propose in this work a novel approach for Boltzmann machine training which assumes that a meaningful metric between observations is known. This metric between observations can then be used to define the Wasserstein distance between the distribution induced by the Boltzmann machine on the one hand, and that given by the training sample on the other hand. We derive a gradient of that distance with respect to the model parameters. Minimization of this new objective leads to generative models with different statistical properties. We demonstrate their practical potential on data completion and denoising, for which the metric between observations plays a crucial role.
Tasks	Denoising
Published	2016-12-01
URL	http://papers.nips.cc/paper/6248-wasserstein-training-of-restricted-boltzmann-machines
PDF	http://papers.nips.cc/paper/6248-wasserstein-training-of-restricted-boltzmann-machines.pdf
PWC	https://paperswithcode.com/paper/wasserstein-training-of-restricted-boltzmann
Repo
Framework

Cross-lingual Pronoun Prediction with Linguistically Informed Features


Title	Cross-lingual Pronoun Prediction with Linguistically Informed Features
Authors	Rachel Bawden
Abstract
Tasks	Coreference Resolution, Language Modelling, Machine Translation, Word Alignment
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-2348/
PDF	https://www.aclweb.org/anthology/W16-2348
PWC	https://paperswithcode.com/paper/cross-lingual-pronoun-prediction-with
Repo
Framework

Feature-distributed sparse regression: a screen-and-clean approach


Title	Feature-distributed sparse regression: a screen-and-clean approach
Authors	Jiyan Yang, Michael W. Mahoney, Michael Saunders, Yuekai Sun
Abstract	Most existing approaches to distributed sparse regression assume the data is partitioned by samples. However, for high-dimensional data (D » N), it is more natural to partition the data by features. We propose an algorithm to distributed sparse regression when the data is partitioned by features rather than samples. Our approach allows the user to tailor our general method to various distributed computing platforms by trading-off the total amount of data (in bits) sent over the communication network and the number of rounds of communication. We show that an implementation of our approach is capable of solving L1-regularized L2 regression problems with millions of features in minutes.
Tasks
Published	2016-12-01
URL	http://papers.nips.cc/paper/6187-feature-distributed-sparse-regression-a-screen-and-clean-approach
PDF	http://papers.nips.cc/paper/6187-feature-distributed-sparse-regression-a-screen-and-clean-approach.pdf
PWC	https://paperswithcode.com/paper/feature-distributed-sparse-regression-a
Repo
Framework

Does String-Based Neural MT Learn Source Syntax?


Title	Does String-Based Neural MT Learn Source Syntax?
Authors	Xing Shi, Inkit Padhi, Kevin Knight
Abstract
Tasks	Machine Translation
Published	2016-11-01
URL	https://www.aclweb.org/anthology/D16-1159/
PDF	https://www.aclweb.org/anthology/D16-1159
PWC	https://paperswithcode.com/paper/does-string-based-neural-mt-learn-source
Repo
Framework

Evaluating Embeddings using Syntax-based Classification Tasks as a Proxy for Parser Performance


Title	Evaluating Embeddings using Syntax-based Classification Tasks as a Proxy for Parser Performance
Authors	Arne K{"o}hn
Abstract
Tasks	Dependency Parsing
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-2512/
PDF	https://www.aclweb.org/anthology/W16-2512
PWC	https://paperswithcode.com/paper/evaluating-embeddings-using-syntax-based
Repo
Framework

Overview of the 2016 ALTA Shared Task: Cross-KB Coreference


Title	Overview of the 2016 ALTA Shared Task: Cross-KB Coreference
Authors	Andrew Chisholm, Ben Hachey, Diego Moll{'a}
Abstract
Tasks	Coreference Resolution
Published	2016-12-01
URL	https://www.aclweb.org/anthology/U16-1020/
PDF	https://www.aclweb.org/anthology/U16-1020
PWC	https://paperswithcode.com/paper/overview-of-the-2016-alta-shared-task-cross
Repo
Framework

融合多任務學習類神經網路聲學模型訓練於會議語音辨識之研究(Leveraging Multi-task Learning with Neural Network Based Acoustic Modeling for Improved Meeting Speech Recognition) [In Chinese]


Title	融合多任務學習類神經網路聲學模型訓練於會議語音辨識之研究(Leveraging Multi-task Learning with Neural Network Based Acoustic Modeling for Improved Meeting Speech Recognition) [In Chinese]
Authors	Ming-Han Yang, Yao-Chi Hsu, Hsiao-Tsung Hung, Ying-Wen Chen, Berlin Chen, Kuan-Yu Chen
Abstract
Tasks	Multi-Task Learning, Speech Recognition
Published	2016-10-01
URL	https://www.aclweb.org/anthology/O16-1002/
PDF	https://www.aclweb.org/anthology/O16-1002
PWC	https://paperswithcode.com/paper/eaaaaa-ceccc2e-e2a-ae-c-14eeae3e34-ea1c
Repo
Framework

The on-line version of Grammatical Dictionary of Polish


Title	The on-line version of Grammatical Dictionary of Polish
Authors	Marcin Woli{'n}ski, Witold Kiera{'s}
Abstract	We present the new online edition of a dictionary of Polish inflection â€• the Grammatical Dictionary of Polish (http://sgjp.pl). The dictionary is interesting for several reasons: it is comprehensive (over 330,000 lexemes corresponding to almost 4,300,000 different textual words; 1116 handcrafted inflectional patterns), the inflection is presented in an explicit manner in the form of carefully designed tables, the user interface facilitates advanced queries by several features (lemmas, forms, applicable grammatical categories, types of inflection). Moreover, the data of the dictionary is used in morphological analysers, including our product Morfeusz (http://sgjp. pl/morfeusz). From the start, the dictionary was meant to be comfortable for the human reader as well as to be ready for use in NLP applications. In the paper we briefly discuss both aspects of the resource.
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1412/
PDF	https://www.aclweb.org/anthology/L16-1412
PWC	https://paperswithcode.com/paper/the-on-line-version-of-grammatical-dictionary
Repo
Framework

Metaphor as a Medium for Emotion: An Empirical Study


Title	Metaphor as a Medium for Emotion: An Empirical Study
Authors	Saif Mohammad, Ekaterina Shutova, Peter Turney
Abstract
Tasks
Published	2016-08-01
URL	https://www.aclweb.org/anthology/S16-2003/
PDF	https://www.aclweb.org/anthology/S16-2003
PWC	https://paperswithcode.com/paper/metaphor-as-a-medium-for-emotion-an-empirical
Repo
Framework


Title	Extracting Social Networks from Literary Text with Word Embedding Tools
Authors	Gerhard Wohlgenannt, Ekaterina Chernyak, Dmitry Ilvovsky
Abstract	In this paper a social network is extracted from a literary text. The social network shows, how frequent the characters interact and how similar their social behavior is. Two types of similarity measures are used: the first applies co-occurrence statistics, while the second exploits cosine similarity on different types of word embedding vectors. The results are evaluated by a paid micro-task crowdsourcing survey. The experiments suggest that specific types of word embeddings like word2vec are well-suited for the task at hand and the specific circumstances of literary fiction text.
Tasks	Language Modelling, Word Embeddings
Published	2016-12-01
URL	https://www.aclweb.org/anthology/W16-4004/
PDF	https://www.aclweb.org/anthology/W16-4004
PWC	https://paperswithcode.com/paper/extracting-social-networks-from-literary-text
Repo
Framework

New release of Mixer-6: Improved validity for phonetic study of speaker variation and identification


Title	New release of Mixer-6: Improved validity for phonetic study of speaker variation and identification
Authors	Eleanor Chodroff, Matthew Maciejewski, Jan Trmal, Sanjeev Khudanpur, John Godfrey
Abstract	The Mixer series of speech corpora were collected over several years, principally to support annual NIST evaluations of speaker recognition (SR) technologies. These evaluations focused on conversational speech over a variety of channels and recording conditions. One of the series, Mixer-6, added a new condition, read speech, to support basic scientific research on speaker characteristics, as well as technology evaluation. With read speech it is possible to make relatively precise measurements of phonetic events and features, which can be correlated with the performance of speaker recognition algorithms, or directly used in phonetic analysis of speaker variability. The read speech, as originally recorded, was adequate for large-scale evaluations (e.g., fixed-text speaker ID algorithms) but only marginally suitable for acoustic-phonetic studies. Numerous errors due largely to speaker behavior remained in the corpus, with no record of their locations or rate of occurrence. We undertook the effort to correct this situation with automatic methods supplemented by human listening and annotation. The present paper describes the tools and methods, resulting corrections, and some examples of the kinds of research studies enabled by these enhancements.
Tasks	Speaker Recognition
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1210/
PDF	https://www.aclweb.org/anthology/L16-1210
PWC	https://paperswithcode.com/paper/new-release-of-mixer-6-improved-validity-for
Repo
Framework

Arabic Corpora for Credibility Analysis


Title	Arabic Corpora for Credibility Analysis
Authors	Ayman Al Zaatari, Rim El Ballouli, Shady ELbassouni, Wassim El-Hajj, Hazem Hajj, Khaled Shaban, Nizar Habash, Emad Yahya
Abstract	A significant portion of data generated on blogging and microblogging websites is non-credible as shown in many recent studies. To filter out such non-credible information, machine learning can be deployed to build automatic credibility classifiers. However, as in the case with most supervised machine learning approaches, a sufficiently large and accurate training data must be available. In this paper, we focus on building a public Arabic corpus of blogs and microblogs that can be used for credibility classification. We focus on Arabic due to the recent popularity of blogs and microblogs in the Arab World and due to the lack of any such public corpora in Arabic. We discuss our data acquisition approach and annotation process, provide rigid analysis on the annotated data and finally report some results on the effectiveness of our data for credibility classification.
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1696/
PDF	https://www.aclweb.org/anthology/L16-1696
PWC	https://paperswithcode.com/paper/arabic-corpora-for-credibility-analysis
Repo
Framework

Easy Questions First? A Case Study on Curriculum Learning for Question Answering


Title	Easy Questions First? A Case Study on Curriculum Learning for Question Answering
Authors	Mrinmaya Sachan, Eric Xing
Abstract
Tasks	Active Learning, Question Answering
Published	2016-08-01
URL	https://www.aclweb.org/anthology/P16-1043/
PDF	https://www.aclweb.org/anthology/P16-1043
PWC	https://paperswithcode.com/paper/easy-questions-first-a-case-study-on
Repo
Framework