May 4, 2019

1567 words 8 mins read

Paper Group NANR 201

Mistake Bounds for Binary Matrix Completion. Building a learner corpus for Russian. Dependency Based Embeddings for Sentence Classification Tasks. Improving Sequence to Sequence Learning for Morphological Inflection Generation: The BIU-MIT Systems for the SIGMORPHON 2016 Shared Task for Morphological Reinflection. Proceedings of the 1st Workshop on …

Mistake Bounds for Binary Matrix Completion


Title	Mistake Bounds for Binary Matrix Completion
Authors	Mark Herbster, Stephen Pasteris, Massimiliano Pontil
Abstract	We study the problem of completing a binary matrix in an online learning setting. On each trial we predict a matrix entry and then receive the true entry. We propose a Matrix Exponentiated Gradient algorithm [1] to solve this problem. We provide a mistake bound for the algorithm, which scales with the margin complexity [2, 3] of the underlying matrix. The bound suggests an interpretation where each row of the matrix is a prediction task over a finite set of objects, the columns. Using this we show that the algorithm makes a number of mistakes which is comparable up to a logarithmic factor to the number of mistakes made by the Kernel Perceptron with an optimal kernel in hindsight. We discuss applications of the algorithm to predicting as well as the best biclustering and to the problem of predicting the labeling of a graph without knowing the graph in advance.
Tasks	Matrix Completion
Published	2016-12-01
URL	http://papers.nips.cc/paper/6567-mistake-bounds-for-binary-matrix-completion
PDF	http://papers.nips.cc/paper/6567-mistake-bounds-for-binary-matrix-completion.pdf
PWC	https://paperswithcode.com/paper/mistake-bounds-for-binary-matrix-completion
Repo
Framework

Building a learner corpus for Russian


Title	Building a learner corpus for Russian
Authors	Ekaterina Rakhilina, Anastasia Vyrenkova, Elmira Mustakimova, Alina Ladygina, Ivan Smirnov
Abstract
Tasks	Language Acquisition, Language Identification
Published	2016-11-01
URL	https://www.aclweb.org/anthology/W16-6509/
PDF	https://www.aclweb.org/anthology/W16-6509
PWC	https://paperswithcode.com/paper/building-a-learner-corpus-for-russian
Repo
Framework

Dependency Based Embeddings for Sentence Classification Tasks


Title	Dependency Based Embeddings for Sentence Classification Tasks
Authors	Alex Komninos, ros, Man, Suresh har
Abstract
Tasks	Chunking, Learning Word Embeddings, Named Entity Recognition, Relation Classification, Sentence Classification, Sentiment Analysis, Word Embeddings
Published	2016-06-01
URL	https://www.aclweb.org/anthology/N16-1175/
PDF	https://www.aclweb.org/anthology/N16-1175
PWC	https://paperswithcode.com/paper/dependency-based-embeddings-for-sentence
Repo
Framework

Improving Sequence to Sequence Learning for Morphological Inflection Generation: The BIU-MIT Systems for the SIGMORPHON 2016 Shared Task for Morphological Reinflection


Title	Improving Sequence to Sequence Learning for Morphological Inflection Generation: The BIU-MIT Systems for the SIGMORPHON 2016 Shared Task for Morphological Reinflection
Authors	Roee Aharoni, Yoav Goldberg, Yonatan Belinkov
Abstract
Tasks	Feature Engineering, Machine Translation, Morphological Inflection
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-2007/
PDF	https://www.aclweb.org/anthology/W16-2007
PWC	https://paperswithcode.com/paper/improving-sequence-to-sequence-learning-for
Repo
Framework

Proceedings of the 1st Workshop on Representation Learning for NLP


Title	Proceedings of the 1st Workshop on Representation Learning for NLP
Authors
Abstract
Tasks	Representation Learning
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-1600/
PDF	https://www.aclweb.org/anthology/W16-1600
PWC	https://paperswithcode.com/paper/proceedings-of-the-1st-workshop-on
Repo
Framework

Detecting Grammatical Errors in Machine Translation Output Using Dependency Parsing and Treebank Querying


Title	Detecting Grammatical Errors in Machine Translation Output Using Dependency Parsing and Treebank Querying
Authors	Arda Tezcan, Veronique Hoste, Lieve Macken
Abstract
Tasks	Dependency Parsing, Machine Translation
Published	2016-01-01
URL	https://www.aclweb.org/anthology/W16-3409/
PDF	https://www.aclweb.org/anthology/W16-3409
PWC	https://paperswithcode.com/paper/detecting-grammatical-errors-in-machine
Repo
Framework

Graphonological Levenshtein Edit Distance: Application for Automated Cognate Identification


Title	Graphonological Levenshtein Edit Distance: Application for Automated Cognate Identification
Authors	Bogdan Babych
Abstract
Tasks	Transliteration
Published	2016-01-01
URL	https://www.aclweb.org/anthology/W16-3402/
PDF	https://www.aclweb.org/anthology/W16-3402
PWC	https://paperswithcode.com/paper/graphonological-levenshtein-edit-distance
Repo
Framework

A Graphical Pronoun Analysis Tool for the PROTEST Pronoun Evaluation Test Suite


Title	A Graphical Pronoun Analysis Tool for the PROTEST Pronoun Evaluation Test Suite
Authors	Christian Hardmeier, Liane Guillou
Abstract
Tasks	Machine Translation
Published	2016-01-01
URL	https://www.aclweb.org/anthology/W16-3418/
PDF	https://www.aclweb.org/anthology/W16-3418
PWC	https://paperswithcode.com/paper/a-graphical-pronoun-analysis-tool-for-the
Repo
Framework

Automated scalable segmentation of neurons from multispectral images


Title	Automated scalable segmentation of neurons from multispectral images
Authors	Uygar Sümbül, Douglas Roossien, Dawen Cai, Fei Chen, Nicholas Barry, John P. Cunningham, Edward Boyden, Liam Paninski
Abstract	Reconstruction of neuroanatomy is a fundamental problem in neuroscience. Stochastic expression of colors in individual cells is a promising tool, although its use in the nervous system has been limited due to various sources of variability in expression. Moreover, the intermingled anatomy of neuronal trees is challenging for existing segmentation algorithms. Here, we propose a method to automate the segmentation of neurons in such (potentially pseudo-colored) images. The method uses spatio-color relations between the voxels, generates supervoxels to reduce the problem size by four orders of magnitude before the final segmentation, and is parallelizable over the supervoxels. To quantify performance and gain insight, we generate simulated images, where the noise level and characteristics, the density of expression, and the number of fluorophore types are variable. We also present segmentations of real Brainbow images of the mouse hippocampus, which reveal many of the dendritic segments.
Tasks
Published	2016-12-01
URL	http://papers.nips.cc/paper/6549-automated-scalable-segmentation-of-neurons-from-multispectral-images
PDF	http://papers.nips.cc/paper/6549-automated-scalable-segmentation-of-neurons-from-multispectral-images.pdf
PWC	https://paperswithcode.com/paper/automated-scalable-segmentation-of-neurons
Repo
Framework

Relation- and Phrase-level Linking of FrameNet with Sar-graphs


Title	Relation- and Phrase-level Linking of FrameNet with Sar-graphs
Authors	Aleks Gabryszak, ra, Sebastian Krause, Leonhard Hennig, Feiyu Xu, Hans Uszkoreit
Abstract	Recent research shows the importance of linking linguistic knowledge resources for the creation of large-scale linguistic data. We describe our approach for combining two English resources, FrameNet and sar-graphs, and illustrate the benefits of the linked data in a relation extraction setting. While FrameNet consists of schematic representations of situations, linked to lexemes and their valency patterns, sar-graphs are knowledge resources that connect semantic relations from factual knowledge graphs to the linguistic phrases used to express instances of these relations. We analyze the conceptual similarities and differences of both resources and propose to link sar-graphs and FrameNet on the levels of relations/frames as well as phrases. The former alignment involves a manual ontology mapping step, which allows us to extend sar-graphs with new phrase patterns from FrameNet. The phrase-level linking, on the other hand, is fully automatic. We investigate the quality of the automatically constructed links and identify two main classes of errors.
Tasks	Knowledge Graphs, Relation Extraction
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1383/
PDF	https://www.aclweb.org/anthology/L16-1383
PWC	https://paperswithcode.com/paper/relation-and-phrase-level-linking-of-framenet
Repo
Framework

Multi-step learning and underlying structure in statistical models


Title	Multi-step learning and underlying structure in statistical models
Authors	Maia Fraser
Abstract	In multi-step learning, where a final learning task is accomplished via a sequence of intermediate learning tasks, the intuition is that successive steps or levels transform the initial data into representations more and more suited" to the final learning task. A related principle arises in transfer-learning where Baxter (2000) proposed a theoretical framework to study how learning multiple tasks transforms the inductive bias of a learner. The most widespread multi-step learning approach is semi-supervised learning with two steps: unsupervised, then supervised. Several authors (Castelli-Cover, 1996; Balcan-Blum, 2005; Niyogi, 2008; Ben-David et al, 2008; Urner et al, 2011) have analyzed SSL, with Balcan-Blum (2005) proposing a version of the PAC learning framework augmented by a compatibility function” to link concept class and unlabeled data distribution. We propose to analyze SSL and other multi-step learning approaches, much in the spirit of Baxter’s framework, by defining a learning problem generatively as a joint statistical model on $X \times Y$. This determines in a natural way the class of conditional distributions that are possible with each marginal, and amounts to an abstract form of compatibility function. It also allows to analyze both discrete and non-discrete settings. As tool for our analysis, we define a notion of $\gamma$-uniform shattering for statistical models. We use this to give conditions on the marginal and conditional models which imply an advantage for multi-step learning approaches. In particular, we recover a more general version of a result of Poggio et al (2012): under mild hypotheses a multi-step approach which learns features invariant under successive factors of a finite group of invariances has sample complexity requirements that are additive rather than multiplicative in the size of the subgroups.
Tasks	Transfer Learning
Published	2016-12-01
URL	http://papers.nips.cc/paper/6197-multi-step-learning-and-underlying-structure-in-statistical-models
PDF	http://papers.nips.cc/paper/6197-multi-step-learning-and-underlying-structure-in-statistical-models.pdf
PWC	https://paperswithcode.com/paper/multi-step-learning-and-underlying-structure
Repo
Framework

Proceedings of the 19th Annual Conference of the EAMT: Projects/Products


Title	Proceedings of the 19th Annual Conference of the EAMT: Projects/Products
Authors	{European Association for Machine Translation}
Abstract
Tasks	Machine Translation
Published	2016-01-01
URL	https://www.aclweb.org/anthology/W16-3424/
PDF	https://www.aclweb.org/anthology/W16-3424
PWC	https://paperswithcode.com/paper/proceedings-of-the-19th-annual-conference-of
Repo
Framework

Comparing the Template-Based Approach to GF: the case of Afrikaans


Title	Comparing the Template-Based Approach to GF: the case of Afrikaans
Authors	Lauren Sanby, Ion Todd, Maria C. Keet
Abstract
Tasks	Text Generation
Published	2016-09-01
URL	https://www.aclweb.org/anthology/W16-3510/
PDF	https://www.aclweb.org/anthology/W16-3510
PWC	https://paperswithcode.com/paper/comparing-the-template-based-approach-to-gf
Repo
Framework

A Repository of Frame Instance Lexicalizations for Generation


Title	A Repository of Frame Instance Lexicalizations for Generation
Authors	Valerio Basile
Abstract
Tasks	Text Generation
Published	2016-09-01
URL	https://www.aclweb.org/anthology/W16-3502/
PDF	https://www.aclweb.org/anthology/W16-3502
PWC	https://paperswithcode.com/paper/a-repository-of-frame-instance
Repo
Framework

QCRI @ DSL 2016: Spoken Arabic Dialect Identification Using Textual Features


Title	QCRI @ DSL 2016: Spoken Arabic Dialect Identification Using Textual Features
Authors	Mohamed Eldesouki, Fahim Dalvi, Hassan Sajjad, Kareem Darwish
Abstract	The paper describes the QCRI submissions to the task of automatic Arabic dialect classification into 5 Arabic variants, namely Egyptian, Gulf, Levantine, North-African, and Modern Standard Arabic (MSA). The training data is relatively small and is automatically generated from an ASR system. To avoid over-fitting on such small data, we carefully selected and designed the features to capture the morphological essence of the different dialects. We submitted four runs to the Arabic sub-task. For all runs, we used a combined feature vector of character bi-grams, tri-grams, 4-grams, and 5-grams. We tried several machine-learning algorithms, namely Logistic Regression, Naive Bayes, Neural Networks, and Support Vector Machines (SVM) with linear and string kernels. However, our submitted runs used SVM with a linear kernel. In the closed submission, we got the best accuracy of 0.5136 and the third best weighted F1 score, with a difference less than 0.002 from the highest score.
Tasks	Machine Translation
Published	2016-12-01
URL	https://www.aclweb.org/anthology/W16-4828/
PDF	https://www.aclweb.org/anthology/W16-4828
PWC	https://paperswithcode.com/paper/qcri-dsl-2016-spoken-arabic-dialect
Repo
Framework