May 5, 2019

1933 words 10 mins read

Paper Group NANR 14

Paper Group NANR 14

Solving Random Systems of Quadratic Equations via Truncated Generalized Gradient Flow. CNNpack: Packing Convolutional Neural Networks in the Frequency Domain. ELRA Activities and Services. Melbourne at SemEval 2016 Task 11: Classifying Type-level Word Complexity using Random Forests with Corpus and Word List Features. Creation of comparable corpora …

Solving Random Systems of Quadratic Equations via Truncated Generalized Gradient Flow

Title Solving Random Systems of Quadratic Equations via Truncated Generalized Gradient Flow
Authors Gang Wang, Georgios Giannakis
Abstract This paper puts forth a novel algorithm, termed \emph{truncated generalized gradient flow} (TGGF), to solve for $\bm{x}\in\mathbb{R}^n/\mathbb{C}^n$ a system of $m$ quadratic equations $y_i=\langle\bm{a}_i,\bm{x}\rangle^2$, $i=1,2,\ldots,m$, which even for $\left{\bm{a}i\in\mathbb{R}^n/\mathbb{C}^n\right}{i=1}^m$ random is known to be \emph{NP-hard} in general. We prove that as soon as the number of equations $m$ is on the order of the number of unknowns $n$, TGGF recovers the solution exactly (up to a global unimodular constant) with high probability and complexity growing linearly with the time required to read the data $\left{\left(\bm{a}i;,y_i\right)\right}{i=1}^m$. Specifically, TGGF proceeds in two stages: s1) A novel \emph{orthogonality-promoting} initialization that is obtained with simple power iterations; and, s2) a refinement of the initial estimate by successive updates of scalable \emph{truncated generalized gradient iterations}. The former is in sharp contrast to the existing spectral initializations, while the latter handles the rather challenging nonconvex and nonsmooth \emph{amplitude-based} cost function. Numerical tests demonstrate that: i) The novel orthogonality-promoting initialization method returns more accurate and robust estimates relative to its spectral counterparts; and ii) even with the same initialization, our refinement/truncation outperforms Wirtinger-based alternatives, all corroborating the superior performance of TGGF over state-of-the-art algorithms.
Tasks
Published 2016-12-01
URL http://papers.nips.cc/paper/6061-solving-random-systems-of-quadratic-equations-via-truncated-generalized-gradient-flow
PDF http://papers.nips.cc/paper/6061-solving-random-systems-of-quadratic-equations-via-truncated-generalized-gradient-flow.pdf
PWC https://paperswithcode.com/paper/solving-random-systems-of-quadratic-equations
Repo
Framework

CNNpack: Packing Convolutional Neural Networks in the Frequency Domain

Title CNNpack: Packing Convolutional Neural Networks in the Frequency Domain
Authors Yunhe Wang, Chang Xu, Shan You, Dacheng Tao, Chao Xu
Abstract Deep convolutional neural networks (CNNs) are successfully used in a number of applications. However, their storage and computational requirements have largely prevented their widespread use on mobile devices. Here we present an effective CNN compression approach in the frequency domain, which focuses not only on smaller weights but on all the weights and their underlying connections. By treating convolutional filters as images, we decompose their representations in the frequency domain as common parts (i.e., cluster centers) shared by other similar filters and their individual private parts (i.e., individual residuals). A large number of low-energy frequency coefficients in both parts can be discarded to produce high compression without significantly compromising accuracy. We relax the computational burden of convolution operations in CNNs by linearly combining the convolution responses of discrete cosine transform (DCT) bases. The compression and speed-up ratios of the proposed algorithm are thoroughly analyzed and evaluated on benchmark image datasets to demonstrate its superiority over state-of-the-art methods.
Tasks
Published 2016-12-01
URL http://papers.nips.cc/paper/6390-cnnpack-packing-convolutional-neural-networks-in-the-frequency-domain
PDF http://papers.nips.cc/paper/6390-cnnpack-packing-convolutional-neural-networks-in-the-frequency-domain.pdf
PWC https://paperswithcode.com/paper/cnnpack-packing-convolutional-neural-networks
Repo
Framework

ELRA Activities and Services

Title ELRA Activities and Services
Authors Khalid Choukri, Val{'e}rie Mapelli, H{'e}l{`e}ne Mazo, Vladimir Popescu
Abstract After celebrating its 20th anniversary in 2015, ELRA is carrying on its strong involvement in the HLT field. To share ELRA{'}s expertise of those 21 past years, this article begins with a presentation of ELRA{'}s strategic Data and LR Management Plan for a wide use by the language communities. Then, we further report on ELRA{'}s activities and services provided since LREC 2014. When looking at the cataloguing and licensing activities, we can see that ELRA has been active at making the Meta-Share repository move toward new developments steps, supporting Europe to obtain accurate LRs within the Connecting Europe Facility programme, promoting the use of LR citation, creating the ELRA License Wizard web portal.The article further elaborates on the recent LR production activities of various written, speech and video resources, commissioned by public and private customers. In parallel, ELDA has also worked on several EU-funded projects centred on strategic issues related to the European Digital Single Market. The last part gives an overview of the latest dissemination activities, with a special focus on the celebration of its 20th anniversary organised in Dubrovnik (Croatia) and the following up of LREC, as well as the launching of the new ELRA portal.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1074/
PDF https://www.aclweb.org/anthology/L16-1074
PWC https://paperswithcode.com/paper/elra-activities-and-services
Repo
Framework

Melbourne at SemEval 2016 Task 11: Classifying Type-level Word Complexity using Random Forests with Corpus and Word List Features

Title Melbourne at SemEval 2016 Task 11: Classifying Type-level Word Complexity using Random Forests with Corpus and Word List Features
Authors Julian Brooke, Alex Uitdenbogerd, ra, Timothy Baldwin
Abstract
Tasks Complex Word Identification, Text Simplification
Published 2016-06-01
URL https://www.aclweb.org/anthology/S16-1150/
PDF https://www.aclweb.org/anthology/S16-1150
PWC https://paperswithcode.com/paper/melbourne-at-semeval-2016-task-11-classifying
Repo
Framework

Creation of comparable corpora for English-Urdu, Arabic, Persian

Title Creation of comparable corpora for English-Urdu, Arabic, Persian
Authors Murad Abouammoh, Kashif Shah, Ahmet Aker
Abstract Statistical Machine Translation (SMT) relies on the availability of rich parallel corpora. However, in the case of under-resourced languages or some specific domains, parallel corpora are not readily available. This leads to under-performing machine translation systems in those sparse data settings. To overcome the low availability of parallel resources the machine translation community has recognized the potential of using comparable resources as training data. However, most efforts have been related to European languages and less in middle-east languages. In this study, we report comparable corpora created from news articles for the pair English ―{Arabic, Persian, Urdu} languages. The data has been collected over a period of a year, entails Arabic, Persian and Urdu languages. Furthermore using the English as a pivot language, comparable corpora that involve more than one language can be created, e.g. English- Arabic - Persian, English - Arabic - Urdu, English ― Urdu - Persian, etc. Upon request the data can be provided for research purposes.
Tasks Machine Translation
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1663/
PDF https://www.aclweb.org/anthology/L16-1663
PWC https://paperswithcode.com/paper/creation-of-comparable-corpora-for-english
Repo
Framework

Designing Algorithms for Referring with Proper Names

Title Designing Algorithms for Referring with Proper Names
Authors Kees van Deemter
Abstract
Tasks Text Generation
Published 2016-09-01
URL https://www.aclweb.org/anthology/W16-6605/
PDF https://www.aclweb.org/anthology/W16-6605
PWC https://paperswithcode.com/paper/designing-algorithms-for-referring-with
Repo
Framework

Learning to Identify Sentence Parallelism in Student Essays

Title Learning to Identify Sentence Parallelism in Student Essays
Authors Wei Song, Tong Liu, Ruiji Fu, Lizhen Liu, Hanshi Wang, Ting Liu
Abstract Parallelism is an important rhetorical device. We propose a machine learning approach for automated sentence parallelism identification in student essays. We build an essay dataset with sentence level parallelism annotated. We derive features by combining generalized word alignment strategies and the alignment measures between word sequences. The experimental results show that sentence parallelism can be effectively identified with a F1 score of 82{%} at pair-wise level and 72{%} at parallelism chunk level.Based on this approach, we automatically identify sentence parallelism in more than 2000 student essays and study the correlation between the use of sentence parallelism and the types and quality of essays.
Tasks Word Alignment
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-1076/
PDF https://www.aclweb.org/anthology/C16-1076
PWC https://paperswithcode.com/paper/learning-to-identify-sentence-parallelism-in
Repo
Framework

A Corpus of Text Data and Gaze Fixations from Autistic and Non-Autistic Adults

Title A Corpus of Text Data and Gaze Fixations from Autistic and Non-Autistic Adults
Authors Victoria Yaneva, Irina Temnikova, Ruslan Mitkov
Abstract The paper presents a corpus of text data and its corresponding gaze fixations obtained from autistic and non-autistic readers. The data was elicited through reading comprehension testing combined with eye-tracking recording. The corpus consists of 1034 content words tagged with their POS, syntactic role and three gaze-based measures corresponding to the autistic and control participants. The reading skills of the participants were measured through multiple-choice questions and, based on the answers given, they were divided into groups of skillful and less-skillful readers. This division of the groups informs researchers on whether particular fixations were elicited from skillful or less-skillful readers and allows a fair between-group comparison for two levels of reading ability. In addition to describing the process of data collection and corpus development, we present a study on the effect that word length has on reading in autism. The corpus is intended as a resource for investigating the particular linguistic constructions which pose reading difficulties for people with autism and hopefully, as a way to inform future text simplification research intended for this population.
Tasks Eye Tracking, Reading Comprehension, Text Simplification
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1077/
PDF https://www.aclweb.org/anthology/L16-1077
PWC https://paperswithcode.com/paper/a-corpus-of-text-data-and-gaze-fixations-from
Repo
Framework

Semi-automatically Alignment of Predicates between Speech and OntoNotes data

Title Semi-automatically Alignment of Predicates between Speech and OntoNotes data
Authors Niraj Shrestha, Marie-Francine Moens
Abstract Speech data currently receives a growing attention and is an important source of information. We still lack suitable corpora of transcribed speech annotated with semantic roles that can be used for semantic role labeling (SRL), which is not the case for written data. Semantic role labeling in speech data is a challenging and complex task due to the lack of sentence boundaries and the many transcription errors such as insertion, deletion and misspellings of words. In written data, SRL evaluation is performed at the sentence level, but in speech data sentence boundaries identification is still a bottleneck which makes evaluation more complex. In this work, we semi-automatically align the predicates found in transcribed speech obtained with an automatic speech recognizer (ASR) with the predicates found in the corresponding written documents of the OntoNotes corpus and manually align the semantic roles of these predicates thus obtaining annotated semantic frames in the speech data. This data can serve as gold standard alignments for future research in semantic role labeling of speech data.
Tasks Semantic Role Labeling
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1222/
PDF https://www.aclweb.org/anthology/L16-1222
PWC https://paperswithcode.com/paper/semi-automatically-alignment-of-predicates
Repo
Framework

The MultiTal NLP tool infrastructure

Title The MultiTal NLP tool infrastructure
Authors Driss Sadoun, Satenik Mkhitaryan, Damien Nouvel, Mathieu Valette
Abstract This paper gives an overview of the MultiTal project, which aims to create a research infrastructure that ensures long-term distribution of NLP tools descriptions. The goal is to make NLP tools more accessible and usable to end-users of different disciplines. The infrastructure is built on a meta-data scheme modelling and standardising multilingual NLP tools documentation. The model is conceptualised using an OWL ontology. The formal representation of the ontology allows us to automatically generate organised and structured documentation in different languages for each represented tool.
Tasks
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-4021/
PDF https://www.aclweb.org/anthology/W16-4021
PWC https://paperswithcode.com/paper/the-multital-nlp-tool-infrastructure
Repo
Framework

A Constituent Syntactic Parse Tree Based Discourse Parser

Title A Constituent Syntactic Parse Tree Based Discourse Parser
Authors Zhongyi Li, Hai Zhao, Chenxi Pang, Lili Wang, Huan Wang
Abstract
Tasks Question Answering, Text Classification
Published 2016-08-01
URL https://www.aclweb.org/anthology/K16-2008/
PDF https://www.aclweb.org/anthology/K16-2008
PWC https://paperswithcode.com/paper/a-constituent-syntactic-parse-tree-based
Repo
Framework

AI-KU at SemEval-2016 Task 11: Word Embeddings and Substring Features for Complex Word Identification

Title AI-KU at SemEval-2016 Task 11: Word Embeddings and Substring Features for Complex Word Identification
Authors Onur Kuru
Abstract
Tasks Complex Word Identification, Document Classification, Lexical Simplification, Named Entity Recognition, Word Embeddings
Published 2016-06-01
URL https://www.aclweb.org/anthology/S16-1163/
PDF https://www.aclweb.org/anthology/S16-1163
PWC https://paperswithcode.com/paper/ai-ku-at-semeval-2016-task-11-word-embeddings
Repo
Framework

IIIT at SemEval-2016 Task 11: Complex Word Identification using Nearest Centroid Classification

Title IIIT at SemEval-2016 Task 11: Complex Word Identification using Nearest Centroid Classification
Authors Ashish Palakurthi, Radhika Mamidi
Abstract
Tasks Complex Word Identification, Lexical Simplification
Published 2016-06-01
URL https://www.aclweb.org/anthology/S16-1158/
PDF https://www.aclweb.org/anthology/S16-1158
PWC https://paperswithcode.com/paper/iiit-at-semeval-2016-task-11-complex-word
Repo
Framework

#WhoAmI in 160 Characters? Classifying Social Identities Based on Twitter Profile Descriptions

Title #WhoAmI in 160 Characters? Classifying Social Identities Based on Twitter Profile Descriptions
Authors Anna Priante, Djoerd Hiemstra, Tijs van den Broek, Aaqib Saeed, Michel Ehrenhard, Ariana Need
Abstract
Tasks Text Classification
Published 2016-11-01
URL https://www.aclweb.org/anthology/W16-5608/
PDF https://www.aclweb.org/anthology/W16-5608
PWC https://paperswithcode.com/paper/whoami-in-160-characters-classifying-social
Repo
Framework

Bridging Corpus for Russian in comparison with Czech

Title Bridging Corpus for Russian in comparison with Czech
Authors Anna Roitberg, Anna Nedoluzhko
Abstract
Tasks Coreference Resolution
Published 2016-06-01
URL https://www.aclweb.org/anthology/W16-0709/
PDF https://www.aclweb.org/anthology/W16-0709
PWC https://paperswithcode.com/paper/bridging-corpus-for-russian-in-comparison
Repo
Framework
comments powered by Disqus