May 5, 2019

1933 words 10 mins read

Paper Group NANR 14

Solving Random Systems of Quadratic Equations via Truncated Generalized Gradient Flow. CNNpack: Packing Convolutional Neural Networks in the Frequency Domain. ELRA Activities and Services. Melbourne at SemEval 2016 Task 11: Classifying Type-level Word Complexity using Random Forests with Corpus and Word List Features. Creation of comparable corpora …

Solving Random Systems of Quadratic Equations via Truncated Generalized Gradient Flow


Title	Solving Random Systems of Quadratic Equations via Truncated Generalized Gradient Flow
Authors	Gang Wang, Georgios Giannakis
Abstract	This paper puts forth a novel algorithm, termed \emph{truncated generalized gradient flow} (TGGF), to solve for $\bm{x}\in\mathbb{R}^n/\mathbb{C}^n$ a system of $m$ quadratic equations $y_i=\langle\bm{a}_i,\bm{x}\rangle^2$, $i=1,2,\ldots,m$, which even for $\left{\bm{a}i\in\mathbb{R}^n/\mathbb{C}^n\right}{i=1}^m$ random is known to be \emph{NP-hard} in general. We prove that as soon as the number of equations $m$ is on the order of the number of unknowns $n$, TGGF recovers the solution exactly (up to a global unimodular constant) with high probability and complexity growing linearly with the time required to read the data $\left{\left(\bm{a}i;,y_i\right)\right}{i=1}^m$. Specifically, TGGF proceeds in two stages: s1) A novel \emph{orthogonality-promoting} initialization that is obtained with simple power iterations; and, s2) a refinement of the initial estimate by successive updates of scalable \emph{truncated generalized gradient iterations}. The former is in sharp contrast to the existing spectral initializations, while the latter handles the rather challenging nonconvex and nonsmooth \emph{amplitude-based} cost function. Numerical tests demonstrate that: i) The novel orthogonality-promoting initialization method returns more accurate and robust estimates relative to its spectral counterparts; and ii) even with the same initialization, our refinement/truncation outperforms Wirtinger-based alternatives, all corroborating the superior performance of TGGF over state-of-the-art algorithms.
Tasks
Published	2016-12-01
URL	http://papers.nips.cc/paper/6061-solving-random-systems-of-quadratic-equations-via-truncated-generalized-gradient-flow
PDF	http://papers.nips.cc/paper/6061-solving-random-systems-of-quadratic-equations-via-truncated-generalized-gradient-flow.pdf
PWC	https://paperswithcode.com/paper/solving-random-systems-of-quadratic-equations
Repo
Framework

CNNpack: Packing Convolutional Neural Networks in the Frequency Domain


Title	CNNpack: Packing Convolutional Neural Networks in the Frequency Domain
Authors	Yunhe Wang, Chang Xu, Shan You, Dacheng Tao, Chao Xu
Abstract	Deep convolutional neural networks (CNNs) are successfully used in a number of applications. However, their storage and computational requirements have largely prevented their widespread use on mobile devices. Here we present an effective CNN compression approach in the frequency domain, which focuses not only on smaller weights but on all the weights and their underlying connections. By treating convolutional filters as images, we decompose their representations in the frequency domain as common parts (i.e., cluster centers) shared by other similar filters and their individual private parts (i.e., individual residuals). A large number of low-energy frequency coefficients in both parts can be discarded to produce high compression without significantly compromising accuracy. We relax the computational burden of convolution operations in CNNs by linearly combining the convolution responses of discrete cosine transform (DCT) bases. The compression and speed-up ratios of the proposed algorithm are thoroughly analyzed and evaluated on benchmark image datasets to demonstrate its superiority over state-of-the-art methods.
Tasks
Published	2016-12-01
URL	http://papers.nips.cc/paper/6390-cnnpack-packing-convolutional-neural-networks-in-the-frequency-domain
PDF	http://papers.nips.cc/paper/6390-cnnpack-packing-convolutional-neural-networks-in-the-frequency-domain.pdf
PWC	https://paperswithcode.com/paper/cnnpack-packing-convolutional-neural-networks
Repo
Framework

ELRA Activities and Services


Title	ELRA Activities and Services
Authors	Khalid Choukri, Val{'e}rie Mapelli, H{'e}l{`e}ne Mazo, Vladimir Popescu
Abstract	After celebrating its 20th anniversary in 2015, ELRA is carrying on its strong involvement in the HLT field. To share ELRA{'}s expertise of those 21 past years, this article begins with a presentation of ELRA{'}s strategic Data and LR Management Plan for a wide use by the language communities. Then, we further report on ELRA{'}s activities and services provided since LREC 2014. When looking at the cataloguing and licensing activities, we can see that ELRA has been active at making the Meta-Share repository move toward new developments steps, supporting Europe to obtain accurate LRs within the Connecting Europe Facility programme, promoting the use of LR citation, creating the ELRA License Wizard web portal.The article further elaborates on the recent LR production activities of various written, speech and video resources, commissioned by public and private customers. In parallel, ELDA has also worked on several EU-funded projects centred on strategic issues related to the European Digital Single Market. The last part gives an overview of the latest dissemination activities, with a special focus on the celebration of its 20th anniversary organised in Dubrovnik (Croatia) and the following up of LREC, as well as the launching of the new ELRA portal.
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1074/
PDF	https://www.aclweb.org/anthology/L16-1074
PWC	https://paperswithcode.com/paper/elra-activities-and-services
Repo
Framework

Melbourne at SemEval 2016 Task 11: Classifying Type-level Word Complexity using Random Forests with Corpus and Word List Features


Title	Melbourne at SemEval 2016 Task 11: Classifying Type-level Word Complexity using Random Forests with Corpus and Word List Features
Authors	Julian Brooke, Alex Uitdenbogerd, ra, Timothy Baldwin
Abstract
Tasks	Complex Word Identification, Text Simplification
Published	2016-06-01
URL	https://www.aclweb.org/anthology/S16-1150/
PDF	https://www.aclweb.org/anthology/S16-1150
PWC	https://paperswithcode.com/paper/melbourne-at-semeval-2016-task-11-classifying
Repo
Framework

Creation of comparable corpora for English-Urdu, Arabic, Persian


Title	Creation of comparable corpora for English-Urdu, Arabic, Persian
Authors	Murad Abouammoh, Kashif Shah, Ahmet Aker
Abstract	Statistical Machine Translation (SMT) relies on the availability of rich parallel corpora. However, in the case of under-resourced languages or some specific domains, parallel corpora are not readily available. This leads to under-performing machine translation systems in those sparse data settings. To overcome the low availability of parallel resources the machine translation community has recognized the potential of using comparable resources as training data. However, most efforts have been related to European languages and less in middle-east languages. In this study, we report comparable corpora created from news articles for the pair English ―{Arabic, Persian, Urdu} languages. The data has been collected over a period of a year, entails Arabic, Persian and Urdu languages. Furthermore using the English as a pivot language, comparable corpora that involve more than one language can be created, e.g. English- Arabic - Persian, English - Arabic - Urdu, English ― Urdu - Persian, etc. Upon request the data can be provided for research purposes.
Tasks	Machine Translation
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1663/
PDF	https://www.aclweb.org/anthology/L16-1663
PWC	https://paperswithcode.com/paper/creation-of-comparable-corpora-for-english
Repo
Framework

Designing Algorithms for Referring with Proper Names


Title	Designing Algorithms for Referring with Proper Names
Authors	Kees van Deemter
Abstract
Tasks	Text Generation
Published	2016-09-01
URL	https://www.aclweb.org/anthology/W16-6605/
PDF	https://www.aclweb.org/anthology/W16-6605
PWC	https://paperswithcode.com/paper/designing-algorithms-for-referring-with
Repo
Framework

Learning to Identify Sentence Parallelism in Student Essays


Title	Learning to Identify Sentence Parallelism in Student Essays
Authors	Wei Song, Tong Liu, Ruiji Fu, Lizhen Liu, Hanshi Wang, Ting Liu
Abstract	Parallelism is an important rhetorical device. We propose a machine learning approach for automated sentence parallelism identification in student essays. We build an essay dataset with sentence level parallelism annotated. We derive features by combining generalized word alignment strategies and the alignment measures between word sequences. The experimental results show that sentence parallelism can be effectively identified with a F1 score of 82{%} at pair-wise level and 72{%} at parallelism chunk level.Based on this approach, we automatically identify sentence parallelism in more than 2000 student essays and study the correlation between the use of sentence parallelism and the types and quality of essays.
Tasks	Word Alignment
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-1076/
PDF	https://www.aclweb.org/anthology/C16-1076
PWC	https://paperswithcode.com/paper/learning-to-identify-sentence-parallelism-in
Repo
Framework

A Corpus of Text Data and Gaze Fixations from Autistic and Non-Autistic Adults


Title	A Corpus of Text Data and Gaze Fixations from Autistic and Non-Autistic Adults
Authors	Victoria Yaneva, Irina Temnikova, Ruslan Mitkov
Abstract	The paper presents a corpus of text data and its corresponding gaze fixations obtained from autistic and non-autistic readers. The data was elicited through reading comprehension testing combined with eye-tracking recording. The corpus consists of 1034 content words tagged with their POS, syntactic role and three gaze-based measures corresponding to the autistic and control participants. The reading skills of the participants were measured through multiple-choice questions and, based on the answers given, they were divided into groups of skillful and less-skillful readers. This division of the groups informs researchers on whether particular fixations were elicited from skillful or less-skillful readers and allows a fair between-group comparison for two levels of reading ability. In addition to describing the process of data collection and corpus development, we present a study on the effect that word length has on reading in autism. The corpus is intended as a resource for investigating the particular linguistic constructions which pose reading difficulties for people with autism and hopefully, as a way to inform future text simplification research intended for this population.
Tasks	Eye Tracking, Reading Comprehension, Text Simplification
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1077/
PDF	https://www.aclweb.org/anthology/L16-1077
PWC	https://paperswithcode.com/paper/a-corpus-of-text-data-and-gaze-fixations-from
Repo
Framework

Semi-automatically Alignment of Predicates between Speech and OntoNotes data


Title	Semi-automatically Alignment of Predicates between Speech and OntoNotes data
Authors	Niraj Shrestha, Marie-Francine Moens
Abstract	Speech data currently receives a growing attention and is an important source of information. We still lack suitable corpora of transcribed speech annotated with semantic roles that can be used for semantic role labeling (SRL), which is not the case for written data. Semantic role labeling in speech data is a challenging and complex task due to the lack of sentence boundaries and the many transcription errors such as insertion, deletion and misspellings of words. In written data, SRL evaluation is performed at the sentence level, but in speech data sentence boundaries identification is still a bottleneck which makes evaluation more complex. In this work, we semi-automatically align the predicates found in transcribed speech obtained with an automatic speech recognizer (ASR) with the predicates found in the corresponding written documents of the OntoNotes corpus and manually align the semantic roles of these predicates thus obtaining annotated semantic frames in the speech data. This data can serve as gold standard alignments for future research in semantic role labeling of speech data.
Tasks	Semantic Role Labeling
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1222/
PDF	https://www.aclweb.org/anthology/L16-1222
PWC	https://paperswithcode.com/paper/semi-automatically-alignment-of-predicates
Repo
Framework

The MultiTal NLP tool infrastructure


Title	The MultiTal NLP tool infrastructure
Authors	Driss Sadoun, Satenik Mkhitaryan, Damien Nouvel, Mathieu Valette
Abstract	This paper gives an overview of the MultiTal project, which aims to create a research infrastructure that ensures long-term distribution of NLP tools descriptions. The goal is to make NLP tools more accessible and usable to end-users of different disciplines. The infrastructure is built on a meta-data scheme modelling and standardising multilingual NLP tools documentation. The model is conceptualised using an OWL ontology. The formal representation of the ontology allows us to automatically generate organised and structured documentation in different languages for each represented tool.
Tasks
Published	2016-12-01
URL	https://www.aclweb.org/anthology/W16-4021/
PDF	https://www.aclweb.org/anthology/W16-4021
PWC	https://paperswithcode.com/paper/the-multital-nlp-tool-infrastructure
Repo
Framework

A Constituent Syntactic Parse Tree Based Discourse Parser


Title	A Constituent Syntactic Parse Tree Based Discourse Parser
Authors	Zhongyi Li, Hai Zhao, Chenxi Pang, Lili Wang, Huan Wang
Abstract
Tasks	Question Answering, Text Classification
Published	2016-08-01
URL	https://www.aclweb.org/anthology/K16-2008/
PDF	https://www.aclweb.org/anthology/K16-2008
PWC	https://paperswithcode.com/paper/a-constituent-syntactic-parse-tree-based
Repo
Framework

AI-KU at SemEval-2016 Task 11: Word Embeddings and Substring Features for Complex Word Identification


Title	AI-KU at SemEval-2016 Task 11: Word Embeddings and Substring Features for Complex Word Identification
Authors	Onur Kuru
Abstract
Tasks	Complex Word Identification, Document Classification, Lexical Simplification, Named Entity Recognition, Word Embeddings
Published	2016-06-01
URL	https://www.aclweb.org/anthology/S16-1163/
PDF	https://www.aclweb.org/anthology/S16-1163
PWC	https://paperswithcode.com/paper/ai-ku-at-semeval-2016-task-11-word-embeddings
Repo
Framework

IIIT at SemEval-2016 Task 11: Complex Word Identification using Nearest Centroid Classification


Title	IIIT at SemEval-2016 Task 11: Complex Word Identification using Nearest Centroid Classification
Authors	Ashish Palakurthi, Radhika Mamidi
Abstract
Tasks	Complex Word Identification, Lexical Simplification
Published	2016-06-01
URL	https://www.aclweb.org/anthology/S16-1158/
PDF	https://www.aclweb.org/anthology/S16-1158
PWC	https://paperswithcode.com/paper/iiit-at-semeval-2016-task-11-complex-word
Repo
Framework


Title	#WhoAmI in 160 Characters? Classifying Social Identities Based on Twitter Profile Descriptions
Authors	Anna Priante, Djoerd Hiemstra, Tijs van den Broek, Aaqib Saeed, Michel Ehrenhard, Ariana Need
Abstract
Tasks	Text Classification
Published	2016-11-01
URL	https://www.aclweb.org/anthology/W16-5608/
PDF	https://www.aclweb.org/anthology/W16-5608
PWC	https://paperswithcode.com/paper/whoami-in-160-characters-classifying-social
Repo
Framework

Bridging Corpus for Russian in comparison with Czech


Title	Bridging Corpus for Russian in comparison with Czech
Authors	Anna Roitberg, Anna Nedoluzhko
Abstract
Tasks	Coreference Resolution
Published	2016-06-01
URL	https://www.aclweb.org/anthology/W16-0709/
PDF	https://www.aclweb.org/anthology/W16-0709
PWC	https://paperswithcode.com/paper/bridging-corpus-for-russian-in-comparison
Repo
Framework