October 15, 2019

2786 words 14 mins read

Paper Group NANR 134

Deep Neural Networks at the Service of Multilingual Parallel Sentence Extraction. LEARNING TO SHARE: SIMULTANEOUS PARAMETER TYING AND SPARSIFICATION IN DEEP LEARNING. Collecting Code-Switched Data from Social Media. Modifying Non-Local Variations Across Multiple Views. Edition 1.1 of the PARSEME Shared Task on Automatic Identification of Verbal Mul …

Deep Neural Networks at the Service of Multilingual Parallel Sentence Extraction


Title	Deep Neural Networks at the Service of Multilingual Parallel Sentence Extraction
Authors	Ahmad Aghaebrahimian
Abstract	Wikipedia provides an invaluable source of parallel multilingual data, which are in high demand for various sorts of linguistic inquiry, including both theoretical and practical studies. We introduce a novel end-to-end neural model for large-scale parallel data harvesting from Wikipedia. Our model is language-independent, robust, and highly scalable. We use our system for collecting parallel German-English, French-English and Persian-English sentences. Human evaluations at the end show the strong performance of this model in collecting high-quality parallel data. We also propose a statistical framework which extends the results of our human evaluation to other language pairs. Our model also obtained a state-of-the-art result on the German-English dataset of BUCC 2017 shared task on parallel sentence extraction from comparable corpora.
Tasks	Information Retrieval, Machine Translation
Published	2018-08-01
URL	https://www.aclweb.org/anthology/C18-1116/
PDF	https://www.aclweb.org/anthology/C18-1116
PWC	https://paperswithcode.com/paper/deep-neural-networks-at-the-service-of
Repo
Framework


Title	LEARNING TO SHARE: SIMULTANEOUS PARAMETER TYING AND SPARSIFICATION IN DEEP LEARNING
Authors	Dejiao Zhang, Haozhu Wang, Mario Figueiredo, Laura Balzano
Abstract	Deep neural networks (DNNs) usually contain millions, maybe billions, of parameters/weights, making both storage and computation very expensive. This has motivated a large body of work to reduce the complexity of the neural network by using sparsity-inducing regularizers. Another well-known approach for controlling the complexity of DNNs is parameter sharing/tying, where certain sets of weights are forced to share a common value. Some forms of weight sharing are hard-wired to express certain in- variances, with a notable example being the shift-invariance of convolutional layers. However, there may be other groups of weights that may be tied together during the learning process, thus further re- ducing the complexity of the network. In this paper, we adopt a recently proposed sparsity-inducing regularizer, named GrOWL (group ordered weighted l1), which encourages sparsity and, simulta- neously, learns which groups of parameters should share a common value. GrOWL has been proven effective in linear regression, being able to identify and cope with strongly correlated covariates. Unlike standard sparsity-inducing regularizers (e.g., l1 a.k.a. Lasso), GrOWL not only eliminates unimportant neurons by setting all the corresponding weights to zero, but also explicitly identifies strongly correlated neurons by tying the corresponding weights to a common value. This ability of GrOWL motivates the following two-stage procedure: (i) use GrOWL regularization in the training process to simultaneously identify significant neurons and groups of parameter that should be tied together; (ii) retrain the network, enforcing the structure that was unveiled in the previous phase, i.e., keeping only the significant neurons and enforcing the learned tying structure. We evaluate the proposed approach on several benchmark datasets, showing that it can dramatically compress the network with slight or even no loss on generalization performance.
Tasks
Published	2018-01-01
URL	https://openreview.net/forum?id=rypT3fb0b
PDF	https://openreview.net/pdf?id=rypT3fb0b
PWC	https://paperswithcode.com/paper/learning-to-share-simultaneous-parameter
Repo
Framework


Title	Collecting Code-Switched Data from Social Media
Authors	Gideon Mendels, Victor Soto, Aaron Jaech, Julia Hirschberg
Abstract
Tasks	Language Identification, Language Modelling, Sentiment Analysis, Speech Recognition
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1107/
PDF	https://www.aclweb.org/anthology/L18-1107
PWC	https://paperswithcode.com/paper/collecting-code-switched-data-from-social
Repo
Framework

Modifying Non-Local Variations Across Multiple Views


Title	Modifying Non-Local Variations Across Multiple Views
Authors	Tal Tlusty, Tomer Michaeli, Tali Dekel, Lihi Zelnik-Manor
Abstract	We present an algorithm for modifying small non-local variations between repeating structures and patterns in multiple images of the same scene. The modification is consistent across views, even-though the images could have been photographed from different view points and under different lighting conditions. We show that when modifying each image independently the correspondence between them breaks and the geometric structure of the scene gets distorted. Our approach modifies the views while maintaining correspondence, hence, we succeed in modifying appearance and structure variations consistently. We demonstrate our methods on a number of challenging examples, photographed in different lighting, scales and view points.
Tasks
Published	2018-06-01
URL	http://openaccess.thecvf.com/content_cvpr_2018/html/Tlusty_Modifying_Non-Local_Variations_CVPR_2018_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2018/papers/Tlusty_Modifying_Non-Local_Variations_CVPR_2018_paper.pdf
PWC	https://paperswithcode.com/paper/modifying-non-local-variations-across
Repo
Framework

Edition 1.1 of the PARSEME Shared Task on Automatic Identification of Verbal Multiword Expressions


Title	Edition 1.1 of the PARSEME Shared Task on Automatic Identification of Verbal Multiword Expressions
Authors	Carlos Ramisch, Silvio Ricardo Cordeiro, Agata Savary, Veronika Vincze, Verginica Barbu Mititelu, Archna Bhatia, Maja Buljan, C, Marie ito, Polona Gantar, Voula Giouli, Tunga G{"u}ng{"o}r, Abdelati Hawwari, Uxoa I{~n}urrieta, Jolanta Kovalevskait{.e}, Simon Krek, Timm Lichte, Chaya Liebeskind, Johanna Monti, Carla Parra Escart{'\i}n, Behrang QasemiZadeh, Renata Ramisch, Nathan Schneider, Ivelina Stoyanova, Ashwini Vaidya, Abigail Walsh
Abstract	This paper describes the PARSEME Shared Task 1.1 on automatic identification of verbal multiword expressions. We present the annotation methodology, focusing on changes from last year{'}s shared task. Novel aspects include enhanced annotation guidelines, additional annotated data for most languages, corpora for some new languages, and new evaluation settings. Corpora were created for 20 languages, which are also briefly discussed. We report organizational principles behind the shared task and the evaluation metrics employed for ranking. The 17 participating systems, their methods and obtained results are also presented and analysed.
Tasks
Published	2018-08-01
URL	https://www.aclweb.org/anthology/W18-4925/
PDF	https://www.aclweb.org/anthology/W18-4925
PWC	https://paperswithcode.com/paper/edition-11-of-the-parseme-shared-task-on
Repo
Framework

Exploring word embeddings and phonological similarity for the unsupervised correction of language learner errors


Title	Exploring word embeddings and phonological similarity for the unsupervised correction of language learner errors
Authors	Ildik{'o} Pil{'a}n, Elena Volodina
Abstract	The presence of misspellings and other errors or non-standard word forms poses a considerable challenge for NLP systems. Although several supervised approaches have been proposed previously to normalize these, annotated training data is scarce for many languages. We investigate, therefore, an unsupervised method where correction candidates for Swedish language learners{'} errors are retrieved from word embeddings. Furthermore, we compare the usefulness of combining cosine similarity with orthographic and phonological similarity based on a neural grapheme-to-phoneme conversion system we train for this purpose. Although combinations of similarity measures have been explored for finding error correction candidates, it remains unclear how these measures relate to each other and how much they contribute individually to identifying the correct alternative. We experiment with different combinations of these and find that integrating phonological information is especially useful when the majority of learner errors are related to misspellings, but less so when errors are of a variety of types including, e.g. grammatical errors.
Tasks	Language Acquisition, Word Embeddings
Published	2018-08-01
URL	https://www.aclweb.org/anthology/W18-4514/
PDF	https://www.aclweb.org/anthology/W18-4514
PWC	https://paperswithcode.com/paper/exploring-word-embeddings-and-phonological
Repo
Framework

DNN Representations as Codewords: Manipulating Statistical Properties via Penalty Regularization


Title	DNN Representations as Codewords: Manipulating Statistical Properties via Penalty Regularization
Authors	Daeyoung Choi, Changho Shin, Hyunghun Cho, Wonjong Rhee
Abstract	Performance of Deep Neural Network (DNN) heavily depends on the characteristics of hidden layer representations. Unlike the codewords of channel coding, however, the representations of learning cannot be directly designed or controlled. Therefore, we develop a family of penalty regularizers where each one aims to affect one of representation’s statistical properties such as sparsity, variance, or covariance. The regularizers are extended to perform class-wise regularization, and the extension is found to provide an outstanding shaping capability. A variety of statistical properties are investigated for 10 different regularization strategies including dropout and batch normalization, and several interesting findings are reported. Using the family of regularizers, performance improvements are confirmed for MNIST, CIFAR-100, and CIFAR-10 classification problems. But more importantly, our results suggest that understanding how to manipulate statistical properties of representations can be an important step toward understanding DNN and that the role and effect of DNN regularizers need to be reconsidered.
Tasks
Published	2018-01-01
URL	https://openreview.net/forum?id=rkQu4Wb0Z
PDF	https://openreview.net/pdf?id=rkQu4Wb0Z
PWC	https://paperswithcode.com/paper/dnn-representations-as-codewords-manipulating
Repo
Framework

Exploiting Vector Fields for Geometric Rectification of Distorted Document Images


Title	Exploiting Vector Fields for Geometric Rectification of Distorted Document Images
Authors	Gaofeng MENG, Yuanqi SU, Ying WU, Shiming XIANG, Chunhong PAN
Abstract	This paper proposes a segment-free method for geometric rectification of a distorted document image captured by a hand-held camera. The method can recover the 3D page shape by exploiting the intrinsic vector fields of the image. Based on the assumption that the curled page shape is a general cylindrical surface, we estimate the parameters related to the camera and the 3D shape model through weighted majority voting on the vector fields. Then the spatial directrix of the surface is recovered by solving an ordinary differential equation (ODE) through the Euler method. Finally, the geometric distortions in images can be rectified by flattening the estimated 3D page surface onto a plane. Our method can exploit diverse types of visual cues available in a distorted document image to estimate its vector fields for 3D page shape recovery. In comparison to the state-of-the-art methods, the great advantage is that it is a segment-free method and does not have to extract curved text lines or textual blocks, which is still a very challenging problem especially for a distorted document image. Our method can therefore be freely applied to document images with extremely complicated page layouts and severe image quality degradation. Extensive experiments are implemented to demonstrate the effectiveness of the proposed method.
Tasks
Published	2018-09-01
URL	http://openaccess.thecvf.com/content_ECCV_2018/html/Gaofeng_Meng_Exploiting_Vector_Fields_ECCV_2018_paper.html
PDF	http://openaccess.thecvf.com/content_ECCV_2018/papers/Gaofeng_Meng_Exploiting_Vector_Fields_ECCV_2018_paper.pdf
PWC	https://paperswithcode.com/paper/exploiting-vector-fields-for-geometric
Repo
Framework

Transfer Learning for a Letter-Ngrams to Word Decoder in the Context of Historical Handwriting Recognition with Scarce Resources


Title	Transfer Learning for a Letter-Ngrams to Word Decoder in the Context of Historical Handwriting Recognition with Scarce Resources
Authors	Adeline Granet, Emmanuel Morin, Harold Mouch{`e}re, Solen Quiniou, Christian Viard-Gaudin
Abstract	Lack of data can be an issue when beginning a new study on historical handwritten documents. In order to deal with this, we present the character-based decoder part of a multilingual approach based on transductive transfer learning for a historical handwriting recognition task on Italian Comedy Registers. The decoder must build a sequence of characters that corresponds to a word from a vector of letter-ngrams. As learning data, we created a new dataset from untapped resources that covers the same domain and period of our Italian Comedy data, as well as resources from common domains, periods, or languages. We obtain a 97.42{%} Character Recognition Rate and a 86.57{%} Word Recognition Rate on our Italian Comedy data, despite a lexical coverage of 67{%} between the Italian Comedy data and the training data. These results show that an efficient system can be obtained by a carefully selecting the datasets used for the transfer learning.
Tasks	Information Retrieval, Keyword Spotting, Language Modelling, Transfer Learning
Published	2018-08-01
URL	https://www.aclweb.org/anthology/C18-1125/
PDF	https://www.aclweb.org/anthology/C18-1125
PWC	https://paperswithcode.com/paper/transfer-learning-for-a-letter-ngrams-to-word
Repo
Framework

Trading robust representations for sample complexity through self-supervised visual experience


Title	Trading robust representations for sample complexity through self-supervised visual experience
Authors	Andrea Tacchetti, Stephen Voinea, Georgios Evangelopoulos
Abstract	Learning in small sample regimes is among the most remarkable features of the human perceptual system. This ability is related to robustness to transformations, which is acquired through visual experience in the form of weak- or self-supervision during development. We explore the idea of allowing artificial systems to learn representations of visual stimuli through weak supervision prior to downstream supervised tasks. We introduce a novel loss function for representation learning using unlabeled image sets and video sequences, and experimentally demonstrate that these representations support one-shot learning and reduce the sample complexity of multiple recognition tasks. We establish the existence of a trade-off between the sizes of weakly supervised, automatically obtained from video sequences, and fully supervised data sets. Our results suggest that equivalence sets other than class labels, which are abundant in unlabeled visual experience, can be used for self-supervised learning of semantically relevant image embeddings.
Tasks	One-Shot Learning, Representation Learning
Published	2018-12-01
URL	http://papers.nips.cc/paper/8170-trading-robust-representations-for-sample-complexity-through-self-supervised-visual-experience
PDF	http://papers.nips.cc/paper/8170-trading-robust-representations-for-sample-complexity-through-self-supervised-visual-experience.pdf
PWC	https://paperswithcode.com/paper/trading-robust-representations-for-sample
Repo
Framework

User-Level Race and Ethnicity Predictors from Twitter Text


Title	User-Level Race and Ethnicity Predictors from Twitter Text
Authors	Daniel Preo{\c{t}}iuc-Pietro, Lyle Ungar
Abstract	User demographic inference from social media text has the potential to improve a range of downstream applications, including real-time passive polling or quantifying demographic bias. This study focuses on developing models for user-level race and ethnicity prediction. We introduce a data set of users who self-report their race/ethnicity through a survey, in contrast to previous approaches that use distantly supervised data or perceived labels. We develop predictive models from text which accurately predict the membership of a user to the four largest racial and ethnic groups with up to .884 AUC and make these available to the research community.
Tasks
Published	2018-08-01
URL	https://www.aclweb.org/anthology/C18-1130/
PDF	https://www.aclweb.org/anthology/C18-1130
PWC	https://paperswithcode.com/paper/user-level-race-and-ethnicity-predictors-from
Repo
Framework

Looking Beyond the Surface: A Challenge Set for Reading Comprehension over Multiple Sentences


Title	Looking Beyond the Surface: A Challenge Set for Reading Comprehension over Multiple Sentences
Authors	Daniel Khashabi, Snigdha Chaturvedi, Michael Roth, Shyam Upadhyay, Dan Roth
Abstract	We present a reading comprehension challenge in which questions can only be answered by taking into account information from multiple sentences. We solicit and verify questions and answers for this challenge through a 4-step crowdsourcing experiment. Our challenge dataset contains 6,500+ questions for 1000+ paragraphs across 7 different domains (elementary school science, news, travel guides, fiction stories, etc) bringing in linguistic diversity to the texts and to the questions wordings. On a subset of our dataset, we found human solvers to achieve an F1-score of 88.1{%}. We analyze a range of baselines, including a recent state-of-art reading comprehension system, and demonstrate the difficulty of this challenge, despite a high human performance. The dataset is the first to study multi-sentence inference at scale, with an open-ended set of question types that requires reasoning skills.
Tasks	Natural Language Inference, Question Answering, Reading Comprehension
Published	2018-06-01
URL	https://www.aclweb.org/anthology/N18-1023/
PDF	https://www.aclweb.org/anthology/N18-1023
PWC	https://paperswithcode.com/paper/looking-beyond-the-surface-a-challenge-set
Repo
Framework

CLPsych 2018 Shared Task: Predicting Current and Future Psychological Health from Childhood Essays


Title	CLPsych 2018 Shared Task: Predicting Current and Future Psychological Health from Childhood Essays
Authors	Veronica Lynn, Alissa Goodman, Kate Niederhoffer, Kate Loveys, Philip Resnik, H. Andrew Schwartz
Abstract	We describe the shared task for the CLPsych 2018 workshop, which focused on predicting current and future psychological health from an essay authored in childhood. Language-based predictions of a person{'}s current health have the potential to supplement traditional psychological assessment such as questionnaires, improving intake risk measurement and monitoring. Predictions of future psychological health can aid with both early detection and the development of preventative care. Research into the mental health trajectory of people, beginning from their childhood, has thus far been an area of little work within the NLP community. This shared task represents one of the first attempts to evaluate the use of early language to predict future health; this has the potential to support a wide variety of clinical health care tasks, from early assessment of lifetime risk for mental health problems, to optimal timing for targeted interventions aimed at both prevention and treatment.
Tasks
Published	2018-06-01
URL	https://www.aclweb.org/anthology/W18-0604/
PDF	https://www.aclweb.org/anthology/W18-0604
PWC	https://paperswithcode.com/paper/clpsych-2018-shared-task-predicting-current
Repo
Framework

Searching for the X-Factor: Exploring Corpus Subjectivity for Word Embeddings


Title	Searching for the X-Factor: Exploring Corpus Subjectivity for Word Embeddings
Authors	Maksim Tkachenko, Chong Cher Chia, Hady Lauw
Abstract	We explore the notion of subjectivity, and hypothesize that word embeddings learnt from input corpora of varying levels of subjectivity behave differently on natural language processing tasks such as classifying a sentence by sentiment, subjectivity, or topic. Through systematic comparative analyses, we establish this to be the case indeed. Moreover, based on the discovery of the outsized role that sentiment words play on subjectivity-sensitive tasks such as sentiment classification, we develop a novel word embedding SentiVec which is infused with sentiment information from a lexical resource, and is shown to outperform baselines on such tasks.
Tasks	Sentiment Analysis, Word Embeddings
Published	2018-07-01
URL	https://www.aclweb.org/anthology/P18-1112/
PDF	https://www.aclweb.org/anthology/P18-1112
PWC	https://paperswithcode.com/paper/searching-for-the-x-factor-exploring-corpus
Repo
Framework


Title	Predicting Adolescents’ Educational Track from Chat Messages on Dutch Social Media
Authors	Lisa Hilte, Walter Daelemans, V, Reinhild ekerckhove
Abstract	We aim to predict Flemish adolescents{'} educational track based on their Dutch social media writing. We distinguish between the three main types of Belgian secondary education: General (theory-oriented), Vocational (practice-oriented), and Technical Secondary Education (hybrid). The best results are obtained with a Naive Bayes model, i.e. an F-score of 0.68 (std. dev. 0.05) in 10-fold cross-validation experiments on the training data and an F-score of 0.60 on unseen data. Many of the most informative features are character n-grams containing specific occurrences of chatspeak phenomena such as emoticons. While the detection of the most theory- and practice-oriented educational tracks seems to be a relatively easy task, the hybrid Technical level appears to be much harder to capture based on online writing style, as expected.
Tasks
Published	2018-10-01
URL	https://www.aclweb.org/anthology/W18-6248/
PDF	https://www.aclweb.org/anthology/W18-6248
PWC	https://paperswithcode.com/paper/predicting-adolescents-educational-track-from
Repo
Framework

Paper Group NANR 134

Deep Neural Networks at the Service of Multilingual Parallel Sentence Extraction

LEARNING TO SHARE: SIMULTANEOUS PARAMETER TYING AND SPARSIFICATION IN DEEP LEARNING

Collecting Code-Switched Data from Social Media