Paper Group NANR 134
Deep Neural Networks at the Service of Multilingual Parallel Sentence Extraction. LEARNING TO SHARE: SIMULTANEOUS PARAMETER TYING AND SPARSIFICATION IN DEEP LEARNING. Collecting Code-Switched Data from Social Media. Modifying Non-Local Variations Across Multiple Views. Edition 1.1 of the PARSEME Shared Task on Automatic Identification of Verbal Mul …
Deep Neural Networks at the Service of Multilingual Parallel Sentence Extraction
Title | Deep Neural Networks at the Service of Multilingual Parallel Sentence Extraction |
Authors | Ahmad Aghaebrahimian |
Abstract | Wikipedia provides an invaluable source of parallel multilingual data, which are in high demand for various sorts of linguistic inquiry, including both theoretical and practical studies. We introduce a novel end-to-end neural model for large-scale parallel data harvesting from Wikipedia. Our model is language-independent, robust, and highly scalable. We use our system for collecting parallel German-English, French-English and Persian-English sentences. Human evaluations at the end show the strong performance of this model in collecting high-quality parallel data. We also propose a statistical framework which extends the results of our human evaluation to other language pairs. Our model also obtained a state-of-the-art result on the German-English dataset of BUCC 2017 shared task on parallel sentence extraction from comparable corpora. |
Tasks | Information Retrieval, Machine Translation |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/C18-1116/ |
https://www.aclweb.org/anthology/C18-1116 | |
PWC | https://paperswithcode.com/paper/deep-neural-networks-at-the-service-of |
Repo | |
Framework | |
LEARNING TO SHARE: SIMULTANEOUS PARAMETER TYING AND SPARSIFICATION IN DEEP LEARNING
Title | LEARNING TO SHARE: SIMULTANEOUS PARAMETER TYING AND SPARSIFICATION IN DEEP LEARNING |
Authors | Dejiao Zhang, Haozhu Wang, Mario Figueiredo, Laura Balzano |
Abstract | Deep neural networks (DNNs) usually contain millions, maybe billions, of parameters/weights, making both storage and computation very expensive. This has motivated a large body of work to reduce the complexity of the neural network by using sparsity-inducing regularizers. Another well-known approach for controlling the complexity of DNNs is parameter sharing/tying, where certain sets of weights are forced to share a common value. Some forms of weight sharing are hard-wired to express certain in- variances, with a notable example being the shift-invariance of convolutional layers. However, there may be other groups of weights that may be tied together during the learning process, thus further re- ducing the complexity of the network. In this paper, we adopt a recently proposed sparsity-inducing regularizer, named GrOWL (group ordered weighted l1), which encourages sparsity and, simulta- neously, learns which groups of parameters should share a common value. GrOWL has been proven effective in linear regression, being able to identify and cope with strongly correlated covariates. Unlike standard sparsity-inducing regularizers (e.g., l1 a.k.a. Lasso), GrOWL not only eliminates unimportant neurons by setting all the corresponding weights to zero, but also explicitly identifies strongly correlated neurons by tying the corresponding weights to a common value. This ability of GrOWL motivates the following two-stage procedure: (i) use GrOWL regularization in the training process to simultaneously identify significant neurons and groups of parameter that should be tied together; (ii) retrain the network, enforcing the structure that was unveiled in the previous phase, i.e., keeping only the significant neurons and enforcing the learned tying structure. We evaluate the proposed approach on several benchmark datasets, showing that it can dramatically compress the network with slight or even no loss on generalization performance. |
Tasks | |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=rypT3fb0b |
https://openreview.net/pdf?id=rypT3fb0b | |
PWC | https://paperswithcode.com/paper/learning-to-share-simultaneous-parameter |
Repo | |
Framework | |
Collecting Code-Switched Data from Social Media
Title | Collecting Code-Switched Data from Social Media |
Authors | Gideon Mendels, Victor Soto, Aaron Jaech, Julia Hirschberg |
Abstract | |
Tasks | Language Identification, Language Modelling, Sentiment Analysis, Speech Recognition |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1107/ |
https://www.aclweb.org/anthology/L18-1107 | |
PWC | https://paperswithcode.com/paper/collecting-code-switched-data-from-social |
Repo | |
Framework | |
Modifying Non-Local Variations Across Multiple Views
Title | Modifying Non-Local Variations Across Multiple Views |
Authors | Tal Tlusty, Tomer Michaeli, Tali Dekel, Lihi Zelnik-Manor |
Abstract | We present an algorithm for modifying small non-local variations between repeating structures and patterns in multiple images of the same scene. The modification is consistent across views, even-though the images could have been photographed from different view points and under different lighting conditions. We show that when modifying each image independently the correspondence between them breaks and the geometric structure of the scene gets distorted. Our approach modifies the views while maintaining correspondence, hence, we succeed in modifying appearance and structure variations consistently. We demonstrate our methods on a number of challenging examples, photographed in different lighting, scales and view points. |
Tasks | |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Tlusty_Modifying_Non-Local_Variations_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Tlusty_Modifying_Non-Local_Variations_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/modifying-non-local-variations-across |
Repo | |
Framework | |
Edition 1.1 of the PARSEME Shared Task on Automatic Identification of Verbal Multiword Expressions
Title | Edition 1.1 of the PARSEME Shared Task on Automatic Identification of Verbal Multiword Expressions |
Authors | Carlos Ramisch, Silvio Ricardo Cordeiro, Agata Savary, Veronika Vincze, Verginica Barbu Mititelu, Archna Bhatia, Maja Buljan, C, Marie ito, Polona Gantar, Voula Giouli, Tunga G{"u}ng{"o}r, Abdelati Hawwari, Uxoa I{~n}urrieta, Jolanta Kovalevskait{.e}, Simon Krek, Timm Lichte, Chaya Liebeskind, Johanna Monti, Carla Parra Escart{'\i}n, Behrang QasemiZadeh, Renata Ramisch, Nathan Schneider, Ivelina Stoyanova, Ashwini Vaidya, Abigail Walsh |
Abstract | This paper describes the PARSEME Shared Task 1.1 on automatic identification of verbal multiword expressions. We present the annotation methodology, focusing on changes from last year{'}s shared task. Novel aspects include enhanced annotation guidelines, additional annotated data for most languages, corpora for some new languages, and new evaluation settings. Corpora were created for 20 languages, which are also briefly discussed. We report organizational principles behind the shared task and the evaluation metrics employed for ranking. The 17 participating systems, their methods and obtained results are also presented and analysed. |
Tasks | |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/W18-4925/ |
https://www.aclweb.org/anthology/W18-4925 | |
PWC | https://paperswithcode.com/paper/edition-11-of-the-parseme-shared-task-on |
Repo | |
Framework | |
Exploring word embeddings and phonological similarity for the unsupervised correction of language learner errors
Title | Exploring word embeddings and phonological similarity for the unsupervised correction of language learner errors |
Authors | Ildik{'o} Pil{'a}n, Elena Volodina |
Abstract | The presence of misspellings and other errors or non-standard word forms poses a considerable challenge for NLP systems. Although several supervised approaches have been proposed previously to normalize these, annotated training data is scarce for many languages. We investigate, therefore, an unsupervised method where correction candidates for Swedish language learners{'} errors are retrieved from word embeddings. Furthermore, we compare the usefulness of combining cosine similarity with orthographic and phonological similarity based on a neural grapheme-to-phoneme conversion system we train for this purpose. Although combinations of similarity measures have been explored for finding error correction candidates, it remains unclear how these measures relate to each other and how much they contribute individually to identifying the correct alternative. We experiment with different combinations of these and find that integrating phonological information is especially useful when the majority of learner errors are related to misspellings, but less so when errors are of a variety of types including, e.g. grammatical errors. |
Tasks | Language Acquisition, Word Embeddings |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/W18-4514/ |
https://www.aclweb.org/anthology/W18-4514 | |
PWC | https://paperswithcode.com/paper/exploring-word-embeddings-and-phonological |
Repo | |
Framework | |
DNN Representations as Codewords: Manipulating Statistical Properties via Penalty Regularization
Title | DNN Representations as Codewords: Manipulating Statistical Properties via Penalty Regularization |
Authors | Daeyoung Choi, Changho Shin, Hyunghun Cho, Wonjong Rhee |
Abstract | Performance of Deep Neural Network (DNN) heavily depends on the characteristics of hidden layer representations. Unlike the codewords of channel coding, however, the representations of learning cannot be directly designed or controlled. Therefore, we develop a family of penalty regularizers where each one aims to affect one of representation’s statistical properties such as sparsity, variance, or covariance. The regularizers are extended to perform class-wise regularization, and the extension is found to provide an outstanding shaping capability. A variety of statistical properties are investigated for 10 different regularization strategies including dropout and batch normalization, and several interesting findings are reported. Using the family of regularizers, performance improvements are confirmed for MNIST, CIFAR-100, and CIFAR-10 classification problems. But more importantly, our results suggest that understanding how to manipulate statistical properties of representations can be an important step toward understanding DNN and that the role and effect of DNN regularizers need to be reconsidered. |
Tasks | |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=rkQu4Wb0Z |
https://openreview.net/pdf?id=rkQu4Wb0Z | |
PWC | https://paperswithcode.com/paper/dnn-representations-as-codewords-manipulating |
Repo | |
Framework | |
Exploiting Vector Fields for Geometric Rectification of Distorted Document Images
Title | Exploiting Vector Fields for Geometric Rectification of Distorted Document Images |
Authors | Gaofeng MENG, Yuanqi SU, Ying WU, Shiming XIANG, Chunhong PAN |
Abstract | This paper proposes a segment-free method for geometric rectification of a distorted document image captured by a hand-held camera. The method can recover the 3D page shape by exploiting the intrinsic vector fields of the image. Based on the assumption that the curled page shape is a general cylindrical surface, we estimate the parameters related to the camera and the 3D shape model through weighted majority voting on the vector fields. Then the spatial directrix of the surface is recovered by solving an ordinary differential equation (ODE) through the Euler method. Finally, the geometric distortions in images can be rectified by flattening the estimated 3D page surface onto a plane. Our method can exploit diverse types of visual cues available in a distorted document image to estimate its vector fields for 3D page shape recovery. In comparison to the state-of-the-art methods, the great advantage is that it is a segment-free method and does not have to extract curved text lines or textual blocks, which is still a very challenging problem especially for a distorted document image. Our method can therefore be freely applied to document images with extremely complicated page layouts and severe image quality degradation. Extensive experiments are implemented to demonstrate the effectiveness of the proposed method. |
Tasks | |
Published | 2018-09-01 |
URL | http://openaccess.thecvf.com/content_ECCV_2018/html/Gaofeng_Meng_Exploiting_Vector_Fields_ECCV_2018_paper.html |
http://openaccess.thecvf.com/content_ECCV_2018/papers/Gaofeng_Meng_Exploiting_Vector_Fields_ECCV_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/exploiting-vector-fields-for-geometric |
Repo | |
Framework | |
Transfer Learning for a Letter-Ngrams to Word Decoder in the Context of Historical Handwriting Recognition with Scarce Resources
Title | Transfer Learning for a Letter-Ngrams to Word Decoder in the Context of Historical Handwriting Recognition with Scarce Resources |
Authors | Adeline Granet, Emmanuel Morin, Harold Mouch{`e}re, Solen Quiniou, Christian Viard-Gaudin |
Abstract | Lack of data can be an issue when beginning a new study on historical handwritten documents. In order to deal with this, we present the character-based decoder part of a multilingual approach based on transductive transfer learning for a historical handwriting recognition task on Italian Comedy Registers. The decoder must build a sequence of characters that corresponds to a word from a vector of letter-ngrams. As learning data, we created a new dataset from untapped resources that covers the same domain and period of our Italian Comedy data, as well as resources from common domains, periods, or languages. We obtain a 97.42{%} Character Recognition Rate and a 86.57{%} Word Recognition Rate on our Italian Comedy data, despite a lexical coverage of 67{%} between the Italian Comedy data and the training data. These results show that an efficient system can be obtained by a carefully selecting the datasets used for the transfer learning. |
Tasks | Information Retrieval, Keyword Spotting, Language Modelling, Transfer Learning |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/C18-1125/ |
https://www.aclweb.org/anthology/C18-1125 | |
PWC | https://paperswithcode.com/paper/transfer-learning-for-a-letter-ngrams-to-word |
Repo | |
Framework | |
Trading robust representations for sample complexity through self-supervised visual experience
Title | Trading robust representations for sample complexity through self-supervised visual experience |
Authors | Andrea Tacchetti, Stephen Voinea, Georgios Evangelopoulos |
Abstract | Learning in small sample regimes is among the most remarkable features of the human perceptual system. This ability is related to robustness to transformations, which is acquired through visual experience in the form of weak- or self-supervision during development. We explore the idea of allowing artificial systems to learn representations of visual stimuli through weak supervision prior to downstream supervised tasks. We introduce a novel loss function for representation learning using unlabeled image sets and video sequences, and experimentally demonstrate that these representations support one-shot learning and reduce the sample complexity of multiple recognition tasks. We establish the existence of a trade-off between the sizes of weakly supervised, automatically obtained from video sequences, and fully supervised data sets. Our results suggest that equivalence sets other than class labels, which are abundant in unlabeled visual experience, can be used for self-supervised learning of semantically relevant image embeddings. |
Tasks | One-Shot Learning, Representation Learning |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/8170-trading-robust-representations-for-sample-complexity-through-self-supervised-visual-experience |
http://papers.nips.cc/paper/8170-trading-robust-representations-for-sample-complexity-through-self-supervised-visual-experience.pdf | |
PWC | https://paperswithcode.com/paper/trading-robust-representations-for-sample |
Repo | |
Framework | |
User-Level Race and Ethnicity Predictors from Twitter Text
Title | User-Level Race and Ethnicity Predictors from Twitter Text |
Authors | Daniel Preo{\c{t}}iuc-Pietro, Lyle Ungar |
Abstract | User demographic inference from social media text has the potential to improve a range of downstream applications, including real-time passive polling or quantifying demographic bias. This study focuses on developing models for user-level race and ethnicity prediction. We introduce a data set of users who self-report their race/ethnicity through a survey, in contrast to previous approaches that use distantly supervised data or perceived labels. We develop predictive models from text which accurately predict the membership of a user to the four largest racial and ethnic groups with up to .884 AUC and make these available to the research community. |
Tasks | |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/C18-1130/ |
https://www.aclweb.org/anthology/C18-1130 | |
PWC | https://paperswithcode.com/paper/user-level-race-and-ethnicity-predictors-from |
Repo | |
Framework | |
Looking Beyond the Surface: A Challenge Set for Reading Comprehension over Multiple Sentences
Title | Looking Beyond the Surface: A Challenge Set for Reading Comprehension over Multiple Sentences |
Authors | Daniel Khashabi, Snigdha Chaturvedi, Michael Roth, Shyam Upadhyay, Dan Roth |
Abstract | We present a reading comprehension challenge in which questions can only be answered by taking into account information from multiple sentences. We solicit and verify questions and answers for this challenge through a 4-step crowdsourcing experiment. Our challenge dataset contains 6,500+ questions for 1000+ paragraphs across 7 different domains (elementary school science, news, travel guides, fiction stories, etc) bringing in linguistic diversity to the texts and to the questions wordings. On a subset of our dataset, we found human solvers to achieve an F1-score of 88.1{%}. We analyze a range of baselines, including a recent state-of-art reading comprehension system, and demonstrate the difficulty of this challenge, despite a high human performance. The dataset is the first to study multi-sentence inference at scale, with an open-ended set of question types that requires reasoning skills. |
Tasks | Natural Language Inference, Question Answering, Reading Comprehension |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/N18-1023/ |
https://www.aclweb.org/anthology/N18-1023 | |
PWC | https://paperswithcode.com/paper/looking-beyond-the-surface-a-challenge-set |
Repo | |
Framework | |
CLPsych 2018 Shared Task: Predicting Current and Future Psychological Health from Childhood Essays
Title | CLPsych 2018 Shared Task: Predicting Current and Future Psychological Health from Childhood Essays |
Authors | Veronica Lynn, Alissa Goodman, Kate Niederhoffer, Kate Loveys, Philip Resnik, H. Andrew Schwartz |
Abstract | We describe the shared task for the CLPsych 2018 workshop, which focused on predicting current and future psychological health from an essay authored in childhood. Language-based predictions of a person{'}s current health have the potential to supplement traditional psychological assessment such as questionnaires, improving intake risk measurement and monitoring. Predictions of future psychological health can aid with both early detection and the development of preventative care. Research into the mental health trajectory of people, beginning from their childhood, has thus far been an area of little work within the NLP community. This shared task represents one of the first attempts to evaluate the use of early language to predict future health; this has the potential to support a wide variety of clinical health care tasks, from early assessment of lifetime risk for mental health problems, to optimal timing for targeted interventions aimed at both prevention and treatment. |
Tasks | |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/W18-0604/ |
https://www.aclweb.org/anthology/W18-0604 | |
PWC | https://paperswithcode.com/paper/clpsych-2018-shared-task-predicting-current |
Repo | |
Framework | |
Searching for the X-Factor: Exploring Corpus Subjectivity for Word Embeddings
Title | Searching for the X-Factor: Exploring Corpus Subjectivity for Word Embeddings |
Authors | Maksim Tkachenko, Chong Cher Chia, Hady Lauw |
Abstract | We explore the notion of subjectivity, and hypothesize that word embeddings learnt from input corpora of varying levels of subjectivity behave differently on natural language processing tasks such as classifying a sentence by sentiment, subjectivity, or topic. Through systematic comparative analyses, we establish this to be the case indeed. Moreover, based on the discovery of the outsized role that sentiment words play on subjectivity-sensitive tasks such as sentiment classification, we develop a novel word embedding SentiVec which is infused with sentiment information from a lexical resource, and is shown to outperform baselines on such tasks. |
Tasks | Sentiment Analysis, Word Embeddings |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/P18-1112/ |
https://www.aclweb.org/anthology/P18-1112 | |
PWC | https://paperswithcode.com/paper/searching-for-the-x-factor-exploring-corpus |
Repo | |
Framework | |
Predicting Adolescents’ Educational Track from Chat Messages on Dutch Social Media
Title | Predicting Adolescents’ Educational Track from Chat Messages on Dutch Social Media |
Authors | Lisa Hilte, Walter Daelemans, V, Reinhild ekerckhove |
Abstract | We aim to predict Flemish adolescents{'} educational track based on their Dutch social media writing. We distinguish between the three main types of Belgian secondary education: General (theory-oriented), Vocational (practice-oriented), and Technical Secondary Education (hybrid). The best results are obtained with a Naive Bayes model, i.e. an F-score of 0.68 (std. dev. 0.05) in 10-fold cross-validation experiments on the training data and an F-score of 0.60 on unseen data. Many of the most informative features are character n-grams containing specific occurrences of chatspeak phenomena such as emoticons. While the detection of the most theory- and practice-oriented educational tracks seems to be a relatively easy task, the hybrid Technical level appears to be much harder to capture based on online writing style, as expected. |
Tasks | |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/W18-6248/ |
https://www.aclweb.org/anthology/W18-6248 | |
PWC | https://paperswithcode.com/paper/predicting-adolescents-educational-track-from |
Repo | |
Framework | |