Paper Group NANR 222
Similarity Dependent Chinese Restaurant Process for Cognate Identification in Multilingual Wordlists. Native Language Identification With Classifier Stacking and Ensembles. Language Identification and Morphosyntactic Tagging: The Second VarDial Evaluation Campaign. Learning What to Share: Leaky Multi-Task Network for Text Classification. Clipping F …
Similarity Dependent Chinese Restaurant Process for Cognate Identification in Multilingual Wordlists
Title | Similarity Dependent Chinese Restaurant Process for Cognate Identification in Multilingual Wordlists |
Authors | Taraka Rama |
Abstract | We present and evaluate two similarity dependent Chinese Restaurant Process (sd-CRP) algorithms at the task of automated cognate detection. The sd-CRP clustering algorithms do not require any predefined threshold for detecting cognate sets in a multilingual word list. We evaluate the performance of the algorithms on six language families (more than 750 languages) and find that both the sd-CRP variants performs as well as InfoMap and better than UPGMA at the task of inferring cognate clusters. The algorithms presented in this paper are family agnostic and can be applied to any linguistically under-studied language family. |
Tasks | Language Identification |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/K18-1027/ |
https://www.aclweb.org/anthology/K18-1027 | |
PWC | https://paperswithcode.com/paper/similarity-dependent-chinese-restaurant |
Repo | |
Framework | |
Native Language Identification With Classifier Stacking and Ensembles
Title | Native Language Identification With Classifier Stacking and Ensembles |
Authors | Shervin Malmasi, Mark Dras |
Abstract | Ensemble methods using multiple classifiers have proven to be among the most successful approaches for the task of Native Language Identification (NLI), achieving the current state of the art. However, a systematic examination of ensemble methods for NLI has yet to be conducted. Additionally, deeper ensemble architectures such as classifier stacking have not been closely evaluated. We present a set of experiments using three ensemble-based models, testing each with multiple configurations and algorithms. This includes a rigorous application of meta-classification models for NLI, achieving state-of-the-art results on several large data sets, evaluated in both intra-corpus and cross-corpus modes. |
Tasks | Language Acquisition, Language Identification, Native Language Identification, Text Classification |
Published | 2018-09-01 |
URL | https://www.aclweb.org/anthology/J18-3003/ |
https://www.aclweb.org/anthology/J18-3003 | |
PWC | https://paperswithcode.com/paper/native-language-identification-with |
Repo | |
Framework | |
Language Identification and Morphosyntactic Tagging: The Second VarDial Evaluation Campaign
Title | Language Identification and Morphosyntactic Tagging: The Second VarDial Evaluation Campaign |
Authors | Marcos Zampieri, Shervin Malmasi, Preslav Nakov, Ahmed Ali, Suwon Shon, James Glass, Yves Scherrer, Tanja Samard{\v{z}}i{'c}, Nikola Ljube{\v{s}}i{'c}, J{"o}rg Tiedemann, Chris van der Lee, Stefan Grondelaers, Nelleke Oostdijk, Dirk Speelman, Antal van den Bosch, Ritesh Kumar, Bornini Lahiri, Mayank Jain |
Abstract | We present the results and the findings of the Second VarDial Evaluation Campaign on Natural Language Processing (NLP) for Similar Languages, Varieties and Dialects. The campaign was organized as part of the fifth edition of the VarDial workshop, collocated with COLING{'}2018. This year, the campaign included five shared tasks, including two task re-runs {–} Arabic Dialect Identification (ADI) and German Dialect Identification (GDI) {–}, and three new tasks {–} Morphosyntactic Tagging of Tweets (MTT), Discriminating between Dutch and Flemish in Subtitles (DFS), and Indo-Aryan Language Identification (ILI). A total of 24 teams submitted runs across the five shared tasks, and contributed 22 system description papers, which were included in the VarDial workshop proceedings and are referred to in this report. |
Tasks | Dependency Parsing, Language Identification |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/W18-3901/ |
https://www.aclweb.org/anthology/W18-3901 | |
PWC | https://paperswithcode.com/paper/language-identification-and-morphosyntactic |
Repo | |
Framework | |
Learning What to Share: Leaky Multi-Task Network for Text Classification
Title | Learning What to Share: Leaky Multi-Task Network for Text Classification |
Authors | Liqiang Xiao, Honglun Zhang, Wenqing Chen, Yongkun Wang, Yaohui Jin |
Abstract | Neural network based multi-task learning has achieved great success on many NLP problems, which focuses on sharing knowledge among tasks by linking some layers to enhance the performance. However, most existing approaches suffer from the interference between tasks because they lack of selection mechanism for feature sharing. In this way, the feature spaces of tasks may be easily contaminated by helpless features borrowed from others, which will confuse the models for making correct prediction. In this paper, we propose a multi-task convolutional neural network with the Leaky Unit, which has memory and forgetting mechanism to filter the feature flows between tasks. Experiments on five different datasets for text classification validate the benefits of our approach. |
Tasks | Information Retrieval, Multi-Task Learning, Representation Learning, Text Classification |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/C18-1175/ |
https://www.aclweb.org/anthology/C18-1175 | |
PWC | https://paperswithcode.com/paper/learning-what-to-share-leaky-multi-task |
Repo | |
Framework | |
Clipping Free Attacks Against Neural Networks
Title | Clipping Free Attacks Against Neural Networks |
Authors | Boussad ADDAD |
Abstract | During the last years, a remarkable breakthrough has been made in AI domain thanks to artificial deep neural networks that achieved a great success in many machine learning tasks in computer vision, natural language processing, speech recognition, malware detection and so on. However, they are highly vulnerable to easily crafted adversarial examples. Many investigations have pointed out this fact and different approaches have been proposed to generate attacks while adding a limited perturbation to the original data. The most robust known method so far is the so called C&W attack [1]. Nonetheless, a countermeasure known as fea- ture squeezing coupled with ensemble defense showed that most of these attacks can be destroyed [6]. In this paper, we present a new method we call Centered Initial Attack (CIA) whose advantage is twofold : first, it insures by construc- tion the maximum perturbation to be smaller than a threshold fixed beforehand, without the clipping process that degrades the quality of attacks. Second, it is robust against recently introduced defenses such as feature squeezing, JPEG en- coding and even against a voting ensemble of defenses. While its application is not limited to images, we illustrate this using five of the current best classifiers on ImageNet dataset among which two are adversarialy retrained on purpose to be robust against attacks. With a fixed maximum perturbation of only 1.5% on any pixel, around 80% of attacks (targeted) fool the voting ensemble defense and nearly 100% when the perturbation is only 6%. While this shows how it is difficult to defend against CIA attacks, the last section of the paper gives some guidelines to limit their impact. |
Tasks | Malware Detection, Speech Recognition |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=rJqfKPJ0Z |
https://openreview.net/pdf?id=rJqfKPJ0Z | |
PWC | https://paperswithcode.com/paper/clipping-free-attacks-against-neural-networks |
Repo | |
Framework | |
Developing Production-Level Conversational Interfaces with Shallow Semantic Parsing
Title | Developing Production-Level Conversational Interfaces with Shallow Semantic Parsing |
Authors | Arushi Raghuvanshi, Lucien Carroll, Karthik Raghunathan |
Abstract | We demonstrate an end-to-end approach for building conversational interfaces from prototype to production that has proven to work well for a number of applications across diverse verticals. Our architecture improves on the standard domain-intent-entity classification hierarchy and dialogue management architecture by leveraging shallow semantic parsing. We observe that NLU systems for industry applications often require more structured representations of entity relations than provided by the standard hierarchy, yet without requiring full semantic parses which are often inaccurate on real-world conversational data. We distinguish two kinds of semantic properties that can be provided through shallow semantic parsing: entity groups and entity roles. We also provide live demos of conversational apps built for two different use cases: food ordering and meeting control. |
Tasks | Dialogue Management, Intent Classification, Named Entity Recognition, Relation Extraction, Semantic Parsing |
Published | 2018-11-01 |
URL | https://www.aclweb.org/anthology/D18-2027/ |
https://www.aclweb.org/anthology/D18-2027 | |
PWC | https://paperswithcode.com/paper/developing-production-level-conversational |
Repo | |
Framework | |
Face Super-resolution Guided by Facial Component Heatmaps
Title | Face Super-resolution Guided by Facial Component Heatmaps |
Authors | Xin Yu, Basura Fernando, Bernard Ghanem, Fatih Porikli, Richard Hartley |
Abstract | State-of-the-art face super-resolution methods use deep convolutional neural networks to learn a mapping between low-resolution (LR) facial patterns and their corresponding high-resolution (HR) counterparts by exploring local information. However, most of them do not account for face structure and suffer from degradations due to large pose variations and misalignments of faces. Our method incorporates structural information of faces explicitly into face super-resolution by using a multi-task convolutional neural network (CNN). Our CNN has two branches: one for super-resolving face images and the other branch for predicting salient regions of a face coined facial component heatmaps. These heatmaps guide the up-sampling stream for generating better super-resolved faces with high-quality details. Our method uses not only the low-level information (ie intensity similarity), but also middle-level information (ie face structure) to further explore spatial constraints of facial components from LR inputs images. Therefore, we are able to super-resolve very small unaligned face images (16$ imes$16 pixels) with a large upscaling factor of 8$ imes$ while preserving face structure. Extensive experiments demonstrate that our network achieves superior face hallucination results and outperforms the state-of-the-art. |
Tasks | Face Hallucination, Super-Resolution |
Published | 2018-09-01 |
URL | http://openaccess.thecvf.com/content_ECCV_2018/html/Xin_Yu_Face_Super-resolution_Guided_ECCV_2018_paper.html |
http://openaccess.thecvf.com/content_ECCV_2018/papers/Xin_Yu_Face_Super-resolution_Guided_ECCV_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/face-super-resolution-guided-by-facial |
Repo | |
Framework | |
Evaluating Inflectional Complexity Crosslinguistically: a Processing Perspective
Title | Evaluating Inflectional Complexity Crosslinguistically: a Processing Perspective |
Authors | Claudia Marzi, Marcello Ferro, Ouafae Nahli, Patrizia Belik, Stavros Bompolas, Vito Pirrelli |
Abstract | |
Tasks | |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1610/ |
https://www.aclweb.org/anthology/L18-1610 | |
PWC | https://paperswithcode.com/paper/evaluating-inflectional-complexity |
Repo | |
Framework | |
FARMI: A FrAmework for Recording Multi-Modal Interactions
Title | FARMI: A FrAmework for Recording Multi-Modal Interactions |
Authors | Patrik Jonell, Mattias Bystedt, Per Fallgren, Dimosthenis Kontogiorgos, Jos{'e} Lopes, Zofia Malisz, Samuel Mascarenhas, Catharine Oertel, Eran Raveh, Todd Shore |
Abstract | |
Tasks | Face Recognition, Speech Recognition |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1626/ |
https://www.aclweb.org/anthology/L18-1626 | |
PWC | https://paperswithcode.com/paper/farmi-a-framework-for-recording-multi-modal |
Repo | |
Framework | |
A Vietnamese Dialog Act Corpus Based on ISO 24617-2 standard
Title | A Vietnamese Dialog Act Corpus Based on ISO 24617-2 standard |
Authors | Thi-Lan Ngo, Pham Khac Linh, Hideaki Takeda |
Abstract | |
Tasks | Emotion Recognition, Sentiment Analysis, Speech Recognition |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1630/ |
https://www.aclweb.org/anthology/L18-1630 | |
PWC | https://paperswithcode.com/paper/a-vietnamese-dialog-act-corpus-based-on-iso |
Repo | |
Framework | |
Compilation of Corpora for the Study of the Information Structure–Prosody Interface
Title | Compilation of Corpora for the Study of the Information Structure–Prosody Interface |
Authors | Alicia Burga, M{'o}nica Dom{'\i}nguez, Mireia Farr{'u}s, Leo Wanner |
Abstract | |
Tasks | |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1635/ |
https://www.aclweb.org/anthology/L18-1635 | |
PWC | https://paperswithcode.com/paper/compilation-of-corpora-for-the-study-of-the |
Repo | |
Framework | |
Zero-Shot Transfer with Deictic Object-Oriented Representation in Reinforcement Learning
Title | Zero-Shot Transfer with Deictic Object-Oriented Representation in Reinforcement Learning |
Authors | Ofir Marom, Benjamin Rosman |
Abstract | Object-oriented representations in reinforcement learning have shown promise in transfer learning, with previous research introducing a propositional object-oriented framework that has provably efficient learning bounds with respect to sample complexity. However, this framework has limitations in terms of the classes of tasks it can efficiently learn. In this paper we introduce a novel deictic object-oriented framework that has provably efficient learning bounds and can solve a broader range of tasks. Additionally, we show that this framework is capable of zero-shot transfer of transition dynamics across tasks and demonstrate this empirically for the Taxi and Sokoban domains. |
Tasks | Transfer Learning |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/7497-zero-shot-transfer-with-deictic-object-oriented-representation-in-reinforcement-learning |
http://papers.nips.cc/paper/7497-zero-shot-transfer-with-deictic-object-oriented-representation-in-reinforcement-learning.pdf | |
PWC | https://paperswithcode.com/paper/zero-shot-transfer-with-deictic-object |
Repo | |
Framework | |
Increasing the Accessibility of Time-Aligned Speech Corpora with Spokes Mix
Title | Increasing the Accessibility of Time-Aligned Speech Corpora with Spokes Mix |
Authors | Piotr P{\k{e}}zik |
Abstract | |
Tasks | Speech Recognition |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1678/ |
https://www.aclweb.org/anthology/L18-1678 | |
PWC | https://paperswithcode.com/paper/increasing-the-accessibility-of-time-aligned |
Repo | |
Framework | |
Evaluating the Stability of Embedding-based Word Similarities
Title | Evaluating the Stability of Embedding-based Word Similarities |
Authors | Maria Antoniak, David Mimno |
Abstract | Word embeddings are increasingly being used as a tool to study word associations in specific corpora. However, it is unclear whether such embeddings reflect enduring properties of language or if they are sensitive to inconsequential variations in the source documents. We find that nearest-neighbor distances are highly sensitive to small changes in the training corpus for a variety of algorithms. For all methods, including specific documents in the training set can result in substantial variations. We show that these effects are more prominent for smaller training corpora. We recommend that users never rely on single embedding models for distance calculations, but rather average over multiple bootstrap samples, especially for small corpora. |
Tasks | Semantic Textual Similarity, Word Embeddings |
Published | 2018-01-01 |
URL | https://www.aclweb.org/anthology/Q18-1008/ |
https://www.aclweb.org/anthology/Q18-1008 | |
PWC | https://paperswithcode.com/paper/evaluating-the-stability-of-embedding-based |
Repo | |
Framework | |
Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies
Title | Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies |
Authors | |
Abstract | |
Tasks | |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/K18-2000/ |
https://www.aclweb.org/anthology/K18-2000 | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-conll-2018-shared-task |
Repo | |
Framework | |