October 15, 2019

1921 words 10 mins read

Paper Group NANR 222

Similarity Dependent Chinese Restaurant Process for Cognate Identification in Multilingual Wordlists. Native Language Identification With Classifier Stacking and Ensembles. Language Identification and Morphosyntactic Tagging: The Second VarDial Evaluation Campaign. Learning What to Share: Leaky Multi-Task Network for Text Classification. Clipping F …

Similarity Dependent Chinese Restaurant Process for Cognate Identification in Multilingual Wordlists


Title	Similarity Dependent Chinese Restaurant Process for Cognate Identification in Multilingual Wordlists
Authors	Taraka Rama
Abstract	We present and evaluate two similarity dependent Chinese Restaurant Process (sd-CRP) algorithms at the task of automated cognate detection. The sd-CRP clustering algorithms do not require any predefined threshold for detecting cognate sets in a multilingual word list. We evaluate the performance of the algorithms on six language families (more than 750 languages) and find that both the sd-CRP variants performs as well as InfoMap and better than UPGMA at the task of inferring cognate clusters. The algorithms presented in this paper are family agnostic and can be applied to any linguistically under-studied language family.
Tasks	Language Identification
Published	2018-10-01
URL	https://www.aclweb.org/anthology/K18-1027/
PDF	https://www.aclweb.org/anthology/K18-1027
PWC	https://paperswithcode.com/paper/similarity-dependent-chinese-restaurant
Repo
Framework

Native Language Identification With Classifier Stacking and Ensembles


Title	Native Language Identification With Classifier Stacking and Ensembles
Authors	Shervin Malmasi, Mark Dras
Abstract	Ensemble methods using multiple classifiers have proven to be among the most successful approaches for the task of Native Language Identification (NLI), achieving the current state of the art. However, a systematic examination of ensemble methods for NLI has yet to be conducted. Additionally, deeper ensemble architectures such as classifier stacking have not been closely evaluated. We present a set of experiments using three ensemble-based models, testing each with multiple configurations and algorithms. This includes a rigorous application of meta-classification models for NLI, achieving state-of-the-art results on several large data sets, evaluated in both intra-corpus and cross-corpus modes.
Tasks	Language Acquisition, Language Identification, Native Language Identification, Text Classification
Published	2018-09-01
URL	https://www.aclweb.org/anthology/J18-3003/
PDF	https://www.aclweb.org/anthology/J18-3003
PWC	https://paperswithcode.com/paper/native-language-identification-with
Repo
Framework

Language Identification and Morphosyntactic Tagging: The Second VarDial Evaluation Campaign


Title	Language Identification and Morphosyntactic Tagging: The Second VarDial Evaluation Campaign
Authors	Marcos Zampieri, Shervin Malmasi, Preslav Nakov, Ahmed Ali, Suwon Shon, James Glass, Yves Scherrer, Tanja Samard{\v{z}}i{'c}, Nikola Ljube{\v{s}}i{'c}, J{"o}rg Tiedemann, Chris van der Lee, Stefan Grondelaers, Nelleke Oostdijk, Dirk Speelman, Antal van den Bosch, Ritesh Kumar, Bornini Lahiri, Mayank Jain
Abstract	We present the results and the findings of the Second VarDial Evaluation Campaign on Natural Language Processing (NLP) for Similar Languages, Varieties and Dialects. The campaign was organized as part of the fifth edition of the VarDial workshop, collocated with COLING{'}2018. This year, the campaign included five shared tasks, including two task re-runs {–} Arabic Dialect Identification (ADI) and German Dialect Identification (GDI) {–}, and three new tasks {–} Morphosyntactic Tagging of Tweets (MTT), Discriminating between Dutch and Flemish in Subtitles (DFS), and Indo-Aryan Language Identification (ILI). A total of 24 teams submitted runs across the five shared tasks, and contributed 22 system description papers, which were included in the VarDial workshop proceedings and are referred to in this report.
Tasks	Dependency Parsing, Language Identification
Published	2018-08-01
URL	https://www.aclweb.org/anthology/W18-3901/
PDF	https://www.aclweb.org/anthology/W18-3901
PWC	https://paperswithcode.com/paper/language-identification-and-morphosyntactic
Repo
Framework


Title	Learning What to Share: Leaky Multi-Task Network for Text Classification
Authors	Liqiang Xiao, Honglun Zhang, Wenqing Chen, Yongkun Wang, Yaohui Jin
Abstract	Neural network based multi-task learning has achieved great success on many NLP problems, which focuses on sharing knowledge among tasks by linking some layers to enhance the performance. However, most existing approaches suffer from the interference between tasks because they lack of selection mechanism for feature sharing. In this way, the feature spaces of tasks may be easily contaminated by helpless features borrowed from others, which will confuse the models for making correct prediction. In this paper, we propose a multi-task convolutional neural network with the Leaky Unit, which has memory and forgetting mechanism to filter the feature flows between tasks. Experiments on five different datasets for text classification validate the benefits of our approach.
Tasks	Information Retrieval, Multi-Task Learning, Representation Learning, Text Classification
Published	2018-08-01
URL	https://www.aclweb.org/anthology/C18-1175/
PDF	https://www.aclweb.org/anthology/C18-1175
PWC	https://paperswithcode.com/paper/learning-what-to-share-leaky-multi-task
Repo
Framework

Clipping Free Attacks Against Neural Networks


Title	Clipping Free Attacks Against Neural Networks
Authors	Boussad ADDAD
Abstract	During the last years, a remarkable breakthrough has been made in AI domain thanks to artificial deep neural networks that achieved a great success in many machine learning tasks in computer vision, natural language processing, speech recognition, malware detection and so on. However, they are highly vulnerable to easily crafted adversarial examples. Many investigations have pointed out this fact and different approaches have been proposed to generate attacks while adding a limited perturbation to the original data. The most robust known method so far is the so called C&W attack [1]. Nonetheless, a countermeasure known as fea- ture squeezing coupled with ensemble defense showed that most of these attacks can be destroyed [6]. In this paper, we present a new method we call Centered Initial Attack (CIA) whose advantage is twofold : first, it insures by construc- tion the maximum perturbation to be smaller than a threshold fixed beforehand, without the clipping process that degrades the quality of attacks. Second, it is robust against recently introduced defenses such as feature squeezing, JPEG en- coding and even against a voting ensemble of defenses. While its application is not limited to images, we illustrate this using five of the current best classifiers on ImageNet dataset among which two are adversarialy retrained on purpose to be robust against attacks. With a fixed maximum perturbation of only 1.5% on any pixel, around 80% of attacks (targeted) fool the voting ensemble defense and nearly 100% when the perturbation is only 6%. While this shows how it is difficult to defend against CIA attacks, the last section of the paper gives some guidelines to limit their impact.
Tasks	Malware Detection, Speech Recognition
Published	2018-01-01
URL	https://openreview.net/forum?id=rJqfKPJ0Z
PDF	https://openreview.net/pdf?id=rJqfKPJ0Z
PWC	https://paperswithcode.com/paper/clipping-free-attacks-against-neural-networks
Repo
Framework

Developing Production-Level Conversational Interfaces with Shallow Semantic Parsing


Title	Developing Production-Level Conversational Interfaces with Shallow Semantic Parsing
Authors	Arushi Raghuvanshi, Lucien Carroll, Karthik Raghunathan
Abstract	We demonstrate an end-to-end approach for building conversational interfaces from prototype to production that has proven to work well for a number of applications across diverse verticals. Our architecture improves on the standard domain-intent-entity classification hierarchy and dialogue management architecture by leveraging shallow semantic parsing. We observe that NLU systems for industry applications often require more structured representations of entity relations than provided by the standard hierarchy, yet without requiring full semantic parses which are often inaccurate on real-world conversational data. We distinguish two kinds of semantic properties that can be provided through shallow semantic parsing: entity groups and entity roles. We also provide live demos of conversational apps built for two different use cases: food ordering and meeting control.
Tasks	Dialogue Management, Intent Classification, Named Entity Recognition, Relation Extraction, Semantic Parsing
Published	2018-11-01
URL	https://www.aclweb.org/anthology/D18-2027/
PDF	https://www.aclweb.org/anthology/D18-2027
PWC	https://paperswithcode.com/paper/developing-production-level-conversational
Repo
Framework

Face Super-resolution Guided by Facial Component Heatmaps


Title	Face Super-resolution Guided by Facial Component Heatmaps
Authors	Xin Yu, Basura Fernando, Bernard Ghanem, Fatih Porikli, Richard Hartley
Abstract	State-of-the-art face super-resolution methods use deep convolutional neural networks to learn a mapping between low-resolution (LR) facial patterns and their corresponding high-resolution (HR) counterparts by exploring local information. However, most of them do not account for face structure and suffer from degradations due to large pose variations and misalignments of faces. Our method incorporates structural information of faces explicitly into face super-resolution by using a multi-task convolutional neural network (CNN). Our CNN has two branches: one for super-resolving face images and the other branch for predicting salient regions of a face coined facial component heatmaps. These heatmaps guide the up-sampling stream for generating better super-resolved faces with high-quality details. Our method uses not only the low-level information (ie intensity similarity), but also middle-level information (ie face structure) to further explore spatial constraints of facial components from LR inputs images. Therefore, we are able to super-resolve very small unaligned face images (16$ imes$16 pixels) with a large upscaling factor of 8$ imes$ while preserving face structure. Extensive experiments demonstrate that our network achieves superior face hallucination results and outperforms the state-of-the-art.
Tasks	Face Hallucination, Super-Resolution
Published	2018-09-01
URL	http://openaccess.thecvf.com/content_ECCV_2018/html/Xin_Yu_Face_Super-resolution_Guided_ECCV_2018_paper.html
PDF	http://openaccess.thecvf.com/content_ECCV_2018/papers/Xin_Yu_Face_Super-resolution_Guided_ECCV_2018_paper.pdf
PWC	https://paperswithcode.com/paper/face-super-resolution-guided-by-facial
Repo
Framework

Evaluating Inflectional Complexity Crosslinguistically: a Processing Perspective


Title	Evaluating Inflectional Complexity Crosslinguistically: a Processing Perspective
Authors	Claudia Marzi, Marcello Ferro, Ouafae Nahli, Patrizia Belik, Stavros Bompolas, Vito Pirrelli
Abstract
Tasks
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1610/
PDF	https://www.aclweb.org/anthology/L18-1610
PWC	https://paperswithcode.com/paper/evaluating-inflectional-complexity
Repo
Framework


Title	FARMI: A FrAmework for Recording Multi-Modal Interactions
Authors	Patrik Jonell, Mattias Bystedt, Per Fallgren, Dimosthenis Kontogiorgos, Jos{'e} Lopes, Zofia Malisz, Samuel Mascarenhas, Catharine Oertel, Eran Raveh, Todd Shore
Abstract
Tasks	Face Recognition, Speech Recognition
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1626/
PDF	https://www.aclweb.org/anthology/L18-1626
PWC	https://paperswithcode.com/paper/farmi-a-framework-for-recording-multi-modal
Repo
Framework

A Vietnamese Dialog Act Corpus Based on ISO 24617-2 standard


Title	A Vietnamese Dialog Act Corpus Based on ISO 24617-2 standard
Authors	Thi-Lan Ngo, Pham Khac Linh, Hideaki Takeda
Abstract
Tasks	Emotion Recognition, Sentiment Analysis, Speech Recognition
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1630/
PDF	https://www.aclweb.org/anthology/L18-1630
PWC	https://paperswithcode.com/paper/a-vietnamese-dialog-act-corpus-based-on-iso
Repo
Framework

Compilation of Corpora for the Study of the Information Structure–Prosody Interface


Title	Compilation of Corpora for the Study of the Information Structure–Prosody Interface
Authors	Alicia Burga, M{'o}nica Dom{'\i}nguez, Mireia Farr{'u}s, Leo Wanner
Abstract
Tasks
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1635/
PDF	https://www.aclweb.org/anthology/L18-1635
PWC	https://paperswithcode.com/paper/compilation-of-corpora-for-the-study-of-the
Repo
Framework

Zero-Shot Transfer with Deictic Object-Oriented Representation in Reinforcement Learning


Title	Zero-Shot Transfer with Deictic Object-Oriented Representation in Reinforcement Learning
Authors	Ofir Marom, Benjamin Rosman
Abstract	Object-oriented representations in reinforcement learning have shown promise in transfer learning, with previous research introducing a propositional object-oriented framework that has provably efficient learning bounds with respect to sample complexity. However, this framework has limitations in terms of the classes of tasks it can efficiently learn. In this paper we introduce a novel deictic object-oriented framework that has provably efficient learning bounds and can solve a broader range of tasks. Additionally, we show that this framework is capable of zero-shot transfer of transition dynamics across tasks and demonstrate this empirically for the Taxi and Sokoban domains.
Tasks	Transfer Learning
Published	2018-12-01
URL	http://papers.nips.cc/paper/7497-zero-shot-transfer-with-deictic-object-oriented-representation-in-reinforcement-learning
PDF	http://papers.nips.cc/paper/7497-zero-shot-transfer-with-deictic-object-oriented-representation-in-reinforcement-learning.pdf
PWC	https://paperswithcode.com/paper/zero-shot-transfer-with-deictic-object
Repo
Framework

Increasing the Accessibility of Time-Aligned Speech Corpora with Spokes Mix


Title	Increasing the Accessibility of Time-Aligned Speech Corpora with Spokes Mix
Authors	Piotr P{\k{e}}zik
Abstract
Tasks	Speech Recognition
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1678/
PDF	https://www.aclweb.org/anthology/L18-1678
PWC	https://paperswithcode.com/paper/increasing-the-accessibility-of-time-aligned
Repo
Framework

Evaluating the Stability of Embedding-based Word Similarities


Title	Evaluating the Stability of Embedding-based Word Similarities
Authors	Maria Antoniak, David Mimno
Abstract	Word embeddings are increasingly being used as a tool to study word associations in specific corpora. However, it is unclear whether such embeddings reflect enduring properties of language or if they are sensitive to inconsequential variations in the source documents. We find that nearest-neighbor distances are highly sensitive to small changes in the training corpus for a variety of algorithms. For all methods, including specific documents in the training set can result in substantial variations. We show that these effects are more prominent for smaller training corpora. We recommend that users never rely on single embedding models for distance calculations, but rather average over multiple bootstrap samples, especially for small corpora.
Tasks	Semantic Textual Similarity, Word Embeddings
Published	2018-01-01
URL	https://www.aclweb.org/anthology/Q18-1008/
PDF	https://www.aclweb.org/anthology/Q18-1008
PWC	https://paperswithcode.com/paper/evaluating-the-stability-of-embedding-based
Repo
Framework

Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies


Title	Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies
Authors
Abstract
Tasks
Published	2018-10-01
URL	https://www.aclweb.org/anthology/K18-2000/
PDF	https://www.aclweb.org/anthology/K18-2000
PWC	https://paperswithcode.com/paper/proceedings-of-the-conll-2018-shared-task
Repo
Framework