January 24, 2020

1943 words 10 mins read

Paper Group NANR 107

Paper Group NANR 107

HIT-SCIR at MRP 2019: A Unified Pipeline for Meaning Representation Parsing via Efficient Training and Effective Encoding. Optimal Attacks against Multiple Classifiers. Data augmentation using back-translation for context-aware neural machine translation. ST MADAR 2019 Shared Task: Arabic Fine-Grained Dialect Identification. Quasi-Globally Optimal …

HIT-SCIR at MRP 2019: A Unified Pipeline for Meaning Representation Parsing via Efficient Training and Effective Encoding

Title HIT-SCIR at MRP 2019: A Unified Pipeline for Meaning Representation Parsing via Efficient Training and Effective Encoding
Authors Wanxiang Che, Longxu Dou, Yang Xu, Yuxuan Wang, Yijia Liu, Ting Liu
Abstract This paper describes our system (HIT-SCIR) for CoNLL 2019 shared task: Cross-Framework Meaning Representation Parsing. We extended the basic transition-based parser with two improvements: a) Efficient Training by realizing Stack LSTM parallel training; b) Effective Encoding via adopting deep contextualized word embeddings BERT. Generally, we proposed a unified pipeline to meaning representation parsing, including framework-specific transition-based parsers, BERT-enhanced word representation, and post-processing. In the final evaluation, our system was ranked first according to ALL-F1 (86.2{%}) and especially ranked first in UCCA framework (81.67{%}).
Tasks Word Embeddings
Published 2019-11-01
URL https://www.aclweb.org/anthology/K19-2007/
PDF https://www.aclweb.org/anthology/K19-2007
PWC https://paperswithcode.com/paper/hit-scir-at-mrp-2019-a-unified-pipeline-for
Repo
Framework

Optimal Attacks against Multiple Classifiers

Title Optimal Attacks against Multiple Classifiers
Authors Juan C. Perdomo, Yaron Singer
Abstract We study the problem of designing provably optimal adversarial noise algorithms that induce misclassification in settings where a learner aggregates decisions from multiple classifiers. Given the demonstrated vulnerability of state-of-the-art models to adversarial examples, recent efforts within the field of robust machine learning have focused on the use of ensemble classifiers as a way of boosting the robustness of individual models. In this paper, we design provably optimal attacks against a set of classifiers. We demonstrate how this problem can be framed as finding strategies at equilibrium in a two player, zero sum game between a learner and an adversary and consequently illustrate the need for randomization in adversarial attacks. The main technical challenge we consider is the design of best response oracles that can be implemented in a Multiplicative Weight Updates framework to find equilibrium strategies in the zero-sum game. We develop a series of scalable noise generation algorithms for deep neural networks, and show that it outperforms state-of-the-art attacks on various image classification tasks. Although there are generally no guarantees for deep learning, we show this is a well-principled approach in that it is provably optimal for linear classifiers. The main insight is a geometric characterization of the decision space that reduces the problem of designing best response oracles to minimizing a quadratic function over a set of convex polytopes.
Tasks Image Classification
Published 2019-05-01
URL https://openreview.net/forum?id=rkl4M3R5K7
PDF https://openreview.net/pdf?id=rkl4M3R5K7
PWC https://paperswithcode.com/paper/optimal-attacks-against-multiple-classifiers
Repo
Framework

Data augmentation using back-translation for context-aware neural machine translation

Title Data augmentation using back-translation for context-aware neural machine translation
Authors Amane Sugiyama, Naoki Yoshinaga
Abstract A single sentence does not always convey information that is enough to translate it into other languages. Some target languages need to add or specialize words that are omitted or ambiguous in the source languages (e.g, zero pronouns in translating Japanese to English or epicene pronouns in translating English to French). To translate such ambiguous sentences, we need contexts beyond a single sentence, and have so far explored context-aware neural machine translation (NMT). However, a large amount of parallel corpora is not easily available to train accurate context-aware NMT models. In this study, we first obtain large-scale pseudo parallel corpora by back-translating monolingual data, and then investigate its impact on the translation accuracy of context-aware NMT models. We evaluated context-aware NMT models trained with small parallel corpora and the large-scale pseudo parallel corpora on English-Japanese and English-French datasets to demonstrate the large impact of the data augmentation for context-aware NMT models.
Tasks Data Augmentation, Machine Translation
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-6504/
PDF https://www.aclweb.org/anthology/D19-6504
PWC https://paperswithcode.com/paper/data-augmentation-using-back-translation-for
Repo
Framework

ST MADAR 2019 Shared Task: Arabic Fine-Grained Dialect Identification

Title ST MADAR 2019 Shared Task: Arabic Fine-Grained Dialect Identification
Authors Mourad Abbas, Mohamed Lichouri, Abed Alhakim Freihat
Abstract This paper describes the solution that we propose on MADAR 2019 Arabic Fine-Grained Dialect Identification task. The proposed solution utilized a set of classifiers that we trained on character and word features. These classifiers are: Support Vector Machines (SVM), Bernoulli Naive Bayes (BNB), Multinomial Naive Bayes (MNB), Logistic Regression (LR), Stochastic Gradient Descent (SGD), Passive Aggressive(PA) and Perceptron (PC). The system achieved competitive results, with a performance of 62.87 {%} and 62.12 {%} for both development and test sets.
Tasks
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-4635/
PDF https://www.aclweb.org/anthology/W19-4635
PWC https://paperswithcode.com/paper/st-madar-2019-shared-task-arabic-fine-grained
Repo
Framework

Quasi-Globally Optimal and Efficient Vanishing Point Estimation in Manhattan World

Title Quasi-Globally Optimal and Efficient Vanishing Point Estimation in Manhattan World
Authors Haoang Li, Ji Zhao, Jean-Charles Bazin, Wen Chen, Zhe Liu, Yun-Hui Liu
Abstract The image lines projected from parallel 3D lines intersect at a common point called the vanishing point (VP). Manhattan world holds for the scenes with three orthogonal VPs. In Manhattan world, given several lines in a calibrated image, we aim at clustering them by three unknown-but-sought VPs. The VP estimation can be reformulated as computing the rotation between the Manhattan frame and the camera frame. To compute this rotation, state-of-the-art methods are based on either data sampling or parameter search, and they fail to guarantee the accuracy and efficiency simultaneously. In contrast, we propose to hybridize these two strategies. We first compute two degrees of freedom (DOF) of the above rotation by two sampled image lines, and then search for the optimal third DOF based on the branch-and-bound. Our sampling accelerates our search by reducing the search space and simplifying the bound computation. Our search is not sensitive to noise and achieves quasi-global optimality in terms of maximizing the number of inliers. Experiments on synthetic and real-world images showed that our method outperforms state-of-the-art approaches in terms of accuracy and/or efficiency.
Tasks
Published 2019-10-01
URL http://openaccess.thecvf.com/content_ICCV_2019/html/Li_Quasi-Globally_Optimal_and_Efficient_Vanishing_Point_Estimation_in_Manhattan_World_ICCV_2019_paper.html
PDF http://openaccess.thecvf.com/content_ICCV_2019/papers/Li_Quasi-Globally_Optimal_and_Efficient_Vanishing_Point_Estimation_in_Manhattan_World_ICCV_2019_paper.pdf
PWC https://paperswithcode.com/paper/quasi-globally-optimal-and-efficient
Repo
Framework

KDEHatEval at SemEval-2019 Task 5: A Neural Network Model for Detecting Hate Speech in Twitter

Title KDEHatEval at SemEval-2019 Task 5: A Neural Network Model for Detecting Hate Speech in Twitter
Authors Umme Aymun Siddiqua, Abu Nowshed Chy, Masaki Aono
Abstract In the age of emerging volume of microblog platforms, especially twitter, hate speech propagation is now of great concern. However, due to the brevity of tweets and informal user generated contents, detecting and analyzing hate speech on twitter is a formidable task. In this paper, we present our approach for detecting hate speech in tweets defined in the SemEval-2019 Task 5. Our team KDEHatEval employs different neural network models including multi-kernel convolution (MKC), nested LSTMs (NLSTMs), and multi-layer perceptron (MLP) in a unified architecture. Moreover, we utilize the state-of-the-art pre-trained sentence embedding models including DeepMoji, InferSent, and BERT for effective tweet representation. We analyze the performance of our method and demonstrate the contribution of each component of our architecture.
Tasks Sentence Embedding
Published 2019-06-01
URL https://www.aclweb.org/anthology/S19-2064/
PDF https://www.aclweb.org/anthology/S19-2064
PWC https://paperswithcode.com/paper/kdehateval-at-semeval-2019-task-5-a-neural
Repo
Framework

IB-GAN: Disentangled Representation Learning with Information Bottleneck GAN

Title IB-GAN: Disentangled Representation Learning with Information Bottleneck GAN
Authors Insu Jeon, Wonkwang Lee, Gunhee Kim
Abstract We present a novel architecture of GAN for a disentangled representation learning. The new model architecture is inspired by Information Bottleneck (IB) theory thereby named IB-GAN. IB-GAN objective is similar to that of InfoGAN but has a crucial difference; a capacity regularization for mutual information is adopted, thanks to which the generator of IB-GAN can harness a latent representation in disentangled and interpretable manner. To facilitate the optimization of IB-GAN in practice, a new variational upper-bound is derived. With experiments on CelebA, 3DChairs, and dSprites datasets, we demonstrate that the visual quality of samples generated by IB-GAN is often better than those by β-VAEs. Moreover, IB-GAN achieves much higher disentanglement metrics score than β-VAEs or InfoGAN on the dSprites dataset.
Tasks Representation Learning
Published 2019-05-01
URL https://openreview.net/forum?id=ryljV2A5KX
PDF https://openreview.net/pdf?id=ryljV2A5KX
PWC https://paperswithcode.com/paper/ib-gan-disentangled-representation-learning
Repo
Framework

Readability of Twitter Tweets for Second Language Learners

Title Readability of Twitter Tweets for Second Language Learners
Authors Patrick Jacob, Alex Uitdenbogerd, ra
Abstract Optimal language acquisition via reading requires the learners to read slightly above their current language skill level. Identifying material at the right level is the essential role of automatic readability measurement. Short message platforms such as Twitter offer the opportunity for language practice while reading about current topics and engaging in conversation in small doses, and can be filtered according to linguistic criteria to suit the learner. In this research, we explore how readable tweets are for English language learners and which factors contribute to their readability. With participants from six language groups, we collected 14,659 data points, each representing a tweet from a pool of 4100 tweets, and a judgement of perceived readability. Traditional readability measures and features failed on the data-set, but demographic data showed that judgements were largely genuine and reflected reported language skill, which is consistent with other recent studies. We report on the properties of the data set and implications for future research.
Tasks Language Acquisition
Published 2019-04-01
URL https://www.aclweb.org/anthology/U19-1003/
PDF https://www.aclweb.org/anthology/U19-1003
PWC https://paperswithcode.com/paper/readability-of-twitter-tweets-for-second
Repo
Framework

Assessing socioeconomic status of Twitter users: A survey

Title Assessing socioeconomic status of Twitter users: A survey
Authors Dhouha GHAZOUANI, Luigi LANCIERI, Habib OUNELLI, Chaker JEBARI
Abstract Every day, the emotion and opinion of different people across the world are reflected in the form of short messages using microblogging platforms. Despite the existence of enormous potential introduced by this data source, the Twitter community is still ambiguous and is not fully explored yet. While there are a huge number of studies examining the possibilities of inferring gender and age, there exist hardly researches on socioeconomic status (SES) inference of Twitter users. As socioeconomic status is essential to treating diverse questions linked to human behavior in several fields (sociology, demography, public health, etc.), we conducted a comprehensive literature review of SES studies, inference methods, and metrics. With reference to the research on literature{'}s results, we came to outline the most critical challenges for researchers. To the best of our knowledge, this paper is the first review that introduces the different aspects of SES inference. Indeed, this article provides the benefits for practitioners who aim to process and explore Twitter SES inference.
Tasks
Published 2019-09-01
URL https://www.aclweb.org/anthology/R19-1046/
PDF https://www.aclweb.org/anthology/R19-1046
PWC https://paperswithcode.com/paper/assessing-socioeconomic-status-of-twitter
Repo
Framework

A Platform for Community-sourced Indic Knowledge Processing at Scale

Title A Platform for Community-sourced Indic Knowledge Processing at Scale
Authors Sai Susarla, Damodar Reddy Challa
Abstract
Tasks
Published 2019-10-01
URL https://www.aclweb.org/anthology/W19-7506/
PDF https://www.aclweb.org/anthology/W19-7506
PWC https://paperswithcode.com/paper/a-platform-for-community-sourced-indic
Repo
Framework

Framework for Question-Answering in Sanskrit through Automated Construction of Knowledge Graphs

Title Framework for Question-Answering in Sanskrit through Automated Construction of Knowledge Graphs
Authors Hrishikesh Terdalkar, Arnab Bhattacharya
Abstract
Tasks Knowledge Graphs, Question Answering
Published 2019-10-01
URL https://www.aclweb.org/anthology/W19-7508/
PDF https://www.aclweb.org/anthology/W19-7508
PWC https://paperswithcode.com/paper/framework-for-question-answering-in-sanskrit
Repo
Framework

Introduction to Sanskrit Shabdamitra: An Educational Application of Sanskrit Wordnet

Title Introduction to Sanskrit Shabdamitra: An Educational Application of Sanskrit Wordnet
Authors Malhar Kulkarni, Nilesh Joshi, Sayali Khare, Hanumant Redkar, Pushpak Bhattacharyya
Abstract
Tasks
Published 2019-10-01
URL https://www.aclweb.org/anthology/W19-7509/
PDF https://www.aclweb.org/anthology/W19-7509
PWC https://paperswithcode.com/paper/introduction-to-sanskrit-shabdamitra-an
Repo
Framework

Syntactic dependencies correspond to word pairs with high mutual information

Title Syntactic dependencies correspond to word pairs with high mutual information
Authors Richard Futrell, Peng Qian, Edward Gibson, Evelina Fedorenko, Idan Blank
Abstract
Tasks
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-7703/
PDF https://www.aclweb.org/anthology/W19-7703
PWC https://paperswithcode.com/paper/syntactic-dependencies-correspond-to-word
Repo
Framework

An Introduction to the Textual History Tool

Title An Introduction to the Textual History Tool
Authors Diptesh Kanojia, Malhar Kulkarni, Pushpak Bhattacharyya, Eivind Kahrs
Abstract
Tasks
Published 2019-10-01
URL https://www.aclweb.org/anthology/W19-7512/
PDF https://www.aclweb.org/anthology/W19-7512
PWC https://paperswithcode.com/paper/an-introduction-to-the-textual-history-tool
Repo
Framework

Character-level Annotation for Chinese Surface-Syntactic Universal Dependencies

Title Character-level Annotation for Chinese Surface-Syntactic Universal Dependencies
Authors Yixuan Li, Gerdes Kim, Dong Chuanming
Abstract
Tasks
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-7726/
PDF https://www.aclweb.org/anthology/W19-7726
PWC https://paperswithcode.com/paper/character-level-annotation-for-chinese
Repo
Framework
comments powered by Disqus