January 24, 2020

1943 words 10 mins read

Paper Group NANR 107

HIT-SCIR at MRP 2019: A Unified Pipeline for Meaning Representation Parsing via Efficient Training and Effective Encoding. Optimal Attacks against Multiple Classifiers. Data augmentation using back-translation for context-aware neural machine translation. ST MADAR 2019 Shared Task: Arabic Fine-Grained Dialect Identification. Quasi-Globally Optimal …

HIT-SCIR at MRP 2019: A Unified Pipeline for Meaning Representation Parsing via Efficient Training and Effective Encoding


Title	HIT-SCIR at MRP 2019: A Unified Pipeline for Meaning Representation Parsing via Efficient Training and Effective Encoding
Authors	Wanxiang Che, Longxu Dou, Yang Xu, Yuxuan Wang, Yijia Liu, Ting Liu
Abstract	This paper describes our system (HIT-SCIR) for CoNLL 2019 shared task: Cross-Framework Meaning Representation Parsing. We extended the basic transition-based parser with two improvements: a) Efficient Training by realizing Stack LSTM parallel training; b) Effective Encoding via adopting deep contextualized word embeddings BERT. Generally, we proposed a unified pipeline to meaning representation parsing, including framework-specific transition-based parsers, BERT-enhanced word representation, and post-processing. In the final evaluation, our system was ranked first according to ALL-F1 (86.2{%}) and especially ranked first in UCCA framework (81.67{%}).
Tasks	Word Embeddings
Published	2019-11-01
URL	https://www.aclweb.org/anthology/K19-2007/
PDF	https://www.aclweb.org/anthology/K19-2007
PWC	https://paperswithcode.com/paper/hit-scir-at-mrp-2019-a-unified-pipeline-for
Repo
Framework

Optimal Attacks against Multiple Classifiers


Title	Optimal Attacks against Multiple Classifiers
Authors	Juan C. Perdomo, Yaron Singer
Abstract	We study the problem of designing provably optimal adversarial noise algorithms that induce misclassification in settings where a learner aggregates decisions from multiple classifiers. Given the demonstrated vulnerability of state-of-the-art models to adversarial examples, recent efforts within the field of robust machine learning have focused on the use of ensemble classifiers as a way of boosting the robustness of individual models. In this paper, we design provably optimal attacks against a set of classifiers. We demonstrate how this problem can be framed as finding strategies at equilibrium in a two player, zero sum game between a learner and an adversary and consequently illustrate the need for randomization in adversarial attacks. The main technical challenge we consider is the design of best response oracles that can be implemented in a Multiplicative Weight Updates framework to find equilibrium strategies in the zero-sum game. We develop a series of scalable noise generation algorithms for deep neural networks, and show that it outperforms state-of-the-art attacks on various image classification tasks. Although there are generally no guarantees for deep learning, we show this is a well-principled approach in that it is provably optimal for linear classifiers. The main insight is a geometric characterization of the decision space that reduces the problem of designing best response oracles to minimizing a quadratic function over a set of convex polytopes.
Tasks	Image Classification
Published	2019-05-01
URL	https://openreview.net/forum?id=rkl4M3R5K7
PDF	https://openreview.net/pdf?id=rkl4M3R5K7
PWC	https://paperswithcode.com/paper/optimal-attacks-against-multiple-classifiers
Repo
Framework

Data augmentation using back-translation for context-aware neural machine translation


Title	Data augmentation using back-translation for context-aware neural machine translation
Authors	Amane Sugiyama, Naoki Yoshinaga
Abstract	A single sentence does not always convey information that is enough to translate it into other languages. Some target languages need to add or specialize words that are omitted or ambiguous in the source languages (e.g, zero pronouns in translating Japanese to English or epicene pronouns in translating English to French). To translate such ambiguous sentences, we need contexts beyond a single sentence, and have so far explored context-aware neural machine translation (NMT). However, a large amount of parallel corpora is not easily available to train accurate context-aware NMT models. In this study, we first obtain large-scale pseudo parallel corpora by back-translating monolingual data, and then investigate its impact on the translation accuracy of context-aware NMT models. We evaluated context-aware NMT models trained with small parallel corpora and the large-scale pseudo parallel corpora on English-Japanese and English-French datasets to demonstrate the large impact of the data augmentation for context-aware NMT models.
Tasks	Data Augmentation, Machine Translation
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-6504/
PDF	https://www.aclweb.org/anthology/D19-6504
PWC	https://paperswithcode.com/paper/data-augmentation-using-back-translation-for
Repo
Framework

ST MADAR 2019 Shared Task: Arabic Fine-Grained Dialect Identification


Title	ST MADAR 2019 Shared Task: Arabic Fine-Grained Dialect Identification
Authors	Mourad Abbas, Mohamed Lichouri, Abed Alhakim Freihat
Abstract	This paper describes the solution that we propose on MADAR 2019 Arabic Fine-Grained Dialect Identification task. The proposed solution utilized a set of classifiers that we trained on character and word features. These classifiers are: Support Vector Machines (SVM), Bernoulli Naive Bayes (BNB), Multinomial Naive Bayes (MNB), Logistic Regression (LR), Stochastic Gradient Descent (SGD), Passive Aggressive(PA) and Perceptron (PC). The system achieved competitive results, with a performance of 62.87 {%} and 62.12 {%} for both development and test sets.
Tasks
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-4635/
PDF	https://www.aclweb.org/anthology/W19-4635
PWC	https://paperswithcode.com/paper/st-madar-2019-shared-task-arabic-fine-grained
Repo
Framework

Quasi-Globally Optimal and Efficient Vanishing Point Estimation in Manhattan World


Title	Quasi-Globally Optimal and Efficient Vanishing Point Estimation in Manhattan World
Authors	Haoang Li, Ji Zhao, Jean-Charles Bazin, Wen Chen, Zhe Liu, Yun-Hui Liu
Abstract	The image lines projected from parallel 3D lines intersect at a common point called the vanishing point (VP). Manhattan world holds for the scenes with three orthogonal VPs. In Manhattan world, given several lines in a calibrated image, we aim at clustering them by three unknown-but-sought VPs. The VP estimation can be reformulated as computing the rotation between the Manhattan frame and the camera frame. To compute this rotation, state-of-the-art methods are based on either data sampling or parameter search, and they fail to guarantee the accuracy and efficiency simultaneously. In contrast, we propose to hybridize these two strategies. We first compute two degrees of freedom (DOF) of the above rotation by two sampled image lines, and then search for the optimal third DOF based on the branch-and-bound. Our sampling accelerates our search by reducing the search space and simplifying the bound computation. Our search is not sensitive to noise and achieves quasi-global optimality in terms of maximizing the number of inliers. Experiments on synthetic and real-world images showed that our method outperforms state-of-the-art approaches in terms of accuracy and/or efficiency.
Tasks
Published	2019-10-01
URL	http://openaccess.thecvf.com/content_ICCV_2019/html/Li_Quasi-Globally_Optimal_and_Efficient_Vanishing_Point_Estimation_in_Manhattan_World_ICCV_2019_paper.html
PDF	http://openaccess.thecvf.com/content_ICCV_2019/papers/Li_Quasi-Globally_Optimal_and_Efficient_Vanishing_Point_Estimation_in_Manhattan_World_ICCV_2019_paper.pdf
PWC	https://paperswithcode.com/paper/quasi-globally-optimal-and-efficient
Repo
Framework

KDEHatEval at SemEval-2019 Task 5: A Neural Network Model for Detecting Hate Speech in Twitter


Title	KDEHatEval at SemEval-2019 Task 5: A Neural Network Model for Detecting Hate Speech in Twitter
Authors	Umme Aymun Siddiqua, Abu Nowshed Chy, Masaki Aono
Abstract	In the age of emerging volume of microblog platforms, especially twitter, hate speech propagation is now of great concern. However, due to the brevity of tweets and informal user generated contents, detecting and analyzing hate speech on twitter is a formidable task. In this paper, we present our approach for detecting hate speech in tweets defined in the SemEval-2019 Task 5. Our team KDEHatEval employs different neural network models including multi-kernel convolution (MKC), nested LSTMs (NLSTMs), and multi-layer perceptron (MLP) in a unified architecture. Moreover, we utilize the state-of-the-art pre-trained sentence embedding models including DeepMoji, InferSent, and BERT for effective tweet representation. We analyze the performance of our method and demonstrate the contribution of each component of our architecture.
Tasks	Sentence Embedding
Published	2019-06-01
URL	https://www.aclweb.org/anthology/S19-2064/
PDF	https://www.aclweb.org/anthology/S19-2064
PWC	https://paperswithcode.com/paper/kdehateval-at-semeval-2019-task-5-a-neural
Repo
Framework

IB-GAN: Disentangled Representation Learning with Information Bottleneck GAN


Title	IB-GAN: Disentangled Representation Learning with Information Bottleneck GAN
Authors	Insu Jeon, Wonkwang Lee, Gunhee Kim
Abstract	We present a novel architecture of GAN for a disentangled representation learning. The new model architecture is inspired by Information Bottleneck (IB) theory thereby named IB-GAN. IB-GAN objective is similar to that of InfoGAN but has a crucial difference; a capacity regularization for mutual information is adopted, thanks to which the generator of IB-GAN can harness a latent representation in disentangled and interpretable manner. To facilitate the optimization of IB-GAN in practice, a new variational upper-bound is derived. With experiments on CelebA, 3DChairs, and dSprites datasets, we demonstrate that the visual quality of samples generated by IB-GAN is often better than those by β-VAEs. Moreover, IB-GAN achieves much higher disentanglement metrics score than β-VAEs or InfoGAN on the dSprites dataset.
Tasks	Representation Learning
Published	2019-05-01
URL	https://openreview.net/forum?id=ryljV2A5KX
PDF	https://openreview.net/pdf?id=ryljV2A5KX
PWC	https://paperswithcode.com/paper/ib-gan-disentangled-representation-learning
Repo
Framework

Readability of Twitter Tweets for Second Language Learners


Title	Readability of Twitter Tweets for Second Language Learners
Authors	Patrick Jacob, Alex Uitdenbogerd, ra
Abstract	Optimal language acquisition via reading requires the learners to read slightly above their current language skill level. Identifying material at the right level is the essential role of automatic readability measurement. Short message platforms such as Twitter offer the opportunity for language practice while reading about current topics and engaging in conversation in small doses, and can be filtered according to linguistic criteria to suit the learner. In this research, we explore how readable tweets are for English language learners and which factors contribute to their readability. With participants from six language groups, we collected 14,659 data points, each representing a tweet from a pool of 4100 tweets, and a judgement of perceived readability. Traditional readability measures and features failed on the data-set, but demographic data showed that judgements were largely genuine and reflected reported language skill, which is consistent with other recent studies. We report on the properties of the data set and implications for future research.
Tasks	Language Acquisition
Published	2019-04-01
URL	https://www.aclweb.org/anthology/U19-1003/
PDF	https://www.aclweb.org/anthology/U19-1003
PWC	https://paperswithcode.com/paper/readability-of-twitter-tweets-for-second
Repo
Framework

Assessing socioeconomic status of Twitter users: A survey


Title	Assessing socioeconomic status of Twitter users: A survey
Authors	Dhouha GHAZOUANI, Luigi LANCIERI, Habib OUNELLI, Chaker JEBARI
Abstract	Every day, the emotion and opinion of different people across the world are reflected in the form of short messages using microblogging platforms. Despite the existence of enormous potential introduced by this data source, the Twitter community is still ambiguous and is not fully explored yet. While there are a huge number of studies examining the possibilities of inferring gender and age, there exist hardly researches on socioeconomic status (SES) inference of Twitter users. As socioeconomic status is essential to treating diverse questions linked to human behavior in several fields (sociology, demography, public health, etc.), we conducted a comprehensive literature review of SES studies, inference methods, and metrics. With reference to the research on literature{'}s results, we came to outline the most critical challenges for researchers. To the best of our knowledge, this paper is the first review that introduces the different aspects of SES inference. Indeed, this article provides the benefits for practitioners who aim to process and explore Twitter SES inference.
Tasks
Published	2019-09-01
URL	https://www.aclweb.org/anthology/R19-1046/
PDF	https://www.aclweb.org/anthology/R19-1046
PWC	https://paperswithcode.com/paper/assessing-socioeconomic-status-of-twitter
Repo
Framework

A Platform for Community-sourced Indic Knowledge Processing at Scale


Title	A Platform for Community-sourced Indic Knowledge Processing at Scale
Authors	Sai Susarla, Damodar Reddy Challa
Abstract
Tasks
Published	2019-10-01
URL	https://www.aclweb.org/anthology/W19-7506/
PDF	https://www.aclweb.org/anthology/W19-7506
PWC	https://paperswithcode.com/paper/a-platform-for-community-sourced-indic
Repo
Framework

Framework for Question-Answering in Sanskrit through Automated Construction of Knowledge Graphs


Title	Framework for Question-Answering in Sanskrit through Automated Construction of Knowledge Graphs
Authors	Hrishikesh Terdalkar, Arnab Bhattacharya
Abstract
Tasks	Knowledge Graphs, Question Answering
Published	2019-10-01
URL	https://www.aclweb.org/anthology/W19-7508/
PDF	https://www.aclweb.org/anthology/W19-7508
PWC	https://paperswithcode.com/paper/framework-for-question-answering-in-sanskrit
Repo
Framework

Introduction to Sanskrit Shabdamitra: An Educational Application of Sanskrit Wordnet


Title	Introduction to Sanskrit Shabdamitra: An Educational Application of Sanskrit Wordnet
Authors	Malhar Kulkarni, Nilesh Joshi, Sayali Khare, Hanumant Redkar, Pushpak Bhattacharyya
Abstract
Tasks
Published	2019-10-01
URL	https://www.aclweb.org/anthology/W19-7509/
PDF	https://www.aclweb.org/anthology/W19-7509
PWC	https://paperswithcode.com/paper/introduction-to-sanskrit-shabdamitra-an
Repo
Framework

Syntactic dependencies correspond to word pairs with high mutual information


Title	Syntactic dependencies correspond to word pairs with high mutual information
Authors	Richard Futrell, Peng Qian, Edward Gibson, Evelina Fedorenko, Idan Blank
Abstract
Tasks
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-7703/
PDF	https://www.aclweb.org/anthology/W19-7703
PWC	https://paperswithcode.com/paper/syntactic-dependencies-correspond-to-word
Repo
Framework

An Introduction to the Textual History Tool


Title	An Introduction to the Textual History Tool
Authors	Diptesh Kanojia, Malhar Kulkarni, Pushpak Bhattacharyya, Eivind Kahrs
Abstract
Tasks
Published	2019-10-01
URL	https://www.aclweb.org/anthology/W19-7512/
PDF	https://www.aclweb.org/anthology/W19-7512
PWC	https://paperswithcode.com/paper/an-introduction-to-the-textual-history-tool
Repo
Framework

Character-level Annotation for Chinese Surface-Syntactic Universal Dependencies


Title	Character-level Annotation for Chinese Surface-Syntactic Universal Dependencies
Authors	Yixuan Li, Gerdes Kim, Dong Chuanming
Abstract
Tasks
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-7726/
PDF	https://www.aclweb.org/anthology/W19-7726
PWC	https://paperswithcode.com/paper/character-level-annotation-for-chinese
Repo
Framework