January 24, 2020

2567 words 13 mins read

Paper Group NANR 233

Large-Scale Hierarchical Alignment for Data-driven Text Rewriting. Simplification-induced transformations: typology and some characteristics. Multi-Class Part Parsing With Joint Boundary-Semantic Awareness. Barrage of Random Transforms for Adversarially Robust Defense. Sentence Length. VerbAtlas: a Novel Large-Scale Verbal Semantic Resource and Its …

Large-Scale Hierarchical Alignment for Data-driven Text Rewriting


Title	Large-Scale Hierarchical Alignment for Data-driven Text Rewriting
Authors	Nikola Nikolov, Richard Hahnloser
Abstract	We propose a simple unsupervised method for extracting pseudo-parallel monolingual sentence pairs from comparable corpora representative of two different text styles, such as news articles and scientific papers. Our approach does not require a seed parallel corpus, but instead relies solely on hierarchical search over pre-trained embeddings of documents and sentences. We demonstrate the effectiveness of our method through automatic and extrinsic evaluation on text simplification from the normal to the Simple Wikipedia. We show that pseudo-parallel sentences extracted with our method not only supplement existing parallel data, but can even lead to competitive performance on their own.
Tasks	Text Simplification
Published	2019-09-01
URL	https://www.aclweb.org/anthology/R19-1098/
PDF	https://www.aclweb.org/anthology/R19-1098
PWC	https://paperswithcode.com/paper/large-scale-hierarchical-alignment-for-data
Repo
Framework

Simplification-induced transformations: typology and some characteristics


Title	Simplification-induced transformations: typology and some characteristics
Authors	Ana{"\i}s Koptient, R{'e}mi Cardon, Natalia Grabar
Abstract	The purpose of automatic text simplification is to transform technical or difficult to understand texts into a more friendly version. The semantics must be preserved during this transformation. Automatic text simplification can be done at different levels (lexical, syntactic, semantic, stylistic…) and relies on the corresponding knowledge and resources (lexicon, rules…). Our objective is to propose methods and material for the creation of transformation rules from a small set of parallel sentences differentiated by their technicity. We also propose a typology of transformations and quantify them. We work with French-language data related to the medical domain, although we assume that the method can be exploited on texts in any language and from any domain.
Tasks	Text Simplification
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-5033/
PDF	https://www.aclweb.org/anthology/W19-5033
PWC	https://paperswithcode.com/paper/simplification-induced-transformations
Repo
Framework

Multi-Class Part Parsing With Joint Boundary-Semantic Awareness


Title	Multi-Class Part Parsing With Joint Boundary-Semantic Awareness
Authors	Yifan Zhao, Jia Li, Yu Zhang, Yonghong Tian
Abstract	Object part parsing in the wild, which requires to simultaneously detect multiple object classes in the scene and accurately segments semantic parts within each class, is challenging for the joint presence of class-level and part-level ambiguities. Despite its importance, however, this problem is not sufficiently explored in existing works. In this paper, we propose a joint parsing framework with boundary and semantic awareness to address this challenging problem. To handle part-level ambiguity, a boundary awareness module is proposed to make mid-level features at multiple scales attend to part boundaries for accurate part localization, which are then fused with high-level features for effective part recognition. For class-level ambiguity, we further present a semantic awareness module that selects discriminative part features relevant to a category to prevent irrelevant features being merged together. The proposed modules are lightweight and implementation friendly, improving the performance substantially when plugged into various baseline architectures. Without bells and whistles, the full model sets new state-of-the-art results on the Pascal-Part dataset, in both multi-class and the conventional single-class setting, while running substantially faster than recent high-performance approaches.
Tasks
Published	2019-10-01
URL	http://openaccess.thecvf.com/content_ICCV_2019/html/Zhao_Multi-Class_Part_Parsing_With_Joint_Boundary-Semantic_Awareness_ICCV_2019_paper.html
PDF	http://openaccess.thecvf.com/content_ICCV_2019/papers/Zhao_Multi-Class_Part_Parsing_With_Joint_Boundary-Semantic_Awareness_ICCV_2019_paper.pdf
PWC	https://paperswithcode.com/paper/multi-class-part-parsing-with-joint-boundary
Repo
Framework

Barrage of Random Transforms for Adversarially Robust Defense


Title	Barrage of Random Transforms for Adversarially Robust Defense
Authors	Edward Raff, Jared Sylvester, Steven Forsyth, Mark McLean
Abstract	Defenses against adversarial examples, when using the ImageNet dataset, are historically easy to defeat. The common understanding is that a combination of simple image transformations and other various defenses are insufficient to provide the necessary protection when the obfuscated gradient is taken into account. In this paper, we explore the idea of stochastically combining a large number of individually weak defenses into a single barrage of randomized transformations to build a strong defense against adversarial attacks. We show that, even after accounting for obfuscated gradients, the Barrage of Random Transforms (BaRT) is a resilient defense against even the most difficult attacks, such as PGD. BaRT achieves up to a 24x improvement in accuracy compared to previous work, and has even extended effectiveness out to a previously untested maximum adversarial perturbation of e=32.
Tasks
Published	2019-06-01
URL	http://openaccess.thecvf.com/content_CVPR_2019/html/Raff_Barrage_of_Random_Transforms_for_Adversarially_Robust_Defense_CVPR_2019_paper.html
PDF	http://openaccess.thecvf.com/content_CVPR_2019/papers/Raff_Barrage_of_Random_Transforms_for_Adversarially_Robust_Defense_CVPR_2019_paper.pdf
PWC	https://paperswithcode.com/paper/barrage-of-random-transforms-for
Repo
Framework

Sentence Length


Title	Sentence Length
Authors	G{'a}bor Borb{'e}ly, Andr{'a}s Kornai
Abstract
Tasks
Published	2019-07-01
URL	https://www.aclweb.org/anthology/W19-5710/
PDF	https://www.aclweb.org/anthology/W19-5710
PWC	https://paperswithcode.com/paper/sentence-length-1
Repo
Framework

VerbAtlas: a Novel Large-Scale Verbal Semantic Resource and Its Application to Semantic Role Labeling


Title	VerbAtlas: a Novel Large-Scale Verbal Semantic Resource and Its Application to Semantic Role Labeling
Authors	Andrea Di Fabio, Simone Conia, Roberto Navigli
Abstract	We present VerbAtlas, a new, hand-crafted lexical-semantic resource whose goal is to bring together all verbal synsets from WordNet into semantically-coherent frames. The frames define a common, prototypical argument structure while at the same time providing new concept-specific information. In contrast to PropBank, which defines enumerative semantic roles, VerbAtlas comes with an explicit, cross-frame set of semantic roles linked to selectional preferences expressed in terms of WordNet synsets, and is the first resource enriched with semantic information about implicit, shadow, and default arguments. We demonstrate the effectiveness of VerbAtlas in the task of dependency-based Semantic Role Labeling and show how its integration into a high-performance system leads to improvements on both the in-domain and out-of-domain test sets of CoNLL-2009. VerbAtlas is available at http://verbatlas.org.
Tasks	Semantic Role Labeling
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-1058/
PDF	https://www.aclweb.org/anthology/D19-1058
PWC	https://paperswithcode.com/paper/verbatlas-a-novel-large-scale-verbal-semantic
Repo
Framework

Accelerated Gravitational Point Set Alignment With Altered Physical Laws


Title	Accelerated Gravitational Point Set Alignment With Altered Physical Laws
Authors	Vladislav Golyanik, Christian Theobalt, Didier Stricker
Abstract	This work describes Barnes-Hut Rigid Gravitational Approach (BH-RGA) – a new rigid point set registration method relying on principles of particle dynamics. Interpreting the inputs as two interacting particle swarms, we directly minimise the gravitational potential energy of the system using non-linear least squares. Compared to solutions obtained by solving systems of second-order ordinary differential equations, our approach is more robust and less dependent on the parameter choice. We accelerate otherwise exhaustive particle interactions with a Barnes-Hut tree and efficiently handle massive point sets in quasilinear time while preserving the globally multiply-linked character of interactions. Among the advantages of BH-RGA is the possibility to define boundary conditions or additional alignment cues through varying point masses. Systematic experiments demonstrate that BH-RGA surpasses performances of baseline methods in terms of the convergence basin and accuracy when handling incomplete, noisy and perturbed data. The proposed approach also positively compares to the competing method for the alignment with prior matches.
Tasks
Published	2019-10-01
URL	http://openaccess.thecvf.com/content_ICCV_2019/html/Golyanik_Accelerated_Gravitational_Point_Set_Alignment_With_Altered_Physical_Laws_ICCV_2019_paper.html
PDF	http://openaccess.thecvf.com/content_ICCV_2019/papers/Golyanik_Accelerated_Gravitational_Point_Set_Alignment_With_Altered_Physical_Laws_ICCV_2019_paper.pdf
PWC	https://paperswithcode.com/paper/accelerated-gravitational-point-set-alignment
Repo
Framework

From Monolingual to Multilingual FAQ Assistant using Multilingual Co-training


Title	From Monolingual to Multilingual FAQ Assistant using Multilingual Co-training
Authors	Mayur Patidar, Surabhi Kumari, Manasi Patwardhan, Kar, Shirish e, Puneet Agarwal, Lovekesh Vig, Gautam Shroff
Abstract	Recent research on cross-lingual transfer show state-of-the-art results on benchmark datasets using pre-trained language representation models (PLRM) like BERT. These results are achieved with the traditional training approaches, such as Zero-shot with no data, Translate-train or Translate-test with machine translated data. In this work, we propose an approach of {``}Multilingual Co-training{''} (MCT) where we augment the expert annotated dataset in the source language (English) with the corresponding machine translations in the target languages (e.g. Arabic, Spanish) and fine-tune the PLRM jointly. We observe that the proposed approach provides consistent gains in the performance of BERT for multiple benchmark datasets (e.g. 1.0{%} gain on MLDocs, and 1.2{%} gain on XNLI over translate-train with BERT), while requiring a single model for multiple languages. We further consider a FAQ dataset where the available English test dataset is translated by experts into Arabic and Spanish. On such a dataset, we observe an average gain of 4.9{%} over all other cross-lingual transfer protocols with BERT. We further observe that domain-specific joint pre-training of the PLRM using HR policy documents in English along with the machine translations in the target languages, followed by the joint finetuning, provides a further improvement of 2.8{%} in average accuracy. \|
Tasks	Cross-Lingual Transfer
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-6113/
PDF	https://www.aclweb.org/anthology/D19-6113
PWC	https://paperswithcode.com/paper/from-monolingual-to-multilingual-faq
Repo
Framework

Learning to Link Grammar and Encyclopedic Information of Assist ESL Learners


Title	Learning to Link Grammar and Encyclopedic Information of Assist ESL Learners
Authors	Jhih-Jie Chen, Chingyu Yang, Peichen Ho, Ming Chiao Tsai, Chia-Fang Ho, Kai-Wen Tuan, Chung-Ting Tsai, Wen-Bin Han, Jason Chang
Abstract	We introduce a system aimed at improving and expanding second language learners{'} English vocabulary. In addition to word definitions, we provide rich lexical information such as collocations and grammar patterns for target words. We present Linggle Booster that takes an article, identifies target vocabulary, provides lexical information, and generates a quiz on target words. Linggle Booster also links named-entity to corresponding Wikipedia pages. Evaluation on a set of target words shows that the method have reasonably good performance in terms of generating useful and information for learning vocabulary.
Tasks
Published	2019-07-01
URL	https://www.aclweb.org/anthology/P19-3034/
PDF	https://www.aclweb.org/anthology/P19-3034
PWC	https://paperswithcode.com/paper/learning-to-link-grammar-and-encyclopedic
Repo
Framework

Discourse Analysis and Its Applications


Title	Discourse Analysis and Its Applications
Authors	Shafiq Joty, Giuseppe Carenini, Raymond Ng, Gabriel Murray
Abstract	Discourse processing is a suite of Natural Language Processing (NLP) tasks to uncover linguistic structures from texts at several levels, which can support many downstream applications. This involves identifying the topic structure, the coherence structure, the coreference structure, and the conversation structure for conversational discourse. Taken together, these structures can inform text summarization, machine translation, essay scoring, sentiment analysis, information extraction, question answering, and thread recovery. The tutorial starts with an overview of basic concepts in discourse analysis {–} monologue vs. conversation, synchronous vs. asynchronous conversation, and key linguistic structures in discourse analysis. We also give an overview of linguistic structures and corresponding discourse analysis tasks that discourse researchers are generally interested in, as well as key applications on which these discourse structures have an impact.
Tasks	Machine Translation, Question Answering, Sentiment Analysis, Text Summarization
Published	2019-07-01
URL	https://www.aclweb.org/anthology/P19-4003/
PDF	https://www.aclweb.org/anthology/P19-4003
PWC	https://paperswithcode.com/paper/discourse-analysis-and-its-applications
Repo
Framework

Journalist-in-the-Loop: Continuous Learning as a Service for Rumour Analysis


Title	Journalist-in-the-Loop: Continuous Learning as a Service for Rumour Analysis
Authors	Twin Karmakharm, Nikolaos Aletras, Kalina Bontcheva
Abstract	Automatically identifying rumours in social media and assessing their veracity is an important task with downstream applications in journalism. A significant challenge is how to keep rumour analysis tools up-to-date as new information becomes available for particular rumours that spread in a social network. This paper presents a novel open-source web-based rumour analysis tool that can continuous learn from journalists. The system features a rumour annotation service that allows journalists to easily provide feedback for a given social media post through a web-based interface. The feedback allows the system to improve an underlying state-of-the-art neural network-based rumour classification model. The system can be easily integrated as a service into existing tools and platforms used by journalists using a REST API.
Tasks	Rumour Detection
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-3020/
PDF	https://www.aclweb.org/anthology/D19-3020
PWC	https://paperswithcode.com/paper/journalist-in-the-loop-continuous-learning-as
Repo
Framework

Unsupervised Learning of Consensus Maximization for 3D Vision Problems


Title	Unsupervised Learning of Consensus Maximization for 3D Vision Problems
Authors	Thomas Probst, Danda Pani Paudel, Ajad Chhatkuli, Luc Van Gool
Abstract	Consensus maximization is a key strategy in 3D vision for robust geometric model estimation from measurements with outliers. Generic methods for consensus maximization, such as Random Sampling and Consensus (RANSAC), have played a tremendous role in the success of 3D vision, in spite of the ubiquity of outliers. However, replicating the same generic behaviour in a deeply learned architecture, using supervised approaches, has proven to be difficult. In that context, unsupervised methods have a huge potential to adapt to any unseen data distribution, and therefore are highly desirable. In this paper, we propose for the first time an unsupervised learning framework for consensus maximization, in the context of solving 3D vision problems. For that purpose, we establish a relationship between inlier measurements, represented by an ideal of inlier set, and the subspace of polynomials representing the space of target transformations. Using this relationship, we derive a constraint that must be satisfied by the sought inlier set. This constraint can be tested without knowing the transformation parameters, therefore allows us to efficiently define the geometric model fitting cost. This model fitting cost is used as a supervisory signal for learning consensus maximization, where the learning process seeks for the largest measurement set that minimizes the proposed model fitting cost. Using our method, we solve a diverse set of 3D vision problems, including 3D-3D matching, non-rigid 3D shape matching with piece-wise rigidity and image-to-image matching. Despite being unsupervised, our method outperforms RANSAC in all three tasks for several datasets.
Tasks
Published	2019-06-01
URL	http://openaccess.thecvf.com/content_CVPR_2019/html/Probst_Unsupervised_Learning_of_Consensus_Maximization_for_3D_Vision_Problems_CVPR_2019_paper.html
PDF	http://openaccess.thecvf.com/content_CVPR_2019/papers/Probst_Unsupervised_Learning_of_Consensus_Maximization_for_3D_Vision_Problems_CVPR_2019_paper.pdf
PWC	https://paperswithcode.com/paper/unsupervised-learning-of-consensus
Repo
Framework

Unsupervised Bilingual Word Embedding Agreement for Unsupervised Neural Machine Translation


Title	Unsupervised Bilingual Word Embedding Agreement for Unsupervised Neural Machine Translation
Authors	Haipeng Sun, Rui Wang, Kehai Chen, Masao Utiyama, Eiichiro Sumita, Tiejun Zhao
Abstract	Unsupervised bilingual word embedding (UBWE), together with other technologies such as back-translation and denoising, has helped unsupervised neural machine translation (UNMT) achieve remarkable results in several language pairs. In previous methods, UBWE is first trained using non-parallel monolingual corpora and then this pre-trained UBWE is used to initialize the word embedding in the encoder and decoder of UNMT. That is, the training of UBWE and UNMT are separate. In this paper, we first empirically investigate the relationship between UBWE and UNMT. The empirical findings show that the performance of UNMT is significantly affected by the performance of UBWE. Thus, we propose two methods that train UNMT with UBWE agreement. Empirical results on several language pairs show that the proposed methods significantly outperform conventional UNMT.
Tasks	Denoising, Machine Translation
Published	2019-07-01
URL	https://www.aclweb.org/anthology/P19-1119/
PDF	https://www.aclweb.org/anthology/P19-1119
PWC	https://paperswithcode.com/paper/unsupervised-bilingual-word-embedding
Repo
Framework

Learning Outcomes and Their Relatedness in a Medical Curriculum


Title	Learning Outcomes and Their Relatedness in a Medical Curriculum
Authors	Sneha Mondal, Tejas Dhamecha, Shantanu Godbole, Smriti Pathak, Red Mendoza, K Gayathri Wijayarathna, Nabil Zary, Swarnadeep Saha, Malolan Chetlur
Abstract	A typical medical curriculum is organized in a hierarchy of instructional objectives called Learning Outcomes (LOs); a few thousand LOs span five years of study. Gaining a thorough understanding of the curriculum requires learners to recognize and apply related LOs across years, and across different parts of the curriculum. However, given the large scope of the curriculum, manually labeling related LOs is tedious, and almost impossible to scale. In this paper, we build a system that learns relationships between LOs, and we achieve up to human-level performance in the LO relationship extraction task. We then present an application where the proposed system is employed to build a map of related LOs and Learning Resources (LRs) pertaining to a virtual patient case. We believe that our system can help medical students grasp the curriculum better, within classroom as well as in Intelligent Tutoring Systems (ITS) settings.
Tasks
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-4442/
PDF	https://www.aclweb.org/anthology/W19-4442
PWC	https://paperswithcode.com/paper/learning-outcomes-and-their-relatedness-in-a
Repo
Framework

Weak contraction mapping and optimization


Title	Weak contraction mapping and optimization
Authors	Siwei Luo
Abstract	The weak contraction mapping is a self mapping that the range is always a subset of the domain, which admits a unique fixed-point. The iteration of weak contraction mapping is a Cauchy sequence that yields the unique fixed-point. A gradient-free optimization method as an application of weak contraction mapping is proposed to achieve global minimum convergence. The optimization method is robust to local minima and initial point position.
Tasks
Published	2019-05-01
URL	https://openreview.net/forum?id=SygJSiA5YQ
PDF	https://openreview.net/pdf?id=SygJSiA5YQ
PWC	https://paperswithcode.com/paper/weak-contraction-mapping-and-optimization
Repo
Framework