January 25, 2020

2542 words 12 mins read

Paper Group NANR 79

ZCU-NLP at MADAR 2019: Recognizing Arabic Dialects. Arabic Dialect Identification with Deep Learning and Hybrid Frequency Based Features. Optimistic Acceleration for Optimization. The En-Ru Two-way Integrated Machine Translation System Based on Transformer. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Sys …

ZCU-NLP at MADAR 2019: Recognizing Arabic Dialects


Title	ZCU-NLP at MADAR 2019: Recognizing Arabic Dialects
Authors	Pavel P{\v{r}}ib{'a}{\v{n}}, Stephen Taylor
Abstract	In this paper, we present our systems for the MADAR Shared Task: Arabic Fine-Grained Dialect Identification. The shared task consists of two subtasks. The goal of Subtask{–} 1 (S-1) is to detect an Arabic city dialect in a given text and the goal of Subtask{–}2 (S-2) is to predict the country of origin of a Twitter user by using tweets posted by the user. In S-1, our proposed systems are based on language modelling. We use language models to extract features that are later used as an input for other machine learning algorithms. We also experiment with recurrent neural networks (RNN), but these experiments showed that simpler machine learning algorithms are more successful. Our system achieves 0.658 macro F1-score and our rank is 6th out of 19 teams in S-1 and 7th in S-2 with 0.475 macro F1-score.
Tasks	Language Modelling
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-4623/
PDF	https://www.aclweb.org/anthology/W19-4623
PWC	https://paperswithcode.com/paper/zcu-nlp-at-madar-2019-recognizing-arabic
Repo
Framework

Arabic Dialect Identification with Deep Learning and Hybrid Frequency Based Features


Title	Arabic Dialect Identification with Deep Learning and Hybrid Frequency Based Features
Authors	Youssef Fares, Zeyad El-Zanaty, Kareem Abdel-Salam, Muhammed Ezzeldin, Aliaa Mohamed, Karim El-Awaad, Marwan Torki
Abstract	Studies on Dialectical Arabic are growing more important by the day as it becomes the primary written and spoken form of Arabic online in informal settings. Among the important problems that should be explored is that of dialect identification. This paper reports different techniques that can be applied towards such goal and reports their performance on the Multi Arabic Dialect Applications and Resources (MADAR) Arabic Dialect Corpora. Our results show that improving on traditional systems using frequency based features and non deep learning classifiers is a challenging task. We propose different models based on different word and document representations. Our top model is able to achieve an F1 macro averaged score of 65.66 on MADAR{'}s small-scale parallel corpus of 25 dialects and Modern Standard Arabic (MSA).
Tasks
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-4626/
PDF	https://www.aclweb.org/anthology/W19-4626
PWC	https://paperswithcode.com/paper/arabic-dialect-identification-with-deep
Repo
Framework

Optimistic Acceleration for Optimization


Title	Optimistic Acceleration for Optimization
Authors	Jun-Kun Wang, Xiaoyun Li, Ping Li
Abstract	We consider new variants of optimization algorithms. Our algorithms are based on the observation that mini-batch of stochastic gradients in consecutive iterations do not change drastically and consequently may be predictable. Inspired by the similar setting in online learning literature called Optimistic Online learning, we propose two new optimistic algorithms for AMSGrad and Adam, respectively, by exploiting the predictability of gradients. The new algorithms combine the idea of momentum method, adaptive gradient method, and algorithms in Optimistic Online learning, which leads to speed up in training deep neural nets in practice.
Tasks
Published	2019-05-01
URL	https://openreview.net/forum?id=HkghV209tm
PDF	https://openreview.net/pdf?id=HkghV209tm
PWC	https://paperswithcode.com/paper/optimistic-acceleration-for-optimization
Repo
Framework

The En-Ru Two-way Integrated Machine Translation System Based on Transformer


Title	The En-Ru Two-way Integrated Machine Translation System Based on Transformer
Authors	Doron Yu
Abstract	Machine translation is one of the most popular areas in natural language processing. WMT is a conference to assess the level of machine translation capabilities of organizations around the world, which is the evaluation activity we participated in. In this review we participated in a two-way translation track from Russian to English and English to Russian. We used official training data, 38 million parallel corpora, and 10 million monolingual corpora. The overall framework we use is the Transformer neural machine translation model, supplemented by data filtering, post-processing, reordering and other related processing methods. The BLEU value of our final translation result from Russian to English is 38.7, ranking 5th, while from English to Russian is 27.8, ranking 10th.
Tasks	Machine Translation
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-5349/
PDF	https://www.aclweb.org/anthology/W19-5349
PWC	https://paperswithcode.com/paper/the-en-ru-two-way-integrated-machine
Repo
Framework

Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations


Title	Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations
Authors
Abstract
Tasks
Published	2019-07-01
URL	https://www.aclweb.org/anthology/P19-3000/
PDF	https://www.aclweb.org/anthology/P19-3000
PWC	https://paperswithcode.com/paper/proceedings-of-the-57th-conference-of-the-2
Repo
Framework

Learning to Describe Scenes with Programs


Title	Learning to Describe Scenes with Programs
Authors	Yunchao Liu, Zheng Wu, Daniel Ritchie, William T. Freeman, Joshua B. Tenenbaum, Jiajun Wu
Abstract	Human scene perception goes beyond recognizing a collection of objects and their pairwise relations. We are able to understand the higher-level, abstract regularities within the scene such as symmetry and repetition. Current vision recognition modules and scene representations fall short in this dimension. In this paper, we present scene programs, representing a scene via a symbolic program for its objects and their attributes. We also propose a model that infers such scene programs by exploiting a hierarchical, object-based scene representation. Experiments demonstrate that our model works well on synthetic data and is able to transfer to real images with such compositional structure. The use of scene programs has enabled a number of applications, such as complex visual analogy-making and scene extrapolation.
Tasks
Published	2019-05-01
URL	https://openreview.net/forum?id=SyNPk2R9K7
PDF	https://openreview.net/pdf?id=SyNPk2R9K7
PWC	https://paperswithcode.com/paper/learning-to-describe-scenes-with-programs
Repo
Framework

Parsing Chinese Sentences with Grammatical Relations


Title	Parsing Chinese Sentences with Grammatical Relations
Authors	Weiwei Sun, Yufei Chen, Xiaojun Wan, Meichun Liu
Abstract	We report our work on building linguistic resources and data-driven parsers in the grammatical relation (GR) analysis for Mandarin Chinese. Chinese, as an analytic language, encodes grammatical information in a highly configurational rather than morphological way. Accordingly, it is possible and reasonable to represent almost all grammatical relations as bilexical dependencies. In this work, we propose to represent grammatical information using general directed dependency graphs. Both only-local and rich long-distance dependencies are explicitly represented. To create high-quality annotations, we take advantage of an existing TreeBank, namely, Chinese TreeBank (CTB), which is grounded on the Government and Binding theory. We define a set of linguistic rules to explore CTB{'}s implicit phrase structural information and build deep dependency graphs. The reliability of this linguistically motivated GR extraction procedure is highlighted by manual evaluation. Based on the converted corpus, data-driven, including graph- and transition-based, models are explored for Chinese GR parsing. For graph-based parsing, a new perspective, graph merging, is proposed for building flexible dependency graphs: constructing complex graphs via constructing simple subgraphs. Two key problems are discussed in this perspective: (1) how to decompose a complex graph into simple subgraphs, and (2) how to combine subgraphs into a coherent complex graph. For transition-based parsing, we introduce a neural parser based on a list-based transition system. We also discuss several other key problems, including dynamic oracle and beam search for neural transition-based parsing. Evaluation gauges how successful GR parsing for Chinese can be by applying data-driven models. The empirical analysis suggests several directions for future study.
Tasks
Published	2019-03-01
URL	https://www.aclweb.org/anthology/J19-1003/
PDF	https://www.aclweb.org/anthology/J19-1003
PWC	https://paperswithcode.com/paper/parsing-chinese-sentences-with-grammatical
Repo
Framework

Predicting the Present and Future States of Multi-agent Systems from Partially-observed Visual Data


Title	Predicting the Present and Future States of Multi-agent Systems from Partially-observed Visual Data
Authors	Chen Sun, Per Karlsson, Jiajun Wu, Joshua B Tenenbaum, Kevin Murphy
Abstract	We present a method which learns to integrate temporal information, from a learned dynamics model, with ambiguous visual information, from a learned vision model, in the context of interacting agents. Our method is based on a graph-structured variational recurrent neural network, which is trained end-to-end to infer the current state of the (partially observed) world, as well as to forecast future states. We show that our method outperforms various baselines on two sports datasets, one based on real basketball trajectories, and one generated by a soccer game engine.
Tasks
Published	2019-05-01
URL	https://openreview.net/forum?id=r1xdH3CcKX
PDF	https://openreview.net/pdf?id=r1xdH3CcKX
PWC	https://paperswithcode.com/paper/predicting-the-present-and-future-states-of
Repo
Framework

The role of over-parametrization in generalization of neural networks


Title	The role of over-parametrization in generalization of neural networks
Authors	Behnam Neyshabur, Zhiyuan Li, Srinadh Bhojanapalli, Yann LeCun, Nathan Srebro
Abstract	Despite existing work on ensuring generalization of neural networks in terms of scale sensitive complexity measures, such as norms, margin and sharpness, these complexity measures do not offer an explanation of why neural networks generalize better with over-parametrization. In this work we suggest a novel complexity measure based on unit-wise capacities resulting in a tighter generalization bound for two layer ReLU networks. Our capacity bound correlates with the behavior of test error with increasing network sizes (within the range reported in the experiments), and could partly explain the improvement in generalization with over-parametrization. We further present a matching lower bound for the Rademacher complexity that improves over previous capacity lower bounds for neural networks.
Tasks
Published	2019-05-01
URL	https://openreview.net/forum?id=BygfghAcYX
PDF	https://openreview.net/pdf?id=BygfghAcYX
PWC	https://paperswithcode.com/paper/the-role-of-over-parametrization-in
Repo
Framework

Combining Global Sparse Gradients with Local Gradients in Distributed Neural Network Training


Title	Combining Global Sparse Gradients with Local Gradients in Distributed Neural Network Training
Authors	Alham Fikri Aji, Kenneth Heafield, Nikolay Bogoychev
Abstract	One way to reduce network traffic in multi-node data-parallel stochastic gradient descent is to only exchange the largest gradients. However, doing so damages the gradient and degrades the model{'}s performance. Transformer models degrade dramatically while the impact on RNNs is smaller. We restore gradient quality by combining the compressed global gradient with the node{'}s locally computed uncompressed gradient. Neural machine translation experiments show that Transformer convergence is restored while RNNs converge faster. With our method, training on 4 nodes converges up to 1.5x as fast as with uncompressed gradients and scales 3.5x relative to single-node training.
Tasks	Machine Translation
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-1373/
PDF	https://www.aclweb.org/anthology/D19-1373
PWC	https://paperswithcode.com/paper/combining-global-sparse-gradients-with-local
Repo
Framework

Deformable Surface Tracking by Graph Matching


Title	Deformable Surface Tracking by Graph Matching
Authors	Tao Wang, Haibin Ling, Congyan Lang, Songhe Feng, Xiaohui Hou
Abstract	This paper addresses the problem of deformable surface tracking from monocular images. Specifically, we propose a graph-based approach that effectively explores the structure information of the surface to enhance tracking performance. Our approach solves simultaneously for feature correspondence, outlier rejection and shape reconstruction by optimizing a single objective function, which is defined by means of pairwise projection errors between graph structures instead of unary projection errors between matched points. Furthermore, an efficient matching algorithm is developed based on soft matching relaxation. For evaluation, our approach is extensively compared to state-of-the-art algorithms on a standard dataset of occluded surfaces, as well as a newly compiled dataset of different surfaces with rich, weak or repetitive texture. Experimental results reveal that our approach achieves robust tracking results for surfaces with different types of texture, and outperforms other algorithms in both accuracy and efficiency.
Tasks	Graph Matching
Published	2019-10-01
URL	http://openaccess.thecvf.com/content_ICCV_2019/html/Wang_Deformable_Surface_Tracking_by_Graph_Matching_ICCV_2019_paper.html
PDF	http://openaccess.thecvf.com/content_ICCV_2019/papers/Wang_Deformable_Surface_Tracking_by_Graph_Matching_ICCV_2019_paper.pdf
PWC	https://paperswithcode.com/paper/deformable-surface-tracking-by-graph-matching
Repo
Framework

Semantic Role Labeling with Pretrained Language Models for Known and Unknown Predicates


Title	Semantic Role Labeling with Pretrained Language Models for Known and Unknown Predicates
Authors	Daniil Larionov, Artem Shelmanov, Elena Chistova, Ivan Smirnov
Abstract	We build the first full pipeline for semantic role labelling of Russian texts. The pipeline implements predicate identification, argument extraction, argument classification (labeling), and global scoring via integer linear programming. We train supervised neural network models for argument classification using Russian semantically annotated corpus {–} FrameBank. However, we note that this resource provides annotations only to a very limited set of predicates. We combat the problem of annotation scarcity by introducing two models that rely on different sets of features: one for {`}known{''} predicates that are present in the training set and one for {`}unknown{''} predicates that are not. We show that the model for {`}unknown{''} predicates can alleviate the lack of annotation by using pretrained embeddings. We perform experiments with various types of embeddings including the ones generated by deep pretrained language models: word2vec, FastText, ELMo, BERT, and show that embeddings generated by deep pretrained language models are superior to classical shallow embeddings for argument classification of both {`}known{''} and {``}unknown{''} predicates. \|
Tasks	Semantic Role Labeling
Published	2019-09-01
URL	https://www.aclweb.org/anthology/R19-1073/
PDF	https://www.aclweb.org/anthology/R19-1073
PWC	https://paperswithcode.com/paper/semantic-role-labeling-with-pretrained
Repo
Framework

Generating Sentential Arguments from Diverse Perspectives on Controversial Topic


Title	Generating Sentential Arguments from Diverse Perspectives on Controversial Topic
Authors	ChaeHun Park, Wonsuk Yang, Jong Park
Abstract	Considering diverse aspects of an argumentative issue is an essential step for mitigating a biased opinion and making reasonable decisions. A related generation model can produce flexible results that cover a wide range of topics, compared to the retrieval-based method that may show unstable performance for unseen data. In this paper, we study the problem of generating sentential arguments from multiple perspectives, and propose a neural method to address this problem. Our model, ArgDiver (Argument generation model from diverse perspectives), in a way a conversational system, successfully generates high-quality sentential arguments. At the same time, the automatically generated arguments by our model show a higher diversity than those generated by any other baseline models. We believe that our work provides evidence for the potential of a good generation model in providing diverse perspectives on a controversial topic.
Tasks
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-5007/
PDF	https://www.aclweb.org/anthology/D19-5007
PWC	https://paperswithcode.com/paper/generating-sentential-arguments-from-diverse
Repo
Framework

Team JUST at the MADAR Shared Task on Arabic Fine-Grained Dialect Identification


Title	Team JUST at the MADAR Shared Task on Arabic Fine-Grained Dialect Identification
Authors	Bashar Talafha, Ali Fadel, Mahmoud Al-Ayyoub, Yaser Jararweh, Mohammad AL-Smadi, Patrick Juola
Abstract	In this paper, we describe our team{'}s effort on the MADAR Shared Task on Arabic Fine-Grained Dialect Identification. The task requires building a system capable of differentiating between 25 different Arabic dialects in addition to MSA. Our approach is simple. After preprocessing the data, we use Data Augmentation (DA) to enlarge the training data six times. We then build a language model and extract n-gram word-level and character-level TF-IDF features and feed them into an MNB classifier. Despite its simplicity, the resulting model performs really well producing the 4th highest F-measure and region-level accuracy and the 5th highest precision, recall, city-level accuracy and country-level accuracy among the participating teams.
Tasks	Data Augmentation, Language Modelling
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-4638/
PDF	https://www.aclweb.org/anthology/W19-4638
PWC	https://paperswithcode.com/paper/team-just-at-the-madar-shared-task-on-arabic
Repo
Framework

Learning Protein Structure with a Differentiable Simulator


Title	Learning Protein Structure with a Differentiable Simulator
Authors	John Ingraham, Adam Riesselman, Chris Sander, Debora Marks
Abstract	The Boltzmann distribution is a natural model for many systems, from brains to materials and biomolecules, but is often of limited utility for fitting data because Monte Carlo algorithms are unable simulate it in available time. This gap between the expressive capabilities and sampling practicalities of energy-based models is exemplified by the protein folding problem, since energy landscapes underlie contemporary knowledge of protein biophysics but computer simulations are often unable to fold all but the smallest proteins from first-principles. In this work we aim to bridge the gap between the expressive capacity of energy functions and the practical capabilities of their simulators by using an unrolled Monte Carlo simulation as a model for data. We compose a neural energy function with a novel and efficient simulator based on Langevin dynamics to build an end-to-end-differentiable model of atomic protein structure given amino acid sequence information. We introduce techniques for stabilizing backpropagation under long roll-outs and demonstrate the model’s capacity to make multimodal predictions and to (sometimes) generalize to unobserved protein fold types when trained on a large corpus of protein structures.
Tasks
Published	2019-05-01
URL	https://openreview.net/forum?id=Byg3y3C9Km
PDF	https://openreview.net/pdf?id=Byg3y3C9Km
PWC	https://paperswithcode.com/paper/learning-protein-structure-with-a
Repo
Framework