January 24, 2020

2688 words 13 mins read

Paper Group NANR 270

Paper Group NANR 270

The OSU/Facebook Realizer for SRST 2019: Seq2Seq Inflection and Serialized Tree2Tree Linearization. Recycling a Pre-trained BERT Encoder for Neural Machine Translation. Learning with Limited Data for Multilingual Reading Comprehension. A new dog learns old tricks: RL finds classic optimization algorithms. Turkish Tweet Classification with Transform …

The OSU/Facebook Realizer for SRST 2019: Seq2Seq Inflection and Serialized Tree2Tree Linearization

Title The OSU/Facebook Realizer for SRST 2019: Seq2Seq Inflection and Serialized Tree2Tree Linearization
Authors Kartikeya Upasani, David King, Jinfeng Rao, Anusha Balakrishnan, Michael White
Abstract We describe our exploratory system for the shallow surface realization task, which combines morphological inflection using character sequence-to-sequence models with a baseline linearizer that implements a tree-to-tree model using sequence-to-sequence models on serialized trees. Results for morphological inflection were competitive across languages. Due to time constraints, we could only submit complete results (including linearization) for English. Preliminary linearization results were decent, with a small benefit from reranking to prefer valid output trees, but inadequate control over the words in the output led to poor quality on longer sentences.
Tasks Morphological Inflection
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-6309/
PDF https://www.aclweb.org/anthology/D19-6309
PWC https://paperswithcode.com/paper/the-osufacebook-realizer-for-srst-2019
Repo
Framework

Recycling a Pre-trained BERT Encoder for Neural Machine Translation

Title Recycling a Pre-trained BERT Encoder for Neural Machine Translation
Authors Kenji Imamura, Eiichiro Sumita
Abstract In this paper, a pre-trained Bidirectional Encoder Representations from Transformers (BERT) model is applied to Transformer-based neural machine translation (NMT). In contrast to monolingual tasks, the number of unlearned model parameters in an NMT decoder is as huge as the number of learned parameters in the BERT model. To train all the models appropriately, we employ two-stage optimization, which first trains only the unlearned parameters by freezing the BERT model, and then fine-tunes all the sub-models. In our experiments, stable two-stage optimization was achieved, in contrast the BLEU scores of direct fine-tuning were extremely low. Consequently, the BLEU scores of the proposed method were better than those of the Transformer base model and the same model without pre-training. Additionally, we confirmed that NMT with the BERT encoder is more effective in low-resource settings.
Tasks Machine Translation
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-5603/
PDF https://www.aclweb.org/anthology/D19-5603
PWC https://paperswithcode.com/paper/recycling-a-pre-trained-bert-encoder-for
Repo
Framework

Learning with Limited Data for Multilingual Reading Comprehension

Title Learning with Limited Data for Multilingual Reading Comprehension
Authors Kyungjae Lee, Sunghyun Park, Hojae Han, Jinyoung Yeo, Seung-won Hwang, Juho Lee
Abstract This paper studies the problem of supporting question answering in a new language with limited training resources. As an extreme scenario, when no such resource exists, one can (1) transfer labels from another language, and (2) generate labels from unlabeled data, using translator and automatic labeling function respectively. However, these approaches inevitably introduce noises to the training data, due to translation or generation errors, which require a judicious use of data with varying confidence. To address this challenge, we propose a weakly-supervised framework that quantifies such noises from automatically generated labels, to deemphasize or fix noisy data in training. On reading comprehension task, we demonstrate the effectiveness of our model on low-resource languages with varying similarity to English, namely, Korean and French.
Tasks Question Answering, Reading Comprehension
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-1283/
PDF https://www.aclweb.org/anthology/D19-1283
PWC https://paperswithcode.com/paper/learning-with-limited-data-for-multilingual
Repo
Framework

A new dog learns old tricks: RL finds classic optimization algorithms

Title A new dog learns old tricks: RL finds classic optimization algorithms
Authors Weiwei Kong, Christopher Liaw, Aranyak Mehta, D. Sivakumar
Abstract This paper introduces a novel framework for learning algorithms to solve online combinatorial optimization problems. Towards this goal, we introduce a number of key ideas from traditional algorithms and complexity theory. First, we draw a new connection between primal-dual methods and reinforcement learning. Next, we introduce the concept of adversarial distributions (universal and high-entropy training sets), which are distributions that encourage the learner to find algorithms that work well in the worst case. We test our new ideas on a number of optimization problem such as the AdWords problem, the online knapsack problem, and the secretary problem. Our results indicate that the models have learned behaviours that are consistent with the traditional optimal algorithms for these problems.
Tasks Combinatorial Optimization
Published 2019-05-01
URL https://openreview.net/forum?id=rkluJ2R9KQ
PDF https://openreview.net/pdf?id=rkluJ2R9KQ
PWC https://paperswithcode.com/paper/a-new-dog-learns-old-tricks-rl-finds-classic
Repo
Framework

Turkish Tweet Classification with Transformer Encoder

Title Turkish Tweet Classification with Transformer Encoder
Authors At{\i}f Emre Y{"u}ksel, Ya{\c{s}}ar Alim T{"u}rkmen, Arzucan {"O}zg{"u}r, Berna Alt{\i}nel
Abstract Short-text classification is a challenging task, due to the sparsity and high dimensionality of the feature space. In this study, we aim to analyze and classify Turkish tweets based on their topics. Social media jargon and the agglutinative structure of the Turkish language makes this classification task even harder. As far as we know, this is the first study that uses a Transformer Encoder for short text classification in Turkish. The model is trained in a weakly supervised manner, where the training data set has been labeled automatically. Our results on the test set, which has been manually labeled, show that performing morphological analysis improves the classification performance of the traditional machine learning algorithms Random Forest, Naive Bayes, and Support Vector Machines. Still, the proposed approach achieves an F-score of 89.3 {%} outperforming those algorithms by at least 5 points.
Tasks Morphological Analysis, Text Classification
Published 2019-09-01
URL https://www.aclweb.org/anthology/R19-1158/
PDF https://www.aclweb.org/anthology/R19-1158
PWC https://paperswithcode.com/paper/turkish-tweet-classification-with-transformer
Repo
Framework

Automatic Arabic Text Summarization Based on Fuzzy Logic

Title Automatic Arabic Text Summarization Based on Fuzzy Logic
Authors Lamees Al Qassem, Di Wang, Hassan Barada, Ahmad Al-Rubaie, Nawaf Almoosa
Abstract
Tasks Text Summarization
Published 2019-09-01
URL https://www.aclweb.org/anthology/W19-7406/
PDF https://www.aclweb.org/anthology/W19-7406
PWC https://paperswithcode.com/paper/automatic-arabic-text-summarization-based-on
Repo
Framework

Filter Training and Maximum Response: Classification via Discerning

Title Filter Training and Maximum Response: Classification via Discerning
Authors Lei Gu
Abstract This report introduces a training and recognition scheme, in which classification is realized via class-wise discerning. Trained with datasets whose labels are randomly shuffled except for one class of interest, a neural network learns class-wise parameter values, and remolds itself from a feature sorter into feature filters, each of which discerns objects belonging to one of the classes only. Classification of an input can be inferred from the maximum response of the filters. A multiple check with multiple versions of filters can diminish fluctuation and yields better performance. This scheme of discerning, maximum response and multiple check is a method of general viability to improve performance of feedforward networks, and the filter training itself is a promising feature abstraction procedure. In contrast to the direct sorting, the scheme mimics the classification process mediated by a series of one component picking.
Tasks
Published 2019-05-01
URL https://openreview.net/forum?id=r1gKNs0qYX
PDF https://openreview.net/pdf?id=r1gKNs0qYX
PWC https://paperswithcode.com/paper/filter-training-and-maximum-response
Repo
Framework

Reinforced Product Metadata Selection for Helpfulness Assessment of Customer Reviews

Title Reinforced Product Metadata Selection for Helpfulness Assessment of Customer Reviews
Authors Miao Fan, Chao Feng, Mingming Sun, Ping Li
Abstract To automatically assess the helpfulness of a customer review online, conventional approaches generally acquire various linguistic and neural embedding features solely from the textual content of the review itself as the evidence. We, however, find out that a helpful review is largely concerned with the metadata (such as the name, the brand, the category, etc.) of its target product. It leaves us with a challenge of how to choose the correct key-value product metadata to help appraise the helpfulness of free-text reviews more precisely. To address this problem, we propose a novel framework composed of two mutual-benefit modules. Given a product, a selector (agent) learns from both the keys in the product metadata and one of its reviews to take an action that selects the correct value, and a successive predictor (network) makes the free-text review attend to this value to obtain better neural representations for helpfulness assessment. The predictor is directly optimized by SGD with the loss of helpfulness prediction, and the selector could be updated via policy gradient rewarded with the performance of the predictor. We use two real-world datasets from Amazon.com and Yelp.com, respectively, to compare the performance of our framework with other mainstream methods under two application scenarios: helpfulness identification and regression of customer reviews. Extensive results demonstrate that our framework can achieve state-of-the-art performance with substantial improvements.
Tasks
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-1177/
PDF https://www.aclweb.org/anthology/D19-1177
PWC https://paperswithcode.com/paper/reinforced-product-metadata-selection-for
Repo
Framework

DrivingStereo: A Large-Scale Dataset for Stereo Matching in Autonomous Driving Scenarios

Title DrivingStereo: A Large-Scale Dataset for Stereo Matching in Autonomous Driving Scenarios
Authors Guorun Yang, Xiao Song, Chaoqin Huang, Zhidong Deng, Jianping Shi, Bolei Zhou
Abstract Great progress has been made on estimating disparity maps from stereo images. However, with the limited stereo data available in the existing datasets and unstable ranging precision of current stereo methods, industry-level stereo matching in autonomous driving remains challenging. In this paper, we construct a novel large-scale stereo dataset named DrivingStereo. It contains over 180k images covering a diverse set of driving scenarios, which is hundreds of times larger than the KITTI Stereo dataset. High-quality labels of disparity are produced by a model-guided filtering strategy from multi-frame LiDAR points. For better evaluations, we present two new metrics for stereo matching in the driving scenes, i.e. a distance-aware metric and a semantic-aware metric. Extensive experiments show that compared with the models trained on FlyingThings3D or Cityscapes, the models trained on our DrivingStereo achieve higher generalization accuracy in real-world driving scenes, while the proposed metrics better evaluate the stereo methods on all-range distances and across different classes. Our dataset and code are available at https://drivingstereo-dataset.github.io.
Tasks Autonomous Driving, Stereo Matching, Stereo Matching Hand
Published 2019-06-01
URL http://openaccess.thecvf.com/content_CVPR_2019/html/Yang_DrivingStereo_A_Large-Scale_Dataset_for_Stereo_Matching_in_Autonomous_Driving_CVPR_2019_paper.html
PDF http://openaccess.thecvf.com/content_CVPR_2019/papers/Yang_DrivingStereo_A_Large-Scale_Dataset_for_Stereo_Matching_in_Autonomous_Driving_CVPR_2019_paper.pdf
PWC https://paperswithcode.com/paper/drivingstereo-a-large-scale-dataset-for
Repo
Framework

AdaTransform: Adaptive Data Transformation

Title AdaTransform: Adaptive Data Transformation
Authors Zhiqiang Tang, Xi Peng, Tingfeng Li, Yizhe Zhu, Dimitris N. Metaxas
Abstract Data augmentation is widely used to increase data variance in training deep neural networks. However, previous methods require either comprehensive domain knowledge or high computational cost. Can we learn data transformation automatically and efficiently with limited domain knowledge? Furthermore, can we leverage data transformation to improve not only network training but also network testing? In this work, we propose adaptive data transformation to achieve the two goals. The AdaTransform can increase data variance in training and decrease data variance in testing. Experiments on different tasks prove that it can improve generalization performance.
Tasks Data Augmentation
Published 2019-10-01
URL http://openaccess.thecvf.com/content_ICCV_2019/html/Tang_AdaTransform_Adaptive_Data_Transformation_ICCV_2019_paper.html
PDF http://openaccess.thecvf.com/content_ICCV_2019/papers/Tang_AdaTransform_Adaptive_Data_Transformation_ICCV_2019_paper.pdf
PWC https://paperswithcode.com/paper/adatransform-adaptive-data-transformation
Repo
Framework

Transformer-based Model for Single Documents Neural Summarization

Title Transformer-based Model for Single Documents Neural Summarization
Authors Elozino Egonmwan, Yllias Chali
Abstract We propose a system that improves performance on single document summarization task using the CNN/DailyMail and Newsroom datasets. It follows the popular encoder-decoder paradigm, but with an extra focus on the encoder. The intuition is that the probability of correctly decoding an information significantly lies in the pattern and correctness of the encoder. Hence we introduce, encode {–}encode {–} decode. A framework that encodes the source text first with a transformer, then a sequence-to-sequence (seq2seq) model. We find that the transformer and seq2seq model complement themselves adequately, making for a richer encoded vector representation. We also find that paying more attention to the vocabulary of target words during abstraction improves performance. We experiment our hypothesis and framework on the task of extractive and abstractive single document summarization and evaluate using the standard CNN/DailyMail dataset and the recently released Newsroom dataset.
Tasks Document Summarization
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-5607/
PDF https://www.aclweb.org/anthology/D19-5607
PWC https://paperswithcode.com/paper/transformer-based-model-for-single-documents
Repo
Framework

Subtopic-driven Multi-Document Summarization

Title Subtopic-driven Multi-Document Summarization
Authors Xin Zheng, Aixin Sun, Jing Li, Karthik Muthuswamy
Abstract In multi-document summarization, a set of documents to be summarized is assumed to be on the same topic, known as the underlying topic in this paper. That is, the underlying topic can be collectively represented by all the documents in the set. Meanwhile, different documents may cover various different subtopics and the same subtopic can be across several documents. Inspired by topic model, the underlying topic of a document set can also be viewed as a collection of different subtopics of different importance. In this paper, we propose a summarization model called STDS. The model generates the underlying topic representation from both document view and subtopic view in parallel. The learning objective is to minimize the distance between the representations learned from the two views. The contextual information is encoded through a hierarchical RNN architecture. Sentence salience is estimated in a hierarchical way with subtopic salience and relative sentence salience, by considering the contextual information. Top ranked sentences are then extracted as a summary. Note that the notion of subtopic enables us to bring in additional information (e.g. comments to news articles) that is helpful for document summarization. Experimental results show that the proposed solution outperforms state-of-the-art methods on benchmark datasets.
Tasks Document Summarization, Multi-Document Summarization
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-1311/
PDF https://www.aclweb.org/anthology/D19-1311
PWC https://paperswithcode.com/paper/subtopic-driven-multi-document-summarization
Repo
Framework

BlackMarks: Black-box Multi-bit Watermarking for Deep Neural Networks

Title BlackMarks: Black-box Multi-bit Watermarking for Deep Neural Networks
Authors Huili Chen, Bita Darvish Rouhani, Farinaz Koushanfar
Abstract Deep Neural Networks (DNNs) are increasingly deployed in cloud servers and autonomous agents due to their superior performance. The deployed DNN is either leveraged in a white-box setting (model internals are publicly known) or a black-box setting (only model outputs are known) depending on the application. A practical concern in the rush to adopt DNNs is protecting the models against Intellectual Property (IP) infringement. We propose BlackMarks, the first end-to-end multi-bit watermarking framework that is applicable in the black-box scenario. BlackMarks takes the pre-trained unmarked model and the owner’s binary signature as inputs. The output is the corresponding marked model with specific keys that can be later used to trigger the embedded watermark. To do so, BlackMarks first designs a model-dependent encoding scheme that maps all possible classes in the task to bit ‘0’ and bit ‘1’. Given the owner’s watermark signature (a binary string), a set of key image and label pairs is designed using targeted adversarial attacks. The watermark (WM) is then encoded in the distribution of output activations of the DNN by fine-tuning the model with a WM-specific regularized loss. To extract the WM, BlackMarks queries the model with the WM key images and decodes the owner’s signature from the corresponding predictions using the designed encoding scheme. We perform a comprehensive evaluation of BlackMarks’ performance on MNIST, CIFAR-10, ImageNet datasets and corroborate its effectiveness and robustness. BlackMarks preserves the functionality of the original DNN and incurs negligible WM embedding overhead as low as 2.054%.
Tasks
Published 2019-05-01
URL https://openreview.net/forum?id=S1MeM2RcFm
PDF https://openreview.net/pdf?id=S1MeM2RcFm
PWC https://paperswithcode.com/paper/blackmarks-black-box-multi-bit-watermarking
Repo
Framework

Generating Discourse Inferences from Unscoped Episodic Logical Formulas

Title Generating Discourse Inferences from Unscoped Episodic Logical Formulas
Authors Gene Kim, Benjamin Kane, Viet Duong, Muskaan Mendiratta, Graeme McGuire, Sophie Sackstein, Georgiy Platonov, Lenhart Schubert
Abstract Abstract Unscoped episodic logical form (ULF) is a semantic representation capturing the predicate-argument structure of English within the episodic logic formalism in relation to the syntactic structure, while leaving scope, word sense, and anaphora unresolved. We describe how ULF can be used to generate natural language inferences that are grounded in the semantic and syntactic structure through a small set of rules defined over interpretable predicates and transformations on ULFs. The semantic restrictions placed by ULF semantic types enables us to ensure that the inferred structures are semantically coherent while the nearness to syntax enables accurate mapping to English. We demonstrate these inferences on four classes of conversationally-oriented inferences in a mixed genre dataset with 68.5{%} precision from human judgments.
Tasks
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-3306/
PDF https://www.aclweb.org/anthology/W19-3306
PWC https://paperswithcode.com/paper/generating-discourse-inferences-from-unscoped
Repo
Framework

Machine Comprehension Improves Domain-Specific Japanese Predicate-Argument Structure Analysis

Title Machine Comprehension Improves Domain-Specific Japanese Predicate-Argument Structure Analysis
Authors Norio Takahashi, Tomohide Shibata, Daisuke Kawahara, Sadao Kurohashi
Abstract To improve the accuracy of predicate-argument structure (PAS) analysis, large-scale training data and knowledge for PAS analysis are indispensable. We focus on a specific domain, specifically Japanese blogs on driving, and construct two wide-coverage datasets as a form of QA using crowdsourcing: a PAS-QA dataset and a reading comprehension QA (RC-QA) dataset. We train a machine comprehension (MC) model based on these datasets to perform PAS analysis. Our experiments show that a stepwise training method is the most effective, which pre-trains an MC model based on the RC-QA dataset to acquire domain knowledge and then fine-tunes based on the PAS-QA dataset.
Tasks Reading Comprehension
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-5814/
PDF https://www.aclweb.org/anthology/D19-5814
PWC https://paperswithcode.com/paper/machine-comprehension-improves-domain
Repo
Framework
comments powered by Disqus