February 1, 2020

3131 words 15 mins read

Paper Group AWR 355

Paper Group AWR 355

Adapting Multilingual Neural Machine Translation to Unseen Languages. Adversarial Policy Gradient for Deep Learning Image Augmentation. TalkSumm: A Dataset and Scalable Annotation Method for Scientific Paper Summarization Based on Conference Talks. Story Ending Prediction by Transferable BERT. Learning Optimal Data Augmentation Policies via Bayesia …

Adapting Multilingual Neural Machine Translation to Unseen Languages

Title Adapting Multilingual Neural Machine Translation to Unseen Languages
Authors Surafel M. Lakew, Alina Karakanta, Marcello Federico, Matteo Negri, Marco Turchi
Abstract Multilingual Neural Machine Translation (MNMT) for low-resource languages (LRL) can be enhanced by the presence of related high-resource languages (HRL), but the relatedness of HRL usually relies on predefined linguistic assumptions about language similarity. Recently, adapting MNMT to a LRL has shown to greatly improve performance. In this work, we explore the problem of adapting an MNMT model to an unseen LRL using data selection and model adaptation. In order to improve NMT for LRL, we employ perplexity to select HRL data that are most similar to the LRL on the basis of language distance. We extensively explore data selection in popular multilingual NMT settings, namely in (zero-shot) translation, and in adaptation from a multilingual pre-trained model, for both directions (LRL-en). We further show that dynamic adaptation of the model’s vocabulary results in a more favourable segmentation for the LRL in comparison with direct adaptation. Experiments show reductions in training time and significant performance gains over LRL baselines, even with zero LRL data (+13.0 BLEU), up to +17.0 BLEU for pre-trained multilingual model dynamic adaptation with related data selection. Our method outperforms current approaches, such as massively multilingual models and data augmentation, on four LRL.
Tasks Data Augmentation, Machine Translation
Published 2019-10-30
URL https://arxiv.org/abs/1910.13998v1
PDF https://arxiv.org/pdf/1910.13998v1.pdf
PWC https://paperswithcode.com/paper/adapting-multilingual-neural-machine
Repo https://github.com/surafelml/adapt-mnmt
Framework none

Adversarial Policy Gradient for Deep Learning Image Augmentation

Title Adversarial Policy Gradient for Deep Learning Image Augmentation
Authors Kaiyang Cheng, Claudia Iriondo, Francesco Calivá, Justin Krogue, Sharmila Majumdar, Valentina Pedoia
Abstract The use of semantic segmentation for masking and cropping input images has proven to be a significant aid in medical imaging classification tasks by decreasing the noise and variance of the training dataset. However, implementing this approach with classical methods is challenging: the cost of obtaining a dense segmentation is high, and the precise input area that is most crucial to the classification task is difficult to determine a-priori. We propose a novel joint-training deep reinforcement learning framework for image augmentation. A segmentation network, weakly supervised with policy gradient optimization, acts as an agent, and outputs masks as actions given samples as states, with the goal of maximizing reward signals from the classification network. In this way, the segmentation network learns to mask unimportant imaging features. Our method, Adversarial Policy Gradient Augmentation (APGA), shows promising results on Stanford’s MURA dataset and on a hip fracture classification task with an increase in global accuracy of up to 7.33% and improved performance over baseline methods in 9/10 tasks evaluated. We discuss the broad applicability of our joint training strategy to a variety of medical imaging tasks.
Tasks Image Augmentation, Semantic Segmentation
Published 2019-09-09
URL https://arxiv.org/abs/1909.04108v1
PDF https://arxiv.org/pdf/1909.04108v1.pdf
PWC https://paperswithcode.com/paper/adversarial-policy-gradient-for-deep-learning
Repo https://github.com/victorychain/Adversarial-Policy-Gradient-Augmentation
Framework pytorch

TalkSumm: A Dataset and Scalable Annotation Method for Scientific Paper Summarization Based on Conference Talks

Title TalkSumm: A Dataset and Scalable Annotation Method for Scientific Paper Summarization Based on Conference Talks
Authors Guy Lev, Michal Shmueli-Scheuer, Jonathan Herzig, Achiya Jerbi, David Konopnicki
Abstract Currently, no large-scale training data is available for the task of scientific paper summarization. In this paper, we propose a novel method that automatically generates summaries for scientific papers, by utilizing videos of talks at scientific conferences. We hypothesize that such talks constitute a coherent and concise description of the papers’ content, and can form the basis for good summaries. We collected 1716 papers and their corresponding videos, and created a dataset of paper summaries. A model trained on this dataset achieves similar performance as models trained on a dataset of summaries created manually. In addition, we validated the quality of our summaries by human experts.
Tasks
Published 2019-06-04
URL https://arxiv.org/abs/1906.01351v2
PDF https://arxiv.org/pdf/1906.01351v2.pdf
PWC https://paperswithcode.com/paper/talksumm-a-dataset-and-scalable-annotation
Repo https://github.com/levguy/talksumm
Framework none

Story Ending Prediction by Transferable BERT

Title Story Ending Prediction by Transferable BERT
Authors Zhongyang Li, Xiao Ding, Ting Liu
Abstract Recent advances, such as GPT and BERT, have shown success in incorporating a pre-trained transformer language model and fine-tuning operation to improve downstream NLP systems. However, this framework still has some fundamental problems in effectively incorporating supervised knowledge from other related tasks. In this study, we investigate a transferable BERT (TransBERT) training framework, which can transfer not only general language knowledge from large-scale unlabeled data but also specific kinds of knowledge from various semantically related supervised tasks, for a target task. Particularly, we propose utilizing three kinds of transfer tasks, including natural language inference, sentiment classification, and next action prediction, to further train BERT based on a pre-trained model. This enables the model to get a better initialization for the target task. We take story ending prediction as the target task to conduct experiments. The final result, an accuracy of 91.8%, dramatically outperforms previous state-of-the-art baseline methods. Several comparative experiments give some helpful suggestions on how to select transfer tasks. Error analysis shows what are the strength and weakness of BERT-based models for story ending prediction.
Tasks Language Modelling, Natural Language Inference, Sentiment Analysis
Published 2019-05-17
URL https://arxiv.org/abs/1905.07504v2
PDF https://arxiv.org/pdf/1905.07504v2.pdf
PWC https://paperswithcode.com/paper/story-ending-prediction-by-transferable-bert
Repo https://github.com/eecrazy/TransBERT_ijcai2019
Framework pytorch

Learning Optimal Data Augmentation Policies via Bayesian Optimization for Image Classification Tasks

Title Learning Optimal Data Augmentation Policies via Bayesian Optimization for Image Classification Tasks
Authors Chunxu Zhang, Jiaxu Cui, Bo Yang
Abstract In recent years, deep learning has achieved remarkable achievements in many fields, including computer vision, natural language processing, speech recognition and others. Adequate training data is the key to ensure the effectiveness of the deep models. However, obtaining valid data requires a lot of time and labor resources. Data augmentation (DA) is an effective alternative approach, which can generate new labeled data based on existing data using label-preserving transformations. Although we can benefit a lot from DA, designing appropriate DA policies requires a lot of expert experience and time consumption, and the evaluation of searching the optimal policies is costly. So we raise a new question in this paper: how to achieve automated data augmentation at as low cost as possible? We propose a method named BO-Aug for automating the process by finding the optimal DA policies using the Bayesian optimization approach. Our method can find the optimal policies at a relatively low search cost, and the searched policies based on a specific dataset are transferable across different neural network architectures or even different datasets. We validate the BO-Aug on three widely used image classification datasets, including CIFAR-10, CIFAR-100 and SVHN. Experimental results show that the proposed method can achieve state-of-the-art or near advanced classification accuracy. Code to reproduce our experiments is available at https://github.com/zhangxiaozao/BO-Aug.
Tasks Data Augmentation, Image Augmentation, Image Classification, Speech Recognition
Published 2019-05-06
URL https://arxiv.org/abs/1905.02610v2
PDF https://arxiv.org/pdf/1905.02610v2.pdf
PWC https://paperswithcode.com/paper/learning-optimal-data-augmentation-policies
Repo https://github.com/zhangxiaozao/BO-Aug
Framework tf

Improved Image Augmentation for Convolutional Neural Networks by Copyout and CopyPairing

Title Improved Image Augmentation for Convolutional Neural Networks by Copyout and CopyPairing
Authors Philip May
Abstract Image augmentation is a widely used technique to improve the performance of convolutional neural networks (CNNs). In common image shifting, cropping, flipping, shearing and rotating are used for augmentation. But there are more advanced techniques like Cutout and SamplePairing. In this work we present two improvements of the state-of-the-art Cutout and SamplePairing techniques. Our new method called Copyout takes a square patch of another random training image and copies it onto a random location of each image used for training. The second technique we discovered is called CopyPairing. It combines Copyout and SamplePairing for further augmentation and even better performance. We apply different experiments with these augmentation techniques on the CIFAR-10 dataset to evaluate and compare them under different configurations. In our experiments we show that Copyout reduces the test error rate by 8.18% compared with Cutout and 4.27% compared with SamplePairing. CopyPairing reduces the test error rate by 11.97% compared with Cutout and 8.21% compared with SamplePairing. Copyout and CopyPairing implementations are available at https://github.com/t-systems-on-site-services-gmbh/coocop.
Tasks Image Augmentation
Published 2019-09-01
URL https://arxiv.org/abs/1909.00390v2
PDF https://arxiv.org/pdf/1909.00390v2.pdf
PWC https://paperswithcode.com/paper/improved-image-augmentation-for-convolutional
Repo https://github.com/t-systems-on-site-services-gmbh/coocop
Framework none

Augmented Memory for Correlation Filters in Real-Time UAV Tracking

Title Augmented Memory for Correlation Filters in Real-Time UAV Tracking
Authors Yiming Li, Changhong Fu, Fangqiang Ding, Ziyuan Huang, Jia Pan
Abstract The outstanding computational efficiency of discriminative correlation filter (DCF) fades away with various complicated improvements. Previous appearances are also gradually forgotten due to the exponential decay of historical views in traditional appearance updating scheme of DCF framework, reducing the model’s robustness. In this work, a novel tracker based on DCF framework is proposed to augment memory of previously appeared views while running at real-time speed. Several historical views and the current view are simultaneously introduced in training to allow the tracker to adapt to new appearances as well as memorize previous ones. A novel rapid compressed context learning is proposed to increase the discriminative ability of the filter efficiently. Substantial experiments on UAVDT and UAV123 datasets have validated that the proposed tracker performs competitively against other 26 top DCF and deep-based trackers with over 40 FPS on CPU.
Tasks
Published 2019-09-24
URL https://arxiv.org/abs/1909.10989v1
PDF https://arxiv.org/pdf/1909.10989v1.pdf
PWC https://paperswithcode.com/paper/augmented-memory-for-correlation-filters-in
Repo https://github.com/vision4robotics/AMCF-tracker
Framework none

EDVR: Video Restoration with Enhanced Deformable Convolutional Networks

Title EDVR: Video Restoration with Enhanced Deformable Convolutional Networks
Authors Xintao Wang, Kelvin C. K. Chan, Ke Yu, Chao Dong, Chen Change Loy
Abstract Video restoration tasks, including super-resolution, deblurring, etc, are drawing increasing attention in the computer vision community. A challenging benchmark named REDS is released in the NTIRE19 Challenge. This new benchmark challenges existing methods from two aspects: (1) how to align multiple frames given large motions, and (2) how to effectively fuse different frames with diverse motion and blur. In this work, we propose a novel Video Restoration framework with Enhanced Deformable networks, termed EDVR, to address these challenges. First, to handle large motions, we devise a Pyramid, Cascading and Deformable (PCD) alignment module, in which frame alignment is done at the feature level using deformable convolutions in a coarse-to-fine manner. Second, we propose a Temporal and Spatial Attention (TSA) fusion module, in which attention is applied both temporally and spatially, so as to emphasize important features for subsequent restoration. Thanks to these modules, our EDVR wins the champions and outperforms the second place by a large margin in all four tracks in the NTIRE19 video restoration and enhancement challenges. EDVR also demonstrates superior performance to state-of-the-art published methods on video super-resolution and deblurring. The code is available at https://github.com/xinntao/EDVR.
Tasks Deblurring, Super-Resolution, Video Super-Resolution
Published 2019-05-07
URL https://arxiv.org/abs/1905.02716v1
PDF https://arxiv.org/pdf/1905.02716v1.pdf
PWC https://paperswithcode.com/paper/edvr-video-restoration-with-enhanced
Repo https://github.com/xinntao/EDVR
Framework pytorch

How to Evaluate Word Representations of Informal Domain?

Title How to Evaluate Word Representations of Informal Domain?
Authors Yekun Chai, Naomi Saphra, Adam Lopez
Abstract Diverse word representations have surged in most state-of-the-art natural language processing (NLP) applications. Nevertheless, how to efficiently evaluate such word embeddings in the informal domain such as Twitter or forums, remains an ongoing challenge due to the lack of sufficient evaluation dataset. We derived a large list of variant spelling pairs from UrbanDictionary with the automatic approaches of weakly-supervised pattern-based bootstrapping and self-training linear-chain conditional random field (CRF). With these extracted relation pairs we promote the odds of eliding the text normalization procedure of traditional NLP pipelines and directly adopting representations of non-standard words in the informal domain. Our code is available.
Tasks Word Embeddings
Published 2019-11-12
URL https://arxiv.org/abs/1911.04669v2
PDF https://arxiv.org/pdf/1911.04669v2.pdf
PWC https://paperswithcode.com/paper/how-to-evaluate-word-representations-of
Repo https://github.com/cyk1337/UrbanDict
Framework none

KagNet: Knowledge-Aware Graph Networks for Commonsense Reasoning

Title KagNet: Knowledge-Aware Graph Networks for Commonsense Reasoning
Authors Bill Yuchen Lin, Xinyue Chen, Jamin Chen, Xiang Ren
Abstract Commonsense reasoning aims to empower machines with the human ability to make presumptions about ordinary situations in our daily life. In this paper, we propose a textual inference framework for answering commonsense questions, which effectively utilizes external, structured commonsense knowledge graphs to perform explainable inferences. The framework first grounds a question-answer pair from the semantic space to the knowledge-based symbolic space as a schema graph, a related sub-graph of external knowledge graphs. It represents schema graphs with a novel knowledge-aware graph network module named KagNet, and finally scores answers with graph representations. Our model is based on graph convolutional networks and LSTMs, with a hierarchical path-based attention mechanism. The intermediate attention scores make it transparent and interpretable, which thus produce trustworthy inferences. Using ConceptNet as the only external resource for Bert-based models, we achieved state-of-the-art performance on the CommonsenseQA, a large-scale dataset for commonsense reasoning.
Tasks Common Sense Reasoning, Knowledge Base Question Answering, Knowledge Graphs, Natural Language Inference
Published 2019-09-04
URL https://arxiv.org/abs/1909.02151v1
PDF https://arxiv.org/pdf/1909.02151v1.pdf
PWC https://paperswithcode.com/paper/kagnet-knowledge-aware-graph-networks-for
Repo https://github.com/INK-USC/KagNet
Framework pytorch

Mining Objects: Fully Unsupervised Object Discovery and Localization From a Single Image

Title Mining Objects: Fully Unsupervised Object Discovery and Localization From a Single Image
Authors Runsheng Zhang, Yaping Huang, Mengyang Pu, Jian Zhang, Qingji Guan, Qi Zou, Haibin Ling
Abstract The goal of our work is to discover dominant objects without using any annotations. We focus on performing unsupervised object discovery and localization in a very general setting where only a single image is given. This is far more challenge than typical co-localization or weakly-supervised localization tasks. To tackle this problem, we propose a simple but effective pattern mining-based method, called Object Location Mining (OLM), which exploits the advantages of data mining and feature representation of pre-trained convolutional neural networks (CNNs). Specifically, we first convert the feature maps from a pre-trained CNN model into a set of transactions, and then discovers frequent patterns from transaction database through pattern mining techniques. We observe that those discovered patterns, i.e, co-occurrence highlighted regions, typically hold appearance and spatial consistency. Motivated by this observation, we can easily discover and localize possible objects by merging relevant meaningful patterns in an unsupervised manner. Extensive experiments on eleven benchmarks demonstrate that OLM achieves competitive localization performance compared with the state-of-the-art methods. We also evaluate our approach compared with unsupervised saliency detection methods and achieves best results on four benchmark datasets. Moreover, we conduct experiments on fine-grained classification to show that our proposed method can locate the entire object and parts accurately, which can benefit to improving the classification results significantly.
Tasks Saliency Detection
Published 2019-02-26
URL https://arxiv.org/abs/1902.09968v2
PDF https://arxiv.org/pdf/1902.09968v2.pdf
PWC https://paperswithcode.com/paper/mining-objects-fully-unsupervised-object
Repo https://github.com/anandhupvr/Mining-Objects
Framework tf

End-to-end Learning for GMI Optimized Geometric Constellation Shape

Title End-to-end Learning for GMI Optimized Geometric Constellation Shape
Authors Rasmus T. Jones, Metodi P. Yankov, Darko Zibar
Abstract Autoencoder-based geometric shaping is proposed that includes optimizing bit mappings. Up to 0.2 bits/QAM symbol gain in GMI is achieved for a variety of data rates and in the presence of transceiver impairments. The gains can be harvested with standard binary FEC at no cost w.r.t. conventional BICM.
Tasks
Published 2019-07-19
URL https://arxiv.org/abs/1907.08535v1
PDF https://arxiv.org/pdf/1907.08535v1.pdf
PWC https://paperswithcode.com/paper/end-to-end-learning-for-gmi-optimized
Repo https://github.com/Rassibassi/claude
Framework tf

A Discrete Hard EM Approach for Weakly Supervised Question Answering

Title A Discrete Hard EM Approach for Weakly Supervised Question Answering
Authors Sewon Min, Danqi Chen, Hannaneh Hajishirzi, Luke Zettlemoyer
Abstract Many question answering (QA) tasks only provide weak supervision for how the answer should be computed. For example, TriviaQA answers are entities that can be mentioned multiple times in supporting documents, while DROP answers can be computed by deriving many different equations from numbers in the reference text. In this paper, we show it is possible to convert such tasks into discrete latent variable learning problems with a precomputed, task-specific set of possible “solutions” (e.g. different mentions or equations) that contains one correct option. We then develop a hard EM learning scheme that computes gradients relative to the most likely solution at each update. Despite its simplicity, we show that this approach significantly outperforms previous methods on six QA tasks, including absolute gains of 2–10%, and achieves the state-of-the-art on five of them. Using hard updates instead of maximizing marginal likelihood is key to these results as it encourages the model to find the one correct answer, which we show through detailed qualitative analysis.
Tasks Question Answering
Published 2019-09-11
URL https://arxiv.org/abs/1909.04849v1
PDF https://arxiv.org/pdf/1909.04849v1.pdf
PWC https://paperswithcode.com/paper/a-discrete-hard-em-approach-for-weakly
Repo https://github.com/shmsw25/qa-hard-em
Framework pytorch

Simulating Emergent Properties of Human Driving Behavior Using Multi-Agent Reward Augmented Imitation Learning

Title Simulating Emergent Properties of Human Driving Behavior Using Multi-Agent Reward Augmented Imitation Learning
Authors Raunak P. Bhattacharyya, Derek J. Phillips, Changliu Liu, Jayesh K. Gupta, Katherine Driggs-Campbell, Mykel J. Kochenderfer
Abstract Recent developments in multi-agent imitation learning have shown promising results for modeling the behavior of human drivers. However, it is challenging to capture emergent traffic behaviors that are observed in real-world datasets. Such behaviors arise due to the many local interactions between agents that are not commonly accounted for in imitation learning. This paper proposes Reward Augmented Imitation Learning (RAIL), which integrates reward augmentation into the multi-agent imitation learning framework and allows the designer to specify prior knowledge in a principled fashion. We prove that convergence guarantees for the imitation learning process are preserved under the application of reward augmentation. This method is validated in a driving scenario, where an entire traffic scene is controlled by driving policies learned using our proposed algorithm. Further, we demonstrate improved performance in comparison to traditional imitation learning algorithms both in terms of the local actions of a single agent and the behavior of emergent properties in complex, multi-agent settings.
Tasks Imitation Learning
Published 2019-03-14
URL http://arxiv.org/abs/1903.05766v1
PDF http://arxiv.org/pdf/1903.05766v1.pdf
PWC https://paperswithcode.com/paper/simulating-emergent-properties-of-human
Repo https://github.com/sisl/ngsim_env
Framework tf

Personalization and Optimization of Decision Parameters via Heterogenous Causal Effects

Title Personalization and Optimization of Decision Parameters via Heterogenous Causal Effects
Authors Ye Tu, Kinjal Basu, Jinyun Yan, Birjodh Tiwana, Shaunak Chatterjee
Abstract Randomized experimentation (also known as A/B testing or bucket testing) is very commonly used in the internet industry to measure the effect of a new treatment. Often, the decision on the basis of such A/B testing is to ramp the treatment variant that did best for the entire population. However, the effect of any given treatment varies across experimental units, and choosing a single variant to ramp to the whole population can be quite suboptimal. In this work, we propose a method which automatically identifies the collection of cohorts exhibiting heterogeneous treatment effect (using causal trees). We then use stochastic optimization to identify the optimal treatment variant in each cohort. We use two real-life examples - one related to serving notifications and the other related to modulating ads density on feed. In both examples, using offline simulation and online experimentation, we demonstrate the benefits of our approach. At the time of writing this paper, the method described has been deployed on the LinkedIn Ads and Notifications system.
Tasks Stochastic Optimization
Published 2019-01-29
URL http://arxiv.org/abs/1901.10550v2
PDF http://arxiv.org/pdf/1901.10550v2.pdf
PWC https://paperswithcode.com/paper/personalization-and-optimization-of-decision
Repo https://github.com/tuye0305/kdd2019prophet
Framework none
comments powered by Disqus