January 31, 2020

2514 words 12 mins read

Paper Group AWR 416

Paper Group AWR 416

Kernel Graph Attention Network for Fact Verification. Learning Individual Styles of Conversational Gesture. Improving Adversarial Robustness via Guided Complement Entropy. Simple Applications of BERT for Ad Hoc Document Retrieval. Memeify: A Large-Scale Meme Generation System. Object-Oriented Dynamics Learning through Multi-Level Abstraction. Scala …

Kernel Graph Attention Network for Fact Verification

Title Kernel Graph Attention Network for Fact Verification
Authors Zhenghao Liu, Chenyan Xiong, Maosong Sun
Abstract This paper presents Kernel Graph Attention Network (KGAT), which conducts more fine-grained evidence selection and reasoning for the fact verification task. Given a claim and a set of potential supporting evidence sentences, KGAT constructs a graph attention network using the evidence sentences as its nodes and learns to verify the claim integrity using its edge kernels and node kernels, where the edge kernels learn to propagate information across the evidence graph, and the node kernels learn to merge node level information to the graph level. KGAT reaches a comparable performance (69.4%) on FEVER, a large-scale benchmark for fact verification. Our experiments find that KGAT thrives on verification scenarios where multiple evidence pieces are required. This advantage mainly comes from the sparse and fine-grained attention mechanisms from our kernel technique.
Tasks
Published 2019-10-22
URL https://arxiv.org/abs/1910.09796v2
PDF https://arxiv.org/pdf/1910.09796v2.pdf
PWC https://paperswithcode.com/paper/kernel-graph-attention-network-for-fact
Repo https://github.com/thunlp/KernelGAT
Framework pytorch

Learning Individual Styles of Conversational Gesture

Title Learning Individual Styles of Conversational Gesture
Authors Shiry Ginosar, Amir Bar, Gefen Kohavi, Caroline Chan, Andrew Owens, Jitendra Malik
Abstract Human speech is often accompanied by hand and arm gestures. Given audio speech input, we generate plausible gestures to go along with the sound. Specifically, we perform cross-modal translation from “in-the-wild’’ monologue speech of a single speaker to their hand and arm motion. We train on unlabeled videos for which we only have noisy pseudo ground truth from an automatic pose detection system. Our proposed model significantly outperforms baseline methods in a quantitative comparison. To support research toward obtaining a computational understanding of the relationship between gesture and speech, we release a large video dataset of person-specific gestures. The project website with video, code and data can be found at http://people.eecs.berkeley.edu/~shiry/speech2gesture .
Tasks Speech-to-Gesture Translation
Published 2019-06-10
URL https://arxiv.org/abs/1906.04160v1
PDF https://arxiv.org/pdf/1906.04160v1.pdf
PWC https://paperswithcode.com/paper/learning-individual-styles-of-conversational-1
Repo https://github.com/amirbar/speech2gesture
Framework none

Improving Adversarial Robustness via Guided Complement Entropy

Title Improving Adversarial Robustness via Guided Complement Entropy
Authors Hao-Yun Chen, Jhao-Hong Liang, Shih-Chieh Chang, Jia-Yu Pan, Yu-Ting Chen, Wei Wei, Da-Cheng Juan
Abstract Adversarial robustness has emerged as an important topic in deep learning as carefully crafted attack samples can significantly disturb the performance of a model. Many recent methods have proposed to improve adversarial robustness by utilizing adversarial training or model distillation, which adds additional procedures to model training. In this paper, we propose a new training paradigm called Guided Complement Entropy (GCE) that is capable of achieving “adversarial defense for free,” which involves no additional procedures in the process of improving adversarial robustness. In addition to maximizing model probabilities on the ground-truth class like cross-entropy, we neutralize its probabilities on the incorrect classes along with a “guided” term to balance between these two terms. We show in the experiments that our method achieves better model robustness with even better performance compared to the commonly used cross-entropy training objective. We also show that our method can be used orthogonal to adversarial training across well-known methods with noticeable robustness gain. To the best of our knowledge, our approach is the first one that improves model robustness without compromising performance.
Tasks Adversarial Defense
Published 2019-03-23
URL https://arxiv.org/abs/1903.09799v3
PDF https://arxiv.org/pdf/1903.09799v3.pdf
PWC https://paperswithcode.com/paper/improving-adversarial-robustness-via-guided
Repo https://github.com/Line290/FeatureAttack
Framework pytorch

Simple Applications of BERT for Ad Hoc Document Retrieval

Title Simple Applications of BERT for Ad Hoc Document Retrieval
Authors Wei Yang, Haotian Zhang, Jimmy Lin
Abstract Following recent successes in applying BERT to question answering, we explore simple applications to ad hoc document retrieval. This required confronting the challenge posed by documents that are typically longer than the length of input BERT was designed to handle. We address this issue by applying inference on sentences individually, and then aggregating sentence scores to produce document scores. Experiments on TREC microblog and newswire test collections show that our approach is simple yet effective, as we report the highest average precision on these datasets by neural approaches that we are aware of.
Tasks Ad-Hoc Information Retrieval, Question Answering
Published 2019-03-26
URL http://arxiv.org/abs/1903.10972v1
PDF http://arxiv.org/pdf/1903.10972v1.pdf
PWC https://paperswithcode.com/paper/simple-applications-of-bert-for-ad-hoc
Repo https://github.com/castorini/birch
Framework pytorch

Memeify: A Large-Scale Meme Generation System

Title Memeify: A Large-Scale Meme Generation System
Authors Suryatej Reddy Vyalla, Vishaal Udandarao, Tanmoy Chakraborty
Abstract Interest in the research areas related to meme propagation and generation has been increasing rapidly in the last couple of years. Meme datasets available online are either specific to a context or contain no class information. Here, we prepare a large-scale dataset of memes with captions and class labels. The dataset consists of 1.1 million meme captions from 128 classes. We also provide reasoning for the existence of broad categories, called “themes” across the meme dataset; each theme consists of multiple meme classes. Our generation system uses a trained state-of-the-art transformer-based model for caption generation by employing an encoder-decoder architecture. We develop a web interface, called Memeify for users to generate memes of their choice, and explain in detail, the working of individual components of the system. We also perform a qualitative evaluation of the generated memes by conducting a user study. A link to the demonstration of the Memeify system is https://youtu.be/P_Tfs0X-czs.
Tasks
Published 2019-10-27
URL https://arxiv.org/abs/1910.12279v2
PDF https://arxiv.org/pdf/1910.12279v2.pdf
PWC https://paperswithcode.com/paper/memeify-a-large-scale-meme-generation-system
Repo https://github.com/suryatejreddy/Memeify
Framework none

Object-Oriented Dynamics Learning through Multi-Level Abstraction

Title Object-Oriented Dynamics Learning through Multi-Level Abstraction
Authors Guangxiang Zhu, Jianhao Wang, Zhizhou Ren, Zichuan Lin, Chongjie Zhang
Abstract Object-based approaches for learning action-conditioned dynamics has demonstrated promise for generalization and interpretability. However, existing approaches suffer from structural limitations and optimization difficulties for common environments with multiple dynamic objects. In this paper, we present a novel self-supervised learning framework, called Multi-level Abstraction Object-oriented Predictor (MAOP), which employs a three-level learning architecture that enables efficient object-based dynamics learning from raw visual observations. We also design a spatial-temporal relational reasoning mechanism for MAOP to support instance-level dynamics learning and handle partial observability. Our results show that MAOP significantly outperforms previous methods in terms of sample efficiency and generalization over novel environments for learning environment models. We also demonstrate that learned dynamics models enable efficient planning in unseen environments, comparable to true environment models. In addition, MAOP learns semantically and visually interpretable disentangled representations.
Tasks Relational Reasoning
Published 2019-04-16
URL https://arxiv.org/abs/1904.07482v4
PDF https://arxiv.org/pdf/1904.07482v4.pdf
PWC https://paperswithcode.com/paper/object-oriented-dynamics-learning-through
Repo https://github.com/mig-zh/OODP
Framework tf

Scalable Learning-Based Sampling Optimization for Compressive Dynamic MRI

Title Scalable Learning-Based Sampling Optimization for Compressive Dynamic MRI
Authors Thomas Sanchez, Baran Gözcü, Ruud B. van Heeswijk, Armin Eftekhari, Efe Ilıcak, Tolga Çukur, Volkan Cevher
Abstract Compressed sensing applied to magnetic resonance imaging (MRI) allows to reduce the scanning time by enabling images to be reconstructed from highly undersampled data. In this paper, we tackle the problem of designing a sampling mask for an arbitrary reconstruction method and a limited acquisition budget. Namely, we look for an optimal probability distribution from which a mask with a fixed cardinality is drawn. We demonstrate that this problem admits a compactly supported solution, which leads to a deterministic optimal sampling mask. We then propose a stochastic greedy algorithm that (i) provides an approximate solution to this problem, and (ii) resolves the scaling issues of [1,2]. We validate its performance on in vivo dynamic MRI with retrospective undersampling, showing that our method preserves the performance of [1,2] while reducing the computational burden by a factor close to 200.
Tasks
Published 2019-02-01
URL https://arxiv.org/abs/1902.00386v5
PDF https://arxiv.org/pdf/1902.00386v5.pdf
PWC https://paperswithcode.com/paper/scalable-learning-based-sampling-optimization
Repo https://github.com/t-sanchez/stochasticGreedyMRI
Framework none

Spatio-temporal crop classification of low-resolution satellite imagery with capsule layers and distributed attention

Title Spatio-temporal crop classification of low-resolution satellite imagery with capsule layers and distributed attention
Authors John Brandt
Abstract Land use classification of low resolution spatial imagery is one of the most extensively researched fields in remote sensing. Despite significant advancements in satellite technology, high resolution imagery lacks global coverage and can be prohibitively expensive to procure for extended time periods. Accurately classifying land use change without high resolution imagery offers the potential to monitor vital aspects of global development agenda including climate smart agriculture, drought resistant crops, and sustainable land management. Utilizing a combination of capsule layers and long-short term memory layers with distributed attention, the present paper achieves state-of-the-art accuracy on temporal crop type classification at a 30x30m resolution with Sentinel 2 imagery.
Tasks Crop Classification
Published 2019-04-23
URL http://arxiv.org/abs/1904.10130v1
PDF http://arxiv.org/pdf/1904.10130v1.pdf
PWC https://paperswithcode.com/paper/spatio-temporal-crop-classification-of-low
Repo https://github.com/JohnMBrandt/capsule-attention-networks
Framework none

Gradient Methods for Solving Stackelberg Games

Title Gradient Methods for Solving Stackelberg Games
Authors Roi Naveiro, David Ríos Insua
Abstract Stackelberg Games are gaining importance in the last years due to the raise of Adversarial Machine Learning (AML). Within this context, a new paradigm must be faced: in classical game theory, intervening agents were humans whose decisions are generally discrete and low dimensional. In AML, decisions are made by algorithms and are usually continuous and high dimensional, e.g. choosing the weights of a neural network. As closed form solutions for Stackelberg games generally do not exist, it is mandatory to have efficient algorithms to search for numerical solutions. We study two different procedures for solving this type of games using gradient methods. We study time and space scalability of both approaches and discuss in which situation it is more appropriate to use each of them. Finally, we illustrate their use in an adversarial prediction problem.
Tasks
Published 2019-08-19
URL https://arxiv.org/abs/1908.06901v3
PDF https://arxiv.org/pdf/1908.06901v3.pdf
PWC https://paperswithcode.com/paper/gradient-methods-for-solving-stackelberg
Repo https://github.com/roinaveiro/GM_SG
Framework pytorch

Learning to navigate image manifolds induced by generative adversarial networks for unsupervised video generation

Title Learning to navigate image manifolds induced by generative adversarial networks for unsupervised video generation
Authors Isabela Albuquerque, João Monteiro, Tiago H. Falk
Abstract In this work, we introduce a two-step framework for generative modeling of temporal data. Specifically, the generative adversarial networks (GANs) setting is employed to generate synthetic scenes of moving objects. To do so, we propose a two-step training scheme within which: a generator of static frames is trained first. Afterwards, a recurrent model is trained with the goal of providing a sequence of inputs to the previously trained frames generator, thus yielding scenes which look natural. The adversarial setting is employed in both training steps. However, with the aim of avoiding known training instabilities in GANs, a multiple discriminator approach is used to train both models. Results in the studied video dataset indicate that, by employing such an approach, the recurrent part is able to learn how to coherently navigate the image manifold induced by the frames generator, thus yielding more natural-looking scenes.
Tasks Video Generation
Published 2019-01-23
URL http://arxiv.org/abs/1901.11384v1
PDF http://arxiv.org/pdf/1901.11384v1.pdf
PWC https://paperswithcode.com/paper/learning-to-navigate-image-manifolds-induced
Repo https://github.com/belaalb/frameGAN
Framework pytorch

Video Person Re-ID: Fantastic Techniques and Where to Find Them

Title Video Person Re-ID: Fantastic Techniques and Where to Find Them
Authors Priyank Pathak, Amir Erfan Eshratifar, Michael Gormish
Abstract The ability to identify the same person from multiple camera views without the explicit use of facial recognition is receiving commercial and academic interest. The current status-quo solutions are based on attention neural models. In this paper, we propose Attention and CL loss, which is a hybrid of center and Online Soft Mining (OSM) loss added to the attention loss on top of a temporal attention-based neural network. The proposed loss function applied with bag-of-tricks for training surpasses the state of the art on the common person Re-ID datasets, MARS and PRID 2011. Our source code is publicly available on github.
Tasks Person Re-Identification, Video-Based Person Re-Identification
Published 2019-11-21
URL https://arxiv.org/abs/1912.05295v1
PDF https://arxiv.org/pdf/1912.05295v1.pdf
PWC https://paperswithcode.com/paper/video-person-re-id-fantastic-techniques-and
Repo https://github.com/ppriyank/Video-Person-Re-ID-Fantastic-Techniques-and-Where-to-Find-Them
Framework pytorch

Modular Multimodal Architecture for Document Classification

Title Modular Multimodal Architecture for Document Classification
Authors Tyler Dauphinee, Nikunj Patel, Mohammad Rashidi
Abstract Page classification is a crucial component to any document analysis system, allowing for complex branching control flows for different components of a given document. Utilizing both the visual and textual content of a page, the proposed method exceeds the current state-of-the-art performance on the RVL-CDIP benchmark at 93.03% test accuracy.
Tasks Document Classification
Published 2019-12-09
URL https://arxiv.org/abs/1912.04376v1
PDF https://arxiv.org/pdf/1912.04376v1.pdf
PWC https://paperswithcode.com/paper/modular-multimodal-architecture-for-document
Repo https://github.com/microsoft/unilm/tree/master/layoutlm
Framework pytorch

FlowDelta: Modeling Flow Information Gain in Reasoning for Conversational Machine Comprehension

Title FlowDelta: Modeling Flow Information Gain in Reasoning for Conversational Machine Comprehension
Authors Yi-Ting Yeh, Yun-Nung Chen
Abstract Conversational machine comprehension requires deep understanding of the dialogue flow, and the prior work proposed FlowQA to implicitly model the context representations in reasoning for better understanding. This paper proposes to explicitly model the information gain through dialogue reasoning in order to allow the model to focus on more informative cues. The proposed model achieves state-of-the-art performance in a conversational QA dataset QuAC and sequential instruction understanding dataset SCONE, which shows the effectiveness of the proposed mechanism and demonstrates its capability of generalization to different QA models and tasks.
Tasks Reading Comprehension
Published 2019-08-14
URL https://arxiv.org/abs/1908.05117v3
PDF https://arxiv.org/pdf/1908.05117v3.pdf
PWC https://paperswithcode.com/paper/flowdelta-modeling-flow-information-gain-in
Repo https://github.com/MiuLab/FlowDelta
Framework pytorch

Massively Multilingual Transfer for NER

Title Massively Multilingual Transfer for NER
Authors Afshin Rahimi, Yuan Li, Trevor Cohn
Abstract In cross-lingual transfer, NLP models over one or more source languages are applied to a low-resource target language. While most prior work has used a single source model or a few carefully selected models, here we consider a `massive’ setting with many such models. This setting raises the problem of poor transfer, particularly from distant languages. We propose two techniques for modulating the transfer, suitable for zero-shot or few-shot learning, respectively. Evaluating on named entity recognition, we show that our techniques are much more effective than strong baselines, including standard ensembling, and our unsupervised method rivals oracle selection of the single best individual model. |
Tasks Cross-Lingual Transfer, Few-Shot Learning, Named Entity Recognition
Published 2019-02-01
URL https://arxiv.org/abs/1902.00193v4
PDF https://arxiv.org/pdf/1902.00193v4.pdf
PWC https://paperswithcode.com/paper/multilingual-ner-transfer-for-low-resource
Repo https://github.com/afshinrahimi/mmner
Framework tf

The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding Distillation with Ensemble Learning

Title The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding Distillation with Ensemble Learning
Authors Bonggun Shin, Hao Yang, Jinho D. Choi
Abstract Recent advances in deep learning have facilitated the demand of neural models for real applications. In practice, these applications often need to be deployed with limited resources while keeping high accuracy. This paper touches the core of neural models in NLP, word embeddings, and presents a new embedding distillation framework that remarkably reduces the dimension of word embeddings without compromising accuracy. A novel distillation ensemble approach is also proposed that trains a high-efficient student model using multiple teacher models. In our approach, the teacher models play roles only during training such that the student model operates on its own without getting supports from the teacher models during decoding, which makes it eighty times faster and lighter than other typical ensemble methods. All models are evaluated on seven document classification datasets and show a significant advantage over the teacher models for most cases. Our analysis depicts insightful transformation of word embeddings from distillation and suggests a future direction to ensemble approaches using neural models.
Tasks Document Classification, Word Embeddings
Published 2019-05-31
URL https://arxiv.org/abs/1906.00095v1
PDF https://arxiv.org/pdf/1906.00095v1.pdf
PWC https://paperswithcode.com/paper/190600095
Repo https://github.com/bgshin/distill_demo
Framework tf
comments powered by Disqus