Paper Group AWR 416
Kernel Graph Attention Network for Fact Verification. Learning Individual Styles of Conversational Gesture. Improving Adversarial Robustness via Guided Complement Entropy. Simple Applications of BERT for Ad Hoc Document Retrieval. Memeify: A Large-Scale Meme Generation System. Object-Oriented Dynamics Learning through Multi-Level Abstraction. Scala …
Kernel Graph Attention Network for Fact Verification
Title | Kernel Graph Attention Network for Fact Verification |
Authors | Zhenghao Liu, Chenyan Xiong, Maosong Sun |
Abstract | This paper presents Kernel Graph Attention Network (KGAT), which conducts more fine-grained evidence selection and reasoning for the fact verification task. Given a claim and a set of potential supporting evidence sentences, KGAT constructs a graph attention network using the evidence sentences as its nodes and learns to verify the claim integrity using its edge kernels and node kernels, where the edge kernels learn to propagate information across the evidence graph, and the node kernels learn to merge node level information to the graph level. KGAT reaches a comparable performance (69.4%) on FEVER, a large-scale benchmark for fact verification. Our experiments find that KGAT thrives on verification scenarios where multiple evidence pieces are required. This advantage mainly comes from the sparse and fine-grained attention mechanisms from our kernel technique. |
Tasks | |
Published | 2019-10-22 |
URL | https://arxiv.org/abs/1910.09796v2 |
https://arxiv.org/pdf/1910.09796v2.pdf | |
PWC | https://paperswithcode.com/paper/kernel-graph-attention-network-for-fact |
Repo | https://github.com/thunlp/KernelGAT |
Framework | pytorch |
Learning Individual Styles of Conversational Gesture
Title | Learning Individual Styles of Conversational Gesture |
Authors | Shiry Ginosar, Amir Bar, Gefen Kohavi, Caroline Chan, Andrew Owens, Jitendra Malik |
Abstract | Human speech is often accompanied by hand and arm gestures. Given audio speech input, we generate plausible gestures to go along with the sound. Specifically, we perform cross-modal translation from “in-the-wild’’ monologue speech of a single speaker to their hand and arm motion. We train on unlabeled videos for which we only have noisy pseudo ground truth from an automatic pose detection system. Our proposed model significantly outperforms baseline methods in a quantitative comparison. To support research toward obtaining a computational understanding of the relationship between gesture and speech, we release a large video dataset of person-specific gestures. The project website with video, code and data can be found at http://people.eecs.berkeley.edu/~shiry/speech2gesture . |
Tasks | Speech-to-Gesture Translation |
Published | 2019-06-10 |
URL | https://arxiv.org/abs/1906.04160v1 |
https://arxiv.org/pdf/1906.04160v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-individual-styles-of-conversational-1 |
Repo | https://github.com/amirbar/speech2gesture |
Framework | none |
Improving Adversarial Robustness via Guided Complement Entropy
Title | Improving Adversarial Robustness via Guided Complement Entropy |
Authors | Hao-Yun Chen, Jhao-Hong Liang, Shih-Chieh Chang, Jia-Yu Pan, Yu-Ting Chen, Wei Wei, Da-Cheng Juan |
Abstract | Adversarial robustness has emerged as an important topic in deep learning as carefully crafted attack samples can significantly disturb the performance of a model. Many recent methods have proposed to improve adversarial robustness by utilizing adversarial training or model distillation, which adds additional procedures to model training. In this paper, we propose a new training paradigm called Guided Complement Entropy (GCE) that is capable of achieving “adversarial defense for free,” which involves no additional procedures in the process of improving adversarial robustness. In addition to maximizing model probabilities on the ground-truth class like cross-entropy, we neutralize its probabilities on the incorrect classes along with a “guided” term to balance between these two terms. We show in the experiments that our method achieves better model robustness with even better performance compared to the commonly used cross-entropy training objective. We also show that our method can be used orthogonal to adversarial training across well-known methods with noticeable robustness gain. To the best of our knowledge, our approach is the first one that improves model robustness without compromising performance. |
Tasks | Adversarial Defense |
Published | 2019-03-23 |
URL | https://arxiv.org/abs/1903.09799v3 |
https://arxiv.org/pdf/1903.09799v3.pdf | |
PWC | https://paperswithcode.com/paper/improving-adversarial-robustness-via-guided |
Repo | https://github.com/Line290/FeatureAttack |
Framework | pytorch |
Simple Applications of BERT for Ad Hoc Document Retrieval
Title | Simple Applications of BERT for Ad Hoc Document Retrieval |
Authors | Wei Yang, Haotian Zhang, Jimmy Lin |
Abstract | Following recent successes in applying BERT to question answering, we explore simple applications to ad hoc document retrieval. This required confronting the challenge posed by documents that are typically longer than the length of input BERT was designed to handle. We address this issue by applying inference on sentences individually, and then aggregating sentence scores to produce document scores. Experiments on TREC microblog and newswire test collections show that our approach is simple yet effective, as we report the highest average precision on these datasets by neural approaches that we are aware of. |
Tasks | Ad-Hoc Information Retrieval, Question Answering |
Published | 2019-03-26 |
URL | http://arxiv.org/abs/1903.10972v1 |
http://arxiv.org/pdf/1903.10972v1.pdf | |
PWC | https://paperswithcode.com/paper/simple-applications-of-bert-for-ad-hoc |
Repo | https://github.com/castorini/birch |
Framework | pytorch |
Memeify: A Large-Scale Meme Generation System
Title | Memeify: A Large-Scale Meme Generation System |
Authors | Suryatej Reddy Vyalla, Vishaal Udandarao, Tanmoy Chakraborty |
Abstract | Interest in the research areas related to meme propagation and generation has been increasing rapidly in the last couple of years. Meme datasets available online are either specific to a context or contain no class information. Here, we prepare a large-scale dataset of memes with captions and class labels. The dataset consists of 1.1 million meme captions from 128 classes. We also provide reasoning for the existence of broad categories, called “themes” across the meme dataset; each theme consists of multiple meme classes. Our generation system uses a trained state-of-the-art transformer-based model for caption generation by employing an encoder-decoder architecture. We develop a web interface, called Memeify for users to generate memes of their choice, and explain in detail, the working of individual components of the system. We also perform a qualitative evaluation of the generated memes by conducting a user study. A link to the demonstration of the Memeify system is https://youtu.be/P_Tfs0X-czs. |
Tasks | |
Published | 2019-10-27 |
URL | https://arxiv.org/abs/1910.12279v2 |
https://arxiv.org/pdf/1910.12279v2.pdf | |
PWC | https://paperswithcode.com/paper/memeify-a-large-scale-meme-generation-system |
Repo | https://github.com/suryatejreddy/Memeify |
Framework | none |
Object-Oriented Dynamics Learning through Multi-Level Abstraction
Title | Object-Oriented Dynamics Learning through Multi-Level Abstraction |
Authors | Guangxiang Zhu, Jianhao Wang, Zhizhou Ren, Zichuan Lin, Chongjie Zhang |
Abstract | Object-based approaches for learning action-conditioned dynamics has demonstrated promise for generalization and interpretability. However, existing approaches suffer from structural limitations and optimization difficulties for common environments with multiple dynamic objects. In this paper, we present a novel self-supervised learning framework, called Multi-level Abstraction Object-oriented Predictor (MAOP), which employs a three-level learning architecture that enables efficient object-based dynamics learning from raw visual observations. We also design a spatial-temporal relational reasoning mechanism for MAOP to support instance-level dynamics learning and handle partial observability. Our results show that MAOP significantly outperforms previous methods in terms of sample efficiency and generalization over novel environments for learning environment models. We also demonstrate that learned dynamics models enable efficient planning in unseen environments, comparable to true environment models. In addition, MAOP learns semantically and visually interpretable disentangled representations. |
Tasks | Relational Reasoning |
Published | 2019-04-16 |
URL | https://arxiv.org/abs/1904.07482v4 |
https://arxiv.org/pdf/1904.07482v4.pdf | |
PWC | https://paperswithcode.com/paper/object-oriented-dynamics-learning-through |
Repo | https://github.com/mig-zh/OODP |
Framework | tf |
Scalable Learning-Based Sampling Optimization for Compressive Dynamic MRI
Title | Scalable Learning-Based Sampling Optimization for Compressive Dynamic MRI |
Authors | Thomas Sanchez, Baran Gözcü, Ruud B. van Heeswijk, Armin Eftekhari, Efe Ilıcak, Tolga Çukur, Volkan Cevher |
Abstract | Compressed sensing applied to magnetic resonance imaging (MRI) allows to reduce the scanning time by enabling images to be reconstructed from highly undersampled data. In this paper, we tackle the problem of designing a sampling mask for an arbitrary reconstruction method and a limited acquisition budget. Namely, we look for an optimal probability distribution from which a mask with a fixed cardinality is drawn. We demonstrate that this problem admits a compactly supported solution, which leads to a deterministic optimal sampling mask. We then propose a stochastic greedy algorithm that (i) provides an approximate solution to this problem, and (ii) resolves the scaling issues of [1,2]. We validate its performance on in vivo dynamic MRI with retrospective undersampling, showing that our method preserves the performance of [1,2] while reducing the computational burden by a factor close to 200. |
Tasks | |
Published | 2019-02-01 |
URL | https://arxiv.org/abs/1902.00386v5 |
https://arxiv.org/pdf/1902.00386v5.pdf | |
PWC | https://paperswithcode.com/paper/scalable-learning-based-sampling-optimization |
Repo | https://github.com/t-sanchez/stochasticGreedyMRI |
Framework | none |
Spatio-temporal crop classification of low-resolution satellite imagery with capsule layers and distributed attention
Title | Spatio-temporal crop classification of low-resolution satellite imagery with capsule layers and distributed attention |
Authors | John Brandt |
Abstract | Land use classification of low resolution spatial imagery is one of the most extensively researched fields in remote sensing. Despite significant advancements in satellite technology, high resolution imagery lacks global coverage and can be prohibitively expensive to procure for extended time periods. Accurately classifying land use change without high resolution imagery offers the potential to monitor vital aspects of global development agenda including climate smart agriculture, drought resistant crops, and sustainable land management. Utilizing a combination of capsule layers and long-short term memory layers with distributed attention, the present paper achieves state-of-the-art accuracy on temporal crop type classification at a 30x30m resolution with Sentinel 2 imagery. |
Tasks | Crop Classification |
Published | 2019-04-23 |
URL | http://arxiv.org/abs/1904.10130v1 |
http://arxiv.org/pdf/1904.10130v1.pdf | |
PWC | https://paperswithcode.com/paper/spatio-temporal-crop-classification-of-low |
Repo | https://github.com/JohnMBrandt/capsule-attention-networks |
Framework | none |
Gradient Methods for Solving Stackelberg Games
Title | Gradient Methods for Solving Stackelberg Games |
Authors | Roi Naveiro, David Ríos Insua |
Abstract | Stackelberg Games are gaining importance in the last years due to the raise of Adversarial Machine Learning (AML). Within this context, a new paradigm must be faced: in classical game theory, intervening agents were humans whose decisions are generally discrete and low dimensional. In AML, decisions are made by algorithms and are usually continuous and high dimensional, e.g. choosing the weights of a neural network. As closed form solutions for Stackelberg games generally do not exist, it is mandatory to have efficient algorithms to search for numerical solutions. We study two different procedures for solving this type of games using gradient methods. We study time and space scalability of both approaches and discuss in which situation it is more appropriate to use each of them. Finally, we illustrate their use in an adversarial prediction problem. |
Tasks | |
Published | 2019-08-19 |
URL | https://arxiv.org/abs/1908.06901v3 |
https://arxiv.org/pdf/1908.06901v3.pdf | |
PWC | https://paperswithcode.com/paper/gradient-methods-for-solving-stackelberg |
Repo | https://github.com/roinaveiro/GM_SG |
Framework | pytorch |
Learning to navigate image manifolds induced by generative adversarial networks for unsupervised video generation
Title | Learning to navigate image manifolds induced by generative adversarial networks for unsupervised video generation |
Authors | Isabela Albuquerque, João Monteiro, Tiago H. Falk |
Abstract | In this work, we introduce a two-step framework for generative modeling of temporal data. Specifically, the generative adversarial networks (GANs) setting is employed to generate synthetic scenes of moving objects. To do so, we propose a two-step training scheme within which: a generator of static frames is trained first. Afterwards, a recurrent model is trained with the goal of providing a sequence of inputs to the previously trained frames generator, thus yielding scenes which look natural. The adversarial setting is employed in both training steps. However, with the aim of avoiding known training instabilities in GANs, a multiple discriminator approach is used to train both models. Results in the studied video dataset indicate that, by employing such an approach, the recurrent part is able to learn how to coherently navigate the image manifold induced by the frames generator, thus yielding more natural-looking scenes. |
Tasks | Video Generation |
Published | 2019-01-23 |
URL | http://arxiv.org/abs/1901.11384v1 |
http://arxiv.org/pdf/1901.11384v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-navigate-image-manifolds-induced |
Repo | https://github.com/belaalb/frameGAN |
Framework | pytorch |
Video Person Re-ID: Fantastic Techniques and Where to Find Them
Title | Video Person Re-ID: Fantastic Techniques and Where to Find Them |
Authors | Priyank Pathak, Amir Erfan Eshratifar, Michael Gormish |
Abstract | The ability to identify the same person from multiple camera views without the explicit use of facial recognition is receiving commercial and academic interest. The current status-quo solutions are based on attention neural models. In this paper, we propose Attention and CL loss, which is a hybrid of center and Online Soft Mining (OSM) loss added to the attention loss on top of a temporal attention-based neural network. The proposed loss function applied with bag-of-tricks for training surpasses the state of the art on the common person Re-ID datasets, MARS and PRID 2011. Our source code is publicly available on github. |
Tasks | Person Re-Identification, Video-Based Person Re-Identification |
Published | 2019-11-21 |
URL | https://arxiv.org/abs/1912.05295v1 |
https://arxiv.org/pdf/1912.05295v1.pdf | |
PWC | https://paperswithcode.com/paper/video-person-re-id-fantastic-techniques-and |
Repo | https://github.com/ppriyank/Video-Person-Re-ID-Fantastic-Techniques-and-Where-to-Find-Them |
Framework | pytorch |
Modular Multimodal Architecture for Document Classification
Title | Modular Multimodal Architecture for Document Classification |
Authors | Tyler Dauphinee, Nikunj Patel, Mohammad Rashidi |
Abstract | Page classification is a crucial component to any document analysis system, allowing for complex branching control flows for different components of a given document. Utilizing both the visual and textual content of a page, the proposed method exceeds the current state-of-the-art performance on the RVL-CDIP benchmark at 93.03% test accuracy. |
Tasks | Document Classification |
Published | 2019-12-09 |
URL | https://arxiv.org/abs/1912.04376v1 |
https://arxiv.org/pdf/1912.04376v1.pdf | |
PWC | https://paperswithcode.com/paper/modular-multimodal-architecture-for-document |
Repo | https://github.com/microsoft/unilm/tree/master/layoutlm |
Framework | pytorch |
FlowDelta: Modeling Flow Information Gain in Reasoning for Conversational Machine Comprehension
Title | FlowDelta: Modeling Flow Information Gain in Reasoning for Conversational Machine Comprehension |
Authors | Yi-Ting Yeh, Yun-Nung Chen |
Abstract | Conversational machine comprehension requires deep understanding of the dialogue flow, and the prior work proposed FlowQA to implicitly model the context representations in reasoning for better understanding. This paper proposes to explicitly model the information gain through dialogue reasoning in order to allow the model to focus on more informative cues. The proposed model achieves state-of-the-art performance in a conversational QA dataset QuAC and sequential instruction understanding dataset SCONE, which shows the effectiveness of the proposed mechanism and demonstrates its capability of generalization to different QA models and tasks. |
Tasks | Reading Comprehension |
Published | 2019-08-14 |
URL | https://arxiv.org/abs/1908.05117v3 |
https://arxiv.org/pdf/1908.05117v3.pdf | |
PWC | https://paperswithcode.com/paper/flowdelta-modeling-flow-information-gain-in |
Repo | https://github.com/MiuLab/FlowDelta |
Framework | pytorch |
Massively Multilingual Transfer for NER
Title | Massively Multilingual Transfer for NER |
Authors | Afshin Rahimi, Yuan Li, Trevor Cohn |
Abstract | In cross-lingual transfer, NLP models over one or more source languages are applied to a low-resource target language. While most prior work has used a single source model or a few carefully selected models, here we consider a `massive’ setting with many such models. This setting raises the problem of poor transfer, particularly from distant languages. We propose two techniques for modulating the transfer, suitable for zero-shot or few-shot learning, respectively. Evaluating on named entity recognition, we show that our techniques are much more effective than strong baselines, including standard ensembling, and our unsupervised method rivals oracle selection of the single best individual model. | |
Tasks | Cross-Lingual Transfer, Few-Shot Learning, Named Entity Recognition |
Published | 2019-02-01 |
URL | https://arxiv.org/abs/1902.00193v4 |
https://arxiv.org/pdf/1902.00193v4.pdf | |
PWC | https://paperswithcode.com/paper/multilingual-ner-transfer-for-low-resource |
Repo | https://github.com/afshinrahimi/mmner |
Framework | tf |
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding Distillation with Ensemble Learning
Title | The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding Distillation with Ensemble Learning |
Authors | Bonggun Shin, Hao Yang, Jinho D. Choi |
Abstract | Recent advances in deep learning have facilitated the demand of neural models for real applications. In practice, these applications often need to be deployed with limited resources while keeping high accuracy. This paper touches the core of neural models in NLP, word embeddings, and presents a new embedding distillation framework that remarkably reduces the dimension of word embeddings without compromising accuracy. A novel distillation ensemble approach is also proposed that trains a high-efficient student model using multiple teacher models. In our approach, the teacher models play roles only during training such that the student model operates on its own without getting supports from the teacher models during decoding, which makes it eighty times faster and lighter than other typical ensemble methods. All models are evaluated on seven document classification datasets and show a significant advantage over the teacher models for most cases. Our analysis depicts insightful transformation of word embeddings from distillation and suggests a future direction to ensemble approaches using neural models. |
Tasks | Document Classification, Word Embeddings |
Published | 2019-05-31 |
URL | https://arxiv.org/abs/1906.00095v1 |
https://arxiv.org/pdf/1906.00095v1.pdf | |
PWC | https://paperswithcode.com/paper/190600095 |
Repo | https://github.com/bgshin/distill_demo |
Framework | tf |