February 2, 2020

3073 words 15 mins read

Paper Group AWR 41

Paper Group AWR 41

MUREL: Multimodal Relational Reasoning for Visual Question Answering. Rates of Convergence for Sparse Variational Gaussian Process Regression. Sliced Wasserstein Generative Models. Unifying Knowledge Graph Learning and Recommendation: Towards a Better Understanding of User Preferences. Frequency Principle: Fourier Analysis Sheds Light on Deep Neura …

MUREL: Multimodal Relational Reasoning for Visual Question Answering

Title MUREL: Multimodal Relational Reasoning for Visual Question Answering
Authors Remi Cadene, Hedi Ben-younes, Matthieu Cord, Nicolas Thome
Abstract Multimodal attentional networks are currently state-of-the-art models for Visual Question Answering (VQA) tasks involving real images. Although attention allows to focus on the visual content relevant to the question, this simple mechanism is arguably insufficient to model complex reasoning features required for VQA or other high-level tasks. In this paper, we propose MuRel, a multimodal relational network which is learned end-to-end to reason over real images. Our first contribution is the introduction of the MuRel cell, an atomic reasoning primitive representing interactions between question and image regions by a rich vectorial representation, and modeling region relations with pairwise combinations. Secondly, we incorporate the cell into a full MuRel network, which progressively refines visual and question interactions, and can be leveraged to define visualization schemes finer than mere attention maps. We validate the relevance of our approach with various ablation studies, and show its superiority to attention-based methods on three datasets: VQA 2.0, VQA-CP v2 and TDIUC. Our final MuRel network is competitive to or outperforms state-of-the-art results in this challenging context. Our code is available: https://github.com/Cadene/murel.bootstrap.pytorch
Tasks Relational Reasoning, Visual Question Answering
Published 2019-02-25
URL http://arxiv.org/abs/1902.09487v1
PDF http://arxiv.org/pdf/1902.09487v1.pdf
PWC https://paperswithcode.com/paper/murel-multimodal-relational-reasoning-for
Repo https://github.com/Cadene/murel.bootstrap.pytorch
Framework pytorch

Rates of Convergence for Sparse Variational Gaussian Process Regression

Title Rates of Convergence for Sparse Variational Gaussian Process Regression
Authors David R. Burt, Carl E. Rasmussen, Mark van der Wilk
Abstract Excellent variational approximations to Gaussian process posteriors have been developed which avoid the $\mathcal{O}\left(N^3\right)$ scaling with dataset size $N$. They reduce the computational cost to $\mathcal{O}\left(NM^2\right)$, with $M\ll N$ being the number of inducing variables, which summarise the process. While the computational cost seems to be linear in $N$, the true complexity of the algorithm depends on how $M$ must increase to ensure a certain quality of approximation. We address this by characterising the behavior of an upper bound on the KL divergence to the posterior. We show that with high probability the KL divergence can be made arbitrarily small by growing $M$ more slowly than $N$. A particular case of interest is that for regression with normally distributed inputs in D-dimensions with the popular Squared Exponential kernel, $M=\mathcal{O}(\log^D N)$ is sufficient. Our results show that as datasets grow, Gaussian process posteriors can truly be approximated cheaply, and provide a concrete rule for how to increase $M$ in continual learning scenarios.
Tasks Continual Learning
Published 2019-03-08
URL https://arxiv.org/abs/1903.03571v3
PDF https://arxiv.org/pdf/1903.03571v3.pdf
PWC https://paperswithcode.com/paper/rates-of-convergence-for-sparse-variational
Repo https://github.com/DavidBurt2/Rates-of-Convergence-SGPR
Framework none

Sliced Wasserstein Generative Models

Title Sliced Wasserstein Generative Models
Authors Jiqing Wu, Zhiwu Huang, Dinesh Acharya, Wen Li, Janine Thoma, Danda Pani Paudel, Luc Van Gool
Abstract In generative modeling, the Wasserstein distance (WD) has emerged as a useful metric to measure the discrepancy between generated and real data distributions. Unfortunately, it is challenging to approximate the WD of high-dimensional distributions. In contrast, the sliced Wasserstein distance (SWD) factorizes high-dimensional distributions into their multiple one-dimensional marginal distributions and is thus easier to approximate. In this paper, we introduce novel approximations of the primal and dual SWD. Instead of using a large number of random projections, as it is done by conventional SWD approximation methods, we propose to approximate SWDs with a small number of parameterized orthogonal projections in an end-to-end deep learning fashion. As concrete applications of our SWD approximations, we design two types of differentiable SWD blocks to equip modern generative frameworks—Auto-Encoders (AE) and Generative Adversarial Networks (GAN). In the experiments, we not only show the superiority of the proposed generative models on standard image synthesis benchmarks, but also demonstrate the state-of-the-art performance on challenging high resolution image and video generation in an unsupervised manner.
Tasks Image Generation, Video Generation
Published 2019-04-10
URL http://arxiv.org/abs/1904.05408v2
PDF http://arxiv.org/pdf/1904.05408v2.pdf
PWC https://paperswithcode.com/paper/sliced-wasserstein-generative-models-1
Repo https://github.com/musikisomorphie/swd
Framework tf

Unifying Knowledge Graph Learning and Recommendation: Towards a Better Understanding of User Preferences

Title Unifying Knowledge Graph Learning and Recommendation: Towards a Better Understanding of User Preferences
Authors Yixin Cao, Xiang Wang, Xiangnan He, Zikun hu, Tat-Seng Chua
Abstract Incorporating knowledge graph (KG) into recommender system is promising in improving the recommendation accuracy and explainability. However, existing methods largely assume that a KG is complete and simply transfer the “knowledge” in KG at the shallow level of entity raw data or embeddings. This may lead to suboptimal performance, since a practical KG can hardly be complete, and it is common that a KG has missing facts, relations, and entities. Thus, we argue that it is crucial to consider the incomplete nature of KG when incorporating it into recommender system. In this paper, we jointly learn the model of recommendation and knowledge graph completion. Distinct from previous KG-based recommendation methods, we transfer the relation information in KG, so as to understand the reasons that a user likes an item. As an example, if a user has watched several movies directed by (relation) the same person (entity), we can infer that the director relation plays a critical role when the user makes the decision, thus help to understand the user’s preference at a finer granularity. Technically, we contribute a new translation-based recommendation model, which specially accounts for various preferences in translating a user to an item, and then jointly train it with a KG completion model by combining several transfer schemes. Extensive experiments on two benchmark datasets show that our method outperforms state-of-the-art KG-based recommendation methods. Further analysis verifies the positive effect of joint training on both tasks of recommendation and KG completion, and the advantage of our model in understanding user preference. We publish our project at https://github.com/TaoMiner/joint-kg-recommender.
Tasks Knowledge Graph Completion, Recommendation Systems
Published 2019-02-17
URL http://arxiv.org/abs/1902.06236v1
PDF http://arxiv.org/pdf/1902.06236v1.pdf
PWC https://paperswithcode.com/paper/unifying-knowledge-graph-learning-and
Repo https://github.com/TaoMiner/joint-kg-recommender
Framework pytorch

Frequency Principle: Fourier Analysis Sheds Light on Deep Neural Networks

Title Frequency Principle: Fourier Analysis Sheds Light on Deep Neural Networks
Authors Zhi-Qin John Xu, Yaoyu Zhang, Tao Luo, Yanyang Xiao, Zheng Ma
Abstract We study the training process of Deep Neural Networks (DNNs) from the Fourier analysis perspective. We demonstrate a very universal Frequency Principle (F-Principle) — DNNs often fit target functions from low to high frequencies — on high-dimensional benchmark datasets such as MNIST/CIFAR10 and deep neural networks such as VGG16. This F-Principle of DNNs is opposite to the behavior of most conventional iterative numerical schemes (e.g., Jacobi method), which exhibit faster convergence for higher frequencies for various scientific computing problems. With a simple theory, we illustrate that this F-Principle results from the regularity of the commonly used activation functions. The F-Principle implies an implicit bias that DNNs tend to fit training data by a low-frequency function. This understanding provides an explanation of good generalization of DNNs on most real datasets and bad generalization of DNNs on parity function or randomized dataset.
Tasks
Published 2019-01-19
URL https://arxiv.org/abs/1901.06523v5
PDF https://arxiv.org/pdf/1901.06523v5.pdf
PWC https://paperswithcode.com/paper/frequency-principle-fourier-analysis-sheds
Repo https://github.com/xuzhiqin1990/F-Principle
Framework tf

Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation

Title Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation
Authors Chenxi Liu, Liang-Chieh Chen, Florian Schroff, Hartwig Adam, Wei Hua, Alan Yuille, Li Fei-Fei
Abstract Recently, Neural Architecture Search (NAS) has successfully identified neural network architectures that exceed human designed ones on large-scale image classification. In this paper, we study NAS for semantic image segmentation. Existing works often focus on searching the repeatable cell structure, while hand-designing the outer network structure that controls the spatial resolution changes. This choice simplifies the search space, but becomes increasingly problematic for dense image prediction which exhibits a lot more network level architectural variations. Therefore, we propose to search the network level structure in addition to the cell level structure, which forms a hierarchical architecture search space. We present a network level search space that includes many popular designs, and develop a formulation that allows efficient gradient-based architecture search (3 P100 GPU days on Cityscapes images). We demonstrate the effectiveness of the proposed method on the challenging Cityscapes, PASCAL VOC 2012, and ADE20K datasets. Auto-DeepLab, our architecture searched specifically for semantic image segmentation, attains state-of-the-art performance without any ImageNet pretraining.
Tasks Image Classification, Neural Architecture Search, Semantic Segmentation
Published 2019-01-10
URL http://arxiv.org/abs/1901.02985v2
PDF http://arxiv.org/pdf/1901.02985v2.pdf
PWC https://paperswithcode.com/paper/auto-deeplab-hierarchical-neural-architecture
Repo https://github.com/NoamRosenberg/AutoML
Framework pytorch

Correlation-aware Adversarial Domain Adaptation and Generalization

Title Correlation-aware Adversarial Domain Adaptation and Generalization
Authors Mohammad Mahfujur Rahman, Clinton Fookes, Mahsa Baktashmotlagh, Sridha Sridharan
Abstract Domain adaptation (DA) and domain generalization (DG) have emerged as a solution to the domain shift problem where the distribution of the source and target data is different. The task of DG is more challenging than DA as the target data is totally unseen during the training phase in DG scenarios. The current state-of-the-art employs adversarial techniques, however, these are rarely considered for the DG problem. Furthermore, these approaches do not consider correlation alignment which has been proven highly beneficial for minimizing domain discrepancy. In this paper, we propose a correlation-aware adversarial DA and DG framework where the features of the source and target data are minimized using correlation alignment along with adversarial learning. Incorporating the correlation alignment module along with adversarial learning helps to achieve a more domain agnostic model due to the improved ability to reduce domain discrepancy with unlabeled target data more effectively. Experiments on benchmark datasets serve as evidence that our proposed method yields improved state-of-the-art performance.
Tasks Domain Adaptation, Domain Generalization
Published 2019-11-29
URL https://arxiv.org/abs/1911.12983v1
PDF https://arxiv.org/pdf/1911.12983v1.pdf
PWC https://paperswithcode.com/paper/correlation-aware-adversarial-domain
Repo https://github.com/mahfujur1/CA-DA-DG
Framework none

Domain Generalization via Model-Agnostic Learning of Semantic Features

Title Domain Generalization via Model-Agnostic Learning of Semantic Features
Authors Qi Dou, Daniel C. Castro, Konstantinos Kamnitsas, Ben Glocker
Abstract Generalization capability to unseen domains is crucial for machine learning models when deploying to real-world conditions. We investigate the challenging problem of domain generalization, i.e., training a model on multi-domain source data such that it can directly generalize to target domains with unknown statistics. We adopt a model-agnostic learning paradigm with gradient-based meta-train and meta-test procedures to expose the optimization to domain shift. Further, we introduce two complementary losses which explicitly regularize the semantic structure of the feature space. Globally, we align a derived soft confusion matrix to preserve general knowledge about inter-class relationships. Locally, we promote domain-independent class-specific cohesion and separation of sample features with a metric-learning component. The effectiveness of our method is demonstrated with new state-of-the-art results on two common object recognition benchmarks. Our method also shows consistent improvement on a medical image segmentation task.
Tasks Domain Generalization, Medical Image Segmentation, Metric Learning, Object Recognition, Semantic Segmentation
Published 2019-10-29
URL https://arxiv.org/abs/1910.13580v1
PDF https://arxiv.org/pdf/1910.13580v1.pdf
PWC https://paperswithcode.com/paper/domain-generalization-via-model-agnostic
Repo https://github.com/biomedia-mira/masf
Framework tf

GluonTS: Probabilistic Time Series Models in Python

Title GluonTS: Probabilistic Time Series Models in Python
Authors Alexander Alexandrov, Konstantinos Benidis, Michael Bohlke-Schneider, Valentin Flunkert, Jan Gasthaus, Tim Januschowski, Danielle C. Maddix, Syama Rangapuram, David Salinas, Jasper Schulz, Lorenzo Stella, Ali Caner Türkmen, Yuyang Wang
Abstract We introduce Gluon Time Series (GluonTS, available at https://gluon-ts.mxnet.io), a library for deep-learning-based time series modeling. GluonTS simplifies the development of and experimentation with time series models for common tasks such as forecasting or anomaly detection. It provides all necessary components and tools that scientists need for quickly building new models, for efficiently running and analyzing experiments and for evaluating model accuracy.
Tasks Anomaly Detection, Time Series, Time Series Forecasting, Time Series Prediction
Published 2019-06-12
URL https://arxiv.org/abs/1906.05264v2
PDF https://arxiv.org/pdf/1906.05264v2.pdf
PWC https://paperswithcode.com/paper/gluonts-probabilistic-time-series-models-in
Repo https://github.com/mbohlkeschneider/gluon-ts
Framework mxnet

Decentralized & Collaborative AI on Blockchain

Title Decentralized & Collaborative AI on Blockchain
Authors Justin D. Harris, Bo Waggoner
Abstract Machine learning has recently enabled large advances in artificial intelligence, but these tend to be highly centralized. The large datasets required are generally proprietary; predictions are often sold on a per-query basis; and published models can quickly become out of date without effort to acquire more data and re-train them. We propose a framework for participants to collaboratively build a dataset and use smart contracts to host a continuously updated model. This model will be shared publicly on a blockchain where it can be free to use for inference. Ideal learning problems include scenarios where a model is used many times for similar input such as personal assistants, playing games, recommender systems, etc. In order to maintain the model’s accuracy with respect to some test set we propose both financial and non-financial (gamified) incentive structures for providing good data. A free and open source implementation for the Ethereum blockchain is provided at https://github.com/microsoft/0xDeCA10B.
Tasks Recommendation Systems
Published 2019-07-16
URL https://arxiv.org/abs/1907.07247v1
PDF https://arxiv.org/pdf/1907.07247v1.pdf
PWC https://paperswithcode.com/paper/decentralized-collaborative-ai-on-blockchain
Repo https://github.com/microsoft/0xDeCA10B
Framework none

Learning to Find Common Objects Across Few Image Collections

Title Learning to Find Common Objects Across Few Image Collections
Authors Amirreza Shaban, Amir Rahimi, Shray Bansal, Stephen Gould, Byron Boots, Richard Hartley
Abstract Given a collection of bags where each bag is a set of images, our goal is to select one image from each bag such that the selected images are from the same object class. We model the selection as an energy minimization problem with unary and pairwise potential functions. Inspired by recent few-shot learning algorithms, we propose an approach to learn the potential functions directly from the data. Furthermore, we propose a fast greedy inference algorithm for energy minimization. We evaluate our approach on few-shot common object recognition as well as object co-localization tasks. Our experiments show that learning the pairwise and unary terms greatly improves the performance of the model over several well-known methods for these tasks. The proposed greedy optimization algorithm achieves performance comparable to state-of-the-art structured inference algorithms while being ~10 times faster. The code is publicly available on https://github.com/haamoon/finding_common_object.
Tasks Few-Shot Learning, Object Recognition
Published 2019-04-29
URL https://arxiv.org/abs/1904.12936v2
PDF https://arxiv.org/pdf/1904.12936v2.pdf
PWC https://paperswithcode.com/paper/learning-to-find-common-objects-across-image
Repo https://github.com/haamoon/finding_common_object
Framework tf

Revealing the Dark Secrets of BERT

Title Revealing the Dark Secrets of BERT
Authors Olga Kovaleva, Alexey Romanov, Anna Rogers, Anna Rumshisky
Abstract BERT-based architectures currently give state-of-the-art performance on many NLP tasks, but little is known about the exact mechanisms that contribute to its success. In the current work, we focus on the interpretation of self-attention, which is one of the fundamental underlying components of BERT. Using a subset of GLUE tasks and a set of handcrafted features-of-interest, we propose the methodology and carry out a qualitative and quantitative analysis of the information encoded by the individual BERT’s heads. Our findings suggest that there is a limited set of attention patterns that are repeated across different heads, indicating the overall model overparametrization. While different heads consistently use the same attention patterns, they have varying impact on performance across different tasks. We show that manually disabling attention in certain heads leads to a performance improvement over the regular fine-tuned BERT models.
Tasks
Published 2019-08-21
URL https://arxiv.org/abs/1908.08593v2
PDF https://arxiv.org/pdf/1908.08593v2.pdf
PWC https://paperswithcode.com/paper/revealing-the-dark-secrets-of-bert
Repo https://github.com/KzKe/Transformer-models
Framework none

User Intent Prediction in Information-seeking Conversations

Title User Intent Prediction in Information-seeking Conversations
Authors Chen Qu, Liu Yang, Bruce Croft, Yongfeng Zhang, Johanne R. Trippas, Minghui Qiu
Abstract Conversational assistants are being progressively adopted by the general population. However, they are not capable of handling complicated information-seeking tasks that involve multiple turns of information exchange. Due to the limited communication bandwidth in conversational search, it is important for conversational assistants to accurately detect and predict user intent in information-seeking conversations. In this paper, we investigate two aspects of user intent prediction in an information-seeking setting. First, we extract features based on the content, structural, and sentiment characteristics of a given utterance, and use classic machine learning methods to perform user intent prediction. We then conduct an in-depth feature importance analysis to identify key features in this prediction task. We find that structural features contribute most to the prediction performance. Given this finding, we construct neural classifiers to incorporate context information and achieve better performance without feature engineering. Our findings can provide insights into the important factors and effective methods of user intent prediction in information-seeking conversations.
Tasks Feature Engineering, Feature Importance
Published 2019-01-11
URL http://arxiv.org/abs/1901.03489v1
PDF http://arxiv.org/pdf/1901.03489v1.pdf
PWC https://paperswithcode.com/paper/user-intent-prediction-in-information-seeking
Repo https://github.com/prdwb/UserIntentPrediction
Framework none

A Simple BERT-Based Approach for Lexical Simplification

Title A Simple BERT-Based Approach for Lexical Simplification
Authors Jipeng Qiang, Yun Li, Yi Zhu, Yunhao Yuan, Xindong Wu
Abstract Lexical simplification (LS) aims to replace complex words in a given sentence with their simpler alternatives of equivalent meaning. Recently unsupervised lexical simplification approaches only rely on the complex word itself regardless of the given sentence to generate candidate substitutions, which will inevitably produce a large number of spurious candidates. We present a simple BERT-based LS approach that makes use of the pre-trained unsupervised deep bidirectional representations BERT. Despite being entirely unsupervised, experimental results show that our approach obtains obvious improvement than these baselines leveraging linguistic databases and parallel corpus, outperforming the state-of-the-art by more than 11 Accuracy points on three well-known benchmarks.
Tasks Language Modelling, Lexical Simplification
Published 2019-07-14
URL https://arxiv.org/abs/1907.06226v4
PDF https://arxiv.org/pdf/1907.06226v4.pdf
PWC https://paperswithcode.com/paper/a-simple-bert-based-approach-for-lexical
Repo https://github.com/qiang2100/BERT-LS
Framework pytorch

SynC: A Unified Framework for Generating Synthetic Population with Gaussian Copula

Title SynC: A Unified Framework for Generating Synthetic Population with Gaussian Copula
Authors Colin Wan, Zheng Li, Alicia Guo, Yue Zhao
Abstract Synthetic population generation is the process of combining multiple socioeconomic and demographic datasets from different sources and/or granularity levels, and downscaling them to an individual level. Although it is a fundamental step for many data science tasks, an efficient and standard framework is absent. In this study, we propose a multi-stage framework called SynC (Synthetic Population via Gaussian Copula) to fill the gap. SynC first removes potential outliers in the data and then fits the filtered data with a Gaussian copula model to correctly capture dependencies and marginal distributions of sampled survey data. Finally, SynC leverages predictive models to merge datasets into one and then scales them accordingly to match the marginal constraints. We make three key contributions in this work: 1) propose a novel framework for generating individual level data from aggregated data sources by combining state-of-the-art machine learning and statistical techniques, 2) demonstrate its value as a feature engineering tool, as well as an alternative to data collection in situations where gathering is difficult through two real-world datasets, 3) release an easy-to-use framework implementation for reproducibility, and 4) ensure the methodology is scalable at the production level and can easily incorporate new data.
Tasks Feature Engineering
Published 2019-04-16
URL https://arxiv.org/abs/1904.07998v2
PDF https://arxiv.org/pdf/1904.07998v2.pdf
PWC https://paperswithcode.com/paper/sync-a-unified-framework-for-generating
Repo https://github.com/winstonll/SynC
Framework none
comments powered by Disqus