April 1, 2020

2816 words 14 mins read

Paper Group NANR 13

Weakly-supervised Knowledge Graph Alignment with Adversarial Learning. Utilizing Edge Features in Graph Neural Networks via Variational Information Maximization. Learning deep graph matching with channel-independent embedding and Hungarian attention. MMD GAN with Random-Forest Kernels. Locally Constant Networks. Learning Long- and Short-Term User L …

Weakly-supervised Knowledge Graph Alignment with Adversarial Learning


Title	Weakly-supervised Knowledge Graph Alignment with Adversarial Learning
Authors	Anonymous
Abstract	This paper studies aligning knowledge graphs from different sources or languages. Most existing methods train supervised methods for the alignment, which usually require a large number of aligned knowledge triplets. However, such a large number of aligned knowledge triplets may not be available or are expensive to obtain in many domains. Therefore, in this paper we propose to study aligning knowledge graphs in fully-unsupervised or weakly-supervised fashion, i.e., without or with only a few aligned triplets. We propose an unsupervised framework to align the entity and relation embddings of different knowledge graphs with an adversarial learning framework. Moreover, a regularization term which maximizes the mutual information between the embeddings of different knowledge graphs is used to mitigate the problem of mode collapse when learning the alignment functions. Such a framework can be further seamlessly integrated with existing supervised methods by utilizing a limited number of aligned triples as guidance. Experimental results on multiple datasets prove the effectiveness of our proposed approach in both the unsupervised and the weakly-supervised settings.
Tasks	Knowledge Graphs
Published	2020-01-01
URL	https://openreview.net/forum?id=SygfNCEYDH
PDF	https://openreview.net/pdf?id=SygfNCEYDH
PWC	https://paperswithcode.com/paper/weakly-supervised-knowledge-graph-alignment-2
Repo
Framework

Utilizing Edge Features in Graph Neural Networks via Variational Information Maximization


Title	Utilizing Edge Features in Graph Neural Networks via Variational Information Maximization
Authors	Anonymous
Abstract	Graph Neural Networks (GNNs) broadly follow the scheme that the representation vector of each node is updated recursively using the message from neighbor nodes, where the message of a neighbor is usually pre-processed with a parameterized transform matrix. To make better use of edge features, we propose the Edge Information maximized Graph Neural Network (EIGNN) that maximizes the Mutual Information (MI) between edge features and message passing channels. The MI is reformulated as a differentiable objective via a variational approach. We theoretically show that the newly introduced objective enables the model to preserve edge information, and empirically corroborate the enhanced performance of MI-maximized models across a broad range of learning tasks including regression on molecular graphs and relation prediction in knowledge graphs.
Tasks	Knowledge Graphs
Published	2020-01-01
URL	https://openreview.net/forum?id=BygZK2VYvB
PDF	https://openreview.net/pdf?id=BygZK2VYvB
PWC	https://paperswithcode.com/paper/utilizing-edge-features-in-graph-neural-1
Repo
Framework

Learning deep graph matching with channel-independent embedding and Hungarian attention


Title	Learning deep graph matching with channel-independent embedding and Hungarian attention
Authors	Tianshu Yu, Runzhong Wang, Junchi Yan, Baoxin Li
Abstract	Graph matching aims to establishing node-wise correspondence between two graphs, which is a classic combinatorial problem and in general NP-complete. Until very recently, deep graph matching methods start to resort to deep networks to achieve unprecedented matching accuracy. Along this direction, this paper makes two complementary contributions which can also be reused as plugin in existing works: i) a novel node and edge embedding strategy which stimulates the multi-head strategy in attention models and allows the information in each channel to be merged independently. In contrast, only node embedding is accounted in previous works; ii) a general masking mechanism over the loss function is devised to improve the smoothness of objective learning for graph matching. Using Hungarian algorithm, it dynamically constructs a structured and sparsely connected layer, taking into account the most contributing matching pairs as hard attention. Our approach performs competitively, and can also improve state-of-the-art methods as plugin, regarding with matching accuracy on three public benchmarks.
Tasks	Graph Matching
Published	2020-01-01
URL	https://openreview.net/forum?id=rJgBd2NYPH
PDF	https://openreview.net/pdf?id=rJgBd2NYPH
PWC	https://paperswithcode.com/paper/learning-deep-graph-matching-with-channel
Repo
Framework

MMD GAN with Random-Forest Kernels


Title	MMD GAN with Random-Forest Kernels
Authors	Tao Huang, Zhen Han, Xu Jia, Hanyuan Hang
Abstract	In this paper, we propose a novel kind of kernel, random forest kernel, to enhance the empirical performance of MMD GAN. Different from common forests with deterministic routings, a probabilistic routing variant is used in our innovated random-forest kernel, which is possible to merge with the CNN frameworks. Our proposed random-forest kernel has the following advantages: From the perspective of random forest, the output of GAN discriminator can be viewed as feature inputs to the forest, where each tree gets access to merely a fraction of the features, and thus the entire forest benefits from ensemble learning. In the aspect of kernel method, random-forest kernel is proved to be characteristic, and therefore suitable for the MMD structure. Besides, being an asymmetric kernel, our random-forest kernel is much more flexible, in terms of capturing the differences between distributions. Sharing the advantages of CNN, kernel method, and ensemble learning, our random-forest kernel based MMD GAN obtains desirable empirical performances on CIFAR-10, CelebA and LSUN bedroom data sets. Furthermore, for the sake of completeness, we also put forward comprehensive theoretical analysis to support our experimental results.
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=HJxhWa4KDr
PDF	https://openreview.net/pdf?id=HJxhWa4KDr
PWC	https://paperswithcode.com/paper/mmd-gan-with-random-forest-kernels
Repo
Framework

Locally Constant Networks


Title	Locally Constant Networks
Authors	Anonymous
Abstract	We show how neural models can be used to realize piece-wise constant functions such as decision trees. Our approach builds on ReLU networks that are piece-wise linear and hence their associated gradients with respect to the inputs are locally constant. We formally establish the equivalence between the classes of locally constant networks and decision trees. Moreover, we highlight several advantageous properties of locally constant networks, including how they realize decision trees with parameter sharing across branching / leaves. Indeed, only $M$ neurons suffice to implicitly model an oblique decision tree with $2^M$ leaf nodes. The neural representation also enables us to adopt many tools developed for deep networks (e.g., DropConnect (Wan et al., 2013)) while implicitly training decision trees. We demonstrate that our method outperforms alternative techniques for training oblique decision trees in the context of molecular property classification and regression tasks.
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=Bke8UR4FPB
PDF	https://openreview.net/pdf?id=Bke8UR4FPB
PWC	https://paperswithcode.com/paper/locally-constant-networks-1
Repo
Framework

Learning Long- and Short-Term User Literal-Preference with Multimodal Hierarchical Transformer Network for Personalized Image Caption


Title	Learning Long- and Short-Term User Literal-Preference with Multimodal Hierarchical Transformer Network for Personalized Image Caption
Authors	Wei Zhang, Yue Ying, Pan Lu, Hongyuan Zha
Abstract	Personalized image caption, a natural extension of the standard image caption task, requires to generate brief image descriptions tailored for users’ writing style and traits, and is more practical to meet users’ real demands. Only a few recent studies shed light on this crucial task and learn static user representations to capture their long-term literal-preference. However, it is insufficient to achieve satisfactory performance due to the intrinsic existence of not only long-term user literal-preference, but also short-term literal-preference which is associated with users’ recent states. To bridge this gap, we develop a novel multimodal hierarchical transformer network (MHTN) for personalized image caption in this paper. It learns short-term user literal-preference based on users’recent captions through a short-term user encoder at the low level. And at the high level, the multimodal encoder integrates target image representations with short-term literal preference, as well as long-term literal-preference learned from user IDs. These two encoders enjoy the advantages of the powerful transformer networks. Extensive experiments on two real datasets show the effectiveness of considering two types of user literal-preference simultaneously and better performance over the state-of-the-art models
Tasks	Image Captioning
Published	2020-02-04
URL	https://lupantech.github.io/papers/aaai20_caption.pdf
PDF	https://lupantech.github.io/papers/aaai20_caption.pdf
PWC	https://paperswithcode.com/paper/learning-long-and-short-term-user-literal
Repo
Framework

Sparse and Structured Visual Attention


Title	Sparse and Structured Visual Attention
Authors	Anonymous
Abstract	Visual attention mechanisms have been widely used in image captioning models. In this paper, to better link the image structure with the generated text, we replace the traditional softmax attention mechanism by two alternative sparsity-promoting transformations: sparsemax and Total-Variation Sparse Attention (TVmax). With sparsemax, we obtain sparse attention weights, selecting relevant features. In order to promote sparsity and encourage fusing of the related adjacent spatial locations, we propose TVmax. By selecting relevant groups of features, the TVmax transformation improves interpretability. We present results in the Microsoft COCO and Flickr30k datasets, obtaining gains in comparison to softmax. TVmax outperforms the other compared attention mechanisms in terms of human-rated caption quality and attention relevance.
Tasks	Image Captioning
Published	2020-01-01
URL	https://openreview.net/forum?id=r1e8WTEYPB
PDF	https://openreview.net/pdf?id=r1e8WTEYPB
PWC	https://paperswithcode.com/paper/sparse-and-structured-visual-attention
Repo
Framework

A Causal View on Robustness of Neural Networks


Title	A Causal View on Robustness of Neural Networks
Authors	Anonymous
Abstract	We present a causal view on the robustness of neural networks against input manipulations, which applies not only to traditional classification tasks but also to general measurement data. Based on this view, we design a deep causal manipulation augmented model (deep CAMA) which explicitly models the manipulations of data as a cause to the observed effect variables. We further develop data augmentation and test-time fine-tuning methods to improve deep CAMA’s robustness. When compared with discriminative deep neural networks, our proposed model shows superior robustness against unseen manipulations. As a by-product, our model achieves disentangled representation which separates the representation of manipulations from those of other latent causes.
Tasks	Data Augmentation
Published	2020-01-01
URL	https://openreview.net/forum?id=Hkxvl0EtDH
PDF	https://openreview.net/pdf?id=Hkxvl0EtDH
PWC	https://paperswithcode.com/paper/a-causal-view-on-robustness-of-neural
Repo
Framework

Concise Multi-head Attention Models


Title	Concise Multi-head Attention Models
Authors	Anonymous
Abstract	Attention based Transformer architecture has enabled significant advances in the field of natural language processing. In addition to new pre-training techniques, recent improvements crucially rely on working with a relatively larger embedding dimension for tokens. This leads to models that are prohibitively large to be employed in the downstream tasks. In this paper we identify one of the important factors contributing to the large embedding size requirement. In particular, our analysis highlights that the scaling between the number of heads and the size of each head in the existing architectures gives rise to this limitation, which we further validate with our experiments. As a solution, we propose a new way to set the projection size in attention heads that allows us to train models with a relatively smaller embedding dimension, without sacrificing the performance.
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=r1eVXa4KvH
PDF	https://openreview.net/pdf?id=r1eVXa4KvH
PWC	https://paperswithcode.com/paper/concise-multi-head-attention-models
Repo
Framework

On Variational Learning of Controllable Representations for Text without Supervision


Title	On Variational Learning of Controllable Representations for Text without Supervision
Authors	Anonymous
Abstract	The variational autoencoder (VAE) has found success in modelling the manifold of natural images on certain datasets, allowing meaningful images to be generated while interpolating or extrapolating in the latent code space, but it is unclear whether similar capabilities are feasible for text considering its discrete nature. In this work, we investigate the reason why unsupervised learning of controllable representations fails for text. We find that traditional sequence VAEs can learn disentangled representations through their latent codes to some extent, but they often fail to properly decode when the latent factor is being manipulated, because the manipulated codes often land in holes or vacant regions in the aggregated posterior latent space, which the decoding network is not trained to process. Both as a validation of the explanation and as a fix to the problem, we propose to constrain the posterior mean to a learned probability simplex, and performs manipulation within this simplex. Our proposed method mitigates the latent vacancy problem and achieves the first success in unsupervised learning of controllable representations for text. Empirically, our method significantly outperforms unsupervised baselines and is competitive with strong supervised approaches on text style transfer. Furthermore, when switching the latent factor (e.g., topic) during a long sentence generation, our proposed framework can often complete the sentence in a seemingly natural way – a capability that has never been attempted by previous methods.
Tasks	Style Transfer, Text Style Transfer
Published	2020-01-01
URL	https://openreview.net/forum?id=Hkex2a4FPr
PDF	https://openreview.net/pdf?id=Hkex2a4FPr
PWC	https://paperswithcode.com/paper/on-variational-learning-of-controllable
Repo
Framework

SCALABLE OBJECT-ORIENTED SEQUENTIAL GENERATIVE MODELS


Title	SCALABLE OBJECT-ORIENTED SEQUENTIAL GENERATIVE MODELS
Authors	Anonymous
Abstract	The most significant limitation of previous approaches to unsupervised learning for object-oriented representation is its scalability. Most of the previous models have been shown to work only on scenes with a few objects. In this paper, we propose SCALOR, a generative model for Scalable Sequential Object-Oriented Representation. With the spatially parallel attention and proposal-rejection mechanism, SCALOR is a scalable model that can deal with orders of magnitude more objects that previous models. Besides, we introduce the background model so that it can model the foreground objects and complex background together. In experiments on large-scale MNIST and DSprite datasets, we demonstrate that SCALOR can deal with scenes with near 100 objects as well as modeling complex natural background images. Importantly, using SCALOR, we demonstrate for the first time a result of modeling natural scenes with several tens of moving objects
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=SJxrKgStDH
PDF	https://openreview.net/pdf?id=SJxrKgStDH
PWC	https://paperswithcode.com/paper/scalable-object-oriented-sequential
Repo
Framework

Closed loop deep Bayesian inversion: Uncertainty driven acquisition for fast MRI


Title	Closed loop deep Bayesian inversion: Uncertainty driven acquisition for fast MRI
Authors	Anonymous
Abstract	This work proposes a closed-loop, uncertainty-driven adaptive sampling frame- work (CLUDAS) for accelerating magnetic resonance imaging (MRI) via deep Bayesian inversion. By closed-loop, we mean that our samples adapt in real- time to the incoming data. To our knowledge, we demonstrate the first genera- tive adversarial network (GAN) based framework for posterior estimation over a continuum sampling rates of an inverse problem. We use this estimator to drive the sampling for accelerated MRI. Our numerical evidence demonstrates that the variance estimate strongly correlates with the expected MSE improvement for dif- ferent acceleration rates even with few posterior samples. Moreover, the resulting masks bring improvements to the state-of-the-art fixed and active mask designing approaches across MSE, posterior variance and SSIM on real undersampled MRI scans.
Tasks
Published	2020-01-01
URL	https://openreview.net/forum?id=BJlPOlBKDB
PDF	https://openreview.net/pdf?id=BJlPOlBKDB
PWC	https://paperswithcode.com/paper/closed-loop-deep-bayesian-inversion
Repo
Framework

Mutual Information Gradient Estimation for Representation Learning


Title	Mutual Information Gradient Estimation for Representation Learning
Authors	Anonymous
Abstract	Mutual information (MI) plays an important role in representation learning. However, MI is unfortunately intractable in continuous and high-dimensional settings. Recent advances establish tractable and scalable MI estimators to discover useful representation. However, most of existing methods are not capable of providing accurate estimation of MI with low-variance when the MI is large. We argue that estimating gradients of MI is more appealing for representation learning than directly estimating MI due to the difficulty of estimating MI. Therefore, we propose the Mutual Information Gradient Estimator (MIGE) for representation learning based on score estimation of implicit distributions. It exhibits a tight and smooth gradient estimation of MI in the high-dimensional and large-MI setting. We expand the applications of MIGE in both unsupervised learning of deep representations based on InfoMax and the Information Bottleneck method. Experimental results have indicated the remarkable performance improvement in learning useful representation.
Tasks	Representation Learning
Published	2020-01-01
URL	https://openreview.net/forum?id=ByxaUgrFvH
PDF	https://openreview.net/pdf?id=ByxaUgrFvH
PWC	https://paperswithcode.com/paper/mutual-information-gradient-estimation-for
Repo
Framework

Efficient Deep Representation Learning by Adaptive Latent Space Sampling


Title	Efficient Deep Representation Learning by Adaptive Latent Space Sampling
Authors	Anonymous
Abstract	Supervised deep learning requires a large amount of training samples with annotations (e.g. label class for classification task, pixel- or voxel-wised label map for segmentation tasks), which are expensive and time-consuming to obtain. During the training of a deep neural network, the annotated samples are fed into the network in a mini-batch way, where they are often regarded of equal importance. However, some of the samples may become less informative during training, as the magnitude of the gradient start to vanish for these samples. In the meantime, other samples of higher utility or hardness may be more demanded for the training process to proceed and require more exploitation. To address the challenges of expensive annotations and loss of sample informativeness, here we propose a novel training framework which adaptively selects informative samples that are fed to the training process. The adaptive selection or sampling is performed based on a hardness-aware strategy in the latent space constructed by a generative model. To evaluate the proposed training framework, we perform experiments on three different datasets, including MNIST and CIFAR-10 for image classification task and a medical image dataset IVUS for biophysical simulation task. On all three datasets, the proposed framework outperforms a random sampling method, which demonstrates the effectiveness of our framework.
Tasks	Image Classification, Representation Learning
Published	2020-01-01
URL	https://openreview.net/forum?id=Byl3HxBFwH
PDF	https://openreview.net/pdf?id=Byl3HxBFwH
PWC	https://paperswithcode.com/paper/efficient-deep-representation-learning-by
Repo
Framework

GraphQA: Protein Model Quality Assessment using Graph Convolutional Network


Title	GraphQA: Protein Model Quality Assessment using Graph Convolutional Network
Authors	Anonymous
Abstract	Proteins are ubiquitous molecules whose function in biological processes is determined by their 3D structure. Experimental identification of a protein’s structure can be time-consuming, prohibitively expensive, and not always possible. Alternatively, protein folding can be modeled using computational methods, which however are not guaranteed to always produce optimal results. GraphQA is a graph-based method to estimate the quality of protein models, that possesses favorable properties such as representation learning, explicit modeling of both sequential and 3D structure, geometric invariance and computational efficiency. In this work, we demonstrate significant improvements of the state-of-the-art for both hand-engineered and representation-learning approaches, as well as carefully evaluating the individual contributions of GraphQA.
Tasks	Representation Learning
Published	2020-01-01
URL	https://openreview.net/forum?id=HyxgBerKwB
PDF	https://openreview.net/pdf?id=HyxgBerKwB
PWC	https://paperswithcode.com/paper/graphqa-protein-model-quality-assessment
Repo
Framework