February 2, 2020

3582 words 17 mins read

Paper Group AWR 54

Distribution-Aware Coordinate Representation for Human Pose Estimation. Hierarchical Graph Pooling with Structure Learning. Learning Discriminative Model Prediction for Tracking. Bidirectional Attentive Memory Networks for Question Answering over Knowledge Bases. Kervolutional Neural Networks. Consistency-Aware Recommendation for User-Generated Ite …

Distribution-Aware Coordinate Representation for Human Pose Estimation


Title	Distribution-Aware Coordinate Representation for Human Pose Estimation
Authors	Feng Zhang, Xiatian Zhu, Hanbin Dai, Mao Ye, Ce Zhu
Abstract	While being the de facto standard coordinate representation in human pose estimation, heatmap is never systematically investigated in the literature, to our best knowledge. This work fills this gap by studying the coordinate representation with a particular focus on the heatmap. Interestingly, we found that the process of decoding the predicted heatmaps into the final joint coordinates in the original image space is surprisingly significant for human pose estimation performance, which nevertheless was not recognised before. In light of the discovered importance, we further probe the design limitations of the standard coordinate decoding method widely used by existing methods, and propose a more principled distribution-aware decoding method. Meanwhile, we improve the standard coordinate encoding process (i.e. transforming ground-truth coordinates to heatmaps) by generating accurate heatmap distributions for unbiased model training. Taking the two together, we formulate a novel Distribution-Aware coordinate Representation of Keypoint (DARK) method. Serving as a model-agnostic plug-in, DARK significantly improves the performance of a variety of state-of-the-art human pose estimation models. Extensive experiments show that DARK yields the best results on two common benchmarks, MPII and COCO, consistently validating the usefulness and effectiveness of our novel coordinate representation idea.
Tasks	Keypoint Detection, Multi-Person Pose Estimation, Pose Estimation
Published	2019-10-14
URL	https://arxiv.org/abs/1910.06278v1
PDF	https://arxiv.org/pdf/1910.06278v1.pdf
PWC	https://paperswithcode.com/paper/distribution-aware-coordinate-representation
Repo	https://github.com/ShanghaiTechCVDL/Weekly_Group_Meeting_Paper_List
Framework	none

Hierarchical Graph Pooling with Structure Learning


Title	Hierarchical Graph Pooling with Structure Learning
Authors	Zhen Zhang, Jiajun Bu, Martin Ester, Jianfeng Zhang, Chengwei Yao, Zhi Yu, Can Wang
Abstract	Graph Neural Networks (GNNs), which generalize deep neural networks to graph-structured data, have drawn considerable attention and achieved state-of-the-art performance in numerous graph related tasks. However, existing GNN models mainly focus on designing graph convolution operations. The graph pooling (or downsampling) operations, that play an important role in learning hierarchical representations, are usually overlooked. In this paper, we propose a novel graph pooling operator, called Hierarchical Graph Pooling with Structure Learning (HGP-SL), which can be integrated into various graph neural network architectures. HGP-SL incorporates graph pooling and structure learning into a unified module to generate hierarchical representations of graphs. More specifically, the graph pooling operation adaptively selects a subset of nodes to form an induced subgraph for the subsequent layers. To preserve the integrity of graph’s topological information, we further introduce a structure learning mechanism to learn a refined graph structure for the pooled graph at each layer. By combining HGP-SL operator with graph neural networks, we perform graph level representation learning with focus on graph classification task. Experimental results on six widely used benchmarks demonstrate the effectiveness of our proposed model.
Tasks	Graph Classification, Representation Learning
Published	2019-11-14
URL	https://arxiv.org/abs/1911.05954v3
PDF	https://arxiv.org/pdf/1911.05954v3.pdf
PWC	https://paperswithcode.com/paper/hierarchical-graph-pooling-with-structure
Repo	https://github.com/cszhangzhen/HGP-SL
Framework	pytorch

Learning Discriminative Model Prediction for Tracking


Title	Learning Discriminative Model Prediction for Tracking
Authors	Goutam Bhat, Martin Danelljan, Luc Van Gool, Radu Timofte
Abstract	The current strive towards end-to-end trainable computer vision systems imposes major challenges for the task of visual tracking. In contrast to most other vision problems, tracking requires the learning of a robust target-specific appearance model online, during the inference stage. To be end-to-end trainable, the online learning of the target model thus needs to be embedded in the tracking architecture itself. Due to these difficulties, the popular Siamese paradigm simply predicts a target feature template. However, such a model possesses limited discriminative power due to its inability of integrating background information. We develop an end-to-end tracking architecture, capable of fully exploiting both target and background appearance information for target model prediction. Our architecture is derived from a discriminative learning loss by designing a dedicated optimization process that is capable of predicting a powerful model in only a few iterations. Furthermore, our approach is able to learn key aspects of the discriminative loss itself. The proposed tracker sets a new state-of-the-art on 6 tracking benchmarks, achieving an EAO score of 0.440 on VOT2018, while running at over 40 FPS.
Tasks	Visual Object Tracking, Visual Tracking
Published	2019-04-15
URL	http://arxiv.org/abs/1904.07220v1
PDF	http://arxiv.org/pdf/1904.07220v1.pdf
PWC	https://paperswithcode.com/paper/190407220
Repo	https://github.com/visionml/pytracking
Framework	pytorch

Bidirectional Attentive Memory Networks for Question Answering over Knowledge Bases


Title	Bidirectional Attentive Memory Networks for Question Answering over Knowledge Bases
Authors	Yu Chen, Lingfei Wu, Mohammed J. Zaki
Abstract	When answering natural language questions over knowledge bases (KBs), different question components and KB aspects play different roles. However, most existing embedding-based methods for knowledge base question answering (KBQA) ignore the subtle inter-relationships between the question and the KB (e.g., entity types, relation paths and context). In this work, we propose to directly model the two-way flow of interactions between the questions and the KB via a novel Bidirectional Attentive Memory Network, called BAMnet. Requiring no external resources and only very few hand-crafted features, on the WebQuestions benchmark, our method significantly outperforms existing information-retrieval based methods, and remains competitive with (hand-crafted) semantic parsing based methods. Also, since we use attention mechanisms, our method offers better interpretability compared to other baselines.
Tasks	Information Retrieval, Knowledge Base Question Answering, Question Answering, Semantic Parsing
Published	2019-03-06
URL	https://arxiv.org/abs/1903.02188v3
PDF	https://arxiv.org/pdf/1903.02188v3.pdf
PWC	https://paperswithcode.com/paper/bidirectional-attentive-memory-networks-for
Repo	https://github.com/hugochan/BAMnet
Framework	none

Kervolutional Neural Networks


Title	Kervolutional Neural Networks
Authors	Chen Wang, Jianfei Yang, Lihua Xie, Junsong Yuan
Abstract	Convolutional neural networks (CNNs) have enabled the state-of-the-art performance in many computer vision tasks. However, little effort has been devoted to establishing convolution in non-linear space. Existing works mainly leverage on the activation layers, which can only provide point-wise non-linearity. To solve this problem, a new operation, kervolution (kernel convolution), is introduced to approximate complex behaviors of human perception systems leveraging on the kernel trick. It generalizes convolution, enhances the model capacity, and captures higher order interactions of features, via patch-wise kernel functions, but without introducing additional parameters. Extensive experiments show that kervolutional neural networks (KNN) achieve higher accuracy and faster convergence than baseline CNN.
Tasks
Published	2019-04-08
URL	https://arxiv.org/abs/1904.03955v2
PDF	https://arxiv.org/pdf/1904.03955v2.pdf
PWC	https://paperswithcode.com/paper/kervolutional-neural-networks
Repo	https://github.com/ryanaleksander/kernel-convolution
Framework	pytorch

Consistency-Aware Recommendation for User-Generated ItemList Continuation


Title	Consistency-Aware Recommendation for User-Generated ItemList Continuation
Authors	Yun He, Yin Zhang, Weiwen Liu, James Caverlee
Abstract	User-generated item lists are popular on many platforms. Examples include video-based playlists on YouTube, image-based lists (or"boards”) on Pinterest, book-based lists on Goodreads, and answer-based lists on question-answer forums like Zhihu. As users create these lists, a common challenge is in identifying what items to curate next. Some lists are organized around particular genres or topics, while others are seemingly incoherent, reflecting individual preferences for what items belong together. Furthermore, this heterogeneity in item consistency may vary from platform to platform, and from sub-community to sub-community. Hence, this paper proposes a generalizable approach for user-generated item list continuation. Complementary to methods that exploit specific content patterns (e.g., as in song-based playlists that rely on audio features), the proposed approach models the consistency of item lists based on human curation patterns, and so can be deployed across a wide range of varying item types (e.g., videos, images, books). A key contribution is in intelligently combining two preference models via a novel consistency-aware gating network - a general user preference model that captures a user’s overall interests, and a current preference priority model that captures a user’s current (as of the most recent item) interests. In this way, the proposed consistency-aware recommender can dynamically adapt as user preferences evolve. Evaluation over four datasets(of songs, books, and answers) confirms these observations and demonstrates the effectiveness of the proposed model versus state-of-the-art alternatives. Further, all code and data are available at https://github.com/heyunh2015/ListContinuation_WSDM2020.
Tasks
Published	2019-12-30
URL	https://arxiv.org/abs/1912.13031v1
PDF	https://arxiv.org/pdf/1912.13031v1.pdf
PWC	https://paperswithcode.com/paper/consistency-aware-recommendation-for-user
Repo	https://github.com/heyunh2015/ListContinuation_WSDM2020
Framework	tf

Diachronic Embedding for Temporal Knowledge Graph Completion


Title	Diachronic Embedding for Temporal Knowledge Graph Completion
Authors	Rishab Goel, Seyed Mehran Kazemi, Marcus Brubaker, Pascal Poupart
Abstract	Knowledge graphs (KGs) typically contain temporal facts indicating relationships among entities at different times. Due to their incompleteness, several approaches have been proposed to infer new facts for a KG based on the existing ones-a problem known as KG completion. KG embedding approaches have proved effective for KG completion, however, they have been developed mostly for static KGs. Developing temporal KG embedding models is an increasingly important problem. In this paper, we build novel models for temporal KG completion through equipping static models with a diachronic entity embedding function which provides the characteristics of entities at any point in time. This is in contrast to the existing temporal KG embedding approaches where only static entity features are provided. The proposed embedding function is model-agnostic and can be potentially combined with any static model. We prove that combining it with SimplE, a recent model for static KG embedding, results in a fully expressive model for temporal KG completion. Our experiments indicate the superiority of our proposal compared to existing baselines.
Tasks	Knowledge Graph Completion, Knowledge Graphs
Published	2019-07-06
URL	https://arxiv.org/abs/1907.03143v1
PDF	https://arxiv.org/pdf/1907.03143v1.pdf
PWC	https://paperswithcode.com/paper/diachronic-embedding-for-temporal-knowledge
Repo	https://github.com/BorealisAI/DE-SimplE
Framework	none

A physics-aware, probabilistic machine learning framework for coarse-graining high-dimensional systems in the Small Data regime


Title	A physics-aware, probabilistic machine learning framework for coarse-graining high-dimensional systems in the Small Data regime
Authors	Constantin Grigo, Phaedon-Stelios Koutsourelakis
Abstract	The automated construction of coarse-grained models represents a pivotal component in computer simulation of physical systems and is a key enabler in various analysis and design tasks related to uncertainty quantification. Pertinent methods are severely inhibited by the high-dimension of the parametric input and the limited number of training input/output pairs that can be generated when computationally demanding forward models are considered. Such cases are frequently encountered in the modeling of random heterogeneous media where the scale of the microstructure necessitates the use of high-dimensional random vectors and very fine discretizations of the governing equations. The present paper proposes a probabilistic Machine Learning framework that is capable of operating in the presence of Small Data by exploiting aspects of the physical structure of the problem as well as contextual knowledge. As a result, it can perform comparably well under extrapolative conditions. It unifies the tasks of dimensionality and model-order reduction through an encoder-decoder scheme that simultaneously identifies a sparse set of salient lower-dimensional microstructural features and calibrates an inexpensive, coarse-grained model which is predictive of the output. Information loss is accounted for and quantified in the form of probabilistic predictive estimates. The learning engine is based on Stochastic Variational Inference. We demonstrate how the variational objectives can be used not only to train the coarse-grained model, but also to suggest refinements that lead to improved predictions.
Tasks
Published	2019-02-11
URL	https://arxiv.org/abs/1902.03968v2
PDF	https://arxiv.org/pdf/1902.03968v2.pdf
PWC	https://paperswithcode.com/paper/a-physics-aware-probabilistic-machine
Repo	https://github.com/congriUQ/physics_aware_surrogate
Framework	none

Domain-Specific Embedding Network for Zero-Shot Recognition


Title	Domain-Specific Embedding Network for Zero-Shot Recognition
Authors	Shaobo Min, Hantao Yao, Hongtao Xie, Zheng-Jun Zha, Yongdong Zhang
Abstract	Zero-Shot Learning (ZSL) seeks to recognize a sample from either seen or unseen domain by projecting the image data and semantic labels into a joint embedding space. However, most existing methods directly adapt a well-trained projection from one domain to another, thereby ignoring the serious bias problem caused by domain differences. To address this issue, we propose a novel Domain-Specific Embedding Network (DSEN) that can apply specific projections to different domains for unbiased embedding, as well as several domain constraints. In contrast to previous methods, the DSEN decomposes the domain-shared projection function into one domain-invariant and two domain-specific sub-functions to explore the similarities and differences between two domains. To prevent the two specific projections from breaking the semantic relationship, a semantic reconstruction constraint is proposed by applying the same decoder function to them in a cycle consistency way. Furthermore, a domain division constraint is developed to directly penalize the margin between real and pseudo image features in respective seen and unseen domains, which can enlarge the inter-domain difference of visual features. Extensive experiments on four public benchmarks demonstrate the effectiveness of DSEN with an average of $9.2%$ improvement in terms of harmonic mean. The code is available in \url{https://github.com/mboboGO/DSEN-for-GZSL}.
Tasks	Zero-Shot Learning
Published	2019-08-12
URL	https://arxiv.org/abs/1908.04174v1
PDF	https://arxiv.org/pdf/1908.04174v1.pdf
PWC	https://paperswithcode.com/paper/domain-specific-embedding-network-for-zero
Repo	https://github.com/mboboGO/DSEN-for-GZSL
Framework	pytorch

Lower Bounds on Adversarial Robustness from Optimal Transport


Title	Lower Bounds on Adversarial Robustness from Optimal Transport
Authors	Arjun Nitin Bhagoji, Daniel Cullina, Prateek Mittal
Abstract	While progress has been made in understanding the robustness of machine learning classifiers to test-time adversaries (evasion attacks), fundamental questions remain unresolved. In this paper, we use optimal transport to characterize the minimum possible loss in an adversarial classification scenario. In this setting, an adversary receives a random labeled example from one of two classes, perturbs the example subject to a neighborhood constraint, and presents the modified example to the classifier. We define an appropriate cost function such that the minimum transportation cost between the distributions of the two classes determines the minimum $0-1$ loss for any classifier. When the classifier comes from a restricted hypothesis class, the optimal transportation cost provides a lower bound. We apply our framework to the case of Gaussian data with norm-bounded adversaries and explicitly show matching bounds for the classification and transport problems as well as the optimality of linear classifiers. We also characterize the sample complexity of learning in this setting, deriving and extending previously known results as a special case. Finally, we use our framework to study the gap between the optimal classification performance possible and that currently achieved by state-of-the-art robustly trained neural networks for datasets of interest, namely, MNIST, Fashion MNIST and CIFAR-10.
Tasks
Published	2019-09-26
URL	https://arxiv.org/abs/1909.12272v2
PDF	https://arxiv.org/pdf/1909.12272v2.pdf
PWC	https://paperswithcode.com/paper/lower-bounds-on-adversarial-robustness-from
Repo	https://github.com/inspire-group/robustness-via-transport
Framework	tf

A New Benchmark and Approach for Fine-grained Cross-media Retrieval


Title	A New Benchmark and Approach for Fine-grained Cross-media Retrieval
Authors	Xiangteng He, Yuxin Peng, Liu Xie
Abstract	Cross-media retrieval is to return the results of various media types corresponding to the query of any media type. Existing researches generally focus on coarse-grained cross-media retrieval. When users submit an image of “Slaty-backed Gull” as a query, coarse-grained cross-media retrieval treats it as “Bird”, so that users can only get the results of “Bird”, which may include other bird species with similar appearance (image and video), descriptions (text) or sounds (audio), such as “Herring Gull”. Such coarse-grained cross-media retrieval is not consistent with human lifestyle, where we generally have the fine-grained requirement of returning the exactly relevant results of “Slaty-backed Gull” instead of “Herring Gull”. However, few researches focus on fine-grained cross-media retrieval, which is a highly challenging and practical task. Therefore, in this paper, we first construct a new benchmark for fine-grained cross-media retrieval, which consists of 200 fine-grained subcategories of the “Bird”, and contains 4 media types, including image, text, video and audio. To the best of our knowledge, it is the first benchmark with 4 media types for fine-grained cross-media retrieval. Then, we propose a uniform deep model, namely FGCrossNet, which simultaneously learns 4 types of media without discriminative treatments. We jointly consider three constraints for better common representation learning: classification constraint ensures the learning of discriminative features, center constraint ensures the compactness characteristic of the features of the same subcategory, and ranking constraint ensures the sparsity characteristic of the features of different subcategories. Extensive experiments verify the usefulness of the new benchmark and the effectiveness of our FGCrossNet. They will be made available at https://github.com/PKU-ICST-MIPL/FGCrossNet_ACMMM2019.
Tasks	Representation Learning
Published	2019-07-10
URL	https://arxiv.org/abs/1907.04476v2
PDF	https://arxiv.org/pdf/1907.04476v2.pdf
PWC	https://paperswithcode.com/paper/a-new-benchmark-and-approach-for-fine-grained
Repo	https://github.com/PKU-ICST-MIPL/FGCrossNet_ACMMM2019
Framework	pytorch

A Restricted Black-box Adversarial Framework Towards Attacking Graph Embedding Models


Title	A Restricted Black-box Adversarial Framework Towards Attacking Graph Embedding Models
Authors	Heng Chang, Yu Rong, Tingyang Xu, Wenbing Huang, Honglei Zhang, Peng Cui, Wenwu Zhu, Junzhou Huang
Abstract	With the great success of graph embedding model on both academic and industry area, the robustness of graph embedding against adversarial attack inevitably becomes a central problem in graph learning domain. Regardless of the fruitful progress, most of the current works perform the attack in a white-box fashion: they need to access the model predictions and labels to construct their adversarial loss. However, the inaccessibility of model predictions in real systems makes the white-box attack impractical to real graph learning system. This paper promotes current frameworks in a more general and flexible sense – we demand to attack various kinds of graph embedding model with black-box driven. To this end, we begin by investigating the theoretical connections between graph signal processing and graph embedding models in a principled way and formulate the graph embedding model as a general graph signal process with corresponding graph filter. As such, a generalized adversarial attacker: GF-Attack is constructed by the graph filter and feature matrix. Instead of accessing any knowledge of the target classifiers used in graph embedding, GF-Attack performs the attack only on the graph filter in a black-box attack fashion. To validate the generalization of GF-Attack, we construct the attacker on four popular graph embedding models. Extensive experimental results validate the effectiveness of our attacker on several benchmark datasets. Particularly by using our attack, even small graph perturbations like one-edge flip is able to consistently make a strong attack in performance to different graph embedding models.
Tasks	Adversarial Attack, Graph Embedding, Representation Learning
Published	2019-08-04
URL	https://arxiv.org/abs/1908.01297v5
PDF	https://arxiv.org/pdf/1908.01297v5.pdf
PWC	https://paperswithcode.com/paper/the-general-black-box-attack-method-for-graph
Repo	https://github.com/SwiftieH/GFAttack
Framework	tf

Sliced Wasserstein Discrepancy for Unsupervised Domain Adaptation


Title	Sliced Wasserstein Discrepancy for Unsupervised Domain Adaptation
Authors	Chen-Yu Lee, Tanmay Batra, Mohammad Haris Baig, Daniel Ulbricht
Abstract	In this work, we connect two distinct concepts for unsupervised domain adaptation: feature distribution alignment between domains by utilizing the task-specific decision boundary and the Wasserstein metric. Our proposed sliced Wasserstein discrepancy (SWD) is designed to capture the natural notion of dissimilarity between the outputs of task-specific classifiers. It provides a geometrically meaningful guidance to detect target samples that are far from the support of the source and enables efficient distribution alignment in an end-to-end trainable fashion. In the experiments, we validate the effectiveness and genericness of our method on digit and sign recognition, image classification, semantic segmentation, and object detection.
Tasks	Domain Adaptation, Image Classification, Object Detection, Semantic Segmentation, Unsupervised Domain Adaptation
Published	2019-03-10
URL	http://arxiv.org/abs/1903.04064v1
PDF	http://arxiv.org/pdf/1903.04064v1.pdf
PWC	https://paperswithcode.com/paper/sliced-wasserstein-discrepancy-for
Repo	https://github.com/apple/ml-cvpr2019-swd
Framework	tf

Probabilistic Logic Neural Networks for Reasoning


Title	Probabilistic Logic Neural Networks for Reasoning
Authors	Meng Qu, Jian Tang
Abstract	Knowledge graph reasoning, which aims at predicting the missing facts through reasoning with the observed facts, is critical to many applications. Such a problem has been widely explored by traditional logic rule-based approaches and recent knowledge graph embedding methods. A principled logic rule-based approach is the Markov Logic Network (MLN), which is able to leverage domain knowledge with first-order logic and meanwhile handle their uncertainty. However, the inference of MLNs is usually very difficult due to the complicated graph structures. Different from MLNs, knowledge graph embedding methods (e.g. TransE, DistMult) learn effective entity and relation embeddings for reasoning, which are much more effective and efficient. However, they are unable to leverage domain knowledge. In this paper, we propose the probabilistic Logic Neural Network (pLogicNet), which combines the advantages of both methods. A pLogicNet defines the joint distribution of all possible triplets by using a Markov logic network with first-order logic, which can be efficiently optimized with the variational EM algorithm. In the E-step, a knowledge graph embedding model is used for inferring the missing triplets, while in the M-step, the weights of logic rules are updated based on both the observed and predicted triplets. Experiments on multiple knowledge graphs prove the effectiveness of pLogicNet over many competitive baselines.
Tasks	Graph Embedding, Knowledge Graph Embedding, Knowledge Graphs
Published	2019-06-20
URL	https://arxiv.org/abs/1906.08495v2
PDF	https://arxiv.org/pdf/1906.08495v2.pdf
PWC	https://paperswithcode.com/paper/probabilistic-logic-neural-networks-for
Repo	https://github.com/DeepGraphLearning/pLogicNet
Framework	pytorch

Graph Embedding on Biomedical Networks: Methods, Applications, and Evaluations


Title	Graph Embedding on Biomedical Networks: Methods, Applications, and Evaluations
Authors	Xiang Yue, Zhen Wang, Jingong Huang, Srinivasan Parthasarathy, Soheil Moosavinasab, Yungui Huang, Simon M. Lin, Wen Zhang, Ping Zhang, Huan Sun
Abstract	Graph embedding learning that aims to automatically learn low-dimensional node representations, has drawn increasing attention in recent years. To date, most recent graph embedding methods are evaluated on social and information networks and are not comprehensively studied on biomedical networks under systematic experiments and analyses. On the other hand, for a variety of biomedical network analysis tasks, traditional techniques such as matrix factorization (which can be seen as a type of graph embedding methods) have shown promising results, and hence there is a need to systematically evaluate the more recent graph embedding methods (e.g. random walk-based and neural network-based) in terms of their usability and potential to further the state-of-the-art. We select 11 representative graph embedding methods and conduct a systematic comparison on 3 important biomedical link prediction tasks: drug-disease association (DDA) prediction, drug-drug interaction (DDI) prediction, protein-protein interaction (PPI) prediction; and 2 node classification tasks: medical term semantic type classification, protein function prediction. Our experimental results demonstrate that the recent graph embedding methods achieve promising results and deserve more attention in the future biomedical graph analysis. Compared with three state-of-the-art methods for DDAs, DDIs and protein function predictions, the recent graph embedding methods achieve competitive performance without using any biological features and the learned embeddings can be treated as complementary representations for the biological features. By summarizing the experimental results, we provide general guidelines for properly selecting graph embedding methods and setting their hyper-parameters for different biomedical tasks.
Tasks	Graph Embedding, Link Prediction, Node Classification, Protein Function Prediction
Published	2019-06-12
URL	https://arxiv.org/abs/1906.05017v3
PDF	https://arxiv.org/pdf/1906.05017v3.pdf
PWC	https://paperswithcode.com/paper/graph-embedding-on-biomedical-networks
Repo	https://github.com/xiangyue9607/BioNEV
Framework	none