February 1, 2020

2989 words 15 mins read

Paper Group AWR 202

Paper Group AWR 202

Hyper-Path-Based Representation Learning for Hyper-Networks. Advancing GraphSAGE with A Data-Driven Node Sampling. Point-to-Point Video Generation. Semi-Supervised Classification on Non-Sparse Graphs Using Low-Rank Graph Convolutional Networks. VideoFlow: A Conditional Flow-Based Model for Stochastic Video Generation. Handwritten Indic Character Re …

Hyper-Path-Based Representation Learning for Hyper-Networks

Title Hyper-Path-Based Representation Learning for Hyper-Networks
Authors Jie Huang, Xin Liu, Yangqiu Song
Abstract Network representation learning has aroused widespread interests in recent years. While most of the existing methods deal with edges as pairwise relationships, only a few studies have been proposed for hyper-networks to capture more complicated tuplewise relationships among multiple nodes. A hyper-network is a network where each edge, called hyperedge, connects an arbitrary number of nodes. Different from conventional networks, hyper-networks have certain degrees of indecomposability such that the nodes in a subset of a hyperedge may not possess a strong relationship. That is the main reason why traditional algorithms fail in learning representations in hyper-networks by simply decomposing hyperedges into pairwise relationships. In this paper, we firstly define a metric to depict the degrees of indecomposability for hyper-networks. Then we propose a new concept called hyper-path and design hyper-path-based random walks to preserve the structural information of hyper-networks according to the analysis of the indecomposability. Then a carefully designed algorithm, Hyper-gram, utilizes these random walks to capture both pairwise relationships and tuplewise relationships in the whole hyper-networks. Finally, we conduct extensive experiments on several real-world datasets covering the tasks of link prediction and hyper-network reconstruction, and results demonstrate the rationality, validity, and effectiveness of our methods compared with those existing state-of-the-art models designed for conventional networks or hyper-networks.
Tasks Link Prediction, Representation Learning
Published 2019-08-24
URL https://arxiv.org/abs/1908.09152v2
PDF https://arxiv.org/pdf/1908.09152v2.pdf
PWC https://paperswithcode.com/paper/hyper-path-based-representation-learning-for
Repo https://github.com/HKUST-KnowComp/HPHG
Framework none

Advancing GraphSAGE with A Data-Driven Node Sampling

Title Advancing GraphSAGE with A Data-Driven Node Sampling
Authors Jihun Oh, Kyunghyun Cho, Joan Bruna
Abstract As an efficient and scalable graph neural network, GraphSAGE has enabled an inductive capability for inferring unseen nodes or graphs by aggregating subsampled local neighborhoods and by learning in a mini-batch gradient descent fashion. The neighborhood sampling used in GraphSAGE is effective in order to improve computing and memory efficiency when inferring a batch of target nodes with diverse degrees in parallel. Despite this advantage, the default uniform sampling suffers from high variance in training and inference, leading to sub-optimum accuracy. We propose a new data-driven sampling approach to reason about the real-valued importance of a neighborhood by a non-linear regressor, and to use the value as a criterion for subsampling neighborhoods. The regressor is learned using a value-based reinforcement learning. The implied importance for each combination of vertex and neighborhood is inductively extracted from the negative classification loss output of GraphSAGE. As a result, in an inductive node classification benchmark using three datasets, our method enhanced the baseline using the uniform sampling, outperforming recent variants of a graph neural network in accuracy.
Tasks Node Classification
Published 2019-04-29
URL http://arxiv.org/abs/1904.12935v1
PDF http://arxiv.org/pdf/1904.12935v1.pdf
PWC https://paperswithcode.com/paper/advancing-graphsage-with-a-data-driven-node
Repo https://github.com/oj9040/GraphSAGE_RL
Framework tf

Point-to-Point Video Generation

Title Point-to-Point Video Generation
Authors Tsun-Hsuan Wang, Yen-Chi Cheng, Chieh Hubert Lin, Hwann-Tzong Chen, Min Sun
Abstract While image manipulation achieves tremendous breakthroughs (e.g., generating realistic faces) in recent years, video generation is much less explored and harder to control, which limits its applications in the real world. For instance, video editing requires temporal coherence across multiple clips and thus poses both start and end constraints within a video sequence. We introduce point-to-point video generation that controls the generation process with two control points: the targeted start- and end-frames. The task is challenging since the model not only generates a smooth transition of frames, but also plans ahead to ensure that the generated end-frame conforms to the targeted end-frame for videos of various length. We propose to maximize the modified variational lower bound of conditional data likelihood under a skip-frame training strategy. Our model can generate sequences such that their end-frame is consistent with the targeted end-frame without loss of quality and diversity. Extensive experiments are conducted on Stochastic Moving MNIST, Weizmann Human Action, and Human3.6M to evaluate the effectiveness of the proposed method. We demonstrate our method under a series of scenarios (e.g., dynamic length generation) and the qualitative results showcase the potential and merits of point-to-point generation. For project page, see https://zswang666.github.io/P2PVG-Project-Page/
Tasks Video Generation
Published 2019-04-05
URL https://arxiv.org/abs/1904.02912v2
PDF https://arxiv.org/pdf/1904.02912v2.pdf
PWC https://paperswithcode.com/paper/point-to-point-video-generation
Repo https://github.com/yccyenchicheng/p2pvg
Framework pytorch

Semi-Supervised Classification on Non-Sparse Graphs Using Low-Rank Graph Convolutional Networks

Title Semi-Supervised Classification on Non-Sparse Graphs Using Low-Rank Graph Convolutional Networks
Authors Dominik Alfke, Martin Stoll
Abstract Graph Convolutional Networks (GCNs) have proven to be successful tools for semi-supervised learning on graph-based datasets. For sparse graphs, linear and polynomial filter functions have yielded impressive results. For large non-sparse graphs, however, network training and evaluation becomes prohibitively expensive. By introducing low-rank filters, we gain significant runtime acceleration and simultaneously improved accuracy. We further propose an architecture change mimicking techniques from Model Order Reduction in what we call a reduced-order GCN. Moreover, we present how our method can also be applied to hypergraph datasets and how hypergraph convolution can be implemented efficiently.
Tasks
Published 2019-05-24
URL https://arxiv.org/abs/1905.10224v1
PDF https://arxiv.org/pdf/1905.10224v1.pdf
PWC https://paperswithcode.com/paper/semi-supervised-classification-on-non-sparse
Repo https://github.com/dominikalfke/GCNModel
Framework tf

VideoFlow: A Conditional Flow-Based Model for Stochastic Video Generation

Title VideoFlow: A Conditional Flow-Based Model for Stochastic Video Generation
Authors Manoj Kumar, Mohammad Babaeizadeh, Dumitru Erhan, Chelsea Finn, Sergey Levine, Laurent Dinh, Durk Kingma
Abstract Generative models that can model and predict sequences of future events can, in principle, learn to capture complex real-world phenomena, such as physical interactions. However, a central challenge in video prediction is that the future is highly uncertain: a sequence of past observations of events can imply many possible futures. Although a number of recent works have studied probabilistic models that can represent uncertain futures, such models are either extremely expensive computationally as in the case of pixel-level autoregressive models, or do not directly optimize the likelihood of the data. To our knowledge, our work is the first to propose multi-frame video prediction with normalizing flows, which allows for direct optimization of the data likelihood, and produces high-quality stochastic predictions. We describe an approach for modeling the latent space dynamics, and demonstrate that flow-based generative models offer a viable and competitive approach to generative modelling of video.
Tasks Predict Future Video Frames, Video Generation, Video Prediction
Published 2019-03-04
URL https://arxiv.org/abs/1903.01434v3
PDF https://arxiv.org/pdf/1903.01434v3.pdf
PWC https://paperswithcode.com/paper/videoflow-a-flow-based-generative-model-for
Repo https://github.com/fatemehazimi990/Pytorch-VideoFlow
Framework pytorch

Handwritten Indic Character Recognition using Capsule Networks

Title Handwritten Indic Character Recognition using Capsule Networks
Authors Bodhisatwa Mandal, Suvam Dubey, Swarnendu Ghosh, Ritesh Sarkhel, Nibaran Das
Abstract Convolutional neural networks(CNNs) has become one of the primary algorithms for various computer vision tasks. Handwritten character recognition is a typical example of such task that has also attracted attention. CNN architectures such as LeNet and AlexNet have become very prominent over the last two decades however the spatial invariance of the different kernels has been a prominent issue till now. With the introduction of capsule networks, kernels can work together in consensus with one another with the help of dynamic routing, that combines individual opinions of multiple groups of kernels called capsules to employ equivariance among kernels. In the current work, we have implemented capsule network on handwritten Indic digits and character datasets to show its superiority over networks like LeNet. Furthermore, it has also been shown that they can boost the performance of other networks like LeNet and AlexNet.
Tasks
Published 2019-01-01
URL http://arxiv.org/abs/1901.00166v1
PDF http://arxiv.org/pdf/1901.00166v1.pdf
PWC https://paperswithcode.com/paper/handwritten-indic-character-recognition-using
Repo https://github.com/prabhuomkar/hicr-capsnet
Framework tf

Safe Planning via Model Predictive Shielding

Title Safe Planning via Model Predictive Shielding
Authors Osbert Bastani
Abstract Reinforcement learning is a promising approach to synthesizing policies for robotics tasks. A key challenge is ensuring safety of the learned policy—e.g., that a walking robot does not fall over, or an autonomous car does not run into an obstacle. We focus on the setting where the dynamics are known, and the goal is to prove that a policy trained in simulation satisfies a given safety constraint. We build on an approach called shielding, which uses a backup policy to override the learned policy as needed to ensure safety. Our algorithm, called model predictive shielding (MPS), computes whether it is safe to use the learned policy on-the-fly instead of ahead-of-time. By doing so, our approach is computationally efficient, and can furthermore be used to ensure safety even in novel environments. Finally, we empirically demonstrate the benefits of our approach.
Tasks
Published 2019-05-25
URL https://arxiv.org/abs/1905.10691v2
PDF https://arxiv.org/pdf/1905.10691v2.pdf
PWC https://paperswithcode.com/paper/safe-reinforcement-learning-via-online
Repo https://github.com/obastani/model-predictive-shielding
Framework pytorch

EDIT: Exemplar-Domain Aware Image-to-Image Translation

Title EDIT: Exemplar-Domain Aware Image-to-Image Translation
Authors Yuanbin Fu, Jiayi Ma, Lin Ma, Xiaojie Guo
Abstract Image-to-image translation is to convert an image of the certain style to another of the target style with the content preserved. A desired translator should be capable to generate diverse results in a controllable (many-to-many) fashion. To this end, we design a novel generative adversarial network, namely exemplar-domain aware image-to-image translator (EDIT for short). The principle behind is that, for images from multiple domains, the content features can be obtained by a uniform extractor, while (re-)stylization is achieved by mapping the extracted features specifically to different purposes (domains and exemplars). The generator of our EDIT comprises of a part of blocks configured by shared parameters, and the rest by varied parameters exported by an exemplar-domain aware parameter network. In addition, a discriminator is equipped during the training phase to guarantee the output satisfying the distribution of the target domain. Our EDIT can flexibly and effectively work on multiple domains and arbitrary exemplars in a unified neat model. We conduct experiments to show the efficacy of our design, and reveal its advances over other state-of-the-art methods both quantitatively and qualitatively.
Tasks Image-to-Image Translation
Published 2019-11-24
URL https://arxiv.org/abs/1911.10520v1
PDF https://arxiv.org/pdf/1911.10520v1.pdf
PWC https://paperswithcode.com/paper/edit-exemplar-domain-aware-image-to-image
Repo https://github.com/ForawardStar/EDIT
Framework pytorch

SVGD: A Virtual Gradients Descent Method for Stochastic Optimization

Title SVGD: A Virtual Gradients Descent Method for Stochastic Optimization
Authors Zheng Li, Shi Shu
Abstract Inspired by dynamic programming, we propose Stochastic Virtual Gradient Descent (SVGD) algorithm where the Virtual Gradient is defined by computational graph and automatic differentiation. The method is computationally efficient and has little memory requirements. We also analyze the theoretical convergence properties and implementation of the algorithm. Experimental results on multiple datasets and network models show that SVGD has advantages over other stochastic optimization methods.
Tasks Stochastic Optimization
Published 2019-07-09
URL https://arxiv.org/abs/1907.04021v2
PDF https://arxiv.org/pdf/1907.04021v2.pdf
PWC https://paperswithcode.com/paper/svgd-a-virtual-gradients-descent-method-for
Repo https://github.com/LizhengMathAi/svgd
Framework tf

Super-resolution of Omnidirectional Images Using Adversarial Learning

Title Super-resolution of Omnidirectional Images Using Adversarial Learning
Authors Cagri Ozcinar, Aakanksha Rana, Aljosa Smolic
Abstract An omnidirectional image (ODI) enables viewers to look in every direction from a fixed point through a head-mounted display providing an immersive experience compared to that of a standard image. Designing immersive virtual reality systems with ODIs is challenging as they require high resolution content. In this paper, we study super-resolution for ODIs and propose an improved generative adversarial network based model which is optimized to handle the artifacts obtained in the spherical observational space. Specifically, we propose to use a fast PatchGAN discriminator, as it needs fewer parameters and improves the super-resolution at a fine scale. We also explore the generative models with adversarial learning by introducing a spherical-content specific loss function, called 360-SS. To train and test the performance of our proposed model we prepare a dataset of 4500 ODIs. Our results demonstrate the efficacy of the proposed method and identify new challenges in ODI super-resolution for future investigations.
Tasks Super-Resolution
Published 2019-08-12
URL https://arxiv.org/abs/1908.04297v1
PDF https://arxiv.org/pdf/1908.04297v1.pdf
PWC https://paperswithcode.com/paper/super-resolution-of-omnidirectional-images
Repo https://github.com/V-Sense/360SR
Framework pytorch

Sparse Variational Inference: Bayesian Coresets from Scratch

Title Sparse Variational Inference: Bayesian Coresets from Scratch
Authors Trevor Campbell, Boyan Beronov
Abstract The proliferation of automated inference algorithms in Bayesian statistics has provided practitioners newfound access to fast, reproducible data analysis and powerful statistical models. Designing automated methods that are also both computationally scalable and theoretically sound, however, remains a significant challenge. Recent work on Bayesian coresets takes the approach of compressing the dataset before running a standard inference algorithm, providing both scalability and guarantees on posterior approximation error. But the automation of past coreset methods is limited because they depend on the availability of a reasonable coarse posterior approximation, which is difficult to specify in practice. In the present work we remove this requirement by formulating coreset construction as sparsity-constrained variational inference within an exponential family. This perspective leads to a novel construction via greedy optimization, and also provides a unifying information-geometric view of present and past methods. The proposed Riemannian coreset construction algorithm is fully automated, requiring no problem-specific inputs aside from the probabilistic model and dataset. In addition to being significantly easier to use than past methods, experiments demonstrate that past coreset constructions are fundamentally limited by the fixed coarse posterior approximation; in contrast, the proposed algorithm is able to continually improve the coreset, providing state-of-the-art Bayesian dataset summarization with orders-of-magnitude reduction in KL divergence to the exact posterior.
Tasks
Published 2019-06-07
URL https://arxiv.org/abs/1906.03329v2
PDF https://arxiv.org/pdf/1906.03329v2.pdf
PWC https://paperswithcode.com/paper/sparse-variational-inference-bayesian
Repo https://github.com/trevorcampbell/bayesian-coresets
Framework none

Beyond Photometric Loss for Self-Supervised Ego-Motion Estimation

Title Beyond Photometric Loss for Self-Supervised Ego-Motion Estimation
Authors Tianwei Shen, Zixin Luo, Lei Zhou, Hanyu Deng, Runze Zhang, Tian Fang, Long Quan
Abstract Accurate relative pose is one of the key components in visual odometry (VO) and simultaneous localization and mapping (SLAM). Recently, the self-supervised learning framework that jointly optimizes the relative pose and target image depth has attracted the attention of the community. Previous works rely on the photometric error generated from depths and poses between adjacent frames, which contains large systematic error under realistic scenes due to reflective surfaces and occlusions. In this paper, we bridge the gap between geometric loss and photometric loss by introducing the matching loss constrained by epipolar geometry in a self-supervised framework. Evaluated on the KITTI dataset, our method outperforms the state-of-the-art unsupervised ego-motion estimation methods by a large margin. The code and data are available at https://github.com/hlzz/DeepMatchVO.
Tasks Motion Estimation, Simultaneous Localization and Mapping, Visual Odometry
Published 2019-02-25
URL http://arxiv.org/abs/1902.09103v1
PDF http://arxiv.org/pdf/1902.09103v1.pdf
PWC https://paperswithcode.com/paper/beyond-photometric-loss-for-self-supervised
Repo https://github.com/hlzz/DeepMatchVO
Framework tf

Generating Philosophical Statements using Interpolated Markov Models and Dynamic Templates

Title Generating Philosophical Statements using Interpolated Markov Models and Dynamic Templates
Authors Thomas Winters
Abstract Automatically imitating input text is a common task in natural language generation, often used to create humorous results. Classic algorithms for learning to imitate text, e.g. simple Markov chains, usually have a trade-off between originality and syntactic correctness. We present two ways of automatically parodying philosophical statements from examples overcoming this issue, and show how these can work in interactive systems as well. The first algorithm uses interpolated Markov models with extensions to improve the quality of the generated texts. For the second algorithm, we propose dynamically extracting templates and filling these with new content. To illustrate these algorithms, we implemented TorfsBot, a Twitterbot imitating the witty, semi-philosophical tweets of professor Rik Torfs, the previous KU Leuven rector. We found that users preferred generative models that focused on local coherent sentences, rather than those mimicking the global structure of a philosophical statement. The proposed algorithms are thus valuable new tools for automatic parody as well as template learning systems.
Tasks Text Generation
Published 2019-09-19
URL https://arxiv.org/abs/1909.09480v1
PDF https://arxiv.org/pdf/1909.09480v1.pdf
PWC https://paperswithcode.com/paper/generating-philosophical-statements-using
Repo https://github.com/twinters/torfs-bot
Framework none

3D Human Pose Machines with Self-supervised Learning

Title 3D Human Pose Machines with Self-supervised Learning
Authors Keze Wang, Liang Lin, Chenhan Jiang, Chen Qian, Pengxu Wei
Abstract Driven by recent computer vision and robotic applications, recovering 3D human poses has become increasingly important and attracted growing interests. In fact, completing this task is quite challenging due to the diverse appearances, viewpoints, occlusions and inherently geometric ambiguities inside monocular images. Most of the existing methods focus on designing some elaborate priors /constraints to directly regress 3D human poses based on the corresponding 2D human pose-aware features or 2D pose predictions. However, due to the insufficient 3D pose data for training and the domain gap between 2D space and 3D space, these methods have limited scalabilities for all practical scenarios (e.g., outdoor scene). Attempt to address this issue, this paper proposes a simple yet effective self-supervised correction mechanism to learn all intrinsic structures of human poses from abundant images. Specifically, the proposed mechanism involves two dual learning tasks, i.e., the 2D-to-3D pose transformation and 3D-to-2D pose projection, to serve as a bridge between 3D and 2D human poses in a type of “free” self-supervision for accurate 3D human pose estimation. The 2D-to-3D pose implies to sequentially regress intermediate 3D poses by transforming the pose representation from the 2D domain to the 3D domain under the sequence-dependent temporal context, while the 3D-to-2D pose projection contributes to refining the intermediate 3D poses by maintaining geometric consistency between the 2D projections of 3D poses and the estimated 2D poses. We further apply our self-supervised correction mechanism to develop a 3D human pose machine, which jointly integrates the 2D spatial relationship, temporal smoothness of predictions and 3D geometric knowledge. Extensive evaluations demonstrate the superior performance and efficiency of our framework over all the compared competing methods.
Tasks 3D Human Pose Estimation, Pose Estimation
Published 2019-01-12
URL http://arxiv.org/abs/1901.03798v2
PDF http://arxiv.org/pdf/1901.03798v2.pdf
PWC https://paperswithcode.com/paper/3d-human-pose-machines-with-self-supervised
Repo https://github.com/Khenu/Computer-Animation
Framework none

Sogou Machine Reading Comprehension Toolkit

Title Sogou Machine Reading Comprehension Toolkit
Authors Jindou Wu, Yunlun Yang, Chao Deng, Hongyi Tang, Bingning Wang, Haoze Sun, Ting Yao, Qi Zhang
Abstract Machine reading comprehension have been intensively studied in recent years, and neural network-based models have shown dominant performances. In this paper, we present a Sogou Machine Reading Comprehension (SMRC) toolkit that can be used to provide the fast and efficient development of modern machine comprehension models, including both published models and original prototypes. To achieve this goal, the toolkit provides dataset readers, a flexible preprocessing pipeline, necessary neural network components, and built-in models, which make the whole process of data preparation, model construction, and training easier.
Tasks Machine Reading Comprehension, Reading Comprehension
Published 2019-03-28
URL http://arxiv.org/abs/1903.11848v2
PDF http://arxiv.org/pdf/1903.11848v2.pdf
PWC https://paperswithcode.com/paper/sogou-machine-reading-comprehension-toolkit
Repo https://github.com/sogou/SMRCToolkit
Framework tf
comments powered by Disqus