January 26, 2020

2981 words 14 mins read

Paper Group ANR 1393

Paper Group ANR 1393

AutoCorrect: Deep Inductive Alignment of Noisy Geometric Annotations. Multifactorial Evolutionary Algorithm For Clustered Minimum Routing Cost Problem. MEGAN: A Generative Adversarial Network for Multi-View Network Embedding. HONEM: Network Embedding Using Higher-Order Patterns in Sequential Data. Scalable Syntax-Aware Language Models Using Knowled …

AutoCorrect: Deep Inductive Alignment of Noisy Geometric Annotations

Title AutoCorrect: Deep Inductive Alignment of Noisy Geometric Annotations
Authors Honglie Chen, Weidi Xie, Andrea Vedaldi, Andrew Zisserman
Abstract We propose AutoCorrect, a method to automatically learn object-annotation alignments from a dataset with annotations affected by geometric noise. The method is based on a consistency loss that enables deep neural networks to be trained, given only noisy annotations as input, to correct the annotations. When some noise-free annotations are available, we show that the consistency loss reduces to a stricter self-supervised loss. We also show that the method can implicitly leverage object symmetries to reduce the ambiguity arising in correcting noisy annotations. When multiple object-annotation pairs are present in an image, we introduce a spatial memory map that allows the network to correct annotations sequentially, one at a time, while accounting for all other annotations in the image and corrections performed so far. Through ablation, we show the benefit of these contributions, demonstrating excellent results on geo-spatial imagery. Specifically, we show results using a new Railway tracks dataset as well as the public INRIA Buildings benchmarks, achieving new state-of-the-art results for the latter.
Tasks
Published 2019-08-14
URL https://arxiv.org/abs/1908.05263v1
PDF https://arxiv.org/pdf/1908.05263v1.pdf
PWC https://paperswithcode.com/paper/autocorrect-deep-inductive-alignment-of-noisy
Repo
Framework

Multifactorial Evolutionary Algorithm For Clustered Minimum Routing Cost Problem

Title Multifactorial Evolutionary Algorithm For Clustered Minimum Routing Cost Problem
Authors Tran Ba Trung, Huynh Thi Thanh Binh, Le Tien Thanh, Ly Trung Hieu, Pham Dinh Thanh
Abstract Minimum Routing Cost Clustered Tree Problem (CluMRCT) is applied in various fields in both theory and application. Because the CluMRCT is NP-Hard, the approximate approaches are suitable to find the solution for this problem. Recently, Multifactorial Evolutionary Algorithm (MFEA) has emerged as one of the most efficient approximation algorithms to deal with many different kinds of problems. Therefore, this paper studies to apply MFEA for solving CluMRCT problems. In the proposed MFEA, we focus on crossover and mutation operators which create a valid solution of CluMRCT problem in two levels: first level constructs spanning trees for graphs in clusters while the second level builds a spanning tree for connecting among clusters. To reduce the consuming resources, we will also introduce a new method of calculating the cost of CluMRCT solution. The proposed algorithm is experimented on numerous types of datasets. The experimental results demonstrate the effectiveness of the proposed algorithm, partially on large instances
Tasks
Published 2019-12-23
URL https://arxiv.org/abs/1912.10986v1
PDF https://arxiv.org/pdf/1912.10986v1.pdf
PWC https://paperswithcode.com/paper/multifactorial-evolutionary-algorithm-for
Repo
Framework

MEGAN: A Generative Adversarial Network for Multi-View Network Embedding

Title MEGAN: A Generative Adversarial Network for Multi-View Network Embedding
Authors Yiwei Sun, Suhang Wang, Tsung-Yu Hsieh, Xianfeng Tang, Vasant Honavar
Abstract Data from many real-world applications can be naturally represented by multi-view networks where the different views encode different types of relationships (e.g., friendship, shared interests in music, etc.) between real-world individuals or entities. There is an urgent need for methods to obtain low-dimensional, information preserving and typically nonlinear embeddings of such multi-view networks. However, most of the work on multi-view learning focuses on data that lack a network structure, and most of the work on network embeddings has focused primarily on single-view networks. Against this background, we consider the multi-view network representation learning problem, i.e., the problem of constructing low-dimensional information preserving embeddings of multi-view networks. Specifically, we investigate a novel Generative Adversarial Network (GAN) framework for Multi-View Network Embedding, namely MEGAN, aimed at preserving the information from the individual network views, while accounting for connectivity across (and hence complementarity of and correlations between) different views. The results of our experiments on two real-world multi-view data sets show that the embeddings obtained using MEGAN outperform the state-of-the-art methods on node classification, link prediction and visualization tasks.
Tasks Link Prediction, MULTI-VIEW LEARNING, Network Embedding, Node Classification, Representation Learning
Published 2019-08-20
URL https://arxiv.org/abs/1909.01084v1
PDF https://arxiv.org/pdf/1909.01084v1.pdf
PWC https://paperswithcode.com/paper/megan-a-generative-adversarial-network-for
Repo
Framework

HONEM: Network Embedding Using Higher-Order Patterns in Sequential Data

Title HONEM: Network Embedding Using Higher-Order Patterns in Sequential Data
Authors Mandana Saebi, Giovanni Luca Ciampaglia, Lance M Kaplan, Nitesh V Chawla
Abstract Representation learning offers a powerful alternative to the oft painstaking process of manual feature engineering, and as a result, has enjoyed considerable success in recent years. This success is especially striking in the context of graph mining, since networks can take advantage of vast troves of sequential data to encode information about interactions between entities of interest. But how do we learn embeddings on networks that have higher-order and sequential dependencies? Existing network embedding methods naively assume the Markovian property (first-order dependency) for node interactions, which may not capture the time-dependent and longer-range underlying complex interactions of the raw data. To address the limitation of current methods, we propose a network embedding method for higher-order networks (HON). We demonstrate that the higher-order network embedding (HONEM) method is able to extract higher-order dependencies from HON to construct the higher-order neighborhood matrix of the network, while existing methods are not able to capture these higher-order dependencies. We show that our method outperforms other state-of-the-art methods in node classification, network reconstruction, link prediction, and visualization.
Tasks Feature Engineering, Link Prediction, Network Embedding, Node Classification, Representation Learning
Published 2019-08-15
URL https://arxiv.org/abs/1908.05387v1
PDF https://arxiv.org/pdf/1908.05387v1.pdf
PWC https://paperswithcode.com/paper/honem-network-embedding-using-higher-order
Repo
Framework

Scalable Syntax-Aware Language Models Using Knowledge Distillation

Title Scalable Syntax-Aware Language Models Using Knowledge Distillation
Authors Adhiguna Kuncoro, Chris Dyer, Laura Rimell, Stephen Clark, Phil Blunsom
Abstract Prior work has shown that, on small amounts of training data, syntactic neural language models learn structurally sensitive generalisations more successfully than sequential language models. However, their computational complexity renders scaling difficult, and it remains an open question whether structural biases are still necessary when sequential models have access to ever larger amounts of training data. To answer this question, we introduce an efficient knowledge distillation (KD) technique that transfers knowledge from a syntactic language model trained on a small corpus to an LSTM language model, hence enabling the LSTM to develop a more structurally sensitive representation of the larger training data it learns from. On targeted syntactic evaluations, we find that, while sequential LSTMs perform much better than previously reported, our proposed technique substantially improves on this baseline, yielding a new state of the art. Our findings and analysis affirm the importance of structural biases, even in models that learn from large amounts of data.
Tasks Language Modelling
Published 2019-06-14
URL https://arxiv.org/abs/1906.06438v1
PDF https://arxiv.org/pdf/1906.06438v1.pdf
PWC https://paperswithcode.com/paper/scalable-syntax-aware-language-models-using
Repo
Framework

Stain Style Transfer using Transitive Adversarial Networks

Title Stain Style Transfer using Transitive Adversarial Networks
Authors Shaojin Cai, Yuyang Xue3 Qinquan Gao, Min Du, Gang Chen, Hejun Zhang, Tong Tong
Abstract Digitized pathological diagnosis has been in increasing demand recently. It is well known that color information is critical to the automatic and visual analysis of pathological slides. However, the color variations due to various factors not only have negative impact on pathologist’s diagnosis, but also will reduce the robustness of the algorithms. The factors that cause the color differences are not only in the process of making the slices, but also in the process of digitization. Different strategies have been proposed to alleviate the color variations. Most of such techniques rely on collecting color statistics to perform color matching across images and highly dependent on a reference template slide. Since the pathological slides between hospitals are usually unpaired, these methods do not yield good matching results. In this work, we propose a novel network that we refer to as Transitive Adversarial Networks (TAN) to transfer the color information among slides from different hospitals or centers. It is not necessary for an expert to pick a representative reference slide in the proposed TAN method. We compare the proposed method with the state-of-the-art methods quantitatively and qualitatively. Compared with the state-of-the-art methods, our method yields an improvement of 0.87dB in terms of PSNR, demonstrating the effectiveness of the proposed TAN method in stain style transfer.
Tasks Style Transfer
Published 2019-10-23
URL https://arxiv.org/abs/1910.10330v1
PDF https://arxiv.org/pdf/1910.10330v1.pdf
PWC https://paperswithcode.com/paper/stain-style-transfer-using-transitive
Repo
Framework

Disentangled Representation Learning with Wasserstein Total Correlation

Title Disentangled Representation Learning with Wasserstein Total Correlation
Authors Yijun Xiao, William Yang Wang
Abstract Unsupervised learning of disentangled representations involves uncovering of different factors of variations that contribute to the data generation process. Total correlation penalization has been a key component in recent methods towards disentanglement. However, Kullback-Leibler (KL) divergence-based total correlation is metric-agnostic and sensitive to data samples. In this paper, we introduce Wasserstein total correlation in both variational autoencoder and Wasserstein autoencoder settings to learn disentangled latent representations. A critic is adversarially trained along with the main objective to estimate the Wasserstein total correlation term. We discuss the benefits of using Wasserstein distance over KL divergence to measure independence and conduct quantitative and qualitative experiments on several data sets. Moreover, we introduce a new metric to measure disentanglement. We show that the proposed approach has comparable performances on disentanglement with smaller sacrifices in reconstruction abilities.
Tasks Representation Learning
Published 2019-12-30
URL https://arxiv.org/abs/1912.12818v1
PDF https://arxiv.org/pdf/1912.12818v1.pdf
PWC https://paperswithcode.com/paper/disentangled-representation-learning-with-2
Repo
Framework

Action Genome: Actions as Composition of Spatio-temporal Scene Graphs

Title Action Genome: Actions as Composition of Spatio-temporal Scene Graphs
Authors Jingwei Ji, Ranjay Krishna, Li Fei-Fei, Juan Carlos Niebles
Abstract Action recognition has typically treated actions and activities as monolithic events that occur in videos. However, there is evidence from Cognitive Science and Neuroscience that people actively encode activities into consistent hierarchical part structures. However in Computer Vision, few explorations on representations encoding event partonomies have been made. Inspired by evidence that the prototypical unit of an event is an action-object interaction, we introduce Action Genome, a representation that decomposes actions into spatio-temporal scene graphs. Action Genome captures changes between objects and their pairwise relationships while an action occurs. It contains 10K videos with 0.4M objects and 1.7M visual relationships annotated. With Action Genome, we extend an existing action recognition model by incorporating scene graphs as spatio-temporal feature banks to achieve better performance on the Charades dataset. Next, by decomposing and learning the temporal changes in visual relationships that result in an action, we demonstrate the utility of a hierarchical event decomposition by enabling few-shot action recognition, achieving 42.7% mAP using as few as 10 examples. Finally, we benchmark existing scene graph models on the new task of spatio-temporal scene graph prediction.
Tasks
Published 2019-12-15
URL https://arxiv.org/abs/1912.06992v1
PDF https://arxiv.org/pdf/1912.06992v1.pdf
PWC https://paperswithcode.com/paper/action-genome-actions-as-composition-of
Repo
Framework

Semantics Disentangling for Text-to-Image Generation

Title Semantics Disentangling for Text-to-Image Generation
Authors Guojun Yin, Bin Liu, Lu Sheng, Nenghai Yu, Xiaogang Wang, Jing Shao
Abstract Synthesizing photo-realistic images from text descriptions is a challenging problem. Previous studies have shown remarkable progresses on visual quality of the generated images. In this paper, we consider semantics from the input text descriptions in helping render photo-realistic images. However, diverse linguistic expressions pose challenges in extracting consistent semantics even they depict the same thing. To this end, we propose a novel photo-realistic text-to-image generation model that implicitly disentangles semantics to both fulfill the high-level semantic consistency and low-level semantic diversity. To be specific, we design (1) a Siamese mechanism in the discriminator to learn consistent high-level semantics, and (2) a visual-semantic embedding strategy by semantic-conditioned batch normalization to find diverse low-level semantics. Extensive experiments and ablation studies on CUB and MS-COCO datasets demonstrate the superiority of the proposed method in comparison to state-of-the-art methods.
Tasks Image Generation, Text-to-Image Generation
Published 2019-04-02
URL http://arxiv.org/abs/1904.01480v1
PDF http://arxiv.org/pdf/1904.01480v1.pdf
PWC https://paperswithcode.com/paper/semantics-disentangling-for-text-to-image
Repo
Framework

Quadratic video interpolation

Title Quadratic video interpolation
Authors Xiangyu Xu, Li Siyao, Wenxiu Sun, Qian Yin, Ming-Hsuan Yang
Abstract Video interpolation is an important problem in computer vision, which helps overcome the temporal limitation of camera sensors. Existing video interpolation methods usually assume uniform motion between consecutive frames and use linear models for interpolation, which cannot well approximate the complex motion in the real world. To address these issues, we propose a quadratic video interpolation method which exploits the acceleration information in videos. This method allows prediction with curvilinear trajectory and variable velocity, and generates more accurate interpolation results. For high-quality frame synthesis, we develop a flow reversal layer to estimate flow fields starting from the unknown target frame to the source frame. In addition, we present techniques for flow refinement. Extensive experiments demonstrate that our approach performs favorably against the existing linear models on a wide variety of video datasets.
Tasks
Published 2019-11-02
URL https://arxiv.org/abs/1911.00627v1
PDF https://arxiv.org/pdf/1911.00627v1.pdf
PWC https://paperswithcode.com/paper/quadratic-video-interpolation
Repo
Framework

Eikonal Region-based Active Contours for Image Segmentation

Title Eikonal Region-based Active Contours for Image Segmentation
Authors Da Chen, Jean-Marie Mirebeau, Huazhong Shu, Laurent D. Cohen
Abstract The minimal path model based on the Eikonal partial differential equation (PDE) has served as a fundamental tool for the applications of image segmentation and boundary detection in the passed three decades. However, the existing minimal paths-based image segmentation approaches commonly rely on the image boundary features, potentially limiting their performance in some situations. In this paper, we introduce a new variational image segmentation model based on the minimal path framework and the Eikonal PDE, where the region-based functional that defines the homogeneity criteria can be taken into account for estimating the associated geodesic paths. This is done by establishing a geodesic curve interpretation to the region-based active contour evolution problem. The image segmentation processing is carried out in an iterative manner in our approach. A crucial ingredient in each iteration is to construct an asymmetric Randers geodesic metric using a sufficiently small vector field, such that a set of geodesic paths can be tracked from the geodesic distance map which is the solution to an Eikonal PDE. The object boundary can be delineated by the concatenation of the final geodesic paths. We invoke the Finsler variant of the fast marching method to estimate the geodesic distance map, yielding an efficient implementation of the proposed Eikonal region-based active contour model. Experimental results on both of the synthetic and real images exhibit that our model indeed achieves encouraging segmentation performance.
Tasks Boundary Detection, Semantic Segmentation
Published 2019-12-20
URL https://arxiv.org/abs/1912.10122v1
PDF https://arxiv.org/pdf/1912.10122v1.pdf
PWC https://paperswithcode.com/paper/eikonal-region-based-active-contours-for
Repo
Framework

DropAttention: A Regularization Method for Fully-Connected Self-Attention Networks

Title DropAttention: A Regularization Method for Fully-Connected Self-Attention Networks
Authors Lin Zehui, Pengfei Liu, Luyao Huang, Junkun Chen, Xipeng Qiu, Xuanjing Huang
Abstract Variants dropout methods have been designed for the fully-connected layer, convolutional layer and recurrent layer in neural networks, and shown to be effective to avoid overfitting. As an appealing alternative to recurrent and convolutional layers, the fully-connected self-attention layer surprisingly lacks a specific dropout method. This paper explores the possibility of regularizing the attention weights in Transformers to prevent different contextualized feature vectors from co-adaption. Experiments on a wide range of tasks show that DropAttention can improve performance and reduce overfitting.
Tasks
Published 2019-07-25
URL https://arxiv.org/abs/1907.11065v2
PDF https://arxiv.org/pdf/1907.11065v2.pdf
PWC https://paperswithcode.com/paper/dropattention-a-regularization-method-for
Repo
Framework

Hypernym Detection Using Strict Partial Order Networks

Title Hypernym Detection Using Strict Partial Order Networks
Authors Sarthak Dash, Md Faisal Mahbub Chowdhury, Alfio Gliozzo, Nandana Mihindukulasooriya, Nicolas Rodolfo Fauceglia
Abstract This paper introduces Strict Partial Order Networks (SPON), a novel neural network architecture designed to enforce asymmetry and transitive properties as soft constraints. We apply it to induce hypernymy relations by training with is-a pairs. We also present an augmented variant of SPON that can generalize type information learned for in-vocabulary terms to previously unseen ones. An extensive evaluation over eleven benchmarks across different tasks shows that SPON consistently either outperforms or attains the state of the art on all but one of these benchmarks.
Tasks
Published 2019-09-23
URL https://arxiv.org/abs/1909.10572v2
PDF https://arxiv.org/pdf/1909.10572v2.pdf
PWC https://paperswithcode.com/paper/inducing-hypernym-relationships-based-on
Repo
Framework

Theme-Matters: Fashion Compatibility Learning via Theme Attention

Title Theme-Matters: Fashion Compatibility Learning via Theme Attention
Authors Jui-Hsin Lai, Bo Wu, Xin Wang, Dan Zeng, Tao Mei, Jingen Liu
Abstract Fashion compatibility learning is important to many fashion markets such as outfit composition and online fashion recommendation. Unlike previous work, we argue that fashion compatibility is not only a visual appearance compatible problem but also a theme-matters problem. An outfit, which consists of a set of fashion items (e.g., shirt, suit, shoes, etc.), is considered to be compatible for a “dating” event, yet maybe not for a “business” occasion. In this paper, we aim at solving the fashion compatibility problem given specific themes. To this end, we built the first real-world theme-aware fashion dataset comprising 14K around outfits labeled with 32 themes. In this dataset, there are more than 40K fashion items labeled with 152 fine-grained categories. We also propose an attention model learning fashion compatibility given a specific theme. It starts with a category-specific subspace learning, which projects compatible outfit items in certain categories to be close in the subspace. Thanks to strong connections between fashion themes and categories, we then build a theme-attention model over the category-specific embedding space. This model associates themes with the pairwise compatibility with attention, and thus compute the outfit-wise compatibility. To the best of our knowledge, this is the first attempt to estimate outfit compatibility conditional on a theme. We conduct extensive qualitative and quantitative experiments on our new dataset. Our method outperforms the state-of-the-art approaches.
Tasks
Published 2019-12-12
URL https://arxiv.org/abs/1912.06227v2
PDF https://arxiv.org/pdf/1912.06227v2.pdf
PWC https://paperswithcode.com/paper/theme-matters-fashion-compatibility-learning
Repo
Framework

Extreme Learning Machine-Based Receiver for MIMO LED Communications

Title Extreme Learning Machine-Based Receiver for MIMO LED Communications
Authors Dawei Gao, Qinghua Guo
Abstract This work concerns receiver design for light-emitting diode (LED) multiple input multiple output (MIMO) communications where the LED nonlinearity can severely degrade the performance of communications. In this paper, we propose an extreme learning machine (ELM) based receiver to jointly handle the LED nonlinearity and cross-LED interference, and a circulant input weight matrix is employed, which significantly reduces the complexity of the receiver with the fast Fourier transform (FFT). It is demonstrated that the proposed receiver can efficiently handle the LED nonlinearity and cross-LED interference.
Tasks
Published 2019-02-27
URL http://arxiv.org/abs/1903.01551v1
PDF http://arxiv.org/pdf/1903.01551v1.pdf
PWC https://paperswithcode.com/paper/extreme-learning-machine-based-receiver-for
Repo
Framework
comments powered by Disqus