January 27, 2020

3125 words 15 mins read

Paper Group ANR 1189

Paper Group ANR 1189

Towards Interpretable Image Synthesis by Learning Sparsely Connected AND-OR Networks. Training Deep Learning Models via Synthetic Data: Application in Unmanned Aerial Vehicles. DNNVM : End-to-End Compiler Leveraging Heterogeneous Optimizations on FPGA-based CNN Accelerators. Revisiting Landscape Analysis in Deep Neural Networks: Eliminating Decreas …

Towards Interpretable Image Synthesis by Learning Sparsely Connected AND-OR Networks

Title Towards Interpretable Image Synthesis by Learning Sparsely Connected AND-OR Networks
Authors Xianglei Xing, Tianfu Wu, Song-Chun Zhu, Ying Nian Wu
Abstract This paper proposes interpretable image synthesis by learning hierarchical AND-OR networks of sparsely connected semantically meaningful nodes. The proposed method is based on the compositionality and interpretability of scene-objects-parts-subparts-primitives hierarchy in image representation. A scene has different types (i.e., OR) each of which consists of a number of objects (i.e., AND). This can be recursively formulated across the scene-objects-parts-subparts hierarchy and is terminated at the primitive level (e.g., Gabor wavelets-like basis). To realize this interpretable AND-OR hierarchy in image synthesis, the proposed method consists of two components: (i) Each layer of the hierarchy is represented by an over-completed set of basis functions. The basis functions are instantiated using convolution to be translation covariant. Off-the-shelf convolutional neural architectures are then exploited to implement the hierarchy. (ii) Sparsity-inducing constraints are introduced in end-to-end training, which facilitate a sparsely connected AND-OR network to emerge from initially densely connected convolutional neural networks. A straightforward sparsity-inducing constraint is utilized, that is to only allow the top-$k$ basis functions to be active at each layer (where $k$ is a hyperparameter). The learned basis functions are also capable of image reconstruction to explain away input images. In experiments, the proposed method is tested on five benchmark datasets. The results show that meaningful and interpretable hierarchical representations are learned with better qualities of image synthesis and reconstruction obtained than state-of-the-art baselines.
Tasks Image Generation, Image Reconstruction
Published 2019-09-10
URL https://arxiv.org/abs/1909.04324v1
PDF https://arxiv.org/pdf/1909.04324v1.pdf
PWC https://paperswithcode.com/paper/towards-interpretable-image-synthesis-by
Repo
Framework

Training Deep Learning Models via Synthetic Data: Application in Unmanned Aerial Vehicles

Title Training Deep Learning Models via Synthetic Data: Application in Unmanned Aerial Vehicles
Authors Andreas Kamilaris, Corjan van den Brink, Savvas Karatsiolis
Abstract This paper describes preliminary work in the recent promising approach of generating synthetic training data for facilitating the learning procedure of deep learning (DL) models, with a focus on aerial photos produced by unmanned aerial vehicles (UAV). The general concept and methodology are described, and preliminary results are presented, based on a classification problem of fire identification in forests as well as a counting problem of estimating number of houses in urban areas. The proposed technique constitutes a new possibility for the DL community, especially related to UAV-based imagery analysis, with much potential, promising results, and unexplored ground for further research.
Tasks
Published 2019-08-18
URL https://arxiv.org/abs/1908.06472v1
PDF https://arxiv.org/pdf/1908.06472v1.pdf
PWC https://paperswithcode.com/paper/training-deep-learning-models-via-synthetic
Repo
Framework

DNNVM : End-to-End Compiler Leveraging Heterogeneous Optimizations on FPGA-based CNN Accelerators

Title DNNVM : End-to-End Compiler Leveraging Heterogeneous Optimizations on FPGA-based CNN Accelerators
Authors Yu Xing, Shuang Liang, Lingzhi Sui, Xijie Jia, Jiantao Qiu, Xin Liu, Yushun Wang, Yu Wang, Yi Shan
Abstract The convolutional neural network (CNN) has become a state-of-the-art method for several artificial intelligence domains in recent years. The increasingly complex CNN models are both computation-bound and I/O-bound. FPGA-based accelerators driven by custom instruction set architecture (ISA) achieve a balance between generality and efficiency, but there is much on them left to be optimized. We propose the full-stack compiler DNNVM, which is an integration of optimizers for graphs, loops and data layouts, and an assembler, a runtime supporter and a validation environment. The DNNVM works in the context of deep learning frameworks and transforms CNN models into the directed acyclic graph: XGraph. Based on XGraph, we transform the optimization challenges for both the data layout and pipeline into graph-level problems. DNNVM enumerates all potentially profitable fusion opportunities by a heuristic subgraph isomorphism algorithm to leverage pipeline and data layout optimizations, and searches for the best choice of execution strategies of the whole computing graph. On the Xilinx ZU2 @330 MHz and ZU9 @330 MHz, we achieve equivalently state-of-the-art performance on our benchmarks by na"ive implementations without optimizations, and the throughput is further improved up to 1.26x by leveraging heterogeneous optimizations in DNNVM. Finally, with ZU9 @330 MHz, we achieve state-of-the-art performance for VGG and ResNet50. We achieve a throughput of 2.82 TOPs/s and an energy efficiency of 123.7 GOPs/s/W for VGG. Additionally, we achieve 1.38 TOPs/s for ResNet50 and 1.41 TOPs/s for GoogleNet.
Tasks
Published 2019-02-20
URL https://arxiv.org/abs/1902.07463v2
PDF https://arxiv.org/pdf/1902.07463v2.pdf
PWC https://paperswithcode.com/paper/dnnvm-end-to-end-compiler-leveraging
Repo
Framework

Revisiting Landscape Analysis in Deep Neural Networks: Eliminating Decreasing Paths to Infinity

Title Revisiting Landscape Analysis in Deep Neural Networks: Eliminating Decreasing Paths to Infinity
Authors Shiyu Liang, Ruoyu Sun, R. Srikant
Abstract Traditional landscape analysis of deep neural networks aims to show that no sub-optimal local minima exist in some appropriate sense. From this, one may be tempted to conclude that descent algorithms which escape saddle points will reach a good local minimum. However, basic optimization theory tell us that it is also possible for a descent algorithm to diverge to infinity if there are paths leading to infinity, along which the loss function decreases. It is not clear whether for non-linear neural networks there exists one setting that no bad local-min and no decreasing paths to infinity can be simultaneously achieved. In this paper, we give the first positive answer to this question. More specifically, for a large class of over-parameterized deep neural networks with appropriate regularizers, the loss function has no bad local minima and no decreasing paths to infinity. The key mathematical trick is to show that the set of regularizers which may be undesirable can be viewed as the image of a Lipschitz continuous mapping from a lower-dimensional Euclidean space to a higher-dimensional Euclidean space, and thus has zero measure.
Tasks
Published 2019-12-31
URL https://arxiv.org/abs/1912.13472v1
PDF https://arxiv.org/pdf/1912.13472v1.pdf
PWC https://paperswithcode.com/paper/revisiting-landscape-analysis-in-deep-neural
Repo
Framework

Fast and Accurate Least-Mean-Squares Solvers

Title Fast and Accurate Least-Mean-Squares Solvers
Authors Alaa Maalouf, Ibrahim Jubran, Dan Feldman
Abstract Least-mean squares (LMS) solvers such as Linear / Ridge / Lasso-Regression, SVD and Elastic-Net not only solve fundamental machine learning problems, but are also the building blocks in a variety of other methods, such as decision trees and matrix factorizations. We suggest an algorithm that gets a finite set of $n$ $d$-dimensional real vectors and returns a weighted subset of $d+1$ vectors whose sum is \emph{exactly} the same. The proof in Caratheodory’s Theorem (1907) computes such a subset in $O(n^2d^2)$ time and thus not used in practice. Our algorithm computes this subset in $O(nd)$ time, using $O(\log n)$ calls to Caratheodory’s construction on small but “smart” subsets. This is based on a novel paradigm of fusion between different data summarization techniques, known as sketches and coresets. As an example application, we show how it can be used to boost the performance of existing LMS solvers, such as those in scikit-learn library, up to x100. Generalization for streaming and distributed (big) data is trivial. Extensive experimental results and complete open source code are also provided.
Tasks Data Summarization
Published 2019-06-11
URL https://arxiv.org/abs/1906.04705v1
PDF https://arxiv.org/pdf/1906.04705v1.pdf
PWC https://paperswithcode.com/paper/fast-and-accurate-least-mean-squares-solvers
Repo
Framework

Batch Virtual Adversarial Training for Graph Convolutional Networks

Title Batch Virtual Adversarial Training for Graph Convolutional Networks
Authors Zhijie Deng, Yinpeng Dong, Jun Zhu
Abstract We present batch virtual adversarial training (BVAT), a novel regularization method for graph convolutional networks (GCNs). BVAT addresses the shortcoming of GCNs that do not consider the smoothness of the model’s output distribution against local perturbations around the input. We propose two algorithms, sample-based BVAT and optimization-based BVAT, which are suitable to promote the smoothness of the model for graph-structured data by either finding virtual adversarial perturbations for a subset of nodes far from each other or generating virtual adversarial perturbations for all nodes with an optimization process. Extensive experiments on three citation network datasets Cora, Citeseer and Pubmed and a knowledge graph dataset Nell validate the effectiveness of the proposed method, which establishes state-of-the-art results in the semi-supervised node classification tasks.
Tasks Node Classification
Published 2019-02-25
URL https://arxiv.org/abs/1902.09192v2
PDF https://arxiv.org/pdf/1902.09192v2.pdf
PWC https://paperswithcode.com/paper/batch-virtual-adversarial-training-for-graph
Repo
Framework

Unsupervised Learning Framework of Interest Point Via Properties Optimization

Title Unsupervised Learning Framework of Interest Point Via Properties Optimization
Authors Pei Yan, Yihua Tan, Yuan Xiao, Yuan Tai, Cai Wen
Abstract This paper presents an entirely unsupervised interest point training framework by jointly learning detector and descriptor, which takes an image as input and outputs a probability and a description for every image point. The objective of the training framework is formulated as joint probability distribution of the properties of the extracted points. The essential properties are selected as sparsity, repeatability and discriminability which are formulated by the probabilities. To maximize the objective efficiently, latent variable is introduced to represent the probability of that a point satisfies the required properties. Therefore, original maximization can be optimized with Expectation Maximization algorithm (EM). Considering high computation cost of EM on large scale image set, we implement the optimization process with an efficient strategy as Mini-Batch approximation of EM (MBEM). In the experiments both detector and descriptor are instantiated with fully convolutional network which is named as Property Network (PN). The experiments demonstrate that PN outperforms state-of-the-art methods on a number of image matching benchmarks without need of retraining. PN also reveals that the proposed training framework has high flexibility to adapt to diverse types of scenes.
Tasks
Published 2019-07-26
URL https://arxiv.org/abs/1907.11375v1
PDF https://arxiv.org/pdf/1907.11375v1.pdf
PWC https://paperswithcode.com/paper/unsupervised-learning-framework-of-interest
Repo
Framework

Weak consistency of the 1-nearest neighbor measure with applications to missing data

Title Weak consistency of the 1-nearest neighbor measure with applications to missing data
Authors James Sharpnack
Abstract When data is partially missing at random, imputation and importance weighting are often used to estimate moments of the unobserved population. In this paper, we study 1-nearest neighbor (1NN) importance weighting, which estimates moments by replacing missing data with the complete data that is the nearest neighbor in the non-missing covariate space. We define an empirical measure, the 1NN measure, and show that it is weakly consistent for the measure of the missing data. The main idea behind this result is that the 1NN measure is performing inverse probability weighting in the limit. We study applications to missing data and mitigating the impact of covariate shift in prediction tasks.
Tasks Domain Adaptation, Imputation
Published 2019-02-06
URL https://arxiv.org/abs/1902.02408v2
PDF https://arxiv.org/pdf/1902.02408v2.pdf
PWC https://paperswithcode.com/paper/weak-consistency-of-the-1-nearest-neighbor
Repo
Framework

Learning by Inertia: Self-supervised Monocular Visual Odometry for Road Vehicles

Title Learning by Inertia: Self-supervised Monocular Visual Odometry for Road Vehicles
Authors Chengze Wang, Yuan Yuan, Qi Wang
Abstract In this paper, we present iDVO (inertia-embedded deep visual odometry), a self-supervised learning based monocular visual odometry (VO) for road vehicles. When modelling the geometric consistency within adjacent frames, most deep VO methods ignore the temporal continuity of the camera pose, which results in a very severe jagged fluctuation in the velocity curves. With the observation that road vehicles tend to perform smooth dynamic characteristics in most of the time, we design the inertia loss function to describe the abnormal motion variation, which assists the model to learn the consecutiveness from long-term camera ego-motion. Based on the recurrent convolutional neural network (RCNN) architecture, our method implicitly models the dynamics of road vehicles and the temporal consecutiveness by the extended Long Short-Term Memory (LSTM) block. Furthermore, we develop the dynamic hard-edge mask to handle the non-consistency in fast camera motion by blocking the boundary part and which generates more efficiency in the whole non-consistency mask. The proposed method is evaluated on the KITTI dataset, and the results demonstrate state-of-the-art performance with respect to other monocular deep VO and SLAM approaches.
Tasks Monocular Visual Odometry, Visual Odometry
Published 2019-05-05
URL https://arxiv.org/abs/1905.01634v1
PDF https://arxiv.org/pdf/1905.01634v1.pdf
PWC https://paperswithcode.com/paper/learning-by-inertia-self-supervised-monocular
Repo
Framework

Hierarchical Soft Actor-Critic: Adversarial Exploration via Mutual Information Optimization

Title Hierarchical Soft Actor-Critic: Adversarial Exploration via Mutual Information Optimization
Authors Ari Azarafrooz, John Brock
Abstract We describe a novel extension of soft actor-critics for hierarchical Deep Q-Networks (HDQN) architectures using mutual information metric. The proposed extension provides a suitable framework for encouraging explorations in such hierarchical networks. A natural utilization of this framework is an adversarial setting, where meta-controller and controller play minimax over the mutual information objective but cooperate on maximizing expected rewards.
Tasks
Published 2019-06-17
URL https://arxiv.org/abs/1906.07122v1
PDF https://arxiv.org/pdf/1906.07122v1.pdf
PWC https://paperswithcode.com/paper/hierarchical-soft-actor-critic-adversarial
Repo
Framework

High-dimensional Gaussian graphical model for network-linked data

Title High-dimensional Gaussian graphical model for network-linked data
Authors Tianxi Li, Cheng Qian, Elizaveta Levina, Ji Zhu
Abstract Graphical models are commonly used to represent conditional dependence relationships between variables. There are multiple methods available for exploring them from high-dimensional data, but almost all of them rely on the assumption that the observations are independent and identically distributed. At the same time, observations connected by a network are becoming increasingly common, and tend to violate these assumptions. Here we develop a Gaussian graphical model for observations connected by a network with potentially different mean vectors, varying smoothly over the network. We propose an efficient estimation algorithm and demonstrate its effectiveness on both simulated and real data, obtaining meaningful interpretable results on a statistics coauthorship network. We also prove that our method estimates both the inverse covariance matrix and the corresponding graph structure correctly under the assumption of network “cohesion”, which refers to the empirically observed phenomenon of network neighbors sharing similar traits.
Tasks
Published 2019-07-04
URL https://arxiv.org/abs/1907.02443v1
PDF https://arxiv.org/pdf/1907.02443v1.pdf
PWC https://paperswithcode.com/paper/high-dimensional-gaussian-graphical-model-for
Repo
Framework

An MBO scheme for clustering and semi-supervised clustering of signed networks

Title An MBO scheme for clustering and semi-supervised clustering of signed networks
Authors Mihai Cucuringu, Andrea Pizzoferrato, Yves van Gennip
Abstract We introduce a principled method for the signed clustering problem, where the goal is to partition a graph whose edge weights take both positive and negative values, such that edges within the same cluster are mostly positive, while edges spanning across clusters are mostly negative. Our method relies on a graph-based diffuse interface model formulation utilizing the Ginzburg-Landau functional, based on an adaptation of the classic numerical Merriman-Bence-Osher (MBO) scheme for minimizing such graph-based functionals. The proposed objective function aims to minimize the total weight of inter-cluster positively-weighted edges, while maximizing the total weight of the inter-cluster negatively-weighted edges. Our method scales to large sparse networks, and can be easily adjusted to incorporate labelled data information, as is often the case in the context of semi-supervised learning. We tested our method on a number of both synthetic stochastic block models and real-world data sets (including financial correlation matrices), and obtained promising results that compare favourably against a number of state-of-the-art approaches from the recent literature.
Tasks
Published 2019-01-10
URL https://arxiv.org/abs/1901.03091v5
PDF https://arxiv.org/pdf/1901.03091v5.pdf
PWC https://paperswithcode.com/paper/an-mbo-scheme-for-clustering-and-semi
Repo
Framework

Measuring Patent Claim Generation by Span Relevancy

Title Measuring Patent Claim Generation by Span Relevancy
Authors Jieh-Sheng Lee, Jieh Hsiang
Abstract Our goal of patent claim generation is to realize “augmented inventing” for inventors by leveraging latest Deep Learning techniques. We envision the possibility of building an “auto-complete” function for inventors to conceive better inventions in the era of artificial intelligence. In order to generate patent claims with good quality, a fundamental question is how to measure it. We tackle the problem from a perspective of claim span relevancy. Patent claim language was rarely explored in the NLP field. It is unique in its own way and contains rich explicit and implicit human annotations. In this work, we propose a span-based approach and a generic framework to measure patent claim generation quantitatively. In order to study the effectiveness of patent claim generation, we define a metric to measure whether two consecutive spans in a generated patent claims are relevant. We treat such relevancy measurement as a span-pair classification problem, following the concept of natural language inference. Technically, the span-pair classifier is implemented by fine-tuning a pre-trained language model. The patent claim generation is implemented by fine-tuning the other pre-trained model. Specifically, we fine-tune a pre-trained Google BERT model to measure the patent claim spans generated by a fine-tuned OpenAI GPT-2 model. In this way, we re-use two of the state-of-the-art pre-trained models in the NLP field. Our result shows the effectiveness of the span-pair classifier after fine-tuning the pre-trained model. It further validates the quantitative metric of span relevancy in patent claim generation. Particularly, we found that the span relevancy ratio measured by BERT becomes lower when the diversity in GPT-2 text generation becomes higher.
Tasks Language Modelling, Natural Language Inference, Text Generation
Published 2019-08-26
URL https://arxiv.org/abs/1908.09591v2
PDF https://arxiv.org/pdf/1908.09591v2.pdf
PWC https://paperswithcode.com/paper/measuring-patent-claim-generation-by-span
Repo
Framework

Derivational Morphological Relations in Word Embeddings

Title Derivational Morphological Relations in Word Embeddings
Authors Tomáš Musil, Jonáš Vidra, David Mareček
Abstract Derivation is a type of a word-formation process which creates new words from existing ones by adding, changing or deleting affixes. In this paper, we explore the potential of word embeddings to identify properties of word derivations in the morphologically rich Czech language. We extract derivational relations between pairs of words from DeriNet, a Czech lexical network, which organizes almost one million Czech lemmata into derivational trees. For each such pair, we compute the difference of the embeddings of the two words, and perform unsupervised clustering of the resulting vectors. Our results show that these clusters largely match manually annotated semantic categories of the derivational relations (e.g. the relation ‘bake–baker’ belongs to category ‘actor’, and a correct clustering puts it into the same cluster as ‘govern–governor’).
Tasks Word Embeddings
Published 2019-06-06
URL https://arxiv.org/abs/1906.02510v1
PDF https://arxiv.org/pdf/1906.02510v1.pdf
PWC https://paperswithcode.com/paper/derivational-morphological-relations-in-word
Repo
Framework

Synthetic Video Generation for Robust Hand Gesture Recognition in Augmented Reality Applications

Title Synthetic Video Generation for Robust Hand Gesture Recognition in Augmented Reality Applications
Authors Varun Jain, Shivam Aggarwal, Suril Mehta, Ramya Hebbalaguppe
Abstract Hand gestures are a natural means of interaction in Augmented Reality and Virtual Reality (AR/VR) applications. Recently, there has been an increased focus on removing the dependence of accurate hand gesture recognition on complex sensor setup found in expensive proprietary devices such as the Microsoft HoloLens, Daqri and Meta Glasses. Most such solutions either rely on multi-modal sensor data or deep neural networks that can benefit greatly from abundance of labelled data. Datasets are an integral part of any deep learning based research. They have been the principal reason for the substantial progress in this field, both, in terms of providing enough data for the training of these models, and, for benchmarking competing algorithms. However, it is becoming increasingly difficult to generate enough labelled data for complex tasks such as hand gesture recognition. The goal of this work is to introduce a framework capable of generating photo-realistic videos that have labelled hand bounding box and fingertip that can help in designing, training, and benchmarking models for hand-gesture recognition in AR/VR applications. We demonstrate the efficacy of our framework in generating videos with diverse backgrounds.
Tasks Gesture Recognition, Hand Gesture Recognition, Hand-Gesture Recognition, Video Generation
Published 2019-11-04
URL https://arxiv.org/abs/1911.01320v3
PDF https://arxiv.org/pdf/1911.01320v3.pdf
PWC https://paperswithcode.com/paper/synthetic-video-generation-for-robust-hand
Repo
Framework
comments powered by Disqus