February 1, 2020

3094 words 15 mins read

Paper Group AWR 305

Paper Group AWR 305

Unsupervised Network Embedding for Graph Visualization, Clustering and Classification. On Generalizing Detection Models for Unconstrained Environments. Learning Topological Representation for Networks via Hierarchical Sampling. SATNet: Bridging deep learning and logical reasoning using a differentiable satisfiability solver. MLSL: Multi-Level Self- …

Unsupervised Network Embedding for Graph Visualization, Clustering and Classification

Title Unsupervised Network Embedding for Graph Visualization, Clustering and Classification
Authors Leonardo Gutiérrez-Gómez, Jean-Charles Delvenne
Abstract A main challenge in mining network-based data is finding effective ways to represent or encode graph structures so that it can be efficiently exploited by machine learning algorithms. Several methods have focused in network representation at node/edge or substructure level. However, many real life challenges such as time-varying, multilayer, chemical compounds and brain networks involve analysis of a family of graphs instead of single one opening additional challenges in graph comparison and representation. Traditional approaches for learning representations relies on hand-crafting specialized heuristics to extract meaningful information about the graphs, e.g statistical properties, structural features, etc. as well as engineered graph distances to quantify dissimilarity between networks. In this work we provide an unsupervised approach to learn embedding representation for a collection of graphs so that it can be used in numerous graph mining tasks. By using an unsupervised neural network approach on input graphs, we aim to capture the underlying distribution of the data in order to discriminate between different class of networks. Our method is assessed empirically on synthetic and real life datasets and evaluated in three different tasks: graph clustering, visualization and classification. Results reveal that our method outperforms well known graph distances and graph-kernels in clustering and classification tasks, being highly efficient in runtime.
Tasks Graph Clustering, Network Embedding
Published 2019-02-25
URL http://arxiv.org/abs/1903.05980v2
PDF http://arxiv.org/pdf/1903.05980v2.pdf
PWC https://paperswithcode.com/paper/unsupervised-network-embedding-for-graph
Repo https://github.com/leoguti85/GraphEmbs
Framework none

On Generalizing Detection Models for Unconstrained Environments

Title On Generalizing Detection Models for Unconstrained Environments
Authors Prajjwal Bhargava
Abstract Object detection has seen tremendous progress in recent years. However, current algorithms don’t generalize well when tested on diverse data distributions. We address the problem of incremental learning in object detection on the India Driving Dataset (IDD). Our approach involves using multiple domain-specific classifiers and effective transfer learning techniques focussed on avoiding catastrophic forgetting. We evaluate our approach on the IDD and BDD100K dataset. Results show the effectiveness of our domain adaptive approach in the case of domain shifts in environments.
Tasks Object Detection, Transfer Learning
Published 2019-09-28
URL https://arxiv.org/abs/1909.13080v1
PDF https://arxiv.org/pdf/1909.13080v1.pdf
PWC https://paperswithcode.com/paper/on-generalizing-detection-models-for
Repo https://github.com/prajjwal1/autonomous-object-detection
Framework pytorch

Learning Topological Representation for Networks via Hierarchical Sampling

Title Learning Topological Representation for Networks via Hierarchical Sampling
Authors Guoji Fu, Chengbin Hou, Xin Yao
Abstract The topological information is essential for studying the relationship between nodes in a network. Recently, Network Representation Learning (NRL), which projects a network into a low-dimensional vector space, has been shown their advantages in analyzing large-scale networks. However, most existing NRL methods are designed to preserve the local topology of a network, they fail to capture the global topology. To tackle this issue, we propose a new NRL framework, named HSRL, to help existing NRL methods capture both the local and global topological information of a network. Specifically, HSRL recursively compresses an input network into a series of smaller networks using a community-awareness compressing strategy. Then, an existing NRL method is used to learn node embeddings for each compressed network. Finally, the node embeddings of the input network are obtained by concatenating the node embeddings from all compressed networks. Empirical studies for link prediction on five real-world datasets demonstrate the advantages of HSRL over state-of-the-art methods.
Tasks Link Prediction, Representation Learning
Published 2019-02-15
URL http://arxiv.org/abs/1902.06684v1
PDF http://arxiv.org/pdf/1902.06684v1.pdf
PWC https://paperswithcode.com/paper/learning-topological-representation-for
Repo https://github.com/fuguoji/HSRL
Framework none

SATNet: Bridging deep learning and logical reasoning using a differentiable satisfiability solver

Title SATNet: Bridging deep learning and logical reasoning using a differentiable satisfiability solver
Authors Po-Wei Wang, Priya L. Donti, Bryan Wilder, Zico Kolter
Abstract Integrating logical reasoning within deep learning architectures has been a major goal of modern AI systems. In this paper, we propose a new direction toward this goal by introducing a differentiable (smoothed) maximum satisfiability (MAXSAT) solver that can be integrated into the loop of larger deep learning systems. Our (approximate) solver is based upon a fast coordinate descent approach to solving the semidefinite program (SDP) associated with the MAXSAT problem. We show how to analytically differentiate through the solution to this SDP and efficiently solve the associated backward pass. We demonstrate that by integrating this solver into end-to-end learning systems, we can learn the logical structure of challenging problems in a minimally supervised fashion. In particular, we show that we can learn the parity function using single-bit supervision (a traditionally hard task for deep networks) and learn how to play 9x9 Sudoku solely from examples. We also solve a “visual Sudok” problem that maps images of Sudoku puzzles to their associated logical solutions by combining our MAXSAT solver with a traditional convolutional architecture. Our approach thus shows promise in integrating logical structures within deep learning.
Tasks Game of Suduko
Published 2019-05-29
URL https://arxiv.org/abs/1905.12149v1
PDF https://arxiv.org/pdf/1905.12149v1.pdf
PWC https://paperswithcode.com/paper/satnet-bridging-deep-learning-and-logical
Repo https://github.com/locuslab/SATNet
Framework pytorch

MLSL: Multi-Level Self-Supervised Learning for Domain Adaptation with Spatially Independent and Semantically Consistent Labeling

Title MLSL: Multi-Level Self-Supervised Learning for Domain Adaptation with Spatially Independent and Semantically Consistent Labeling
Authors Javed Iqbal, Mohsen Ali
Abstract Most of the recent Deep Semantic Segmentation algorithms suffer from large generalization errors, even when powerful hierarchical representation models based on convolutional neural networks have been employed. This could be attributed to limited training data and large distribution gap in train and test domain datasets. In this paper, we propose a multi-level self-supervised learning model for domain adaptation of semantic segmentation. Exploiting the idea that an object (and most of the stuff given context) should be labeled consistently regardless of its location, we generate spatially independent and semantically consistent (SISC) pseudo-labels by segmenting multiple sub-images using base model and designing an aggregation strategy. Image level pseudo weak-labels, PWL, are computed to guide domain adaptation by capturing global context similarity in source and domain at latent space level. Thus helping latent space learn the representation even when there are very few pixels belonging to the domain category (small object for example) compared to rest of the image. Our multi-level Self-supervised learning (MLSL) outperforms existing state-of art (self or adversarial learning) algorithms. Specifically, keeping all setting similar and employing MLSL we obtain an mIoU gain of 5:1% on GTA-V to Cityscapes adaptation and 4:3% on SYNTHIA to Cityscapes adaptation compared to existing state-of-art method.
Tasks Domain Adaptation, Semantic Segmentation
Published 2019-09-30
URL https://arxiv.org/abs/1909.13776v1
PDF https://arxiv.org/pdf/1909.13776v1.pdf
PWC https://paperswithcode.com/paper/mlsl-multi-level-self-supervised-learning-for
Repo https://github.com/engrjavediqbal/MLSL
Framework mxnet

Stochastic Gradient Trees

Title Stochastic Gradient Trees
Authors Henry Gouk, Bernhard Pfahringer, Eibe Frank
Abstract We present an algorithm for learning decision trees using stochastic gradient information as the source of supervision. In contrast to previous approaches to gradient-based tree learning, our method operates in the incremental learning setting rather than the batch learning setting, and does not make use of soft splits or require the construction of a new tree for every update. We demonstrate how one can apply these decision trees to different problems by changing only the loss function, using classification, regression, and multi-instance learning as example applications. In the experimental evaluation, our method performs similarly to standard incremental classification trees, outperforms state of the art incremental regression trees, and achieves comparable performance with batch multi-instance learning methods.
Tasks Multi-Label Classification
Published 2019-01-23
URL https://arxiv.org/abs/1901.07777v3
PDF https://arxiv.org/pdf/1901.07777v3.pdf
PWC https://paperswithcode.com/paper/stochastic-gradient-trees
Repo https://github.com/henrygouk/stochastic-gradient-trees
Framework none

Dynamics-aware Embeddings

Title Dynamics-aware Embeddings
Authors William Whitney, Rajat Agarwal, Kyunghyun Cho, Abhinav Gupta
Abstract In this paper we consider self-supervised representation learning to improve sample efficiency in reinforcement learning (RL). We propose a forward prediction objective for simultaneously learning embeddings of states and action sequences. These embeddings capture the structure of the environment’s dynamics, enabling efficient policy learning. We demonstrate that our action embeddings alone improve the sample efficiency and peak performance of model-free RL on control from low-dimensional states. By combining state and action embeddings, we achieve efficient learning of high-quality policies on goal-conditioned continuous control from pixel observations in only 1-2 million environment steps.
Tasks Continuous Control, Representation Learning
Published 2019-08-25
URL https://arxiv.org/abs/1908.09357v3
PDF https://arxiv.org/pdf/1908.09357v3.pdf
PWC https://paperswithcode.com/paper/dynamics-aware-embeddings
Repo https://github.com/willwhitney/dynamics-aware-embeddings
Framework pytorch

R-Transformer: Recurrent Neural Network Enhanced Transformer

Title R-Transformer: Recurrent Neural Network Enhanced Transformer
Authors Zhiwei Wang, Yao Ma, Zitao Liu, Jiliang Tang
Abstract Recurrent Neural Networks have long been the dominating choice for sequence modeling. However, it severely suffers from two issues: impotent in capturing very long-term dependencies and unable to parallelize the sequential computation procedure. Therefore, many non-recurrent sequence models that are built on convolution and attention operations have been proposed recently. Notably, models with multi-head attention such as Transformer have demonstrated extreme effectiveness in capturing long-term dependencies in a variety of sequence modeling tasks. Despite their success, however, these models lack necessary components to model local structures in sequences and heavily rely on position embeddings that have limited effects and require a considerable amount of design efforts. In this paper, we propose the R-Transformer which enjoys the advantages of both RNNs and the multi-head attention mechanism while avoids their respective drawbacks. The proposed model can effectively capture both local structures and global long-term dependencies in sequences without any use of position embeddings. We evaluate R-Transformer through extensive experiments with data from a wide range of domains and the empirical results show that R-Transformer outperforms the state-of-the-art methods by a large margin in most of the tasks. We have made the code publicly available at \url{https://github.com/DSE-MSU/R-transformer}.
Tasks Language Modelling, Music Modeling, Sequential Image Classification
Published 2019-07-12
URL https://arxiv.org/abs/1907.05572v1
PDF https://arxiv.org/pdf/1907.05572v1.pdf
PWC https://paperswithcode.com/paper/r-transformer-recurrent-neural-network
Repo https://github.com/DSE-MSU/R-transformer
Framework pytorch

Vector-Valued Graph Trend Filtering with Non-Convex Penalties

Title Vector-Valued Graph Trend Filtering with Non-Convex Penalties
Authors Rohan Varma, Harlin Lee, Jelena Kovačević, Yuejie Chi
Abstract This work studies the denoising of piecewise smooth graph signals that exhibit inhomogeneous levels of smoothness over a graph, where the value at each node can be vector-valued. We extend the graph trend filtering framework to denoising vector-valued graph signals with a family of non-convex regularizers, which exhibit superior recovery performance over existing convex regularizers. Using an oracle inequality, we establish the statistical error rates of first-order stationary points of the proposed non-convex method for generic graphs. Furthermore, we present an ADMM-based algorithm to solve the proposed method and establish its convergence. Numerical experiments are conducted on both synthetic and real-world data for denoising, support recovery, event detection, and semi-supervised classification.
Tasks Denoising
Published 2019-05-29
URL https://arxiv.org/abs/1905.12692v3
PDF https://arxiv.org/pdf/1905.12692v3.pdf
PWC https://paperswithcode.com/paper/vector-valued-graph-trend-filtering-with-non
Repo https://github.com/HarlinLee/nonconvex-GTF-public
Framework none

DFineNet: Ego-Motion Estimation and Depth Refinement from Sparse, Noisy Depth Input with RGB Guidance

Title DFineNet: Ego-Motion Estimation and Depth Refinement from Sparse, Noisy Depth Input with RGB Guidance
Authors Yilun Zhang, Ty Nguyen, Ian D. Miller, Shreyas S. Shivakumar, Steven Chen, Camillo J. Taylor, Vijay Kumar
Abstract Depth estimation is an important capability for autonomous vehicles to understand and reconstruct 3D environments as well as avoid obstacles during the execution. Accurate depth sensors such as LiDARs are often heavy, expensive and can only provide sparse depth while lighter depth sensors such as stereo cameras are noiser in comparison. We propose an end-to-end learning algorithm that is capable of using sparse, noisy input depth for refinement and depth completion. Our model also produces the camera pose as a byproduct, making it a great solution for autonomous systems. We evaluate our approach on both indoor and outdoor datasets. Empirical results show that our method performs well on the KITTI~\cite{kitti_geiger2012we} dataset when compared to other competing methods, while having superior performance in dealing with sparse, noisy input depth on the TUM~\cite{sturm12iros} dataset.
Tasks Autonomous Vehicles, Depth Completion, Depth Estimation, Motion Estimation
Published 2019-03-15
URL https://arxiv.org/abs/1903.06397v4
PDF https://arxiv.org/pdf/1903.06397v4.pdf
PWC https://paperswithcode.com/paper/dfinenet-ego-motion-estimation-and-depth
Repo https://github.com/Ougui9/DFineNet
Framework none

Fast Deep Stereo with 2D Convolutional Processing of Cost Signatures

Title Fast Deep Stereo with 2D Convolutional Processing of Cost Signatures
Authors Kyle Yee, Ayan Chakrabarti
Abstract Modern neural network-based algorithms are able to produce highly accurate depth estimates from stereo image pairs, nearly matching the reliability of measurements from more expensive depth sensors. However, this accuracy comes with a higher computational cost since these methods use network architectures designed to compute and process matching scores across all candidate matches at all locations, with floating point computations repeated across a match volume with dimensions corresponding to both space and disparity. This leads to longer running times to process each image pair, making them impractical for real-time use in robots and autonomous vehicles. We propose a new stereo algorithm that employs a significantly more efficient network architecture. Our method builds an initial match cost volume using traditional matching costs that are fast to compute, and trains a network to estimate disparity from this volume. Crucially, our network only employs per-pixel and two-dimensional convolution operations: to summarize the match information at each location as a low-dimensional feature vector, and to spatially process these `cost-signature’ features to produce a dense disparity map. Experimental results on the KITTI benchmark show that our method delivers competitive accuracy at significantly higher speeds—running at 48 frames per second on a modern GPU. |
Tasks Autonomous Vehicles
Published 2019-03-08
URL http://arxiv.org/abs/1903.04939v1
PDF http://arxiv.org/pdf/1903.04939v1.pdf
PWC https://paperswithcode.com/paper/fast-deep-stereo-with-2d-convolutional
Repo https://github.com/ayanc/fdscs
Framework tf

Leveraging Shape Completion for 3D Siamese Tracking

Title Leveraging Shape Completion for 3D Siamese Tracking
Authors Silvio Giancola, Jesus Zarzar, Bernard Ghanem
Abstract Point clouds are challenging to process due to their sparsity, therefore autonomous vehicles rely more on appearance attributes than pure geometric features. However, 3D LIDAR perception can provide crucial information for urban navigation in challenging light or weather conditions. In this paper, we investigate the versatility of Shape Completion for 3D Object Tracking in LIDAR point clouds. We design a Siamese tracker that encodes model and candidate shapes into a compact latent representation. We regularize the encoding by enforcing the latent representation to decode into an object model shape. We observe that 3D object tracking and 3D shape completion complement each other. Learning a more meaningful latent representation shows better discriminatory capabilities, leading to improved tracking performance. We test our method on the KITTI Tracking set using car 3D bounding boxes. Our model reaches a 76.94% Success rate and 81.38% Precision for 3D Object Tracking, with the shape completion regularization leading to an improvement of 3% in both metrics.
Tasks Autonomous Vehicles, Object Tracking
Published 2019-03-05
URL http://arxiv.org/abs/1903.01784v2
PDF http://arxiv.org/pdf/1903.01784v2.pdf
PWC https://paperswithcode.com/paper/leveraging-shape-completion-for-3d-siamese
Repo https://github.com/SilvioGiancola/ShapeCompletion3DTracking
Framework pytorch

Noise Flow: Noise Modeling with Conditional Normalizing Flows

Title Noise Flow: Noise Modeling with Conditional Normalizing Flows
Authors Abdelrahman Abdelhamed, Marcus A. Brubaker, Michael S. Brown
Abstract Modeling and synthesizing image noise is an important aspect in many computer vision applications. The long-standing additive white Gaussian and heteroscedastic (signal-dependent) noise models widely used in the literature provide only a coarse approximation of real sensor noise. This paper introduces Noise Flow, a powerful and accurate noise model based on recent normalizing flow architectures. Noise Flow combines well-established basic parametric noise models (e.g., signal-dependent noise) with the flexibility and expressiveness of normalizing flow networks. The result is a single, comprehensive, compact noise model containing fewer than 2500 parameters yet able to represent multiple cameras and gain factors. Noise Flow dramatically outperforms existing noise models, with 0.42 nats/pixel improvement over the camera-calibrated noise level functions, which translates to 52% improvement in the likelihood of sampled noise. Noise Flow represents the first serious attempt to go beyond simple parametric models to one that leverages the power of deep learning and data-driven noise distributions.
Tasks
Published 2019-08-22
URL https://arxiv.org/abs/1908.08453v1
PDF https://arxiv.org/pdf/1908.08453v1.pdf
PWC https://paperswithcode.com/paper/noise-flow-noise-modeling-with-conditional
Repo https://github.com/BorealisAI/noise_flow
Framework tf

Low-Level Linguistic Controls for Style Transfer and Content Preservation

Title Low-Level Linguistic Controls for Style Transfer and Content Preservation
Authors Katy Gero, Chris Kedzie, Jonathan Reeve, Lydia Chilton
Abstract Despite the success of style transfer in image processing, it has seen limited progress in natural language generation. Part of the problem is that content is not as easily decoupled from style in the text domain. Curiously, in the field of stylometry, content does not figure prominently in practical methods of discriminating stylistic elements, such as authorship and genre. Rather, syntax and function words are the most salient features. Drawing on this work, we model style as a suite of low-level linguistic controls, such as frequency of pronouns, prepositions, and subordinate clause constructions. We train a neural encoder-decoder model to reconstruct reference sentences given only content words and the setting of the controls. We perform style transfer by keeping the content words fixed while adjusting the controls to be indicative of another style. In experiments, we show that the model reliably responds to the linguistic controls and perform both automatic and manual evaluations on style transfer. We find we can fool a style classifier 84% of the time, and that our model produces highly diverse and stylistically distinctive outputs. This work introduces a formal, extendable model of style that can add control to any neural text generation system.
Tasks Style Transfer, Text Generation
Published 2019-11-08
URL https://arxiv.org/abs/1911.03385v1
PDF https://arxiv.org/pdf/1911.03385v1.pdf
PWC https://paperswithcode.com/paper/low-level-linguistic-controls-for-style
Repo https://github.com/kedz/styleeq
Framework none

Soft Prototyping Camera Designs for Car Detection Based on a Convolutional Neural Network

Title Soft Prototyping Camera Designs for Car Detection Based on a Convolutional Neural Network
Authors Zhenyi Liu, Trisha Lian, Joyce Farrell, Brian Wandell
Abstract Imaging systems are increasingly used as input to convolutional neural networks (CNN) for object detection; we would like to design cameras that are optimized for this purpose. It is impractical to build different cameras and then acquire and label the necessary data for every potential camera design; creating software simulations of the camera in context (soft prototyping) is the only realistic approach. We implemented soft-prototyping tools that can quantitatively simulate image radiance and camera designs to create realistic images that are input to a convolutional neural network for car detection. We used these methods to quantify the effect that critical hardware components (pixel size), sensor control (exposure algorithms) and image processing (gamma and demosaicing algorithms) have upon average precision of car detection. We quantify (a) the relationship between pixel size and the ability to detect cars at different distances, (b) the penalty for choosing a poor exposure duration, and (c) the ability of the CNN to perform car detection for a variety of post-acquisition processing algorithms. These results show that the optimal choices for car detection are not constrained by the same metrics used for image quality in consumer photography. It is better to evaluate camera designs for CNN applications using soft prototyping with task-specific metrics rather than consumer photography metrics.
Tasks Demosaicking, Object Detection
Published 2019-10-24
URL https://arxiv.org/abs/1910.10916v1
PDF https://arxiv.org/pdf/1910.10916v1.pdf
PWC https://paperswithcode.com/paper/soft-prototyping-camera-designs-for-car
Repo https://github.com/iset/iset3d
Framework none
comments powered by Disqus