February 1, 2020

3094 words 15 mins read

Paper Group AWR 305

Unsupervised Network Embedding for Graph Visualization, Clustering and Classification. On Generalizing Detection Models for Unconstrained Environments. Learning Topological Representation for Networks via Hierarchical Sampling. SATNet: Bridging deep learning and logical reasoning using a differentiable satisfiability solver. MLSL: Multi-Level Self- …

Unsupervised Network Embedding for Graph Visualization, Clustering and Classification


Title	Unsupervised Network Embedding for Graph Visualization, Clustering and Classification
Authors	Leonardo Gutiérrez-Gómez, Jean-Charles Delvenne
Abstract	A main challenge in mining network-based data is finding effective ways to represent or encode graph structures so that it can be efficiently exploited by machine learning algorithms. Several methods have focused in network representation at node/edge or substructure level. However, many real life challenges such as time-varying, multilayer, chemical compounds and brain networks involve analysis of a family of graphs instead of single one opening additional challenges in graph comparison and representation. Traditional approaches for learning representations relies on hand-crafting specialized heuristics to extract meaningful information about the graphs, e.g statistical properties, structural features, etc. as well as engineered graph distances to quantify dissimilarity between networks. In this work we provide an unsupervised approach to learn embedding representation for a collection of graphs so that it can be used in numerous graph mining tasks. By using an unsupervised neural network approach on input graphs, we aim to capture the underlying distribution of the data in order to discriminate between different class of networks. Our method is assessed empirically on synthetic and real life datasets and evaluated in three different tasks: graph clustering, visualization and classification. Results reveal that our method outperforms well known graph distances and graph-kernels in clustering and classification tasks, being highly efficient in runtime.
Tasks	Graph Clustering, Network Embedding
Published	2019-02-25
URL	http://arxiv.org/abs/1903.05980v2
PDF	http://arxiv.org/pdf/1903.05980v2.pdf
PWC	https://paperswithcode.com/paper/unsupervised-network-embedding-for-graph
Repo	https://github.com/leoguti85/GraphEmbs
Framework	none

On Generalizing Detection Models for Unconstrained Environments


Title	On Generalizing Detection Models for Unconstrained Environments
Authors	Prajjwal Bhargava
Abstract	Object detection has seen tremendous progress in recent years. However, current algorithms don’t generalize well when tested on diverse data distributions. We address the problem of incremental learning in object detection on the India Driving Dataset (IDD). Our approach involves using multiple domain-specific classifiers and effective transfer learning techniques focussed on avoiding catastrophic forgetting. We evaluate our approach on the IDD and BDD100K dataset. Results show the effectiveness of our domain adaptive approach in the case of domain shifts in environments.
Tasks	Object Detection, Transfer Learning
Published	2019-09-28
URL	https://arxiv.org/abs/1909.13080v1
PDF	https://arxiv.org/pdf/1909.13080v1.pdf
PWC	https://paperswithcode.com/paper/on-generalizing-detection-models-for
Repo	https://github.com/prajjwal1/autonomous-object-detection
Framework	pytorch

Learning Topological Representation for Networks via Hierarchical Sampling


Title	Learning Topological Representation for Networks via Hierarchical Sampling
Authors	Guoji Fu, Chengbin Hou, Xin Yao
Abstract	The topological information is essential for studying the relationship between nodes in a network. Recently, Network Representation Learning (NRL), which projects a network into a low-dimensional vector space, has been shown their advantages in analyzing large-scale networks. However, most existing NRL methods are designed to preserve the local topology of a network, they fail to capture the global topology. To tackle this issue, we propose a new NRL framework, named HSRL, to help existing NRL methods capture both the local and global topological information of a network. Specifically, HSRL recursively compresses an input network into a series of smaller networks using a community-awareness compressing strategy. Then, an existing NRL method is used to learn node embeddings for each compressed network. Finally, the node embeddings of the input network are obtained by concatenating the node embeddings from all compressed networks. Empirical studies for link prediction on five real-world datasets demonstrate the advantages of HSRL over state-of-the-art methods.
Tasks	Link Prediction, Representation Learning
Published	2019-02-15
URL	http://arxiv.org/abs/1902.06684v1
PDF	http://arxiv.org/pdf/1902.06684v1.pdf
PWC	https://paperswithcode.com/paper/learning-topological-representation-for
Repo	https://github.com/fuguoji/HSRL
Framework	none

SATNet: Bridging deep learning and logical reasoning using a differentiable satisfiability solver


Title	SATNet: Bridging deep learning and logical reasoning using a differentiable satisfiability solver
Authors	Po-Wei Wang, Priya L. Donti, Bryan Wilder, Zico Kolter
Abstract	Integrating logical reasoning within deep learning architectures has been a major goal of modern AI systems. In this paper, we propose a new direction toward this goal by introducing a differentiable (smoothed) maximum satisfiability (MAXSAT) solver that can be integrated into the loop of larger deep learning systems. Our (approximate) solver is based upon a fast coordinate descent approach to solving the semidefinite program (SDP) associated with the MAXSAT problem. We show how to analytically differentiate through the solution to this SDP and efficiently solve the associated backward pass. We demonstrate that by integrating this solver into end-to-end learning systems, we can learn the logical structure of challenging problems in a minimally supervised fashion. In particular, we show that we can learn the parity function using single-bit supervision (a traditionally hard task for deep networks) and learn how to play 9x9 Sudoku solely from examples. We also solve a “visual Sudok” problem that maps images of Sudoku puzzles to their associated logical solutions by combining our MAXSAT solver with a traditional convolutional architecture. Our approach thus shows promise in integrating logical structures within deep learning.
Tasks	Game of Suduko
Published	2019-05-29
URL	https://arxiv.org/abs/1905.12149v1
PDF	https://arxiv.org/pdf/1905.12149v1.pdf
PWC	https://paperswithcode.com/paper/satnet-bridging-deep-learning-and-logical
Repo	https://github.com/locuslab/SATNet
Framework	pytorch

MLSL: Multi-Level Self-Supervised Learning for Domain Adaptation with Spatially Independent and Semantically Consistent Labeling


Title	MLSL: Multi-Level Self-Supervised Learning for Domain Adaptation with Spatially Independent and Semantically Consistent Labeling
Authors	Javed Iqbal, Mohsen Ali
Abstract	Most of the recent Deep Semantic Segmentation algorithms suffer from large generalization errors, even when powerful hierarchical representation models based on convolutional neural networks have been employed. This could be attributed to limited training data and large distribution gap in train and test domain datasets. In this paper, we propose a multi-level self-supervised learning model for domain adaptation of semantic segmentation. Exploiting the idea that an object (and most of the stuff given context) should be labeled consistently regardless of its location, we generate spatially independent and semantically consistent (SISC) pseudo-labels by segmenting multiple sub-images using base model and designing an aggregation strategy. Image level pseudo weak-labels, PWL, are computed to guide domain adaptation by capturing global context similarity in source and domain at latent space level. Thus helping latent space learn the representation even when there are very few pixels belonging to the domain category (small object for example) compared to rest of the image. Our multi-level Self-supervised learning (MLSL) outperforms existing state-of art (self or adversarial learning) algorithms. Specifically, keeping all setting similar and employing MLSL we obtain an mIoU gain of 5:1% on GTA-V to Cityscapes adaptation and 4:3% on SYNTHIA to Cityscapes adaptation compared to existing state-of-art method.
Tasks	Domain Adaptation, Semantic Segmentation
Published	2019-09-30
URL	https://arxiv.org/abs/1909.13776v1
PDF	https://arxiv.org/pdf/1909.13776v1.pdf
PWC	https://paperswithcode.com/paper/mlsl-multi-level-self-supervised-learning-for
Repo	https://github.com/engrjavediqbal/MLSL
Framework	mxnet

Stochastic Gradient Trees


Title	Stochastic Gradient Trees
Authors	Henry Gouk, Bernhard Pfahringer, Eibe Frank
Abstract	We present an algorithm for learning decision trees using stochastic gradient information as the source of supervision. In contrast to previous approaches to gradient-based tree learning, our method operates in the incremental learning setting rather than the batch learning setting, and does not make use of soft splits or require the construction of a new tree for every update. We demonstrate how one can apply these decision trees to different problems by changing only the loss function, using classification, regression, and multi-instance learning as example applications. In the experimental evaluation, our method performs similarly to standard incremental classification trees, outperforms state of the art incremental regression trees, and achieves comparable performance with batch multi-instance learning methods.
Tasks	Multi-Label Classification
Published	2019-01-23
URL	https://arxiv.org/abs/1901.07777v3
PDF	https://arxiv.org/pdf/1901.07777v3.pdf
PWC	https://paperswithcode.com/paper/stochastic-gradient-trees
Repo	https://github.com/henrygouk/stochastic-gradient-trees
Framework	none

Dynamics-aware Embeddings


Title	Dynamics-aware Embeddings
Authors	William Whitney, Rajat Agarwal, Kyunghyun Cho, Abhinav Gupta
Abstract	In this paper we consider self-supervised representation learning to improve sample efficiency in reinforcement learning (RL). We propose a forward prediction objective for simultaneously learning embeddings of states and action sequences. These embeddings capture the structure of the environment’s dynamics, enabling efficient policy learning. We demonstrate that our action embeddings alone improve the sample efficiency and peak performance of model-free RL on control from low-dimensional states. By combining state and action embeddings, we achieve efficient learning of high-quality policies on goal-conditioned continuous control from pixel observations in only 1-2 million environment steps.
Tasks	Continuous Control, Representation Learning
Published	2019-08-25
URL	https://arxiv.org/abs/1908.09357v3
PDF	https://arxiv.org/pdf/1908.09357v3.pdf
PWC	https://paperswithcode.com/paper/dynamics-aware-embeddings
Repo	https://github.com/willwhitney/dynamics-aware-embeddings
Framework	pytorch

R-Transformer: Recurrent Neural Network Enhanced Transformer


Title	R-Transformer: Recurrent Neural Network Enhanced Transformer
Authors	Zhiwei Wang, Yao Ma, Zitao Liu, Jiliang Tang
Abstract	Recurrent Neural Networks have long been the dominating choice for sequence modeling. However, it severely suffers from two issues: impotent in capturing very long-term dependencies and unable to parallelize the sequential computation procedure. Therefore, many non-recurrent sequence models that are built on convolution and attention operations have been proposed recently. Notably, models with multi-head attention such as Transformer have demonstrated extreme effectiveness in capturing long-term dependencies in a variety of sequence modeling tasks. Despite their success, however, these models lack necessary components to model local structures in sequences and heavily rely on position embeddings that have limited effects and require a considerable amount of design efforts. In this paper, we propose the R-Transformer which enjoys the advantages of both RNNs and the multi-head attention mechanism while avoids their respective drawbacks. The proposed model can effectively capture both local structures and global long-term dependencies in sequences without any use of position embeddings. We evaluate R-Transformer through extensive experiments with data from a wide range of domains and the empirical results show that R-Transformer outperforms the state-of-the-art methods by a large margin in most of the tasks. We have made the code publicly available at \url{https://github.com/DSE-MSU/R-transformer}.
Tasks	Language Modelling, Music Modeling, Sequential Image Classification
Published	2019-07-12
URL	https://arxiv.org/abs/1907.05572v1
PDF	https://arxiv.org/pdf/1907.05572v1.pdf
PWC	https://paperswithcode.com/paper/r-transformer-recurrent-neural-network
Repo	https://github.com/DSE-MSU/R-transformer
Framework	pytorch

Vector-Valued Graph Trend Filtering with Non-Convex Penalties


Title	Vector-Valued Graph Trend Filtering with Non-Convex Penalties
Authors	Rohan Varma, Harlin Lee, Jelena Kovačević, Yuejie Chi
Abstract	This work studies the denoising of piecewise smooth graph signals that exhibit inhomogeneous levels of smoothness over a graph, where the value at each node can be vector-valued. We extend the graph trend filtering framework to denoising vector-valued graph signals with a family of non-convex regularizers, which exhibit superior recovery performance over existing convex regularizers. Using an oracle inequality, we establish the statistical error rates of first-order stationary points of the proposed non-convex method for generic graphs. Furthermore, we present an ADMM-based algorithm to solve the proposed method and establish its convergence. Numerical experiments are conducted on both synthetic and real-world data for denoising, support recovery, event detection, and semi-supervised classification.
Tasks	Denoising
Published	2019-05-29
URL	https://arxiv.org/abs/1905.12692v3
PDF	https://arxiv.org/pdf/1905.12692v3.pdf
PWC	https://paperswithcode.com/paper/vector-valued-graph-trend-filtering-with-non
Repo	https://github.com/HarlinLee/nonconvex-GTF-public
Framework	none


Title	DFineNet: Ego-Motion Estimation and Depth Refinement from Sparse, Noisy Depth Input with RGB Guidance
Authors	Yilun Zhang, Ty Nguyen, Ian D. Miller, Shreyas S. Shivakumar, Steven Chen, Camillo J. Taylor, Vijay Kumar
Abstract	Depth estimation is an important capability for autonomous vehicles to understand and reconstruct 3D environments as well as avoid obstacles during the execution. Accurate depth sensors such as LiDARs are often heavy, expensive and can only provide sparse depth while lighter depth sensors such as stereo cameras are noiser in comparison. We propose an end-to-end learning algorithm that is capable of using sparse, noisy input depth for refinement and depth completion. Our model also produces the camera pose as a byproduct, making it a great solution for autonomous systems. We evaluate our approach on both indoor and outdoor datasets. Empirical results show that our method performs well on the KITTI~\cite{kitti_geiger2012we} dataset when compared to other competing methods, while having superior performance in dealing with sparse, noisy input depth on the TUM~\cite{sturm12iros} dataset.
Tasks	Autonomous Vehicles, Depth Completion, Depth Estimation, Motion Estimation
Published	2019-03-15
URL	https://arxiv.org/abs/1903.06397v4
PDF	https://arxiv.org/pdf/1903.06397v4.pdf
PWC	https://paperswithcode.com/paper/dfinenet-ego-motion-estimation-and-depth
Repo	https://github.com/Ougui9/DFineNet
Framework	none

Fast Deep Stereo with 2D Convolutional Processing of Cost Signatures


Title	Fast Deep Stereo with 2D Convolutional Processing of Cost Signatures
Authors	Kyle Yee, Ayan Chakrabarti
Abstract	Modern neural network-based algorithms are able to produce highly accurate depth estimates from stereo image pairs, nearly matching the reliability of measurements from more expensive depth sensors. However, this accuracy comes with a higher computational cost since these methods use network architectures designed to compute and process matching scores across all candidate matches at all locations, with floating point computations repeated across a match volume with dimensions corresponding to both space and disparity. This leads to longer running times to process each image pair, making them impractical for real-time use in robots and autonomous vehicles. We propose a new stereo algorithm that employs a significantly more efficient network architecture. Our method builds an initial match cost volume using traditional matching costs that are fast to compute, and trains a network to estimate disparity from this volume. Crucially, our network only employs per-pixel and two-dimensional convolution operations: to summarize the match information at each location as a low-dimensional feature vector, and to spatially process these `cost-signature’ features to produce a dense disparity map. Experimental results on the KITTI benchmark show that our method delivers competitive accuracy at significantly higher speeds—running at 48 frames per second on a modern GPU. \|
Tasks	Autonomous Vehicles
Published	2019-03-08
URL	http://arxiv.org/abs/1903.04939v1
PDF	http://arxiv.org/pdf/1903.04939v1.pdf
PWC	https://paperswithcode.com/paper/fast-deep-stereo-with-2d-convolutional
Repo	https://github.com/ayanc/fdscs
Framework	tf

Leveraging Shape Completion for 3D Siamese Tracking


Title	Leveraging Shape Completion for 3D Siamese Tracking
Authors	Silvio Giancola, Jesus Zarzar, Bernard Ghanem
Abstract	Point clouds are challenging to process due to their sparsity, therefore autonomous vehicles rely more on appearance attributes than pure geometric features. However, 3D LIDAR perception can provide crucial information for urban navigation in challenging light or weather conditions. In this paper, we investigate the versatility of Shape Completion for 3D Object Tracking in LIDAR point clouds. We design a Siamese tracker that encodes model and candidate shapes into a compact latent representation. We regularize the encoding by enforcing the latent representation to decode into an object model shape. We observe that 3D object tracking and 3D shape completion complement each other. Learning a more meaningful latent representation shows better discriminatory capabilities, leading to improved tracking performance. We test our method on the KITTI Tracking set using car 3D bounding boxes. Our model reaches a 76.94% Success rate and 81.38% Precision for 3D Object Tracking, with the shape completion regularization leading to an improvement of 3% in both metrics.
Tasks	Autonomous Vehicles, Object Tracking
Published	2019-03-05
URL	http://arxiv.org/abs/1903.01784v2
PDF	http://arxiv.org/pdf/1903.01784v2.pdf
PWC	https://paperswithcode.com/paper/leveraging-shape-completion-for-3d-siamese
Repo	https://github.com/SilvioGiancola/ShapeCompletion3DTracking
Framework	pytorch

Noise Flow: Noise Modeling with Conditional Normalizing Flows


Title	Noise Flow: Noise Modeling with Conditional Normalizing Flows
Authors	Abdelrahman Abdelhamed, Marcus A. Brubaker, Michael S. Brown
Abstract	Modeling and synthesizing image noise is an important aspect in many computer vision applications. The long-standing additive white Gaussian and heteroscedastic (signal-dependent) noise models widely used in the literature provide only a coarse approximation of real sensor noise. This paper introduces Noise Flow, a powerful and accurate noise model based on recent normalizing flow architectures. Noise Flow combines well-established basic parametric noise models (e.g., signal-dependent noise) with the flexibility and expressiveness of normalizing flow networks. The result is a single, comprehensive, compact noise model containing fewer than 2500 parameters yet able to represent multiple cameras and gain factors. Noise Flow dramatically outperforms existing noise models, with 0.42 nats/pixel improvement over the camera-calibrated noise level functions, which translates to 52% improvement in the likelihood of sampled noise. Noise Flow represents the first serious attempt to go beyond simple parametric models to one that leverages the power of deep learning and data-driven noise distributions.
Tasks
Published	2019-08-22
URL	https://arxiv.org/abs/1908.08453v1
PDF	https://arxiv.org/pdf/1908.08453v1.pdf
PWC	https://paperswithcode.com/paper/noise-flow-noise-modeling-with-conditional
Repo	https://github.com/BorealisAI/noise_flow
Framework	tf

Low-Level Linguistic Controls for Style Transfer and Content Preservation


Title	Low-Level Linguistic Controls for Style Transfer and Content Preservation
Authors	Katy Gero, Chris Kedzie, Jonathan Reeve, Lydia Chilton
Abstract	Despite the success of style transfer in image processing, it has seen limited progress in natural language generation. Part of the problem is that content is not as easily decoupled from style in the text domain. Curiously, in the field of stylometry, content does not figure prominently in practical methods of discriminating stylistic elements, such as authorship and genre. Rather, syntax and function words are the most salient features. Drawing on this work, we model style as a suite of low-level linguistic controls, such as frequency of pronouns, prepositions, and subordinate clause constructions. We train a neural encoder-decoder model to reconstruct reference sentences given only content words and the setting of the controls. We perform style transfer by keeping the content words fixed while adjusting the controls to be indicative of another style. In experiments, we show that the model reliably responds to the linguistic controls and perform both automatic and manual evaluations on style transfer. We find we can fool a style classifier 84% of the time, and that our model produces highly diverse and stylistically distinctive outputs. This work introduces a formal, extendable model of style that can add control to any neural text generation system.
Tasks	Style Transfer, Text Generation
Published	2019-11-08
URL	https://arxiv.org/abs/1911.03385v1
PDF	https://arxiv.org/pdf/1911.03385v1.pdf
PWC	https://paperswithcode.com/paper/low-level-linguistic-controls-for-style
Repo	https://github.com/kedz/styleeq
Framework	none

Soft Prototyping Camera Designs for Car Detection Based on a Convolutional Neural Network


Title	Soft Prototyping Camera Designs for Car Detection Based on a Convolutional Neural Network
Authors	Zhenyi Liu, Trisha Lian, Joyce Farrell, Brian Wandell
Abstract	Imaging systems are increasingly used as input to convolutional neural networks (CNN) for object detection; we would like to design cameras that are optimized for this purpose. It is impractical to build different cameras and then acquire and label the necessary data for every potential camera design; creating software simulations of the camera in context (soft prototyping) is the only realistic approach. We implemented soft-prototyping tools that can quantitatively simulate image radiance and camera designs to create realistic images that are input to a convolutional neural network for car detection. We used these methods to quantify the effect that critical hardware components (pixel size), sensor control (exposure algorithms) and image processing (gamma and demosaicing algorithms) have upon average precision of car detection. We quantify (a) the relationship between pixel size and the ability to detect cars at different distances, (b) the penalty for choosing a poor exposure duration, and (c) the ability of the CNN to perform car detection for a variety of post-acquisition processing algorithms. These results show that the optimal choices for car detection are not constrained by the same metrics used for image quality in consumer photography. It is better to evaluate camera designs for CNN applications using soft prototyping with task-specific metrics rather than consumer photography metrics.
Tasks	Demosaicking, Object Detection
Published	2019-10-24
URL	https://arxiv.org/abs/1910.10916v1
PDF	https://arxiv.org/pdf/1910.10916v1.pdf
PWC	https://paperswithcode.com/paper/soft-prototyping-camera-designs-for-car
Repo	https://github.com/iset/iset3d
Framework	none