January 30, 2020

2944 words 14 mins read

Paper Group ANR 455

Tree Search Network for Sparse Regression. LCD: Learned Cross-Domain Descriptors for 2D-3D Matching. Snow avalanche segmentation in SAR images with Fully Convolutional Neural Networks. Unsupervised Monocular Depth Prediction for Indoor Continuous Video Streams. Vision: A Deep Learning Approach to provide walking assistance to the visually impaired. …

Tree Search Network for Sparse Regression


Title	Tree Search Network for Sparse Regression
Authors	Kyung-Su Kim, Sae-Young Chung
Abstract	We consider the classical sparse regression problem of recovering a sparse signal $x_0$ given a measurement vector $y = \Phi x_0+w$. We propose a tree search algorithm driven by the deep neural network for sparse regression (TSN). TSN improves the signal reconstruction performance of the deep neural network designed for sparse regression by performing a tree search with pruning. It is observed in both noiseless and noisy cases, TSN recovers synthetic and real signals with lower complexity than a conventional tree search and is superior to existing algorithms by a large margin for various types of the sensing matrix $\Phi$, widely used in sparse regression.
Tasks
Published	2019-04-01
URL	http://arxiv.org/abs/1904.00864v1
PDF	http://arxiv.org/pdf/1904.00864v1.pdf
PWC	https://paperswithcode.com/paper/tree-search-network-for-sparse-regression
Repo
Framework

LCD: Learned Cross-Domain Descriptors for 2D-3D Matching


Title	LCD: Learned Cross-Domain Descriptors for 2D-3D Matching
Authors	Quang-Hieu Pham, Mikaela Angelina Uy, Binh-Son Hua, Duc Thanh Nguyen, Gemma Roig, Sai-Kit Yeung
Abstract	In this work, we present a novel method to learn a local cross-domain descriptor for 2D image and 3D point cloud matching. Our proposed method is a dual auto-encoder neural network that maps 2D and 3D input into a shared latent space representation. We show that such local cross-domain descriptors in the shared embedding are more discriminative than those obtained from individual training in 2D and 3D domains. To facilitate the training process, we built a new dataset by collecting $\approx 1.4$ millions of 2D-3D correspondences with various lighting conditions and settings from publicly available RGB-D scenes. Our descriptor is evaluated in three main experiments: 2D-3D matching, cross-domain retrieval, and sparse-to-dense depth estimation. Experimental results confirm the robustness of our approach as well as its competitive performance not only in solving cross-domain tasks but also in being able to generalize to solve sole 2D and 3D tasks. Our dataset and code are released publicly at \url{https://hkust-vgd.github.io/lcd}.
Tasks	3D Point Cloud Matching, Depth Estimation
Published	2019-11-21
URL	https://arxiv.org/abs/1911.09326v1
PDF	https://arxiv.org/pdf/1911.09326v1.pdf
PWC	https://paperswithcode.com/paper/lcd-learned-cross-domain-descriptors-for-2d
Repo
Framework

Snow avalanche segmentation in SAR images with Fully Convolutional Neural Networks


Title	Snow avalanche segmentation in SAR images with Fully Convolutional Neural Networks
Authors	Filippo Maria Bianchi, Jakob Grahn, Markus Eckerstorfer, Eirik Malnes, Hannah Vickers
Abstract	Knowledge about frequency and location of snow avalanche activity is essential for forecasting and mapping of snow avalanche hazard. Traditional field monitoring of avalanche activity has limitations, especially when surveying large and remote areas. In recent years, avalanche detection in Sentinel-1 radar satellite imagery has been developed to overcome this monitoring problem. Current state-of-the-art detection algorithms, based on radar signal processing techniques, have highly varying accuracy that is on average much lower than the accuracy of visual detections from human experts. To reduce this gap, we propose a deep learning architecture for detecting avalanches in Sentinel-1 radar images. We trained a neural network on 6345 manually labelled avalanches from 117 Sentinel-1 images, each one consisting of six channels with backscatter and topographical information. Then, we tested the best network configuration on one additional SAR image. Comparing to the manual labelling (the gold standard), we achieved an F1 score above 66%, while the state-of-the-art detection algorithm produced an F1 score of 38%. A visual interpretation of the network’s results shows that it only fails to detect small avalanches, while it manages to detect some that were not labelled by the human expert.
Tasks
Published	2019-10-11
URL	https://arxiv.org/abs/1910.05411v1
PDF	https://arxiv.org/pdf/1910.05411v1.pdf
PWC	https://paperswithcode.com/paper/snow-avalanche-segmentation-in-sar-images
Repo
Framework

Unsupervised Monocular Depth Prediction for Indoor Continuous Video Streams


Title	Unsupervised Monocular Depth Prediction for Indoor Continuous Video Streams
Authors	Yinglong Feng, Shuncheng Wu, Okan Köpüklü, Xueyang Kang, Federico Tombari
Abstract	This paper studies unsupervised monocular depth prediction problem. Most of existing unsupervised depth prediction algorithms are developed for outdoor scenarios, while the depth prediction work in the indoor environment is still very scarce to our knowledge. Therefore, this work focuses on narrowing the gap by firstly evaluating existing approaches in the indoor environments and then improving the state-of-the-art design of architecture. Unlike typical outdoor training dataset, such as KITTI with motion constraints, data for indoor environment contains more arbitrary camera movement and short baseline between two consecutive images, which deteriorates the network training for the pose estimation. To address this issue, we propose two methods: Firstly, we propose a novel reconstruction loss function to constraint pose estimation, resulting in accuracy improvement of the predicted disparity map; secondly, we use an ensemble learning with a flipping strategy along with a median filter, directly taking operation on the output disparity map. We evaluate our approaches on the TUM RGB-D and self-collected datasets. The results have shown that both approaches outperform the previous state-of-the-art unsupervised learning approaches.
Tasks	Depth Estimation, Pose Estimation
Published	2019-11-20
URL	https://arxiv.org/abs/1911.08995v1
PDF	https://arxiv.org/pdf/1911.08995v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-monocular-depth-prediction-for
Repo
Framework

Vision: A Deep Learning Approach to provide walking assistance to the visually impaired


Title	Vision: A Deep Learning Approach to provide walking assistance to the visually impaired
Authors	Nikhil Thakurdesai, Anupam Tripathi, Dheeraj Butani, Smita Sankhe
Abstract	Blind people face a lot of problems in their daily routines. They have to struggle a lot just to do their day-to-day chores. In this paper, we have proposed a system with the objective to help the visually impaired by providing audio aid guiding them to avoid obstacles, which will assist them to move in their surroundings. Object Detection using YOLO will help them detect the nearby objects and Depth Estimation using monocular vision will tell the approximate distance of the detected objects from the user. Despite a higher accuracy, stereo vision has many hardware constraints, which makes monocular vision the preferred choice for this application.
Tasks	Depth Estimation, Object Detection
Published	2019-11-20
URL	https://arxiv.org/abs/1911.08739v1
PDF	https://arxiv.org/pdf/1911.08739v1.pdf
PWC	https://paperswithcode.com/paper/vision-a-deep-learning-approach-to-provide
Repo
Framework

One Homonym per Translation


Title	One Homonym per Translation
Authors	Bradley Hauer, Grzegorz Kondrak
Abstract	The study of homonymy is vital to resolving fundamental problems in lexical semantics. In this paper, we propose four hypotheses that characterize the unique behavior of homonyms in the context of translations, discourses, collocations, and sense clusters. We present a new annotated homonym resource that allows us to test our hypotheses on existing WSD resources. The results of the experiments provide strong empirical evidence for the hypotheses. This study represents a step towards a computational method for distinguishing between homonymy and polysemy, and constructing a definitive inventory of coarse-grained senses.
Tasks
Published	2019-04-17
URL	https://arxiv.org/abs/1904.08533v2
PDF	https://arxiv.org/pdf/1904.08533v2.pdf
PWC	https://paperswithcode.com/paper/one-homonym-per-translation
Repo
Framework

Learning deep linear neural networks: Riemannian gradient flows and convergence to global minimizers


Title	Learning deep linear neural networks: Riemannian gradient flows and convergence to global minimizers
Authors	Bubacarr Bah, Holger Rauhut, Ulrich Terstiege, Michael Westdickenberg
Abstract	We study the convergence of gradient flows related to learning deep linear neural networks (where the activation function is the identity map) from data. In this case, the composition of the network layers amounts to simply multiplying the weight matrices of all layers together, resulting in an overparameterized problem. The gradient flow with respect to these factors can be re-interpreted as a Riemannian gradient flow on the manifold of rank-$r$ matrices endowed with a suitable Riemannian metric. We show that the flow always converges to a critical point of the underlying functional. Moreover, we establish that, for almost all initializations, the flow converges to a global minimum on the manifold of rank $k$ matrices for some $k\leq r$.
Tasks
Published	2019-10-12
URL	https://arxiv.org/abs/1910.05505v3
PDF	https://arxiv.org/pdf/1910.05505v3.pdf
PWC	https://paperswithcode.com/paper/learning-deep-linear-neural-networks
Repo
Framework

Digital Passport: A Novel Technological Strategy for Intellectual Property Protection of Convolutional Neural Networks


Title	Digital Passport: A Novel Technological Strategy for Intellectual Property Protection of Convolutional Neural Networks
Authors	Lixin Fan, KamWoh Ng, Chee Seng Chan
Abstract	In order to prevent deep neural networks from being infringed by unauthorized parties, we propose a generic solution which embeds a designated digital passport into a network, and subsequently, either paralyzes the network functionalities for unauthorized usages or maintain its functionalities in the presence of a verified passport. Such a desired network behavior is successfully demonstrated in a number of implementation schemes, which provide reliable, preventive and timely protections against tens of thousands of fake-passport deceptions. Extensive experiments also show that the deep neural network performance under unauthorized usages deteriorate significantly (e.g. with 33% to 82% reductions of CIFAR10 classification accuracies), while networks endorsed with valid passports remain intact.
Tasks
Published	2019-05-10
URL	https://arxiv.org/abs/1905.04368v1
PDF	https://arxiv.org/pdf/1905.04368v1.pdf
PWC	https://paperswithcode.com/paper/digital-passport-a-novel-technological
Repo
Framework

Collaborative Execution of Deep Neural Networks on Internet of Things Devices


Title	Collaborative Execution of Deep Neural Networks on Internet of Things Devices
Authors	Ramyad Hadidi, Jiashen Cao, Micheal S. Ryoo, Hyesoon Kim
Abstract	With recent advancements in deep neural networks (DNNs), we are able to solve traditionally challenging problems. Since DNNs are compute intensive, consumers, to deploy a service, need to rely on expensive and scarce compute resources in the cloud. This approach, in addition to its dependability on high-quality network infrastructure and data centers, raises new privacy concerns. These challenges may limit DNN-based applications, so many researchers have tried optimize DNNs for local and in-edge execution. However, inadequate power and computing resources of edge devices along with small number of requests limits current optimizations applicability, such as batch processing. In this paper, we propose an approach that utilizes aggregated existing computing power of Internet of Things (IoT) devices surrounding an environment by creating a collaborative network. In this approach, IoT devices cooperate to conduct single-batch inferencing in real time. While exploiting several new model-parallelism methods and their distribution characteristics, our approach enhances the collaborative network by creating a balanced and distributed processing pipeline. We have illustrated our work using many Raspberry Pis with studying DNN models such as AlexNet, VGG16, Xception, and C3D.
Tasks
Published	2019-01-08
URL	http://arxiv.org/abs/1901.02537v1
PDF	http://arxiv.org/pdf/1901.02537v1.pdf
PWC	https://paperswithcode.com/paper/collaborative-execution-of-deep-neural
Repo
Framework

Neural-Attention-Based Deep Learning Architectures for Modeling Traffic Dynamics on Lane Graphs


Title	Neural-Attention-Based Deep Learning Architectures for Modeling Traffic Dynamics on Lane Graphs
Authors	Matthew A. Wright, Simon F. G. Ehlers, Roberto Horowitz
Abstract	Deep neural networks can be powerful tools, but require careful application-specific design to ensure that the most informative relationships in the data are learnable. In this paper, we apply deep neural networks to the nonlinear spatiotemporal physics problem of vehicle traffic dynamics. We consider problems of estimating macroscopic quantities (e.g., the queue at an intersection) at a lane level. First-principles modeling at the lane scale has been a challenge due to complexities in modeling social behaviors like lane changes, and those behaviors’ resultant macro-scale effects. Following domain knowledge that upstream/downstream lanes and neighboring lanes affect each others’ traffic flows in distinct ways, we apply a form of neural attention that allows the neural network layers to aggregate information from different lanes in different manners. Using a microscopic traffic simulator as a testbed, we obtain results showing that an attentional neural network model can use information from nearby lanes to improve predictions, and, that explicitly encoding the lane-to-lane relationship types significantly improves performance. We also demonstrate the transfer of our learned neural network to a more complex road network, discuss how its performance degradation may be attributable to new traffic behaviors induced by increased topological complexity, and motivate learning dynamics models from many road network topologies.
Tasks
Published	2019-04-18
URL	https://arxiv.org/abs/1904.08831v3
PDF	https://arxiv.org/pdf/1904.08831v3.pdf
PWC	https://paperswithcode.com/paper/neural-attention-based-deep-learning
Repo
Framework

Semi-supervised Learning using Adversarial Training with Good and Bad Samples


Title	Semi-supervised Learning using Adversarial Training with Good and Bad Samples
Authors	Wenyuan Li, Zichen Wang, Yuguang Yue, Jiayun Li, William Speier, Mingyuan Zhou, Corey W. Arnold
Abstract	In this work, we investigate semi-supervised learning (SSL) for image classification using adversarial training. Previous results have illustrated that generative adversarial networks (GANs) can be used for multiple purposes. Triple-GAN, which aims to jointly optimize model components by incorporating three players, generates suitable image-label pairs to compensate for the lack of labeled data in SSL with improved benchmark performance. Conversely, Bad (or complementary) GAN, optimizes generation to produce complementary data-label pairs and force a classifier’s decision boundary to lie between data manifolds. Although it generally outperforms Triple-GAN, Bad GAN is highly sensitive to the amount of labeled data used for training. Unifying these two approaches, we present unified-GAN (UGAN), a novel framework that enables a classifier to simultaneously learn from both good and bad samples through adversarial training. We perform extensive experiments on various datasets and demonstrate that UGAN: 1) achieves state-of-the-art performance among other deep generative models, and 2) is robust to variations in the amount of labeled data used for training.
Tasks	Image Classification
Published	2019-10-18
URL	https://arxiv.org/abs/1910.08540v1
PDF	https://arxiv.org/pdf/1910.08540v1.pdf
PWC	https://paperswithcode.com/paper/semi-supervised-learning-using-adversarial
Repo
Framework

INTEL-TAU: A Color Constancy Dataset


Title	INTEL-TAU: A Color Constancy Dataset
Authors	Firas Laakom, Jenni Raitoharju, Alexandros Iosifidis, Jarno Nikkanen, Moncef Gabbouj
Abstract	In this paper, we describe a new large dataset for illumination estimation. This dataset, called INTEL-TAU, contains 7022 images in total, which makes it the largest available high-resolution dataset for illumination estimation research. The variety of scenes captured using three different camera models, i.e., Canon 5DSR, Nikon D810, and Sony IMX135, makes the dataset appropriate for evaluating the camera and scene invariance of the different illumination estimation techniques. Privacy masking is done for sensitive information, e.g., faces. Thus, the dataset is coherent with the new General Data Protection Regulation (GDPR) regulations. Furthermore, the effect of color shading for mobile images can be evaluated with INTEL-TAU, as we provide both corrected and uncorrected versions of the raw data. We provide in this paper evaluation of several color constancy approaches
Tasks	Color Constancy
Published	2019-10-23
URL	https://arxiv.org/abs/1910.10404v3
PDF	https://arxiv.org/pdf/1910.10404v3.pdf
PWC	https://paperswithcode.com/paper/intel-tau-a-color-constancy-dataset
Repo
Framework

Multi-Level Batch Normalization In Deep Networks For Invasive Ductal Carcinoma Cell Discrimination In Histopathology Images


Title	Multi-Level Batch Normalization In Deep Networks For Invasive Ductal Carcinoma Cell Discrimination In Histopathology Images
Authors	Francisco Perdigon Romero, An Tang, Samuel Kadoury
Abstract	Breast cancer is the most diagnosed cancer and the most predominant cause of death in women worldwide. Imaging techniques such as the breast cancer pathology helps in the diagnosis and monitoring of the disease. However identification of malignant cells can be challenging given the high heterogeneity in tissue absorbotion from staining agents. In this work, we present a novel approach for Invasive Ductal Carcinoma (IDC) cells discrimination in histopathology slides. We propose a model derived from the Inception architecture, proposing a multi-level batch normalization module between each convolutional steps. This module was used as a base block for the feature extraction in a CNN architecture. We used the open IDC dataset in which we obtained a balanced accuracy of 0.89 and an F1 score of 0.90, thus surpassing recent state of the art classification algorithms tested on this public dataset.
Tasks
Published	2019-01-11
URL	http://arxiv.org/abs/1901.03684v1
PDF	http://arxiv.org/pdf/1901.03684v1.pdf
PWC	https://paperswithcode.com/paper/multi-level-batch-normalization-in-deep
Repo
Framework

Escaping from saddle points on Riemannian manifolds


Title	Escaping from saddle points on Riemannian manifolds
Authors	Yue Sun, Nicolas Flammarion, Maryam Fazel
Abstract	We consider minimizing a nonconvex, smooth function $f$ on a Riemannian manifold $\mathcal{M}$. We show that a perturbed version of Riemannian gradient descent algorithm converges to a second-order stationary point (and hence is able to escape saddle points on the manifold). The rate of convergence depends as $1/\epsilon^2$ on the accuracy $\epsilon$, which matches a rate known only for unconstrained smooth minimization. The convergence rate depends polylogarithmically on the manifold dimension $d$, hence is almost dimension-free. The rate also has a polynomial dependence on the parameters describing the curvature of the manifold and the smoothness of the function. While the unconstrained problem (Euclidean setting) is well-studied, our result is the first to prove such a rate for nonconvex, manifold-constrained problems.
Tasks
Published	2019-06-18
URL	https://arxiv.org/abs/1906.07355v1
PDF	https://arxiv.org/pdf/1906.07355v1.pdf
PWC	https://paperswithcode.com/paper/escaping-from-saddle-points-on-riemannian
Repo
Framework

Improved visible to IR image transformation using synthetic data augmentation with cycle-consistent adversarial networks


Title	Improved visible to IR image transformation using synthetic data augmentation with cycle-consistent adversarial networks
Authors	Kyongsik Yun, Kevin Yu, Joseph Osborne, Sarah Eldin, Luan Nguyen, Alexander Huyen, Thomas Lu
Abstract	Infrared (IR) images are essential to improve the visibility of dark or camouflaged objects. Object recognition and segmentation based on a neural network using IR images provide more accuracy and insight than color visible images. But the bottleneck is the amount of relevant IR images for training. It is difficult to collect real-world IR images for special purposes, including space exploration, military and fire-fighting applications. To solve this problem, we created color visible and IR images using a Unity-based 3D game editor. These synthetically generated color visible and IR images were used to train cycle consistent adversarial networks (CycleGAN) to convert visible images to IR images. CycleGAN has the advantage that it does not require precisely matching visible and IR pairs for transformation training. In this study, we discovered that additional synthetic data can help improve CycleGAN performance. Neural network training using real data (N = 20) performed more accurate transformations than training using real (N = 10) and synthetic (N = 10) data combinations. The result indicates that the synthetic data cannot exceed the quality of the real data. Neural network training using real (N = 10) and synthetic (N = 100) data combinations showed almost the same performance as training using real data (N = 20). At least 10 times more synthetic data than real data is required to achieve the same performance. In summary, CycleGAN is used with synthetic data to improve the IR image conversion performance of visible images.
Tasks	Data Augmentation, Object Recognition
Published	2019-04-25
URL	http://arxiv.org/abs/1904.11620v1
PDF	http://arxiv.org/pdf/1904.11620v1.pdf
PWC	https://paperswithcode.com/paper/improved-visible-to-ir-image-transformation
Repo
Framework