Paper Group ANR 455
Tree Search Network for Sparse Regression. LCD: Learned Cross-Domain Descriptors for 2D-3D Matching. Snow avalanche segmentation in SAR images with Fully Convolutional Neural Networks. Unsupervised Monocular Depth Prediction for Indoor Continuous Video Streams. Vision: A Deep Learning Approach to provide walking assistance to the visually impaired. …
Tree Search Network for Sparse Regression
Title | Tree Search Network for Sparse Regression |
Authors | Kyung-Su Kim, Sae-Young Chung |
Abstract | We consider the classical sparse regression problem of recovering a sparse signal $x_0$ given a measurement vector $y = \Phi x_0+w$. We propose a tree search algorithm driven by the deep neural network for sparse regression (TSN). TSN improves the signal reconstruction performance of the deep neural network designed for sparse regression by performing a tree search with pruning. It is observed in both noiseless and noisy cases, TSN recovers synthetic and real signals with lower complexity than a conventional tree search and is superior to existing algorithms by a large margin for various types of the sensing matrix $\Phi$, widely used in sparse regression. |
Tasks | |
Published | 2019-04-01 |
URL | http://arxiv.org/abs/1904.00864v1 |
http://arxiv.org/pdf/1904.00864v1.pdf | |
PWC | https://paperswithcode.com/paper/tree-search-network-for-sparse-regression |
Repo | |
Framework | |
LCD: Learned Cross-Domain Descriptors for 2D-3D Matching
Title | LCD: Learned Cross-Domain Descriptors for 2D-3D Matching |
Authors | Quang-Hieu Pham, Mikaela Angelina Uy, Binh-Son Hua, Duc Thanh Nguyen, Gemma Roig, Sai-Kit Yeung |
Abstract | In this work, we present a novel method to learn a local cross-domain descriptor for 2D image and 3D point cloud matching. Our proposed method is a dual auto-encoder neural network that maps 2D and 3D input into a shared latent space representation. We show that such local cross-domain descriptors in the shared embedding are more discriminative than those obtained from individual training in 2D and 3D domains. To facilitate the training process, we built a new dataset by collecting $\approx 1.4$ millions of 2D-3D correspondences with various lighting conditions and settings from publicly available RGB-D scenes. Our descriptor is evaluated in three main experiments: 2D-3D matching, cross-domain retrieval, and sparse-to-dense depth estimation. Experimental results confirm the robustness of our approach as well as its competitive performance not only in solving cross-domain tasks but also in being able to generalize to solve sole 2D and 3D tasks. Our dataset and code are released publicly at \url{https://hkust-vgd.github.io/lcd}. |
Tasks | 3D Point Cloud Matching, Depth Estimation |
Published | 2019-11-21 |
URL | https://arxiv.org/abs/1911.09326v1 |
https://arxiv.org/pdf/1911.09326v1.pdf | |
PWC | https://paperswithcode.com/paper/lcd-learned-cross-domain-descriptors-for-2d |
Repo | |
Framework | |
Snow avalanche segmentation in SAR images with Fully Convolutional Neural Networks
Title | Snow avalanche segmentation in SAR images with Fully Convolutional Neural Networks |
Authors | Filippo Maria Bianchi, Jakob Grahn, Markus Eckerstorfer, Eirik Malnes, Hannah Vickers |
Abstract | Knowledge about frequency and location of snow avalanche activity is essential for forecasting and mapping of snow avalanche hazard. Traditional field monitoring of avalanche activity has limitations, especially when surveying large and remote areas. In recent years, avalanche detection in Sentinel-1 radar satellite imagery has been developed to overcome this monitoring problem. Current state-of-the-art detection algorithms, based on radar signal processing techniques, have highly varying accuracy that is on average much lower than the accuracy of visual detections from human experts. To reduce this gap, we propose a deep learning architecture for detecting avalanches in Sentinel-1 radar images. We trained a neural network on 6345 manually labelled avalanches from 117 Sentinel-1 images, each one consisting of six channels with backscatter and topographical information. Then, we tested the best network configuration on one additional SAR image. Comparing to the manual labelling (the gold standard), we achieved an F1 score above 66%, while the state-of-the-art detection algorithm produced an F1 score of 38%. A visual interpretation of the network’s results shows that it only fails to detect small avalanches, while it manages to detect some that were not labelled by the human expert. |
Tasks | |
Published | 2019-10-11 |
URL | https://arxiv.org/abs/1910.05411v1 |
https://arxiv.org/pdf/1910.05411v1.pdf | |
PWC | https://paperswithcode.com/paper/snow-avalanche-segmentation-in-sar-images |
Repo | |
Framework | |
Unsupervised Monocular Depth Prediction for Indoor Continuous Video Streams
Title | Unsupervised Monocular Depth Prediction for Indoor Continuous Video Streams |
Authors | Yinglong Feng, Shuncheng Wu, Okan Köpüklü, Xueyang Kang, Federico Tombari |
Abstract | This paper studies unsupervised monocular depth prediction problem. Most of existing unsupervised depth prediction algorithms are developed for outdoor scenarios, while the depth prediction work in the indoor environment is still very scarce to our knowledge. Therefore, this work focuses on narrowing the gap by firstly evaluating existing approaches in the indoor environments and then improving the state-of-the-art design of architecture. Unlike typical outdoor training dataset, such as KITTI with motion constraints, data for indoor environment contains more arbitrary camera movement and short baseline between two consecutive images, which deteriorates the network training for the pose estimation. To address this issue, we propose two methods: Firstly, we propose a novel reconstruction loss function to constraint pose estimation, resulting in accuracy improvement of the predicted disparity map; secondly, we use an ensemble learning with a flipping strategy along with a median filter, directly taking operation on the output disparity map. We evaluate our approaches on the TUM RGB-D and self-collected datasets. The results have shown that both approaches outperform the previous state-of-the-art unsupervised learning approaches. |
Tasks | Depth Estimation, Pose Estimation |
Published | 2019-11-20 |
URL | https://arxiv.org/abs/1911.08995v1 |
https://arxiv.org/pdf/1911.08995v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-monocular-depth-prediction-for |
Repo | |
Framework | |
Vision: A Deep Learning Approach to provide walking assistance to the visually impaired
Title | Vision: A Deep Learning Approach to provide walking assistance to the visually impaired |
Authors | Nikhil Thakurdesai, Anupam Tripathi, Dheeraj Butani, Smita Sankhe |
Abstract | Blind people face a lot of problems in their daily routines. They have to struggle a lot just to do their day-to-day chores. In this paper, we have proposed a system with the objective to help the visually impaired by providing audio aid guiding them to avoid obstacles, which will assist them to move in their surroundings. Object Detection using YOLO will help them detect the nearby objects and Depth Estimation using monocular vision will tell the approximate distance of the detected objects from the user. Despite a higher accuracy, stereo vision has many hardware constraints, which makes monocular vision the preferred choice for this application. |
Tasks | Depth Estimation, Object Detection |
Published | 2019-11-20 |
URL | https://arxiv.org/abs/1911.08739v1 |
https://arxiv.org/pdf/1911.08739v1.pdf | |
PWC | https://paperswithcode.com/paper/vision-a-deep-learning-approach-to-provide |
Repo | |
Framework | |
One Homonym per Translation
Title | One Homonym per Translation |
Authors | Bradley Hauer, Grzegorz Kondrak |
Abstract | The study of homonymy is vital to resolving fundamental problems in lexical semantics. In this paper, we propose four hypotheses that characterize the unique behavior of homonyms in the context of translations, discourses, collocations, and sense clusters. We present a new annotated homonym resource that allows us to test our hypotheses on existing WSD resources. The results of the experiments provide strong empirical evidence for the hypotheses. This study represents a step towards a computational method for distinguishing between homonymy and polysemy, and constructing a definitive inventory of coarse-grained senses. |
Tasks | |
Published | 2019-04-17 |
URL | https://arxiv.org/abs/1904.08533v2 |
https://arxiv.org/pdf/1904.08533v2.pdf | |
PWC | https://paperswithcode.com/paper/one-homonym-per-translation |
Repo | |
Framework | |
Learning deep linear neural networks: Riemannian gradient flows and convergence to global minimizers
Title | Learning deep linear neural networks: Riemannian gradient flows and convergence to global minimizers |
Authors | Bubacarr Bah, Holger Rauhut, Ulrich Terstiege, Michael Westdickenberg |
Abstract | We study the convergence of gradient flows related to learning deep linear neural networks (where the activation function is the identity map) from data. In this case, the composition of the network layers amounts to simply multiplying the weight matrices of all layers together, resulting in an overparameterized problem. The gradient flow with respect to these factors can be re-interpreted as a Riemannian gradient flow on the manifold of rank-$r$ matrices endowed with a suitable Riemannian metric. We show that the flow always converges to a critical point of the underlying functional. Moreover, we establish that, for almost all initializations, the flow converges to a global minimum on the manifold of rank $k$ matrices for some $k\leq r$. |
Tasks | |
Published | 2019-10-12 |
URL | https://arxiv.org/abs/1910.05505v3 |
https://arxiv.org/pdf/1910.05505v3.pdf | |
PWC | https://paperswithcode.com/paper/learning-deep-linear-neural-networks |
Repo | |
Framework | |
Digital Passport: A Novel Technological Strategy for Intellectual Property Protection of Convolutional Neural Networks
Title | Digital Passport: A Novel Technological Strategy for Intellectual Property Protection of Convolutional Neural Networks |
Authors | Lixin Fan, KamWoh Ng, Chee Seng Chan |
Abstract | In order to prevent deep neural networks from being infringed by unauthorized parties, we propose a generic solution which embeds a designated digital passport into a network, and subsequently, either paralyzes the network functionalities for unauthorized usages or maintain its functionalities in the presence of a verified passport. Such a desired network behavior is successfully demonstrated in a number of implementation schemes, which provide reliable, preventive and timely protections against tens of thousands of fake-passport deceptions. Extensive experiments also show that the deep neural network performance under unauthorized usages deteriorate significantly (e.g. with 33% to 82% reductions of CIFAR10 classification accuracies), while networks endorsed with valid passports remain intact. |
Tasks | |
Published | 2019-05-10 |
URL | https://arxiv.org/abs/1905.04368v1 |
https://arxiv.org/pdf/1905.04368v1.pdf | |
PWC | https://paperswithcode.com/paper/digital-passport-a-novel-technological |
Repo | |
Framework | |
Collaborative Execution of Deep Neural Networks on Internet of Things Devices
Title | Collaborative Execution of Deep Neural Networks on Internet of Things Devices |
Authors | Ramyad Hadidi, Jiashen Cao, Micheal S. Ryoo, Hyesoon Kim |
Abstract | With recent advancements in deep neural networks (DNNs), we are able to solve traditionally challenging problems. Since DNNs are compute intensive, consumers, to deploy a service, need to rely on expensive and scarce compute resources in the cloud. This approach, in addition to its dependability on high-quality network infrastructure and data centers, raises new privacy concerns. These challenges may limit DNN-based applications, so many researchers have tried optimize DNNs for local and in-edge execution. However, inadequate power and computing resources of edge devices along with small number of requests limits current optimizations applicability, such as batch processing. In this paper, we propose an approach that utilizes aggregated existing computing power of Internet of Things (IoT) devices surrounding an environment by creating a collaborative network. In this approach, IoT devices cooperate to conduct single-batch inferencing in real time. While exploiting several new model-parallelism methods and their distribution characteristics, our approach enhances the collaborative network by creating a balanced and distributed processing pipeline. We have illustrated our work using many Raspberry Pis with studying DNN models such as AlexNet, VGG16, Xception, and C3D. |
Tasks | |
Published | 2019-01-08 |
URL | http://arxiv.org/abs/1901.02537v1 |
http://arxiv.org/pdf/1901.02537v1.pdf | |
PWC | https://paperswithcode.com/paper/collaborative-execution-of-deep-neural |
Repo | |
Framework | |
Neural-Attention-Based Deep Learning Architectures for Modeling Traffic Dynamics on Lane Graphs
Title | Neural-Attention-Based Deep Learning Architectures for Modeling Traffic Dynamics on Lane Graphs |
Authors | Matthew A. Wright, Simon F. G. Ehlers, Roberto Horowitz |
Abstract | Deep neural networks can be powerful tools, but require careful application-specific design to ensure that the most informative relationships in the data are learnable. In this paper, we apply deep neural networks to the nonlinear spatiotemporal physics problem of vehicle traffic dynamics. We consider problems of estimating macroscopic quantities (e.g., the queue at an intersection) at a lane level. First-principles modeling at the lane scale has been a challenge due to complexities in modeling social behaviors like lane changes, and those behaviors’ resultant macro-scale effects. Following domain knowledge that upstream/downstream lanes and neighboring lanes affect each others’ traffic flows in distinct ways, we apply a form of neural attention that allows the neural network layers to aggregate information from different lanes in different manners. Using a microscopic traffic simulator as a testbed, we obtain results showing that an attentional neural network model can use information from nearby lanes to improve predictions, and, that explicitly encoding the lane-to-lane relationship types significantly improves performance. We also demonstrate the transfer of our learned neural network to a more complex road network, discuss how its performance degradation may be attributable to new traffic behaviors induced by increased topological complexity, and motivate learning dynamics models from many road network topologies. |
Tasks | |
Published | 2019-04-18 |
URL | https://arxiv.org/abs/1904.08831v3 |
https://arxiv.org/pdf/1904.08831v3.pdf | |
PWC | https://paperswithcode.com/paper/neural-attention-based-deep-learning |
Repo | |
Framework | |
Semi-supervised Learning using Adversarial Training with Good and Bad Samples
Title | Semi-supervised Learning using Adversarial Training with Good and Bad Samples |
Authors | Wenyuan Li, Zichen Wang, Yuguang Yue, Jiayun Li, William Speier, Mingyuan Zhou, Corey W. Arnold |
Abstract | In this work, we investigate semi-supervised learning (SSL) for image classification using adversarial training. Previous results have illustrated that generative adversarial networks (GANs) can be used for multiple purposes. Triple-GAN, which aims to jointly optimize model components by incorporating three players, generates suitable image-label pairs to compensate for the lack of labeled data in SSL with improved benchmark performance. Conversely, Bad (or complementary) GAN, optimizes generation to produce complementary data-label pairs and force a classifier’s decision boundary to lie between data manifolds. Although it generally outperforms Triple-GAN, Bad GAN is highly sensitive to the amount of labeled data used for training. Unifying these two approaches, we present unified-GAN (UGAN), a novel framework that enables a classifier to simultaneously learn from both good and bad samples through adversarial training. We perform extensive experiments on various datasets and demonstrate that UGAN: 1) achieves state-of-the-art performance among other deep generative models, and 2) is robust to variations in the amount of labeled data used for training. |
Tasks | Image Classification |
Published | 2019-10-18 |
URL | https://arxiv.org/abs/1910.08540v1 |
https://arxiv.org/pdf/1910.08540v1.pdf | |
PWC | https://paperswithcode.com/paper/semi-supervised-learning-using-adversarial |
Repo | |
Framework | |
INTEL-TAU: A Color Constancy Dataset
Title | INTEL-TAU: A Color Constancy Dataset |
Authors | Firas Laakom, Jenni Raitoharju, Alexandros Iosifidis, Jarno Nikkanen, Moncef Gabbouj |
Abstract | In this paper, we describe a new large dataset for illumination estimation. This dataset, called INTEL-TAU, contains 7022 images in total, which makes it the largest available high-resolution dataset for illumination estimation research. The variety of scenes captured using three different camera models, i.e., Canon 5DSR, Nikon D810, and Sony IMX135, makes the dataset appropriate for evaluating the camera and scene invariance of the different illumination estimation techniques. Privacy masking is done for sensitive information, e.g., faces. Thus, the dataset is coherent with the new General Data Protection Regulation (GDPR) regulations. Furthermore, the effect of color shading for mobile images can be evaluated with INTEL-TAU, as we provide both corrected and uncorrected versions of the raw data. We provide in this paper evaluation of several color constancy approaches |
Tasks | Color Constancy |
Published | 2019-10-23 |
URL | https://arxiv.org/abs/1910.10404v3 |
https://arxiv.org/pdf/1910.10404v3.pdf | |
PWC | https://paperswithcode.com/paper/intel-tau-a-color-constancy-dataset |
Repo | |
Framework | |
Multi-Level Batch Normalization In Deep Networks For Invasive Ductal Carcinoma Cell Discrimination In Histopathology Images
Title | Multi-Level Batch Normalization In Deep Networks For Invasive Ductal Carcinoma Cell Discrimination In Histopathology Images |
Authors | Francisco Perdigon Romero, An Tang, Samuel Kadoury |
Abstract | Breast cancer is the most diagnosed cancer and the most predominant cause of death in women worldwide. Imaging techniques such as the breast cancer pathology helps in the diagnosis and monitoring of the disease. However identification of malignant cells can be challenging given the high heterogeneity in tissue absorbotion from staining agents. In this work, we present a novel approach for Invasive Ductal Carcinoma (IDC) cells discrimination in histopathology slides. We propose a model derived from the Inception architecture, proposing a multi-level batch normalization module between each convolutional steps. This module was used as a base block for the feature extraction in a CNN architecture. We used the open IDC dataset in which we obtained a balanced accuracy of 0.89 and an F1 score of 0.90, thus surpassing recent state of the art classification algorithms tested on this public dataset. |
Tasks | |
Published | 2019-01-11 |
URL | http://arxiv.org/abs/1901.03684v1 |
http://arxiv.org/pdf/1901.03684v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-level-batch-normalization-in-deep |
Repo | |
Framework | |
Escaping from saddle points on Riemannian manifolds
Title | Escaping from saddle points on Riemannian manifolds |
Authors | Yue Sun, Nicolas Flammarion, Maryam Fazel |
Abstract | We consider minimizing a nonconvex, smooth function $f$ on a Riemannian manifold $\mathcal{M}$. We show that a perturbed version of Riemannian gradient descent algorithm converges to a second-order stationary point (and hence is able to escape saddle points on the manifold). The rate of convergence depends as $1/\epsilon^2$ on the accuracy $\epsilon$, which matches a rate known only for unconstrained smooth minimization. The convergence rate depends polylogarithmically on the manifold dimension $d$, hence is almost dimension-free. The rate also has a polynomial dependence on the parameters describing the curvature of the manifold and the smoothness of the function. While the unconstrained problem (Euclidean setting) is well-studied, our result is the first to prove such a rate for nonconvex, manifold-constrained problems. |
Tasks | |
Published | 2019-06-18 |
URL | https://arxiv.org/abs/1906.07355v1 |
https://arxiv.org/pdf/1906.07355v1.pdf | |
PWC | https://paperswithcode.com/paper/escaping-from-saddle-points-on-riemannian |
Repo | |
Framework | |
Improved visible to IR image transformation using synthetic data augmentation with cycle-consistent adversarial networks
Title | Improved visible to IR image transformation using synthetic data augmentation with cycle-consistent adversarial networks |
Authors | Kyongsik Yun, Kevin Yu, Joseph Osborne, Sarah Eldin, Luan Nguyen, Alexander Huyen, Thomas Lu |
Abstract | Infrared (IR) images are essential to improve the visibility of dark or camouflaged objects. Object recognition and segmentation based on a neural network using IR images provide more accuracy and insight than color visible images. But the bottleneck is the amount of relevant IR images for training. It is difficult to collect real-world IR images for special purposes, including space exploration, military and fire-fighting applications. To solve this problem, we created color visible and IR images using a Unity-based 3D game editor. These synthetically generated color visible and IR images were used to train cycle consistent adversarial networks (CycleGAN) to convert visible images to IR images. CycleGAN has the advantage that it does not require precisely matching visible and IR pairs for transformation training. In this study, we discovered that additional synthetic data can help improve CycleGAN performance. Neural network training using real data (N = 20) performed more accurate transformations than training using real (N = 10) and synthetic (N = 10) data combinations. The result indicates that the synthetic data cannot exceed the quality of the real data. Neural network training using real (N = 10) and synthetic (N = 100) data combinations showed almost the same performance as training using real data (N = 20). At least 10 times more synthetic data than real data is required to achieve the same performance. In summary, CycleGAN is used with synthetic data to improve the IR image conversion performance of visible images. |
Tasks | Data Augmentation, Object Recognition |
Published | 2019-04-25 |
URL | http://arxiv.org/abs/1904.11620v1 |
http://arxiv.org/pdf/1904.11620v1.pdf | |
PWC | https://paperswithcode.com/paper/improved-visible-to-ir-image-transformation |
Repo | |
Framework | |