January 25, 2020

3086 words 15 mins read

Paper Group NAWR 23

Paper Group NAWR 23

6D-VNet: End-to-end 6DoF Vehicle Pose Estimation from Monocular RGB Images. TextPlace: Visual Place Recognition and Topological Localization Through Reading Scene Texts. Exploring Unexplored Tensor Network Decompositions for Convolutional Neural Networks. Wide Feedforward or Recurrent Neural Networks of Any Architecture are Gaussian Processes. A Do …

6D-VNet: End-to-end 6DoF Vehicle Pose Estimation from Monocular RGB Images

Title 6D-VNet: End-to-end 6DoF Vehicle Pose Estimation from Monocular RGB Images
Authors Di Wu, Zhaoyong Zhuang, Canqun Xiang, Wenbin Zou and Xia Li
Abstract We present a conceptually simple framework for 6DoF object pose estimation, especially for autonomous driving scenario. Our approach efficiently detects traffic partic- ipants in a monocular RGB image while simultaneously regressing their 3D translation and rotation vectors. The method, called 6D-VNet, extends Mask R-CNN by adding customised heads for predicting vehicle s finer class, ro- tation and translation. The proposed 6D-VNet is trained end-to-end compared to previous methods. Furthermore, we show that the inclusion of translational regression in the joint losses is crucial for the 6DoF pose estimation task, where object translation distance along longitudinal axis varies significantly, e.g., in autonomous driving sce- narios. Additionally, we incorporate the mutual informa- tion between traffic participants via a modified non-local block. As opposed to the original non-local block imple- mentation, the proposed weighting modification takes the spatial neighbouring information into consideration whilst counteracting the effect of extreme gradient values. Our 6D-VNet reaches the 1 st place in ApolloScape challenge 3D Car Instance task1 [21]. Code has been made available at: https://github.com/stevenwudi/6DVNET.
Tasks Autonomous Driving, Pose Estimation
Published 2019-06-15
URL http://openaccess.thecvf.com/content_CVPRW_2019/html/WAD/Wu_6D-VNet_End-To-End_6-DoF_Vehicle_Pose_Estimation_From_Monocular_RGB_Images_CVPRW_2019_paper.html
PDF http://openaccess.thecvf.com/content_CVPRW_2019/papers/WAD/Wu_6D-VNet_End-To-End_6-DoF_Vehicle_Pose_Estimation_From_Monocular_RGB_Images_CVPRW_2019_paper.pdf
PWC https://paperswithcode.com/paper/6d-vnet-end-to-end-6dof-vehicle-pose
Repo https://github.com/stevenwudi/6DVNET
Framework pytorch

TextPlace: Visual Place Recognition and Topological Localization Through Reading Scene Texts

Title TextPlace: Visual Place Recognition and Topological Localization Through Reading Scene Texts
Authors Ziyang Hong, Yvan Petillot, David Lane, Yishu Miao, Sen Wang
Abstract Visual place recognition is a fundamental problem for many vision based applications. Sparse feature and deep learning based methods have been successful and dominant over the decade. However, most of them do not explicitly leverage high-level semantic information to deal with challenging scenarios where they may fail. This paper proposes a novel visual place recognition algorithm, termed TextPlace, based on scene texts in the wild. Since scene texts are high-level information invariant to illumination changes and very distinct for different places when considering spatial correlation, it is beneficial for visual place recognition tasks under extreme appearance changes and perceptual aliasing. It also takes spatial-temporal dependence between scene texts into account for topological localization. Extensive experiments show that TextPlace achieves state-of-the-art performance, verifying the effectiveness of using high-level scene texts for robust visual place recognition in urban areas.
Tasks Visual Place Recognition
Published 2019-10-01
URL http://openaccess.thecvf.com/content_ICCV_2019/html/Hong_TextPlace_Visual_Place_Recognition_and_Topological_Localization_Through_Reading_Scene_ICCV_2019_paper.html
PDF http://openaccess.thecvf.com/content_ICCV_2019/papers/Hong_TextPlace_Visual_Place_Recognition_and_Topological_Localization_Through_Reading_Scene_ICCV_2019_paper.pdf
PWC https://paperswithcode.com/paper/textplace-visual-place-recognition-and
Repo https://github.com/ziyanghong/dataset
Framework none

Exploring Unexplored Tensor Network Decompositions for Convolutional Neural Networks

Title Exploring Unexplored Tensor Network Decompositions for Convolutional Neural Networks
Authors Kohei Hayashi, Taiki Yamaguchi, Yohei Sugawara, Shin-Ichi Maeda
Abstract Tensor decomposition methods are widely used for model compression and fast inference in convolutional neural networks (CNNs). Although many decompositions are conceivable, only CP decomposition and a few others have been applied in practice, and no extensive comparisons have been made between available methods. Previous studies have not determined how many decompositions are available, nor which of them is optimal. In this study, we first characterize a decomposition class specific to CNNs by adopting a flexible graphical notation. The class includes such well-known CNN modules as depthwise separable convolution layers and bottleneck layers, but also previously unknown modules with nonlinear activations. We also experimentally compare the tradeoff between prediction accuracy and time/space complexity for modules found by enumerating all possible decompositions, or by using a neural architecture search. We find some nonlinear decompositions outperform existing ones.
Tasks Model Compression, Neural Architecture Search
Published 2019-12-01
URL http://papers.nips.cc/paper/8793-exploring-unexplored-tensor-network-decompositions-for-convolutional-neural-networks
PDF http://papers.nips.cc/paper/8793-exploring-unexplored-tensor-network-decompositions-for-convolutional-neural-networks.pdf
PWC https://paperswithcode.com/paper/exploring-unexplored-tensor-network
Repo https://github.com/pfnet-research/einconv
Framework pytorch

Wide Feedforward or Recurrent Neural Networks of Any Architecture are Gaussian Processes

Title Wide Feedforward or Recurrent Neural Networks of Any Architecture are Gaussian Processes
Authors Greg Yang
Abstract Wide neural networks with random weights and biases are Gaussian processes, as observed by Neal (1995) for shallow networks, and more recently by Lee et al.~(2018) and Matthews et al.~(2018) for deep fully-connected networks, as well as by Novak et al.~(2019) and Garriga-Alonso et al.~(2019) for deep convolutional networks. We show that this Neural Network-Gaussian Process correspondence surprisingly extends to all modern feedforward or recurrent neural networks composed of multilayer perceptron, RNNs (e.g. LSTMs, GRUs), (nD or graph) convolution, pooling, skip connection, attention, batch normalization, and/or layer normalization. More generally, we introduce a language for expressing neural network computations, and our result encompasses all such expressible neural networks. This work serves as a tutorial on the \emph{tensor programs} technique formulated in Yang (2019) and elucidates the Gaussian Process results obtained there. We provide open-source implementations of the Gaussian Process kernels of simple RNN, GRU, transformer, and batchnorm+ReLU network at github.com/thegregyang/GP4A. Please see our arxiv version for the complete and up-to-date version of this paper.
Tasks Gaussian Processes
Published 2019-12-01
URL http://papers.nips.cc/paper/9186-wide-feedforward-or-recurrent-neural-networks-of-any-architecture-are-gaussian-processes
PDF http://papers.nips.cc/paper/9186-wide-feedforward-or-recurrent-neural-networks-of-any-architecture-are-gaussian-processes.pdf
PWC https://paperswithcode.com/paper/wide-feedforward-or-recurrent-neural-networks
Repo https://github.com/thegregyang/GP4A
Framework none

A Domain Agnostic Measure for Monitoring and Evaluating GANs

Title A Domain Agnostic Measure for Monitoring and Evaluating GANs
Authors Paulina Grnarova, Kfir Y. Levy, Aurelien Lucchi, Nathanael Perraudin, Ian Goodfellow, Thomas Hofmann, Andreas Krause
Abstract Generative Adversarial Networks (GANs) have shown remarkable results in modeling complex distributions, but their evaluation remains an unsettled issue. Evaluations are essential for: (i) relative assessment of different models and (ii) monitoring the progress of a single model throughout training. The latter cannot be determined by simply inspecting the generator and discriminator loss curves as they behave non-intuitively. We leverage the notion of duality gap from game theory to propose a measure that addresses both (i) and (ii) at a low computational cost. Extensive experiments show the effectiveness of this measure to rank different GAN models and capture the typical GAN failure scenarios, including mode collapse and non-convergent behaviours. This evaluation metric also provides meaningful monitoring on the progression of the loss during training. It highly correlates with FID on natural image datasets, and with domain specific scores for text, sound and cosmology data where FID is not directly suitable. In particular, our proposed metric requires no labels or a pretrained classifier, making it domain agnostic.
Tasks
Published 2019-12-01
URL http://papers.nips.cc/paper/9377-a-domain-agnostic-measure-for-monitoring-and-evaluating-gans
PDF http://papers.nips.cc/paper/9377-a-domain-agnostic-measure-for-monitoring-and-evaluating-gans.pdf
PWC https://paperswithcode.com/paper/a-domain-agnostic-measure-for-monitoring-and
Repo https://github.com/pgrnar/DualityGap
Framework none

Recurrent Highway Networks with Grouped Auxiliary Memory

Title Recurrent Highway Networks with Grouped Auxiliary Memory
Authors Wei Luo ; Feng Yu
Abstract Recurrent neural networks (RNNs) are challenging to train, let alone those with deep spatial structures. Architectures built upon highway connections such as Recurrent Highway Network (RHN) were developed to allow larger step-to-step transition depth, leading to more expressive models. However, problems that require capturing long-term dependencies still can not be well addressed by these models. Moreover, the ability to keep long-term memories tends to diminish when the spatial depth increases, since deeper structure may accelerate gradient vanishing. In this paper, we address these issues by proposing a novel RNN architecture based on RHN, namely the Recurrent Highway Network with Grouped Auxiliary Memory (GAM-RHN). The proposed architecture interconnects the RHN with a set of auxiliary memory units specifically for storing long-term information via reading and writing operations, which is analogous to Memory Augmented Neural Networks (MANNs). Experimental results on artificial long time lag tasks show that GAM-RHNs can be trained efficiently while being deep in both time and space. We also evaluate the proposed architecture on a variety of tasks, including language modeling, sequential image classification, and financial market forecasting. The potential of our approach is demonstrated by achieving state-of-the-art results on these tasks.
Tasks Image Classification, Language Modelling, Sequential Image Classification, Stock Trend Prediction
Published 2019-12-13
URL https://ieeexplore.ieee.org/document/8932404
PDF https://ieeexplore.ieee.org/document/8932404
PWC https://paperswithcode.com/paper/recurrent-highway-networks-with-grouped
Repo https://github.com/WilliamRo/gam_rhn
Framework tf

Preprocessing Method for Performance Enhancement in CNN-based STEMI Detection from 12-lead ECG

Title Preprocessing Method for Performance Enhancement in CNN-based STEMI Detection from 12-lead ECG
Authors YeongHyeon Park, Il Dong Yun, Si-Hyuck Kang
Abstract ST elevation myocardial infarction (STEMI) is an acute life-threatening disease. It shows a high mortality risk when a patient is not timely treated within the golden time, prompt diagnosis with limited information such as electrocardiogram (ECG) is crucial. However, previous studies among physicians and paramedics have shown that the accuracy of STEMI diagnosis by the ECG is not sufficient. Thus, we propose a detecting algorithm based on a convolutional neural network (CNN) for detecting the STEMI on 12-lead ECG in order to support physicians, especially in an emergency room. We mostly focus on enhancing the detecting performance using a preprocessing technique. First, we reduce the noise of ECG using a notch filter and high-pass filter. We also segment pulses from ECG to focus on the ST segment. We use 96 normal and 179 STEMI records provided by Seoul National University Bundang Hospital (SNUBH) for the experiment. The sensitivity, specificity, and area under the curve (AUC) of the receiver operating characteristic (ROC) curve are increased from 0.685, 0.350, and 0.526 to 0.932, 0.896, and 0.943, respectively, depending on the preprocessing technique. As our result shows, the proposed method is effective to enhance STEIM detecting performance. Also, the proposed algorithm would be expected to help timely and the accurate diagnosis of STEMI in clinical practices.
Tasks
Published 2019-07-24
URL https://ieeexplore.ieee.org/abstract/document/8771175
PDF https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8771175
PWC https://paperswithcode.com/paper/preprocessing-method-for-performance
Repo https://github.com/YeongHyeon/Enhancementing-Method-for-STEMI-Detection
Framework tf

Activation Atlas

Title Activation Atlas
Authors Shan Carter, Zan Armstrong, Ludwig Schubert, Ian Johnson, Chris Olah
Abstract By using feature inversion to visualize millions of activations from an image classification network, we create an explorable activation atlas of features the network has learned which can reveal how the network typically represents some concepts.
Tasks Image Classification
Published 2019-03-06
URL https://distill.pub/2019/activation-atlas/
PDF https://distill.pub/2019/activation-atlas/
PWC https://paperswithcode.com/paper/activation-atlas
Repo https://github.com/tensorflow/lucid
Framework tf

Practical Differentially Private Top-k Selection with Pay-what-you-get Composition

Title Practical Differentially Private Top-k Selection with Pay-what-you-get Composition
Authors David Durfee, Ryan M. Rogers
Abstract We study the problem of top-k selection over a large domain universe subject to user-level differential privacy. Typically, the exponential mechanism or report noisy max are the algorithms used to solve this problem. However, these algorithms require querying the database for the count of each domain element. We focus on the setting where the data domain is unknown, which is different than the setting of frequent itemsets where an apriori type algorithm can help prune the space of domain elements to query. We design algorithms that ensures (approximate) differential privacy and only needs access to the true top-k’ elements from the data for any chosen k’ ≥ k. This is a highly desirable feature for making differential privacy practical, since the algorithms require no knowledge of the domain. We consider both the setting where a user’s data can modify an arbitrary number of counts by at most 1, i.e. unrestricted sensitivity, and the setting where a user’s data can modify at most some small, fixed number of counts by at most 1, i.e. restricted sensitivity. Additionally, we provide a pay-what-you-get privacy composition bound for our algorithms. That is, our algorithms might return fewer than k elements when the top-k elements are queried, but the overall privacy budget only decreases by the size of the outcome set.
Tasks
Published 2019-12-01
URL http://papers.nips.cc/paper/8612-practical-differentially-private-top-k-selection-with-pay-what-you-get-composition
PDF http://papers.nips.cc/paper/8612-practical-differentially-private-top-k-selection-with-pay-what-you-get-composition.pdf
PWC https://paperswithcode.com/paper/practical-differentially-private-top-k
Repo https://github.com/rrogers386/DPComposition
Framework none

Performance prediction of data streams on high-performance architecture

Title Performance prediction of data streams on high-performance architecture
Authors Bhaskar Gautam, Annappa Basava
Abstract Worldwide sensor streams are expanding continuously with unbounded velocity in volume, and for this acceleration, there is an adaptation of large stream data processing system from the homogeneous to rack-scale architecture which makes serious concern in the domain of workload optimization, scheduling, and resource management algorithms. Our proposed framework is based on providing architecture independent performance prediction model to enable resource adaptive distributed stream data processing platform. It is comprised of seven pre-defined domain for dynamic data stream metrics including a self-driven model which tries to fit these metrics using ridge regularization regression algorithm. Another significant contribution lies in fully-automated performance prediction model inherited from the state-of-the-art distributed data management system for distributed stream processing systems using Gaussian processes regression that cluster metrics with the help of dimensionality reduction algorithm. We implemented its base on Apache Heron and evaluated with proposed Benchmark Suite comprising of five domain-specific topologies. To assess the proposed methodologies, we forcefully ingest tuple skewness among the benchmarking topologies to set up the ground truth for predictions and found that accuracy of predicting the performance of data streams increased up to 80.62% from 66.36% along with the reduction of error from 37.14 to 16.06%.
Tasks Dimensionality Reduction, Gaussian Processes
Published 2019-01-07
URL https://doi.org/10.1186/s13673-018-0163-4
PDF https://rdcu.be/bMVaG
PWC https://paperswithcode.com/paper/performance-prediction-of-data-streams-on
Repo https://github.com/bhaskar24/StreamBenchmark
Framework none

Optimal Decision Tree with Noisy Outcomes

Title Optimal Decision Tree with Noisy Outcomes
Authors Su Jia, Viswanath Nagarajan, Fatemeh Navidi, R Ravi
Abstract A fundamental task in active learning involves performing a sequence of tests to identify an unknown hypothesis that is drawn from a known distribution. This problem, known as optimal decision tree induction, has been widely studied for decades and the asymptotically best-possible approximation algorithm has been devised for it. We study a generalization where certain test outcomes are noisy, even in the more general case when the noise is persistent, i.e., repeating the test on the scenario gives the same noisy output, disallowing simple repetition as a way to gain confidence. We design new approximation algorithms for both the non-adaptive setting, where the test sequence must be fixed a-priori, and the adaptive setting where the test sequence depends on the outcomes of prior tests. Previous work in the area assumed at most a constant number of noisy outcomes per test and per scenario and provided approximation ratios that were problem dependent (such as the minimum probability of a hypothesis). Our new approximation algorithms provide guarantees that are nearly best-possible and work for the general case of a large number of noisy outcomes per test or per hypothesis where the performance degrades smoothly with this number. Our results adapt and generalize methods used for submodular ranking and stochastic set cover. We evaluate the performance of our algorithms on two natural applications with noise: toxic chemical identification and active learning of linear classifiers. Despite our logarithmic theoretical approximation guarantees, our methods give solutions with cost very close to the information theoretic minimum, demonstrating the effectiveness of our methods.
Tasks Active Learning
Published 2019-12-01
URL http://papers.nips.cc/paper/8592-optimal-decision-tree-with-noisy-outcomes
PDF http://papers.nips.cc/paper/8592-optimal-decision-tree-with-noisy-outcomes.pdf
PWC https://paperswithcode.com/paper/optimal-decision-tree-with-noisy-outcomes
Repo https://github.com/sjia1/ODT-with-noisy-outcomes
Framework none

Pyramid U-Network for Skeleton Extraction From Shape Points

Title Pyramid U-Network for Skeleton Extraction From Shape Points
Authors Rowel Atienza
Abstract The knowledge about the skeleton of a given geometric shape has many practical applications such as shape animation, shape comparison, shape recognition, and estimating structural strength. Skeleton extraction becomes a more challenging problem when the topology is represented in point cloud domain. In this paper, we present the network architecture, PSPU-SkelNet, for TeamPH which ranked 3rd in Point SkelNetOn 2019 challenge. PSPU-SkelNet is a pyramid of three U-Nets that predicts the skeleton from a given shape point cloud. PSPU-SkelNet achieves a Chamfer Distance (CD) of 2.9105 on the final test dataset. The code of PSPU SkelNet is available at https://github.com/roatienza/skelnet.
Tasks
Published 2019-06-17
URL http://openaccess.thecvf.com/content_CVPRW_2019/papers/SkelNetOn/Atienza_Pyramid_U-Network_for_Skeleton_Extraction_From_Shape_Points_CVPRW_2019_paper.pdf
PDF http://openaccess.thecvf.com/content_CVPRW_2019/papers/SkelNetOn/Atienza_Pyramid_U-Network_for_Skeleton_Extraction_From_Shape_Points_CVPRW_2019_paper.pdf
PWC https://paperswithcode.com/paper/pyramid-u-network-for-skeleton-extraction
Repo https://github.com/roatienza/skelnet
Framework none

Sphere Generative Adversarial Network Based on Geometric Moment Matching

Title Sphere Generative Adversarial Network Based on Geometric Moment Matching
Authors Sung Woo Park, Junseok Kwon
Abstract We propose sphere generative adversarial network (GAN), a novel integral probability metric (IPM)-based GAN. Sphere GAN uses the hypersphere to bound IPMs in the objective function. Thus, it can be trained stably. On the hypersphere, sphere GAN exploits the information of higher-order statistics of data using geometric moment matching, thereby providing more accurate results. In the paper, we mathematically prove the good properties of sphere GAN. In experiments, sphere GAN quantitatively and qualitatively surpasses recent state-of-the-art GANs for unsupervised image generation problems with the CIFAR-10, STL-10, and LSUN bedroom datasets. Source code is available at https://github.com/pswkiki/SphereGAN.
Tasks Image Generation
Published 2019-06-01
URL http://openaccess.thecvf.com/content_CVPR_2019/html/Park_Sphere_Generative_Adversarial_Network_Based_on_Geometric_Moment_Matching_CVPR_2019_paper.html
PDF http://openaccess.thecvf.com/content_CVPR_2019/papers/Park_Sphere_Generative_Adversarial_Network_Based_on_Geometric_Moment_Matching_CVPR_2019_paper.pdf
PWC https://paperswithcode.com/paper/sphere-generative-adversarial-network-based
Repo https://github.com/taki0112/SphereGAN-Tensorflow
Framework tf

Fixed That for You: Generating Contrastive Claims with Semantic Edits

Title Fixed That for You: Generating Contrastive Claims with Semantic Edits
Authors Christopher Hidey, Kathy McKeown
Abstract Understanding contrastive opinions is a key component of argument generation. Central to an argument is the claim, a statement that is in dispute. Generating a counter-argument then requires generating a response in contrast to the main claim of the original argument. To generate contrastive claims, we create a corpus of Reddit comment pairs self-labeled by posters using the acronym FTFY (fixed that for you). We then train neural models on these pairs to edit the original claim and produce a new claim with a different view. We demonstrate significant improvement over a sequence-to-sequence baseline in BLEU score and a human evaluation for fluency, coherence, and contrast.
Tasks
Published 2019-06-01
URL https://www.aclweb.org/anthology/N19-1174/
PDF https://www.aclweb.org/anthology/N19-1174
PWC https://paperswithcode.com/paper/fixed-that-for-you-generating-contrastive
Repo https://github.com/chridey/fixedthat
Framework pytorch

Sparse and noisy LiDAR completion with RGB guidance anduncertainty

Title Sparse and noisy LiDAR completion with RGB guidance anduncertainty
Authors Wouter Van Gansbeke, Davy Neven, Bert De Brabandere, Luc Van Gool
Abstract his work proposes a new method to accurately complete sparse LiDAR maps guided by RGB images. For autonomous vehicles and robotics the use of LiDAR is indispensable in order to achieve precise depth predictions. A multitude of applications depend on the awareness of their surroundings, and use depth cues to reason and react accordingly. On the one hand, monocular depth prediction methods fail to generate absolute and precise depth maps. On the other hand, stereoscopic approaches are still significantly outperformed by LiDAR based approaches. The goal of the depth completion task is to generate dense depth predictions from sparse and irregular point clouds which are mapped to a 2D plane. We propose a new framework which extracts both global and local information in order to produce proper depth maps. We argue that simple depth completion does not require a deep network. However, we additionally propose a fusion method with RGB guidance from a monocular camera in order to leverage object information and to correct mistakes in the sparse input. This improves the accuracy significantly. Moreover, confidence masks are exploited in order to take into account the uncertainty in the depth predictions from each modality. This fusion method outperforms the state-of-the-art and ranks first on the KITTI depth completion benchmark.
Tasks Autonomous Vehicles, Depth Completion, Depth Estimation
Published 2019-02-14
URL https://arxiv.org/abs/1902.05356
PDF https://arxiv.org/pdf/1902.05356.pdf
PWC https://paperswithcode.com/paper/sparse-and-noisy-lidar-completion-with-rgb-1
Repo https://github.com/wvangansbeke/Sparse-Depth-Completion
Framework pytorch
comments powered by Disqus