January 27, 2020

2865 words 14 mins read

Paper Group ANR 1084

Concatenated Feature Pyramid Network for Instance Segmentation. Fine-Grained Visual Recognition with Batch Confusion Norm. Point-less: More Abstractive Summarization with Pointer-Generator Networks. Tightly Coupled 3D Lidar Inertial Odometry and Mapping. Coarse2Fine: A Two-stage Training Method for Fine-grained Visual Classification. Few-shot Learn …

Concatenated Feature Pyramid Network for Instance Segmentation


Title	Concatenated Feature Pyramid Network for Instance Segmentation
Authors	Yongqing Sun, Pranav Shenoy K P, Jun Shimamura, Atsushi Sagata
Abstract	Low level features like edges and textures play an important role in accurately localizing instances in neural networks. In this paper, we propose an architecture which improves feature pyramid networks commonly used instance segmentation networks by incorporating low level features in all layers of the pyramid in an optimal and efficient way. Specifically, we introduce a new layer which learns new correlations from feature maps of multiple feature pyramid levels holistically and enhances the semantic information of the feature pyramid to improve accuracy. Our architecture is simple to implement in instance segmentation or object detection frameworks to boost accuracy. Using this method in Mask RCNN, our model achieves consistent improvement in precision on COCO Dataset with the computational overhead compared to the original feature pyramid network.
Tasks	Instance Segmentation, Object Detection, Semantic Segmentation
Published	2019-03-16
URL	http://arxiv.org/abs/1904.00768v1
PDF	http://arxiv.org/pdf/1904.00768v1.pdf
PWC	https://paperswithcode.com/paper/concatenated-feature-pyramid-network-for
Repo
Framework

Fine-Grained Visual Recognition with Batch Confusion Norm


Title	Fine-Grained Visual Recognition with Batch Confusion Norm
Authors	Yen-Chi Hsu, Cheng-Yao Hong, Ding-Jie Chen, Ming-Sui Lee, Davi Geiger, Tyng-Luh Liu
Abstract	We introduce a regularization concept based on the proposed Batch Confusion Norm (BCN) to address Fine-Grained Visual Classification (FGVC). The FGVC problem is notably characterized by its two intriguing properties, significant inter-class similarity and intra-class variations, which cause learning an effective FGVC classifier a challenging task. Inspired by the use of pairwise confusion energy as a regularization mechanism, we develop the BCN technique to improve the FGVC learning by imposing class prediction confusion on each training batch, and consequently alleviate the possible overfitting due to exploring image feature of fine details. In addition, our method is implemented with an attention gated CNN model, boosted by the incorporation of Atrous Spatial Pyramid Pooling (ASPP) to extract discriminative features and proper attentions. To demonstrate the usefulness of our method, we report state-of-the-art results on several benchmark FGVC datasets, along with comprehensive ablation comparisons.
Tasks	Fine-Grained Image Classification, Fine-Grained Visual Recognition
Published	2019-10-28
URL	https://arxiv.org/abs/1910.12423v1
PDF	https://arxiv.org/pdf/1910.12423v1.pdf
PWC	https://paperswithcode.com/paper/fine-grained-visual-recognition-with-batch
Repo
Framework

Point-less: More Abstractive Summarization with Pointer-Generator Networks


Title	Point-less: More Abstractive Summarization with Pointer-Generator Networks
Authors	Freek Boutkan, Jorn Ranzijn, David Rau, Eelco van der Wel
Abstract	The Pointer-Generator architecture has shown to be a big improvement for abstractive summarization seq2seq models. However, the summaries produced by this model are largely extractive as over 30% of the generated sentences are copied from the source text. This work proposes a multihead attention mechanism, pointer dropout, and two new loss functions to promote more abstractive summaries while maintaining similar ROUGE scores. Both the multihead attention and dropout do not improve N-gram novelty, however, the dropout acts as a regularizer which improves the ROUGE score. The new loss function achieves significantly higher novel N-grams and sentences, at the cost of a slightly lower ROUGE score.
Tasks	Abstractive Text Summarization
Published	2019-04-18
URL	http://arxiv.org/abs/1905.01975v1
PDF	http://arxiv.org/pdf/1905.01975v1.pdf
PWC	https://paperswithcode.com/paper/190501975
Repo
Framework

Tightly Coupled 3D Lidar Inertial Odometry and Mapping


Title	Tightly Coupled 3D Lidar Inertial Odometry and Mapping
Authors	Haoyang Ye, Yuying Chen, Ming Liu
Abstract	Ego-motion estimation is a fundamental requirement for most mobile robotic applications. By sensor fusion, we can compensate the deficiencies of stand-alone sensors and provide more reliable estimations. We introduce a tightly coupled lidar-IMU fusion method in this paper. By jointly minimizing the cost derived from lidar and IMU measurements, the lidar-IMU odometry (LIO) can perform well with acceptable drift after long-term experiment, even in challenging cases where the lidar measurements can be degraded. Besides, to obtain more reliable estimations of the lidar poses, a rotation-constrained refinement algorithm (LIO-mapping) is proposed to further align the lidar poses with the global map. The experiment results demonstrate that the proposed method can estimate the poses of the sensor pair at the IMU update rate with high precision, even under fast motion conditions or with insufficient features.
Tasks	Motion Estimation, Sensor Fusion
Published	2019-04-15
URL	http://arxiv.org/abs/1904.06993v1
PDF	http://arxiv.org/pdf/1904.06993v1.pdf
PWC	https://paperswithcode.com/paper/tightly-coupled-3d-lidar-inertial-odometry
Repo
Framework

Coarse2Fine: A Two-stage Training Method for Fine-grained Visual Classification


Title	Coarse2Fine: A Two-stage Training Method for Fine-grained Visual Classification
Authors	Amir Erfan Eshratifar, David Eigen, Michael Gormish, Massoud Pedram
Abstract	Small inter-class and large intra-class variations are the main challenges in fine-grained visual classification. Objects from different classes share visually similar structures and objects in the same class can have different poses and viewpoints. Therefore, the proper extraction of discriminative local features (e.g. bird’s beak or car’s headlight) is crucial. Most of the recent successes on this problem are based upon the attention models which can localize and attend the local discriminative objects parts. In this work, we propose a training method for visual attention networks, Coarse2Fine, which creates a differentiable path from the input space to the attended feature maps. Coarse2Fine learns an inverse mapping function from the attended feature maps to the informative regions in the raw image, which will guide the attention maps to better attend the fine-grained features. We show Coarse2Fine and orthogonal initialization of the attention weights can surpass the state-of-the-art accuracies on common fine-grained classification tasks.
Tasks	Fine-Grained Image Classification
Published	2019-09-06
URL	https://arxiv.org/abs/1909.02680v1
PDF	https://arxiv.org/pdf/1909.02680v1.pdf
PWC	https://paperswithcode.com/paper/coarse2fine-a-two-stage-training-method-for
Repo
Framework

Few-shot Learning for Domain-specific Fine-grained Image Classification


Title	Few-shot Learning for Domain-specific Fine-grained Image Classification
Authors	Xin Sun, Hongwei Xv, Junyu Dong, Qiong Li, Changrui Chen
Abstract	Learning to recognize novel visual categories from a few examples is a challenging task for machines in real-world applications. In contrast, humans have the ability to discriminate even similar objects with little supervision. This paper attempts to address the few-shot fine-grained recognition problem. We propose a feature fusion model to explore the largest discriminative features by focusing on key regions. The model utilizes focus-area location to discover the perceptually similar regions among objects. High-order integration is employed to capture the interaction information among intra-parts. We also design a Center Neighbor Loss to form robust embedding space distribution for generating discriminative features. Furthermore, we build a typical fine-grained and few-shot learning dataset miniPPlankton from the real-world application in the area of marine ecological environment. Extensive experiments are carried out to validate the performance of our model. First, the model is evaluated with two challenging experiments based on the miniDogsNet and Caltech-UCSD public datasets. The results demonstrate that our model achieves competitive performance compared with state-of-the-art models. Then, we implement our model for the real-world phytoplankton recognition task. The experimental results show the superiority of the proposed model compared with others on the miniPPlankton dataset.
Tasks	Few-Shot Learning, Fine-Grained Image Classification, Image Classification
Published	2019-07-23
URL	https://arxiv.org/abs/1907.09647v2
PDF	https://arxiv.org/pdf/1907.09647v2.pdf
PWC	https://paperswithcode.com/paper/few-shot-learning-for-domain-specfic-fine
Repo
Framework

Remote measurement of sea ice dynamics with regularized optimal transport


Title	Remote measurement of sea ice dynamics with regularized optimal transport
Authors	M. D. Parno, B. A. West, A. J. Song, T. S. Hodgdon, D. T. O’Connor
Abstract	As Arctic conditions rapidly change, human activity in the Arctic will continue to increase and so will the need for high-resolution observations of sea ice. While satellite imagery can provide high spatial resolution, it is temporally sparse and significant ice deformation can occur between observations. This makes it difficult to apply feature tracking or image correlation techniques that require persistent features to exist between images. With this in mind, we propose a technique based on optimal transport, which is commonly used to measure differences between probability distributions. When little ice enters or leaves the image scene, we show that regularized optimal transport can be used to quantitatively estimate ice deformation. We discuss the motivation for our approach and describe efficient computational implementations. Results are provided on a combination of synthetic and MODIS imagery to demonstrate the ability of our approach to estimate dynamics properties at the original image resolution.
Tasks
Published	2019-05-02
URL	https://arxiv.org/abs/1905.00989v1
PDF	https://arxiv.org/pdf/1905.00989v1.pdf
PWC	https://paperswithcode.com/paper/remote-measurement-of-sea-ice-dynamics-with
Repo
Framework

Prestopping: How Does Early Stopping Help Generalization against Label Noise?


Title	Prestopping: How Does Early Stopping Help Generalization against Label Noise?
Authors	Hwanjun Song, Minseok Kim, Dongmin Park, Jae-Gil Lee
Abstract	Noisy labels are very common in real-world training data, which lead to poor generalization on test data because of overfitting to the noisy labels. In this paper, we claim that such overfitting can be avoided by “early stopping” training a deep neural network before the noisy labels are severely memorized. Then, we resume training the early stopped network using a “maximal safe set,” which maintains a collection of almost certainly true-labeled samples at each epoch since the early stop point. Putting them all together, our novel two-phase training method, called Prestopping, realizes noise-free training under any type of label noise for practical use. Extensive experiments using four image benchmark data sets verify that our method significantly outperforms four state-of-the-art methods in test error by 0.4-8.2 percent points under existence of real-world noise.
Tasks
Published	2019-11-19
URL	https://arxiv.org/abs/1911.08059v1
PDF	https://arxiv.org/pdf/1911.08059v1.pdf
PWC	https://paperswithcode.com/paper/prestopping-how-does-early-stopping-help-1
Repo
Framework

Lumen boundary detection using neutrosophic c-means in IVOCT images


Title	Lumen boundary detection using neutrosophic c-means in IVOCT images
Authors	Mohammad Habibi, Ahmad Ayatollahi, Niyoosha Dallalazar, Ali Kermani
Abstract	In this paper, a novel method for lumen boundary identification is proposed using Neutrosophic c_means. This method clusters pixels of the intravascular optical coherence tomography image into several clusters using indeterminacy and Neutrosophic theory, which aims to detect the boundaries. Intravascular optical coherence tomography images are cross-sectional and high-resolution images which are taken from the coronary arterial wall. Coronary Artery Disease cause a lot of death each year. The first step for diagnosing this kind of diseases is to detect lumen boundary. Employing this approach, we obtained 0.972, 0.019, 0.076 mm2, 0.32 mm, and 0.985 as mean value for Jaccard measure (JACC), the percentage of area difference (PAD), average distance (AD), Hausdorff distance (HD), and dice index (DI), respectively. Based on our results, this method enjoys high accuracy performance.
Tasks	Boundary Detection
Published	2019-02-09
URL	http://arxiv.org/abs/1902.03489v2
PDF	http://arxiv.org/pdf/1902.03489v2.pdf
PWC	https://paperswithcode.com/paper/lumen-boundary-detection-using-neutrosophic-c
Repo
Framework

A Compendium on Network and Host based Intrusion Detection Systems


Title	A Compendium on Network and Host based Intrusion Detection Systems
Authors	Rahul-Vigneswaran K, Prabaharan Poornachandran, Soman KP
Abstract	The techniques of deep learning have become the state of the art methodology for executing complicated tasks from various domains of computer vision, natural language processing, and several other areas. Due to its rapid development and promising benchmarks in those fields, researchers started experimenting with this technique to perform in the area of, especially in intrusion detection related tasks. Deep learning is a subset and a natural extension of classical Machine learning and an evolved model of neural networks. This paper contemplates and discusses all the methodologies related to the leading edge Deep learning and Neural network models purposing to the arena of Intrusion Detection Systems.
Tasks	Intrusion Detection
Published	2019-04-06
URL	http://arxiv.org/abs/1904.03491v1
PDF	http://arxiv.org/pdf/1904.03491v1.pdf
PWC	https://paperswithcode.com/paper/a-compendium-on-network-and-host-based
Repo
Framework

An Editorial Network for Enhanced Document Summarization


Title	An Editorial Network for Enhanced Document Summarization
Authors	Edward Moroshko, Guy Feigenblat, Haggai Roitman, David Konopnicki
Abstract	We suggest a new idea of Editorial Network - a mixed extractive-abstractive summarization approach, which is applied as a post-processing step over a given sequence of extracted sentences. Our network tries to imitate the decision process of a human editor during summarization. Within such a process, each extracted sentence may be either kept untouched, rephrased or completely rejected. We further suggest an effective way for training the “editor” based on a novel soft-labeling approach. Using the CNN/DailyMail dataset we demonstrate the effectiveness of our approach compared to state-of-the-art extractive-only or abstractive-only baseline methods.
Tasks	Abstractive Text Summarization, Document Summarization
Published	2019-02-27
URL	http://arxiv.org/abs/1902.10360v1
PDF	http://arxiv.org/pdf/1902.10360v1.pdf
PWC	https://paperswithcode.com/paper/an-editorial-network-for-enhanced-document
Repo
Framework

The continuous Bernoulli: fixing a pervasive error in variational autoencoders


Title	The continuous Bernoulli: fixing a pervasive error in variational autoencoders
Authors	Gabriel Loaiza-Ganem, John P. Cunningham
Abstract	Variational autoencoders (VAE) have quickly become a central tool in machine learning, applicable to a broad range of data types and latent variable models. By far the most common first step, taken by seminal papers and by core software libraries alike, is to model MNIST data using a deep network parameterizing a Bernoulli likelihood. This practice contains what appears to be and what is often set aside as a minor inconvenience: the pixel data is [0,1] valued, not {0,1} as supported by the Bernoulli likelihood. Here we show that, far from being a triviality or nuisance that is convenient to ignore, this error has profound importance to VAE, both qualitative and quantitative. We introduce and fully characterize a new [0,1]-supported, single parameter distribution: the continuous Bernoulli, which patches this pervasive bug in VAE. This distribution is not nitpicking; it produces meaningful performance improvements across a range of metrics and datasets, including sharper image samples, and suggests a broader class of performant VAE.
Tasks	Latent Variable Models
Published	2019-07-16
URL	https://arxiv.org/abs/1907.06845v5
PDF	https://arxiv.org/pdf/1907.06845v5.pdf
PWC	https://paperswithcode.com/paper/the-continuous-bernoulli-fixing-a-pervasive
Repo
Framework

Variational Autoencoders and Nonlinear ICA: A Unifying Framework


Title	Variational Autoencoders and Nonlinear ICA: A Unifying Framework
Authors	Ilyes Khemakhem, Diederik P. Kingma, Ricardo Pio Monti, Aapo Hyvärinen
Abstract	The framework of variational autoencoders allows us to efficiently learn deep latent-variable models, such that the model’s marginal distribution over observed variables fits the data. Often, we’re interested in going a step further, and want to approximate the true joint distribution over observed and latent variables, including the true prior and posterior distributions over latent variables. This is known to be generally impossible due to unidentifiability of the model. We address this issue by showing that for a broad family of deep latent-variable models, identification of the true joint distribution over observed and latent variables is actually possible up to very simple transformations, thus achieving a principled and powerful form of disentanglement. Our result requires a factorized prior distribution over the latent variables that is conditioned on an additionally observed variable, such as a class label or almost any other observation. We build on recent developments in nonlinear ICA, which we extend to the case with noisy, undercomplete or discrete observations, integrated in a maximum likelihood framework. The result also trivially contains identifiable flow-based generative models as a special case.
Tasks	Latent Variable Models
Published	2019-07-10
URL	https://arxiv.org/abs/1907.04809v3
PDF	https://arxiv.org/pdf/1907.04809v3.pdf
PWC	https://paperswithcode.com/paper/variational-autoencoders-and-nonlinear-ica-a
Repo
Framework

Seeing Through Fog Without Seeing Fog: Deep Multimodal Sensor Fusion in Unseen Adverse Weather


Title	Seeing Through Fog Without Seeing Fog: Deep Multimodal Sensor Fusion in Unseen Adverse Weather
Authors	Mario Bijelic, Tobias Gruber, Fahim Mannan, Florian Kraus, Werner Ritter, Klaus Dietmayer, Felix Heide
Abstract	The fusion of multimodal sensor streams, such as camera, lidar, and radar measurements, plays a critical role in object detection for autonomous vehicles, which base their decision making on these inputs. While existing methods exploit redundant information under good conditions, they fail to do this in adverse weather where the sensory streams can be asymmetrically distorted. These rare ``edge-case’’ scenarios are not represented in available datasets, and existing fusion architectures are not designed to handle them. To address this data challenge we present a novel multimodal dataset acquired by over 10,000~km of driving in northern Europe. Although this dataset is the first large multimodal dataset in adverse weather, with 100k labels for lidar, camera, radar and gated NIR sensors, it does not facilitate training as extreme weather is rare. To this end, we present a deep fusion network for robust fusion without a large corpus of labeled training data covering all asymmetric distortions. Departing from proposal-level fusion, we propose a single-shot model that adaptively fuses features, driven by measurement entropy. We validate the proposed method, trained on clean data, on our extensive validation dataset. The dataset and all models will be published. \|
Tasks	Autonomous Vehicles, Decision Making, Object Detection, Sensor Fusion
Published	2019-02-24
URL	https://arxiv.org/abs/1902.08913v2
PDF	https://arxiv.org/pdf/1902.08913v2.pdf
PWC	https://paperswithcode.com/paper/seeing-through-fog-without-seeing-fog-deep
Repo
Framework

Stochastic Runge-Kutta Accelerates Langevin Monte Carlo and Beyond


Title	Stochastic Runge-Kutta Accelerates Langevin Monte Carlo and Beyond
Authors	Xuechen Li, Denny Wu, Lester Mackey, Murat A. Erdogdu
Abstract	Sampling with Markov chain Monte Carlo methods often amounts to discretizing some continuous-time dynamics with numerical integration. In this paper, we establish the convergence rate of sampling algorithms obtained by discretizing smooth It^o diffusions exhibiting fast Wasserstein-$2$ contraction, based on local deviation properties of the integration scheme. In particular, we study a sampling algorithm constructed by discretizing the overdamped Langevin diffusion with the method of stochastic Runge-Kutta. For strongly convex potentials that are smooth up to a certain order, its iterates converge to the target distribution in $2$-Wasserstein distance in $\tilde{\mathcal{O}}(d\epsilon^{-2/3})$ iterations. This improves upon the best-known rate for strongly log-concave sampling based on the overdamped Langevin equation using only the gradient oracle without adjustment. In addition, we extend our analysis of stochastic Runge-Kutta methods to uniformly dissipative diffusions with possibly non-convex potentials and show they achieve better rates compared to the Euler-Maruyama scheme in terms of the dependence on tolerance $\epsilon$. Numerical studies show that these algorithms lead to better stability and lower asymptotic errors.
Tasks
Published	2019-06-19
URL	https://arxiv.org/abs/1906.07868v3
PDF	https://arxiv.org/pdf/1906.07868v3.pdf
PWC	https://paperswithcode.com/paper/stochastic-runge-kutta-accelerates-langevin
Repo
Framework