Paper Group ANR 1084
Concatenated Feature Pyramid Network for Instance Segmentation. Fine-Grained Visual Recognition with Batch Confusion Norm. Point-less: More Abstractive Summarization with Pointer-Generator Networks. Tightly Coupled 3D Lidar Inertial Odometry and Mapping. Coarse2Fine: A Two-stage Training Method for Fine-grained Visual Classification. Few-shot Learn …
Concatenated Feature Pyramid Network for Instance Segmentation
Title | Concatenated Feature Pyramid Network for Instance Segmentation |
Authors | Yongqing Sun, Pranav Shenoy K P, Jun Shimamura, Atsushi Sagata |
Abstract | Low level features like edges and textures play an important role in accurately localizing instances in neural networks. In this paper, we propose an architecture which improves feature pyramid networks commonly used instance segmentation networks by incorporating low level features in all layers of the pyramid in an optimal and efficient way. Specifically, we introduce a new layer which learns new correlations from feature maps of multiple feature pyramid levels holistically and enhances the semantic information of the feature pyramid to improve accuracy. Our architecture is simple to implement in instance segmentation or object detection frameworks to boost accuracy. Using this method in Mask RCNN, our model achieves consistent improvement in precision on COCO Dataset with the computational overhead compared to the original feature pyramid network. |
Tasks | Instance Segmentation, Object Detection, Semantic Segmentation |
Published | 2019-03-16 |
URL | http://arxiv.org/abs/1904.00768v1 |
http://arxiv.org/pdf/1904.00768v1.pdf | |
PWC | https://paperswithcode.com/paper/concatenated-feature-pyramid-network-for |
Repo | |
Framework | |
Fine-Grained Visual Recognition with Batch Confusion Norm
Title | Fine-Grained Visual Recognition with Batch Confusion Norm |
Authors | Yen-Chi Hsu, Cheng-Yao Hong, Ding-Jie Chen, Ming-Sui Lee, Davi Geiger, Tyng-Luh Liu |
Abstract | We introduce a regularization concept based on the proposed Batch Confusion Norm (BCN) to address Fine-Grained Visual Classification (FGVC). The FGVC problem is notably characterized by its two intriguing properties, significant inter-class similarity and intra-class variations, which cause learning an effective FGVC classifier a challenging task. Inspired by the use of pairwise confusion energy as a regularization mechanism, we develop the BCN technique to improve the FGVC learning by imposing class prediction confusion on each training batch, and consequently alleviate the possible overfitting due to exploring image feature of fine details. In addition, our method is implemented with an attention gated CNN model, boosted by the incorporation of Atrous Spatial Pyramid Pooling (ASPP) to extract discriminative features and proper attentions. To demonstrate the usefulness of our method, we report state-of-the-art results on several benchmark FGVC datasets, along with comprehensive ablation comparisons. |
Tasks | Fine-Grained Image Classification, Fine-Grained Visual Recognition |
Published | 2019-10-28 |
URL | https://arxiv.org/abs/1910.12423v1 |
https://arxiv.org/pdf/1910.12423v1.pdf | |
PWC | https://paperswithcode.com/paper/fine-grained-visual-recognition-with-batch |
Repo | |
Framework | |
Point-less: More Abstractive Summarization with Pointer-Generator Networks
Title | Point-less: More Abstractive Summarization with Pointer-Generator Networks |
Authors | Freek Boutkan, Jorn Ranzijn, David Rau, Eelco van der Wel |
Abstract | The Pointer-Generator architecture has shown to be a big improvement for abstractive summarization seq2seq models. However, the summaries produced by this model are largely extractive as over 30% of the generated sentences are copied from the source text. This work proposes a multihead attention mechanism, pointer dropout, and two new loss functions to promote more abstractive summaries while maintaining similar ROUGE scores. Both the multihead attention and dropout do not improve N-gram novelty, however, the dropout acts as a regularizer which improves the ROUGE score. The new loss function achieves significantly higher novel N-grams and sentences, at the cost of a slightly lower ROUGE score. |
Tasks | Abstractive Text Summarization |
Published | 2019-04-18 |
URL | http://arxiv.org/abs/1905.01975v1 |
http://arxiv.org/pdf/1905.01975v1.pdf | |
PWC | https://paperswithcode.com/paper/190501975 |
Repo | |
Framework | |
Tightly Coupled 3D Lidar Inertial Odometry and Mapping
Title | Tightly Coupled 3D Lidar Inertial Odometry and Mapping |
Authors | Haoyang Ye, Yuying Chen, Ming Liu |
Abstract | Ego-motion estimation is a fundamental requirement for most mobile robotic applications. By sensor fusion, we can compensate the deficiencies of stand-alone sensors and provide more reliable estimations. We introduce a tightly coupled lidar-IMU fusion method in this paper. By jointly minimizing the cost derived from lidar and IMU measurements, the lidar-IMU odometry (LIO) can perform well with acceptable drift after long-term experiment, even in challenging cases where the lidar measurements can be degraded. Besides, to obtain more reliable estimations of the lidar poses, a rotation-constrained refinement algorithm (LIO-mapping) is proposed to further align the lidar poses with the global map. The experiment results demonstrate that the proposed method can estimate the poses of the sensor pair at the IMU update rate with high precision, even under fast motion conditions or with insufficient features. |
Tasks | Motion Estimation, Sensor Fusion |
Published | 2019-04-15 |
URL | http://arxiv.org/abs/1904.06993v1 |
http://arxiv.org/pdf/1904.06993v1.pdf | |
PWC | https://paperswithcode.com/paper/tightly-coupled-3d-lidar-inertial-odometry |
Repo | |
Framework | |
Coarse2Fine: A Two-stage Training Method for Fine-grained Visual Classification
Title | Coarse2Fine: A Two-stage Training Method for Fine-grained Visual Classification |
Authors | Amir Erfan Eshratifar, David Eigen, Michael Gormish, Massoud Pedram |
Abstract | Small inter-class and large intra-class variations are the main challenges in fine-grained visual classification. Objects from different classes share visually similar structures and objects in the same class can have different poses and viewpoints. Therefore, the proper extraction of discriminative local features (e.g. bird’s beak or car’s headlight) is crucial. Most of the recent successes on this problem are based upon the attention models which can localize and attend the local discriminative objects parts. In this work, we propose a training method for visual attention networks, Coarse2Fine, which creates a differentiable path from the input space to the attended feature maps. Coarse2Fine learns an inverse mapping function from the attended feature maps to the informative regions in the raw image, which will guide the attention maps to better attend the fine-grained features. We show Coarse2Fine and orthogonal initialization of the attention weights can surpass the state-of-the-art accuracies on common fine-grained classification tasks. |
Tasks | Fine-Grained Image Classification |
Published | 2019-09-06 |
URL | https://arxiv.org/abs/1909.02680v1 |
https://arxiv.org/pdf/1909.02680v1.pdf | |
PWC | https://paperswithcode.com/paper/coarse2fine-a-two-stage-training-method-for |
Repo | |
Framework | |
Few-shot Learning for Domain-specific Fine-grained Image Classification
Title | Few-shot Learning for Domain-specific Fine-grained Image Classification |
Authors | Xin Sun, Hongwei Xv, Junyu Dong, Qiong Li, Changrui Chen |
Abstract | Learning to recognize novel visual categories from a few examples is a challenging task for machines in real-world applications. In contrast, humans have the ability to discriminate even similar objects with little supervision. This paper attempts to address the few-shot fine-grained recognition problem. We propose a feature fusion model to explore the largest discriminative features by focusing on key regions. The model utilizes focus-area location to discover the perceptually similar regions among objects. High-order integration is employed to capture the interaction information among intra-parts. We also design a Center Neighbor Loss to form robust embedding space distribution for generating discriminative features. Furthermore, we build a typical fine-grained and few-shot learning dataset miniPPlankton from the real-world application in the area of marine ecological environment. Extensive experiments are carried out to validate the performance of our model. First, the model is evaluated with two challenging experiments based on the miniDogsNet and Caltech-UCSD public datasets. The results demonstrate that our model achieves competitive performance compared with state-of-the-art models. Then, we implement our model for the real-world phytoplankton recognition task. The experimental results show the superiority of the proposed model compared with others on the miniPPlankton dataset. |
Tasks | Few-Shot Learning, Fine-Grained Image Classification, Image Classification |
Published | 2019-07-23 |
URL | https://arxiv.org/abs/1907.09647v2 |
https://arxiv.org/pdf/1907.09647v2.pdf | |
PWC | https://paperswithcode.com/paper/few-shot-learning-for-domain-specfic-fine |
Repo | |
Framework | |
Remote measurement of sea ice dynamics with regularized optimal transport
Title | Remote measurement of sea ice dynamics with regularized optimal transport |
Authors | M. D. Parno, B. A. West, A. J. Song, T. S. Hodgdon, D. T. O’Connor |
Abstract | As Arctic conditions rapidly change, human activity in the Arctic will continue to increase and so will the need for high-resolution observations of sea ice. While satellite imagery can provide high spatial resolution, it is temporally sparse and significant ice deformation can occur between observations. This makes it difficult to apply feature tracking or image correlation techniques that require persistent features to exist between images. With this in mind, we propose a technique based on optimal transport, which is commonly used to measure differences between probability distributions. When little ice enters or leaves the image scene, we show that regularized optimal transport can be used to quantitatively estimate ice deformation. We discuss the motivation for our approach and describe efficient computational implementations. Results are provided on a combination of synthetic and MODIS imagery to demonstrate the ability of our approach to estimate dynamics properties at the original image resolution. |
Tasks | |
Published | 2019-05-02 |
URL | https://arxiv.org/abs/1905.00989v1 |
https://arxiv.org/pdf/1905.00989v1.pdf | |
PWC | https://paperswithcode.com/paper/remote-measurement-of-sea-ice-dynamics-with |
Repo | |
Framework | |
Prestopping: How Does Early Stopping Help Generalization against Label Noise?
Title | Prestopping: How Does Early Stopping Help Generalization against Label Noise? |
Authors | Hwanjun Song, Minseok Kim, Dongmin Park, Jae-Gil Lee |
Abstract | Noisy labels are very common in real-world training data, which lead to poor generalization on test data because of overfitting to the noisy labels. In this paper, we claim that such overfitting can be avoided by “early stopping” training a deep neural network before the noisy labels are severely memorized. Then, we resume training the early stopped network using a “maximal safe set,” which maintains a collection of almost certainly true-labeled samples at each epoch since the early stop point. Putting them all together, our novel two-phase training method, called Prestopping, realizes noise-free training under any type of label noise for practical use. Extensive experiments using four image benchmark data sets verify that our method significantly outperforms four state-of-the-art methods in test error by 0.4-8.2 percent points under existence of real-world noise. |
Tasks | |
Published | 2019-11-19 |
URL | https://arxiv.org/abs/1911.08059v1 |
https://arxiv.org/pdf/1911.08059v1.pdf | |
PWC | https://paperswithcode.com/paper/prestopping-how-does-early-stopping-help-1 |
Repo | |
Framework | |
Lumen boundary detection using neutrosophic c-means in IVOCT images
Title | Lumen boundary detection using neutrosophic c-means in IVOCT images |
Authors | Mohammad Habibi, Ahmad Ayatollahi, Niyoosha Dallalazar, Ali Kermani |
Abstract | In this paper, a novel method for lumen boundary identification is proposed using Neutrosophic c_means. This method clusters pixels of the intravascular optical coherence tomography image into several clusters using indeterminacy and Neutrosophic theory, which aims to detect the boundaries. Intravascular optical coherence tomography images are cross-sectional and high-resolution images which are taken from the coronary arterial wall. Coronary Artery Disease cause a lot of death each year. The first step for diagnosing this kind of diseases is to detect lumen boundary. Employing this approach, we obtained 0.972, 0.019, 0.076 mm2, 0.32 mm, and 0.985 as mean value for Jaccard measure (JACC), the percentage of area difference (PAD), average distance (AD), Hausdorff distance (HD), and dice index (DI), respectively. Based on our results, this method enjoys high accuracy performance. |
Tasks | Boundary Detection |
Published | 2019-02-09 |
URL | http://arxiv.org/abs/1902.03489v2 |
http://arxiv.org/pdf/1902.03489v2.pdf | |
PWC | https://paperswithcode.com/paper/lumen-boundary-detection-using-neutrosophic-c |
Repo | |
Framework | |
A Compendium on Network and Host based Intrusion Detection Systems
Title | A Compendium on Network and Host based Intrusion Detection Systems |
Authors | Rahul-Vigneswaran K, Prabaharan Poornachandran, Soman KP |
Abstract | The techniques of deep learning have become the state of the art methodology for executing complicated tasks from various domains of computer vision, natural language processing, and several other areas. Due to its rapid development and promising benchmarks in those fields, researchers started experimenting with this technique to perform in the area of, especially in intrusion detection related tasks. Deep learning is a subset and a natural extension of classical Machine learning and an evolved model of neural networks. This paper contemplates and discusses all the methodologies related to the leading edge Deep learning and Neural network models purposing to the arena of Intrusion Detection Systems. |
Tasks | Intrusion Detection |
Published | 2019-04-06 |
URL | http://arxiv.org/abs/1904.03491v1 |
http://arxiv.org/pdf/1904.03491v1.pdf | |
PWC | https://paperswithcode.com/paper/a-compendium-on-network-and-host-based |
Repo | |
Framework | |
An Editorial Network for Enhanced Document Summarization
Title | An Editorial Network for Enhanced Document Summarization |
Authors | Edward Moroshko, Guy Feigenblat, Haggai Roitman, David Konopnicki |
Abstract | We suggest a new idea of Editorial Network - a mixed extractive-abstractive summarization approach, which is applied as a post-processing step over a given sequence of extracted sentences. Our network tries to imitate the decision process of a human editor during summarization. Within such a process, each extracted sentence may be either kept untouched, rephrased or completely rejected. We further suggest an effective way for training the “editor” based on a novel soft-labeling approach. Using the CNN/DailyMail dataset we demonstrate the effectiveness of our approach compared to state-of-the-art extractive-only or abstractive-only baseline methods. |
Tasks | Abstractive Text Summarization, Document Summarization |
Published | 2019-02-27 |
URL | http://arxiv.org/abs/1902.10360v1 |
http://arxiv.org/pdf/1902.10360v1.pdf | |
PWC | https://paperswithcode.com/paper/an-editorial-network-for-enhanced-document |
Repo | |
Framework | |
The continuous Bernoulli: fixing a pervasive error in variational autoencoders
Title | The continuous Bernoulli: fixing a pervasive error in variational autoencoders |
Authors | Gabriel Loaiza-Ganem, John P. Cunningham |
Abstract | Variational autoencoders (VAE) have quickly become a central tool in machine learning, applicable to a broad range of data types and latent variable models. By far the most common first step, taken by seminal papers and by core software libraries alike, is to model MNIST data using a deep network parameterizing a Bernoulli likelihood. This practice contains what appears to be and what is often set aside as a minor inconvenience: the pixel data is [0,1] valued, not {0,1} as supported by the Bernoulli likelihood. Here we show that, far from being a triviality or nuisance that is convenient to ignore, this error has profound importance to VAE, both qualitative and quantitative. We introduce and fully characterize a new [0,1]-supported, single parameter distribution: the continuous Bernoulli, which patches this pervasive bug in VAE. This distribution is not nitpicking; it produces meaningful performance improvements across a range of metrics and datasets, including sharper image samples, and suggests a broader class of performant VAE. |
Tasks | Latent Variable Models |
Published | 2019-07-16 |
URL | https://arxiv.org/abs/1907.06845v5 |
https://arxiv.org/pdf/1907.06845v5.pdf | |
PWC | https://paperswithcode.com/paper/the-continuous-bernoulli-fixing-a-pervasive |
Repo | |
Framework | |
Variational Autoencoders and Nonlinear ICA: A Unifying Framework
Title | Variational Autoencoders and Nonlinear ICA: A Unifying Framework |
Authors | Ilyes Khemakhem, Diederik P. Kingma, Ricardo Pio Monti, Aapo Hyvärinen |
Abstract | The framework of variational autoencoders allows us to efficiently learn deep latent-variable models, such that the model’s marginal distribution over observed variables fits the data. Often, we’re interested in going a step further, and want to approximate the true joint distribution over observed and latent variables, including the true prior and posterior distributions over latent variables. This is known to be generally impossible due to unidentifiability of the model. We address this issue by showing that for a broad family of deep latent-variable models, identification of the true joint distribution over observed and latent variables is actually possible up to very simple transformations, thus achieving a principled and powerful form of disentanglement. Our result requires a factorized prior distribution over the latent variables that is conditioned on an additionally observed variable, such as a class label or almost any other observation. We build on recent developments in nonlinear ICA, which we extend to the case with noisy, undercomplete or discrete observations, integrated in a maximum likelihood framework. The result also trivially contains identifiable flow-based generative models as a special case. |
Tasks | Latent Variable Models |
Published | 2019-07-10 |
URL | https://arxiv.org/abs/1907.04809v3 |
https://arxiv.org/pdf/1907.04809v3.pdf | |
PWC | https://paperswithcode.com/paper/variational-autoencoders-and-nonlinear-ica-a |
Repo | |
Framework | |
Seeing Through Fog Without Seeing Fog: Deep Multimodal Sensor Fusion in Unseen Adverse Weather
Title | Seeing Through Fog Without Seeing Fog: Deep Multimodal Sensor Fusion in Unseen Adverse Weather |
Authors | Mario Bijelic, Tobias Gruber, Fahim Mannan, Florian Kraus, Werner Ritter, Klaus Dietmayer, Felix Heide |
Abstract | The fusion of multimodal sensor streams, such as camera, lidar, and radar measurements, plays a critical role in object detection for autonomous vehicles, which base their decision making on these inputs. While existing methods exploit redundant information under good conditions, they fail to do this in adverse weather where the sensory streams can be asymmetrically distorted. These rare ``edge-case’’ scenarios are not represented in available datasets, and existing fusion architectures are not designed to handle them. To address this data challenge we present a novel multimodal dataset acquired by over 10,000~km of driving in northern Europe. Although this dataset is the first large multimodal dataset in adverse weather, with 100k labels for lidar, camera, radar and gated NIR sensors, it does not facilitate training as extreme weather is rare. To this end, we present a deep fusion network for robust fusion without a large corpus of labeled training data covering all asymmetric distortions. Departing from proposal-level fusion, we propose a single-shot model that adaptively fuses features, driven by measurement entropy. We validate the proposed method, trained on clean data, on our extensive validation dataset. The dataset and all models will be published. | |
Tasks | Autonomous Vehicles, Decision Making, Object Detection, Sensor Fusion |
Published | 2019-02-24 |
URL | https://arxiv.org/abs/1902.08913v2 |
https://arxiv.org/pdf/1902.08913v2.pdf | |
PWC | https://paperswithcode.com/paper/seeing-through-fog-without-seeing-fog-deep |
Repo | |
Framework | |
Stochastic Runge-Kutta Accelerates Langevin Monte Carlo and Beyond
Title | Stochastic Runge-Kutta Accelerates Langevin Monte Carlo and Beyond |
Authors | Xuechen Li, Denny Wu, Lester Mackey, Murat A. Erdogdu |
Abstract | Sampling with Markov chain Monte Carlo methods often amounts to discretizing some continuous-time dynamics with numerical integration. In this paper, we establish the convergence rate of sampling algorithms obtained by discretizing smooth It^o diffusions exhibiting fast Wasserstein-$2$ contraction, based on local deviation properties of the integration scheme. In particular, we study a sampling algorithm constructed by discretizing the overdamped Langevin diffusion with the method of stochastic Runge-Kutta. For strongly convex potentials that are smooth up to a certain order, its iterates converge to the target distribution in $2$-Wasserstein distance in $\tilde{\mathcal{O}}(d\epsilon^{-2/3})$ iterations. This improves upon the best-known rate for strongly log-concave sampling based on the overdamped Langevin equation using only the gradient oracle without adjustment. In addition, we extend our analysis of stochastic Runge-Kutta methods to uniformly dissipative diffusions with possibly non-convex potentials and show they achieve better rates compared to the Euler-Maruyama scheme in terms of the dependence on tolerance $\epsilon$. Numerical studies show that these algorithms lead to better stability and lower asymptotic errors. |
Tasks | |
Published | 2019-06-19 |
URL | https://arxiv.org/abs/1906.07868v3 |
https://arxiv.org/pdf/1906.07868v3.pdf | |
PWC | https://paperswithcode.com/paper/stochastic-runge-kutta-accelerates-langevin |
Repo | |
Framework | |