April 1, 2020

3280 words 16 mins read

Paper Group ANR 420

Paper Group ANR 420

A Systematic Evaluation: Fine-Grained CNN vs. Traditional CNN Classifiers. LAVA NAT: A Non-Autoregressive Translation Model with Look-Around Decoding and Vocabulary Attention. Dual Multi-head Co-attention for Multi-choice Reading Comprehension. Multiscale Sparsifying Transform Learning for Image Denoising. Laplacian Denoising Autoencoder. Robust an …

A Systematic Evaluation: Fine-Grained CNN vs. Traditional CNN Classifiers

Title A Systematic Evaluation: Fine-Grained CNN vs. Traditional CNN Classifiers
Authors Saeed Anwar, Nick Barnes, Lars Petersson
Abstract To make the best use of the underlying minute and subtle differences, fine-grained classifiers collect information about inter-class variations. The task is very challenging due to the small differences between the colors, viewpoint, and structure in the same class entities. The classification becomes more difficult due to the similarities between the differences in viewpoint with other classes and differences with its own. In this work, we investigate the performance of the landmark general CNN classifiers, which presented top-notch results on large scale classification datasets, on the fine-grained datasets, and compare it against state-of-the-art fine-grained classifiers. In this paper, we pose two specific questions: (i) Do the general CNN classifiers achieve comparable results to fine-grained classifiers? (ii) Do general CNN classifiers require any specific information to improve upon the fine-grained ones? Throughout this work, we train the general CNN classifiers without introducing any aspect that is specific to fine-grained datasets. We show an extensive evaluation on six datasets to determine whether the fine-grained classifier is able to elevate the baseline in their experiments.
Tasks
Published 2020-03-24
URL https://arxiv.org/abs/2003.11154v1
PDF https://arxiv.org/pdf/2003.11154v1.pdf
PWC https://paperswithcode.com/paper/a-systematic-evaluation-fine-grained-cnn-vs
Repo
Framework

LAVA NAT: A Non-Autoregressive Translation Model with Look-Around Decoding and Vocabulary Attention

Title LAVA NAT: A Non-Autoregressive Translation Model with Look-Around Decoding and Vocabulary Attention
Authors Xiaoya Li, Yuxian Meng, Arianna Yuan, Fei Wu, Jiwei Li
Abstract Non-autoregressive translation (NAT) models generate multiple tokens in one forward pass and is highly efficient at inference stage compared with autoregressive translation (AT) methods. However, NAT models often suffer from the multimodality problem, i.e., generating duplicated tokens or missing tokens. In this paper, we propose two novel methods to address this issue, the Look-Around (LA) strategy and the Vocabulary Attention (VA) mechanism. The Look-Around strategy predicts the neighbor tokens in order to predict the current token, and the Vocabulary Attention models long-term token dependencies inside the decoder by attending the whole vocabulary for each position to acquire knowledge of which token is about to generate. %We also propose a dynamic bidirectional decoding approach to accelerate the inference process of the LAVA model while preserving the high-quality of the generated output. Our proposed model uses significantly less time during inference compared with autoregressive models and most other NAT models. Our experiments on four benchmarks (WMT14 En$\rightarrow$De, WMT14 De$\rightarrow$En, WMT16 Ro$\rightarrow$En and IWSLT14 De$\rightarrow$En) show that the proposed model achieves competitive performance compared with the state-of-the-art non-autoregressive and autoregressive models while significantly reducing the time cost in inference phase.
Tasks
Published 2020-02-08
URL https://arxiv.org/abs/2002.03084v1
PDF https://arxiv.org/pdf/2002.03084v1.pdf
PWC https://paperswithcode.com/paper/lava-nat-a-non-autoregressive-translation
Repo
Framework

Dual Multi-head Co-attention for Multi-choice Reading Comprehension

Title Dual Multi-head Co-attention for Multi-choice Reading Comprehension
Authors Pengfei Zhu, Hai Zhao, Xiaoguang Li
Abstract Multi-choice Machine Reading Comprehension (MRC) requires model to decide the correct answer from a set of answer options when given a passage and a question. Thus in addition to a powerful pre-trained Language Model as encoder, multi-choice MRC especially relies on a matching network design which is supposed to effectively capture the relationship among the triplet of passage, question and answers. While the latest pre-trained Language Models have shown powerful enough even without the support from a matching network, and the latest matching network has been complicated enough, we thus propose a novel going-back-to-the-basic solution which straightforwardly models the MRC relationship as attention mechanism inside network. The proposed DUal Multi-head Co-Attention (DUMA) has been shown simple but effective and is capable of generally promoting pre-trained Language Models. Our proposed method is evaluated on two benchmark multi-choice MRC tasks, DREAM and RACE, showing that in terms of strong Language Models, DUMA may still boost the model to reach new state-of-the-art performance.
Tasks Language Modelling, Machine Reading Comprehension, Reading Comprehension
Published 2020-01-26
URL https://arxiv.org/abs/2001.09415v4
PDF https://arxiv.org/pdf/2001.09415v4.pdf
PWC https://paperswithcode.com/paper/dual-multi-head-co-attention-for-multi-choice
Repo
Framework

Multiscale Sparsifying Transform Learning for Image Denoising

Title Multiscale Sparsifying Transform Learning for Image Denoising
Authors Ashkan Abbasi, Amirhassan Monadjemi, Leyuan Fang, Hossein Rabbani, Neda Noormohammadi
Abstract The data-driven sparse methods such as synthesis dictionary learning and sparsifying transform learning have been proven to be effective in image denoising. However, these methods are intrinsically single-scale, which ignores the multiscale nature of images. This often leads to suboptimal results. In this paper, we propose several strategies to exploit multiscale information in image denoising through the sparsifying transform learning denoising (TLD) method. To this end, we first employ a simple method of denoising each wavelet subband independently via TLD. Then, we show that this method can be greatly enhanced using wavelet subbands mixing, which is a cheap fusion technique, to combine the results of single-scale and multiscale methods. Finally, we remove the need for denoising detail subbands. This simplification leads to an efficient multiscale denoising method with competitive performance to its baseline. The effectiveness of the proposed methods are experimentally shown over two datasets: 1) classic test images corrupted with Gaussian noise, and 2) fluorescence microscopy images corrupted with real Poisson-Gaussian noise. The proposed multiscale methods improve over the single-scale baseline method by an average of about 0.2 dB (in terms of PSNR) for removing synthetic Gaussian noise form classic test images and real Poisson-Gaussian noise from microscopy images, respectively. Interestingly, the proposed multiscale methods keep their superiority over the baseline even when noise is relatively weak. More importantly, we show that the proposed methods lead to visually pleasing results, in which edges and textures are better recovered. Extensive experiments over these two different datasets show that the proposed methods offer a good trade-off between performance and complexity.
Tasks Denoising, Dictionary Learning, Image Denoising
Published 2020-03-25
URL https://arxiv.org/abs/2003.11265v1
PDF https://arxiv.org/pdf/2003.11265v1.pdf
PWC https://paperswithcode.com/paper/multiscale-sparsifying-transform-learning-for
Repo
Framework

Laplacian Denoising Autoencoder

Title Laplacian Denoising Autoencoder
Authors Jianbo Jiao, Linchao Bao, Yunchao Wei, Shengfeng He, Honghui Shi, Rynson Lau, Thomas S. Huang
Abstract While deep neural networks have been shown to perform remarkably well in many machine learning tasks, labeling a large amount of ground truth data for supervised training is usually very costly to scale. Therefore, learning robust representations with unlabeled data is critical in relieving human effort and vital for many downstream tasks. Recent advances in unsupervised and self-supervised learning approaches for visual data have benefited greatly from domain knowledge. Here we are interested in a more generic unsupervised learning framework that can be easily generalized to other domains. In this paper, we propose to learn data representations with a novel type of denoising autoencoder, where the noisy input data is generated by corrupting latent clean data in the gradient domain. This can be naturally generalized to span multiple scales with a Laplacian pyramid representation of the input data. In this way, the agent learns more robust representations that exploit the underlying data structures across multiple scales. Experiments on several visual benchmarks demonstrate that better representations can be learned with the proposed approach, compared to its counterpart with single-scale corruption and other approaches. Furthermore, we also demonstrate that the learned representations perform well when transferring to other downstream vision tasks.
Tasks Denoising
Published 2020-03-30
URL https://arxiv.org/abs/2003.13623v1
PDF https://arxiv.org/pdf/2003.13623v1.pdf
PWC https://paperswithcode.com/paper/laplacian-denoising-autoencoder-1
Repo
Framework

Robust and On-the-fly Dataset Denoising for Image Classification

Title Robust and On-the-fly Dataset Denoising for Image Classification
Authors Jiaming Song, Lunjia Hu, Yann Dauphin, Michael Auli, Tengyu Ma
Abstract Memorization in over-parameterized neural networks could severely hurt generalization in the presence of mislabeled examples. However, mislabeled examples are hard to avoid in extremely large datasets collected with weak supervision. We address this problem by reasoning counterfactually about the loss distribution of examples with uniform random labels had they were trained with the real examples, and use this information to remove noisy examples from the training set. First, we observe that examples with uniform random labels have higher losses when trained with stochastic gradient descent under large learning rates. Then, we propose to model the loss distribution of the counterfactual examples using only the network parameters, which is able to model such examples with remarkable success. Finally, we propose to remove examples whose loss exceeds a certain quantile of the modeled loss distribution. This leads to On-the-fly Data Denoising (ODD), a simple yet effective algorithm that is robust to mislabeled examples, while introducing almost zero computational overhead compared to standard training. ODD is able to achieve state-of-the-art results on a wide range of datasets including real-world ones such as WebVision and Clothing1M.
Tasks Denoising, Image Classification
Published 2020-03-24
URL https://arxiv.org/abs/2003.10647v1
PDF https://arxiv.org/pdf/2003.10647v1.pdf
PWC https://paperswithcode.com/paper/robust-and-on-the-fly-dataset-denoising-for
Repo
Framework

Patch-based Non-Local Bayesian Networks for Blind Confocal Microscopy Denoising

Title Patch-based Non-Local Bayesian Networks for Blind Confocal Microscopy Denoising
Authors Saeed Izadi, Ghassan Hamarneh
Abstract Confocal microscopy is essential for histopathologic cell visualization and quantification. Despite its significant role in biology, fluorescence confocal microscopy suffers from the presence of inherent noise during image acquisition. Non-local patch-wise Bayesian mean filtering (NLB) was until recently the state-of-the-art denoising approach. However, classic denoising methods have been outperformed by neural networks in recent years. In this work, we propose to exploit the strengths of NLB in the framework of Bayesian deep learning. We do so by designing a convolutional neural network and training it to learn parameters of a Gaussian model approximating the prior on noise-free patches given their nearest, similar yet non-local, neighbors. We then apply Bayesian reasoning to leverage the prior and information from the noisy patch in the process of approximating the noise-free patch. Specifically, we use the closed-form analytic \textit{maximum a posteriori} (MAP) estimate in the NLB algorithm to obtain the noise-free patch that maximizes the posterior distribution. The performance of our proposed method is evaluated on confocal microscopy images with real noise Poisson-Gaussian noise. Our experiments reveal the superiority of our approach against state-of-the-art unsupervised denoising techniques.
Tasks Denoising
Published 2020-03-25
URL https://arxiv.org/abs/2003.11177v1
PDF https://arxiv.org/pdf/2003.11177v1.pdf
PWC https://paperswithcode.com/paper/patch-based-non-local-bayesian-networks-for
Repo
Framework

Deep Survival Machines: Fully Parametric Survival Regression and Representation Learning for Censored Data with Competing Risks

Title Deep Survival Machines: Fully Parametric Survival Regression and Representation Learning for Censored Data with Competing Risks
Authors Chirag Nagpal, Xinyu Li, Artur Dubrawski
Abstract We describe a new approach to estimating relative risks in time-to-event prediction problems with censored data in a fully parametric manner. Our approach does not require making strong assumptions of constant baseline hazard of the underlying survival distribution, as required by the Cox-proportional hazard model. By jointly learning deep nonlinear representations of the input covariates, we demonstrate the benefits of our approach when used to estimate survival risks through extensive experimentation on multiple real world datasets with different levels of censoring. We further demonstrate advantages of our model in the competing risks scenario. To the best of our knowledge, this is the first work involving fully parametric estimation of survival times with competing risks in the presence of censoring.
Tasks Representation Learning, Time-to-Event Prediction
Published 2020-03-02
URL https://arxiv.org/abs/2003.01176v1
PDF https://arxiv.org/pdf/2003.01176v1.pdf
PWC https://paperswithcode.com/paper/deep-survival-machines-fully-parametric
Repo
Framework

Reinforcement co-Learning of Deep and Spiking Neural Networks for Energy-Efficient Mapless Navigation with Neuromorphic Hardware

Title Reinforcement co-Learning of Deep and Spiking Neural Networks for Energy-Efficient Mapless Navigation with Neuromorphic Hardware
Authors Guangzhi Tang, Neelesh Kumar, Konstantinos P. Michmizos
Abstract Energy-efficient mapless navigation is crucial for mobile robots as they explore unknown environments with limited on-board resources. Although the recent deep reinforcement learning (DRL) approaches have been successfully applied to navigation, their high energy consumption limits their use in many robotic applications. Here, we propose a neuromorphic approach that combines the energy-efficiency of spiking neural networks with the optimality of DRL to learn control policies for mapless navigation. Our hybrid framework, Spiking deep deterministic policy gradient (SDDPG), consists of a spiking actor network (SAN) and a deep critic network, where the two networks were trained jointly using gradient descent. The trained SAN was deployed on Intel’s Loihi neuromorphic processor. The co-learning enabled synergistic information exchange between the two networks, allowing them to overcome each other’s limitations through a shared representation learning. When validated on both simulated and real-world complex environments, our method on Loihi not only consumed 75 times less energy per inference as compared to DDPG on Jetson TX2, but also had a higher rate of successfully navigating to the goal which ranged by 1% to 4.2%, depending on the forward-propagation timestep size. These results reinforce our ongoing effort to design brain-inspired algorithms for controlling autonomous robots with neuromorphic hardware.
Tasks Representation Learning
Published 2020-03-02
URL https://arxiv.org/abs/2003.01157v1
PDF https://arxiv.org/pdf/2003.01157v1.pdf
PWC https://paperswithcode.com/paper/reinforcement-co-learning-of-deep-and-spiking
Repo
Framework

Channel Interaction Networks for Fine-Grained Image Categorization

Title Channel Interaction Networks for Fine-Grained Image Categorization
Authors Yu Gao, Xintong Han, Xun Wang, Weilin Huang, Matthew R. Scott
Abstract Fine-grained image categorization is challenging due to the subtle inter-class differences.We posit that exploiting the rich relationships between channels can help capture such differences since different channels correspond to different semantics. In this paper, we propose a channel interaction network (CIN), which models the channel-wise interplay both within an image and across images. For a single image, a self-channel interaction (SCI) module is proposed to explore channel-wise correlation within the image. This allows the model to learn the complementary features from the correlated channels, yielding stronger fine-grained features. Furthermore, given an image pair, we introduce a contrastive channel interaction (CCI) module to model the cross-sample channel interaction with a metric learning framework, allowing the CIN to distinguish the subtle visual differences between images. Our model can be trained efficiently in an end-to-end fashion without the need of multi-stage training and testing. Finally, comprehensive experiments are conducted on three publicly available benchmarks, where the proposed method consistently outperforms the state-of-theart approaches, such as DFL-CNN (Wang, Morariu, and Davis 2018) and NTS (Yang et al. 2018).
Tasks Image Categorization, Metric Learning
Published 2020-03-11
URL https://arxiv.org/abs/2003.05235v1
PDF https://arxiv.org/pdf/2003.05235v1.pdf
PWC https://paperswithcode.com/paper/channel-interaction-networks-for-fine-grained-1
Repo
Framework

Whole-Body Control of a Mobile Manipulator using End-to-End Reinforcement Learning

Title Whole-Body Control of a Mobile Manipulator using End-to-End Reinforcement Learning
Authors Julien Kindle, Fadri Furrer, Tonci Novkovic, Jen Jen Chung, Roland Siegwart, Juan Nieto
Abstract Mobile manipulation is usually achieved by sequentially executing base and manipulator movements. This simplification, however, leads to a loss in efficiency and in some cases a reduction of workspace size. Even though different methods have been proposed to solve Whole-Body Control (WBC) online, they are either limited by a kinematic model or do not allow for reactive, online obstacle avoidance. In order to overcome these drawbacks, in this work, we propose an end-to-end Reinforcement Learning (RL) approach to WBC. We compared our learned controller against a state-of-the-art sampling-based method in simulation and achieved faster overall mission times. In addition, we validated the learned policy on our mobile manipulator RoyalPanda in challenging narrow corridor environments.
Tasks
Published 2020-02-25
URL https://arxiv.org/abs/2003.02637v1
PDF https://arxiv.org/pdf/2003.02637v1.pdf
PWC https://paperswithcode.com/paper/whole-body-control-of-a-mobile-manipulator
Repo
Framework

Discrimination-aware Network Pruning for Deep Model Compression

Title Discrimination-aware Network Pruning for Deep Model Compression
Authors Jing Liu, Bohan Zhuang, Zhuangwei Zhuang, Yong Guo, Junzhou Huang, Jinhui Zhu, Mingkui Tan
Abstract We study network pruning which aims to remove redundant channels/kernels and hence speed up the inference of deep networks. Existing pruning methods either train from scratch with sparsity constraints or minimize the reconstruction error between the feature maps of the pre-trained models and the compressed ones. Both strategies suffer from some limitations: the former kind is computationally expensive and difficult to converge, while the latter kind optimizes the reconstruction error but ignores the discriminative power of channels. In this paper, we propose a simple-yet-effective method called discrimination-aware channel pruning (DCP) to choose the channels that actually contribute to the discriminative power. Note that a channel often consists of a set of kernels. Besides the redundancy in channels, some kernels in a channel may also be redundant and fail to contribute to the discriminative power of the network, resulting in kernel level redundancy. To solve this, we propose a discrimination-aware kernel pruning (DKP) method to further compress deep networks by removing redundant kernels. To prevent DCP/DKP from selecting redundant channels/kernels, we propose a new adaptive stopping condition, which helps to automatically determine the number of selected channels/kernels and often results in more compact models with better performance. Extensive experiments on both image classification and face recognition demonstrate the effectiveness of our methods. For example, on ILSVRC-12, the resultant ResNet-50 model with 30% reduction of channels even outperforms the baseline model by 0.36% in terms of Top-1 accuracy. The pruned MobileNetV1 and MobileNetV2 achieve 1.93x and 1.42x inference acceleration on a mobile device, respectively, with negligible performance degradation. The source code and the pre-trained models are available at https://github.com/SCUT-AILab/DCP.
Tasks Face Recognition, Image Classification, Model Compression, Network Pruning
Published 2020-01-04
URL https://arxiv.org/abs/2001.01050v1
PDF https://arxiv.org/pdf/2001.01050v1.pdf
PWC https://paperswithcode.com/paper/discrimination-aware-network-pruning-for-deep
Repo
Framework

Accelerated Dual-Averaging Primal-Dual Method for Composite Convex Minimization

Title Accelerated Dual-Averaging Primal-Dual Method for Composite Convex Minimization
Authors Conghui Tan, Yuqiu Qian, Shiqian Ma, Tong Zhang
Abstract Dual averaging-type methods are widely used in industrial machine learning applications due to their ability to promoting solution structure (e.g., sparsity) efficiently. In this paper, we propose a novel accelerated dual-averaging primal-dual algorithm for minimizing a composite convex function. We also derive a stochastic version of the proposed method which solves empirical risk minimization, and its advantages on handling sparse data are demonstrated both theoretically and empirically.
Tasks
Published 2020-01-15
URL https://arxiv.org/abs/2001.05537v1
PDF https://arxiv.org/pdf/2001.05537v1.pdf
PWC https://paperswithcode.com/paper/accelerated-dual-averaging-primal-dual-method
Repo
Framework

DSR: A Collection for the Evaluation of Graded Disease-Symptom Relations

Title DSR: A Collection for the Evaluation of Graded Disease-Symptom Relations
Authors Markus Zlabinger, Sebastian Hofstätter, Navid Rekabsaz, Allan Hanbury
Abstract The effective extraction of ranked disease-symptom relationships is a critical component in various medical tasks, including computer-assisted medical diagnosis or the discovery of unexpected associations between diseases. While existing disease-symptom relationship extraction methods are used as the foundation in the various medical tasks, no collection is available to systematically evaluate the performance of such methods. In this paper, we introduce the Disease-Symptom Relation collection (DSR-collection), created by five fully trained physicians as expert annotators. We provide graded symptom judgments for diseases by differentiating between “symptoms” and “primary symptoms”. Further, we provide several strong baselines, based on the methods used in previous studies. The first method is based on word embeddings, and the second on co-occurrences of keywords in medical articles. For the co-occurrence method, we propose an adaption in which not only keywords are considered, but also the full text of medical articles. The evaluation on the DSR-collection shows the effectiveness of the proposed adaption in terms of nDCG, precision, and recall.
Tasks Medical Diagnosis, Word Embeddings
Published 2020-01-15
URL https://arxiv.org/abs/2001.05357v1
PDF https://arxiv.org/pdf/2001.05357v1.pdf
PWC https://paperswithcode.com/paper/dsr-a-collection-for-the-evaluation-of-graded
Repo
Framework

Balancing the composition of word embeddings across heterogenous data sets

Title Balancing the composition of word embeddings across heterogenous data sets
Authors Stephanie Brandl, David Lassner, Maximilian Alber
Abstract Word embeddings capture semantic relationships based on contextual information and are the basis for a wide variety of natural language processing applications. Notably these relationships are solely learned from the data and subsequently the data composition impacts the semantic of embeddings which arguably can lead to biased word vectors. Given qualitatively different data subsets, we aim to align the influence of single subsets on the resulting word vectors, while retaining their quality. In this regard we propose a criteria to measure the shift towards a single data subset and develop approaches to meet both objectives. We find that a weighted average of the two subset embeddings balances the influence of those subsets while word similarity performance decreases. We further propose a promising optimization approach to balance influences and quality of word embeddings.
Tasks Word Embeddings
Published 2020-01-14
URL https://arxiv.org/abs/2001.04693v1
PDF https://arxiv.org/pdf/2001.04693v1.pdf
PWC https://paperswithcode.com/paper/balancing-the-composition-of-word-embeddings
Repo
Framework
comments powered by Disqus