April 2, 2020

3093 words 15 mins read

Paper Group ANR 371

Paper Group ANR 371

HOTCAKE: Higher Order Tucker Articulated Kernels for Deeper CNN Compression. Efficient Topological Layer based on Persistent Landscapes. Classification of the Chinese Handwritten Numbers with Supervised Projective Dictionary Pair Learning. COKE: Communication-Censored Kernel Learning for Decentralized Non-parametric Learning. Discriminative Feature …

HOTCAKE: Higher Order Tucker Articulated Kernels for Deeper CNN Compression

Title HOTCAKE: Higher Order Tucker Articulated Kernels for Deeper CNN Compression
Authors Rui Lin, Ching-Yun Ko, Zhuolun He, Cong Chen, Yuan Cheng, Hao Yu, Graziano Chesi, Ngai Wong
Abstract The emerging edge computing has promoted immense interests in compacting a neural network without sacrificing much accuracy. In this regard, low-rank tensor decomposition constitutes a powerful tool to compress convolutional neural networks (CNNs) by decomposing the 4-way kernel tensor into multi-stage smaller ones. Building on top of Tucker-2 decomposition, we propose a generalized Higher Order Tucker Articulated Kernels (HOTCAKE) scheme comprising four steps: input channel decomposition, guided Tucker rank selection, higher order Tucker decomposition and fine-tuning. By subjecting each CONV layer to HOTCAKE, a highly compressed CNN model with graceful accuracy trade-off is obtained. Experiments show HOTCAKE can compress even pre-compressed models and produce state-of-the-art lightweight networks.
Tasks
Published 2020-02-28
URL https://arxiv.org/abs/2002.12663v1
PDF https://arxiv.org/pdf/2002.12663v1.pdf
PWC https://paperswithcode.com/paper/hotcake-higher-order-tucker-articulated
Repo
Framework

Efficient Topological Layer based on Persistent Landscapes

Title Efficient Topological Layer based on Persistent Landscapes
Authors Kwangho Kim, Jisu Kim, Joon Sik Kim, Frederic Chazal, Larry Wasserman
Abstract We propose a novel topological layer for general deep learning models based on persistent landscapes, in which we can efficiently exploit underlying topological features of the input data structure. We use the robust DTM function and show differentiability with respect to layer inputs, for a general persistent homology with arbitrary filtration. Thus, our proposed layer can be placed anywhere in the network architecture and feed critical information on the topological features of input data into subsequent layers to improve the learnability of the networks toward a given task. A task-optimal structure of the topological layer is learned during training via backpropagation, without requiring any input featurization or data preprocessing. We provide a tight stability theorem, and show that the proposed layer is robust towards noise and outliers. We demonstrate the effectiveness of our approach by classification experiments on various datasets.
Tasks
Published 2020-02-07
URL https://arxiv.org/abs/2002.02778v1
PDF https://arxiv.org/pdf/2002.02778v1.pdf
PWC https://paperswithcode.com/paper/efficient-topological-layer-based-on
Repo
Framework

Classification of the Chinese Handwritten Numbers with Supervised Projective Dictionary Pair Learning

Title Classification of the Chinese Handwritten Numbers with Supervised Projective Dictionary Pair Learning
Authors Rasool Ameri, Saideh Ferdowsi, Ali Alameer, Vahid Abolghasemi, Kianoush Nazarpour
Abstract Image classification has become a key ingredient in the field of computer vision. To enhance classification accuracy, current approaches heavily focus on increasing network depth and width, e.g., inception modules, at the cost of computational requirements. To mitigate this problem, in this paper a novel dictionary learning method is proposed and tested with Chinese handwritten numbers. We have considered three important characteristics to design the dictionary: discriminability, sparsity, and classification error. We formulated these metrics into a unified cost function. The proposed architecture i) obtains an efficient sparse code in a novel feature space without relying on $\ell_0$ and $\ell_1$ norms minimisation; and ii) includes the classification error within the cost function as an extra constraint. Experimental results show that the proposed method provides superior classification performance compared to recent dictionary learning methods. With a classification accuracy of $\sim$98%, the results suggest that our proposed sparse learning algorithm achieves comparable performance to existing well-known deep learning methods, e.g., SqueezeNet, GoogLeNet and MobileNetV2, but with a fraction of parameters.
Tasks Dictionary Learning, Image Classification, Sparse Learning
Published 2020-03-26
URL https://arxiv.org/abs/2003.11700v1
PDF https://arxiv.org/pdf/2003.11700v1.pdf
PWC https://paperswithcode.com/paper/classification-of-the-chinese-handwritten
Repo
Framework

COKE: Communication-Censored Kernel Learning for Decentralized Non-parametric Learning

Title COKE: Communication-Censored Kernel Learning for Decentralized Non-parametric Learning
Authors Ping Xu, Yue Wang, Xiang Chen, Tian Zhi
Abstract This paper studies the decentralized optimization and learning problem where multiple interconnected agents aim to learn an optimal decision function defined over a reproducing kernel Hilbert (RKH) space by jointly minimizing a global objective function, with access to locally observed data only. As a non-parametric approach, kernel learning faces a major challenge in distributed implementation: the decision variables of local objective functions are data-dependent with different sizes and thus cannot be optimized under the decentralized consensus framework without any raw data exchange among agents. To circumvent this major challenge and preserve data privacy, we leverage the random feature (RF) approximation approach to map the large-volume data represented in the RKH space into a smaller RF space, which facilitates the same-size parameter exchange and enables distributed agents to reach consensus on the function decided by the parameters in the RF space. For fast convergent implementation, we design an iterative algorithm for Decentralized Kernel Learning via Alternating direction method of multipliers (DKLA). Further, we develop a COmmunication-censored KErnel learning (COKE) algorithm to reduce the communication load in DKLA. To do so, we apply a communication-censoring strategy, which prevents an agent from transmitting at every iteration unless its local updates are deemed informative. Theoretical results in terms of linear convergence guarantee and generalization performance analysis of DKLA and COKE are provided. Comprehensive tests with both synthetic and real datasets are conducted to verify the communication efficiency and learning effectiveness of COKE.
Tasks
Published 2020-01-28
URL https://arxiv.org/abs/2001.10133v1
PDF https://arxiv.org/pdf/2001.10133v1.pdf
PWC https://paperswithcode.com/paper/coke-communication-censored-kernel-learning
Repo
Framework

Discriminative Feature and Dictionary Learning with Part-aware Model for Vehicle Re-identification

Title Discriminative Feature and Dictionary Learning with Part-aware Model for Vehicle Re-identification
Authors Huibing Wang, Jinjia Peng, Guangqi Jiang, Fengqiang Xu, Xianping Fu
Abstract With the development of smart cities, urban surveillance video analysis will play a further significant role in intelligent transportation systems. Identifying the same target vehicle in large datasets from non-overlapping cameras should be highlighted, which has grown into a hot topic in promoting intelligent transportation systems. However, vehicle re-identification (re-ID) technology is a challenging task since vehicles of the same design or manufacturer show similar appearance. To fill these gaps, we tackle this challenge by proposing Triplet Center Loss based Part-aware Model (TCPM) that leverages the discriminative features in part details of vehicles to refine the accuracy of vehicle re-identification. TCPM base on part discovery is that partitions the vehicle from horizontal and vertical directions to strengthen the details of the vehicle and reinforce the internal consistency of the parts. In addition, to eliminate intra-class differences in local regions of the vehicle, we propose external memory modules to emphasize the consistency of each part to learn the discriminating features, which forms a global dictionary over all categories in dataset. In TCPM, triplet-center loss is introduced to ensure each part of vehicle features extracted has intra-class consistency and inter-class separability. Experimental results show that our proposed TCPM has an enormous preference over the existing state-of-the-art methods on benchmark datasets VehicleID and VeRi-776.
Tasks Dictionary Learning, Vehicle Re-Identification
Published 2020-03-16
URL https://arxiv.org/abs/2003.07139v1
PDF https://arxiv.org/pdf/2003.07139v1.pdf
PWC https://paperswithcode.com/paper/discriminative-feature-and-dictionary
Repo
Framework

3D Dynamic Point Cloud Denoising via Spatial-Temporal Graph Learning

Title 3D Dynamic Point Cloud Denoising via Spatial-Temporal Graph Learning
Authors Wei Hu, Qianjiang Hu, Zehua Wang, Xiang Gao
Abstract The prevalence of accessible depth sensing and 3D laser scanning techniques has enabled the convenient acquisition of 3D dynamic point clouds, which provide efficient representation of arbitrarily-shaped objects in motion. Nevertheless, dynamic point clouds are often perturbed by noise due to hardware, software or other causes. While a plethora of methods have been proposed for static point cloud denoising, few efforts are made for the denoising of dynamic point clouds with varying number of irregularly-sampled points in each frame. In this paper, we represent dynamic point clouds naturally on graphs and address the denoising problem by inferring the underlying graph via spatio-temporal graph learning, exploiting both the intra-frame similarity and inter-frame consistency. Firstly, assuming the availability of a relevant feature vector per node, we pose spatial-temporal graph learning as optimizing a Mahalanobis distance metric M, which is formulated as the minimization of graph Laplacian regularizer. Secondly, to ease the optimization of the symmetric and positive definite metric matrix M, we decompose it into M = R’*R and solve R instead via proximal gradient. Finally, based on the spatial-temporal graph learning, we formulate dynamic point cloud denoising as the joint optimization of the desired point cloud and underlying spatio-temporal graph, which leverages both intra-frame affinities and inter-frame consistency and is solved via alternating minimization. Experimental results show that the proposed method significantly outperforms independent denoising of each frame from state-of-the-art static point cloud denoising approaches.
Tasks Denoising
Published 2020-03-17
URL https://arxiv.org/abs/2003.08355v1
PDF https://arxiv.org/pdf/2003.08355v1.pdf
PWC https://paperswithcode.com/paper/3d-dynamic-point-cloud-denoising-via-spatial
Repo
Framework

Gated Texture CNN for Efficient and Configurable Image Denoising

Title Gated Texture CNN for Efficient and Configurable Image Denoising
Authors Kaito Imai, Takamichi Miyata
Abstract Convolutional neural network (CNN)-based image denoising methods typically estimate the noise component contained in a noisy input image and restore a clean image by subtracting the estimated noise from the input. However, previous denoising methods tend to remove high-frequency information (e.g., textures) from the input. It caused by intermediate feature maps of CNN contains texture information. A straightforward approach to this problem is stacking numerous layers, which leads to a high computational cost. To achieve high performance and computational efficiency, we propose a gated texture CNN (GTCNN), which is designed to carefully exclude the texture information from each intermediate feature map of the CNN by incorporating gating mechanisms. Our GTCNN achieves state-of-the-art performance with 4.8 times fewer parameters than previous state-of-the-art methods. Furthermore, the GTCNN allows us to interactively control the texture strength in the output image without any additional modules, training, or computational costs.
Tasks Denoising, Image Denoising
Published 2020-03-16
URL https://arxiv.org/abs/2003.07042v1
PDF https://arxiv.org/pdf/2003.07042v1.pdf
PWC https://paperswithcode.com/paper/gated-texture-cnn-for-efficient-and
Repo
Framework

On Interpretability of Artificial Neural Networks

Title On Interpretability of Artificial Neural Networks
Authors Fenglei Fan, Jinjun Xiong, Ge Wang
Abstract Deep learning has achieved great successes in many important areas to dealing with text, images, video, graphs, and so on. However, the black-box nature of deep artificial neural networks has become the primary obstacle to their public acceptance and wide popularity in critical applications such as diagnosis and therapy. Due to the huge potential of deep learning, interpreting neural networks has become one of the most critical research directions. In this paper, we systematically review recent studies in understanding the mechanism of neural networks and shed light on some future directions of interpretability research (This work is still in progress).
Tasks
Published 2020-01-08
URL https://arxiv.org/abs/2001.02522v1
PDF https://arxiv.org/pdf/2001.02522v1.pdf
PWC https://paperswithcode.com/paper/on-interpretability-of-artificial-neural
Repo
Framework

Restore from Restored: Video Restoration with Pseudo Clean Video

Title Restore from Restored: Video Restoration with Pseudo Clean Video
Authors Seunghwan Lee, Seobin Park, Donghyeon Cho, Jiwon Kim, Tae Hyun Kim
Abstract In this paper, we propose a self-supervised video denoising method called “restore-from-restored” that fine-tunes a baseline network by using a pseudo clean video at the test phase. The pseudo clean video can be obtained by applying an input noisy video to the pre-trained baseline network. By adopting a fully convolutional network (FCN) as the baseline, we can restore videos without accurate optical flow and registration due to its translation-invariant property unlike many conventional video restoration methods. Moreover, the proposed method can take advantage of the existence of many similar patches across consecutive frames (i.e., patch-recurrence), which can boost performance of the baseline network by a large margin. We analyze the restoration performance of the FCN fine-tuned with the proposed self-supervision-based training algorithm, and demonstrate that FCN can utilize recurring patches without the need for registration among adjacent frames. The proposed method can be applied to any FCN-based denoising models. In our experiments, we apply the proposed method to the state-of-the-art denoisers, and our results indicate a considerable improvementin task performance.
Tasks Denoising, Optical Flow Estimation, Video Denoising
Published 2020-03-09
URL https://arxiv.org/abs/2003.04279v1
PDF https://arxiv.org/pdf/2003.04279v1.pdf
PWC https://paperswithcode.com/paper/restore-from-restored-video-restoration-with
Repo
Framework

Universal-RCNN: Universal Object Detector via Transferable Graph R-CNN

Title Universal-RCNN: Universal Object Detector via Transferable Graph R-CNN
Authors Hang Xu, Linpu Fang, Xiaodan Liang, Wenxiong Kang, Zhenguo Li
Abstract The dominant object detection approaches treat each dataset separately and fit towards a specific domain, which cannot adapt to other domains without extensive retraining. In this paper, we address the problem of designing a universal object detection model that exploits diverse category granularity from multiple domains and predict all kinds of categories in one system. Existing works treat this problem by integrating multiple detection branches upon one shared backbone network. However, this paradigm overlooks the crucial semantic correlations between multiple domains, such as categories hierarchy, visual similarity, and linguistic relationship. To address these drawbacks, we present a novel universal object detector called Universal-RCNN that incorporates graph transfer learning for propagating relevant semantic information across multiple datasets to reach semantic coherency. Specifically, we first generate a global semantic pool by integrating all high-level semantic representation of all the categories. Then an Intra-Domain Reasoning Module learns and propagates the sparse graph representation within one dataset guided by a spatial-aware GCN. Finally, an InterDomain Transfer Module is proposed to exploit diverse transfer dependencies across all domains and enhance the regional feature representation by attending and transferring semantic contexts globally. Extensive experiments demonstrate that the proposed method significantly outperforms multiple-branch models and achieves the state-of-the-art results on multiple object detection benchmarks (mAP: 49.1% on COCO).
Tasks Object Detection, Transfer Learning
Published 2020-02-18
URL https://arxiv.org/abs/2002.07417v1
PDF https://arxiv.org/pdf/2002.07417v1.pdf
PWC https://paperswithcode.com/paper/universal-rcnn-universal-object-detector-via
Repo
Framework

A Survey towards Federated Semi-supervised Learning

Title A Survey towards Federated Semi-supervised Learning
Authors Yilun Jin, Xiguang Wei, Yang Liu, Qiang Yang
Abstract The success of Artificial Intelligence (AI) should be largely attributed to the accessibility of abundant data. However, this is not exactly the case in reality, where it is common for developers in industry to face insufficient, incomplete and isolated data. Consequently, federated learning was proposed to alleviate such challenges by allowing multiple parties to collaboratively build machine learning models without explicitly sharing their data and in the meantime, preserve data privacy. However, existing algorithms of federated learning mainly focus on examples where, either the data do not require explicit labeling, or all data are labeled. Yet in reality, we are often confronted with the case that labeling data itself is costly and there is no sufficient supply of labeled data. While such issues are commonly solved by semi-supervised learning, to the best of knowledge, no existing effort has been put to federated semi-supervised learning. In this survey, we briefly summarize prevalent semi-supervised algorithms and make a brief prospect into federated semi-supervised learning, including possible methodologies, settings and challenges.
Tasks
Published 2020-02-26
URL https://arxiv.org/abs/2002.11545v1
PDF https://arxiv.org/pdf/2002.11545v1.pdf
PWC https://paperswithcode.com/paper/a-survey-towards-federated-semi-supervised
Repo
Framework

Unsupervised Dictionary Learning for Anomaly Detection

Title Unsupervised Dictionary Learning for Anomaly Detection
Authors Paul Irofti, Andra Băltoiu
Abstract We investigate the possibilities of employing dictionary learning to address the requirements of most anomaly detection applications, such as absence of supervision, online formulations, low false positive rates. We present new results of our recent semi-supervised online algorithm, TODDLeR, on a anti-money laundering application. We also introduce a novel unsupervised method of using the performance of the learning algorithm as indication of the nature of the samples.
Tasks Anomaly Detection, Dictionary Learning
Published 2020-02-29
URL https://arxiv.org/abs/2003.00293v1
PDF https://arxiv.org/pdf/2003.00293v1.pdf
PWC https://paperswithcode.com/paper/unsupervised-dictionary-learning-for-anomaly
Repo
Framework

Better Captioning with Sequence-Level Exploration

Title Better Captioning with Sequence-Level Exploration
Authors Jia Chen, Qin Jin
Abstract Sequence-level learning objective has been widely used in captioning tasks to achieve the state-of-the-art performance for many models. In this objective, the model is trained by the reward on the quality of its generated captions (sequence-level). In this work, we show the limitation of the current sequence-level learning objective for captioning tasks from both theory and empirical result. In theory, we show that the current objective is equivalent to only optimizing the precision side of the caption set generated by the model and therefore overlooks the recall side. Empirical result shows that the model trained by this objective tends to get lower score on the recall side. We propose to add a sequence-level exploration term to the current objective to boost recall. It guides the model to explore more plausible captions in the training. In this way, the proposed objective takes both the precision and recall sides of generated captions into account. Experiments show the effectiveness of the proposed method on both video and image captioning datasets.
Tasks Image Captioning
Published 2020-03-08
URL https://arxiv.org/abs/2003.03749v1
PDF https://arxiv.org/pdf/2003.03749v1.pdf
PWC https://paperswithcode.com/paper/better-captioning-with-sequence-level
Repo
Framework

Deep Learning-Based Solvability of Underdetermined Inverse Problems in Medical Imaging

Title Deep Learning-Based Solvability of Underdetermined Inverse Problems in Medical Imaging
Authors Chang Min Hyun, Seong Hyeon Baek, Mingyu Lee, Sung Min Lee, Jin Keun Seo
Abstract Recently, with the significant developments in deep learning techniques, solving underdetermined inverse problems has become one of the major concerns in the medical imaging domain. Typical examples include undersampled magnetic resonance imaging, interior tomography, and sparse-view computed tomography, where deep learning techniques have achieved excellent performances. Although deep learning methods appear to overcome the limitations of existing mathematical methods when handling various underdetermined problems, there is a lack of rigorous mathematical foundations that would allow us to elucidate the reasons for the remarkable performance of deep learning methods. This study focuses on learning the causal relationship regarding the structure of the training data suitable for deep learning, to solve highly underdetermined inverse problems. We observe that a majority of the problems of solving underdetermined linear systems in medical imaging are highly non-linear. Furthermore, we analyze if a desired reconstruction map can be learnable from the training data and underdetermined system.
Tasks
Published 2020-01-06
URL https://arxiv.org/abs/2001.01432v2
PDF https://arxiv.org/pdf/2001.01432v2.pdf
PWC https://paperswithcode.com/paper/deep-learning-based-solvability-of
Repo
Framework

Do you comply with AI? – Personalized explanations of learning algorithms and their impact on employees’ compliance behavior

Title Do you comply with AI? – Personalized explanations of learning algorithms and their impact on employees’ compliance behavior
Authors NIklas Kuhl, Jodie Lobana, Christian Meske
Abstract Machine Learning algorithms are technological key enablers for artificial intelligence (AI). Due to the inherent complexity, these learning algorithms represent black boxes and are difficult to comprehend, therefore influencing compliance behavior. Hence, compliance with the recommendations of such artifacts, which can impact employees’ task performance significantly, is still subject to research - and personalization of AI explanations seems to be a promising concept in this regard. In our work, we hypothesize that, based on varying backgrounds like training, domain knowledge and demographic characteristics, individuals have different understandings and hence mental models about the learning algorithm. Personalization of AI explanations, related to the individuals’ mental models, may thus be an instrument to affect compliance and therefore employee task performance. Our preliminary results already indicate the importance of personalized explanations in industry settings and emphasize the importance of this research endeavor.
Tasks
Published 2020-02-20
URL https://arxiv.org/abs/2002.08777v1
PDF https://arxiv.org/pdf/2002.08777v1.pdf
PWC https://paperswithcode.com/paper/do-you-comply-with-ai-personalized
Repo
Framework
comments powered by Disqus