April 2, 2020

3093 words 15 mins read

Paper Group ANR 371

HOTCAKE: Higher Order Tucker Articulated Kernels for Deeper CNN Compression. Efficient Topological Layer based on Persistent Landscapes. Classification of the Chinese Handwritten Numbers with Supervised Projective Dictionary Pair Learning. COKE: Communication-Censored Kernel Learning for Decentralized Non-parametric Learning. Discriminative Feature …

HOTCAKE: Higher Order Tucker Articulated Kernels for Deeper CNN Compression


Title	HOTCAKE: Higher Order Tucker Articulated Kernels for Deeper CNN Compression
Authors	Rui Lin, Ching-Yun Ko, Zhuolun He, Cong Chen, Yuan Cheng, Hao Yu, Graziano Chesi, Ngai Wong
Abstract	The emerging edge computing has promoted immense interests in compacting a neural network without sacrificing much accuracy. In this regard, low-rank tensor decomposition constitutes a powerful tool to compress convolutional neural networks (CNNs) by decomposing the 4-way kernel tensor into multi-stage smaller ones. Building on top of Tucker-2 decomposition, we propose a generalized Higher Order Tucker Articulated Kernels (HOTCAKE) scheme comprising four steps: input channel decomposition, guided Tucker rank selection, higher order Tucker decomposition and fine-tuning. By subjecting each CONV layer to HOTCAKE, a highly compressed CNN model with graceful accuracy trade-off is obtained. Experiments show HOTCAKE can compress even pre-compressed models and produce state-of-the-art lightweight networks.
Tasks
Published	2020-02-28
URL	https://arxiv.org/abs/2002.12663v1
PDF	https://arxiv.org/pdf/2002.12663v1.pdf
PWC	https://paperswithcode.com/paper/hotcake-higher-order-tucker-articulated
Repo
Framework

Efficient Topological Layer based on Persistent Landscapes


Title	Efficient Topological Layer based on Persistent Landscapes
Authors	Kwangho Kim, Jisu Kim, Joon Sik Kim, Frederic Chazal, Larry Wasserman
Abstract	We propose a novel topological layer for general deep learning models based on persistent landscapes, in which we can efficiently exploit underlying topological features of the input data structure. We use the robust DTM function and show differentiability with respect to layer inputs, for a general persistent homology with arbitrary filtration. Thus, our proposed layer can be placed anywhere in the network architecture and feed critical information on the topological features of input data into subsequent layers to improve the learnability of the networks toward a given task. A task-optimal structure of the topological layer is learned during training via backpropagation, without requiring any input featurization or data preprocessing. We provide a tight stability theorem, and show that the proposed layer is robust towards noise and outliers. We demonstrate the effectiveness of our approach by classification experiments on various datasets.
Tasks
Published	2020-02-07
URL	https://arxiv.org/abs/2002.02778v1
PDF	https://arxiv.org/pdf/2002.02778v1.pdf
PWC	https://paperswithcode.com/paper/efficient-topological-layer-based-on
Repo
Framework

Classification of the Chinese Handwritten Numbers with Supervised Projective Dictionary Pair Learning


Title	Classification of the Chinese Handwritten Numbers with Supervised Projective Dictionary Pair Learning
Authors	Rasool Ameri, Saideh Ferdowsi, Ali Alameer, Vahid Abolghasemi, Kianoush Nazarpour
Abstract	Image classification has become a key ingredient in the field of computer vision. To enhance classification accuracy, current approaches heavily focus on increasing network depth and width, e.g., inception modules, at the cost of computational requirements. To mitigate this problem, in this paper a novel dictionary learning method is proposed and tested with Chinese handwritten numbers. We have considered three important characteristics to design the dictionary: discriminability, sparsity, and classification error. We formulated these metrics into a unified cost function. The proposed architecture i) obtains an efficient sparse code in a novel feature space without relying on $\ell_0$ and $\ell_1$ norms minimisation; and ii) includes the classification error within the cost function as an extra constraint. Experimental results show that the proposed method provides superior classification performance compared to recent dictionary learning methods. With a classification accuracy of $\sim$98%, the results suggest that our proposed sparse learning algorithm achieves comparable performance to existing well-known deep learning methods, e.g., SqueezeNet, GoogLeNet and MobileNetV2, but with a fraction of parameters.
Tasks	Dictionary Learning, Image Classification, Sparse Learning
Published	2020-03-26
URL	https://arxiv.org/abs/2003.11700v1
PDF	https://arxiv.org/pdf/2003.11700v1.pdf
PWC	https://paperswithcode.com/paper/classification-of-the-chinese-handwritten
Repo
Framework

COKE: Communication-Censored Kernel Learning for Decentralized Non-parametric Learning


Title	COKE: Communication-Censored Kernel Learning for Decentralized Non-parametric Learning
Authors	Ping Xu, Yue Wang, Xiang Chen, Tian Zhi
Abstract	This paper studies the decentralized optimization and learning problem where multiple interconnected agents aim to learn an optimal decision function defined over a reproducing kernel Hilbert (RKH) space by jointly minimizing a global objective function, with access to locally observed data only. As a non-parametric approach, kernel learning faces a major challenge in distributed implementation: the decision variables of local objective functions are data-dependent with different sizes and thus cannot be optimized under the decentralized consensus framework without any raw data exchange among agents. To circumvent this major challenge and preserve data privacy, we leverage the random feature (RF) approximation approach to map the large-volume data represented in the RKH space into a smaller RF space, which facilitates the same-size parameter exchange and enables distributed agents to reach consensus on the function decided by the parameters in the RF space. For fast convergent implementation, we design an iterative algorithm for Decentralized Kernel Learning via Alternating direction method of multipliers (DKLA). Further, we develop a COmmunication-censored KErnel learning (COKE) algorithm to reduce the communication load in DKLA. To do so, we apply a communication-censoring strategy, which prevents an agent from transmitting at every iteration unless its local updates are deemed informative. Theoretical results in terms of linear convergence guarantee and generalization performance analysis of DKLA and COKE are provided. Comprehensive tests with both synthetic and real datasets are conducted to verify the communication efficiency and learning effectiveness of COKE.
Tasks
Published	2020-01-28
URL	https://arxiv.org/abs/2001.10133v1
PDF	https://arxiv.org/pdf/2001.10133v1.pdf
PWC	https://paperswithcode.com/paper/coke-communication-censored-kernel-learning
Repo
Framework

Discriminative Feature and Dictionary Learning with Part-aware Model for Vehicle Re-identification


Title	Discriminative Feature and Dictionary Learning with Part-aware Model for Vehicle Re-identification
Authors	Huibing Wang, Jinjia Peng, Guangqi Jiang, Fengqiang Xu, Xianping Fu
Abstract	With the development of smart cities, urban surveillance video analysis will play a further significant role in intelligent transportation systems. Identifying the same target vehicle in large datasets from non-overlapping cameras should be highlighted, which has grown into a hot topic in promoting intelligent transportation systems. However, vehicle re-identification (re-ID) technology is a challenging task since vehicles of the same design or manufacturer show similar appearance. To fill these gaps, we tackle this challenge by proposing Triplet Center Loss based Part-aware Model (TCPM) that leverages the discriminative features in part details of vehicles to refine the accuracy of vehicle re-identification. TCPM base on part discovery is that partitions the vehicle from horizontal and vertical directions to strengthen the details of the vehicle and reinforce the internal consistency of the parts. In addition, to eliminate intra-class differences in local regions of the vehicle, we propose external memory modules to emphasize the consistency of each part to learn the discriminating features, which forms a global dictionary over all categories in dataset. In TCPM, triplet-center loss is introduced to ensure each part of vehicle features extracted has intra-class consistency and inter-class separability. Experimental results show that our proposed TCPM has an enormous preference over the existing state-of-the-art methods on benchmark datasets VehicleID and VeRi-776.
Tasks	Dictionary Learning, Vehicle Re-Identification
Published	2020-03-16
URL	https://arxiv.org/abs/2003.07139v1
PDF	https://arxiv.org/pdf/2003.07139v1.pdf
PWC	https://paperswithcode.com/paper/discriminative-feature-and-dictionary
Repo
Framework

3D Dynamic Point Cloud Denoising via Spatial-Temporal Graph Learning


Title	3D Dynamic Point Cloud Denoising via Spatial-Temporal Graph Learning
Authors	Wei Hu, Qianjiang Hu, Zehua Wang, Xiang Gao
Abstract	The prevalence of accessible depth sensing and 3D laser scanning techniques has enabled the convenient acquisition of 3D dynamic point clouds, which provide efficient representation of arbitrarily-shaped objects in motion. Nevertheless, dynamic point clouds are often perturbed by noise due to hardware, software or other causes. While a plethora of methods have been proposed for static point cloud denoising, few efforts are made for the denoising of dynamic point clouds with varying number of irregularly-sampled points in each frame. In this paper, we represent dynamic point clouds naturally on graphs and address the denoising problem by inferring the underlying graph via spatio-temporal graph learning, exploiting both the intra-frame similarity and inter-frame consistency. Firstly, assuming the availability of a relevant feature vector per node, we pose spatial-temporal graph learning as optimizing a Mahalanobis distance metric M, which is formulated as the minimization of graph Laplacian regularizer. Secondly, to ease the optimization of the symmetric and positive definite metric matrix M, we decompose it into M = R’*R and solve R instead via proximal gradient. Finally, based on the spatial-temporal graph learning, we formulate dynamic point cloud denoising as the joint optimization of the desired point cloud and underlying spatio-temporal graph, which leverages both intra-frame affinities and inter-frame consistency and is solved via alternating minimization. Experimental results show that the proposed method significantly outperforms independent denoising of each frame from state-of-the-art static point cloud denoising approaches.
Tasks	Denoising
Published	2020-03-17
URL	https://arxiv.org/abs/2003.08355v1
PDF	https://arxiv.org/pdf/2003.08355v1.pdf
PWC	https://paperswithcode.com/paper/3d-dynamic-point-cloud-denoising-via-spatial
Repo
Framework

Gated Texture CNN for Efficient and Configurable Image Denoising


Title	Gated Texture CNN for Efficient and Configurable Image Denoising
Authors	Kaito Imai, Takamichi Miyata
Abstract	Convolutional neural network (CNN)-based image denoising methods typically estimate the noise component contained in a noisy input image and restore a clean image by subtracting the estimated noise from the input. However, previous denoising methods tend to remove high-frequency information (e.g., textures) from the input. It caused by intermediate feature maps of CNN contains texture information. A straightforward approach to this problem is stacking numerous layers, which leads to a high computational cost. To achieve high performance and computational efficiency, we propose a gated texture CNN (GTCNN), which is designed to carefully exclude the texture information from each intermediate feature map of the CNN by incorporating gating mechanisms. Our GTCNN achieves state-of-the-art performance with 4.8 times fewer parameters than previous state-of-the-art methods. Furthermore, the GTCNN allows us to interactively control the texture strength in the output image without any additional modules, training, or computational costs.
Tasks	Denoising, Image Denoising
Published	2020-03-16
URL	https://arxiv.org/abs/2003.07042v1
PDF	https://arxiv.org/pdf/2003.07042v1.pdf
PWC	https://paperswithcode.com/paper/gated-texture-cnn-for-efficient-and
Repo
Framework

On Interpretability of Artificial Neural Networks


Title	On Interpretability of Artificial Neural Networks
Authors	Fenglei Fan, Jinjun Xiong, Ge Wang
Abstract	Deep learning has achieved great successes in many important areas to dealing with text, images, video, graphs, and so on. However, the black-box nature of deep artificial neural networks has become the primary obstacle to their public acceptance and wide popularity in critical applications such as diagnosis and therapy. Due to the huge potential of deep learning, interpreting neural networks has become one of the most critical research directions. In this paper, we systematically review recent studies in understanding the mechanism of neural networks and shed light on some future directions of interpretability research (This work is still in progress).
Tasks
Published	2020-01-08
URL	https://arxiv.org/abs/2001.02522v1
PDF	https://arxiv.org/pdf/2001.02522v1.pdf
PWC	https://paperswithcode.com/paper/on-interpretability-of-artificial-neural
Repo
Framework

Restore from Restored: Video Restoration with Pseudo Clean Video


Title	Restore from Restored: Video Restoration with Pseudo Clean Video
Authors	Seunghwan Lee, Seobin Park, Donghyeon Cho, Jiwon Kim, Tae Hyun Kim
Abstract	In this paper, we propose a self-supervised video denoising method called “restore-from-restored” that fine-tunes a baseline network by using a pseudo clean video at the test phase. The pseudo clean video can be obtained by applying an input noisy video to the pre-trained baseline network. By adopting a fully convolutional network (FCN) as the baseline, we can restore videos without accurate optical flow and registration due to its translation-invariant property unlike many conventional video restoration methods. Moreover, the proposed method can take advantage of the existence of many similar patches across consecutive frames (i.e., patch-recurrence), which can boost performance of the baseline network by a large margin. We analyze the restoration performance of the FCN fine-tuned with the proposed self-supervision-based training algorithm, and demonstrate that FCN can utilize recurring patches without the need for registration among adjacent frames. The proposed method can be applied to any FCN-based denoising models. In our experiments, we apply the proposed method to the state-of-the-art denoisers, and our results indicate a considerable improvementin task performance.
Tasks	Denoising, Optical Flow Estimation, Video Denoising
Published	2020-03-09
URL	https://arxiv.org/abs/2003.04279v1
PDF	https://arxiv.org/pdf/2003.04279v1.pdf
PWC	https://paperswithcode.com/paper/restore-from-restored-video-restoration-with
Repo
Framework

Universal-RCNN: Universal Object Detector via Transferable Graph R-CNN


Title	Universal-RCNN: Universal Object Detector via Transferable Graph R-CNN
Authors	Hang Xu, Linpu Fang, Xiaodan Liang, Wenxiong Kang, Zhenguo Li
Abstract	The dominant object detection approaches treat each dataset separately and fit towards a specific domain, which cannot adapt to other domains without extensive retraining. In this paper, we address the problem of designing a universal object detection model that exploits diverse category granularity from multiple domains and predict all kinds of categories in one system. Existing works treat this problem by integrating multiple detection branches upon one shared backbone network. However, this paradigm overlooks the crucial semantic correlations between multiple domains, such as categories hierarchy, visual similarity, and linguistic relationship. To address these drawbacks, we present a novel universal object detector called Universal-RCNN that incorporates graph transfer learning for propagating relevant semantic information across multiple datasets to reach semantic coherency. Specifically, we first generate a global semantic pool by integrating all high-level semantic representation of all the categories. Then an Intra-Domain Reasoning Module learns and propagates the sparse graph representation within one dataset guided by a spatial-aware GCN. Finally, an InterDomain Transfer Module is proposed to exploit diverse transfer dependencies across all domains and enhance the regional feature representation by attending and transferring semantic contexts globally. Extensive experiments demonstrate that the proposed method significantly outperforms multiple-branch models and achieves the state-of-the-art results on multiple object detection benchmarks (mAP: 49.1% on COCO).
Tasks	Object Detection, Transfer Learning
Published	2020-02-18
URL	https://arxiv.org/abs/2002.07417v1
PDF	https://arxiv.org/pdf/2002.07417v1.pdf
PWC	https://paperswithcode.com/paper/universal-rcnn-universal-object-detector-via
Repo
Framework

A Survey towards Federated Semi-supervised Learning


Title	A Survey towards Federated Semi-supervised Learning
Authors	Yilun Jin, Xiguang Wei, Yang Liu, Qiang Yang
Abstract	The success of Artificial Intelligence (AI) should be largely attributed to the accessibility of abundant data. However, this is not exactly the case in reality, where it is common for developers in industry to face insufficient, incomplete and isolated data. Consequently, federated learning was proposed to alleviate such challenges by allowing multiple parties to collaboratively build machine learning models without explicitly sharing their data and in the meantime, preserve data privacy. However, existing algorithms of federated learning mainly focus on examples where, either the data do not require explicit labeling, or all data are labeled. Yet in reality, we are often confronted with the case that labeling data itself is costly and there is no sufficient supply of labeled data. While such issues are commonly solved by semi-supervised learning, to the best of knowledge, no existing effort has been put to federated semi-supervised learning. In this survey, we briefly summarize prevalent semi-supervised algorithms and make a brief prospect into federated semi-supervised learning, including possible methodologies, settings and challenges.
Tasks
Published	2020-02-26
URL	https://arxiv.org/abs/2002.11545v1
PDF	https://arxiv.org/pdf/2002.11545v1.pdf
PWC	https://paperswithcode.com/paper/a-survey-towards-federated-semi-supervised
Repo
Framework

Unsupervised Dictionary Learning for Anomaly Detection


Title	Unsupervised Dictionary Learning for Anomaly Detection
Authors	Paul Irofti, Andra Băltoiu
Abstract	We investigate the possibilities of employing dictionary learning to address the requirements of most anomaly detection applications, such as absence of supervision, online formulations, low false positive rates. We present new results of our recent semi-supervised online algorithm, TODDLeR, on a anti-money laundering application. We also introduce a novel unsupervised method of using the performance of the learning algorithm as indication of the nature of the samples.
Tasks	Anomaly Detection, Dictionary Learning
Published	2020-02-29
URL	https://arxiv.org/abs/2003.00293v1
PDF	https://arxiv.org/pdf/2003.00293v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-dictionary-learning-for-anomaly
Repo
Framework

Better Captioning with Sequence-Level Exploration


Title	Better Captioning with Sequence-Level Exploration
Authors	Jia Chen, Qin Jin
Abstract	Sequence-level learning objective has been widely used in captioning tasks to achieve the state-of-the-art performance for many models. In this objective, the model is trained by the reward on the quality of its generated captions (sequence-level). In this work, we show the limitation of the current sequence-level learning objective for captioning tasks from both theory and empirical result. In theory, we show that the current objective is equivalent to only optimizing the precision side of the caption set generated by the model and therefore overlooks the recall side. Empirical result shows that the model trained by this objective tends to get lower score on the recall side. We propose to add a sequence-level exploration term to the current objective to boost recall. It guides the model to explore more plausible captions in the training. In this way, the proposed objective takes both the precision and recall sides of generated captions into account. Experiments show the effectiveness of the proposed method on both video and image captioning datasets.
Tasks	Image Captioning
Published	2020-03-08
URL	https://arxiv.org/abs/2003.03749v1
PDF	https://arxiv.org/pdf/2003.03749v1.pdf
PWC	https://paperswithcode.com/paper/better-captioning-with-sequence-level
Repo
Framework

Deep Learning-Based Solvability of Underdetermined Inverse Problems in Medical Imaging


Title	Deep Learning-Based Solvability of Underdetermined Inverse Problems in Medical Imaging
Authors	Chang Min Hyun, Seong Hyeon Baek, Mingyu Lee, Sung Min Lee, Jin Keun Seo
Abstract	Recently, with the significant developments in deep learning techniques, solving underdetermined inverse problems has become one of the major concerns in the medical imaging domain. Typical examples include undersampled magnetic resonance imaging, interior tomography, and sparse-view computed tomography, where deep learning techniques have achieved excellent performances. Although deep learning methods appear to overcome the limitations of existing mathematical methods when handling various underdetermined problems, there is a lack of rigorous mathematical foundations that would allow us to elucidate the reasons for the remarkable performance of deep learning methods. This study focuses on learning the causal relationship regarding the structure of the training data suitable for deep learning, to solve highly underdetermined inverse problems. We observe that a majority of the problems of solving underdetermined linear systems in medical imaging are highly non-linear. Furthermore, we analyze if a desired reconstruction map can be learnable from the training data and underdetermined system.
Tasks
Published	2020-01-06
URL	https://arxiv.org/abs/2001.01432v2
PDF	https://arxiv.org/pdf/2001.01432v2.pdf
PWC	https://paperswithcode.com/paper/deep-learning-based-solvability-of
Repo
Framework

Do you comply with AI? – Personalized explanations of learning algorithms and their impact on employees’ compliance behavior


Title	Do you comply with AI? – Personalized explanations of learning algorithms and their impact on employees’ compliance behavior
Authors	NIklas Kuhl, Jodie Lobana, Christian Meske
Abstract	Machine Learning algorithms are technological key enablers for artificial intelligence (AI). Due to the inherent complexity, these learning algorithms represent black boxes and are difficult to comprehend, therefore influencing compliance behavior. Hence, compliance with the recommendations of such artifacts, which can impact employees’ task performance significantly, is still subject to research - and personalization of AI explanations seems to be a promising concept in this regard. In our work, we hypothesize that, based on varying backgrounds like training, domain knowledge and demographic characteristics, individuals have different understandings and hence mental models about the learning algorithm. Personalization of AI explanations, related to the individuals’ mental models, may thus be an instrument to affect compliance and therefore employee task performance. Our preliminary results already indicate the importance of personalized explanations in industry settings and emphasize the importance of this research endeavor.
Tasks
Published	2020-02-20
URL	https://arxiv.org/abs/2002.08777v1
PDF	https://arxiv.org/pdf/2002.08777v1.pdf
PWC	https://paperswithcode.com/paper/do-you-comply-with-ai-personalized
Repo
Framework