January 28, 2020

3279 words 16 mins read

Paper Group ANR 804

Deep CNNs Meet Global Covariance Pooling: Better Representation and Generalization. LSAR: Efficient Leverage Score Sampling Algorithm for the Analysis of Big Time Series Data. WaveCycleGAN2: Time-domain Neural Post-filter for Speech Waveform Generation. Regularized Estimation of High-Dimensional Vector AutoRegressions with Weakly Dependent Innovati …

Deep CNNs Meet Global Covariance Pooling: Better Representation and Generalization


Title	Deep CNNs Meet Global Covariance Pooling: Better Representation and Generalization
Authors	Qilong Wang, Jiangtao Xie, Wangmeng Zuo, Lei Zhang, Peihua Li
Abstract	Compared with global average pooling in existing deep convolutional neural networks (CNNs), global covariance pooling can capture richer statistics of deep features, having potential for improving representation and generalization abilities of deep CNNs. However, integration of global covariance pooling into deep CNNs brings two challenges: (1) robust covariance estimation given deep features of high dimension and small sample; (2) appropriate use of geometry of covariances. To address these challenges, we propose a global Matrix Power Normalized COVariance (MPN-COV) Pooling. Our MPN-COV conforms to a robust covariance estimator, very suitable for scenario of high dimension and small sample. It can also be regarded as power-Euclidean metric between covariances, effectively exploiting their geometry. Furthermore, a global Gaussian embedding method is proposed to incorporate first-order statistics into MPN-COV. For fast training of MPN-COV networks, we propose an iterative matrix square root normalization, avoiding GPU unfriendly eigen-decomposition inherent in MPN-COV. Additionally, progressive 1x1 and group convolutions are introduced to compact covariance representations. The MPN-COV and its variants are highly modular, readily plugged into existing deep CNNs. Extensive experiments are conducted on large-scale object classification, scene categorization, fine-grained visual recognition and texture classification, showing our methods are superior to the counterparts and achieve state-of-the-art performance.
Tasks	Fine-Grained Visual Recognition, Object Classification, Texture Classification
Published	2019-04-15
URL	http://arxiv.org/abs/1904.06836v1
PDF	http://arxiv.org/pdf/1904.06836v1.pdf
PWC	https://paperswithcode.com/paper/deep-cnns-meet-global-covariance-pooling
Repo
Framework

LSAR: Efficient Leverage Score Sampling Algorithm for the Analysis of Big Time Series Data


Title	LSAR: Efficient Leverage Score Sampling Algorithm for the Analysis of Big Time Series Data
Authors	Ali Eshragh, Fred Roosta, Asef Nazari, Michael W. Mahoney
Abstract	We apply methods from randomized numerical linear algebra (RandNLA) to develop improved algorithms for the analysis of large-scale time series data. We first develop a new fast algorithm to estimate the leverage scores of an autoregressive (AR) model in big data regimes. We show that the accuracy of approximations lies within $(1+\mathcal{O}(\varepsilon))$ of the true leverage scores with high probability. These theoretical results are subsequently exploited to develop an efficient algorithm, called LSAR, for fitting an appropriate AR model to big time series data. Our proposed algorithm is guaranteed, with high probability, to find the maximum likelihood estimates of the parameters of the underlying true AR model and has a worst case running time that significantly improves those of the state-of-the-art alternatives in big data regimes. Empirical results on large-scale synthetic as well as real data highly support the theoretical results and reveal the efficacy of this new approach.
Tasks	Time Series
Published	2019-11-27
URL	https://arxiv.org/abs/1911.12321v2
PDF	https://arxiv.org/pdf/1911.12321v2.pdf
PWC	https://paperswithcode.com/paper/lsar-efficient-leverage-score-sampling
Repo
Framework

WaveCycleGAN2: Time-domain Neural Post-filter for Speech Waveform Generation


Title	WaveCycleGAN2: Time-domain Neural Post-filter for Speech Waveform Generation
Authors	Kou Tanaka, Hirokazu Kameoka, Takuhiro Kaneko, Nobukatsu Hojo
Abstract	WaveCycleGAN has recently been proposed to bridge the gap between natural and synthesized speech waveforms in statistical parametric speech synthesis and provides fast inference with a moving average model rather than an autoregressive model and high-quality speech synthesis with the adversarial training. However, the human ear can still distinguish the processed speech waveforms from natural ones. One possible cause of this distinguishability is the aliasing observed in the processed speech waveform via down/up-sampling modules. To solve the aliasing and provide higher quality speech synthesis, we propose WaveCycleGAN2, which 1) uses generators without down/up-sampling modules and 2) combines discriminators of the waveform domain and acoustic parameter domain. The results show that the proposed method 1) alleviates the aliasing well, 2) is useful for both speech waveforms generated by analysis-and-synthesis and statistical parametric speech synthesis, and 3) achieves a mean opinion score comparable to those of natural speech and speech synthesized by WaveNet (open WaveNet) and WaveGlow while processing speech samples at a rate of more than 150 kHz on an NVIDIA Tesla P100.
Tasks	Speech Synthesis
Published	2019-04-05
URL	http://arxiv.org/abs/1904.02892v2
PDF	http://arxiv.org/pdf/1904.02892v2.pdf
PWC	https://paperswithcode.com/paper/wavecyclegan2-time-domain-neural-post-filter
Repo
Framework

Regularized Estimation of High-Dimensional Vector AutoRegressions with Weakly Dependent Innovations


Title	Regularized Estimation of High-Dimensional Vector AutoRegressions with Weakly Dependent Innovations
Authors	Ricardo P. Masini, Marcelo C. Medeiros, Eduardo F. Mendes
Abstract	There has been considerable advance in understanding the properties of sparse regularization procedures in high-dimensional models. Most of the work is limited to either independent and identically distributed setting, or time series with independent and/or (sub-)Gaussian innovations. We extend current literature to a broader set of innovation processes, by assuming that the error process is non-sub-Gaussian and conditionally heteroscedastic, and the generating process is not necessarily sparse. This setting covers fat tailed, conditionally dependent innovations which is of particular interest for financial risk modeling. It covers several multivariate-GARCH specifications, such as the BEKK model, and other factor stochastic volatility specifications.
Tasks	Time Series
Published	2019-12-19
URL	https://arxiv.org/abs/1912.09002v1
PDF	https://arxiv.org/pdf/1912.09002v1.pdf
PWC	https://paperswithcode.com/paper/regularized-estimation-of-high-dimensional
Repo
Framework

“The Squawk Bot”: Joint Learning of Time Series and Text Data Modalities for Automated Financial Information Filtering


Title	“The Squawk Bot”: Joint Learning of Time Series and Text Data Modalities for Automated Financial Information Filtering
Authors	Xuan-Hong Dang, Syed Yousaf Shah, Petros Zerfos
Abstract	Multimodal analysis that uses numerical time series and textual corpora as input data sources is becoming a promising approach, especially in the financial industry. However, the main focus of such analysis has been on achieving high prediction accuracy while little effort has been spent on the important task of understanding the association between the two data modalities. Performance on the time series hence receives little explanation though human-understandable textual information is available. In this work, we address the problem of given a numerical time series, and a general corpus of textual stories collected in the same period of the time series, the task is to timely discover a succinct set of textual stories associated with that time series. Towards this goal, we propose a novel multi-modal neural model called MSIN that jointly learns both numerical time series and categorical text articles in order to unearth the association between them. Through multiple steps of data interrelation between the two data modalities, MSIN learns to focus on a small subset of text articles that best align with the performance in the time series. This succinct set is timely discovered and presented as recommended documents, acting as automated information filtering, for the given time series. We empirically evaluate the performance of our model on discovering relevant news articles for two stock time series from Apple and Google companies, along with the daily news articles collected from the Thomson Reuters over a period of seven consecutive years. The experimental results demonstrate that MSIN achieves up to 84.9% and 87.2% in recalling the ground truth articles respectively to the two examined time series, far more superior to state-of-the-art algorithms that rely on conventional attention mechanism in deep learning.
Tasks	Time Series
Published	2019-12-20
URL	https://arxiv.org/abs/1912.10858v1
PDF	https://arxiv.org/pdf/1912.10858v1.pdf
PWC	https://paperswithcode.com/paper/the-squawk-bot-joint-learning-of-time-series
Repo
Framework

Recent advances in deep learning applied to skin cancer detection


Title	Recent advances in deep learning applied to skin cancer detection
Authors	Andre G. C. Pacheco, Renato A. Krohling
Abstract	Skin cancer is a major public health problem around the world. Its early detection is very important to increase patient prognostics. However, the lack of qualified professionals and medical instruments are significant issues in this field. In this context, over the past few years, deep learning models applied to automated skin cancer detection have become a trend. In this paper, we present an overview of the recent advances reported in this field as well as a discussion about the challenges and opportunities for improvement in the current models. In addition, we also present some important aspects regarding the use of these models in smartphones and indicate future directions we believe the field will take.
Tasks
Published	2019-12-06
URL	https://arxiv.org/abs/1912.03280v1
PDF	https://arxiv.org/pdf/1912.03280v1.pdf
PWC	https://paperswithcode.com/paper/recent-advances-in-deep-learning-applied-to
Repo
Framework

Deep Feature Learning from a Hospital-Scale Chest X-ray Dataset with Application to TB Detection on a Small-Scale Dataset


Title	Deep Feature Learning from a Hospital-Scale Chest X-ray Dataset with Application to TB Detection on a Small-Scale Dataset
Authors	Ophir Gozes, Hayit Greenspan
Abstract	The use of ImageNet pre-trained networks is becoming widespread in the medical imaging community. It enables training on small datasets, commonly available in medical imaging tasks. The recent emergence of a large Chest X-ray dataset opened the possibility for learning features that are specific to the X-ray analysis task. In this work, we demonstrate that the features learned allow for better classification results for the problem of Tuberculosis detection and enable generalization to an unseen dataset. To accomplish the task of feature learning, we train a DenseNet-121 CNN on 112K images from the ChestXray14 dataset which includes labels of 14 common thoracic pathologies. In addition to the pathology labels, we incorporate metadata which is available in the dataset: Patient Positioning, Gender and Patient Age. We term this architecture MetaChexNet. As a by-product of the feature learning, we demonstrate state of the art performance on the task of patient Age & Gender estimation using CNN’s. Finally, we show the features learned using ChestXray14 allow for better transfer learning on small-scale datasets for Tuberculosis.
Tasks	Transfer Learning
Published	2019-06-03
URL	https://arxiv.org/abs/1906.00768v1
PDF	https://arxiv.org/pdf/1906.00768v1.pdf
PWC	https://paperswithcode.com/paper/190600768
Repo
Framework

Material Segmentation of Multi-View Satellite Imagery


Title	Material Segmentation of Multi-View Satellite Imagery
Authors	Matthew Purri, Jia Xue, Kristin Dana, Matthew Leotta, Dan Lipsa, Zhixin Li, Bo Xu, Jie Shan
Abstract	Material recognition methods use image context and local cues for pixel-wise classification. In many cases only a single image is available to make a material prediction. Image sequences, routinely acquired in applications such as mutliview stereo, can provide a sampling of the underlying reflectance functions that reveal pixel-level material attributes. We investigate multi-view material segmentation using two datasets generated for building material segmentation and scene material segmentation from the SpaceNet Challenge satellite image dataset. In this paper, we explore the impact of multi-angle reflectance information by introducing the \textit{reflectance residual encoding}, which captures both the multi-angle and multispectral information present in our datasets. The residuals are computed by differencing the sparse-sampled reflectance function with a dictionary of pre-defined dense-sampled reflectance functions. Our proposed reflectance residual features improves material segmentation performance when integrated into pixel-wise and semantic segmentation architectures. At test time, predictions from individual segmentations are combined through softmax fusion and refined by building segment voting. We demonstrate robust and accurate pixelwise segmentation results using the proposed material segmentation pipeline.
Tasks	Material Recognition, Semantic Segmentation
Published	2019-04-17
URL	http://arxiv.org/abs/1904.08537v1
PDF	http://arxiv.org/pdf/1904.08537v1.pdf
PWC	https://paperswithcode.com/paper/material-segmentation-of-multi-view-satellite
Repo
Framework

IoT based Smart Access Controlled Secure Smart City Architecture Using Blockchain


Title	IoT based Smart Access Controlled Secure Smart City Architecture Using Blockchain
Authors	Rourab Paul, Nimisha Ghosh, Suman Sau, Amlan Chakrabarti, Prasant Mahapatra
Abstract	Standard security protocols like SSL, TLS, IPSec etc. have high memory and processor consumption which makes all these security protocols unsuitable for resource constrained platforms such as Internet of Things (IoT). Blockchain (BC) finds its efficient application in IoT platform to preserve the five basic cryptographic primitives, such as confidentiality, authenticity, integrity, availability and non-repudiation. Conventional adoption of BC in IoT platform causes high energy consumption, delay and computational overhead which are not appropriate for various resource constrained IoT devices. This work proposes a machine learning (ML) based smart access control framework in a public and a private BC for a smart city application which makes it more efficient as compared to the existing IoT applications. The proposed IoT based smart city architecture adopts BC technology for preserving all the cryptographic security and privacy issues. Moreover, BC has very minimal overhead on IoT platform as well. This work investigates the existing threat models and critical access control issues which handle multiple permissions of various nodes and detects relevant inconsistencies to notify the corresponding nodes. Comparison in terms of all security issues with existing literature shows that the proposed architecture is competitively efficient in terms of security access control.
Tasks
Published	2019-08-30
URL	https://arxiv.org/abs/1908.11538v3
PDF	https://arxiv.org/pdf/1908.11538v3.pdf
PWC	https://paperswithcode.com/paper/iot-based-smart-access-controlled-secure
Repo
Framework

Mining Rules Incrementally over Large Knowledge Bases


Title	Mining Rules Incrementally over Large Knowledge Bases
Authors	Xiaofeng Zhou, Ali Sadeghian, Daisy Zhe Wang
Abstract	Multiple web-scale Knowledge Bases, e.g., Freebase, YAGO, NELL, have been constructed using semi-supervised or unsupervised information extraction techniques and many of them, despite their large sizes, are continuously growing. Much research effort has been put into mining inference rules from knowledge bases. To address the task of rule mining over evolving web-scale knowledge bases, we propose a parallel incremental rule mining framework. Our approach is able to efficiently mine rules based on the relational model and apply updates to large knowledge bases; we propose an alternative metric that reduces computation complexity without compromising quality; we apply multiple optimization techniques that reduce runtime by more than 2 orders of magnitude. Experiments show that our approach efficiently scales to web-scale knowledge bases and saves over 90% time compared to the state-of-the-art batch rule mining system. We also apply our optimization techniques to the batch rule mining algorithm, reducing runtime by more than half compared to the state-of-the-art. To the best of our knowledge, our incremental rule mining system is the first that handles updates to web-scale knowledge bases.
Tasks
Published	2019-04-20
URL	http://arxiv.org/abs/1904.09399v1
PDF	http://arxiv.org/pdf/1904.09399v1.pdf
PWC	https://paperswithcode.com/paper/190409399
Repo
Framework

FaceSpoof Buster: a Presentation Attack Detector Based on Intrinsic Image Properties and Deep Learning


Title	FaceSpoof Buster: a Presentation Attack Detector Based on Intrinsic Image Properties and Deep Learning
Authors	Rodrigo Bresan, Allan Pinto, Anderson Rocha, Carlos Beluzo, Tiago Carvalho
Abstract	Nowadays, the adoption of face recognition for biometric authentication systems is usual, mainly because this is one of the most accessible biometric modalities. Techniques that rely on trespassing these kind of systems by using a forged biometric sample, such as a printed paper or a recorded video of a genuine access, are known as presentation attacks, but may be also referred in the literature as face spoofing. Presentation attack detection is a crucial step for preventing this kind of unauthorized accesses into restricted areas and/or devices. In this paper, we propose a novel approach which relies in a combination between intrinsic image properties and deep neural networks to detect presentation attack attempts. Our method explores depth, salience and illumination maps, associated with a pre-trained Convolutional Neural Network in order to produce robust and discriminant features. Each one of these properties are individually classified and, in the end of the process, they are combined by a meta learning classifier, which achieves outstanding results on the most popular datasets for PAD. Results show that proposed method is able to overpass state-of-the-art results in an inter-dataset protocol, which is defined as the most challenging in the literature.
Tasks	Face Recognition, Meta-Learning
Published	2019-02-07
URL	http://arxiv.org/abs/1902.02845v1
PDF	http://arxiv.org/pdf/1902.02845v1.pdf
PWC	https://paperswithcode.com/paper/facespoof-buster-a-presentation-attack
Repo
Framework

Neural Logic Rule Layers


Title	Neural Logic Rule Layers
Authors	Jan Niclas Reimann, Andreas Schwung
Abstract	Despite their great success in recent years, deep neural networks (DNN) are mainly black boxes where the results obtained by running through the network are difficult to understand and interpret. Compared to e.g. decision trees or bayesian classifiers, DNN suffer from bad interpretability where we understand by interpretability, that a human can easily derive the relations modeled by the network. A reasonable way to provide interpretability for humans are logical rules. In this paper we propose neural logic rule layers (NLRL) which are able to represent arbitrary logic rules in terms of their conjunctive and disjunctive normal forms. Using various NLRL within one layer and correspondingly stacking various layers, we are able to represent arbitrary complex rules by the resulting neural network architecture. The NLRL are end-to-end trainable allowing to learn logic rules directly from available data sets. Experiments show that NLRL-enhanced neural networks can learn to model arbitrary complex logic and perform arithmetic operation over the input values.
Tasks
Published	2019-07-01
URL	https://arxiv.org/abs/1907.00878v1
PDF	https://arxiv.org/pdf/1907.00878v1.pdf
PWC	https://paperswithcode.com/paper/neural-logic-rule-layers
Repo
Framework

Adaptive Hierarchical Down-Sampling for Point Cloud Classification


Title	Adaptive Hierarchical Down-Sampling for Point Cloud Classification
Authors	Ehsan Nezhadarya, Ehsan Taghavi, Bingbing Liu, Jun Luo
Abstract	While several convolution-like operators have recently been proposed for extracting features out of point clouds, down-sampling an unordered point cloud in a deep neural network has not been rigorously studied. Existing methods down-sample the points regardless of their importance for the output. As a result, some important points in the point cloud may be removed, while less valuable points may be passed to the next layers. In contrast, adaptive down-sampling methods sample the points by taking into account the importance of each point, which varies based on the application, task and training data. In this paper, we propose a permutation-invariant learning-based adaptive down-sampling layer, called Critical Points Layer (CPL), which reduces the number of points in an unordered point cloud while retaining the important points. Unlike most graph-based point cloud down-sampling methods that use $k$-NN search algorithm to find the neighbouring points, CPL is a global down-sampling method, rendering it computationally very efficient. The proposed layer can be used along with any graph-based point cloud convolution layer to form a convolutional neural network, dubbed CP-Net in this paper. We introduce a CP-Net for $3$D object classification that achieves the best accuracy for the ModelNet$40$ dataset among point cloud-based methods, which validates the effectiveness of the CPL.
Tasks	Object Classification
Published	2019-04-11
URL	http://arxiv.org/abs/1904.08506v1
PDF	http://arxiv.org/pdf/1904.08506v1.pdf
PWC	https://paperswithcode.com/paper/190408506
Repo
Framework

Know More about Each Other: Evolving Dialogue Strategy via Compound Assessment


Title	Know More about Each Other: Evolving Dialogue Strategy via Compound Assessment
Authors	Siqi Bao, Huang He, Fan Wang, Rongzhong Lian, Hua Wu
Abstract	In this paper, a novel Generation-Evaluation framework is developed for multi-turn conversations with the objective of letting both participants know more about each other. For the sake of rational knowledge utilization and coherent conversation flow, a dialogue strategy which controls knowledge selection is instantiated and continuously adapted via reinforcement learning. Under the deployed strategy, knowledge grounded conversations are conducted with two dialogue agents. The generated dialogues are comprehensively evaluated on aspects like informativeness and coherence, which are aligned with our objective and human instinct. These assessments are integrated as a compound reward to guide the evolution of dialogue strategy via policy gradient. Comprehensive experiments have been carried out on the publicly available dataset, demonstrating that the proposed method outperforms the other state-of-the-art approaches significantly.
Tasks
Published	2019-06-03
URL	https://arxiv.org/abs/1906.00549v1
PDF	https://arxiv.org/pdf/1906.00549v1.pdf
PWC	https://paperswithcode.com/paper/190600549
Repo
Framework

Assessing Capsule Networks With Biased Data


Title	Assessing Capsule Networks With Biased Data
Authors	Bruno Ferrarini, Shoaib Ehsan, Adrien Bartoli, Aleš Leonardis, Klaus D. McDonald-Maier
Abstract	Machine learning based methods achieves impressive results in object classification and detection. Utilizing representative data of the visual world during the training phase is crucial to achieve good performance with such data driven approaches. However, it not always possible to access bias-free datasets thus, robustness to biased data is a desirable property for a learning system. Capsule Networks have been introduced recently and their tolerance to biased data has received little attention. This paper aims to fill this gap and proposes two experimental scenarios to assess the tolerance to imbalanced training data and to determine the generalization performance of a model with unfamiliar affine transformations of the images. This paper assesses dynamic routing and EM routing based Capsule Networks and proposes a comparison with Convolutional Neural Networks in the two tested scenarios. The presented results provide new insights into the behaviour of capsule networks.
Tasks	Object Classification
Published	2019-04-09
URL	http://arxiv.org/abs/1904.04555v1
PDF	http://arxiv.org/pdf/1904.04555v1.pdf
PWC	https://paperswithcode.com/paper/assessing-capsule-networks-with-biased-data
Repo
Framework