Paper Group ANR 804
Deep CNNs Meet Global Covariance Pooling: Better Representation and Generalization. LSAR: Efficient Leverage Score Sampling Algorithm for the Analysis of Big Time Series Data. WaveCycleGAN2: Time-domain Neural Post-filter for Speech Waveform Generation. Regularized Estimation of High-Dimensional Vector AutoRegressions with Weakly Dependent Innovati …
Deep CNNs Meet Global Covariance Pooling: Better Representation and Generalization
Title | Deep CNNs Meet Global Covariance Pooling: Better Representation and Generalization |
Authors | Qilong Wang, Jiangtao Xie, Wangmeng Zuo, Lei Zhang, Peihua Li |
Abstract | Compared with global average pooling in existing deep convolutional neural networks (CNNs), global covariance pooling can capture richer statistics of deep features, having potential for improving representation and generalization abilities of deep CNNs. However, integration of global covariance pooling into deep CNNs brings two challenges: (1) robust covariance estimation given deep features of high dimension and small sample; (2) appropriate use of geometry of covariances. To address these challenges, we propose a global Matrix Power Normalized COVariance (MPN-COV) Pooling. Our MPN-COV conforms to a robust covariance estimator, very suitable for scenario of high dimension and small sample. It can also be regarded as power-Euclidean metric between covariances, effectively exploiting their geometry. Furthermore, a global Gaussian embedding method is proposed to incorporate first-order statistics into MPN-COV. For fast training of MPN-COV networks, we propose an iterative matrix square root normalization, avoiding GPU unfriendly eigen-decomposition inherent in MPN-COV. Additionally, progressive 1x1 and group convolutions are introduced to compact covariance representations. The MPN-COV and its variants are highly modular, readily plugged into existing deep CNNs. Extensive experiments are conducted on large-scale object classification, scene categorization, fine-grained visual recognition and texture classification, showing our methods are superior to the counterparts and achieve state-of-the-art performance. |
Tasks | Fine-Grained Visual Recognition, Object Classification, Texture Classification |
Published | 2019-04-15 |
URL | http://arxiv.org/abs/1904.06836v1 |
http://arxiv.org/pdf/1904.06836v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-cnns-meet-global-covariance-pooling |
Repo | |
Framework | |
LSAR: Efficient Leverage Score Sampling Algorithm for the Analysis of Big Time Series Data
Title | LSAR: Efficient Leverage Score Sampling Algorithm for the Analysis of Big Time Series Data |
Authors | Ali Eshragh, Fred Roosta, Asef Nazari, Michael W. Mahoney |
Abstract | We apply methods from randomized numerical linear algebra (RandNLA) to develop improved algorithms for the analysis of large-scale time series data. We first develop a new fast algorithm to estimate the leverage scores of an autoregressive (AR) model in big data regimes. We show that the accuracy of approximations lies within $(1+\mathcal{O}(\varepsilon))$ of the true leverage scores with high probability. These theoretical results are subsequently exploited to develop an efficient algorithm, called LSAR, for fitting an appropriate AR model to big time series data. Our proposed algorithm is guaranteed, with high probability, to find the maximum likelihood estimates of the parameters of the underlying true AR model and has a worst case running time that significantly improves those of the state-of-the-art alternatives in big data regimes. Empirical results on large-scale synthetic as well as real data highly support the theoretical results and reveal the efficacy of this new approach. |
Tasks | Time Series |
Published | 2019-11-27 |
URL | https://arxiv.org/abs/1911.12321v2 |
https://arxiv.org/pdf/1911.12321v2.pdf | |
PWC | https://paperswithcode.com/paper/lsar-efficient-leverage-score-sampling |
Repo | |
Framework | |
WaveCycleGAN2: Time-domain Neural Post-filter for Speech Waveform Generation
Title | WaveCycleGAN2: Time-domain Neural Post-filter for Speech Waveform Generation |
Authors | Kou Tanaka, Hirokazu Kameoka, Takuhiro Kaneko, Nobukatsu Hojo |
Abstract | WaveCycleGAN has recently been proposed to bridge the gap between natural and synthesized speech waveforms in statistical parametric speech synthesis and provides fast inference with a moving average model rather than an autoregressive model and high-quality speech synthesis with the adversarial training. However, the human ear can still distinguish the processed speech waveforms from natural ones. One possible cause of this distinguishability is the aliasing observed in the processed speech waveform via down/up-sampling modules. To solve the aliasing and provide higher quality speech synthesis, we propose WaveCycleGAN2, which 1) uses generators without down/up-sampling modules and 2) combines discriminators of the waveform domain and acoustic parameter domain. The results show that the proposed method 1) alleviates the aliasing well, 2) is useful for both speech waveforms generated by analysis-and-synthesis and statistical parametric speech synthesis, and 3) achieves a mean opinion score comparable to those of natural speech and speech synthesized by WaveNet (open WaveNet) and WaveGlow while processing speech samples at a rate of more than 150 kHz on an NVIDIA Tesla P100. |
Tasks | Speech Synthesis |
Published | 2019-04-05 |
URL | http://arxiv.org/abs/1904.02892v2 |
http://arxiv.org/pdf/1904.02892v2.pdf | |
PWC | https://paperswithcode.com/paper/wavecyclegan2-time-domain-neural-post-filter |
Repo | |
Framework | |
Regularized Estimation of High-Dimensional Vector AutoRegressions with Weakly Dependent Innovations
Title | Regularized Estimation of High-Dimensional Vector AutoRegressions with Weakly Dependent Innovations |
Authors | Ricardo P. Masini, Marcelo C. Medeiros, Eduardo F. Mendes |
Abstract | There has been considerable advance in understanding the properties of sparse regularization procedures in high-dimensional models. Most of the work is limited to either independent and identically distributed setting, or time series with independent and/or (sub-)Gaussian innovations. We extend current literature to a broader set of innovation processes, by assuming that the error process is non-sub-Gaussian and conditionally heteroscedastic, and the generating process is not necessarily sparse. This setting covers fat tailed, conditionally dependent innovations which is of particular interest for financial risk modeling. It covers several multivariate-GARCH specifications, such as the BEKK model, and other factor stochastic volatility specifications. |
Tasks | Time Series |
Published | 2019-12-19 |
URL | https://arxiv.org/abs/1912.09002v1 |
https://arxiv.org/pdf/1912.09002v1.pdf | |
PWC | https://paperswithcode.com/paper/regularized-estimation-of-high-dimensional |
Repo | |
Framework | |
“The Squawk Bot”: Joint Learning of Time Series and Text Data Modalities for Automated Financial Information Filtering
Title | “The Squawk Bot”: Joint Learning of Time Series and Text Data Modalities for Automated Financial Information Filtering |
Authors | Xuan-Hong Dang, Syed Yousaf Shah, Petros Zerfos |
Abstract | Multimodal analysis that uses numerical time series and textual corpora as input data sources is becoming a promising approach, especially in the financial industry. However, the main focus of such analysis has been on achieving high prediction accuracy while little effort has been spent on the important task of understanding the association between the two data modalities. Performance on the time series hence receives little explanation though human-understandable textual information is available. In this work, we address the problem of given a numerical time series, and a general corpus of textual stories collected in the same period of the time series, the task is to timely discover a succinct set of textual stories associated with that time series. Towards this goal, we propose a novel multi-modal neural model called MSIN that jointly learns both numerical time series and categorical text articles in order to unearth the association between them. Through multiple steps of data interrelation between the two data modalities, MSIN learns to focus on a small subset of text articles that best align with the performance in the time series. This succinct set is timely discovered and presented as recommended documents, acting as automated information filtering, for the given time series. We empirically evaluate the performance of our model on discovering relevant news articles for two stock time series from Apple and Google companies, along with the daily news articles collected from the Thomson Reuters over a period of seven consecutive years. The experimental results demonstrate that MSIN achieves up to 84.9% and 87.2% in recalling the ground truth articles respectively to the two examined time series, far more superior to state-of-the-art algorithms that rely on conventional attention mechanism in deep learning. |
Tasks | Time Series |
Published | 2019-12-20 |
URL | https://arxiv.org/abs/1912.10858v1 |
https://arxiv.org/pdf/1912.10858v1.pdf | |
PWC | https://paperswithcode.com/paper/the-squawk-bot-joint-learning-of-time-series |
Repo | |
Framework | |
Recent advances in deep learning applied to skin cancer detection
Title | Recent advances in deep learning applied to skin cancer detection |
Authors | Andre G. C. Pacheco, Renato A. Krohling |
Abstract | Skin cancer is a major public health problem around the world. Its early detection is very important to increase patient prognostics. However, the lack of qualified professionals and medical instruments are significant issues in this field. In this context, over the past few years, deep learning models applied to automated skin cancer detection have become a trend. In this paper, we present an overview of the recent advances reported in this field as well as a discussion about the challenges and opportunities for improvement in the current models. In addition, we also present some important aspects regarding the use of these models in smartphones and indicate future directions we believe the field will take. |
Tasks | |
Published | 2019-12-06 |
URL | https://arxiv.org/abs/1912.03280v1 |
https://arxiv.org/pdf/1912.03280v1.pdf | |
PWC | https://paperswithcode.com/paper/recent-advances-in-deep-learning-applied-to |
Repo | |
Framework | |
Deep Feature Learning from a Hospital-Scale Chest X-ray Dataset with Application to TB Detection on a Small-Scale Dataset
Title | Deep Feature Learning from a Hospital-Scale Chest X-ray Dataset with Application to TB Detection on a Small-Scale Dataset |
Authors | Ophir Gozes, Hayit Greenspan |
Abstract | The use of ImageNet pre-trained networks is becoming widespread in the medical imaging community. It enables training on small datasets, commonly available in medical imaging tasks. The recent emergence of a large Chest X-ray dataset opened the possibility for learning features that are specific to the X-ray analysis task. In this work, we demonstrate that the features learned allow for better classification results for the problem of Tuberculosis detection and enable generalization to an unseen dataset. To accomplish the task of feature learning, we train a DenseNet-121 CNN on 112K images from the ChestXray14 dataset which includes labels of 14 common thoracic pathologies. In addition to the pathology labels, we incorporate metadata which is available in the dataset: Patient Positioning, Gender and Patient Age. We term this architecture MetaChexNet. As a by-product of the feature learning, we demonstrate state of the art performance on the task of patient Age & Gender estimation using CNN’s. Finally, we show the features learned using ChestXray14 allow for better transfer learning on small-scale datasets for Tuberculosis. |
Tasks | Transfer Learning |
Published | 2019-06-03 |
URL | https://arxiv.org/abs/1906.00768v1 |
https://arxiv.org/pdf/1906.00768v1.pdf | |
PWC | https://paperswithcode.com/paper/190600768 |
Repo | |
Framework | |
Material Segmentation of Multi-View Satellite Imagery
Title | Material Segmentation of Multi-View Satellite Imagery |
Authors | Matthew Purri, Jia Xue, Kristin Dana, Matthew Leotta, Dan Lipsa, Zhixin Li, Bo Xu, Jie Shan |
Abstract | Material recognition methods use image context and local cues for pixel-wise classification. In many cases only a single image is available to make a material prediction. Image sequences, routinely acquired in applications such as mutliview stereo, can provide a sampling of the underlying reflectance functions that reveal pixel-level material attributes. We investigate multi-view material segmentation using two datasets generated for building material segmentation and scene material segmentation from the SpaceNet Challenge satellite image dataset. In this paper, we explore the impact of multi-angle reflectance information by introducing the \textit{reflectance residual encoding}, which captures both the multi-angle and multispectral information present in our datasets. The residuals are computed by differencing the sparse-sampled reflectance function with a dictionary of pre-defined dense-sampled reflectance functions. Our proposed reflectance residual features improves material segmentation performance when integrated into pixel-wise and semantic segmentation architectures. At test time, predictions from individual segmentations are combined through softmax fusion and refined by building segment voting. We demonstrate robust and accurate pixelwise segmentation results using the proposed material segmentation pipeline. |
Tasks | Material Recognition, Semantic Segmentation |
Published | 2019-04-17 |
URL | http://arxiv.org/abs/1904.08537v1 |
http://arxiv.org/pdf/1904.08537v1.pdf | |
PWC | https://paperswithcode.com/paper/material-segmentation-of-multi-view-satellite |
Repo | |
Framework | |
IoT based Smart Access Controlled Secure Smart City Architecture Using Blockchain
Title | IoT based Smart Access Controlled Secure Smart City Architecture Using Blockchain |
Authors | Rourab Paul, Nimisha Ghosh, Suman Sau, Amlan Chakrabarti, Prasant Mahapatra |
Abstract | Standard security protocols like SSL, TLS, IPSec etc. have high memory and processor consumption which makes all these security protocols unsuitable for resource constrained platforms such as Internet of Things (IoT). Blockchain (BC) finds its efficient application in IoT platform to preserve the five basic cryptographic primitives, such as confidentiality, authenticity, integrity, availability and non-repudiation. Conventional adoption of BC in IoT platform causes high energy consumption, delay and computational overhead which are not appropriate for various resource constrained IoT devices. This work proposes a machine learning (ML) based smart access control framework in a public and a private BC for a smart city application which makes it more efficient as compared to the existing IoT applications. The proposed IoT based smart city architecture adopts BC technology for preserving all the cryptographic security and privacy issues. Moreover, BC has very minimal overhead on IoT platform as well. This work investigates the existing threat models and critical access control issues which handle multiple permissions of various nodes and detects relevant inconsistencies to notify the corresponding nodes. Comparison in terms of all security issues with existing literature shows that the proposed architecture is competitively efficient in terms of security access control. |
Tasks | |
Published | 2019-08-30 |
URL | https://arxiv.org/abs/1908.11538v3 |
https://arxiv.org/pdf/1908.11538v3.pdf | |
PWC | https://paperswithcode.com/paper/iot-based-smart-access-controlled-secure |
Repo | |
Framework | |
Mining Rules Incrementally over Large Knowledge Bases
Title | Mining Rules Incrementally over Large Knowledge Bases |
Authors | Xiaofeng Zhou, Ali Sadeghian, Daisy Zhe Wang |
Abstract | Multiple web-scale Knowledge Bases, e.g., Freebase, YAGO, NELL, have been constructed using semi-supervised or unsupervised information extraction techniques and many of them, despite their large sizes, are continuously growing. Much research effort has been put into mining inference rules from knowledge bases. To address the task of rule mining over evolving web-scale knowledge bases, we propose a parallel incremental rule mining framework. Our approach is able to efficiently mine rules based on the relational model and apply updates to large knowledge bases; we propose an alternative metric that reduces computation complexity without compromising quality; we apply multiple optimization techniques that reduce runtime by more than 2 orders of magnitude. Experiments show that our approach efficiently scales to web-scale knowledge bases and saves over 90% time compared to the state-of-the-art batch rule mining system. We also apply our optimization techniques to the batch rule mining algorithm, reducing runtime by more than half compared to the state-of-the-art. To the best of our knowledge, our incremental rule mining system is the first that handles updates to web-scale knowledge bases. |
Tasks | |
Published | 2019-04-20 |
URL | http://arxiv.org/abs/1904.09399v1 |
http://arxiv.org/pdf/1904.09399v1.pdf | |
PWC | https://paperswithcode.com/paper/190409399 |
Repo | |
Framework | |
FaceSpoof Buster: a Presentation Attack Detector Based on Intrinsic Image Properties and Deep Learning
Title | FaceSpoof Buster: a Presentation Attack Detector Based on Intrinsic Image Properties and Deep Learning |
Authors | Rodrigo Bresan, Allan Pinto, Anderson Rocha, Carlos Beluzo, Tiago Carvalho |
Abstract | Nowadays, the adoption of face recognition for biometric authentication systems is usual, mainly because this is one of the most accessible biometric modalities. Techniques that rely on trespassing these kind of systems by using a forged biometric sample, such as a printed paper or a recorded video of a genuine access, are known as presentation attacks, but may be also referred in the literature as face spoofing. Presentation attack detection is a crucial step for preventing this kind of unauthorized accesses into restricted areas and/or devices. In this paper, we propose a novel approach which relies in a combination between intrinsic image properties and deep neural networks to detect presentation attack attempts. Our method explores depth, salience and illumination maps, associated with a pre-trained Convolutional Neural Network in order to produce robust and discriminant features. Each one of these properties are individually classified and, in the end of the process, they are combined by a meta learning classifier, which achieves outstanding results on the most popular datasets for PAD. Results show that proposed method is able to overpass state-of-the-art results in an inter-dataset protocol, which is defined as the most challenging in the literature. |
Tasks | Face Recognition, Meta-Learning |
Published | 2019-02-07 |
URL | http://arxiv.org/abs/1902.02845v1 |
http://arxiv.org/pdf/1902.02845v1.pdf | |
PWC | https://paperswithcode.com/paper/facespoof-buster-a-presentation-attack |
Repo | |
Framework | |
Neural Logic Rule Layers
Title | Neural Logic Rule Layers |
Authors | Jan Niclas Reimann, Andreas Schwung |
Abstract | Despite their great success in recent years, deep neural networks (DNN) are mainly black boxes where the results obtained by running through the network are difficult to understand and interpret. Compared to e.g. decision trees or bayesian classifiers, DNN suffer from bad interpretability where we understand by interpretability, that a human can easily derive the relations modeled by the network. A reasonable way to provide interpretability for humans are logical rules. In this paper we propose neural logic rule layers (NLRL) which are able to represent arbitrary logic rules in terms of their conjunctive and disjunctive normal forms. Using various NLRL within one layer and correspondingly stacking various layers, we are able to represent arbitrary complex rules by the resulting neural network architecture. The NLRL are end-to-end trainable allowing to learn logic rules directly from available data sets. Experiments show that NLRL-enhanced neural networks can learn to model arbitrary complex logic and perform arithmetic operation over the input values. |
Tasks | |
Published | 2019-07-01 |
URL | https://arxiv.org/abs/1907.00878v1 |
https://arxiv.org/pdf/1907.00878v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-logic-rule-layers |
Repo | |
Framework | |
Adaptive Hierarchical Down-Sampling for Point Cloud Classification
Title | Adaptive Hierarchical Down-Sampling for Point Cloud Classification |
Authors | Ehsan Nezhadarya, Ehsan Taghavi, Bingbing Liu, Jun Luo |
Abstract | While several convolution-like operators have recently been proposed for extracting features out of point clouds, down-sampling an unordered point cloud in a deep neural network has not been rigorously studied. Existing methods down-sample the points regardless of their importance for the output. As a result, some important points in the point cloud may be removed, while less valuable points may be passed to the next layers. In contrast, adaptive down-sampling methods sample the points by taking into account the importance of each point, which varies based on the application, task and training data. In this paper, we propose a permutation-invariant learning-based adaptive down-sampling layer, called Critical Points Layer (CPL), which reduces the number of points in an unordered point cloud while retaining the important points. Unlike most graph-based point cloud down-sampling methods that use $k$-NN search algorithm to find the neighbouring points, CPL is a global down-sampling method, rendering it computationally very efficient. The proposed layer can be used along with any graph-based point cloud convolution layer to form a convolutional neural network, dubbed CP-Net in this paper. We introduce a CP-Net for $3$D object classification that achieves the best accuracy for the ModelNet$40$ dataset among point cloud-based methods, which validates the effectiveness of the CPL. |
Tasks | Object Classification |
Published | 2019-04-11 |
URL | http://arxiv.org/abs/1904.08506v1 |
http://arxiv.org/pdf/1904.08506v1.pdf | |
PWC | https://paperswithcode.com/paper/190408506 |
Repo | |
Framework | |
Know More about Each Other: Evolving Dialogue Strategy via Compound Assessment
Title | Know More about Each Other: Evolving Dialogue Strategy via Compound Assessment |
Authors | Siqi Bao, Huang He, Fan Wang, Rongzhong Lian, Hua Wu |
Abstract | In this paper, a novel Generation-Evaluation framework is developed for multi-turn conversations with the objective of letting both participants know more about each other. For the sake of rational knowledge utilization and coherent conversation flow, a dialogue strategy which controls knowledge selection is instantiated and continuously adapted via reinforcement learning. Under the deployed strategy, knowledge grounded conversations are conducted with two dialogue agents. The generated dialogues are comprehensively evaluated on aspects like informativeness and coherence, which are aligned with our objective and human instinct. These assessments are integrated as a compound reward to guide the evolution of dialogue strategy via policy gradient. Comprehensive experiments have been carried out on the publicly available dataset, demonstrating that the proposed method outperforms the other state-of-the-art approaches significantly. |
Tasks | |
Published | 2019-06-03 |
URL | https://arxiv.org/abs/1906.00549v1 |
https://arxiv.org/pdf/1906.00549v1.pdf | |
PWC | https://paperswithcode.com/paper/190600549 |
Repo | |
Framework | |
Assessing Capsule Networks With Biased Data
Title | Assessing Capsule Networks With Biased Data |
Authors | Bruno Ferrarini, Shoaib Ehsan, Adrien Bartoli, Aleš Leonardis, Klaus D. McDonald-Maier |
Abstract | Machine learning based methods achieves impressive results in object classification and detection. Utilizing representative data of the visual world during the training phase is crucial to achieve good performance with such data driven approaches. However, it not always possible to access bias-free datasets thus, robustness to biased data is a desirable property for a learning system. Capsule Networks have been introduced recently and their tolerance to biased data has received little attention. This paper aims to fill this gap and proposes two experimental scenarios to assess the tolerance to imbalanced training data and to determine the generalization performance of a model with unfamiliar affine transformations of the images. This paper assesses dynamic routing and EM routing based Capsule Networks and proposes a comparison with Convolutional Neural Networks in the two tested scenarios. The presented results provide new insights into the behaviour of capsule networks. |
Tasks | Object Classification |
Published | 2019-04-09 |
URL | http://arxiv.org/abs/1904.04555v1 |
http://arxiv.org/pdf/1904.04555v1.pdf | |
PWC | https://paperswithcode.com/paper/assessing-capsule-networks-with-biased-data |
Repo | |
Framework | |