January 31, 2020

3335 words 16 mins read

Paper Group ANR 131

Stateful Detection of Black-Box Adversarial Attacks. Quantum Data Fitting Algorithm for Non-sparse Matrices. Loss Switching Fusion with Similarity Search for Video Classification. A Paired Sparse Representation Model for Robust Face Recognition from a Single Sample. Pathologist-Level Grading of Prostate Biopsies with Artificial Intelligence. Bandit …

Stateful Detection of Black-Box Adversarial Attacks


Title	Stateful Detection of Black-Box Adversarial Attacks
Authors	Steven Chen, Nicholas Carlini, David Wagner
Abstract	The problem of adversarial examples, evasion attacks on machine learning classifiers, has proven extremely difficult to solve. This is true even when, as is the case in many practical settings, the classifier is hosted as a remote service and so the adversary does not have direct access to the model parameters. This paper argues that in such settings, defenders have a much larger space of actions than have been previously explored. Specifically, we deviate from the implicit assumption made by prior work that a defense must be a stateless function that operates on individual examples, and explore the possibility for stateful defenses. To begin, we develop a defense designed to detect the process of adversarial example generation. By keeping a history of the past queries, a defender can try to identify when a sequence of queries appears to be for the purpose of generating an adversarial example. We then introduce query blinding, a new class of attacks designed to bypass defenses that rely on such a defense approach. We believe that expanding the study of adversarial examples from stateless classifiers to stateful systems is not only more realistic for many black-box settings, but also gives the defender a much-needed advantage in responding to the adversary.
Tasks
Published	2019-07-12
URL	https://arxiv.org/abs/1907.05587v1
PDF	https://arxiv.org/pdf/1907.05587v1.pdf
PWC	https://paperswithcode.com/paper/stateful-detection-of-black-box-adversarial
Repo
Framework

Quantum Data Fitting Algorithm for Non-sparse Matrices


Title	Quantum Data Fitting Algorithm for Non-sparse Matrices
Authors	Guangxi Li, Youle Wang, Yu Luo, Yuan Feng
Abstract	We propose a quantum data fitting algorithm for non-sparse matrices, which is based on the Quantum Singular Value Estimation (QSVE) subroutine and a novel efficient method for recovering the signs of eigenvalues. Our algorithm generalizes the quantum data fitting algorithm of Wiebe, Braun, and Lloyd for sparse and well-conditioned matrices by adding a regularization term to avoid the over-fitting problem, which is a very important problem in machine learning. As a result, the algorithm achieves a sparsity-independent runtime of $O(\kappa^2\sqrt{N}\mathrm{polylog}(N)/(\epsilon\log\kappa))$ for an $N\times N$ dimensional Hermitian matrix $\bm{F}$, where $\kappa$ denotes the condition number of $\bm{F}$ and $\epsilon$ is the precision parameter. This amounts to a polynomial speedup on the dimension of matrices when compared with the classical data fitting algorithms, and a strictly less than quadratic dependence on $\kappa$.
Tasks
Published	2019-07-16
URL	https://arxiv.org/abs/1907.06949v1
PDF	https://arxiv.org/pdf/1907.06949v1.pdf
PWC	https://paperswithcode.com/paper/quantum-data-fitting-algorithm-for-non-sparse
Repo
Framework

Loss Switching Fusion with Similarity Search for Video Classification


Title	Loss Switching Fusion with Similarity Search for Video Classification
Authors	Lei Wang, Du Q. Huynh, Moussa Reda Mansour
Abstract	From video streaming to security and surveillance applications, video data play an important role in our daily living today. However, managing a large amount of video data and retrieving the most useful information for the user remain a challenging task. In this paper, we propose a novel video classification system that would benefit the scene understanding task. We define our classification problem as classifying background and foreground motions using the same feature representation for outdoor scenes. This means that the feature representation needs to be robust enough and adaptable to different classification tasks. We propose a lightweight Loss Switching Fusion Network (LSFNet) for the fusion of spatiotemporal descriptors and a similarity search scheme with soft voting to boost the classification performance. The proposed system has a variety of potential applications such as content-based video clustering, video filtering, etc. Evaluation results on two private industry datasets show that our system is robust in both classifying different background motions and detecting human motions from these background motions.
Tasks	Scene Understanding, Video Classification
Published	2019-06-27
URL	https://arxiv.org/abs/1906.11465v1
PDF	https://arxiv.org/pdf/1906.11465v1.pdf
PWC	https://paperswithcode.com/paper/loss-switching-fusion-with-similarity-search
Repo
Framework

A Paired Sparse Representation Model for Robust Face Recognition from a Single Sample


Title	A Paired Sparse Representation Model for Robust Face Recognition from a Single Sample
Authors	Fania Mokhayeri, Eric Granger
Abstract	Sparse representation-based classification (SRC) has been shown to achieve a high level of accuracy in face recognition (FR). However, matching faces captured in unconstrained video against a gallery with a single reference facial still per individual typically yields low accuracy. For improved robustness to intra-class variations, SRC techniques for FR have recently been extended to incorporate variational information from an external generic set into an auxiliary dictionary. Despite their success in handling linear variations, non-linear variations (e.g., pose and expressions) between probe and reference facial images cannot be accurately reconstructed with a linear combination of images in the gallery and auxiliary dictionaries because they do not share the same type of variations. In order to account for non-linear variations due to pose, a paired sparse representation model is introduced allowing for joint use of variational information and synthetic face images. The proposed model, called synthetic plus variational model, reconstructs a probe image by jointly using (1) a variational dictionary and (2) a gallery dictionary augmented with a set of synthetic images generated over a wide diversity of pose angles. The augmented gallery dictionary is then encouraged to pair the same sparsity pattern with the variational dictionary for similar pose angles by solving a newly formulated simultaneous sparsity-based optimization problem. Experimental results obtained on Chokepoint and COX-S2V datasets, using different face representations, indicate that the proposed approach can outperform state-of-the-art SRC-based methods for still-to-video FR with a single sample per person.
Tasks	Face Recognition, Robust Face Recognition, Sparse Representation-based Classification
Published	2019-10-05
URL	https://arxiv.org/abs/1910.02192v1
PDF	https://arxiv.org/pdf/1910.02192v1.pdf
PWC	https://paperswithcode.com/paper/a-paired-sparse-representation-model-for
Repo
Framework

Pathologist-Level Grading of Prostate Biopsies with Artificial Intelligence


Title	Pathologist-Level Grading of Prostate Biopsies with Artificial Intelligence
Authors	Peter Ström, Kimmo Kartasalo, Henrik Olsson, Leslie Solorzano, Brett Delahunt, Daniel M. Berney, David G. Bostwick, Andrew J. Evans, David J. Grignon, Peter A. Humphrey, Kenneth A. Iczkowski, James G. Kench, Glen Kristiansen, Theodorus H. van der Kwast, Katia R. M. Leite, Jesse K. McKenney, Jon Oxley, Chin-Chen Pan, Hemamali Samaratunga, John R. Srigley, Hiroyuki Takahashi, Toyonori Tsuzuki, Murali Varma, Ming Zhou, Johan Lindberg, Cecilia Bergström, Pekka Ruusuvuori, Carolina Wählby, Henrik Grönberg, Mattias Rantalainen, Lars Egevad, Martin Eklund
Abstract	Background: An increasing volume of prostate biopsies and a world-wide shortage of uro-pathologists puts a strain on pathology departments. Additionally, the high intra- and inter-observer variability in grading can result in over- and undertreatment of prostate cancer. Artificial intelligence (AI) methods may alleviate these problems by assisting pathologists to reduce workload and harmonize grading. Methods: We digitized 6,682 needle biopsies from 976 participants in the population based STHLM3 diagnostic study to train deep neural networks for assessing prostate biopsies. The networks were evaluated by predicting the presence, extent, and Gleason grade of malignant tissue for an independent test set comprising 1,631 biopsies from 245 men. We additionally evaluated grading performance on 87 biopsies individually graded by 23 experienced urological pathologists from the International Society of Urological Pathology. We assessed discriminatory performance by receiver operating characteristics (ROC) and tumor extent predictions by correlating predicted millimeter cancer length against measurements by the reporting pathologist. We quantified the concordance between grades assigned by the AI and the expert urological pathologists using Cohen’s kappa. Results: The performance of the AI to detect and grade cancer in prostate needle biopsy samples was comparable to that of international experts in prostate pathology. The AI achieved an area under the ROC curve of 0.997 for distinguishing between benign and malignant biopsy cores, and 0.999 for distinguishing between men with or without prostate cancer. The correlation between millimeter cancer predicted by the AI and assigned by the reporting pathologist was 0.96. For assigning Gleason grades, the AI achieved an average pairwise kappa of 0.62. This was within the range of the corresponding values for the expert pathologists (0.60 to 0.73).
Tasks
Published	2019-07-02
URL	https://arxiv.org/abs/1907.01368v1
PDF	https://arxiv.org/pdf/1907.01368v1.pdf
PWC	https://paperswithcode.com/paper/pathologist-level-grading-of-prostate
Repo
Framework

Bandit Learning Through Biased Maximum Likelihood Estimation


Title	Bandit Learning Through Biased Maximum Likelihood Estimation
Authors	Xi Liu, Ping-Chun Hsieh, Anirban Bhattacharya, P. R. Kumar
Abstract	We propose BMLE, a new family of bandit algorithms, that are formulated in a general way based on the Biased Maximum Likelihood Estimation method originally appearing in the adaptive control literature. We design the cost-bias term to tackle the exploration and exploitation tradeoff for stochastic bandit problems. We provide an explicit closed form expression for the index of an arm for Bernoulli bandits, which is trivial to compute. We also provide a general recipe for extending the BMLE algorithm to other families of reward distributions. We prove that for Bernoulli bandits, the BMLE algorithm achieves a logarithmic finite-time regret bound and hence attains order-optimality. Through extensive simulations, we demonstrate that the proposed algorithms achieve regret performance comparable to the best of several state-of-the-art baseline methods, while having a significant computational advantage in comparison to other best performing methods. The generality of the proposed approach makes it possible to address more complex models, including general adaptive control of Markovian systems.
Tasks
Published	2019-07-02
URL	https://arxiv.org/abs/1907.01287v2
PDF	https://arxiv.org/pdf/1907.01287v2.pdf
PWC	https://paperswithcode.com/paper/bandit-learning-through-biased-maximum
Repo
Framework

Spatio-Temporal Fusion Networks for Action Recognition


Title	Spatio-Temporal Fusion Networks for Action Recognition
Authors	Sangwoo Cho, Hassan Foroosh
Abstract	The video based CNN works have focused on effective ways to fuse appearance and motion networks, but they typically lack utilizing temporal information over video frames. In this work, we present a novel spatio-temporal fusion network (STFN) that integrates temporal dynamics of appearance and motion information from entire videos. The captured temporal dynamic information is then aggregated for a better video level representation and learned via end-to-end training. The spatio-temporal fusion network consists of two set of Residual Inception blocks that extract temporal dynamics and a fusion connection for appearance and motion features. The benefits of STFN are: (a) it captures local and global temporal dynamics of complementary data to learn video-wide information; and (b) it is applicable to any network for video classification to boost performance. We explore a variety of design choices for STFN and verify how the network performance is varied with the ablation studies. We perform experiments on two challenging human activity datasets, UCF101 and HMDB51, and achieve the state-of-the-art results with the best network.
Tasks	Video Classification
Published	2019-06-17
URL	https://arxiv.org/abs/1906.06822v1
PDF	https://arxiv.org/pdf/1906.06822v1.pdf
PWC	https://paperswithcode.com/paper/spatio-temporal-fusion-networks-for-action
Repo
Framework

Decoding Imagined Speech and Computer Control using Brain Waves


Title	Decoding Imagined Speech and Computer Control using Brain Waves
Authors	Abhiram Singh, Ashwin Gumaste
Abstract	In this work, we explore the possibility of decoding Imagined Speech brain waves using machine learning techniques. We propose a covariance matrix of Electroencephalogram channels as input features, projection to tangent space of covariance matrices for obtaining vectors from covariance matrices, principal component analysis for dimension reduction of vectors, an artificial feed-forward neural network as a classification model and bootstrap aggregation for creating an ensemble of neural network models. After the classification, two different Finite State Machines are designed that create an interface for controlling a computer system using an Imagined Speech-based BCI system. The proposed approach is able to decode the Imagined Speech signal with a maximum mean classification accuracy of 85% on binary classification task of one long word and a short word. We also show that our proposed approach is able to differentiate between imagined speech brain signals and rest state brain signals with maximum mean classification accuracy of 94%. We compared our proposed method with other approaches for decoding imagined speech and show that our approach performs equivalent to the state of the art approach on decoding long vs. short words and outperforms it significantly on the other two tasks of decoding three short words and three vowels with an average margin of 11% and 9%, respectively. We also obtain an information transfer rate of 21-bits-per-minute when using an IS based system to operate a computer. These results show that the proposed approach is able to decode a wide variety of imagined speech signals without any human-designed features.
Tasks	Dimensionality Reduction
Published	2019-11-08
URL	https://arxiv.org/abs/1911.04255v2
PDF	https://arxiv.org/pdf/1911.04255v2.pdf
PWC	https://paperswithcode.com/paper/decoding-imagined-speech-and-computer-control
Repo
Framework

Wavelet regression and additive models for irregularly spaced data


Title	Wavelet regression and additive models for irregularly spaced data
Authors	Asad Haris, Noah Simon, Ali Shojaie
Abstract	We present a novel approach for nonparametric regression using wavelet basis functions. Our proposal, $\texttt{waveMesh}$, can be applied to non-equispaced data with sample size not necessarily a power of 2. We develop an efficient proximal gradient descent algorithm for computing the estimator and establish adaptive minimax convergence rates. The main appeal of our approach is that it naturally extends to additive and sparse additive models for a potentially large number of covariates. We prove minimax optimal convergence rates under a weak compatibility condition for sparse additive models. The compatibility condition holds when we have a small number of covariates. Additionally, we establish convergence rates for when the condition is not met. We complement our theoretical results with empirical studies comparing $\texttt{waveMesh}$ to existing methods.
Tasks
Published	2019-03-11
URL	http://arxiv.org/abs/1903.04631v1
PDF	http://arxiv.org/pdf/1903.04631v1.pdf
PWC	https://paperswithcode.com/paper/wavelet-regression-and-additive-models-for-1
Repo
Framework

From-Below Boolean Matrix Factorization Algorithm Based on MDL


Title	From-Below Boolean Matrix Factorization Algorithm Based on MDL
Authors	Tatiana Makhalova, Martin Trnecka
Abstract	During the past few years Boolean matrix factorization (BMF) has become an important direction in data analysis. The minimum description length principle (MDL) was successfully adapted in BMF for the model order selection. Nevertheless, a BMF algorithm performing good results from the standpoint of standard measures in BMF is missing. In this paper, we propose a novel from-below Boolean matrix factorization algorithm based on formal concept analysis. The algorithm utilizes the MDL principle as a criterion for the factor selection. On various experiments we show that the proposed algorithm outperforms—from different standpoints—existing state-of-the-art BMF algorithms.
Tasks
Published	2019-01-28
URL	http://arxiv.org/abs/1901.09567v1
PDF	http://arxiv.org/pdf/1901.09567v1.pdf
PWC	https://paperswithcode.com/paper/from-below-boolean-matrix-factorization
Repo
Framework

Patch Transformer for Multi-tagging Whole Slide Histopathology Images


Title	Patch Transformer for Multi-tagging Whole Slide Histopathology Images
Authors	Weijian Li, Viet-Duy Nguyen, Haofu Liao, Matt Wilder, Ke Cheng, Jiebo Luo
Abstract	Automated whole slide image (WSI) tagging has become a growing demand due to the increasing volume and diversity of WSIs collected nowadays in histopathology. Various methods have been studied to classify WSIs with single tags but none of them focuses on labeling WSIs with multiple tags. To this end, we propose a novel end-to-end trainable deep neural network named Patch Transformer which can effectively predict multiple slide-level tags from WSI patches based on both the correlations and the uniqueness between the tags. Specifically, the proposed method learns patch characteristics considering 1) patch-wise relations through a patch transformation module and 2) tag-wise uniqueness for each tagging task through a multi-tag attention module. Extensive experiments on a large and diverse dataset consisting of 4,920 WSIs prove the effectiveness of the proposed model.
Tasks
Published	2019-06-10
URL	https://arxiv.org/abs/1906.04151v3
PDF	https://arxiv.org/pdf/1906.04151v3.pdf
PWC	https://paperswithcode.com/paper/patch-transformer-for-multi-tagging-whole
Repo
Framework

SUPER Learning: A Supervised-Unsupervised Framework for Low-Dose CT Image Reconstruction


Title	SUPER Learning: A Supervised-Unsupervised Framework for Low-Dose CT Image Reconstruction
Authors	Zhipeng Li, Siqi Ye, Yong Long, Saiprasad Ravishankar
Abstract	Recent years have witnessed growing interest in machine learning-based models and techniques for low-dose X-ray CT (LDCT) imaging tasks. The methods can typically be categorized into supervised learning methods and unsupervised or model-based learning methods. Supervised learning methods have recently shown success in image restoration tasks. However, they often rely on large training sets. Model-based learning methods such as dictionary or transform learning do not require large or paired training sets and often have good generalization properties, since they learn general properties of CT image sets. Recent works have shown the promising reconstruction performance of methods such as PWLS-ULTRA that rely on clustering the underlying (reconstructed) image patches into a learned union of transforms. In this paper, we propose a new Supervised-UnsuPERvised (SUPER) reconstruction framework for LDCT image reconstruction that combines the benefits of supervised learning methods and (unsupervised) transform learning-based methods such as PWLS-ULTRA that involve highly image-adaptive clustering. The SUPER model consists of several layers, each of which includes a deep network learned in a supervised manner and an unsupervised iterative method that involves image-adaptive components. The SUPER reconstruction algorithms are learned in a greedy manner from training data. The proposed SUPER learning methods dramatically outperform both the constituent supervised learning-based networks and iterative algorithms for LDCT, and use much fewer iterations in the iterative reconstruction modules.
Tasks	Image Reconstruction, Image Restoration
Published	2019-10-26
URL	https://arxiv.org/abs/1910.12024v1
PDF	https://arxiv.org/pdf/1910.12024v1.pdf
PWC	https://paperswithcode.com/paper/super-learning-a-supervised-unsupervised
Repo
Framework

Global Aggregations of Local Explanations for Black Box models


Title	Global Aggregations of Local Explanations for Black Box models
Authors	Ilse van der Linden, Hinda Haned, Evangelos Kanoulas
Abstract	The decision-making process of many state-of-the-art machine learning models is inherently inscrutable to the extent that it is impossible for a human to interpret the model directly: they are black box models. This has led to a call for research on explaining black box models, for which there are two main approaches. Global explanations that aim to explain a model’s decision making process in general, and local explanations that aim to explain a single prediction. Since it remains challenging to establish fidelity to black box models in globally interpretable approximations, much attention is put on local explanations. However, whether local explanations are able to reliably represent the black box model and provide useful insights remains an open question. We present Global Aggregations of Local Explanations (GALE) with the objective to provide insights in a model’s global decision making process. Overall, our results reveal that the choice of aggregation matters. We find that the global importance introduced by Local Interpretable Model-agnostic Explanations (LIME) does not reliably represent the model’s global behavior. Our proposed aggregations are better able to represent how features affect the model’s predictions, and to provide global insights by identifying distinguishing features.
Tasks	Decision Making
Published	2019-07-05
URL	https://arxiv.org/abs/1907.03039v1
PDF	https://arxiv.org/pdf/1907.03039v1.pdf
PWC	https://paperswithcode.com/paper/global-aggregations-of-local-explanations-for
Repo
Framework

Learn to synthesize and synthesize to learn


Title	Learn to synthesize and synthesize to learn
Authors	Behzad Bozorgtabar, Mohammad Saeed Rad, Hazım Kemal Ekenel, Jean-Philippe Thiran
Abstract	Attribute guided face image synthesis aims to manipulate attributes on a face image. Most existing methods for image-to-image translation can either perform a fixed translation between any two image domains using a single attribute or require training data with the attributes of interest for each subject. Therefore, these methods could only train one specific model for each pair of image domains, which limits their ability in dealing with more than two domains. Another disadvantage of these methods is that they often suffer from the common problem of mode collapse that degrades the quality of the generated images. To overcome these shortcomings, we propose attribute guided face image generation method using a single model, which is capable to synthesize multiple photo-realistic face images conditioned on the attributes of interest. In addition, we adopt the proposed model to increase the realism of the simulated face images while preserving the face characteristics. Compared to existing models, synthetic face images generated by our method present a good photorealistic quality on several face datasets. Finally, we demonstrate that generated facial images can be used for synthetic data augmentation, and improve the performance of the classifier used for facial expression recognition.
Tasks	Data Augmentation, Facial Expression Recognition, Image Generation, Image-to-Image Translation
Published	2019-05-01
URL	http://arxiv.org/abs/1905.00286v1
PDF	http://arxiv.org/pdf/1905.00286v1.pdf
PWC	https://paperswithcode.com/paper/learn-to-synthesize-and-synthesize-to-learn
Repo
Framework

Attention Based Image Compression Post-Processing Convolutional Neural Network


Title	Attention Based Image Compression Post-Processing Convolutional Neural Network
Authors	Yuyang Xue, Jiannan Su
Abstract	The traditional image compressors, e.g., BPG and H.266, have achieved great image and video compression quality. Recently, Convolutional Neural Network has been used widely in image compression. We proposed an attention-based convolutional neural network for low bit-rate compression to post-process the output of traditional image compression decoder. Across the experimental results on validation sets, the post-processing module trained by MAE and MS-SSIM losses yields the highest PSNR of 32.10 on average at the bit-rate of 0.15.
Tasks	Image Compression, Video Compression
Published	2019-05-27
URL	https://arxiv.org/abs/1905.11045v1
PDF	https://arxiv.org/pdf/1905.11045v1.pdf
PWC	https://paperswithcode.com/paper/attention-based-image-compression-post
Repo
Framework