April 3, 2020

# Paper Group ANR 85

Flexible Log File Parsing using Hidden Markov Models. Bi-objective Optimization of Biclustering with Binary Data. An Empirical Evaluation of Perturbation-based Defenses. MLFcGAN: Multi-level Feature Fusion based Conditional GAN for Underwater Image Color Correction. Lung Infection Quantification of COVID-19 in CT Images with Deep Learning. Weakly-s …

#### Flexible Log File Parsing using Hidden Markov Models

Title Flexible Log File Parsing using Hidden Markov Models
Abstract We aim to model unknown file processing. As the content of log files often evolves over time, we established a dynamic statistical model which learns and adapts processing and parsing rules. First, we limit the amount of unstructured text by focusing only on those frequent patterns which lead to the desired output table similar to Vaarandi [10]. Second, we transform the found frequent patterns and the output stating the parsed table into a Hidden Markov Model (HMM). We use this HMM as a specific, however, flexible representation of a pattern for log file processing. With changes in the raw log file distorting learned patterns, we aim the model to adapt automatically in order to maintain high quality output. After training our model on one system type, applying the model and the resulting parsing rule to a different system with slightly different log file patterns, we achieve an accuracy over 99%.
Published 2020-01-05
URL https://arxiv.org/abs/2001.01216v1
PDF https://arxiv.org/pdf/2001.01216v1.pdf
PWC https://paperswithcode.com/paper/flexible-log-file-parsing-using-hidden-markov
Repo
Framework

#### Bi-objective Optimization of Biclustering with Binary Data

Title Bi-objective Optimization of Biclustering with Binary Data
Authors Fred Glover, Said Hanafi, Gintaras Palubeckis
Abstract Clustering consists of partitioning data objects into subsets called clusters according to some similarity criteria. This paper addresses a generalization called quasi-clustering that allows overlapping of clusters, and which we link to biclustering. Biclustering simultaneously groups the objects and features so that a specific group of objects has a special group of features. In recent years, biclustering has received a lot of attention in several practical applications. In this paper we consider a bi-objective optimization of biclustering problem with binary data. First we present an integer programing formulations for the bi-objective optimization biclustering. Next we propose a constructive heuristic based on the set intersection operation and its efficient implementation for solving a series of mono-objective problems used inside the Epsilon-constraint method (obtained by keeping only one objective function and the other objective function is integrated into constraints). Finally, our experimental results show that using CPLEX solver as an exact algorithm for finding an optimal solution drastically increases the computational cost for large instances, while our proposed heuristic provides very good results and significantly reduces the computational expense.
Published 2020-02-09
URL https://arxiv.org/abs/2002.04711v1
PDF https://arxiv.org/pdf/2002.04711v1.pdf
PWC https://paperswithcode.com/paper/bi-objective-optimization-of-biclustering
Repo
Framework

#### An Empirical Evaluation of Perturbation-based Defenses

Title An Empirical Evaluation of Perturbation-based Defenses
Abstract Recent work has extensively shown that randomized perturbations of a neural network can improve its robustness to adversarial attacks. The literature is, however, lacking a detailed compare-and-contrast of the latest proposals to understand what classes of perturbations work, when they work, and why they work. We contribute a detailed experimental evaluation that elucidates these questions and benchmarks perturbation defenses in a consistent way. In particular, we show five main results: (1) all input perturbation defenses, whether random or deterministic, are essentially equivalent in their efficacy, (2) such defenses offer almost no robustness to adaptive attacks unless these perturbations are observed during training, (3) a tuned sequence of noise layers across a network provides the best empirical robustness, (4) attacks transfer between perturbation defenses so the attackers need not know the specific type of defense only that it involves perturbations, and (5) adversarial examples very close to original images show an elevated sensitivity to perturbation in a first-order analysis. Based on these insights, we demonstrate a new robust model built on noise injection and adversarial training that achieves state-of-the-art robustness.
Published 2020-02-08
URL https://arxiv.org/abs/2002.03080v2
PDF https://arxiv.org/pdf/2002.03080v2.pdf
PWC https://paperswithcode.com/paper/an-empirical-evaluation-of-perturbation-based
Repo
Framework

#### MLFcGAN: Multi-level Feature Fusion based Conditional GAN for Underwater Image Color Correction

Title MLFcGAN: Multi-level Feature Fusion based Conditional GAN for Underwater Image Color Correction
Authors Xiaodong Liu, Zhi Gao, Ben M. Chen
Abstract Color correction for underwater images has received increasing interests, due to its critical role in facilitating available mature vision algorithms for underwater scenarios. Inspired by the stunning success of deep convolutional neural networks (DCNNs) techniques in many vision tasks, especially the strength in extracting features in multiple scales, we propose a deep multi-scale feature fusion net based on the conditional generative adversarial network (GAN) for underwater image color correction. In our network, multi-scale features are extracted first, followed by augmenting local features on each scale with global features. This design was verified to facilitate more effective and faster network learning, resulting in better performance in both color correction and detail preservation. We conducted extensive experiments and compared with the state-of-the-art approaches quantitatively and qualitatively, showing that our method achieves significant improvements.
Published 2020-02-13
URL https://arxiv.org/abs/2002.05333v1
PDF https://arxiv.org/pdf/2002.05333v1.pdf
PWC https://paperswithcode.com/paper/mlfcgan-multi-level-feature-fusion-based
Repo
Framework

#### Lung Infection Quantification of COVID-19 in CT Images with Deep Learning

Title Lung Infection Quantification of COVID-19 in CT Images with Deep Learning
Authors Fei Shan, Yaozong Gao, Jun Wang, Weiya Shi, Nannan Shi, Miaofei Han, Zhong Xue, Dinggang Shen, Yuxin Shi
Abstract CT imaging is crucial for diagnosis, assessment and staging COVID-19 infection. Follow-up scans every 3-5 days are often recommended for disease progression. It has been reported that bilateral and peripheral ground glass opacification (GGO) with or without consolidation are predominant CT findings in COVID-19 patients. However, due to lack of computerized quantification tools, only qualitative impression and rough description of infected areas are currently used in radiological reports. In this paper, a deep learning (DL)-based segmentation system is developed to automatically quantify infection regions of interest (ROIs) and their volumetric ratios w.r.t. the lung. The performance of the system was evaluated by comparing the automatically segmented infection regions with the manually-delineated ones on 300 chest CT scans of 300 COVID-19 patients. For fast manual delineation of training samples and possible manual intervention of automatic results, a human-in-the-loop (HITL) strategy has been adopted to assist radiologists for infection region segmentation, which dramatically reduced the total segmentation time to 4 minutes after 3 iterations of model updating. The average Dice simiarility coefficient showed 91.6% agreement between automatic and manual infaction segmentations, and the mean estimation error of percentage of infection (POI) was 0.3% for the whole lung. Finally, possible applications, including but not limited to analysis of follow-up CT scans and infection distributions in the lobes and segments correlated with clinical findings, were discussed.
Published 2020-03-10
URL https://arxiv.org/abs/2003.04655v3
PDF https://arxiv.org/pdf/2003.04655v3.pdf
PWC https://paperswithcode.com/paper/lung-infection-quantification-of-covid-19-in
Repo
Framework

#### Weakly-supervised 3D coronary artery reconstruction from two-view angiographic images

Title Weakly-supervised 3D coronary artery reconstruction from two-view angiographic images
Authors Lu Wang, Dong-xue Liang, Xiao-lei Yin, Jing Qiu, Zhi-yun Yang, Jun-hui Xing, Jian-zeng Dong, Zhao-yuan Ma
Abstract The reconstruction of three-dimensional models of coronary arteries is of great significance for the localization, evaluation and diagnosis of stenosis and plaque in the arteries, as well as for the assisted navigation of interventional surgery. In the clinical practice, physicians use a few angles of coronary angiography to capture arterial images, so it is of great practical value to perform 3D reconstruction directly from coronary angiography images. However, this is a very difficult computer vision task due to the complex shape of coronary blood vessels, as well as the lack of data set and key point labeling. With the rise of deep learning, more and more work is being done to reconstruct 3D models of human organs from medical images using deep neural networks. We propose an adversarial and generative way to reconstruct three dimensional coronary artery models, from two different views of angiographic images of coronary arteries. With 3D fully supervised learning and 2D weakly supervised learning schemes, we obtained reconstruction accuracies that outperform state-of-art techniques.
Published 2020-03-26
URL https://arxiv.org/abs/2003.11846v1
PDF https://arxiv.org/pdf/2003.11846v1.pdf
PWC https://paperswithcode.com/paper/weakly-supervised-3d-coronary-artery
Repo
Framework

#### SYMOG: learning symmetric mixture of Gaussian modes for improved fixed-point quantization

Title SYMOG: learning symmetric mixture of Gaussian modes for improved fixed-point quantization
Authors Lukas Enderich, Fabian Timm, Wolfram Burgard
Abstract Deep neural networks (DNNs) have been proven to outperform classical methods on several machine learning benchmarks. However, they have high computational complexity and require powerful processing units. Especially when deployed on embedded systems, model size and inference time must be significantly reduced. We propose SYMOG (symmetric mixture of Gaussian modes), which significantly decreases the complexity of DNNs through low-bit fixed-point quantization. SYMOG is a novel soft quantization method such that the learning task and the quantization are solved simultaneously. During training the weight distribution changes from an unimodal Gaussian distribution to a symmetric mixture of Gaussians, where each mean value belongs to a particular fixed-point mode. We evaluate our approach with different architectures (LeNet5, VGG7, VGG11, DenseNet) on common benchmark data sets (MNIST, CIFAR-10, CIFAR-100) and we compare with state-of-the-art quantization approaches. We achieve excellent results and outperform 2-bit state-of-the-art performance with an error rate of only 5.71% on CIFAR-10 and 27.65% on CIFAR-100.
Published 2020-02-19
URL https://arxiv.org/abs/2002.08204v1
PDF https://arxiv.org/pdf/2002.08204v1.pdf
PWC https://paperswithcode.com/paper/symog-learning-symmetric-mixture-of-gaussian
Repo
Framework

#### A Visual Analytics System for Multi-model Comparison on Clinical Data Predictions

Title A Visual Analytics System for Multi-model Comparison on Clinical Data Predictions
Authors Yiran Li, Takanori Fujiwara, Yong K. Choi, Katherine K. Kim, Kwan-Liu Ma
Abstract There is a growing trend of applying machine learning methods to medical datasets in order to predict patients’ future status. Although some of these methods achieve high performance, challenges still exist in comparing and evaluating different models through their interpretable information. Such analytics can help clinicians improve evidence-based medical decision making. In this work, we develop a visual analytics system that compares multiple models’ prediction criteria and evaluates their consistency. With our system, users can generate knowledge on different models’ inner criteria and how confidently we can rely on each model’s prediction for a certain patient. Through a case study of a publicly available clinical dataset, we demonstrate the effectiveness of our visual analytics system to assist clinicians and researchers in comparing and quantitatively evaluating different machine learning methods.
Published 2020-02-18
URL https://arxiv.org/abs/2002.10998v2
PDF https://arxiv.org/pdf/2002.10998v2.pdf
PWC https://paperswithcode.com/paper/a-visual-analytics-system-for-multi-model
Repo
Framework

#### Agnostic Q-learning with Function Approximation in Deterministic Systems: Tight Bounds on Approximation Error and Sample Complexity

Title Agnostic Q-learning with Function Approximation in Deterministic Systems: Tight Bounds on Approximation Error and Sample Complexity
Authors Simon S. Du, Jason D. Lee, Gaurav Mahajan, Ruosong Wang
Abstract The current paper studies the problem of agnostic $Q$-learning with function approximation in deterministic systems where the optimal $Q$-function is approximable by a function in the class $\mathcal{F}$ with approximation error $\delta \ge 0$. We propose a novel recursion-based algorithm and show that if $\delta = O\left(\rho/\sqrt{\dim_E}\right)$, then one can find the optimal policy using $O\left(\dim_E\right)$ trajectories, where $\rho$ is the gap between the optimal $Q$-value of the best actions and that of the second-best actions and $\dim_E$ is the Eluder dimension of $\mathcal{F}$. Our result has two implications: 1) In conjunction with the lower bound in [Du et al., ICLR 2020], our upper bound suggests that the condition $\delta = \widetilde{\Theta}\left(\rho/\sqrt{\mathrm{dim}_E}\right)$ is necessary and sufficient for algorithms with polynomial sample complexity. 2) In conjunction with the lower bound in [Wen and Van Roy, NIPS 2013], our upper bound suggests that the sample complexity $\widetilde{\Theta}\left(\mathrm{dim}_E\right)$ is tight even in the agnostic setting. Therefore, we settle the open problem on agnostic $Q$-learning proposed in [Wen and Van Roy, NIPS 2013]. We further extend our algorithm to the stochastic reward setting and obtain similar results.
Published 2020-02-17
URL https://arxiv.org/abs/2002.07125v1
PDF https://arxiv.org/pdf/2002.07125v1.pdf
PWC https://paperswithcode.com/paper/agnostic-q-learning-with-function
Repo
Framework

#### Transformer-based language modeling and decoding for conversational speech recognition

Title Transformer-based language modeling and decoding for conversational speech recognition
Authors Kareem Nassar
Abstract We propose a way to use a transformer-based language model in conversational speech recognition. Specifically, we focus on decoding efficiently in a weighted finite-state transducer framework. We showcase an approach to lattice re-scoring that allows for longer range history captured by a transfomer-based language model and takes advantage of a transformer’s ability to avoid computing sequentially.
Published 2020-01-04
URL https://arxiv.org/abs/2001.01140v1
PDF https://arxiv.org/pdf/2001.01140v1.pdf
PWC https://paperswithcode.com/paper/transformer-based-language-modeling-and
Repo
Framework

#### Remove Appearance Shift for Ultrasound Image Segmentation via Fast and Universal Style Transfer

Title Remove Appearance Shift for Ultrasound Image Segmentation via Fast and Universal Style Transfer
Authors Zhendong Liu, Xin Yang, Rui Gao, Shengfeng Liu, Haoran Dou, Shuangchi He, Yuhao Huang, Yankai Huang, Huanjia Luo, Yuanji Zhang, Yi Xiong, Dong Ni
Abstract Deep Neural Networks (DNNs) suffer from the performance degradation when image appearance shift occurs, especially in ultrasound (US) image segmentation. In this paper, we propose a novel and intuitive framework to remove the appearance shift, and hence improve the generalization ability of DNNs. Our work has three highlights. First, we follow the spirit of universal style transfer to remove appearance shifts, which was not explored before for US images. Without sacrificing image structure details, it enables the arbitrary style-content transfer. Second, accelerated with Adaptive Instance Normalization block, our framework achieved real-time speed required in the clinical US scanning. Third, an efficient and effective style image selection strategy is proposed to ensure the target-style US image and testing content US image properly match each other. Experiments on two large US datasets demonstrate that our methods are superior to state-of-the-art methods on making DNNs robust against various appearance shifts.
Published 2020-02-14
URL https://arxiv.org/abs/2002.05844v1
PDF https://arxiv.org/pdf/2002.05844v1.pdf
PWC https://paperswithcode.com/paper/remove-appearance-shift-for-ultrasound-image
Repo
Framework

#### Attention based on-device streaming speech recognition with large speech corpus

Title Attention based on-device streaming speech recognition with large speech corpus
Authors Kwangyoun Kim, Kyungmin Lee, Dhananjaya Gowda, Junmo Park, Sungsoo Kim, Sichen Jin, Young-Yoon Lee, Jinsu Yeo, Daehyun Kim, Seokyeong Jung, Jungin Lee, Myoungji Han, Chanwoo Kim
Abstract In this paper, we present a new on-device automatic speech recognition (ASR) system based on monotonic chunk-wise attention (MoChA) models trained with large (> 10K hours) corpus. We attained around 90% of a word recognition rate for general domain mainly by using joint training of connectionist temporal classifier (CTC) and cross entropy (CE) losses, minimum word error rate (MWER) training, layer-wise pre-training and data augmentation methods. In addition, we compressed our models by more than 3.4 times smaller using an iterative hyper low-rank approximation (LRA) method while minimizing the degradation in recognition accuracy. The memory footprint was further reduced with 8-bit quantization to bring down the final model size to lower than 39 MB. For on-demand adaptation, we fused the MoChA models with statistical n-gram models, and we could achieve a relatively 36% improvement on average in word error rate (WER) for target domains including the general domain.
Tasks Data Augmentation, Quantization, Speech Recognition
Published 2020-01-02
URL https://arxiv.org/abs/2001.00577v1
PDF https://arxiv.org/pdf/2001.00577v1.pdf
PWC https://paperswithcode.com/paper/attention-based-on-device-streaming-speech
Repo
Framework

#### GSANet: Semantic Segmentation with Global and Selective Attention

Title GSANet: Semantic Segmentation with Global and Selective Attention
Authors Qingfeng Liu, Mostafa El-Khamy, Dongwoon Bai, Jungwon Lee
Abstract This paper proposes a novel deep learning architecture for semantic segmentation. The proposed Global and Selective Attention Network (GSANet) features Atrous Spatial Pyramid Pooling (ASPP) with a novel sparsemax global attention and a novel selective attention that deploys a condensation and diffusion mechanism to aggregate the multi-scale contextual information from the extracted deep features. A selective attention decoder is also proposed to process the GSA-ASPP outputs for optimizing the softmax volume. We are the first to benchmark the performance of semantic segmentation networks with the low-complexity feature extraction network (FXN) MobileNetEdge, that is optimized for low latency on edge devices. We show that GSANet can result in more accurate segmentation with MobileNetEdge, as well as with strong FXNs, such as Xception. GSANet improves the state-of-art semantic segmentation accuracy on both the ADE20k and the Cityscapes datasets.
Published 2020-02-14
URL https://arxiv.org/abs/2003.00830v1
PDF https://arxiv.org/pdf/2003.00830v1.pdf
PWC https://paperswithcode.com/paper/gsanet-semantic-segmentation-with-global-and
Repo
Framework

#### Physical Accuracy of Deep Neural Networks for 2D and 3D Multi-Mineral Segmentation of Rock micro-CT Images

Title Physical Accuracy of Deep Neural Networks for 2D and 3D Multi-Mineral Segmentation of Rock micro-CT Images
Authors Ying Da Wang, Mehdi Shabaninejad, Ryan T. Armstrong, Peyman Mostaghimi
Abstract Segmentation of 3D micro-Computed Tomographic uCT) images of rock samples is essential for further Digital Rock Physics (DRP) analysis, however, conventional methods such as thresholding, watershed segmentation, and converging active contours are susceptible to user-bias. Deep Convolutional Neural Networks (CNNs) have produced accurate pixelwise semantic segmentation results with natural images and $\mu$CT rock images, however, physical accuracy is not well documented. The performance of 4 CNN architectures is tested for 2D and 3D cases in 10 configurations. Manually segmented uCT images of Mt. Simon Sandstone are treated as ground truth and used as training and validation data, with a high voxelwise accuracy (over 99%) achieved. Downstream analysis is then used to validate physical accuracy. The topology of each segmented phase is calculated, and the absolute permeability and multiphase flow is modelled with direct simulation in single and mixed wetting cases. These physical measures of connectivity, and flow characteristics show high variance and uncertainty, with models that achieve 95%+ in voxelwise accuracy possessing permeabilities and connectivities orders of magnitude off. A new network architecture is also introduced as a hybrid fusion of U-net and ResNet, combining short and long skip connections in a Network-in-Network configuration. The 3D implementation outperforms all other tested models in voxelwise and physical accuracy measures. The network architecture and the volume fraction in the dataset (and associated weighting), are factors that not only influence the accuracy trade-off in the voxelwise case, but is especially important in training a physically accurate model for segmentation.
Published 2020-02-13
URL https://arxiv.org/abs/2002.05322v2
PDF https://arxiv.org/pdf/2002.05322v2.pdf
PWC https://paperswithcode.com/paper/physical-accuracy-of-deep-neural-networks-for
Repo
Framework

#### Learnable Bernoulli Dropout for Bayesian Deep Learning

Title Learnable Bernoulli Dropout for Bayesian Deep Learning
Authors Shahin Boluki, Randy Ardywibowo, Siamak Zamani Dadaneh, Mingyuan Zhou, Xiaoning Qian
Abstract In this work, we propose learnable Bernoulli dropout (LBD), a new model-agnostic dropout scheme that considers the dropout rates as parameters jointly optimized with other model parameters. By probabilistic modeling of Bernoulli dropout, our method enables more robust prediction and uncertainty quantification in deep models. Especially, when combined with variational auto-encoders (VAEs), LBD enables flexible semi-implicit posterior representations, leading to new semi-implicit VAE~(SIVAE) models. We solve the optimization for training with respect to the dropout parameters using Augment-REINFORCE-Merge (ARM), an unbiased and low-variance gradient estimator. Our experiments on a range of tasks show the superior performance of our approach compared with other commonly used dropout schemes. Overall, LBD leads to improved accuracy and uncertainty estimates in image classification and semantic segmentation. Moreover, using SIVAE, we can achieve state-of-the-art performance on collaborative filtering for implicit feedback on several public datasets.