January 30, 2020

2738 words 13 mins read

Paper Group ANR 267

Paper Group ANR 267

Spatial CUSUM for Signal Region Detection. Bayesian Robustness: A Nonasymptotic Viewpoint. Multiclass spectral feature scaling method for dimensionality reduction. Ensemble of 3D CNN regressors with data fusion for fluid intelligence prediction. Cross-Lingual Transfer for Distantly Supervised and Low-resources Indonesian NER. Fast Calculation of Pr …

Spatial CUSUM for Signal Region Detection

Title Spatial CUSUM for Signal Region Detection
Authors Xin Zhang, Zhengyuan Zhu
Abstract Detecting weak clustered signal in spatial data is important but challenging in applications such as medical image and epidemiology. A more efficient detection algorithm can provide more precise early warning, and effectively reduce the decision risk and cost. To date, many methods have been developed to detect signals with spatial structures. However, most of the existing methods are either too conservative for weak signals or computationally too intensive. In this paper, we consider a novel method named Spatial CUSUM (SCUSUM), which employs the idea of the CUSUM procedure and false discovery rate controlling. We develop theoretical properties of the method which indicates that asymptotically SCUSUM can reach high classification accuracy. In the simulation study, we demonstrate that SCUSUM is sensitive to weak spatial signals. This new method is applied to a real fMRI dataset as illustration, and more irregular weak spatial signals are detected in the images compared to some existing methods, including the conventional FDR, FDR$_L$ and scan statistics.
Tasks Epidemiology
Published 2019-04-05
URL http://arxiv.org/abs/1904.03246v1
PDF http://arxiv.org/pdf/1904.03246v1.pdf
PWC https://paperswithcode.com/paper/spatial-cusum-for-signal-region-detection
Repo
Framework

Bayesian Robustness: A Nonasymptotic Viewpoint

Title Bayesian Robustness: A Nonasymptotic Viewpoint
Authors Kush Bhatia, Yi-An Ma, Anca D. Dragan, Peter L. Bartlett, Michael I. Jordan
Abstract We study the problem of robustly estimating the posterior distribution for the setting where observed data can be contaminated with potentially adversarial outliers. We propose Rob-ULA, a robust variant of the Unadjusted Langevin Algorithm (ULA), and provide a finite-sample analysis of its sampling distribution. In particular, we show that after $T= \tilde{\mathcal{O}}(d/\varepsilon_{\textsf{acc}})$ iterations, we can sample from $p_T$ such that $\text{dist}(p_T, p^*) \leq \varepsilon_{\textsf{acc}} + \tilde{\mathcal{O}}(\epsilon)$, where $\epsilon$ is the fraction of corruptions. We corroborate our theoretical analysis with experiments on both synthetic and real-world data sets for mean estimation, regression and binary classification.
Tasks
Published 2019-07-27
URL https://arxiv.org/abs/1907.11826v1
PDF https://arxiv.org/pdf/1907.11826v1.pdf
PWC https://paperswithcode.com/paper/bayesian-robustness-a-nonasymptotic-viewpoint
Repo
Framework

Multiclass spectral feature scaling method for dimensionality reduction

Title Multiclass spectral feature scaling method for dimensionality reduction
Authors Momo Matsuda, Keiichi Morikuni, Akira Imakura, Xiucai Ye, Tetsuya Sakurai
Abstract Irregular features disrupt the desired classification. In this paper, we consider aggressively modifying scales of features in the original space according to the label information to form well-separated clusters in low-dimensional space. The proposed method exploits spectral clustering to derive scaling factors that are used to modify the features. Specifically, we reformulate the Laplacian eigenproblem of the spectral clustering as an eigenproblem of a linear matrix pencil whose eigenvector has the scaling factors. Numerical experiments show that the proposed method outperforms well-established supervised dimensionality reduction methods for toy problems with more samples than features and real-world problems with more features than samples.
Tasks Dimensionality Reduction
Published 2019-10-16
URL https://arxiv.org/abs/1910.07174v1
PDF https://arxiv.org/pdf/1910.07174v1.pdf
PWC https://paperswithcode.com/paper/multiclass-spectral-feature-scaling-method
Repo
Framework

Ensemble of 3D CNN regressors with data fusion for fluid intelligence prediction

Title Ensemble of 3D CNN regressors with data fusion for fluid intelligence prediction
Authors Marina Pominova, Anna Kuzina, Ekaterina Kondrateva, Svetlana Sushchinskaya, Maxim Sharaev, Evgeny Burnaev, and Vyacheslav Yarkin
Abstract In this work, we aim at predicting children’s fluid intelligence scores based on structural T1-weighted MR images from the largest long-term study of brain development and child health. The target variable was regressed on a data collection site, socio-demographic variables and brain volume, thus being independent to the potentially informative factors, which are not directly related to the brain functioning. We investigate both feature extraction and deep learning approaches as well as different deep CNN architectures and their ensembles. We propose an advanced architecture of VoxCNNs ensemble, which yield MSE (92.838) on blind test.
Tasks
Published 2019-05-25
URL https://arxiv.org/abs/1905.10550v1
PDF https://arxiv.org/pdf/1905.10550v1.pdf
PWC https://paperswithcode.com/paper/ensemble-of-3d-cnn-regressors-with-data
Repo
Framework

Cross-Lingual Transfer for Distantly Supervised and Low-resources Indonesian NER

Title Cross-Lingual Transfer for Distantly Supervised and Low-resources Indonesian NER
Authors Fariz Ikhwantri
Abstract Manually annotated corpora for low-resource languages are usually small in quantity (gold), or large but distantly supervised (silver). Inspired by recent progress of injecting pre-trained language model (LM) on many Natural Language Processing (NLP) task, we proposed to fine-tune pre-trained language model from high-resources languages to low-resources languages to improve the performance of both scenarios. Our empirical experiment demonstrates significant improvement when fine-tuning pre-trained language model in cross-lingual transfer scenarios for small gold corpus and competitive results in large silver compare to supervised cross-lingual transfer, which will be useful when there is no parallel annotation in the same task to begin. We compare our proposed method of cross-lingual transfer using pre-trained LM to different sources of transfer such as mono-lingual LM and Part-of-Speech tagging (POS) in the downstream task of both large silver and small gold NER dataset by exploiting character-level input of bi-directional language model task.
Tasks Cross-Lingual Transfer, Language Modelling, Part-Of-Speech Tagging
Published 2019-07-25
URL https://arxiv.org/abs/1907.11158v1
PDF https://arxiv.org/pdf/1907.11158v1.pdf
PWC https://paperswithcode.com/paper/cross-lingual-transfer-for-distantly
Repo
Framework

Fast Calculation of Probabilistic Optimal Power Flow: A Deep Learning Approach

Title Fast Calculation of Probabilistic Optimal Power Flow: A Deep Learning Approach
Authors Yan Yang, Juan Yu, Zhifang Yang, Mingxu Xiang, Ren Liu
Abstract Probabilistic optimal power flow (POPF) is an important analytical tool to ensure the secure and economic operation of power systems. POPF needs to solve enormous nonlinear and nonconvex optimization problems. The huge computational burden has become the major bottleneck for the practical application. This paper presents a deep learning approach to solve the POPF problem efficiently and accurately. Taking advantage of the deep structure and reconstructive strategy of stacked denoising auto encoders (SDAE), a SDAE-based optimal power flow (OPF) is developed to extract the high-level nonlinear correlations between the system operating condition and the OPF solution. A training process is designed to learn the feature of POPF. The trained SDAE network can be utilized to conveniently calculate the OPF solution of random samples generated by Monte-Carlo simulation (MCS) without the need of optimization. A modified IEEE 118-bus power system is simulated to demonstrate the effectiveness of the proposed method.
Tasks Denoising
Published 2019-06-24
URL https://arxiv.org/abs/1906.09951v1
PDF https://arxiv.org/pdf/1906.09951v1.pdf
PWC https://paperswithcode.com/paper/fast-calculation-of-probabilistic-optimal
Repo
Framework

Sub-query Fragmentation for Query Analysis and Data Caching in the Distributed Environment

Title Sub-query Fragmentation for Query Analysis and Data Caching in the Distributed Environment
Authors Santhilata Kuppili Venkata, Katarzyna Musial
Abstract When data stores and users are distributed geographically, it is essential to organize distributed data cache points at ideal locations to minimize data transfers. To answer this, we are developing an adaptive distributed data caching framework that can identify suitable data chunks to cache and move across a network of community cache locations.
Tasks
Published 2019-10-11
URL https://arxiv.org/abs/1910.04991v1
PDF https://arxiv.org/pdf/1910.04991v1.pdf
PWC https://paperswithcode.com/paper/sub-query-fragmentation-for-query-analysis
Repo
Framework

Learning Deformable Kernels for Image and Video Denoising

Title Learning Deformable Kernels for Image and Video Denoising
Authors Xiangyu Xu, Muchen Li, Wenxiu Sun
Abstract Most of the classical denoising methods restore clear results by selecting and averaging pixels in the noisy input. Instead of relying on hand-crafted selecting and averaging strategies, we propose to explicitly learn this process with deep neural networks. Specifically, we propose deformable 2D kernels for image denoising where the sampling locations and kernel weights are both learned. The proposed kernel naturally adapts to image structures and could effectively reduce the oversmoothing artifacts. Furthermore, we develop 3D deformable kernels for video denoising to more efficiently sample pixels across the spatial-temporal space. Our method is able to solve the misalignment issues of large motion from dynamic scenes. For better training our video denoising model, we introduce the trilinear sampler and a new regularization term. We demonstrate that the proposed method performs favorably against the state-of-the-art image and video denoising approaches on both synthetic and real-world data.
Tasks Denoising, Image Denoising, Video Denoising
Published 2019-04-15
URL http://arxiv.org/abs/1904.06903v1
PDF http://arxiv.org/pdf/1904.06903v1.pdf
PWC https://paperswithcode.com/paper/learning-deformable-kernels-for-image-and
Repo
Framework

Heterogeneous tissue characterization using ultrasound: a comparison of fractal analysis backscatter models on liver tumors

Title Heterogeneous tissue characterization using ultrasound: a comparison of fractal analysis backscatter models on liver tumors
Authors Omar S. Al-Kadi, Daniel Y. F. Chung, Constantin C. Coussios, J. Alison Noble
Abstract Assessing tumor tissue heterogeneity via ultrasound has recently been suggested for predicting early response to treatment. The ultrasound backscattering characteristics can assist in better understanding the tumor texture by highlighting local concentration and spatial arrangement of tissue scatterers. However, it is challenging to quantify the various tissue heterogeneities ranging from fine-to-coarse of the echo envelope peaks in tumor texture. Local parametric fractal features extracted via maximum likelihood estimation from five well-known statistical model families are evaluated for the purpose of ultrasound tissue characterization. The fractal dimension (self-similarity measure) was used to characterize the spatial distribution of scatterers, while the Lacunarity (sparsity measure) was applied to determine scatterer number density. Performance was assessed based on 608 cross-sectional clinical ultrasound RF images of liver tumors (230 and 378 demonstrating respondent and non-respondent cases, respectively). Crossvalidation via leave-one-tumor-out and with different k-folds methodologies using a Bayesian classifier were employed for validation. The fractal properties of the backscattered echoes based on the Nakagami model (Nkg) and its extend four-parameter Nakagami-generalized inverse Gaussian (NIG) distribution achieved best results - with nearly similar performance - for characterizing liver tumor tissue. Accuracy, sensitivity and specificity for the Nkg/NIG were: 85.6%/86.3%, 94.0%/96.0%, and 73.0%/71.0%, respectively. Other statistical models, such as the Rician, Rayleigh, and K-distribution were found to not be as effective in characterizing the subtle changes in tissue texture as an indication of response to treatment. Employing the most relevant and practical statistical model could have potential consequences for the design of an early and effective clinical therapy.
Tasks
Published 2019-12-20
URL https://arxiv.org/abs/1912.09903v1
PDF https://arxiv.org/pdf/1912.09903v1.pdf
PWC https://paperswithcode.com/paper/heterogeneous-tissue-characterization-using
Repo
Framework

Development of email classifier in Brazilian Portuguese using feature selection for automatic response

Title Development of email classifier in Brazilian Portuguese using feature selection for automatic response
Authors Rogerio Bonatti, Arthur Gola de Paula
Abstract Automatic email categorization is an important application of text classification. We study the automatic reply of email business messages in Brazilian Portuguese. We present a novel corpus containing messages from a real application, and baseline categorization experiments using Naive Bayes and support Vector Machines. We then discuss the effect of lemmatization and the role of part-of-speech tagging filtering on precision and recall. Support Vector Machines classification coupled with nonlemmatized selection of verbs, nouns and adjectives was the best approach, with 87.3% maximum accuracy. Straightforward lemmatization in Portuguese led to the lowest classification results in the group, with 85.3% and 81.7% precision in SVM and Naive Bayes respectively. Thus, while lemmatization reduced precision and recall, part-of-speech filtering improved overall results.
Tasks Feature Selection, Lemmatization, Part-Of-Speech Tagging, Text Classification
Published 2019-07-08
URL https://arxiv.org/abs/1907.04905v1
PDF https://arxiv.org/pdf/1907.04905v1.pdf
PWC https://paperswithcode.com/paper/development-of-email-classifier-in-brazilian
Repo
Framework

Mobile APP User Attribute Prediction by Heterogeneous Information Network Modeling

Title Mobile APP User Attribute Prediction by Heterogeneous Information Network Modeling
Authors Hekai Zhang, Jibing Gong, Zhiyong Teng, Dan Wang, Hongfei Wang, Linfeng Du, Zakirul Alam Bhuiyan
Abstract User-based attribute information, such as age and gender, is usually considered as user privacy information. It is difficult for enterprises to obtain user-based privacy attribute information. However, user-based privacy attribute information has a wide range of applications in personalized services, user behavior analysis and other aspects. this paper advances the HetPathMine model and puts forward TPathMine model. With applying the number of clicks of attributes under each node to express the user’s emotional preference information, optimizations of the solution of meta-path weight are also presented. Based on meta-path in heterogeneous information networks, the new model integrates all relationships among objects into isomorphic relationships of classified objects. Matrix is used to realize the knowledge dissemination of category knowledge among isomorphic objects. The experimental results show that: (1) the prediction of user attributes based on heterogeneous information networks can achieve higher accuracy than traditional machine learning classification methods; (2) TPathMine model based on the number of clicks is more accurate in classifying users of different age groups, and the weight of each meta-path is consistent with human intuition or the real world situation.
Tasks
Published 2019-10-06
URL https://arxiv.org/abs/1910.02450v1
PDF https://arxiv.org/pdf/1910.02450v1.pdf
PWC https://paperswithcode.com/paper/mobile-app-user-attribute-prediction-by
Repo
Framework

Enhancing Transformation-based Defenses using a Distribution Classifier

Title Enhancing Transformation-based Defenses using a Distribution Classifier
Authors Connie Kou, Hwee Kuan Lee, Ee-Chien Chang, Teck Khim Ng
Abstract Adversarial attacks on convolutional neural networks (CNN) have gained significant attention and there have been active research efforts on defense mechanisms. Stochastic input transformation methods have been proposed, where the idea is to recover the image from adversarial attack by random transformation, and to take the majority vote as consensus among the random samples. However, the transformation improves the accuracy on adversarial images at the expense of the accuracy on clean images. While it is intuitive that the accuracy on clean images would deteriorate, the exact mechanism in which how this occurs is unclear. In this paper, we study the distribution of softmax induced by stochastic transformations. We observe that with random transformations on the clean images, although the mass of the softmax distribution could shift to the wrong class, the resulting distribution of softmax could be used to correct the prediction. Furthermore, on the adversarial counterparts, with the image transformation, the resulting shapes of the distribution of softmax are similar to the distributions from the clean images. With these observations, we propose a method to improve existing transformation-based defenses. We train a separate lightweight distribution classifier to recognize distinct features in the distributions of softmax outputs of transformed images. Our empirical studies show that our distribution classifier, by training on distributions obtained from clean images only, outperforms majority voting for both clean and adversarial images. Our method is generic and can be integrated with existing transformation-based defenses.
Tasks Adversarial Attack
Published 2019-06-01
URL https://arxiv.org/abs/1906.00258v2
PDF https://arxiv.org/pdf/1906.00258v2.pdf
PWC https://paperswithcode.com/paper/190600258
Repo
Framework

Computationally Efficient CFD Prediction of Bubbly Flow using Physics-Guided Deep Learning

Title Computationally Efficient CFD Prediction of Bubbly Flow using Physics-Guided Deep Learning
Authors Han Bao, Jinyong Feng, Nam Dinh, Hongbin Zhang
Abstract To realize efficient computational fluid dynamics (CFD) prediction of two-phase flow, a multi-scale framework was proposed in this paper by applying a physics-guided data-driven approach. Instrumental to this framework, Feature Similarity Measurement (FSM) technique was developed for error estimation in two-phase flow simulation using coarse-mesh CFD, to achieve a comparable accuracy as fine-mesh simulations with fast-running feature. By defining physics-guided parameters and variable gradients as physical features, FSM has the capability to capture the underlying local patterns in the coarse-mesh CFD simulation. Massive low-fidelity data and respective high-fidelity data are used to explore the underlying information relevant to the main simulation errors and the effects of phenomenological scaling. By learning from previous simulation data, a surrogate model using deep feedforward neural network (DFNN) can be developed and trained to estimate the simulation error of coarse-mesh CFD. The research documented supports the feasibility of the physics-guided deep learning methods for coarse mesh CFD simulations which has a potential for the efficient industrial design.
Tasks
Published 2019-10-17
URL https://arxiv.org/abs/1910.08037v1
PDF https://arxiv.org/pdf/1910.08037v1.pdf
PWC https://paperswithcode.com/paper/computationally-efficient-cfd-prediction-of
Repo
Framework

Gradient Descent with Compressed Iterates

Title Gradient Descent with Compressed Iterates
Authors Ahmed Khaled, Peter Richtárik
Abstract We propose and analyze a new type of stochastic first order method: gradient descent with compressed iterates (GDCI). GDCI in each iteration first compresses the current iterate using a lossy randomized compression technique, and subsequently takes a gradient step. This method is a distillation of a key ingredient in the current practice of federated learning, where a model needs to be compressed by a mobile device before it is sent back to a server for aggregation. Our analysis provides a step towards closing the gap between the theory and practice of federated learning, and opens the possibility for many extensions.
Tasks
Published 2019-09-10
URL https://arxiv.org/abs/1909.04716v2
PDF https://arxiv.org/pdf/1909.04716v2.pdf
PWC https://paperswithcode.com/paper/gradient-descent-with-compressed-iterates
Repo
Framework

The Blessings of Multiple Causes: A Reply to Ogburn et al. (2019)

Title The Blessings of Multiple Causes: A Reply to Ogburn et al. (2019)
Authors Yixin Wang, David M. Blei
Abstract Ogburn et al. (2019, arXiv:1910.05438) discuss “The Blessings of Multiple Causes” (Wang and Blei, 2018, arXiv:1805.06826). Many of their remarks are interesting. But they also claim that the paper has “foundational errors” and that its “premise is…incorrect.” These claims are not substantiated. There are no foundational errors; the premise is correct.
Tasks
Published 2019-10-15
URL https://arxiv.org/abs/1910.07320v3
PDF https://arxiv.org/pdf/1910.07320v3.pdf
PWC https://paperswithcode.com/paper/the-blessings-of-multiple-causes-a-reply-to
Repo
Framework
comments powered by Disqus