January 30, 2020

2738 words 13 mins read

Paper Group ANR 267

Spatial CUSUM for Signal Region Detection. Bayesian Robustness: A Nonasymptotic Viewpoint. Multiclass spectral feature scaling method for dimensionality reduction. Ensemble of 3D CNN regressors with data fusion for fluid intelligence prediction. Cross-Lingual Transfer for Distantly Supervised and Low-resources Indonesian NER. Fast Calculation of Pr …

Spatial CUSUM for Signal Region Detection


Title	Spatial CUSUM for Signal Region Detection
Authors	Xin Zhang, Zhengyuan Zhu
Abstract	Detecting weak clustered signal in spatial data is important but challenging in applications such as medical image and epidemiology. A more efficient detection algorithm can provide more precise early warning, and effectively reduce the decision risk and cost. To date, many methods have been developed to detect signals with spatial structures. However, most of the existing methods are either too conservative for weak signals or computationally too intensive. In this paper, we consider a novel method named Spatial CUSUM (SCUSUM), which employs the idea of the CUSUM procedure and false discovery rate controlling. We develop theoretical properties of the method which indicates that asymptotically SCUSUM can reach high classification accuracy. In the simulation study, we demonstrate that SCUSUM is sensitive to weak spatial signals. This new method is applied to a real fMRI dataset as illustration, and more irregular weak spatial signals are detected in the images compared to some existing methods, including the conventional FDR, FDR$_L$ and scan statistics.
Tasks	Epidemiology
Published	2019-04-05
URL	http://arxiv.org/abs/1904.03246v1
PDF	http://arxiv.org/pdf/1904.03246v1.pdf
PWC	https://paperswithcode.com/paper/spatial-cusum-for-signal-region-detection
Repo
Framework

Bayesian Robustness: A Nonasymptotic Viewpoint


Title	Bayesian Robustness: A Nonasymptotic Viewpoint
Authors	Kush Bhatia, Yi-An Ma, Anca D. Dragan, Peter L. Bartlett, Michael I. Jordan
Abstract	We study the problem of robustly estimating the posterior distribution for the setting where observed data can be contaminated with potentially adversarial outliers. We propose Rob-ULA, a robust variant of the Unadjusted Langevin Algorithm (ULA), and provide a finite-sample analysis of its sampling distribution. In particular, we show that after $T= \tilde{\mathcal{O}}(d/\varepsilon_{\textsf{acc}})$ iterations, we can sample from $p_T$ such that $\text{dist}(p_T, p^*) \leq \varepsilon_{\textsf{acc}} + \tilde{\mathcal{O}}(\epsilon)$, where $\epsilon$ is the fraction of corruptions. We corroborate our theoretical analysis with experiments on both synthetic and real-world data sets for mean estimation, regression and binary classification.
Tasks
Published	2019-07-27
URL	https://arxiv.org/abs/1907.11826v1
PDF	https://arxiv.org/pdf/1907.11826v1.pdf
PWC	https://paperswithcode.com/paper/bayesian-robustness-a-nonasymptotic-viewpoint
Repo
Framework

Multiclass spectral feature scaling method for dimensionality reduction


Title	Multiclass spectral feature scaling method for dimensionality reduction
Authors	Momo Matsuda, Keiichi Morikuni, Akira Imakura, Xiucai Ye, Tetsuya Sakurai
Abstract	Irregular features disrupt the desired classification. In this paper, we consider aggressively modifying scales of features in the original space according to the label information to form well-separated clusters in low-dimensional space. The proposed method exploits spectral clustering to derive scaling factors that are used to modify the features. Specifically, we reformulate the Laplacian eigenproblem of the spectral clustering as an eigenproblem of a linear matrix pencil whose eigenvector has the scaling factors. Numerical experiments show that the proposed method outperforms well-established supervised dimensionality reduction methods for toy problems with more samples than features and real-world problems with more features than samples.
Tasks	Dimensionality Reduction
Published	2019-10-16
URL	https://arxiv.org/abs/1910.07174v1
PDF	https://arxiv.org/pdf/1910.07174v1.pdf
PWC	https://paperswithcode.com/paper/multiclass-spectral-feature-scaling-method
Repo
Framework

Ensemble of 3D CNN regressors with data fusion for fluid intelligence prediction


Title	Ensemble of 3D CNN regressors with data fusion for fluid intelligence prediction
Authors	Marina Pominova, Anna Kuzina, Ekaterina Kondrateva, Svetlana Sushchinskaya, Maxim Sharaev, Evgeny Burnaev, and Vyacheslav Yarkin
Abstract	In this work, we aim at predicting children’s fluid intelligence scores based on structural T1-weighted MR images from the largest long-term study of brain development and child health. The target variable was regressed on a data collection site, socio-demographic variables and brain volume, thus being independent to the potentially informative factors, which are not directly related to the brain functioning. We investigate both feature extraction and deep learning approaches as well as different deep CNN architectures and their ensembles. We propose an advanced architecture of VoxCNNs ensemble, which yield MSE (92.838) on blind test.
Tasks
Published	2019-05-25
URL	https://arxiv.org/abs/1905.10550v1
PDF	https://arxiv.org/pdf/1905.10550v1.pdf
PWC	https://paperswithcode.com/paper/ensemble-of-3d-cnn-regressors-with-data
Repo
Framework

Cross-Lingual Transfer for Distantly Supervised and Low-resources Indonesian NER


Title	Cross-Lingual Transfer for Distantly Supervised and Low-resources Indonesian NER
Authors	Fariz Ikhwantri
Abstract	Manually annotated corpora for low-resource languages are usually small in quantity (gold), or large but distantly supervised (silver). Inspired by recent progress of injecting pre-trained language model (LM) on many Natural Language Processing (NLP) task, we proposed to fine-tune pre-trained language model from high-resources languages to low-resources languages to improve the performance of both scenarios. Our empirical experiment demonstrates significant improvement when fine-tuning pre-trained language model in cross-lingual transfer scenarios for small gold corpus and competitive results in large silver compare to supervised cross-lingual transfer, which will be useful when there is no parallel annotation in the same task to begin. We compare our proposed method of cross-lingual transfer using pre-trained LM to different sources of transfer such as mono-lingual LM and Part-of-Speech tagging (POS) in the downstream task of both large silver and small gold NER dataset by exploiting character-level input of bi-directional language model task.
Tasks	Cross-Lingual Transfer, Language Modelling, Part-Of-Speech Tagging
Published	2019-07-25
URL	https://arxiv.org/abs/1907.11158v1
PDF	https://arxiv.org/pdf/1907.11158v1.pdf
PWC	https://paperswithcode.com/paper/cross-lingual-transfer-for-distantly
Repo
Framework

Fast Calculation of Probabilistic Optimal Power Flow: A Deep Learning Approach


Title	Fast Calculation of Probabilistic Optimal Power Flow: A Deep Learning Approach
Authors	Yan Yang, Juan Yu, Zhifang Yang, Mingxu Xiang, Ren Liu
Abstract	Probabilistic optimal power flow (POPF) is an important analytical tool to ensure the secure and economic operation of power systems. POPF needs to solve enormous nonlinear and nonconvex optimization problems. The huge computational burden has become the major bottleneck for the practical application. This paper presents a deep learning approach to solve the POPF problem efficiently and accurately. Taking advantage of the deep structure and reconstructive strategy of stacked denoising auto encoders (SDAE), a SDAE-based optimal power flow (OPF) is developed to extract the high-level nonlinear correlations between the system operating condition and the OPF solution. A training process is designed to learn the feature of POPF. The trained SDAE network can be utilized to conveniently calculate the OPF solution of random samples generated by Monte-Carlo simulation (MCS) without the need of optimization. A modified IEEE 118-bus power system is simulated to demonstrate the effectiveness of the proposed method.
Tasks	Denoising
Published	2019-06-24
URL	https://arxiv.org/abs/1906.09951v1
PDF	https://arxiv.org/pdf/1906.09951v1.pdf
PWC	https://paperswithcode.com/paper/fast-calculation-of-probabilistic-optimal
Repo
Framework

Sub-query Fragmentation for Query Analysis and Data Caching in the Distributed Environment


Title	Sub-query Fragmentation for Query Analysis and Data Caching in the Distributed Environment
Authors	Santhilata Kuppili Venkata, Katarzyna Musial
Abstract	When data stores and users are distributed geographically, it is essential to organize distributed data cache points at ideal locations to minimize data transfers. To answer this, we are developing an adaptive distributed data caching framework that can identify suitable data chunks to cache and move across a network of community cache locations.
Tasks
Published	2019-10-11
URL	https://arxiv.org/abs/1910.04991v1
PDF	https://arxiv.org/pdf/1910.04991v1.pdf
PWC	https://paperswithcode.com/paper/sub-query-fragmentation-for-query-analysis
Repo
Framework

Learning Deformable Kernels for Image and Video Denoising


Title	Learning Deformable Kernels for Image and Video Denoising
Authors	Xiangyu Xu, Muchen Li, Wenxiu Sun
Abstract	Most of the classical denoising methods restore clear results by selecting and averaging pixels in the noisy input. Instead of relying on hand-crafted selecting and averaging strategies, we propose to explicitly learn this process with deep neural networks. Specifically, we propose deformable 2D kernels for image denoising where the sampling locations and kernel weights are both learned. The proposed kernel naturally adapts to image structures and could effectively reduce the oversmoothing artifacts. Furthermore, we develop 3D deformable kernels for video denoising to more efficiently sample pixels across the spatial-temporal space. Our method is able to solve the misalignment issues of large motion from dynamic scenes. For better training our video denoising model, we introduce the trilinear sampler and a new regularization term. We demonstrate that the proposed method performs favorably against the state-of-the-art image and video denoising approaches on both synthetic and real-world data.
Tasks	Denoising, Image Denoising, Video Denoising
Published	2019-04-15
URL	http://arxiv.org/abs/1904.06903v1
PDF	http://arxiv.org/pdf/1904.06903v1.pdf
PWC	https://paperswithcode.com/paper/learning-deformable-kernels-for-image-and
Repo
Framework

Heterogeneous tissue characterization using ultrasound: a comparison of fractal analysis backscatter models on liver tumors


Title	Heterogeneous tissue characterization using ultrasound: a comparison of fractal analysis backscatter models on liver tumors
Authors	Omar S. Al-Kadi, Daniel Y. F. Chung, Constantin C. Coussios, J. Alison Noble
Abstract	Assessing tumor tissue heterogeneity via ultrasound has recently been suggested for predicting early response to treatment. The ultrasound backscattering characteristics can assist in better understanding the tumor texture by highlighting local concentration and spatial arrangement of tissue scatterers. However, it is challenging to quantify the various tissue heterogeneities ranging from fine-to-coarse of the echo envelope peaks in tumor texture. Local parametric fractal features extracted via maximum likelihood estimation from five well-known statistical model families are evaluated for the purpose of ultrasound tissue characterization. The fractal dimension (self-similarity measure) was used to characterize the spatial distribution of scatterers, while the Lacunarity (sparsity measure) was applied to determine scatterer number density. Performance was assessed based on 608 cross-sectional clinical ultrasound RF images of liver tumors (230 and 378 demonstrating respondent and non-respondent cases, respectively). Crossvalidation via leave-one-tumor-out and with different k-folds methodologies using a Bayesian classifier were employed for validation. The fractal properties of the backscattered echoes based on the Nakagami model (Nkg) and its extend four-parameter Nakagami-generalized inverse Gaussian (NIG) distribution achieved best results - with nearly similar performance - for characterizing liver tumor tissue. Accuracy, sensitivity and specificity for the Nkg/NIG were: 85.6%/86.3%, 94.0%/96.0%, and 73.0%/71.0%, respectively. Other statistical models, such as the Rician, Rayleigh, and K-distribution were found to not be as effective in characterizing the subtle changes in tissue texture as an indication of response to treatment. Employing the most relevant and practical statistical model could have potential consequences for the design of an early and effective clinical therapy.
Tasks
Published	2019-12-20
URL	https://arxiv.org/abs/1912.09903v1
PDF	https://arxiv.org/pdf/1912.09903v1.pdf
PWC	https://paperswithcode.com/paper/heterogeneous-tissue-characterization-using
Repo
Framework

Development of email classifier in Brazilian Portuguese using feature selection for automatic response


Title	Development of email classifier in Brazilian Portuguese using feature selection for automatic response
Authors	Rogerio Bonatti, Arthur Gola de Paula
Abstract	Automatic email categorization is an important application of text classification. We study the automatic reply of email business messages in Brazilian Portuguese. We present a novel corpus containing messages from a real application, and baseline categorization experiments using Naive Bayes and support Vector Machines. We then discuss the effect of lemmatization and the role of part-of-speech tagging filtering on precision and recall. Support Vector Machines classification coupled with nonlemmatized selection of verbs, nouns and adjectives was the best approach, with 87.3% maximum accuracy. Straightforward lemmatization in Portuguese led to the lowest classification results in the group, with 85.3% and 81.7% precision in SVM and Naive Bayes respectively. Thus, while lemmatization reduced precision and recall, part-of-speech filtering improved overall results.
Tasks	Feature Selection, Lemmatization, Part-Of-Speech Tagging, Text Classification
Published	2019-07-08
URL	https://arxiv.org/abs/1907.04905v1
PDF	https://arxiv.org/pdf/1907.04905v1.pdf
PWC	https://paperswithcode.com/paper/development-of-email-classifier-in-brazilian
Repo
Framework

Mobile APP User Attribute Prediction by Heterogeneous Information Network Modeling


Title	Mobile APP User Attribute Prediction by Heterogeneous Information Network Modeling
Authors	Hekai Zhang, Jibing Gong, Zhiyong Teng, Dan Wang, Hongfei Wang, Linfeng Du, Zakirul Alam Bhuiyan
Abstract	User-based attribute information, such as age and gender, is usually considered as user privacy information. It is difficult for enterprises to obtain user-based privacy attribute information. However, user-based privacy attribute information has a wide range of applications in personalized services, user behavior analysis and other aspects. this paper advances the HetPathMine model and puts forward TPathMine model. With applying the number of clicks of attributes under each node to express the user’s emotional preference information, optimizations of the solution of meta-path weight are also presented. Based on meta-path in heterogeneous information networks, the new model integrates all relationships among objects into isomorphic relationships of classified objects. Matrix is used to realize the knowledge dissemination of category knowledge among isomorphic objects. The experimental results show that: (1) the prediction of user attributes based on heterogeneous information networks can achieve higher accuracy than traditional machine learning classification methods; (2) TPathMine model based on the number of clicks is more accurate in classifying users of different age groups, and the weight of each meta-path is consistent with human intuition or the real world situation.
Tasks
Published	2019-10-06
URL	https://arxiv.org/abs/1910.02450v1
PDF	https://arxiv.org/pdf/1910.02450v1.pdf
PWC	https://paperswithcode.com/paper/mobile-app-user-attribute-prediction-by
Repo
Framework

Enhancing Transformation-based Defenses using a Distribution Classifier


Title	Enhancing Transformation-based Defenses using a Distribution Classifier
Authors	Connie Kou, Hwee Kuan Lee, Ee-Chien Chang, Teck Khim Ng
Abstract	Adversarial attacks on convolutional neural networks (CNN) have gained significant attention and there have been active research efforts on defense mechanisms. Stochastic input transformation methods have been proposed, where the idea is to recover the image from adversarial attack by random transformation, and to take the majority vote as consensus among the random samples. However, the transformation improves the accuracy on adversarial images at the expense of the accuracy on clean images. While it is intuitive that the accuracy on clean images would deteriorate, the exact mechanism in which how this occurs is unclear. In this paper, we study the distribution of softmax induced by stochastic transformations. We observe that with random transformations on the clean images, although the mass of the softmax distribution could shift to the wrong class, the resulting distribution of softmax could be used to correct the prediction. Furthermore, on the adversarial counterparts, with the image transformation, the resulting shapes of the distribution of softmax are similar to the distributions from the clean images. With these observations, we propose a method to improve existing transformation-based defenses. We train a separate lightweight distribution classifier to recognize distinct features in the distributions of softmax outputs of transformed images. Our empirical studies show that our distribution classifier, by training on distributions obtained from clean images only, outperforms majority voting for both clean and adversarial images. Our method is generic and can be integrated with existing transformation-based defenses.
Tasks	Adversarial Attack
Published	2019-06-01
URL	https://arxiv.org/abs/1906.00258v2
PDF	https://arxiv.org/pdf/1906.00258v2.pdf
PWC	https://paperswithcode.com/paper/190600258
Repo
Framework

Computationally Efficient CFD Prediction of Bubbly Flow using Physics-Guided Deep Learning


Title	Computationally Efficient CFD Prediction of Bubbly Flow using Physics-Guided Deep Learning
Authors	Han Bao, Jinyong Feng, Nam Dinh, Hongbin Zhang
Abstract	To realize efficient computational fluid dynamics (CFD) prediction of two-phase flow, a multi-scale framework was proposed in this paper by applying a physics-guided data-driven approach. Instrumental to this framework, Feature Similarity Measurement (FSM) technique was developed for error estimation in two-phase flow simulation using coarse-mesh CFD, to achieve a comparable accuracy as fine-mesh simulations with fast-running feature. By defining physics-guided parameters and variable gradients as physical features, FSM has the capability to capture the underlying local patterns in the coarse-mesh CFD simulation. Massive low-fidelity data and respective high-fidelity data are used to explore the underlying information relevant to the main simulation errors and the effects of phenomenological scaling. By learning from previous simulation data, a surrogate model using deep feedforward neural network (DFNN) can be developed and trained to estimate the simulation error of coarse-mesh CFD. The research documented supports the feasibility of the physics-guided deep learning methods for coarse mesh CFD simulations which has a potential for the efficient industrial design.
Tasks
Published	2019-10-17
URL	https://arxiv.org/abs/1910.08037v1
PDF	https://arxiv.org/pdf/1910.08037v1.pdf
PWC	https://paperswithcode.com/paper/computationally-efficient-cfd-prediction-of
Repo
Framework

Gradient Descent with Compressed Iterates


Title	Gradient Descent with Compressed Iterates
Authors	Ahmed Khaled, Peter Richtárik
Abstract	We propose and analyze a new type of stochastic first order method: gradient descent with compressed iterates (GDCI). GDCI in each iteration first compresses the current iterate using a lossy randomized compression technique, and subsequently takes a gradient step. This method is a distillation of a key ingredient in the current practice of federated learning, where a model needs to be compressed by a mobile device before it is sent back to a server for aggregation. Our analysis provides a step towards closing the gap between the theory and practice of federated learning, and opens the possibility for many extensions.
Tasks
Published	2019-09-10
URL	https://arxiv.org/abs/1909.04716v2
PDF	https://arxiv.org/pdf/1909.04716v2.pdf
PWC	https://paperswithcode.com/paper/gradient-descent-with-compressed-iterates
Repo
Framework

The Blessings of Multiple Causes: A Reply to Ogburn et al. (2019)


Title	The Blessings of Multiple Causes: A Reply to Ogburn et al. (2019)
Authors	Yixin Wang, David M. Blei
Abstract	Ogburn et al. (2019, arXiv:1910.05438) discuss “The Blessings of Multiple Causes” (Wang and Blei, 2018, arXiv:1805.06826). Many of their remarks are interesting. But they also claim that the paper has “foundational errors” and that its “premise is…incorrect.” These claims are not substantiated. There are no foundational errors; the premise is correct.
Tasks
Published	2019-10-15
URL	https://arxiv.org/abs/1910.07320v3
PDF	https://arxiv.org/pdf/1910.07320v3.pdf
PWC	https://paperswithcode.com/paper/the-blessings-of-multiple-causes-a-reply-to
Repo
Framework