October 17, 2019

2903 words 14 mins read

Paper Group ANR 765

Detecting non-causal artifacts in multivariate linear regression models. Neural Control Variates for Variance Reduction. A predictor-corrector method for the training of deep neural networks. Constrained Policy Improvement for Safe and Efficient Reinforcement Learning. Phonology-Augmented Statistical Framework for Machine Transliteration using Limi …

Detecting non-causal artifacts in multivariate linear regression models


Title	Detecting non-causal artifacts in multivariate linear regression models
Authors	Dominik Janzing, Bernhard Schoelkopf
Abstract	We consider linear models where $d$ potential causes $X_1,…,X_d$ are correlated with one target quantity $Y$ and propose a method to infer whether the association is causal or whether it is an artifact caused by overfitting or hidden common causes. We employ the idea that in the former case the vector of regression coefficients has ‘generic’ orientation relative to the covariance matrix $\Sigma_{XX}$ of $X$. Using an ICA based model for confounding, we show that both confounding and overfitting yield regression vectors that concentrate mainly in the space of low eigenvalues of $\Sigma_{XX}$.
Tasks
Published	2018-03-02
URL	http://arxiv.org/abs/1803.00810v1
PDF	http://arxiv.org/pdf/1803.00810v1.pdf
PWC	https://paperswithcode.com/paper/detecting-non-causal-artifacts-in
Repo
Framework

Neural Control Variates for Variance Reduction


Title	Neural Control Variates for Variance Reduction
Authors	Ruosi Wan, Mingjun Zhong, Haoyi Xiong, Zhanxing Zhu
Abstract	In statistics and machine learning, approximation of an intractable integration is often achieved by using the unbiased Monte Carlo estimator, but the variances of the estimation are generally high in many applications. Control variates approaches are well-known to reduce the variance of the estimation. These control variates are typically constructed by employing predefined parametric functions or polynomials, determined by using those samples drawn from the relevant distributions. Instead, we propose to construct those control variates by learning neural networks to handle the cases when test functions are complex. In many applications, obtaining a large number of samples for Monte Carlo estimation is expensive, which may result in overfitting when training a neural network. We thus further propose to employ auxiliary random variables induced by the original ones to extend data samples for training the neural networks. We apply the proposed control variates with augmented variables to thermodynamic integration and reinforcement learning. Experimental results demonstrate that our method can achieve significant variance reduction compared with other alternatives.
Tasks
Published	2018-06-01
URL	https://arxiv.org/abs/1806.00159v2
PDF	https://arxiv.org/pdf/1806.00159v2.pdf
PWC	https://paperswithcode.com/paper/neural-control-variates-for-variance
Repo
Framework

A predictor-corrector method for the training of deep neural networks


Title	A predictor-corrector method for the training of deep neural networks
Authors	Yatin Saraiya
Abstract	The training of deep neural nets is expensive. We present a predictor- corrector method for the training of deep neural nets. It alternates a predictor pass with a corrector pass using stochastic gradient descent with backpropagation such that there is no loss in validation accuracy. No special modifications to SGD with backpropagation is required by this methodology. Our experiments showed a time improvement of 9% on the CIFAR-10 dataset.
Tasks
Published	2018-01-19
URL	http://arxiv.org/abs/1803.05779v1
PDF	http://arxiv.org/pdf/1803.05779v1.pdf
PWC	https://paperswithcode.com/paper/a-predictor-corrector-method-for-the-training
Repo
Framework

Constrained Policy Improvement for Safe and Efficient Reinforcement Learning


Title	Constrained Policy Improvement for Safe and Efficient Reinforcement Learning
Authors	Elad Sarafian, Aviv Tamar, Sarit Kraus
Abstract	We propose a policy improvement algorithm for Reinforcement Learning (RL) which is called Rerouted Behavior Improvement (RBI). RBI is designed to take into account the evaluation errors of the Q-function. Such errors are common in RL when learning the $Q$-value from finite past experience data. Greedy policies or even constrained policy optimization algorithms which ignore these errors may suffer from an improvement penalty (i.e. a negative policy improvement). To minimize the improvement penalty, the RBI idea is to attenuate rapid policy changes of low probability actions which were less frequently sampled. This approach is shown to avoid catastrophic performance degradation and reduce regret when learning from a batch of past experience. Through a two-armed bandit with Gaussian distributed rewards example, we show that it also increases data efficiency when the optimal action has a high variance. We evaluate RBI in two tasks in the Atari Learning Environment: (1) learning from observations of multiple behavior policies and (2) iterative RL. Our results demonstrate the advantage of RBI over greedy policies and other constrained policy optimization algorithms as a safe learning approach and as a general data efficient learning algorithm. An anonymous Github repository of our RBI implementation is found at https://github.com/eladsar/rbi.
Tasks
Published	2018-05-20
URL	https://arxiv.org/abs/1805.07805v3
PDF	https://arxiv.org/pdf/1805.07805v3.pdf
PWC	https://paperswithcode.com/paper/safe-policy-learning-from-observations
Repo
Framework

Phonology-Augmented Statistical Framework for Machine Transliteration using Limited Linguistic Resources


Title	Phonology-Augmented Statistical Framework for Machine Transliteration using Limited Linguistic Resources
Authors	Gia H. Ngo, Minh Nguyen, Nancy F. Chen
Abstract	Transliteration converts words in a source language (e.g., English) into words in a target language (e.g., Vietnamese). This conversion considers the phonological structure of the target language, as the transliterated output needs to be pronounceable in the target language. For example, a word in Vietnamese that begins with a consonant cluster is phonologically invalid and thus would be an incorrect output of a transliteration system. Most statistical transliteration approaches, albeit being widely adopted, do not explicitly model the target language’s phonology, which often results in invalid outputs. The problem is compounded by the limited linguistic resources available when converting foreign words to transliterated words in the target language. In this work, we present a phonology-augmented statistical framework suitable for transliteration, especially when only limited linguistic resources are available. We propose the concept of pseudo-syllables as structures representing how segments of a foreign word are organized according to the syllables of the target language’s phonology. We performed transliteration experiments on Vietnamese and Cantonese. We show that the proposed framework outperforms the statistical baseline by up to 44.68% relative, when there are limited training examples (587 entries).
Tasks	Transliteration
Published	2018-10-07
URL	http://arxiv.org/abs/1810.03184v1
PDF	http://arxiv.org/pdf/1810.03184v1.pdf
PWC	https://paperswithcode.com/paper/phonology-augmented-statistical-framework-for
Repo
Framework

$α$-Approximation Density-based Clustering of Multi-valued Objects


Title	$α$-Approximation Density-based Clustering of Multi-valued Objects
Authors	Zhilin Zhang
Abstract	Multi-valued data are commonly found in many real applications. During the process of clustering multi-valued data, most existing methods use sampling or aggregation mechanisms that cannot reflect the real distribution of objects and their instances and thus fail to obtain high-quality clusters. In this paper, a concept of $\alpha$-approximation distance is introduced to measure the connectivity between multi-valued objects by taking account of the distribution of the instances. An $\alpha$-approximation density-based clustering algorithm (DBCMO) is proposed to efficiently cluster the multi-valued objects by using global and local R* tree structures. To speed up the algorithm, four pruning rules on the tree structures are implemented. Empirical studies on synthetic and real datasets demonstrate that DBCMO can efficiently and effectively discover the multi-valued object clusters. A comparison with two existing methods further shows that DBCMO can better handle a continuous decrease in the cluster density and detect clusters of varying density.
Tasks
Published	2018-08-09
URL	http://arxiv.org/abs/1808.03300v2
PDF	http://arxiv.org/pdf/1808.03300v2.pdf
PWC	https://paperswithcode.com/paper/-approximation-density-based-clustering-of
Repo
Framework

Optimal terminal dimensionality reduction in Euclidean space


Title	Optimal terminal dimensionality reduction in Euclidean space
Authors	Shyam Narayanan, Jelani Nelson
Abstract	Let $\varepsilon\in(0,1)$ and $X\subset\mathbb R^d$ be arbitrary with $X$ having size $n>1$. The Johnson-Lindenstrauss lemma states there exists $f:X\rightarrow\mathbb R^m$ with $m = O(\varepsilon^{-2}\log n)$ such that $$ \forall x\in X\ \forall y\in X, \x-y_2 \le \f(x)-f(y)_2 \le (1+\varepsilon)\x-y_2 . $$ We show that a strictly stronger version of this statement holds, answering one of the main open questions of [MMMR18]: “$\forall y\in X$” in the above statement may be replaced with “$\forall y\in\mathbb R^d$”, so that $f$ not only preserves distances within $X$, but also distances to $X$ from the rest of space. Previously this stronger version was only known with the worse bound $m = O(\varepsilon^{-4}\log n)$. Our proof is via a tighter analysis of (a specific instantiation of) the embedding recipe of [MMMR18].
Tasks	Dimensionality Reduction
Published	2018-10-22
URL	http://arxiv.org/abs/1810.09250v1
PDF	http://arxiv.org/pdf/1810.09250v1.pdf
PWC	https://paperswithcode.com/paper/optimal-terminal-dimensionality-reduction-in
Repo
Framework

Fast Distribution Grid Line Outage Identification with $μ$PMU


Title	Fast Distribution Grid Line Outage Identification with $μ$PMU
Authors	Yizheng Liao, Yang Weng, Chin-Woo Tan, Ram Rajagopal
Abstract	The growing integration of distributed energy resources (DERs) in urban distribution grids raises various reliability issues due to DER’s uncertain and complex behaviors. With a large-scale DER penetration, traditional outage detection methods, which rely on customers making phone calls and smart meters’ “last gasp” signals, will have limited performance, because the renewable generators can supply powers after line outages and many urban grids are mesh so line outages do not affect power supply. To address these drawbacks, we propose a data-driven outage monitoring approach based on the stochastic time series analysis from micro phasor measurement unit ($\mu$PMU). Specifically, we prove via power flow analysis that the dependency of time-series voltage measurements exhibits significant statistical changes after line outages. This makes the theory on optimal change-point detection suitable to identify line outages via $\mu$PMUs with fast and accurate sampling. However, existing change point detection methods require post-outage voltage distribution unknown in distribution systems. Therefore, we design a maximum likelihood-based method to directly learn the distribution parameters from $\mu$PMU data. We prove that the estimated parameters-based detection still achieves the optimal performance, making it extremely useful for distribution grid outage identifications. Simulation results show highly accurate outage identification in eight distribution grids with 14 configurations with and without DERs using $\mu$PMU data.
Tasks	Change Point Detection, Time Series, Time Series Analysis
Published	2018-11-14
URL	http://arxiv.org/abs/1811.05646v1
PDF	http://arxiv.org/pdf/1811.05646v1.pdf
PWC	https://paperswithcode.com/paper/fast-distribution-grid-line-outage
Repo
Framework

Speaker Adapted Beamforming for Multi-Channel Automatic Speech Recognition


Title	Speaker Adapted Beamforming for Multi-Channel Automatic Speech Recognition
Authors	Tobias Menne, Ralf Schlüter, Hermann Ney
Abstract	This paper presents, in the context of multi-channel ASR, a method to adapt a mask based, statistically optimal beamforming approach to a speaker of interest. The beamforming vector of the statistically optimal beamformer is computed by utilizing speech and noise masks, which are estimated by a neural network. The proposed adaptation approach is based on the integration of the beamformer, which includes the mask estimation network, and the acoustic model of the ASR system. This allows for the propagation of the training error, from the acoustic modeling cost function, all the way through the beamforming operation and through the mask estimation network. By using the results of a first pass recognition and by keeping all other parameters fixed, the mask estimation network can therefore be fine tuned by retraining. Utterances of a speaker of interest can thus be used in a two pass approach, to optimize the beamforming for the speech characteristics of that specific speaker. It is shown that this approach improves the ASR performance of a state-of-the-art multi-channel ASR system on the CHiME-4 data. Furthermore the effect of the adaptation on the estimated speech masks is discussed.
Tasks	Speech Recognition
Published	2018-06-19
URL	http://arxiv.org/abs/1806.07407v1
PDF	http://arxiv.org/pdf/1806.07407v1.pdf
PWC	https://paperswithcode.com/paper/speaker-adapted-beamforming-for-multi-channel
Repo
Framework

Deep Learning versus Classical Regression for Brain Tumor Patient Survival Prediction


Title	Deep Learning versus Classical Regression for Brain Tumor Patient Survival Prediction
Authors	Yannick Suter, Alain Jungo, Michael Rebsamen, Urspeter Knecht, Evelyn Herrmann, Roland Wiest, Mauricio Reyes
Abstract	Deep learning for regression tasks on medical imaging data has shown promising results. However, compared to other approaches, their power is strongly linked to the dataset size. In this study, we evaluate 3D-convolutional neural networks (CNNs) and classical regression methods with hand-crafted features for survival time regression of patients with high grade brain tumors. The tested CNNs for regression showed promising but unstable results. The best performing deep learning approach reached an accuracy of 51.5% on held-out samples of the training set. All tested deep learning experiments were outperformed by a Support Vector Classifier (SVC) using 30 radiomic features. The investigated features included intensity, shape, location and deep features. The submitted method to the BraTS 2018 survival prediction challenge is an ensemble of SVCs, which reached a cross-validated accuracy of 72.2% on the BraTS 2018 training set, 57.1% on the validation set, and 42.9% on the testing set. The results suggest that more training data is necessary for a stable performance of a CNN model for direct regression from magnetic resonance images, and that non-imaging clinical patient information is crucial along with imaging information.
Tasks
Published	2018-11-12
URL	http://arxiv.org/abs/1811.04907v1
PDF	http://arxiv.org/pdf/1811.04907v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-versus-classical-regression-for
Repo
Framework

Detecting Changes in User Preferences using Hidden Markov Models for Sequential Recommendation Tasks


Title	Detecting Changes in User Preferences using Hidden Markov Models for Sequential Recommendation Tasks
Authors	Farzad Eskandanian, Bamshad Mobasher
Abstract	Recommender systems help users find relevant items of interest based on the past preferences of those users. In many domains, however, the tastes and preferences of users change over time due to a variety of factors and recommender systems should capture these dynamics in user preferences in order to remain tuned to the most current interests of users. In this work we present a recommendation framework based on Hidden Markov Models (HMM) which takes into account the dynamics of user preferences. We propose a HMM-based approach to change point detection in the sequence of user interactions which reflect significant changes in preference according to the sequential behavior of all the users in the data. The proposed framework leverages the identified change points to generate recommendations in two ways. In one approach change points are used to create a sequence-aware non-negative matrix factorization model to generate recommendations that are aligned with the current tastes of user. In the second approach the HMM is used directly to generate recommendations taking into account the identified change points. These models are evaluated in terms of accuracy of change point detection and also the effectiveness of recommendations using a real music streaming dataset.
Tasks	Change Point Detection, Recommendation Systems
Published	2018-09-29
URL	http://arxiv.org/abs/1810.00272v1
PDF	http://arxiv.org/pdf/1810.00272v1.pdf
PWC	https://paperswithcode.com/paper/181000272
Repo
Framework

Differentially Private Change-Point Detection


Title	Differentially Private Change-Point Detection
Authors	Rachel Cummings, Sara Krehbiel, Yajun Mei, Rui Tuo, Wanrong Zhang
Abstract	The change-point detection problem seeks to identify distributional changes at an unknown change-point k* in a stream of data. This problem appears in many important practical settings involving personal data, including biosurveillance, fault detection, finance, signal detection, and security systems. The field of differential privacy offers data analysis tools that provide powerful worst-case privacy guarantees. We study the statistical problem of change-point detection through the lens of differential privacy. We give private algorithms for both online and offline change-point detection, analyze these algorithms theoretically, and provide empirical validation of our results.
Tasks	Change Point Detection, Fault Detection
Published	2018-08-29
URL	http://arxiv.org/abs/1808.10056v1
PDF	http://arxiv.org/pdf/1808.10056v1.pdf
PWC	https://paperswithcode.com/paper/differentially-private-change-point-detection
Repo
Framework

Characterizing the Influence of Features on Reading Difficulty Estimation for Non-native Readers


Title	Characterizing the Influence of Features on Reading Difficulty Estimation for Non-native Readers
Authors	Yi-Ting Huang, Meng Chang Chen, Yeali S. Sun
Abstract	In recent years, the number of people studying English as a second language (ESL) has surpassed the number of native speakers. Recent work have demonstrated the success of providing personalized content based on reading difficulty, such as information retrieval and summarization. However, almost all prior studies of reading difficulty are designed for native speakers, rather than non-native readers. In this study, we investigate various features for ESL readers, by conducting a linear regression to estimate the reading level of English language sources. This estimation is based not only on the complexity of lexical and syntactic features, but also several novel concepts, including the age of word and grammar acquisition from several sources, word sense from WordNet, and the implicit relation between sentences. By employing Bayesian Information Criterion (BIC) to select the optimal model, we find that the combination of the number of words, the age of word acquisition and the height of the parsing tree generate better results than alternative competing models. Thus, our results show that proposed second language reading difficulty estimation outperforms other first language reading difficulty estimations.
Tasks	Information Retrieval
Published	2018-08-29
URL	http://arxiv.org/abs/1808.09718v1
PDF	http://arxiv.org/pdf/1808.09718v1.pdf
PWC	https://paperswithcode.com/paper/characterizing-the-influence-of-features-on
Repo
Framework

An Overview of Vulnerabilities of Voice Controlled Systems


Title	An Overview of Vulnerabilities of Voice Controlled Systems
Authors	Yuan Gong, Christian Poellabauer
Abstract	Over the last few years, a rapidly increasing number of Internet-of-Things (IoT) systems that adopt voice as the primary user input have emerged. These systems have been shown to be vulnerable to various types of voice spoofing attacks. However, how exactly these techniques differ or relate to each other has not been extensively studied. In this paper, we provide a survey of recent attack and defense techniques for voice controlled systems and propose a classification of these techniques. We also discuss the need for a universal defense strategy that protects a system from various types of attacks.
Tasks
Published	2018-03-24
URL	http://arxiv.org/abs/1803.09156v1
PDF	http://arxiv.org/pdf/1803.09156v1.pdf
PWC	https://paperswithcode.com/paper/an-overview-of-vulnerabilities-of-voice
Repo
Framework

High-Resolution Mammogram Synthesis using Progressive Generative Adversarial Networks


Title	High-Resolution Mammogram Synthesis using Progressive Generative Adversarial Networks
Authors	Dimitrios Korkinof, Tobias Rijken, Michael O’Neill, Joseph Yearsley, Hugh Harvey, Ben Glocker
Abstract	The ability to generate synthetic medical images is useful for data augmentation, domain transfer, and out-of-distribution detection. However, generating realistic, high-resolution medical images is challenging, particularly for Full Field Digital Mammograms (FFDM), due to the textural heterogeneity, fine structural details and specific tissue properties. In this paper, we explore the use of progressively trained generative adversarial networks (GANs) to synthesize mammograms, overcoming the underlying instabilities when training such adversarial models. This work is the first to show that generation of realistic synthetic medical images is feasible at up to 1280x1024 pixels, the highest resolution achieved for medical image synthesis, enabling visualizations within standard mammographic hanging protocols. We hope this work can serve as a useful guide and facilitate further research on GANs in the medical imaging domain.
Tasks	Data Augmentation, Image Generation, Out-of-Distribution Detection
Published	2018-07-09
URL	https://arxiv.org/abs/1807.03401v2
PDF	https://arxiv.org/pdf/1807.03401v2.pdf
PWC	https://paperswithcode.com/paper/high-resolution-mammogram-synthesis-using
Repo
Framework