Paper Group ANR 21
Correlated signal inference by free energy exploration. Natural Scene Character Recognition Using Robust PCA and Sparse Representation. Understanding data augmentation for classification: when to warp?. A note on the expected minimum error probability in equientropic channels. Measurement Bounds for Sparse Signal Reconstruction with Multiple Side I …
Correlated signal inference by free energy exploration
Title | Correlated signal inference by free energy exploration |
Authors | Torsten A. Enßlin, Jakob Knollmüller |
Abstract | The inference of correlated signal fields with unknown correlation structures is of high scientific and technological relevance, but poses significant conceptual and numerical challenges. To address these, we develop the correlated signal inference (CSI) algorithm within information field theory (IFT) and discuss its numerical implementation. To this end, we introduce the free energy exploration (FrEE) strategy for numerical information field theory (NIFTy) applications. The FrEE strategy is to let the mathematical structure of the inference problem determine the dynamics of the numerical solver. FrEE uses the Gibbs free energy formalism for all involved unknown fields and correlation structures without marginalization of nuisance quantities. It thereby avoids the complexity marginalization often impose to IFT equations. FrEE simultaneously solves for the mean and the uncertainties of signal, nuisance, and auxiliary fields, while exploiting any analytically calculable quantity. Finally, FrEE uses a problem specific and self-tuning exploration strategy to swiftly identify the optimal field estimates as well as their uncertainty maps. For all estimated fields, properly weighted posterior samples drawn from their exact, fully non-Gaussian distributions can be generated. Here, we develop the FrEE strategies for the CSI of a normal, a log-normal, and a Poisson log-normal IFT signal inference problem and demonstrate their performances via their NIFTy implementations. |
Tasks | |
Published | 2016-12-26 |
URL | http://arxiv.org/abs/1612.08406v2 |
http://arxiv.org/pdf/1612.08406v2.pdf | |
PWC | https://paperswithcode.com/paper/correlated-signal-inference-by-free-energy |
Repo | |
Framework | |
Natural Scene Character Recognition Using Robust PCA and Sparse Representation
Title | Natural Scene Character Recognition Using Robust PCA and Sparse Representation |
Authors | Zheng Zhang, Yong Xu, Cheng-Lin Liu |
Abstract | Natural scene character recognition is challenging due to the cluttered background, which is hard to separate from text. In this paper, we propose a novel method for robust scene character recognition. Specifically, we first use robust principal component analysis (PCA) to denoise character image by recovering the missing low-rank component and filtering out the sparse noise term, and then use a simple Histogram of oriented Gradient (HOG) to perform image feature extraction, and finally, use a sparse representation based classifier for recognition. In experiments on four public datasets, namely the Char74K dataset, ICADAR 2003 robust reading dataset, Street View Text (SVT) dataset and IIIT5K-word dataset, our method was demonstrated to be competitive with the state-of-the-art methods. |
Tasks | |
Published | 2016-06-15 |
URL | http://arxiv.org/abs/1606.04616v1 |
http://arxiv.org/pdf/1606.04616v1.pdf | |
PWC | https://paperswithcode.com/paper/natural-scene-character-recognition-using |
Repo | |
Framework | |
Understanding data augmentation for classification: when to warp?
Title | Understanding data augmentation for classification: when to warp? |
Authors | Sebastien C. Wong, Adam Gatt, Victor Stamatescu, Mark D. McDonnell |
Abstract | In this paper we investigate the benefit of augmenting data with synthetically created samples when training a machine learning classifier. Two approaches for creating additional training samples are data warping, which generates additional samples through transformations applied in the data-space, and synthetic over-sampling, which creates additional samples in feature-space. We experimentally evaluate the benefits of data augmentation for a convolutional backpropagation-trained neural network, a convolutional support vector machine and a convolutional extreme learning machine classifier, using the standard MNIST handwritten digit dataset. We found that while it is possible to perform generic augmentation in feature-space, if plausible transforms for the data are known then augmentation in data-space provides a greater benefit for improving performance and reducing overfitting. |
Tasks | Data Augmentation |
Published | 2016-09-28 |
URL | http://arxiv.org/abs/1609.08764v2 |
http://arxiv.org/pdf/1609.08764v2.pdf | |
PWC | https://paperswithcode.com/paper/understanding-data-augmentation-for |
Repo | |
Framework | |
A note on the expected minimum error probability in equientropic channels
Title | A note on the expected minimum error probability in equientropic channels |
Authors | Sebastian Weichwald, Tatiana Fomina, Bernhard Schölkopf, Moritz Grosse-Wentrup |
Abstract | While the channel capacity reflects a theoretical upper bound on the achievable information transmission rate in the limit of infinitely many bits, it does not characterise the information transfer of a given encoding routine with finitely many bits. In this note, we characterise the quality of a code (i. e. a given encoding routine) by an upper bound on the expected minimum error probability that can be achieved when using this code. We show that for equientropic channels this upper bound is minimal for codes with maximal marginal entropy. As an instructive example we show for the additive white Gaussian noise (AWGN) channel that random coding—also a capacity achieving code—indeed maximises the marginal entropy in the limit of infinite messages. |
Tasks | |
Published | 2016-05-23 |
URL | http://arxiv.org/abs/1605.07094v2 |
http://arxiv.org/pdf/1605.07094v2.pdf | |
PWC | https://paperswithcode.com/paper/a-note-on-the-expected-minimum-error |
Repo | |
Framework | |
Measurement Bounds for Sparse Signal Reconstruction with Multiple Side Information
Title | Measurement Bounds for Sparse Signal Reconstruction with Multiple Side Information |
Authors | Huynh Van Luong, Jurgen Seiler, Andre Kaup, Soren Forchhammer, Nikos Deligiannis |
Abstract | In the context of compressed sensing (CS), this paper considers the problem of reconstructing sparse signals with the aid of other given correlated sources as multiple side information. To address this problem, we theoretically study a generic \textcolor{black}{weighted $n$-$\ell_{1}$ minimization} framework and propose a reconstruction algorithm that leverages multiple side information signals (RAMSI). The proposed RAMSI algorithm computes adaptively optimal weights among the side information signals at every reconstruction iteration. In addition, we establish theoretical bounds on the number of measurements that are required to successfully reconstruct the sparse source by using \textcolor{black}{weighted $n$-$\ell_{1}$ minimization}. The analysis of the established bounds reveal that \textcolor{black}{weighted $n$-$\ell_{1}$ minimization} can achieve sharper bounds and significant performance improvements compared to classical CS. We evaluate experimentally the proposed RAMSI algorithm and the established bounds using synthetic sparse signals as well as correlated feature histograms, extracted from a multiview image database for object recognition. The obtained results show clearly that the proposed algorithm outperforms state-of-the-art algorithms—\textcolor{black}{including classical CS, $\ell_1\text{-}\ell_1$ minimization, Modified-CS, regularized Modified-CS, and weighted $\ell_1$ minimization}—in terms of both the theoretical bounds and the practical performance. |
Tasks | Object Recognition |
Published | 2016-05-10 |
URL | http://arxiv.org/abs/1605.03234v2 |
http://arxiv.org/pdf/1605.03234v2.pdf | |
PWC | https://paperswithcode.com/paper/measurement-bounds-for-sparse-signal |
Repo | |
Framework | |
Drift Analysis and Evolutionary Algorithms Revisited
Title | Drift Analysis and Evolutionary Algorithms Revisited |
Authors | Johannes Lengler, Angelika Steger |
Abstract | One of the easiest randomized greedy optimization algorithms is the following evolutionary algorithm which aims at maximizing a boolean function $f:{0,1}^n \to {\mathbb R}$. The algorithm starts with a random search point $\xi \in {0,1}^n$, and in each round it flips each bit of $\xi$ with probability $c/n$ independently at random, where $c>0$ is a fixed constant. The thus created offspring $\xi'$ replaces $\xi$ if and only if $f(\xi’) \ge f(\xi)$. The analysis of the runtime of this simple algorithm on monotone and on linear functions turned out to be highly non-trivial. In this paper we review known results and provide new and self-contained proofs of partly stronger results. |
Tasks | |
Published | 2016-08-10 |
URL | http://arxiv.org/abs/1608.03226v4 |
http://arxiv.org/pdf/1608.03226v4.pdf | |
PWC | https://paperswithcode.com/paper/drift-analysis-and-evolutionary-algorithms |
Repo | |
Framework | |
Citation Classification for Behavioral Analysis of a Scientific Field
Title | Citation Classification for Behavioral Analysis of a Scientific Field |
Authors | David Jurgens, Srijan Kumar, Raine Hoover, Dan McFarland, Dan Jurafsky |
Abstract | Citations are an important indicator of the state of a scientific field, reflecting how authors frame their work, and influencing uptake by future scholars. However, our understanding of citation behavior has been limited to small-scale manual citation analysis. We perform the largest behavioral study of citations to date, analyzing how citations are both framed and taken up by scholars in one entire field: natural language processing. We introduce a new dataset of nearly 2,000 citations annotated for function and centrality, and use it to develop a state-of-the-art classifier and label the entire ACL Reference Corpus. We then study how citations are framed by authors and use both papers and online traces to track how citations are followed by readers. We demonstrate that authors are sensitive to discourse structure and publication venue when citing, that online readers follow temporal links to previous and future work rather than methodological links, and that how a paper cites related work is predictive of its citation count. Finally, we use changes in citation roles to show that the field of NLP is undergoing a significant increase in consensus. |
Tasks | |
Published | 2016-09-02 |
URL | http://arxiv.org/abs/1609.00435v1 |
http://arxiv.org/pdf/1609.00435v1.pdf | |
PWC | https://paperswithcode.com/paper/citation-classification-for-behavioral |
Repo | |
Framework | |
Learning Arbitrary Sum-Product Network Leaves with Expectation-Maximization
Title | Learning Arbitrary Sum-Product Network Leaves with Expectation-Maximization |
Authors | Mattia Desana, Christoph Schnörr |
Abstract | Sum-Product Networks with complex probability distribution at the leaves have been shown to be powerful tractable-inference probabilistic models. However, while learning the internal parameters has been amply studied, learning complex leaf distribution is an open problem with only few results available in special cases. In this paper we derive an efficient method to learn a very large class of leaf distributions with Expectation-Maximization. The EM updates have the form of simple weighted maximum likelihood problems, allowing to use any distribution that can be learned with maximum likelihood, even approximately. The algorithm has cost linear in the model size and converges even if only partial optimizations are performed. We demonstrate this approach with experiments on twenty real-life datasets for density estimation, using tree graphical models as leaves. Our model outperforms state-of-the-art methods for parameter learning despite using SPNs with much fewer parameters. |
Tasks | Density Estimation |
Published | 2016-04-25 |
URL | http://arxiv.org/abs/1604.07243v3 |
http://arxiv.org/pdf/1604.07243v3.pdf | |
PWC | https://paperswithcode.com/paper/learning-arbitrary-sum-product-network-leaves |
Repo | |
Framework | |
No-Regret Replanning under Uncertainty
Title | No-Regret Replanning under Uncertainty |
Authors | Wen Sun, Niteesh Sood, Debadeepta Dey, Gireeja Ranade, Siddharth Prakash, Ashish Kapoor |
Abstract | This paper explores the problem of path planning under uncertainty. Specifically, we consider online receding horizon based planners that need to operate in a latent environment where the latent information can be modeled via Gaussian Processes. Online path planning in latent environments is challenging since the robot needs to explore the environment to get a more accurate model of latent information for better planning later and also achieves the task as quick as possible. We propose UCB style algorithms that are popular in the bandit settings and show how those analyses can be adapted to the online robotic path planning problems. The proposed algorithm trades-off exploration and exploitation in near-optimal manner and has appealing no-regret properties. We demonstrate the efficacy of the framework on the application of aircraft flight path planning when the winds are partially observed. |
Tasks | Gaussian Processes |
Published | 2016-09-16 |
URL | http://arxiv.org/abs/1609.05162v1 |
http://arxiv.org/pdf/1609.05162v1.pdf | |
PWC | https://paperswithcode.com/paper/no-regret-replanning-under-uncertainty |
Repo | |
Framework | |
Minimal Gated Unit for Recurrent Neural Networks
Title | Minimal Gated Unit for Recurrent Neural Networks |
Authors | Guo-Bing Zhou, Jianxin Wu, Chen-Lin Zhang, Zhi-Hua Zhou |
Abstract | Recently recurrent neural networks (RNN) has been very successful in handling sequence data. However, understanding RNN and finding the best practices for RNN is a difficult task, partly because there are many competing and complex hidden units (such as LSTM and GRU). We propose a gated unit for RNN, named as Minimal Gated Unit (MGU), since it only contains one gate, which is a minimal design among all gated hidden units. The design of MGU benefits from evaluation results on LSTM and GRU in the literature. Experiments on various sequence data show that MGU has comparable accuracy with GRU, but has a simpler structure, fewer parameters, and faster training. Hence, MGU is suitable in RNN’s applications. Its simple architecture also means that it is easier to evaluate and tune, and in principle it is easier to study MGU’s properties theoretically and empirically. |
Tasks | |
Published | 2016-03-31 |
URL | http://arxiv.org/abs/1603.09420v1 |
http://arxiv.org/pdf/1603.09420v1.pdf | |
PWC | https://paperswithcode.com/paper/minimal-gated-unit-for-recurrent-neural |
Repo | |
Framework | |
Function-Specific Mixing Times and Concentration Away from Equilibrium
Title | Function-Specific Mixing Times and Concentration Away from Equilibrium |
Authors | Maxim Rabinovich, Aaditya Ramdas, Michael I. Jordan, Martin J. Wainwright |
Abstract | Slow mixing is the central hurdle when working with Markov chains, especially those used for Monte Carlo approximations (MCMC). In many applications, it is only of interest to estimate the stationary expectations of a small set of functions, and so the usual definition of mixing based on total variation convergence may be too conservative. Accordingly, we introduce function-specific analogs of mixing times and spectral gaps, and use them to prove Hoeffding-like function-specific concentration inequalities. These results show that it is possible for empirical expectations of functions to concentrate long before the underlying chain has mixed in the classical sense, and we show that the concentration rates we achieve are optimal up to constants. We use our techniques to derive confidence intervals that are sharper than those implied by both classical Markov chain Hoeffding bounds and Berry-Esseen-corrected CLT bounds. For applications that require testing, rather than point estimation, we show similar improvements over recent sequential testing results for MCMC. We conclude by applying our framework to real data examples of MCMC, providing evidence that our theory is both accurate and relevant to practice. |
Tasks | |
Published | 2016-05-06 |
URL | http://arxiv.org/abs/1605.02077v2 |
http://arxiv.org/pdf/1605.02077v2.pdf | |
PWC | https://paperswithcode.com/paper/function-specific-mixing-times-and |
Repo | |
Framework | |
Variational Hamiltonian Monte Carlo via Score Matching
Title | Variational Hamiltonian Monte Carlo via Score Matching |
Authors | Cheng Zhang, Babak Shahbaba, Hongkai Zhao |
Abstract | Traditionally, the field of computational Bayesian statistics has been divided into two main subfields: variational methods and Markov chain Monte Carlo (MCMC). In recent years, however, several methods have been proposed based on combining variational Bayesian inference and MCMC simulation in order to improve their overall accuracy and computational efficiency. This marriage of fast evaluation and flexible approximation provides a promising means of designing scalable Bayesian inference methods. In this paper, we explore the possibility of incorporating variational approximation into a state-of-the-art MCMC method, Hamiltonian Monte Carlo (HMC), to reduce the required gradient computation in the simulation of Hamiltonian flow, which is the bottleneck for many applications of HMC in big data problems. To this end, we use a {\it free-form} approximation induced by a fast and flexible surrogate function based on single-hidden layer feedforward neural networks. The surrogate provides sufficiently accurate approximation while allowing for fast exploration of parameter space, resulting in an efficient approximate inference algorithm. We demonstrate the advantages of our method on both synthetic and real data problems. |
Tasks | Bayesian Inference |
Published | 2016-02-06 |
URL | http://arxiv.org/abs/1602.02219v2 |
http://arxiv.org/pdf/1602.02219v2.pdf | |
PWC | https://paperswithcode.com/paper/variational-hamiltonian-monte-carlo-via-score |
Repo | |
Framework | |
Randomized Independent Component Analysis
Title | Randomized Independent Component Analysis |
Authors | Matan Sela, Ron Kimmel |
Abstract | Independent component analysis (ICA) is a method for recovering statistically independent signals from observations of unknown linear combinations of the sources. Some of the most accurate ICA decomposition methods require searching for the inverse transformation which minimizes different approximations of the Mutual Information, a measure of statistical independence of random vectors. Two such approximations are the Kernel Generalized Variance or the Kernel Canonical Correlation which has been shown to reach the highest performance of ICA methods. However, the computational effort necessary just for computing these measures is cubic in the sample size. Hence, optimizing them becomes even more computationally demanding, in terms of both space and time. Here, we propose a couple of alternative novel measures based on randomized features of the samples - the Randomized Generalized Variance and the Randomized Canonical Correlation. The computational complexity of calculating the proposed alternatives is linear in the sample size and provide a controllable approximation of their Kernel-based non-random versions. We also show that optimization of the proposed statistical properties yields a comparable separation error at an order of magnitude faster compared to Kernel-based measures. |
Tasks | |
Published | 2016-09-22 |
URL | http://arxiv.org/abs/1609.06942v1 |
http://arxiv.org/pdf/1609.06942v1.pdf | |
PWC | https://paperswithcode.com/paper/randomized-independent-component-analysis |
Repo | |
Framework | |
Unbiased Comparative Evaluation of Ranking Functions
Title | Unbiased Comparative Evaluation of Ranking Functions |
Authors | Tobias Schnabel, Adith Swaminathan, Peter Frazier, Thorsten Joachims |
Abstract | Eliciting relevance judgments for ranking evaluation is labor-intensive and costly, motivating careful selection of which documents to judge. Unlike traditional approaches that make this selection deterministically, probabilistic sampling has shown intriguing promise since it enables the design of estimators that are provably unbiased even when reusing data with missing judgments. In this paper, we first unify and extend these sampling approaches by viewing the evaluation problem as a Monte Carlo estimation task that applies to a large number of common IR metrics. Drawing on the theoretical clarity that this view offers, we tackle three practical evaluation scenarios: comparing two systems, comparing $k$ systems against a baseline, and ranking $k$ systems. For each scenario, we derive an estimator and a variance-optimizing sampling distribution while retaining the strengths of sampling-based evaluation, including unbiasedness, reusability despite missing data, and ease of use in practice. In addition to the theoretical contribution, we empirically evaluate our methods against previously used sampling heuristics and find that they generally cut the number of required relevance judgments at least in half. |
Tasks | |
Published | 2016-04-25 |
URL | http://arxiv.org/abs/1604.07209v1 |
http://arxiv.org/pdf/1604.07209v1.pdf | |
PWC | https://paperswithcode.com/paper/unbiased-comparative-evaluation-of-ranking |
Repo | |
Framework | |
Low-rank and Adaptive Sparse Signal (LASSI) Models for Highly Accelerated Dynamic Imaging
Title | Low-rank and Adaptive Sparse Signal (LASSI) Models for Highly Accelerated Dynamic Imaging |
Authors | Saiprasad Ravishankar, Brian E. Moore, Raj Rao Nadakuditi, Jeffrey A. Fessler |
Abstract | Sparsity-based approaches have been popular in many applications in image processing and imaging. Compressed sensing exploits the sparsity of images in a transform domain or dictionary to improve image recovery from undersampled measurements. In the context of inverse problems in dynamic imaging, recent research has demonstrated the promise of sparsity and low-rank techniques. For example, the patches of the underlying data are modeled as sparse in an adaptive dictionary domain, and the resulting image and dictionary estimation from undersampled measurements is called dictionary-blind compressed sensing, or the dynamic image sequence is modeled as a sum of low-rank and sparse (in some transform domain) components (L+S model) that are estimated from limited measurements. In this work, we investigate a data-adaptive extension of the L+S model, dubbed LASSI, where the temporal image sequence is decomposed into a low-rank component and a component whose spatiotemporal (3D) patches are sparse in some adaptive dictionary domain. We investigate various formulations and efficient methods for jointly estimating the underlying dynamic signal components and the spatiotemporal dictionary from limited measurements. We also obtain efficient sparsity penalized dictionary-blind compressed sensing methods as special cases of our LASSI approaches. Our numerical experiments demonstrate the promising performance of LASSI schemes for dynamic magnetic resonance image reconstruction from limited k-t space data compared to recent methods such as k-t SLR and L+S, and compared to the proposed dictionary-blind compressed sensing method. |
Tasks | Image Reconstruction |
Published | 2016-11-13 |
URL | http://arxiv.org/abs/1611.04069v2 |
http://arxiv.org/pdf/1611.04069v2.pdf | |
PWC | https://paperswithcode.com/paper/low-rank-and-adaptive-sparse-signal-lassi |
Repo | |
Framework | |