Bayesian Non-Negative Matrix Factorization (NMF) is a method of interest across fields including genomics, neuroscience, and audio and image processing. Bayesian Poisson NMF is of particular importance for counts data, for example in cancer mutational signatures analysis. However, MCMC methods for Bayesian Poisson NMF require a computationally intensive augmentation. Further, identifying latent rank is necessary, but commonly used heuristic approaches are slow and potentially subjective, and methods that learn rank automatically are unable to provide posterior uncertainties. In this paper, we introduce bayesNMF, a computationally efficient Gibbs sampler for Bayesian Poisson NMF. The desired Poisson-likelihood NMF is paired with a Normal-likelihood NMF used for high overlap proposal distributions in approximate Metropolis steps, avoiding augmentation. We additionally define Bayesian factor inclusion (BFI) and sparse Bayesian factor inclusion (SBFI) as methods to identify rank automatically while preserving posterior uncertainty quantification on the learned matrices. We provide an open-source R software package with all models and plotting capabilities demonstrated in this paper on GitHub at jennalandy/bayesNMF. While our applications focus on mutational signatures, our software and results can be extended to any use of Bayesian Poisson NMF.
翻译:暂无翻译