Spatial transcriptomics allows researchers to visualize and analyze gene expression within the precise location of tissues or cells. It provides spatially resolved gene expression data but often lacks cellular resolution, necessitating cell type deconvolution to infer cellular composition at each spatial location. In this paper we propose BASIN for cell type deconvolution, which models deconvolution as a nonnegative matrix factorization (NMF) problem incorporating graph Laplacian prior. Rather than find a deterministic optima like other recent methods, we propose a matrix variate Bayesian NMF method with nonnegativity and sparsity priors, in which the variables are maintained in their matrix form to derive a more efficient matrix normal posterior. BASIN employs a Gibbs sampler to approximate the posterior distribution of cell type proportions and other parameters, offering a distribution of possible solutions, enhancing robustness and providing inherent uncertainty quantification. The performance of BASIN is evaluated on different spatial transcriptomics datasets and outperforms other deconvolution methods in terms of accuracy and efficiency. The results also show the effect of the incorporated priors and reflect a truncated matrix normal distribution as we expect.
翻译:空间转录组学使研究人员能够在组织或细胞的精确位置可视化并分析基因表达。它提供了空间分辨的基因表达数据,但通常缺乏细胞分辨率,因此需要进行细胞类型解卷积以推断每个空间位置的细胞组成。本文提出用于细胞类型解卷积的BASIN方法,该方法将解卷积建模为融合图拉普拉斯先验的非负矩阵分解(NMF)问题。不同于近期其他方法寻求确定性最优解,我们提出一种具有非负性与稀疏性先验的矩阵变量贝斯NMF方法,其中变量保持矩阵形式以推导更高效的矩阵正态后验分布。BASIN采用吉布斯采样器近似细胞类型比例及其他参数的后验分布,提供可能解的分布,从而增强鲁棒性并实现固有的不确定性量化。BASIN在不同空间转录组数据集上的评估表明,其在准确性与效率方面均优于其他解卷积方法。结果同时揭示了所融入先验的作用,并如预期般呈现出截断矩阵正态分布。