Gene regulatory network (GRN) refers to the complex network formed by regulatory interactions between genes in living cells. In this paper, we consider inferring GRNs in single cells based on single cell RNA sequencing (scRNA-seq) data. In scRNA-seq, single cells are often profiled from mixed populations and their cell identities are unknown. A common practice for single cell GRN analysis is to first cluster the cells and infer GRNs for every cluster separately. However, this two-step procedure ignores uncertainty in the clustering step and thus could lead to inaccurate estimation of the networks. To address this problem, we propose to model scRNA-seq by the mixture multivariate Poisson log-normal (MPLN) distribution. The precision matrices of the MPLN are the GRNs of different cell types and can be jointly estimated by maximizing MPLN's lasso-penalized log-likelihood. We show that the MPLN model is identifiable and the resulting penalized log-likelihood estimator is consistent. To avoid the intractable optimization of the MPLN's log-likelihood, we develop an algorithm called VMPLN based on the variational inference method. Comprehensive simulation and real scRNA-seq data analyses reveal that VMPLN performs better than the state-of-the-art single cell GRN methods.
翻译:基因基因监管网络( GRN) 指的是由活细胞基因之间监管互动形成的复杂网络。 在本文中, 我们考虑根据单细胞 RNA 排序( scRNA- seq) 数据, 在单细胞中根据单细胞 RNA 排序( scRNA- seq) 数据推断 GRN 。 在 scRNA 类中, 单细胞往往从混合人群中进行剖析, 其单元格身份未知。 单细胞 GRN 分析的常见做法是首先将单元格分组分组分组, 并单独地将每个组群的GRN 调出 GRN 。 但是, 这个两步程序忽略了分组步骤的不确定性, 从而可能导致对网络的估算不准确。 为了解决这个问题, 我们建议用混合物多变异式 Poisson 日志正常( MPLN) 的分布模型来模拟 scRNA 等等值。 MPL 精确矩阵矩阵是不同类别 GRN 的GRML, 通过最大限度的移动方法来共同估计。 我们显示MP 州级变更精确的RDRA,, 的 Ral- sal- salmaxx 的演化法是更精确的方法, 的 的, 。