The problem of estimating the structure of a graph from observed data is of growing interest in the context of high-throughput genomic data, and single-cell RNA sequencing in particular. These, however, are challenging applications, since the data consist of high-dimensional counts with high variance and over-abundance of zeros. Here, we present a general framework for learning the structure of a graph from single-cell RNA-seq data, based on the zero-inflated negative binomial distribution. We demonstrate with simulations that our approach is able to retrieve the structure of a graph in a variety of settings and we show the utility of the approach on real data.
翻译:从观察到的数据中估算图表结构的问题越来越引起人们对高通量基因组数据,特别是单细胞RNA测序的兴趣。然而,这些是具有挑战性的应用,因为数据包含高维计,差异很大,零分过大。在这里,我们提出了一个以零膨胀负二成分布为基础,从单细胞RNA-sq数据中学习图表结构的一般框架。我们通过模拟表明,我们的方法能够在各种环境下检索图表结构,我们展示了对真实数据的方法的实用性。