In microbiome studies, it is of interest to use a sample from a population of microbes, such as the gut microbiota community, to estimate the population proportion of these taxa. However, due to biases introduced in sampling and preprocessing steps, these observed taxa abundances may not reflect true taxa abundance patterns in the ecosystem. Repeated measures including longitudinal study designs may be potential solutions to mitigate the discrepancy between observed abundances and true underlying abundances. Yet, widely observed zero-inflation and over-dispersion issues can distort downstream statistical analyses aiming to associate taxa abundances with covariates of interest. To this end, we propose a Zero-Inflated Poisson Gamma (ZIPG) framework to address the aforementioned challenges. From a perspective of measurement errors, we accommodate the discrepancy between observations and truths by decomposing the mean parameter in Poisson regression into a true abundance level and a multiplicative measurement of sampling variability from the microbial ecosystem. Then, we provide flexible modeling by connecting both mean abundance and the variability to different covariates, and build valid statistical inference procedures for both parameter estimation and hypothesis testing. Through comprehensive simulation studies and real data applications, the proposed ZIPG method provides significant insights into distinguished differential variability and abundance.
翻译:在微生物研究中,人们感兴趣的是利用诸如肠道微生物群等微生物群群的样本来估计这些分类群的人口比例。然而,由于在取样和预处理步骤中引入的偏差,观察到的分类群丰度可能并不反映生态系统真正的分类群丰度模式。反复采用的措施,包括纵向研究设计,可能是减轻观察到的丰度和真实基本丰度之间差异的潜在解决办法。然而,广泛观察到的零通货膨胀和过度分散问题,可能会扭曲下游统计分析,目的是将分类群的丰度与兴趣的共变差联系起来。为此,我们提议了一个零膨胀的Poisson Gamma(ZIPG)框架,以应对上述挑战。从测量误差的角度来看,我们通过将Poisson回归的平均参数分解成真正的丰度水平,并对微生物生态系统的采样变异性进行多种复制性衡量,从而适应灵活的模型,将平均丰度和变异性与不同变异性联系起来。我们提出一个有效的统计推断程序,用以应对上述挑战。从测量误差的角度,我们将观察和假设性地测算出真实数据。