Understanding factors that contribute to the increased likelihood of disease transmission between two individuals is important for infection control. Measures of genetic relatedness of bacterial isolates between two individuals are often analyzed to determine their associations with these factors using simple correlation or regression analyses. However, these standard approaches ignore the potential for correlation in paired data of this type, arising from the presence of the same individual across multiple paired outcomes. We develop two novel hierarchical Bayesian methods for properly analyzing paired genetic relatedness data in the form of patristic distances and transmission probabilities. Using individual-level spatially correlated random effect parameters, we account for multiple sources of correlation in the outcomes as well as other important features of their distribution. Through simulation, we show that the standard analyses drastically underestimate uncertainty in the associations when correlation is present in the data, leading to incorrect conclusions regarding the covariates of interest. Conversely, the newly developed methods perform well under various levels of correlated and uncorrelated data. All methods are applied to Mycobacterium tuberculosis data from the Republic of Moldova where we identify factors associated with disease transmission and, through analysis of the random effect parameters, key individuals and areas with increased transmission activity. Model comparisons show the importance of the new methodology in this setting. The methods are implemented in the R package GenePair.
翻译:对感染控制而言,重要的是要了解导致两个个人之间疾病传播可能性增加的因素。衡量两个个人之间细菌分离的遗传相关程度,经常使用简单的相关或回归分析来分析,以确定他们与这些因素的联系。然而,这些标准方法忽略了这种类型的配对数据的相关性,这种数据因同一个人在多种配对结果中的存在而产生。我们开发了两种新型的等级分级巴伊西亚方法,以适当分析以父系距离和传导概率为形式的配对遗传相关数据。我们使用个人级别的空间相关随机效应参数,说明结果中的多重相关来源以及这些结果分布的其他重要特征。我们通过模拟,表明标准分析大大低估了在数据中出现关联时这些关联的关联性的不确定性,导致关于利益共变的不正确结论。相反,新开发的方法在不同层次的关联和非相关数据下运行良好。所有方法都适用于摩尔多瓦共和国的Mycocactium结核病数据,我们通过分析随机效应参数,查明与疾病传播有关的各种因素,并通过分析其分布的其他重要特征。我们通过模拟,表明在数据中,关键个人和领域与增加的传输活动方法的重要性。