Exposure to diverse non-genetic factors, known as the exposome, is a critical determinant of health outcomes. However, analyzing the exposome presents significant methodological challenges, including: high collinearity among exposures, the longitudinal nature of repeated measurements, and potential complex interactions with individual characteristics. In this paper, we address these challenges by proposing a novel statistical framework that extends Bayesian profile regression. Our method integrates profile regression, which handles collinearity by clustering exposures into latent profiles, into a linear mixed model (LMM), a framework for longitudinal data analysis. This profile-LMM approach effectively accounts for within-person variability over time while also incorporating interactions between the latent exposure clusters and individual characteristics. We validate our method using simulated data, demonstrating its ability to accurately identify model parameters and recover the true latent exposure cluster structure. Finally, we apply this approach to a large longitudinal data set from the Lifelines cohort to identify combinations of exposures that are significantly associated with diastolic blood pressure.
翻译:暴露于多种非遗传因素(即暴露组)是健康结局的关键决定因素。然而,分析暴露组数据面临着重大的方法学挑战,包括:暴露因素间的高度共线性、重复测量的纵向性质,以及与个体特征之间潜在的复杂交互作用。本文通过提出一种扩展贝叶斯剖面回归的新型统计框架来应对这些挑战。我们的方法将剖面回归(通过将暴露聚类为潜在剖面来处理共线性)整合到线性混合模型(LMM)这一纵向数据分析框架中。这种Profile-LMM方法有效地考虑了个人内部随时间的变化,同时纳入了潜在暴露聚类与个体特征之间的交互作用。我们使用模拟数据验证了该方法,证明了其能够准确识别模型参数并恢复真实的潜在暴露聚类结构。最后,我们将此方法应用于来自Lifelines队列的大型纵向数据集,以识别与舒张压显著相关的暴露组合。