Understanding of the pathophysiology of obstructive lung disease (OLD) is limited by available methods to examine the relationship between multi-omic molecular phenomena and clinical outcomes. Integrative factorization methods for multi-omic data can reveal latent patterns of variation describing important biological signal. However, most methods do not provide a framework for inference on the estimated factorization, simultaneously predict important disease phenotypes or clinical outcomes, nor accommodate multiple imputation. To address these gaps, we propose Bayesian Simultaneous Factorization (BSF). We use conjugate normal priors and show that the posterior mode of this model can be estimated by solving a structured nuclear norm-penalized objective that also achieves rank selection and motivates the choice of hyperparameters. We then extend BSF to simultaneously predict a continuous or binary response, termed Bayesian Simultaneous Factorization and Prediction (BSFP). BSF and BSFP accommodate concurrent imputation and full posterior inference for missing data, including "blockwise" missingness, and BSFP offers prediction of unobserved outcomes. We show via simulation that BSFP is competitive in recovering latent variation structure, as well as the importance of propagating uncertainty from the estimated factorization to prediction. We also study the imputation performance of BSF via simulation under missing-at-random and missing-not-at-random assumptions. Lastly, we use BSFP to predict lung function based on the bronchoalveolar lavage metabolome and proteome from a study of HIV-associated OLD. Our analysis reveals a distinct cluster of patients with OLD driven by shared metabolomic and proteomic expression patterns, as well as multi-omic patterns related to lung function decline. Software is freely available at https://github.com/sarahsamorodnitsky/BSFP .
翻译:对阻塞性肺病(OLD)病理学的理解有限,因为现有方法限制了对阻塞性肺病(OLD)病理生理学的理解。我们建议Bayesian Simultany Pritical化(BSF)来研究多组细胞分子分子现象和临床结果之间的关系。我们使用比方分子现象和临床结果(OLD)来分析多组分子分子现象和临床结果之间的关系。多组分子数据的综合因子化方法可以揭示出潜在的变异模式。多组分子数据的综合因子化方法可以揭示出潜在的变异模式。多组数据的结构化模型的理论化模型化方法也可以揭示出潜在的变异模式来描述重要的生物生物学信号。然后,我们把BSFT(BSF)的连续或二进制反应(BSFT)的理论化框架化框架框架框架框架框架(BSFP)用来预测一个连续或二进制反应(BSFP(BFD)的变异性反应。BSFSF(Werealimalal-I-I-I-Ial-IF)的预估测测算)和O-Silal-Slental-Sildal-Sildal-Silal-Sildal-Sildal-Sild(Sildal-Iental-Sild)的预测结果,通过Slentaldal-I)的模型的模型(我们测算算法(O-Sl)的预测结果,通过Silental-Slental-Sildal-Sildal-Sild)的模型的模型)的模型来显示一个不变变变变变变现的模型的模型分析,我们的预估测算法分析,我们的预估测算。