Rapid advancements in collection and dissemination of multi-platform molecular and genomics data has resulted in enormous opportunities to aggregate such data in order to understand, prevent, and treat human diseases. While significant improvements have been made in multi-omic data integration methods to discover biological markers and mechanisms underlying both prognosis and treatment, the precise cellular functions governing these complex mechanisms still need detailed and data-driven de-novo evaluations. We propose a framework called Functional Integrative Bayesian Analysis of High-dimensional Multiplatform Genomic Data (fiBAG), that allows simultaneous identification of upstream functional evidence of proteogenomic biomarkers and the incorporation of such knowledge in Bayesian variable selection models to improve signal detection. fiBAG employs a conflation of Gaussian process models to quantify (possibly non-linear) functional evidence via Bayes factors, which are then mapped to a novel calibrated spike-and-slab prior, thus guiding selection and providing functional relevance to the associations with patient outcomes. Using simulations, we illustrate how integrative methods with functional calibration have higher power to detect disease related markers than non-integrative approaches. We demonstrate the profitability of fiBAG via a pan-cancer analysis of 14 cancer types to identify and assess the cellular mechanisms of proteogenomic markers associated with cancer stemness and patient survival.
翻译:在收集和传播多平台分子和基因组数据方面取得了迅速进展,从而产生了巨大的机会,可以对这些数据进行汇总,以便了解、预防和治疗人类疾病。虽然在发现生物标记和预测和治疗所依据的机制的多组数据整合方法方面取得了重大改进,但这些复杂机制的确切细胞功能仍需要详细和以数据驱动的脱创新评估。我们提议了一个框架,称为“高位多平台基因组数据功能性贝氏分析(FiBAG)综合分析(FiBAG)”,以便能够同时查明蛋白质组生物标志的上游功能证据,并将这种知识纳入巴伊西亚变量选择模型,以改进信号检测。FIBAG采用高音进程模型,通过贝伊因素量化(可能非线性)功能性证据,然后将其绘制成新颖的校准峰值和悬浮标,从而指导选择工作,并为具有病人结果的协会提供功能相关性。我们通过模拟,说明与功能校准相关的生物组生物标记相关的综合方法如何具有更高的功能性校准能力,以便通过非线性癌症模型分析来检测疾病。我们通过不力型的模型分析,确定与癌症相关的指标。