It is well known that the integration among different data-sources is reliable because of its potential of unveiling new functionalities of the genomic expressions which might be dormant in a single source analysis. Moreover, different studies have justified the more powerful analyses of multi-platform data. Toward this, in this study, we consider the circadian genes' omics profile such as copy number changes and RNA sequence data along with their survival response. We develop a Bayesian structural equation modeling coupled with linear regressions and log normal accelerated failure time regression to integrate the information between these two platforms to predict the survival of the subjects. We place conjugate priors on the regression parameters and derive the Gibbs sampler using the conditional distributions of them. Our extensive simulation study shows that the integrative model provides a better fit to the data than its closest competitor. The analyses of glioblastoma cancer data and the breast cancer data from TCGA, the largest genomics and transcriptomics database, support our findings. The developed method is wrapped in R package semmcmc available at R CRAN.
翻译:众所周知,不同数据源的整合是可靠的,因为它有可能揭示基因组表达形式的新功能,这些功能在单一来源分析中可能处于休眠状态。此外,不同的研究证明对多平台数据进行更强有力的分析是有道理的。为此,在本研究中,我们认为环礁基因的奥米谱剖面图,如复制数变化和RNA序列数据及其生存反应。我们开发了一种贝叶斯结构方程模型,加上线性回归和记录正常的加速失败时间回归,以整合这两个平台之间的信息,以预测对象的生存。我们把回归参数放在同位前列,并利用有条件的分布生成Gibs取样器。我们广泛的模拟研究表明,综合模型比其最接近的竞争力模型更适合数据。对Gelioblastoma癌症数据的分析以及来自TCGA(最大的基因组和曲解缩缩缩数据库)的乳腺癌数据支持我们的发现。开发的方法以R包 ASmmc 。