A major goal in genomics is to properly capture the complex dynamical behaviors of gene regulatory networks (GRNs). This includes inferring the complex interactions between genes, which can be used for a wide range of genomics analyses, including diagnosis or prognosis of diseases and finding effective treatments for chronic diseases such as cancer. Boolean networks have emerged as a successful class of models for capturing the behavior of GRNs. In most practical settings, inference of GRNs should be achieved through limited and temporally sparse genomics data. A large number of genes in GRNs leads to a large possible topology candidate space, which often cannot be exhaustively searched due to the limitation in computational resources. This paper develops a scalable and efficient topology inference for GRNs using Bayesian optimization and kernel-based methods. Rather than an exhaustive search over possible topologies, the proposed method constructs a Gaussian Process (GP) with a topology-inspired kernel function to account for correlation in the likelihood function. Then, using the posterior distribution of the GP model, the Bayesian optimization efficiently searches for the topology with the highest likelihood value by optimally balancing between exploration and exploitation. The performance of the proposed method is demonstrated through comprehensive numerical experiments using a well-known mammalian cell-cycle network.
翻译:基因组学的一个主要目标是正确捕捉基因监管网络(GRNs)复杂的动态行为。这包括推断基因之间的复杂相互作用,基因可以用来进行广泛的基因组分析,包括疾病诊断或预测,并找到治疗癌症等慢性疾病的有效方法。Boolean 网络已经成为一种成功的模型,用来捕捉基因网的行为。在多数实际情况下,应该通过有限和暂时稀少的基因组数据来推断基因组。基因组的大量基因可以导致一个巨大的可能表层候选空间,由于计算资源的限制,往往无法对基因组进行分析,包括疾病诊断或预测和癌症等慢性疾病的有效治疗。Boolean 网络已经成为一个成功的模型,用来捕捉基因组国家基因组的行为。在大多数实际情况下,应该通过有限的和暂时稀少的基因组数据组数据组数据组数据组实现基因组的推导出一个已知的基因组进程。基因组的大量基因组基因组功能导致一个巨大的可能的表层候选空间,由于计算资源有限,因此往往无法进行彻底的搜索。本文用Bayes优化的分布和内层分析方法,然后通过最有可能进行最佳的模拟分析,通过最佳的模型分析,通过最佳的模型分析,来模拟分析,来模拟分析。