In Bayesian Network Regression models, networks are considered the predictors of continuous responses. These models have been successfully used in brain research to identify regions in the brain that are associated with specific human traits, yet their potential to elucidate microbial drivers in biological phenotypes for microbiome research remains unknown. In particular, microbial networks are challenging due to their high-dimension and high sparsity compared to brain networks. Furthermore, unlike in brain connectome research, in microbiome research, it is usually expected that the presence of microbes have an effect on the response (main effects), not just the interactions. Here, we develop the first thorough investigation of whether Bayesian Network Regression models are suitable for microbial datasets on a variety of synthetic data that was generated under realistic biological scenarios. We test whether the Bayesian Network Regression model that accounts only for interaction effects (edges in the network) is able to identify key drivers in phenotypic variability (microbes). We show that this model is indeed able to identify influential nodes and edges in the microbial networks that drive changes in the phenotype for most biological settings, but we also identify scenarios where this method performs poorly which allows us to provide practical advice for domain scientists aiming to apply these tools to their datasets. Finally, we implement the model in a publicly available Julia package at https://github.com/solislemuslab/BayesianNetworkRegression.jl.
翻译:在贝叶斯网络回归模型中,网络被认为是连续响应的预测因子。这些模型已经成功地应用于大脑研究中,以发现与特定人类特征相关的大脑区域,然而它们在微生物组研究中阐明生物表型中的微生物驱动因素的潜力仍然未知。特别地,与大脑网络相比,微生物网络由于其高维度和高稀疏性而具有挑战性。此外,与大脑连接组研究不同的是,在微生物组研究中,通常预期微生物的存在会对响应产生影响(主效应),而不仅仅是相互作用。在这里,我们首次对贝叶斯网络回归模型是否适用于微生物数据集进行了深入研究,并在真实的生物场景下生成了各种合成数据进行测试。我们测试了一个仅考虑交互效应(网络中的边缘)的贝叶斯网络回归模型是否能够识别生物表型中的关键驱动因素(微生物)。我们展示了该模型能够识别微生物网络中对表型变异产生重要作用的重要节点和边缘,但也发现了一些情况下该方法表现较差,从而为希望将这些工具应用于其数据集的领域科学家提供了实用建议。最后,我们在一个公开可用的Julia软件包https://github.com/solislemuslab/BayesianNetworkRegression.jl中实现了该模型。