Network estimation and variable selection have been extensively studied in the statistical literature, but only recently have those two challenges been addressed simultaneously. In this paper, we seek to develop a novel method to simultaneously estimate network interactions and associations to relevant covariates for count data, and specifically for compositional data, which have a fixed sum constraint. We use a hierarchical Bayesian model with latent layers and employ spike-and-slab priors for both edge and covariate selection. For posterior inference, we develop a novel variational inference scheme with an expectation maximization step, to enable efficient estimation. Through simulation studies, we demonstrate that the proposed model outperforms existing methods in its accuracy of network recovery. We show the practical utility of our model via an application to microbiome data. The human microbiome has been shown to contribute to many of the functions of the human body, and also to be linked with a number of diseases. In our application, we seek to better understand the interaction between microbes and relevant covariates, as well as the interaction of microbes with each other. We provide a Python implementation of our algorithm, called SINC (Simultaneous Inference for Networks and Covariates), available online.
翻译:统计文献对网络估计和变量选择进行了广泛研究,但直到最近才同时解决了这两个挑战。在本文件中,我们力求开发一种新颖的方法,同时估计计算数据,特别是具有固定总和限制的构成数据的相关共变体的网络互动和关联。我们使用一种具有潜伏层的高级巴伊西亚模型,对边缘和共变两种选择都采用尖峰和悬崖前置前置法。关于后置推论,我们开发了一个具有期望最大化步骤的新的变异推论计划,以便能够进行有效估计。我们通过模拟研究,证明拟议的模型在网络恢复的准确性方面超过了现有方法。我们通过对微生物数据的应用展示了模型的实际效用。我们已证明人类微生物模型有助于人体的许多功能,并且与一些疾病相联系。我们的应用中,我们力求更好地了解微生物与相关变异体之间的相互作用,以及微微变异体之间的相互作用。我们通过模拟研究,为在线算法和变异体网络提供了一种Python实施方法。我们称为Simeconversal-conversates。