While the study of a single network is well-established, technological advances now allow for the collection of multiple networks with relative ease. Increasingly, anywhere from several to thousands of networks can be created from brain imaging, gene co-expression data, or microbiome measurements. And these networks, in turn, are being looked to as potentially powerful features to be used in modeling. However, with networks being non-Euclidean in nature, how best to incorporate them into standard modeling tasks is not obvious. In this paper, we propose a Bayesian modeling framework that provides a unified approach to binary classification, anomaly detection, and survival analysis with network inputs. We encode the networks in the kernel of a Gaussian process prior via their pairwise differences and we discuss several choices of provably positive definite kernel that can be plugged into our models. Although our methods are widely applicable, we are motivated here in particular by microbiome research (where network analysis is emerging as the standard approach for capturing the interconnectedness of microbial taxa across both time and space) and its potential for reducing preterm delivery and improving personalization of prenatal care.
翻译:虽然对单一网络的研究已经确立,但技术进步现在可以比较容易地收集多个网络。越来越多的是,从数到数千个网络的任何地方都可以通过大脑成像、基因共同表达数据或微生物测量来创建。反过来,这些网络被看成是可用于建模的潜在强势特征。然而,由于网络的性质不易形成,因此将网络纳入标准模型任务的最佳方式并不明显。在本文件中,我们提出了一个巴伊西亚模型框架,为二进制分类、异常检测和网络投入的存活分析提供统一的方法。我们通过对称差异,在高斯进程之前将网络编码在高斯进程核心中,我们讨论几种可以被连接到模型中的可察觉到的正态明确核心。虽然我们的方法广泛适用,但我们在这方面的动力是微生物研究(网络分析正在成为获取时间和空间微生物税相互联系的标准方法)及其减少预产期和产前护理个人化的潜力。