Many popular models from the networks literature can be viewed through a common lens. We describe it here and call the class of models log-linear ERGMs. It includes degree-based models, stochastic blockmodels, and combinations of these. Given the interest in combined node and block effects in network formation mechanisms, we introduce a general directed relative of the degree-corrected stochastic blockmodel: an exponential family model we call $p_1$-SBM. It is a generalization of several well-known variants of the blockmodel. We study the problem of testing model fit for the log-linear ERGM class. The model fitting approach we take, through the use of quick estimation algorithms borrowed from the contingency table literature and effective sampling methods rooted in graph theory and algebraic statistics, results in an exact test whose $p$-value can be approximated efficiently in networks of moderate sizes. We showcase the performance of the method on two data sets from biology: the connectome of \emph{C. elegans} and the interactome of \emph{Arabidopsis thaliana}. These two networks, a neuronal network and a protein-protein interaction network, have been popular examples in the network science literature, but a model-based approach to studying them has been missing thus far.
翻译:网络文献的许多流行模型可以通过一个共同的透镜来查看。 我们在这里描述它, 并称之为对线ERGM模型类。 它包括基于度的模型、 随机型块模型以及这些模型的组合。 鉴于对网络形成机制中混合节点和块状效应的兴趣, 我们引入了对经度修正的随机区块模型的一般定向相对: 我们称之为$p_ 1$- SBM的指数式家庭模型。 这是对块状模型中几个众所周知的变体的概括化。 我们研究了适合对线型ERGM类的测试模型的问题。 我们通过使用快速估算从应急表文献中借用的算法以及植根于图表理论和代数统计中的有效抽样方法, 我们采用了一个精确的直线性直接的相对性测试, 其价值在中等规模的网络中可以被近似为$p$- 。 我们展示了该方法在生物学中两个数据集的性能: 缺失的连接体 {C. elegans} 和对正线型ERGM 类模型网络的交互作用, 因此, 网络中以两个模型/ a proomideal- excial- excience as a excience commolal as a commolal exmustional exmusmusmusmusmational as