We introduce the PAPER (Preferential Attachment Plus Erd\H{o}s--R\'{e}nyi) model for random networks, where we let a random network G be the union of a preferential attachment (PA) tree T and additional Erd\H{o}s--R\'{e}nyi (ER) random edges. The PA tree component captures the fact that real world networks often have an underlying growth/recruitment process where vertices and edges are added sequentially, while the ER component can be regarded as random noise. Given only a single snapshot of the final network G, we study the problem of constructing confidence sets for the early history, in particular the root node, of the unobserved growth process; the root node can be patient zero in a disease infection network or the source of fake news in a social media network. We propose an inference algorithm based on Gibbs sampling that scales to networks with millions of nodes and provide theoretical analysis showing that the expected size of the confidence set is small so long as the noise level of the ER edges is not too large. We also propose variations of the model in which multiple growth processes occur simultaneously, reflecting the growth of multiple communities, and we use these models to provide a new approach community detection.
翻译:我们引入随机网络的PAPER( 优先附加附加Erd\H{H}s- R\{e}nyi) 模式( 优先附加+Erd\H}s- R\{{e}nyi) 模式, 我们让随机网络G( 随机网络G) 联合一个优先附加( PA) 树 T 和额外的 Erd\H{o}s- R\\{e} nyi 随机边缘( ER) 。 PA 树部分捕捉到一个事实, 真实世界网络通常有一个潜在的增长/ 招聘过程, 其端端端按顺序添加, 而ER 组件可以被视为随机噪音。 我们只对最终网络G 的一幅图片, 我们研究为早期历史, 特别是未观测的增长过程的根节点建立信任组的问题; 根节点在疾病感染网络或社交媒体网络的假新闻源中可能是零的。 我们建议基于 GIBS 抽样算出一个推论算法, 以至有数百万个节点的网络, 并且提供理论分析显示, 信心组的预期规模很小, 只要是ER 信任组的预期的大小, 只要ER 边端点的噪音水平是小, 只要的噪音水平不是太大的节点, 我们同时提出多种增长模式, 我们同时提出这些增长模式的模型的模型的模型的模型的模型的模型的模型, 来反映。