The spread of infectious disease in a human community or the proliferation of fake news on social media can be modeled as a randomly growing tree-shaped graph. The history of the random growth process is often unobserved but contains important information such as the source of the infection. We consider the problem of statistical inference on aspects of the latent history using only a single snapshot of the final tree. Our approach is to apply random labels to the observed unlabeled tree and analyze the resulting distribution of the growth process, conditional on the final outcome. We show that this conditional distribution is tractable under a shape-exchangeability condition, which we introduce here, and that this condition is satisfied for many popular models for randomly growing trees such as uniform attachment, linear preferential attachment and uniform attachment on a $D$-regular tree. For inference of the root under shape-exchangeability, we propose O(n log n) time algorithms for constructing confidence sets with valid frequentist coverage as well as bounds on the expected size of the confidence sets. We also provide efficient sampling algorithms that extend our methods to a wide class of inference problems.
翻译:传染性疾病在人类社区的传播或社交媒体上假消息的泛滥可以随机增长的树形图作模型。随机增长过程的历史往往没有观测到,但含有诸如感染源等重要信息。我们只用最后一棵树的片子来考虑潜在历史各方面的统计推论问题。我们的做法是对观察到的无标签树随机贴标签,并分析增长过程的最终结果。我们表明,这种有条件的分布在一种形状易变条件下是可移动的,我们在此介绍,对于随机生长的树木,例如统一的依附、线性优先依附和合合合合合合合合合合合合合一的树,这个条件得到满足。为了推断在形状易变形情况下的根根,我们建议用O(n)时间算法来构建信任套,以有效的经常现象覆盖为条件,并按预期的置信体大小进行限制。我们还提供高效的抽样算法,将我们的方法扩大到广泛的推论问题。