We are interested in recovering information on a stochastic block model from the subgraph discovered by an exploring random walk. Stochastic block models correspond to populations structured into a finite number of types, where two individuals are connected by an edge independently from the other pairs and with a probability depending on their types. We consider here the dense case where the random network can be approximated by a graphon. This problem is motivated from the study of chain-referral surveys where each interviewee provides information on her/his contacts in the social network. First, we write the likelihood of the subgraph discovered by the random walk: biases are appearing since hubs and majority types are more likely to be sampled. Even for the case where the types are observed, the maximum likelihood estimator is not explicit any more. When the types of the vertices is unobserved, we use an SAEM algorithm to maximize the likelihood. Second, we propose a different estimation strategy using new results by Athreya and Roellin. It consists in de-biasing the maximum likelihood estimator proposed in Daudin et al. and that ignores the biases.
翻译:我们有兴趣从探索随机行走所发现的子图中获取关于随机行走所发现的随机区块模型的信息。 软行区块模型符合按一定数量类型构建的人口结构, 其中两个个人通过边缘与其他对子独立连接, 概率取决于其类型。 我们在这里考虑随机网络可以通过图解相近的密集案例。 这个问题的起因是每名受访者都提供其在社交网络中联系人的信息的链状转录调查研究。 首先, 我们写随机行走所发现子图的可能性: 偏向出现, 大多数类型更有可能被抽样。 即使在观察到了这些类型的情况下, 最大可能性的估测器也不再明确。 当脊椎的类型不被观测到时, 我们使用SAEM算法来尽量扩大可能性。 其次, 我们用Athreya 和 Roellin 的新结果提出不同的估计策略。 它包含对 Daudin 等人 和 Roellin 提议的最大可能性的分辨, 并且忽略了偏差。