Random walk based sampling methods have been widely used in graph sampling in recent years, while it has bias towards higher degree nodes in the sample. To overcome this deficiency, classical methods such as GMD modify the topology of target graphs so that the long-term behavior of Markov chain can achieve uniform distribution. This modification, however, reduces the conductance of graphs, thus makes the sampler stay in the same node for long time, resulting in undersampling. To address this issue, we propose a new way of modifying target graph, thus propose Weighted Jump Random Walk (WJRW) with parameter C to improve the performance. We prove that WJRW can unify Simple Random Walk and uniform distribution through C, and we also conduct extensive experiments on real-world dataset. The experimental results show WJRW can promote the accuracy significantly under the same budget. We also investigate the effect of the parameter C, and give the suggested range for a better usage in application.
翻译:近年来,以随机步行为基础的抽样方法在图表抽样中广泛使用,但这种方法偏向于抽样中较高程度的节点。为克服这一缺陷,GMD等传统方法对目标图的地形学进行了修改,使Markov链的长期行为能够实现统一分布。然而,这种修改减少了图的进行,从而使取样者长期停留在同一节点,从而导致抽样调查不足。为了解决这一问题,我们提出了修改目标图的新方法,从而提议用WJRW(WJRW)参数C(WJRW)来改进性能。我们证明,WJRW(W)可以统一简单的随机行走和通过C的统一分布,我们还在现实世界数据集上进行了广泛的实验。实验结果显示,WJRW(W)可以大大促进同一预算下的准确性。我们还调查了参数C(C)的影响,并给出了更好的应用建议的范围。