Star sampling (SS) is a random sampling procedure on a graph wherein each sample consists of a randomly selected vertex (the star center) and its one-hop neighbors (the star endpoints). We consider the use of star sampling to find any member of an arbitrary target set of vertices in a graph, where the figure of merit (cost) is either the expected number of samples (unit cost) or the expected number of star centers plus star endpoints (linear cost) until a vertex in the target set is encountered, either as a star center or as a star point. We analyze this performance measure on three related star sampling paradigms: SS with replacement (SSR), SS without center replacement (SSC), and SS without star replacement (SSS). We derive exact and approximate expressions for the expected unit and linear costs of SSR, SSC, and SSS on Erdos-Renyi (ER) graphs. Our results show there is i) little difference in unit cost, but ii) significant difference in linear cost, across the three paradigms. Although our results are derived for ER graphs, experiments on "real-world" graphs suggest our performance expressions are reasonably accurate for non-ER graphs.
翻译:恒星取样(SS)是一个随机抽样程序,每个样本由随机选择的顶点(恒星中心)及其一角邻接点(恒星终点点)组成。我们考虑使用恒星取样来在图表中找到任意目标顶点组合的任何成员,其中功绩(成本)数字要么是样本的预期数量(单位成本),要么是星体中心加上恒星端点的预期数量(线性成本),直到目标集遇到一个顶点,要么是恒星中心,要么是星点。我们分析了三种相关恒星取样模式的性能衡量:SS,替换(SSR)、SS,没有中心替换(SS);SS,没有恒星替换(SS)。我们在Erdos-Renyi(ER)图表中得出了预期单位和线性成本(单位成本)的准确和大致表达方式。我们的结果表明,单位成本差别很小,但线性成本差异很大,贯穿三个模式。虽然我们为ER图表得出了结果,但“现实世界”图表的实验显示我们不准确性能。