We propose PROPAGATE, a fast approximation framework to estimate distance-based metrics on very large graphs such as the (effective) diameter, the (effective) radius, or the average distance within a small error. The framework assigns seeds to nodes and propagates them in a BFS-like fashion, computing the neighbors set until we obtain either the whole vertex set (the diameter) or a given percentage (the effective diameter). At each iteration, we derive compressed Boolean representations of the neighborhood sets discovered so far. The PROPAGATE framework yields two algorithms: PROPAGATE-P, which propagates all the $s$ seeds in parallel, and PROPAGATE-s which propagates the seeds sequentially. For each node, the compressed representation of the PROPAGATE-P algorithm requires $s$ bits while that of PROPAGATE-S only $1$ bit. Both algorithms compute the average distance, the effective diameter, the diameter, and the connectivity rate within a small error with high probability: for any $\varepsilon>0$ and using $s=\Theta\left(\frac{\log n}{\varepsilon^2}\right)$ sample nodes, the error for the average distance is bounded by $\xi = \frac{\varepsilon \Delta}{\alpha}$, the error for the effective diameter and the diameter are bounded by $\xi = \frac{\varepsilon}{\alpha}$, and the error for the connectivity rate is bounded by $\varepsilon$ where $\Delta$ is the diameter and $\alpha$ is a measure of connectivity of the graph. The time complexity is $\mathcal{O}\left(m\Delta \frac{\log n}{\varepsilon^2}\right)$, where $m$ is the number of edges of the graph. The experimental results show that the PROPAGATE framework improves the current state of the art both in accuracy and speed. Moreover, we experimentally show that PROPAGATE-S is also very efficient for solving the All Pair Shortest Path problem in very large graphs.
翻译:我们建议 PROPAGATE, 一个快速近似框架, 用于估算远方( 有效) 直径、 (有效) 半径或小差内平均距离等非常大的图表上基于距离的衡量指标。 这个框架将种子指定给节点, 并以BFS式的方式传播它们, 计算邻居设置, 直到我们获得整个顶点集( 直径) 或给定百分比( 有效直径 ) 。 在每次循环中, 我们得到迄今为止所发现的邻居群的压缩布利值表示。 PROPAGATE 框架产生两种算法: PROPAGATE- P, 以平行方式传播所有美元种子, 以及 PROPAGATE 以连续方式传播种子。 对于每一个节点, PROPAGATE- P算法的压缩表示美元, 而 PROPAGATE- Sl 问题只有1 美元。 。 两种算法都计算出平均距离、 美元直径、 直径( 直径) 直径、直径 直径、直径、直径、直径 直径、直径、直径、直径、直径、直径、直径、直径、直、直、直、直、直、直、直、直、直、直、直、直、直径、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直、直