For a directed graph $G$ with $n$ vertices and a start vertex $u_{\sf start}$, we wish to (approximately) sample an $L$-step random walk over $G$ starting from $u_{\sf start}$ with minimum space using an algorithm that only makes few passes over the edges of the graph. This problem found many applications, for instance, in approximating the PageRank of a webpage. If only a single pass is allowed, the space complexity of this problem was shown to be $\tilde{\Theta}(n \cdot L)$. Prior to our work, a better space complexity was only known with $\tilde{O}(\sqrt{L})$ passes. We settle the space complexity of this random walk simulation problem for two-pass streaming algorithms, showing that it is $\tilde{\Theta}(n \cdot \sqrt{L})$, by giving almost matching upper and lower bounds. Our lower bound argument extends to every constant number of passes $p$, and shows that any $p$-pass algorithm for this problem uses $\tilde{\Omega}(n \cdot L^{1/p})$ space. In addition, we show a similar $\tilde{\Theta}(n \cdot \sqrt{L})$ bound on the space complexity of any algorithm (with any number of passes) for the related problem of sampling an $L$-step random walk from every vertex in the graph.
翻译:对于以美元为顶端的正方形 $G$ 和以美元为顶端的顶端 ${u{sff start}$, 我们希望( 大约) 抽样从$u ⁇ sf start} 美元开始, 以最小空间代表最小值代表$G$开始的 $L$级随机行走。 这个问题发现许多应用程序, 例如, 在接近网页的 PageRank 时。 如果只允许一次通过, 问题的空间复杂性显示为$( tilde) $( cdot L) 。 在我们工作之前, 只有在 $\\ {sf start} (sqrt{L} $的基础上, 才能知道一个更好的空间复杂性。 我们解决了这个随机行行走模拟问题的复杂性, 例如, 在双向流算时, 显示它是$\ t\ sqr>。 任何问题, 通过几乎匹配的上下方框 $( $_x_xx%), 我们的下方参数显示每个固定的平面数字 。