The challenges of graph stream algorithms are twofold. First, each edge needs to be processed only once, and second, it needs to work on highly constrained memory. Diffusion degree is a measure of node centrality that can be calculated (for all nodes) trivially for static graphs using a single Breadth-First Search (BFS). However, keeping track of the Diffusion Degree in a graph stream is nontrivial. The memory requirement for exact calculation is equivalent to keeping the whole graph in memory. The present paper proposes an estimator (or sketch) of diffusion degree for graph streams. We prove the correctness of the proposed sketch and the upper bound of the estimated error. Given $\epsilon, \delta \in (0,1)$, we achieve error below $\epsilon(b_u-a_u)d_u\lambda$ in node $u$ with probability $1-\delta$ by utilizing $O(n\frac1{\epsilon^2}\log{\frac1{\delta}})$ space, where $b_u$ and $a_u$ are the maximum and minimum degrees of neighbors of $u$, $\lambda$ is diffusion probability, and $d_u$ is the degree of node $u$. With the help of this sketch, we propose an algorithm to extract the top-$k$ influencing nodes in the graph stream. Comparative experiments show that the spread of top-$k$ nodes by the proposed graph stream algorithm is equivalent to or better than the spread of top-$k$ nodes extracted by the exact algorithm.
翻译:暂无翻译