Core decomposition is a classic technique for discovering densely connected regions in a graph with large range of applications. Formally, a $k$-core is a maximal subgraph where each vertex has at least $k$ neighbors. A natural extension of a $k$-core is a $(k, h)$-core, where each node must have at least $k$ nodes that can be reached with a path of length $h$. The downside in using $(k, h)$-core decomposition is the significant increase in the computational complexity: whereas the standard core decomposition can be done in $O(m)$ time, the generalization can require $O(n^2m)$ time, where $n$ and $m$ are the number of nodes and edges in the given graph. In this paper we propose a randomized algorithm that produces an $\epsilon$-approximation of $(k, h)$ core decomposition with a probability of $1 - \delta$ in $O(\epsilon^{-2} hm (\log^2 n - \log \delta))$ time. The approximation is based on sampling the neighborhoods of nodes, and we use Chernoff bound to prove the approximation guarantee. We demonstrate empirically that approximating the decomposition complements the exact computation: computing the approximation is significantly faster than computing the exact solution for the networks where computing the exact solution is slow.
翻译:核心分解是一个在应用范围很广的图表中发现密连区域的经典技术。 形式上, $k$- 核心是一个最大子集, 每个顶端至少有美元邻居。 $k$- 核心的自然延伸是一个( k, h) 核心( 核心) $- 核心, 每个节点必须至少有美元( k, h) 节点, 其路径长度必须达到美元。 使用美元( k, h) 核心分解的下方是计算复杂性的显著增加: 标准核心分解可以在美元( m) 时间里完成, 而标准核心分解在最大分解中每个顶端至少有美元邻居。 一个 $( k, h) 核心分解是 $( k- k) 最高分数( $) 最高分数( 美元) 核心分解解解法, 其概率为 $( k, h) 标准核心分解析值可以在 $( eepslon) 中完成, com- climatealblationalation com comalationalizational- roup 。 roupilation( roupilation) 基础上, 我们的计算不使用正数( h=) 和正数( ) 正确的计算规则) 正确值) 。