Core decomposition is a classic technique for discovering densely connected regions in a graph with large range of applications. Formally, a $k$-core is a maximal subgraph where each vertex has at least $k$ neighbors. A natural extension of a $k$-core is a $(k, h)$-core, where each node must have at least $k$ nodes that can be reached with a path of length $h$. The downside in using $(k, h)$-core decomposition is the significant increase in the computational complexity: whereas the standard core decomposition can be done in $O(m)$ time, the generalization can require $O(n^2m)$ time, where $n$ and $m$ are the number of nodes and edges in the given graph. In this paper we propose a randomized algorithm that produces an $\epsilon$-approximation of $(k, h)$ core decomposition with a probability of $1 - \delta$ in $O(\epsilon^{-2} hm (\log^2 n - \log \delta))$ time. The approximation is based on sampling the neighborhoods of nodes, and we use Chernoff bound to prove the approximation guarantee. We also study distance-generalized dense subgraphs, show that the problem is NP-hard, provide an algorithm for discovering such graphs with approximate core decompositions, and provide theoretical guarantees for the quality of the discovered subgraphs. We demonstrate empirically that approximating the decomposition complements the exact computation: computing the approximation is significantly faster than computing the exact solution for the networks where computing the exact solution is slow
翻译:核心分解是一种在应用范围很广的图表中发现密连区域的经典技术。 形式上, $k$- 核心是一个最高分层, 每个顶端的顶端至少有美元邻居。 $k$- 核心的自然延伸为$( k, h) 核心, 每个节点必须至少有美元( k) 节点, 其路径长度必须达到 $h 。 使用 $( k, h) 核心分解的下方是计算复杂性的显著增加: 标准核心分解可以在 $( m) 时间里完成, 而标准核心分解密则需要 $( n) $( n) 20 美元) 时间里。 美元和 美元 核心分解密的自然延伸是 节点数 。 在本文中, 随机的算算法将产生 $( k, h) 美元( h) 核心分解解解析, 其概率为 $ ( deta) del$( we decoml) leallon=2) commologyal deal deal deal deal degres deal deal deal deal deg) 。