A $k$-defective clique is a relaxation of the traditional clique definition, allowing up to $k$ missing edges. This relaxation is crucial in various real-world applications such as link prediction, community detection, and social network analysis. Although the problems of enumerating maximal $k$-defective cliques and searching a maximum $k$-defective clique have been extensively studied, existing algorithms suffer from limitations such as the combinatorial explosion of small partial solutions and sub-optimal search spaces. To address these limitations, we propose a novel clique-first branch-and-bound framework that first generates cliques and then adds missing edges. Furthermore, we introduce a new pivoting technique that achieves a search space size of $\mathcal{O}(3^{\frac{n}{3}} \cdot n^k)$, where $n$ is the number of vertices in the input graph. We prove that the worst-case number of maximal $k$-defective cliques is $Ω(3^{\frac{n}{3}} \cdot n^k)$ when $k$ is a constant, establishing that our algorithm's search space is worst-case optimal. Leveraging the diameter-two property of defective cliques, we further reduce the search space size to $\mathcal{O}(n \cdot 3^{\fracδ{3}} \cdot (δΔ)^k)$, where $δ$ is the degeneracy and $Δ$ is the maximum degree of the input graph. We also propose an efficient framework for maximum $k$-defective clique search based on our branch-and-bound, together with practical techniques to reduce the search space. Experiments on real-world benchmark datasets with more than 1 million edges demonstrate that each of our proposed algorithms for maximal $k$-defective clique enumeration and maximum $k$-defective clique search outperforms the respective state-of-the-art algorithms by up to four orders of magnitude in terms of processing time.
翻译:$k$-缺陷团是传统团定义的松弛形式,允许最多$k$条缺失边。这种松弛在链接预测、社区检测和社交网络分析等多种实际应用中至关重要。尽管枚举极大$k$-缺陷团和搜索最大$k$-缺陷团的问题已得到广泛研究,但现有算法存在局限性,例如小部分解的指数爆炸和次优搜索空间。为解决这些局限性,我们提出了一种新颖的团优先分支定界框架,首先生成团,然后添加缺失边。此外,我们引入了一种新的枢轴技术,实现了$\mathcal{O}(3^{\frac{n}{3}} \cdot n^k)$的搜索空间大小,其中$n$是输入图中的顶点数。我们证明了当$k$为常数时,极大$k$-缺陷团的最坏情况数量为$Ω(3^{\frac{n}{3}} \cdot n^k)$,从而确立了我们算法的搜索空间在最坏情况下是最优的。利用缺陷团的直径二特性,我们进一步将搜索空间大小减少到$\mathcal{O}(n \cdot 3^{\fracδ{3}} \cdot (δΔ)^k)$,其中$δ$是退化度,$Δ$是输入图的最大度数。我们还基于我们的分支定界方法,提出了一种用于最大$k$-缺陷团搜索的高效框架,并结合了减少搜索空间的实用技术。在超过100万条边的真实世界基准数据集上的实验表明,我们提出的极大$k$-缺陷团枚举算法和最大$k$-缺陷团搜索算法在处理时间上分别比现有最先进算法快多达四个数量级。