Enumerating maximal $k$-biplexes (MBPs) of a bipartite graph has been used for applications such as fraud detection. Nevertheless, there usually exists an exponential number of MBPs, which brings up two issues when enumerating MBPs, namely the effectiveness issue (many MBPs are of low values) and the efficiency issue (enumerating all MBPs is not affordable on large graphs). Existing proposals of tackling this problem impose constraints on the number of vertices of each MBP to be enumerated, yet they are still not sufficient (e.g., they require to specify the constraints, which is often not user-friendly, and cannot control the number of MBPs to be enumerated directly). Therefore, in this paper, we study the problem of finding $K$ MBPs with the most edges called MaxBPs, where $K$ is a positive integral user parameter. The new proposal well avoids the drawbacks of existing proposals. We formally prove the NP-hardness of the problem. We then design two branch-and-bound algorithms, among which, the better one called FastBB improves the worst-case time complexity to $O^*(\gamma_k^ n)$, where $O^*$ suppresses the polynomials, $\gamma_k$ is a real number that relies on $k$ and is strictly smaller than 2, and $n$ is the number of vertices in the graph. For example, for $k=1$, $\gamma_k$ is equal to $1.754$. We further introduce three techniques for boosting the performance of the branch-and-bound algorithms, among which, the best one called PBIE can further improve the time complexity to $O^*(\gamma_k^{d^3})$ for large sparse graphs, where $d$ is the maximum degree of the graph. We conduct extensive experiments on both real and synthetic datasets, and the results show that our algorithm is up to four orders of magnitude faster than all baselines and finding MaxBPs works better than finding all MBPs for a fraud detection application.
翻译:计算双叶图的最大值 $( MBP ) 。 解决此问题的现有建议对每张MBP的顶点数量( MBP ) 已经用于欺诈检测等应用。 尽管如此, 通常会存在超指数的MBP数量, 这在计算MBP时会引发两个问题, 即有效性问题( 许多MBP是低值的) 和效率问题( 计算所有 MBP 都无法在大图表上支付得起 ) 。 解决此问题的现有建议会限制每张MBP 的顶点数量, 但是它们仍然不够( 例如, 它们需要指定限制, 通常不方便用户的 MBBP 数量, 无法控制MBP的数量 。 因此, 在本文中, 找到 $KMBP 的问题, $ 是积极的整体用户参数。 新的提议避免了现有提案的底点 。 我们正式证明, 每张量的 NP- 硬度 。 我们随后设计两张的O- brial 运算法的底值 值, 其中, 最坏的是, 一个叫OBBBBBBBPM 数据 最坏的是, 。