We propose a highly parallel primal-dual algorithm for the multicut (a.k.a. correlation clustering) problem, a classical graph clustering problem widely used in machine learning and computer vision. Our algorithm consists of three steps executed recursively: (1) Finding conflicted cycles that correspond to violated inequalities of the underlying multicut relaxation, (2) Performing message passing between the edges and cycles to optimize the Lagrange relaxation coming from the found violated cycles producing reduced costs and (3) Contracting edges with high reduced costs through matrix-matrix multiplications. Our algorithm produces primal solutions and dual lower bounds that estimate the distance to optimum. We implement our algorithm on GPUs and show resulting one to two order-of-magnitudes improvements in execution speed without sacrificing solution quality compared to traditional serial algorithms that run on CPUs. We can solve very large scale benchmark problems with up to $\mathcal{O}(10^8)$ variables in a few seconds with small primal-dual gaps. We make our code available at https://github.com/pawelswoboda/RAMA.
翻译:我们建议对多切(a.k.a.相关组合)问题采用高度平行的原始双轨算法,这是在机器学习和计算机视觉中广泛使用的典型图形组合问题。我们的算法由循环执行的三个步骤组成:(1) 找到与根本多切放松中违反的不平等相对应的冲突周期,(2) 在边缘和周期之间传递信息,以优化从发现的被破坏周期产生的拉格朗放松,从而降低成本;(3) 通过矩阵矩阵矩阵倍增法,以降低高成本的边缘。我们的算法产生原始解决方案和双倍较低的边框,从而估计到最佳距离。我们在GPUs上应用了原始解决方案,并展示了执行速度方面的一至两阶级磁带改进,而没有牺牲解决方案的质量,而没有牺牲了在CPUs上运行的传统序列算法的质量。我们可以在短短几秒钟内解决高达$mathcal{O}(10 ⁇ 8)的非常大规模的基准问题,同时使用小的原始矩阵差距。我们可以在https://github.com/paweloswobda/RAMA。