Hypergraph clustering is a basic algorithmic primitive for analyzing complex datasets and systems characterized by multiway interactions, such as group email conversations, groups of co-purchased retail products, and co-authorship data. This paper presents a practical $O(\log n)$-approximation algorithm for a broad class of hypergraph ratio cut clustering objectives. This includes objectives involving generalized hypergraph cut functions, which allow a user to penalize cut hyperedges differently depending on the number of nodes in each cluster. Our method is a generalization of the cut-matching framework for graph ratio cuts, and relies only on solving maximum s-t flow problems in a special reduced graph. It is significantly faster than existing hypergraph ratio cut algorithms, while also solving a more general problem. In numerical experiments on various types of hypergraphs, we show that it quickly finds ratio cut solutions within a small factor of optimality.
翻译:超光速集成是分析复杂数据集和以多路互动为特征的系统(如集体电子邮件对话、共同购买的零售产品组和共同作者数据)的基本算法原始法。 本文为广泛的高光率削减组群目标提供了实用的 $O ( log n) $- occoration 算法。 这包括涉及通用高光截断功能的目标, 这使得用户能够根据每个组群的节点数量, 以不同的方式处罚切开的超高屏。 我们的方法是将图形比对比框架的概括化, 并且只依靠在特殊减速图中解决最大 S- t 流问题。 它比现有的高光速比率削减算法快得多, 同时解决更普遍的问题。 在对各种高光谱进行的数字实验中, 我们显示它很快在小的最佳性因素中找到断开的比法解决方案。