Several clustering methods (e.g., Normalized Cut and Ratio Cut) divide the Min Cut cost function by a cluster dependent factor (e.g., the size or the degree of the clusters), in order to yield a more balanced partitioning. We, instead, investigate adding such regularizations to the original cost function. We first consider the case where the regularization term is the sum of the squared size of the clusters, and then generalize it to adaptive regularization of the pairwise similarities. This leads to shifting (adaptively) the pairwise similarities which might make some of them negative. We then study the connection of this method to Correlation Clustering and then propose an efficient local search optimization algorithm with fast theoretical convergence rate to solve the new clustering problem. In the following, we investigate the shift of pairwise similarities on some common clustering methods, and finally, we demonstrate the superior performance of the method by extensive experiments on different datasets.
翻译:几个组群方法(例如,普通化计算和比率计算)将最小削减成本函数除以一个组群依附因素(例如,群集的大小或程度),以便产生更平衡的分隔。我们相反地调查在原始成本函数中增加这种正规化。我们首先考虑正规化术语是组群平方大小之和,然后将其概括为对等相似点的适应性正规化。这导致改变(调整)对等相似点,这可能使它们中的一部分出现负差点。我们然后研究这一方法与关联分组的联系,然后建议一种高效的本地搜索优化算法,具有快速理论趋同率,以解决新的组群问题。在接下来,我们研究一些共同组群方法对等的转变,最后,我们通过对不同数据集的广泛实验,展示了方法的优异性表现。