We propose "breathing $k$-means", a novel approximation algorithm for the $k$-means problem. After seeding the centroid set with the well-known $k$-means++ algorithm, the new method cyclically increases and decreases the number of centroids in order to find an improved solution for the given problem. The $k$-means++ solutions used for seeding are typically improved significantly while the extra computational cost is moderate. The effectiveness of our method is demonstrated on a variety of $k$-means problems including all those used in the original $k$-means++ publication. The Python implementation of the new algorithm consists of 78 lines of code.
翻译:我们提出“以美元为单位”的新近似算法。 在用众所周知的以美元为单位的++算法播种的机器人组合后,新的方法周期性地增加并减少以美元为单位的机器人数量,以便找到更好的解决办法。 用于播种的以美元为单位的++ 方法通常会大为改善,而额外的计算成本则比较中和。 我们的方法的有效性表现在各种以美元为单位的单位问题上,包括最初以美元为单位的++ 出版物中所使用的所有问题。 新的算法的Python实施由78行代码组成。