K-means is a classical clustering algorithm with wide applications. However, soft K-means, or fuzzy c-means at m=1, remains unsolved since 1981. To address this challenging open problem, we propose a novel clustering model, i.e. Probabilistic K-Means (PKM), which is also a nonlinear programming model constrained on linear equalities and linear inequalities. In theory, we can solve the model by active gradient projection, while inefficiently. Thus, we further propose maximum-step active gradient projection and fast maximum-step active gradient projection to solve it more efficiently. By experiments, we evaluate the performance of PKM and how well the proposed methods solve it in five aspects: initialization robustness, clustering performance, descending stability, iteration number, and convergence speed.
翻译:K- Means (PKM) 是一种传统组合算法, 具有广泛的应用。 然而, 软 K 表示, 或 m=1 模糊 c 表示, 自1981年以来仍未解决 。 为了解决这个具有挑战性的开放问题, 我们提议了一个新型组合模型, 即概率 K- Means (PKM), 这也是受线性平等和线性不平等制约的非线性编程模型。 在理论上, 我们可以通过主动梯度投影解决模型, 但效率不高 。 因此, 我们进一步提出 最大步主动梯度投影和快速最大步主动梯度预测, 以更有效地解决这个问题 。 通过实验, 我们评估 PKM 的性能, 以及拟议方法如何在五个方面很好地解决这个问题 : 初始化稳健性、 组合性性性能、 递增稳定性、 迭代数 和 趋同速度 。