The $k$-means algorithm is a very prevalent clustering method because of its simplicity, effectiveness, and speed, but its main disadvantage is its high sensitivity to the initial positions of the cluster centers. The global $k$-means is a deterministic algorithm proposed to tackle the random initialization problem of k-means but requires high computational cost. It partitions the data to $K$ clusters by solving all $k$-means sub-problems incrementally for $k=1,\ldots, K$. For each $k$ cluster problem, the method executes the $k$-means algorithm $N$ times, where $N$ is the number of data points. In this paper, we propose the global $k$-means$++$ clustering algorithm, which is an effective way of acquiring quality clustering solutions akin to those of global $k$-means with a reduced computational load. This is achieved by exploiting the center section probability that is used in the effective $k$-means$++$ algorithm. The proposed method has been tested and compared in various well-known real and synthetic datasets yielding very satisfactory results in terms of clustering quality and execution speed.
翻译:美元汇率算法因其简单、有效和速度而是一种非常流行的集群法,但其主要缺点在于它对集群中心最初位置的高度敏感。全球美元汇率是一种确定性算法,旨在解决k美元汇率的随机初始化问题,但需要高计算成本。它通过对美元=1,\ldots,K美元等所有美元汇率子问题进行逐步解决,将数据分割为K美元集群。对于每个美元集群问题,该方法执行美元汇率算法,而美元汇率算法是数据点数的数。在本文件中,我们提议采用全球美元汇率算法,这是获得质量组合法的一种有效方法,类似于全球美元汇率算法,而计算负荷则减少。这是通过利用美元汇率乘以美元汇率乘以美元汇率计算法的中央部分概率实现的。在各种众所周知的实际和合成数据组合执行速度上,对所拟议的方法进行了测试和比较,取得了非常令人满意的结果。