$K$-means clustering is one of the most widely-used partitioning algorithm in cluster analysis due to its simplicity and computational efficiency. However, $K$-means does not provide an appropriate clustering result when applying to data with non-spherically shaped clusters. We propose a novel partitioning clustering algorithm based on expectiles. The cluster centers are defined as multivariate expectiles and clusters are searched via a greedy algorithm by minimizing the within cluster '$\tau$ -variance'. We suggest two schemes: fixed $\tau$ clustering, and adaptive $\tau$ clustering. Validated by simulation results, this method beats both $K$-means and spectral clustering on data with asymmetric shaped clusters, or clusters with a complicated structure, including asymmetric normal, beta, skewed $t$ and $F$ distributed clusters. Applications of adaptive $\tau$ clustering on crypto-currency (CC) market data are provided. One finds that the expectiles clusters of CC markets show the phenomena of an institutional investors dominated market. The second application is on image segmentation. compared to other center based clustering methods, the adaptive $\tau$ cluster centers of pixel data can better capture and describe the features of an image. The fixed $\tau$ clustering brings more flexibility on segmentation with a decent accuracy.
翻译:K$ 平均值分组是集束分析中最广泛使用的分割算法之一,原因是其简单和计算效率。然而,当应用非球形组群的数据时,K$平均值并不能提供适当的组合结果。我们提议根据预期值采用新的分割组合算法。集集中心的定义是多变量预期值和集群通过贪婪算法搜索,将“$$$ -tau$ - variance”分组内的数据最小化。我们建议两种办法:固定 $tau$集群和适应 $\tou$ 集群。经过模拟结果的验证,这种方法在对不对称形状组群或结构复杂(包括不对称正常、β、斜值美元和美元分配的组群)的数据上,优于K$和光谱组合。对调 美元组合群集的应用程序是最小化 CC 市场显示机构投资者支配的市场现象。第二个应用程序是在图像分割上,与基于非对称形形形形形形组群集数据的精度相比, 将更精确的组合集成法用于更精确的固定的基集。