This paper introduces the equiwide clustering problem, where valid partitions must satisfy intra-cluster dissimilarity constraints. Unlike most existing clustering algorithms, equiwide clustering relies neither on density nor on a predefined number of expected classes, but on a dissimilarity threshold. Its main goal is to ensure an upper bound on the error induced by ultimately replacing any object with its cluster representative. Under this constraint, we then primarily focus on minimizing the number of clusters, along with potential sub-objectives. We argue that equiwide clustering is a sound clustering problem, and discuss its relationship with other optimization problems, existing and novel implementations as well as approximation strategies. We review and evaluate suitable clustering algorithms to identify trade-offs between the various practical solutions for this clustering problem.
翻译:本文介绍了全类分组问题,有效分区必须满足组内差异性的限制。与大多数现有的集群算法不同,全类集群既不依赖密度,也不依赖预定的预期类别数量,而是依赖一个不同的门槛。其主要目的是确保对错误有一个上限,最终用分组代表来取代任何对象。在此制约下,我们首先注重尽量减少集群数量,同时注意潜在的次级目标。我们主张,全类集群是一个健全的集群问题,讨论它与其他优化问题、现有和新颖的实施以及近似战略之间的关系。我们审查和评价适当的集群算法,以确定这一集群问题的各种实际解决办法之间的权衡。