Hierarchical clustering is a popular method for identifying distinct groups in a dataset. The most commonly used method for pruning a dendrogram is via a single horizontal cut. In this paper, we propose a new technique "weakest link optimal pruning". We prove its superiority over horizontal pruning and provide some examples illustrating how the two methods can behave quite differently.
翻译:等级分组是一种在数据集中识别不同组群的常用方法。 计算时最常用的方法是单水平切分。 在本文中,我们建议采用新的技术“ 最弱的链接最佳剪裁 ” 。 我们证明了它优于水平剪裁,并提供了一些例子,说明两种方法如何表现不同。