The development of external evaluation criteria for soft clustering (SC) has received limited attention: existing methods do not provide a general approach to extend comparison measures to SC, and are unable to account for the uncertainty represented in the results of SC algorithms. In this article, we propose a general method to address these limitations, grounding on a novel interpretation of SC as distributions over hard clusterings, which we call \emph{distributional measures}. We provide an in-depth study of complexity- and metric-theoretic properties of the proposed approach, and we describe approximation techniques that can make the calculations tractable. Finally, we illustrate our approach through a simple but illustrative experiment.
翻译:制定软类集的外部评价标准受到的关注有限:现有方法并不提供将比较措施扩大到在册种姓的一般方法,也无法说明在册种姓算法结果中体现的不确定性。在本条中,我们提出了解决这些限制的一般方法,其依据是将在册种姓解释为硬类集的分布,我们称之为“分配措施 ” 。我们深入研究了拟议方法的复杂性和计量理论特性,我们描述了能够使计算具有可动性的近似技术。最后,我们通过简单但具有说明性的实验来说明我们的方法。