Classifiers for medical image analysis are often trained with a single consensus label, based on combining labels given by experts or crowds. However, disagreement between annotators may be informative, and thus removing it may not be the best strategy. As a proof of concept, we predict whether a skin lesion from the ISIC 2017 dataset is a melanoma or not, based on crowd annotations of visual characteristics of that lesion. We compare using the mean annotations, illustrating consensus, to standard deviations and other distribution moments, illustrating disagreement. We show that the mean annotations perform best, but that the disagreement measures are still informative. We also make the crowd annotations used in this paper available at \url{https://figshare.com/s/5cbbce14647b66286544}.
翻译:医学图像分析的分类人员往往经过基于专家或人群所贴标签的单一协商一致标签的培训。然而,通知人员之间的分歧可能是信息化的,因此消除这种分歧可能不是最佳策略。作为概念的证明,我们根据2017年国际标准行业分类数据集中的皮肤损伤是否是黑瘤,基于该损伤的视觉特征的人群说明。我们用平均说明、显示共识、标准偏差和其他分发时间来比较,以说明分歧。我们显示,平均说明效果最好,但分歧措施仍然很丰富。我们还在\url{https://figshare.com/s/5cbbcce14647b66286544}上提供本文中使用的人群说明。