The development of automatic segmentation techniques for medical imaging tasks requires assessment metrics to fairly judge and rank such approaches on benchmarks. The Dice Similarity Coefficient (DSC) is a popular choice for comparing the agreement between the predicted segmentation against a ground-truth mask. However, the DSC metric has been shown to be biased to the occurrence rate of the positive class in the ground-truth, and hence should be considered in combination with other metrics. This work describes a detailed analysis of the recently proposed normalised Dice Similarity Coefficient (nDSC) for binary segmentation tasks as an adaptation of DSC which scales the precision at a fixed recall rate to tackle this bias. White matter lesion segmentation on magnetic resonance images of multiple sclerosis patients is selected as a case study task to empirically assess the suitability of nDSC. We validate the normalised DSC using two different models across 59 subject scans with a wide range of lesion loads. It is found that the nDSC is less biased than DSC with lesion load on standard white matter lesion segmentation benchmarks measured using standard rank correlation coefficients. An implementation of nDSC is made available at: https://github.com/NataliiaMolch/nDSC .
翻译:为医疗成像任务开发自动分解技术需要评估指标,以便公平判断和排列这种基准方法。Dice相似系数(DSC)是比较预测的分解法与地面真相面具之间协议的流行选择。不过,DSC指标显示偏向于地面真相中正值等级的发生率,因此应当与其他指标结合考虑。这项工作详细分析了最近提议的二元分解法(nDSC)的二元分解法(nDSC),该二元分解法将精确度定在固定回调率上,以应对这一偏差。将多发硬质病人磁共振图像的白物质分解法选作案例研究,以实际评估NDSC的适合性。我们使用两种不同的模型对59个主题扫描进行正常的正常的DSC进行验证,并使用各种色素负荷。发现nDSC比DSC的偏向性更小,在标准白质分解率分解率基准下用标准级别对应系数测量的LADS/NADSA。