Deep learning (DL) has shown great potential in digital pathology applications. The robustness of a diagnostic DL-based solution is essential for safe clinical deployment. In this work we evaluate if adding uncertainty estimates for DL predictions in digital pathology could result in increased value for the clinical applications, by boosting the general predictive performance or by detecting mispredictions. We compare the effectiveness of model-integrated methods (MC dropout and Deep ensembles) with a model-agnostic approach (Test time augmentation, TTA). Moreover, four uncertainty metrics are compared. Our experiments focus on two domain shift scenarios: a shift to a different medical center and to an underrepresented subtype of cancer. Our results show that uncertainty estimates increase reliability by reducing a model's sensitivity to classification threshold selection as well as by detecting between 70\% and 90\% of the mispredictions done by the model. Overall, the deep ensembles method achieved the best performance closely followed by TTA.
翻译:深层学习( DL) 在数字病理学应用中显示出巨大的潜力。 诊断性DL解决方案的坚固性对于安全临床部署至关重要。 在这项工作中,我们评估数字病理学中DL预测增加不确定性估计数是否会导致临床应用价值增加,提高一般预测性能或发现错误。 我们比较了模型集成方法(MC 辍学和深层集合)与模型-不可知性方法(测试时间增强,TTA)的有效性。此外,还比较了四个不确定性度量度。 我们的实验侧重于两个领域转移假设:转移到不同的医疗中心和代表不足的子癌症类型。我们的结果显示,不确定性估计通过降低模型对分类阈值选择的敏感性和检测模型所完成的差分在70-90 之间,从而增加了可靠性。总的说来,深层酶方法取得了TTA所紧随近的最佳性能。