Challenges have become the state-of-the-art approach to benchmark image analysis algorithms in a comparative manner. While the validation on identical data sets was a great step forward, results analysis is often restricted to pure ranking tables, leaving relevant questions unanswered. Specifically, little effort has been put into the systematic investigation on what characterizes images in which state-of-the-art algorithms fail. To address this gap in the literature, we (1) present a statistical framework for learning from challenges and (2) instantiate it for the specific task of instrument instance segmentation in laparoscopic videos. Our framework relies on the semantic meta data annotation of images, which serves as foundation for a General Linear Mixed Models (GLMM) analysis. Based on 51,542 meta data annotations performed on 2,728 images, we applied our approach to the results of the Robust Medical Instrument Segmentation Challenge (ROBUST-MIS) challenge 2019 and revealed underexposure, motion and occlusion of instruments as well as the presence of smoke or other objects in the background as major sources of algorithm failure. Our subsequent method development, tailored to the specific remaining issues, yielded a deep learning model with state-of-the-art overall performance and specific strengths in the processing of images in which previous methods tended to fail. Due to the objectivity and generic applicability of our approach, it could become a valuable tool for validation in the field of medical image analysis and beyond. and segmentation of small, crossing, moving and transparent instrument(s) (parts).
翻译:虽然对相同数据集的验证是一个很大的进步,但结果分析往往局限于纯粹的排名表,而没有回答相关问题。具体地说,对最先进的算法失败的图像特征的系统调查没有做出多大努力。为了解决文献中的这一差距,我们(1) 提供了一个从挑战中学习的统计框架,(2) 即刻公布仪器在大腿视频中进行仪器例分解的具体任务(大腿片视频中的仪器例分解),我们的框架依靠图像的语义元数据说明,作为一般线性混合模型(GLMMM)分析的基础。基于对2,728图像进行的51,542个元元数据说明,我们运用了我们的方法来调查最先进的医学仪器分解挑战(ROBust-MIS)的结果,并揭示了仪器分解系统的具体任务(Laloaroscopic volation),以及作为主要算术失败来源的烟雾或其他物体的存在。我们后来的方法的演变方法,是超越了整个医学分解法的准确性分析,在特定的分解法中,最终的分辨方法的分解了我们具体的分辨方法的分辨方法。