Image-based tracking of laparoscopic instruments plays a fundamental role in computer and robotic-assisted surgeries by aiding surgeons and increasing patient safety. Computer vision contests, such as the Robust Medical Instrument Segmentation (ROBUST-MIS) Challenge, seek to encourage the development of robust models for such purposes, providing large, diverse, and annotated datasets. To date, most of the existing models for instance segmentation of medical instruments were based on two-stage detectors, which provide robust results but are nowhere near to the real-time (5 frames-per-second (fps)at most). However, in order for the method to be clinically applicable, real-time capability is utmost required along with high accuracy. In this paper, we propose the addition of attention mechanisms to the YOLACT architecture that allows real-time instance segmentation of instrument with improved accuracy on the ROBUST-MIS dataset. Our proposed approach achieves competitive performance compared to the winner ofthe 2019 ROBUST-MIS challenge in terms of robustness scores,obtaining 0.313 MI_DSC and 0.338 MI_NSD, while achieving real-time performance (37 fps)
翻译:在计算机和机器人辅助手术中,通过协助外科医生和提高病人安全性,对腹腔镜仪仪进行基于图像的跟踪,在计算机和机器人辅助外科手术中发挥了根本作用。计算机视觉竞赛,例如强健医疗仪器分块(ROBUST-MIS)挑战,旨在鼓励开发用于此类目的的稳健模型,提供大型、多样化和附加说明的数据集。迄今为止,大多数现有模型,例如医疗仪器分块的基于两阶段探测器,这些探测器提供可靠的结果,但最接近实时(每秒5个框架(fps))),然而,为了使该方法在临床上适用,最需要实时能力,同时要高度精确。在本文件中,我们提议为YOLACT结构增加关注机制,允许实时分块仪器,提高ROBUST-MIS数据集的准确性。我们提出的方法在稳健度分数方面与2019年ROBUST-MIS的优者相比,取得了竞争性的绩效,即0.313 MISC和0.338 MINSD,同时实现实时性能(37 fps),同时实现实时性能(37 fps)。