Colonoscopy is a standard imaging tool for visualizing the entire gastrointestinal (GI) tract of patients to capture lesion areas. However, it takes the clinicians excessive time to review a large number of images extracted from colonoscopy videos. Thus, automatic detection of biological anatomical landmarks within the colon is highly demanded, which can help reduce the burden of clinicians by providing guidance information for the locations of lesion areas. In this article, we propose a novel deep learning-based approach to detect biological anatomical landmarks in colonoscopy videos. First, raw colonoscopy video sequences are pre-processed to reject interference frames. Second, a ResNet-101 based network is used to detect three biological anatomical landmarks separately to obtain the intermediate detection results. Third, to achieve more reliable localization of the landmark periods within the whole video period, we propose to post-process the intermediate detection results by identifying the incorrectly predicted frames based on their temporal distribution and reassigning them back to the correct class. Finally, the average detection accuracy reaches 99.75\%. Meanwhile, the average IoU of 0.91 shows a high degree of similarity between our predicted landmark periods and ground truth. The experimental results demonstrate that our proposed model is capable of accurately detecting and localizing biological anatomical landmarks from colonoscopy videos.
翻译:Colonoscop 是一种标准成像工具,可视化整个胃肠胃(GI)病人的胃肠道,以捕捉病区。然而,临床医生需要过多的时间来审查从结肠镜录像中提取的大量图像。因此,对结肠内生物解剖地标的自动探测要求很高,这可以通过提供病区地点的指导信息帮助减少临床医生的负担。在本篇文章中,我们建议采用一种基于深层次的深层次学习的新方法,以探测结肠镜录像中的生物解剖地标。首先,原始结肠镜录像序列是预先处理的,以拒绝干扰框架。第二,以ResNet-101为基础的网络用来分别探测三个生物解剖地标,以获得中间检测结果。第三,为了在整个录影带期间实现更可靠的地标阶段,我们建议通过根据时间分布错误的预测框架来处理中间检测结果,并将它们重新分配给正确的阶层。最后,平均检测准确的检测准确度达到99.75 ⁇ 。同时,以ResNet-101为基础的网络用来探测三个生物解剖地标标志期间,从我们所测得的地面上平均地标预测地标的图像显示一个高度。