Minimally invasive surgery (MIS) has many documented advantages, but the surgeon's limited visual contact with the scene can be problematic. Hence, systems that can help surgeons navigate, such as a method that can produce a 3D semantic map, can compensate for the limitation above. In theory, we can borrow 3D semantic mapping techniques developed for robotics, but this requires finding solutions to the following challenges in MIS: 1) semantic segmentation, 2) depth estimation, and 3) pose estimation. In this paper, we propose the first 3D semantic mapping system from knee arthroscopy that solves the three challenges above. Using out-of-distribution non-human datasets, where pose could be labeled, we jointly train depth+pose estimators using selfsupervised and supervised losses. Using an in-distribution human knee dataset, we train a fully-supervised semantic segmentation system to label arthroscopic image pixels into femur, ACL, and meniscus. Taking testing images from human knees, we combine the results from these two systems to automatically create 3D semantic maps of the human knee. The result of this work opens the pathway to the generation of intraoperative 3D semantic mapping, registration with pre-operative data, and robotic-assisted arthroscopy
翻译:微小侵入性外科(MIS)有许多有文件记载的优势,但外科医生与现场的视觉接触有限,可能存在问题。因此,能够帮助外科医生导航的系统,例如能够制作3D语义图的方法,可以弥补上述限制。理论上,我们可以借用为机器人开发的3D语义绘图技术,但这需要找到解决IMIS中以下挑战的办法:(1) 语义分解,(2) 深度估计,和(3) 构成估计。在本文中,我们提议从膝盖动脉科分析中建立第一个3D语义绘图系统,以解决上述三个挑战。我们使用分布之外的非人类数据集,例如可以标出姿势的方法,我们联合培训深度+测量器,使用自我监督和监管的损失。我们使用分布式的人体膝盖数据集,我们训练一个完全受控制的语义分解分解分解系统,将亚氏体图像像素贴入Femur、ACL和Meniscus。测试了人类膝盖的图像,我们把这两个系统的结果与自动生成的3D型非人类脉义化的代形图解图解。