Reconstruction of the soft tissues in robotic surgery from endoscopic stereo videos is important for many applications such as intra-operative navigation and image-guided robotic surgery automation. Previous works on this task mainly rely on SLAM-based approaches, which struggle to handle complex surgical scenes. Inspired by recent progress in neural rendering, we present a novel framework for deformable tissue reconstruction from binocular captures in robotic surgery under the single-viewpoint setting. Our framework adopts dynamic neural radiance fields to represent deformable surgical scenes in MLPs and optimize shapes and deformations in a learning-based manner. In addition to non-rigid deformations, tool occlusion and poor 3D clues from a single viewpoint are also particular challenges in soft tissue reconstruction. To overcome these difficulties, we present a series of strategies of tool mask-guided ray casting, stereo depth-cueing ray marching and stereo depth-supervised optimization. With experiments on DaVinci robotic surgery videos, our method significantly outperforms the current state-of-the-art reconstruction method for handling various complex non-rigid deformations. To our best knowledge, this is the first work leveraging neural rendering for surgical scene 3D reconstruction with remarkable potential demonstrated. Code is available at: https://github.com/med-air/EndoNeRF.
翻译:从内镜立体视频对机器人手术中的软组织进行重建,对于许多应用,例如操作内导航和图像制导机器人手术自动化等,非常重要。这项任务以前的工作主要依靠以SLAM为基础的方法,这些方法难以处理复杂的外科手术场景。在神经转化方面最近的进展的启发下,我们提出了一个新的框架,用于在单视点设置下对机器人手术中的双筒镜捕获进行可变组织重建。我们的框架采用了动态神经光亮场,以显示MLPs中可变形的外科手术场,并以基于学习的方式优化形状和变形。除了非硬化变形、工具隔离和单一观点差的三维线索之外,软组织重建中也存在特殊的挑战。为了克服这些困难,我们提出了一系列工具面具制导射线铸造、立体深度射线和立体深度超强优化的战略。通过DVinci机器人外科手术录像的实验,我们的方法大大超越了目前最先进的状态和最先进的重建方法。处理各种复杂、非硬体外科手术的内置技术改革工具,展示了RE-FMRFM 。展示了这一模型。