Accurate and robust tracking and reconstruction of the surgical scene is a critical enabling technology toward autonomous robotic surgery. Existing algorithms for 3D perception in surgery mainly rely on geometric information, while we propose to also leverage semantic information inferred from the endoscopic video using image segmentation algorithms. In this paper, we present a novel, comprehensive surgical perception framework, Semantic-SuPer, that integrates geometric and semantic information to facilitate data association, 3D reconstruction, and tracking of endoscopic scenes, benefiting downstream tasks like surgical navigation. The proposed framework is demonstrated on challenging endoscopic data with deforming tissue, showing its advantages over our baseline and several other state-of the-art approaches. Our code and dataset are available at https://github.com/ucsdarclab/Python-SuPer.
翻译:对外科手术场景的准确和有力的跟踪和重建是自主机器人外科手术的关键扶持技术。外科手术3D感知的现有算法主要依赖几何信息,同时我们提议也利用内窥视视频中利用图像分解算法推断出的语义信息。在本文中,我们提出了一个新颖的综合性外科手术感知框架,即Semantic-Super,它整合了几何和语义信息,以促进数据联系、3D重建以及内窥镜跟踪,使外科导航等下游任务受益。拟议框架展示了具有挑战性的内窥数据与变形组织,显示其优于我们的基线和其他一些最新方法。我们的代码和数据集可在https://github.com/ucsdarclab/Python-Super查阅。