We present PVO, a novel panoptic visual odometry framework to achieve more comprehensive modeling of the scene motion, geometry, and panoptic segmentation information. Our PVO models visual odometry (VO) and video panoptic segmentation (VPS) in a unified view, which makes the two tasks mutually beneficial. Specifically, we introduce a panoptic update module into the VO Module with the guidance of image panoptic segmentation. This Panoptic-Enhanced VO Module can alleviate the impact of dynamic objects in the camera pose estimation with a panoptic-aware dynamic mask. On the other hand, the VO-Enhanced VPS Module also improves the segmentation accuracy by fusing the panoptic segmentation result of the current frame on the fly to the adjacent frames, using geometric information such as camera pose, depth, and optical flow obtained from the VO Module. These two modules contribute to each other through recurrent iterative optimization. Extensive experiments demonstrate that PVO outperforms state-of-the-art methods in both visual odometry and video panoptic segmentation tasks.
翻译:我们提出了一种新颖的全景视觉里程计框架PVO,以实现对场景运动、几何和全景分割信息的更全面建模。我们的PVO模型描述了视觉里程计(VO)和视频全景分割(VPS)的统一视图,使这两个任务相互有益。具体来说,我们在VO模块中引入了一个全景更新模块,用图像全景分割的指导来改善估计相机姿态时动态物体的影响。这种全景增强的VO模块可以通过全景感知的动态掩模来缓解动态物体的影响。另一方面,VO增强的VPS模块也通过利用VO模块得到的几何信息(如相机姿态、深度和光流)在相邻帧之间实时融合当前帧的全景分割结果,从而提高了分割精度。这两个模块通过反复迭代优化相互促进。广泛的实验证明,PVO在视觉里程计和视频全景分割任务方面均优于现有技术。