We present a novel dual-flow representation of scene motion that decomposes the optical flow into a static flow field caused by the camera motion and another dynamic flow field caused by the objects' movements in the scene. Based on this representation, we present a dynamic SLAM, dubbed DeFlowSLAM, that exploits both static and dynamic pixels in the images to solve the camera poses, rather than simply using static background pixels as other dynamic SLAM systems do. We propose a dynamic update module to train our DeFlowSLAM in a self-supervised manner, where a dense bundle adjustment layer takes in estimated static flow fields and the weights controlled by the dynamic mask and outputs the residual of the optimized static flow fields, camera poses, and inverse depths. The static and dynamic flow fields are estimated by warping the current image to the neighboring images, and the optical flow can be obtained by summing the two fields. Extensive experiments demonstrate that DeFlowSLAM generalizes well to both static and dynamic scenes as it exhibits comparable performance to the state-of-the-art DROID-SLAM in static and less dynamic scenes while significantly outperforming DROID-SLAM in highly dynamic environments. Code and data are available on the project webpage: \urlstyle{tt} \textcolor{url_color}{\url{https://zju3dv.github.io/deflowslam/}}.
翻译:我们展示了一种新型双流的场景运动, 将光学流分解成由相机运动和物体在现场运动造成的另一个动态流场造成的静态流场。 基于这个演示, 我们展示了一个动态的SLAM, 称为DeFlowSLAM, 以图像中静态和动态像素来解析相机的配置, 而不是简单地使用静态背景像素作为其他动态的 SLAM 系统。 我们提出一个动态更新模块, 以自我监督的方式培训我们的DeFlowSLAM, 在那里, 密集的捆绑调整层在估计的静态流场中, 以及动态遮罩和输出所控制的重力, 优化的静态流场、 摄像头和反向深度。 静态和动态流区是通过将当前图像转换到相邻图像来估计的, 光学流可以通过对两个场进行总结。 广泛的实验表明, DeFlowSLAM 将静态和动态场景都与 DROID- SLADR_ slimal_ salliumal 环境的状态和动态分析。 在高动态的动态场上, ASal- dal- dal- dromabrl_ droum_ droal_ droal_ droal_ droalpal_ droalpal_ droal_ droal_ dromadal_ droal_ disal_ drocumental_ dropsal_ drocumental_ dromadromadromad procumental_