Scene flow represents the 3D motion of each point in the scene, which explicitly describes the distance and the direction of each point's movement. Scene flow estimation is used in various applications such as autonomous driving fields, activity recognition, and virtual reality fields. As it is challenging to annotate scene flow with ground truth for real-world data, this leaves no real-world dataset available to provide a large amount of data with ground truth for scene flow estimation. Therefore, many works use synthesized data to pre-train their network and real-world LiDAR data to finetune. Unlike the previous unsupervised learning of scene flow in point clouds, we propose to use odometry information to assist the unsupervised learning of scene flow and use real-world LiDAR data to train our network. Supervised odometry provides more accurate shared cost volume for scene flow. In addition, the proposed network has mask-weighted warp layers to get a more accurate predicted point cloud. The warp operation means applying an estimated pose transformation or scene flow to a source point cloud to obtain a predicted point cloud and is the key to refining scene flow from coarse to fine. When performing warp operations, the points in different states use different weights for the pose transformation and scene flow transformation. We classify the states of points as static, dynamic, and occluded, where the static masks are used to divide static and dynamic points, and the occlusion masks are used to divide occluded points. The mask-weighted warp layer indicates that static masks and occlusion masks are used as weights when performing warp operations. Our designs are proved to be effective in ablation experiments. The experiment results show the promising prospect of an odometry-assisted unsupervised learning method for 3D scene flow in real-world data.
翻译:屏幕流是场景中每个点的 3D 运动 3D 运动 显示场景中每个点的 3D 运动 3D 运动 3D 运动 明确描述每个点运动的距离和方向 。 屏幕流估计 3D 运动 。 在各种应用中, 诸如自主驱动场、 活动识别 和虚拟现实场 。 由于对现场流与真实世界数据的地面真相进行批注具有挑战性, 因此没有真实世界的数据集来提供大量具有地面真相的数据 用于现场流估。 因此, 许多工程使用合成数据 来预导网络的网络 和 实数 流流数据 。 与 先前未受监督的对点的场流数据 相比, 我们提议使用 odrod 信息来帮助未受监督的场景流学习 D 。 使用超超强的ododrod 数据 运行到动态状态时, 我们使用的方法变现和变异的状态 。 当我们使用电流数据时, 我们使用的方法 和变异的状态 运行时, 显示不同的变压 流数据 显示不同的状态 。