Scene flow estimation is a long-standing problem in computer vision, where the goal is to find the 3D motion of a scene from its consecutive observations. Recently, there have been efforts to compute the scene flow from 3D point clouds. A common approach is to train a regression model that consumes source and target point clouds and outputs the per-point translation vectors. An alternative is to learn point matches between the point clouds concurrently with regressing a refinement of the initial correspondence flow. In both cases, the learning task is very challenging since the flow regression is done in the free 3D space, and a typical solution is to resort to a large annotated synthetic dataset. We introduce SCOOP, a new method for scene flow estimation that can be learned on a small amount of data without employing ground-truth flow supervision. In contrast to previous work, we train a pure correspondence model focused on learning point feature representation and initialize the flow as the difference between a source point and its softly corresponding target point. Then, in the run-time phase, we directly optimize a flow refinement component with a self-supervised objective, which leads to a coherent and accurate flow field between the point clouds. Experiments on widespread datasets demonstrate the performance gains achieved by our method compared to existing leading techniques while using a fraction of the training data. Our code is publicly available at https://github.com/itailang/SCOOP.
翻译:电流估计是计算机愿景中长期存在的问题,其目标是从连续的观测中找到一个场景的三维运动。最近,一直在努力从三维点云中计算场景的三维运动。一个共同的方法是培训一个利用源和点云的回归模型,以及每个点翻译矢量的输出。另一个办法是学习点云之间的点匹配,同时改进初始通信流。在这两种情况下,学习任务都非常具有挑战性,因为流量回归是在自由的三维空间中完成的,典型的解决方案是采用一个大型的附加说明的合成数据集。我们引入了SCOOP,这是在不使用地面图流监督的情况下可以从少量数据中学习的一种新的场景流动估计方法。与以前的工作不同,我们培训了一个纯的通信模型,侧重于学习点特征代表,并开始将流动作为源点与软相对应的目标点之间的差。然后,在运行阶段,我们直接优化了流量调整的组件,采用一个自我监控的合成数据集。我们引入了一种新方法,可以在少量数据中学习,同时用我们现有的磁标显示我们现有数据流流流流的模型。