Autonomous vehicles operate in highly dynamic environments necessitating an accurate assessment of which aspects of a scene are moving and where they are moving to. A popular approach to 3D motion estimation -- termed scene flow -- is to employ 3D point cloud data from consecutive LiDAR scans, although such approaches have been limited by the small size of real-world, annotated LiDAR data. In this work, we introduce a new large scale benchmark for scene flow based on the Waymo Open Dataset. The dataset is $\sim$1,000$\times$ larger than previous real-world datasets in terms of the number of annotated frames and is derived from the corresponding tracked 3D objects. We demonstrate how previous works were bounded based on the amount of real LiDAR data available, suggesting that larger datasets are required to achieve state-of-the-art predictive performance. Furthermore, we show how previous heuristics for operating on point clouds such as artificial down-sampling heavily degrade performance, motivating a new class of models that are tractable on the full point cloud. To address this issue, we introduce the model architecture FastFlow3D that provides real time inference on the full point cloud. Finally, we demonstrate that this problem is amenable to techniques from semi-supervised learning by highlighting open problems for generalizing methods for predicting motion on unlabeled objects. We hope that this dataset may provide new opportunities for developing real world scene flow systems and motivate a new class of machine learning problems.
翻译:自动飞行器在高度动态的环境中运行,需要准确评估一个场景的哪些方面正在移动,以及它们正在向哪个方向移动。3D运动估计的流行方法 -- -- 称为场景流 -- -- 是使用连续的LIDAR扫描的三维点云数据,尽管这些方法受到现实世界规模小的限制,但这种方法受到附加说明的LIDAR数据的限制。在这项工作中,我们引入了基于Waymo Open Datas数据集的场景流动的新的大规模基准。数据集是比以往真实世界数据集大得多的1 000美元。以注释框架的数量计算,并取自相应的跟踪的3D对象。我们展示了以前的工作是如何根据真实的LIDAR数据数量而受三维点云云数据约束的,这表明需要更大的数据集来达到最新水平的预测性性性工作。此外,我们展示了以前在点云层上操作的超额偏斜度,例如人工下取样严重退化的性能,鼓励在全点云层上可移动的新类型的新模型。为了解决这个问题,我们引入了模型流动的,我们展示了真实的机流动性模型,最终的模型,从而展示了真实的周期性方法,我们展示了真实的学习了真实的。