Exploiting past 3D LiDAR scans to predict future point clouds is a promising method for autonomous mobile systems to realize foresighted state estimation, collision avoidance, and planning. In this paper, we address the problem of predicting future 3D LiDAR point clouds given a sequence of past LiDAR scans. Estimating the future scene on the sensor level does not require any preceding steps as in localization or tracking systems and can be trained self-supervised. We propose an end-to-end approach that exploits a 2D range image representation of each 3D LiDAR scan and concatenates a sequence of range images to obtain a 3D tensor. Based on such tensors, we develop an encoder-decoder architecture using 3D convolutions to jointly aggregate spatial and temporal information of the scene and to predict the future 3D point clouds. We evaluate our method on multiple datasets and the experimental results suggest that our method outperforms existing point cloud prediction architectures and generalizes well to new, unseen environments without additional fine-tuning. Our method operates online and is faster than the common LiDAR frame rate of 10 Hz.
翻译:利用过去3D LiDAR扫描来预测未来点云,这是自主移动系统实现预测状态估计、避免碰撞和规划的一个很有希望的方法。在本文中,我们处理根据过去LIDAR扫描的顺序预测未来3D LiDAR点云的问题。在传感器一级估计未来场景不需要在定位或跟踪系统方面采取任何前期步骤,也可以接受自我监督的培训。我们提议了一种端到端方法,利用每3D LiDAR扫描的2D范围图像表示方式,将一系列的图像聚合在一起,以获得一个 3D 弧。我们以这些高压为基础,开发了一个编码器解码器结构,利用 3D 变相来联合收集现场的时空信息,并预测未来的3D点云。我们评估了多数据集和实验结果的方法,表明我们的方法超越了现有的点云预测结构,并且将新的、隐蔽的环境概括起来,而没有进一步的微调。我们的方法在网上运行,比普通的10个LAAR框架速度要快。