In the past few years we have seen great advances in 3D object detection thanks to deep learning methods. However, they typically rely on large amounts of high-quality labels to achieve good performance, which often require time-consuming and expensive work by human annotators. To address this we propose an automatic annotation pipeline that generates accurate object trajectories in 3D (ie, 4D labels) from LiDAR point clouds. Different from previous works that consider single frames at a time, our approach directly operates on sequential point clouds to combine richer object observations. The key idea is to decompose the 4D label into two parts: the 3D size of the object, and its motion path describing the evolution of the object's pose through time. More specifically, given a noisy but easy-to-get object track as initialization, our model first estimates the object size from temporally aggregated observations, and then refines its motion path by considering both frame-wise observations as well as temporal motion cues. We validate the proposed method on a large-scale driving dataset and show that our approach achieves significant improvements over the baselines. We also showcase the benefits of our approach under the annotator-in-the-loop setting.
翻译:在过去几年里,由于深层次的学习方法,我们在3D天体探测方面取得了巨大进步。然而,它们通常依赖大量高质量的标签来取得良好的性能,这往往需要人类笔记员花费大量时间和花费大量的工作。为了解决这个问题,我们提议了自动注解管道,从LIDAR点云中生成3D(ie, 4D 标签)的精确天体轨迹。与以往每次考虑单一框架的工程不同,我们的方法直接在连续点云上运行,以结合较丰富的天体观测。关键的想法是将4D标记分解成两个部分:物体的3D大小及其描述物体姿势演变的动向路径。更具体地说,由于初始化是一个吵闹但容易找到的物体轨迹,我们的模型首先从时间汇总的观测中估算物体大小,然后通过考虑框架性观测和时间运动提示来改进其运动路径。我们验证了在大规模驱动数据集上的拟议方法,并表明我们的方法在基线上取得了显著的改进。我们还演示了我们的方法的好处。