In the recent literature, on the one hand, many 3D multi-object tracking (MOT) works have focused on tracking accuracy and neglected computation speed, commonly by designing rather complex cost functions and feature extractors. On the other hand, some methods have focused too much on computation speed at the expense of tracking accuracy. In view of these issues, this paper proposes a robust and fast camera-LiDAR fusion-based MOT method that achieves a good trade-off between accuracy and speed. Relying on the characteristics of camera and LiDAR sensors, an effective deep association mechanism is designed and embedded in the proposed MOT method. This association mechanism realizes tracking of an object in a 2D domain when the object is far away and only detected by the camera, and updating of the 2D trajectory with 3D information obtained when the object appears in the LiDAR field of view to achieve a smooth fusion of 2D and 3D trajectories. Extensive experiments based on the typical datasets indicate that our proposed method presents obvious advantages over the state-of-the-art MOT methods in terms of both tracking accuracy and processing speed. Our code is made publicly available for the benefit of the community.
翻译:在最近的文献中,一方面,许多三维多目标跟踪(MOT)工作侧重于跟踪准确性和被忽视的计算速度,通常设计相当复杂的成本功能和特征提取器;另一方面,有些方法过于侧重于计算速度,而以跟踪准确性为代价;鉴于这些问题,本文件建议采用一种强力和快速的摄影机-LiDAR聚流法,在准确性和速度之间实现良好的平衡。根据相机和LIDAR传感器的特点,设计了一个有效的深层联系机制,并嵌入了拟议的MOT方法。这种联系机制在物体远处而且仅由相机探测到的情况下,在2D域跟踪对象,实现了对物体的跟踪,并在2D轨迹以3D信息更新,当物体出现在LIDAR领域时,为了在2D和3D轨迹之间实现平稳的融合。基于典型数据集的大规模实验表明,我们拟议的方法在跟踪准确性和处理速度两方面都明显优于状态MOT方法。