3D object tracking is a critical task in autonomous driving systems. It plays an essential role for the system's awareness about the surrounding environment. At the same time there is an increasing interest in algorithms for autonomous cars that solely rely on inexpensive sensors, such as cameras. In this paper we investigate the use of triplet embeddings in combination with motion representations for 3D object tracking. We start from an off-the-shelf 3D object detector, and apply a tracking mechanism where objects are matched by an affinity score computed on local object feature embeddings and motion descriptors. The feature embeddings are trained to include information about the visual appearance and monocular 3D object characteristics, while motion descriptors provide a strong representation of object trajectories. We will show that our approach effectively re-identifies objects, and also behaves reliably and accurately in case of occlusions, missed detections and can detect re-appearance across different field of views. Experimental evaluation shows that our approach outperforms state-of-the-art on nuScenes by a large margin. We also obtain competitive results on KITTI.
翻译:3D 对象跟踪是自动驱动系统的一项关键任务。 它对于系统对周围环境的认识至关重要。 与此同时, 人们越来越关注完全依赖低成本传感器(如相机)的自动汽车算法。 在本文中, 我们调查三维物体跟踪使用三维嵌入与运动演示相结合的三维物体跟踪。 我们从现成的 3D 对象探测器开始, 并应用一个跟踪机制, 将物体与本地物体嵌入特征和运动描述器的近距离分相匹配。 功能嵌入器经过培训, 以包含视觉外观和单立体物体特性的信息, 而动作描述器则提供强烈的物体轨迹描述。 我们将展示我们的方法有效地重新识别物体, 并在封闭、 错失探测和在不同观点领域发现再次出现时, 也以可靠和准确的方式行事。 实验性评估显示, 我们的方法超越了大边缘点对纳星的状态。 我们还在 KITTI 上取得了竞争性的结果 。