This work proposes an end-to-end multi-camera 3D multi-object tracking (MOT) framework. It emphasizes spatio-temporal continuity and integrates both past and future reasoning for tracked objects. Thus, we name it "Past-and-Future reasoning for Tracking" (PF-Track). Specifically, our method adapts the "tracking by attention" framework and represents tracked instances coherently over time with object queries. To explicitly use historical cues, our "Past Reasoning" module learns to refine the tracks and enhance the object features by cross-attending to queries from previous frames and other objects. The "Future Reasoning" module digests historical information and predicts robust future trajectories. In the case of long-term occlusions, our method maintains the object positions and enables re-association by integrating motion predictions. On the nuScenes dataset, our method improves AMOTA by a large margin and remarkably reduces ID-Switches by 90% compared to prior approaches, which is an order of magnitude less. The code and models are made available at https://github.com/TRI-ML/PF-Track.
翻译:这项工作建议了一个端到端多镜头 3D 多对象跟踪框架。 它强调 spatio- 时间连续性, 并整合被跟踪对象的过去和未来的推理。 因此, 我们将其命名为“ 跟踪跟踪的路径和前景推理 ” ( PF- Track ) 。 具体地说, 我们的方法调整了“ 关注跟踪” 框架, 并代表了随时间与对象查询一致的跟踪实例 。 为了明确使用历史提示, 我们的“ 选择解释” 模块通过对先前框架和其他对象的查询进行交叉调试, 来改进轨道, 并增强对象特性。 “ 未来解释” 模块总结历史信息, 并预测未来稳健的轨迹 。 在长期隔离的情况下, 我们的方法维持对象位置, 通过整合动作预测, 能够重新建立联系 。 在 nuSenes 数据集上, 我们的方法将 AMOTA 改进大边缘, 并显著地将 ID- Switches 减少 90% 与先前的方法相比, 前者的排序为低等。 。 。 代码和模型可在 http/ TRAFRRF 上提供 。