We present our 1st place solution to the Group Dance Multiple People Tracking Challenge. Based on MOTR: End-to-End Multiple-Object Tracking with Transformer, we explore: 1) detect queries as anchors, 2) tracking as query denoising, 3) joint training on pseudo video clips generated from CrowdHuman dataset, and 4) using the YOLOX detection proposals for the anchor initialization of detect queries. Our method achieves 73.4% HOTA on the DanceTrack test set, surpassing the second-place solution by +6.8% HOTA.
翻译:我们向 " 舞蹈多人追踪挑战 " 组展示了我们的第1位解决方案。根据MOTR:端到端多目标追踪与变换器,我们探索:(1) 将查询作为锚,(2) 将查询作为锚,(2) 将查询作为分解,(3) 由人群数据集生成的假视频剪辑方面的联合培训,(4) 使用YOLOX检测建议启动检测查询的锚。 我们的方法在舞蹈轨迹测试组中实现了73.4% HOTA, 超过第二位的6.8% HOTA。