Transformer networks have been a focus of research in many fields in recent years, being able to surpass the state-of-the-art performance in different computer vision tasks. A few attempts have been made to apply this method to the task of Multiple Object Tracking (MOT), among those the state-of-the-art was TransCenter, a transformer-based MOT architecture with dense object queries for accurately tracking all the objects while keeping reasonable runtime. TransCenter is the first center-based transformer framework for MOT, and is also among the first to show the benefits of using transformer-based architectures for MOT. In this paper we show an improvement to this tracker using post processing mechanism based in the Track-by-Detection paradigm: motion model estimation using Kalman filter and target Re-identification using an embedding network. Our new tracker shows significant improvements in the IDF1 and HOTA metrics and comparable results on the MOTA metric (70.9%, 59.8% and 75.8% respectively) on the MOTChallenge MOT17 test dataset and improvement on all 3 metrics (67.5%, 56.3% and 73.0%) on the MOT20 test dataset. Our tracker is currently ranked first among transformer-based trackers in these datasets. The code is publicly available at: https://github.com/amitgalor18/STC_Tracker
翻译:最近几年来,变压器网络一直是许多领域研究的重点,能够超越不同计算机愿景任务中最先进的变压器绩效。 在本文中,我们试图将这种方法应用于多对象跟踪(MOT)任务,其中几次尝试了将这一方法应用于多功能跟踪(MOT)任务,最先进的是TransCenter(TransCenter),一个基于变压器的MOT结构,该结构具有密集对象查询功能,可以准确跟踪所有物体,同时保持合理的运行时间。 TransCenter是MOTA衡量标准(分别为70.9%、59.8%和75.8%)的第一个以中心为基础的变压器框架,也是第一个显示使用基于变压器的变压器结构的好处的国家之一。在本文中,我们展示了利用基于轨迹检测模式的后处理机制改进了这一跟踪器:使用Kalman过滤器进行运动模型估算,并利用嵌入网络进行目标再定位。我们的新追踪器显示,在以色列国防军1和HOTOTA衡量标准(分别为70.9%、59.8%和75.8%)在MOT17测试数据集中的所有3CS-STet为第一位数据轨道。