Multi-Object Tracking (MOT) is one of the most fundamental computer vision tasks that contributes to various video analysis applications. Despite the recent promising progress, current MOT research is still limited to a fixed sampling frame rate of the input stream. In fact, we empirically found that the accuracy of all recent state-of-the-art trackers drops dramatically when the input frame rate changes. For a more intelligent tracking solution, we shift the attention of our research work to the problem of Frame Rate Agnostic MOT (FraMOT), which takes frame rate insensitivity into consideration. In this paper, we propose a Frame Rate Agnostic MOT framework with a Periodic training Scheme (FAPS) to tackle the FraMOT problem for the first time. Specifically, we propose a Frame Rate Agnostic Association Module (FAAM) that infers and encodes the frame rate information to aid identity matching across multi-frame-rate inputs, improving the capability of the learned model in handling complex motion-appearance relations in FraMOT. Moreover, the association gap between training and inference is enlarged in FraMOT because those post-processing steps not included in training make a larger difference in lower frame rate scenarios. To address it, we propose Periodic Training Scheme (PTS) to reflect all post-processing steps in training via tracking pattern matching and fusion. Along with the proposed approaches, we make the first attempt to establish an evaluation method for this new task of FraMOT in two different modes, i.e., known frame rate and unknown frame rate, aiming to handle a more complex situation. The quantitative experiments on the challenging MOT17/20 dataset (FraMOT version) have clearly demonstrated that the proposed approaches can handle different frame rates better and thus improve the robustness against complicated scenarios.
翻译:多目标跟踪是计算机视觉中最基本的任务之一,为各种视频分析应用程序做出了贡献。尽管最近有了很多有希望的进展,但目前的多目标跟踪研究仍然局限于输入流的固定采样帧率。实际上,我们经验证实,即使输入帧率发生变化,所有最新的最先进的跟踪器的准确性也会急剧下降。为了更智能的跟踪解决方案,我们将研究重点转向帧率不敏感的多目标跟踪问题(FraMOT),考虑帧率不敏感性。在本文中,我们提出了一个带周期训练方案的帧率不敏感多目标跟踪框架(FAPS),首次解决FraMOT问题。具体而言,我们提出了一个帧率不敏感关联模块(FAAM),以推断和编码帧率信息以帮助跨多帧率输入进行标识匹配,提高了在FraMOT中处理复杂运动-外观关系的模型能力。此外,在训练和推理之间的关联差距在FraMOT中扩大,因为那些不包含在训练中的后处理步骤在低帧率场景中的差异更大。为了解决这个问题,我们提出了周期训练方案(PTS),通过跟踪模式匹配和融合反映训练中的所有后处理步骤。随着这些方法的提出,我们第一次尝试在两种不同的模式下建立这个新任务FraMOT的评估方法,即已知帧率和未知帧率,旨在处理更复杂的情况。对具有挑战性的MOT17/20数据集(FraMOT版本)的定量实验清楚地表明,所提出的方案可以更好地处理不同的帧率,从而提高对复杂情况的鲁棒性。