Multi-Object Tracking (MOT) is one of the most fundamental computer vision tasks which contributes to a variety of video analysis applications. Despite the recent promising progress, current MOT research is still limited to a fixed sampling frame rate of the input stream. In fact, we empirically find that the accuracy of all recent state-of-the-art trackers drops dramatically when the input frame rate changes. For a more intelligent tracking solution, we shift the attention of our research work to the problem of Frame Rate Agnostic MOT (FraMOT). In this paper, we propose a Frame Rate Agnostic MOT framework with Periodic training Scheme (FAPS) to tackle the FraMOT problem for the first time. Specifically, we propose a Frame Rate Agnostic Association Module (FAAM) that infers and encodes the frame rate information to aid identity matching across multi-frame-rate inputs, improving the capability of the learned model in handling complex motion-appearance relations in FraMOT. Besides, the association gap between training and inference is enlarged in FraMOT because those post-processing steps not included in training make a larger difference in lower frame rate scenarios. To address it, we propose Periodic Training Scheme (PTS) to reflect all post-processing steps in training via tracking pattern matching and fusion. Along with the proposed approaches, we make the first attempt to establish an evaluation method for this new task of FraMOT in two different modes, i.e., known frame rate and unknown frame rate, aiming to handle a more complex situation. The quantitative experiments on the challenging MOT datasets (FraMOT version) have clearly demonstrated that the proposed approaches can handle different frame rates better and thus improve the robustness against complicated scenarios.
翻译:多目标跟踪(MOT)是最重要的计算机愿景任务之一,有助于各种视频分析应用。尽管最近取得了有希望的进展,但目前的MOT研究仍然局限于输入流的固定抽样框架率。事实上,我们从经验中发现,所有最新最先进的跟踪器的准确性在输入框架率变化时会急剧下降。为了更明智的跟踪解决方案,我们把研究工作的注意力转移到框架速率Agnostic MOT(FMOT)问题上。在本文中,我们提议了一个带有定期培训计划的Agnostic MOT框架框架框架框架(FAPS),以便第一次处理FRAMOT问题。具体地说,我们提议一个框架比率 Agnest Agnest Agnest 协会模块(FAM), 来推导出和编码框架框架框架的准确度信息,以便帮助身份匹配多框架投入,提高学习模式处理FRAMOT的复杂动作关系的能力。此外,在FRAMOT中, 培训和判断之间的关联差距正在扩大,因为我们在培训的后处理框架中没有包括后处理步骤,因此在更深层次的版本中提出更精确的路径,因此, 显示一个更精确的模型中要显示一个更不同的方法。