Multi-object tracking (MOT) is the problem of tracking the state of an unknown and time-varying number of objects using noisy measurements, with important applications such as autonomous driving, tracking animal behavior, defense systems, and others. In recent years, deep learning (DL) has been increasingly used in MOT for improving tracking performance, but mostly in settings where the measurements are high-dimensional and there are no available models of the measurement likelihood and the object dynamics. The model-based setting instead has not attracted as much attention, and it is still unclear if DL methods can outperform traditional model-based Bayesian methods, which are the state of the art (SOTA) in this context. In this paper, we propose a Transformer-based DL tracker and evaluate its performance in the model-based setting, comparing it to SOTA model-based Bayesian methods in a variety of different tasks. Our results show that the proposed DL method can match the performance of the model-based methods in simple tasks, while outperforming them when the task gets more complicated, either due to an increase in the data association complexity, or to stronger nonlinearities of the models of the environment.
翻译:多目标跟踪(MOT)是跟踪使用噪音测量,以诸如自主驱动、跟踪动物行为、防御系统等重要应用手段(SOT)来跟踪数量不详和时间变化的物体的状况的问题。近年来,MOT越来越多地使用深层次学习(DL)来改进跟踪性能,但大多是在测量为高维且没有衡量可能性和对象动态的模型模型的模型的情况下。基于模型的设置没有引起更多的注意,目前还不清楚DL方法能否超越传统模型的贝叶西亚方法,而这种方法正是这方面的艺术(SOTA)状况。在本文中,我们建议采用以变换器为基础的DL跟踪器,并评估其在基于模型的环境下的性能,将它与SOTA基于模型的Bayesian方法在不同任务中进行比较。我们的结果显示,拟议的DL方法能够与基于模型的方法在简单任务中的性能相匹配,而在任务更加复杂时,这些方法的性能则超过任务,因为数据关联复杂性增加,或者环境模型的非线性更强。