Most deep trackers still follow the guidance of the siamese paradigms and use a template that contains only the target without any contextual information, which makes it difficult for the tracker to cope with large appearance changes, rapid target movement, and attraction from similar objects. To alleviate the above problem, we propose a long-term context attention (LCA) module that can perform extensive information fusion on the target and its context from long-term frames, and calculate the target correlation while enhancing target features. The complete contextual information contains the location of the target as well as the state around the target. LCA uses the target state from the previous frame to exclude the interference of similar objects and complex backgrounds, thus accurately locating the target and enabling the tracker to obtain higher robustness and regression accuracy. By embedding the LCA module in Transformer, we build a powerful online tracker with a target-aware backbone, termed as TATrack. In addition, we propose a dynamic online update algorithm based on the classification confidence of historical information without additional calculation burden. Our tracker achieves state-of-the-art performance on multiple benchmarks, with 71.1\% AUC, 89.3\% NP, and 73.0\% AO on LaSOT, TrackingNet, and GOT-10k. The code and trained models are available on https://github.com/hekaijie123/TATrack.
翻译:最深的跟踪者仍然遵循Siamese 范式的指导,并使用一个模板,该模板仅包含目标,而没有任何背景信息,从而使跟踪者难以应对巨大的外观变化、快速的目标移动和来自类似对象的吸引。为了缓解上述问题,我们提议了一个长期背景关注模块,该模块能够对目标及其背景进行广泛的信息融合,从长期框架对目标及其背景进行计算,并在加强目标特征的同时计算目标相关性。完整的背景信息包含目标位置以及目标周围的状态。 LCA使用上一个框架的目标状态来排除类似对象和复杂背景的干扰,从而使跟踪者难以准确定位目标并使跟踪者能够获得更强的稳健性和回归精确性。通过将LCA模块嵌入变异器,我们建立了一个强大的在线跟踪器,其主干线被称作TATrack;此外,我们提议基于历史信息分类信心的动态在线更新算法,而无需额外的计算负担。我们的跟踪者在多个基准基准上实现了州-艺术业绩,有71.1<unk> 、89.3Net/MART和MARNG/10号。</s>