The problem of visual object tracking has traditionally been handled by variant tracking paradigms, either learning a model of the object's appearance exclusively online or matching the object with the target in an offline-trained embedding space. Despite the recent success, each method agonizes over its intrinsic constraint. The online-only approaches suffer from a lack of generalization of the model they learn thus are inferior in target regression, while the offline-only approaches (e.g., convontional siamese trackers) lack the video-specific context information thus are not discriminative enough to handle distractors. Therefore, we propose a parallel framework to integrate offline-trained siamese networks with a lightweight online module for enhance the discriminative capability. We further apply a simple yet robust template update strategy for siamese networks, in order to handle object deformation. Robustness can be validated in the consistent improvement over three siamese baselines: SiamFC, SiamRPN++, and SiamMask. Beyond that, our model based on SiamRPN++ obtains the best results over six popular tracking benchmarks. Though equipped with an online module when tracking proceeds, our approach inherits the high efficiency from siamese baseline and can operate beyond real-time.
翻译:视觉物体跟踪问题历来由不同跟踪模式处理,要么学习了该物体的外观完全在线模型,要么在离线训练的嵌入空间将该物体与目标匹配。尽管最近取得了成功,但每种方法都对其内在限制产生痛苦。在线方法因缺乏对所学模型的概括性而受到影响,因此在目标回归方面处于劣势,而离线方法(例如,静脉跟踪器)缺乏视频特定背景信息,因此不足以区分处理分散器。因此,我们建议建立一个平行框架,将经过离线训练的精子网络与轻量级在线模块结合起来,以加强歧视能力。我们进一步为Siamese网络采用一种简单而有力的模板更新战略,以便处理物体变形。在三个Siames基线(SiamFC、SiamRPN++和SiamMask)的不断改进中,可以验证强性。此外,我们基于SiamRPN++的模型在六个通用跟踪基准中获得了最佳结果。我们还可以在一个在线模块运行后,在Syrealima Basimal上运行后,可以运行一个在线模块。