Tracking requires building a discriminative model for the target in the inference stage. An effective way to achieve this is online learning, which can comfortably outperform models that are only trained offline. Recent research shows that visual tracking benefits significantly from the unification of visual tracking and segmentation due to its pixel-level discrimination. However, it imposes a great challenge to perform online learning for such a unified model. A segmentation model cannot easily learn from prior information given in the visual tracking scenario. In this paper, we propose TrackMLP: a novel meta-learning method optimized to learn from only partial information to resolve the imposed challenge. Our model is capable of extensively exploiting limited prior information hence possesses much stronger target-background discriminability than other online learning methods. Empirically, we show that our model achieves state-of-the-art performance and tangible improvement over competing models. Our model achieves improved average overlaps of66.0%,67.1%, and68.5% in VOT2019, VOT2018, and VOT2016 datasets, which are 6.4%,7.3%, and6.4% higher than our baseline. Code will be made publicly available.
翻译:跟踪跟踪要求为该目标在推断阶段建立歧视模式。 实现这一目的的一个有效方法是在线学习, 它可以令人放心地优于只受过训练的离线模式。 最近的研究显示, 视觉跟踪因其像素级的差别而从视觉跟踪和分解的统一中大有裨益。 但是, 它给为这种统一的模型进行在线学习带来了巨大的挑战。 分割模型无法轻易地从视觉跟踪情景中提供的先前信息中学习。 在本文中, 我们提议ChatMLP: 一种新颖的元学习方法, 最优化地从仅部分信息中学习解决强加的挑战。 我们的模型能够广泛利用有限的先前信息, 从而拥有比其他在线学习方法更强得多的目标- 后地差异性。 我们很可能会显示, 我们的模型取得了最先进的艺术性表现和对竞争模型的显著改进。 我们的模型实现了66.0%, 67.1%, 685.5%, VOT2019, VOT2018, 和VOT2016数据集, 将公开制作6.4%, 将比我们的基准高出6.4%。