Siamese-based trackers have achived promising performance on visual object tracking tasks. Most existing Siamese-based trackers contain two separate branches for tracking, including classification branch and bounding box regression branch. In addition, image segmentation provides an alternative way to obetain the more accurate target region. In this paper, we propose a novel tracker with two-stages: detection and segmentation. The detection stage is capable of locating the target by Siamese networks. Then more accurate tracking results are obtained by segmentation module given the coarse state estimation in the first stage. We conduct experiments on four benchmarks. Our approach achieves state-of-the-art results, with the EAO of 52.6$\%$ on VOT2016, 51.3$\%$ on VOT2018, and 39.0$\%$ on VOT2019 datasets, respectively.
翻译:以暹罗为基地的跟踪器在视觉物体跟踪任务上表现良好,大多数现有以暹罗为基地的跟踪器包含两个单独的跟踪分支,包括分类分支和捆绑盒回归分支。此外,图像分割为了解更准确的目标区域提供了替代方法。在本文中,我们提出了一个新的跟踪器,分为两个阶段:探测和分离。探测阶段能够由暹罗网络定位目标。然后,根据第一阶段粗糙的状态估计,通过分割模块获得更准确的跟踪结果。我们在四个基准上进行实验。我们的方法实现了最新结果,分别用VOT2016、51.3美元和39.0美元作为VOT2018和39.0美元作为VOT2019数据集。