Similarity learning has been recognized as a crucial step for object tracking. However, existing multiple object tracking methods only use sparse ground truth matching as the training objective, while ignoring the majority of the informative regions on the images. In this paper, we present Quasi-Dense Similarity Learning, which densely samples hundreds of region proposals on a pair of images for contrastive learning. We can directly combine this similarity learning with existing detection methods to build Quasi-Dense Tracking (QDTrack) without turning to displacement regression or motion priors. We also find that the resulting distinctive feature space admits a simple nearest neighbor search at the inference time. Despite its simplicity, QDTrack outperforms all existing methods on MOT, BDD100K, Waymo, and TAO tracking benchmarks. It achieves 68.7 MOTA at 20.3 FPS on MOT17 without using external training data. Compared to methods with similar detectors, it boosts almost 10 points of MOTA and significantly decreases the number of ID switches on BDD100K and Waymo datasets. Our code and trained models are available at http://vis.xyz/pub/qdtrack.
翻译:近似性学习已被公认为是物体跟踪的一个关键步骤。然而,现有的多天体跟踪方法仅使用稀少的地面真象匹配作为培训目标,而忽略了图像上大多数信息丰富的区域。本文介绍了“准-常量相似性学习”,大量抽样展示了成对一对图像的数百项区域建议,供对比性学习使用。我们可以直接将这种相似性学习与现有的探测方法结合起来,以建立“准-常量跟踪”(QDTrack),而不必转向迁移回归或移动前。我们还发现,由此产生的特殊特性空间在推断时会有一个简单的近邻搜索。尽管QDTrack在MOT、BDD100K、Waymo和TAO跟踪基准上比所有现有方法简单,QDrack优于所有现有方法。在MOT、BDD100K、Waymo和TAAOO跟踪基准上实现了68.7 MOTA,在20.3 FPS,而没有使用外部培训数据。与类似的探测器相比,它几乎提升了MOTA的10点,并大大减少了BDDDD100K和Wayqbdroad 数据设置上的ID开关。我们的代码和经过培训的模型可在http://xyz/z/b.