Multi-Camera Multi-Object Tracking is currently drawing attention in the computer vision field due to its superior performance in real-world applications such as video surveillance in crowded scenes or in wide spaces. In this work, we propose a mathematically elegant multi-camera multiple object tracking approach based on a spatial-temporal lifted multicut formulation. Our model utilizes state-of-the-art tracklets produced by single-camera trackers as proposals. As these tracklets may contain ID-Switch errors, we refine them through a novel pre-clustering obtained from 3D geometry projections. As a result, we derive a better tracking graph without ID switches and more precise affinity costs for the data association phase. Tracklets are then matched to multi-camera trajectories by solving a global lifted multicut formulation that incorporates short and long-range temporal interactions on tracklets located in the same camera as well as inter-camera ones. Experimental results on the WildTrack dataset yield near-perfect performance, outperforming state-of-the-art trackers on Campus while being on par on the PETS-09 dataset.
翻译:多卡梅拉多目标跟踪目前正在计算机视觉领域引起人们的注意,因为它在现实世界应用中表现优异,例如在拥挤的场景或广阔的空间中进行视频监视。 在这项工作中,我们建议采用数学上优雅的多镜头多对象跟踪方法,以空间时空提升多截面配方为基础。我们的模型使用由单摄像头跟踪者作为建议而制作的最先进的跟踪器。由于这些跟踪器可能包含ID-开关错误,我们通过从3D几何预测中获得的新颖的组合前功能来改进这些错误。结果,我们得出了一个更好的跟踪图,没有ID开关,也没有数据组合阶段的更精确的亲近性成本。然后,轨迹与多镜头轨轨相匹配,方法是解决一个全球升动多截面配方配方,该配对位于同一摄像头的跟踪器以及中间的短长距离时间互动。WardTrack数据集的实验结果产生近 Perfect 性,在PETS-09数据集上可以找到的状态跟踪器。