The paper presents a multi-camera tracking method intended for tracking soccer players in long shot video recordings from multiple calibrated cameras installed around the playing field. The large distance to the camera makes it difficult to visually distinguish individual players, which adversely affects the performance of traditional solutions relying on the appearance of tracked objects. Our method focuses on individual player dynamics and interactions between neighborhood players to improve tracking performance. To overcome the difficulty of reliably merging detections from multiple cameras in the presence of calibration errors, we propose the novel tracking approach, where the tracker operates directly on raw detection heat maps from multiple cameras. Our model is trained on a large synthetic dataset generated using Google Research Football Environment and fine-tuned using real-world data to reduce costs involved with ground truth preparation.
翻译:本文介绍了一种多镜头跟踪方法,用于跟踪足球运动员的长镜头录像,这些录像来自在竞技场周围安装的多个校准相机。由于距离远,很难对个别球员进行视觉区分,这对依赖跟踪物体外观的传统解决方案的绩效产生了不利影响。我们的方法侧重于单个球员动态和邻里球员之间的互动,以改善跟踪性能。为了克服在校准错误面前可靠地将从多个相机中探测到的检测结果合并起来的困难,我们建议采用新颖的跟踪方法,即跟踪器直接使用多个相机的原始探测热地图。我们的模型在利用谷歌研究足球环境生成的大型合成数据集方面接受了培训,并利用现实世界数据进行微调,以减少地面真相准备的成本。