Cross-view multi-object tracking aims to link objects between frames and camera views with substantial overlaps. Although cross-view multi-object tracking has received increased attention in recent years, existing datasets still have several issues, including 1) missing real-world scenarios, 2) lacking diverse scenes, 3) owning a limited number of tracks, 4) comprising only static cameras, and 5) lacking standard benchmarks, which hinder the investigation and comparison of cross-view tracking methods. To solve the aforementioned issues, we introduce DIVOTrack: a new cross-view multi-object tracking dataset for DIVerse Open scenes with dense tracking pedestrians in realistic and non-experimental environments. Our DIVOTrack has ten distinct scenarios and 550 cross-view tracks, surpassing all cross-view multi-object tracking datasets currently available. Furthermore, we provide a novel baseline cross-view tracking method with a unified joint detection and cross-view tracking framework named CrossMOT, which learns object detection, single-view association, and cross-view matching with an all-in-one embedding model. Finally, we present a summary of current methodologies and a set of standard benchmarks with our DIVOTrack to provide a fair comparison and conduct a comprehensive analysis of current approaches and our proposed CrossMOT. The dataset and code are available at https://github.com/shengyuhao/DIVOTrack.
翻译:虽然近些年来交叉浏览多目标跟踪工作日益受到关注,但现有的数据集仍有若干问题,包括:(1) 缺少真实世界情景,(2) 缺乏不同场景,(3) 拥有数量有限的轨道,(4) 由静态摄像机组成,(5) 缺乏标准基准,这妨碍了对交叉浏览跟踪方法的调查和比较。为了解决上述问题,我们引入DIVOTrack:为DIVIerse开放场和在现实和非实验环境中密集跟踪行人的新交叉视图多目标跟踪数据集。我们的DIVOTRack有10个不同场景和550个交叉视图轨道,超过了目前现有的所有交叉浏览多点跟踪数据集。此外,我们提供了一个新的基线交叉浏览跟踪方法,采用了统一的联合检测和交叉查看跟踪框架。CrosmMOT, 学习对象探测、单一视图连接和与所有内嵌入模型的交叉视图匹配。最后,我们介绍了当前方法和550个交叉视图跟踪轨道的概要,以及我们现有的标准基准。我们提出的DOROB/CROD数据库和一系列的比较。