Multi-object tracking (MOT) has been dominated by the use of track by detection approaches due to the success of convolutional neural networks (CNNs) on detection in the last decade. As the datasets and bench-marking sites are published, research direction has shifted towards yielding best accuracy on generic scenarios including re-identification (reID) of objects while tracking. In this study, we narrow the scope of MOT for surveillance by providing a dedicated dataset of pedestrians and focus on in-depth analyses of well performing multi-object trackers to observe the weak and strong sides of state-of-the-art (SOTA) techniques for real-world applications. For this purpose, we introduce SOMPT22 dataset; a new set for multi person tracking with annotated short videos captured from static cameras located on poles with 6-8 meters in height positioned for city surveillance. This provides a more focused and specific benchmarking of MOT for outdoor surveillance compared to public MOT datasets. We analyze MOT trackers classified as one-shot and two-stage with respect to the way of use of detection and reID networks on this new dataset. The experimental results of our new dataset indicate that SOTA is still far from high efficiency, and single-shot trackers are good candidates to unify fast execution and accuracy with competitive performance. The dataset will be available at: sompt22.github.io
翻译:多球跟踪(MOT)一直以探测方法的跟踪为主。由于在过去十年里在探测上成功使用进化神经网络(CNN)成功,因此使用探测方法的轨道。随着数据集和基准标记站的公布,研究方向已经转向在一般假设中产生最佳准确性,包括在跟踪时重新识别(reID)物体。在这项研究中,我们缩小了MOT的监测范围,为行人提供了专用数据集,并侧重于深入分析运行良好的多球跟踪器,以观察现实世界应用中最新技术的薄弱和强势一面。为此,我们推出了SOMPT22数据集;一套新的多人跟踪装置,配有附加说明的短视频,从位于杆上、高度6至8米的静止相机上采集,用于城市监视。这为MOT提供了更集中和具体的室室外监视基准,而公共MOTigi数据集。我们分析MOT track trackers仍然被归类为一张照片和关于探测和再定位网络使用方式的新阶段。我们从SOMP22网络获得的高质量数据,在最新数据中将显示。