Multiple human tracking is a fundamental problem for scene understanding. Although both accuracy and speed are required in real-world applications, recent tracking methods based on deep learning have focused on accuracy and require substantial running time. This study aims to improve running speed by performing human detection at a certain frame interval because it accounts for most of the running time. The question is how to maintain accuracy while skipping human detection. In this paper, we propose a method that complements the detection results with optical flow, based on the fact that someone's appearance does not change much between adjacent frames. To maintain the tracking accuracy, we introduce robust interest point selection within human regions and a tracking termination metric calculated by the distribution of the interest points. On the MOT20 dataset in the MOTChallenge, the proposed SDOF-Tracker achieved the best performance in terms of the total running speed while maintaining the MOTA metric. Our code is available at https://anonymous.4open.science/r/sdof-tracker-75AE.
翻译:多重人类追踪是了解现场的基本问题。 虽然现实世界应用需要准确性和速度,但最近基于深层学习的跟踪方法侧重于准确性,需要大量运行时间。 本研究的目的是提高运行速度,在某个框架间隔内进行人类检测,因为它占运行时间的大部分。 问题是如何在跳过人类检测的同时保持准确性。 在本文中,我们基于某人的外观在相邻框架之间变化不大这一事实,提出一种以光学流补充检测结果的方法。 为了保持跟踪准确性,我们在人类区域引入了强有力的利益点选择,并根据利益点分布计算了一个跟踪终止指标。 在MOTCharenge的MOT20数据集方面,拟议的SDOF-Tracker在保持MOTA测量的同时,在总运行速度方面实现了最佳绩效。 我们的代码可在 https://anonymous.4open.science/r/sdof-trager-75AE查阅。