Multiple object tracking gained a lot of interest from researchers in recent years, and it has become one of the trending problems in computer vision, especially with the recent advancement of autonomous driving. MOT is one of the critical vision tasks for different issues like occlusion in crowded scenes, similar appearance, small object detection difficulty, ID switching, etc. To tackle these challenges, as researchers tried to utilize the attention mechanism of transformer, interrelation of tracklets with graph convolutional neural network, appearance similarity of objects in different frames with the siamese network, they also tried simple IOU matching based CNN network, motion prediction with LSTM. To take these scattered techniques under an umbrella, we have studied more than a hundred papers published over the last three years and have tried to extract the techniques that are more focused on by researchers in recent times to solve the problems of MOT. We have enlisted numerous applications, possibilities, and how MOT can be related to real life. Our review has tried to show the different perspectives of techniques that researchers used overtimes and give some future direction for the potential researchers. Moreover, we have included popular benchmark datasets and metrics in this review.
翻译:近年来,研究人员对多物体追踪产生了极大兴趣,这已成为计算机视觉中一个趋势性问题,特别是最近自主驾驶的进步。MOT是各种问题的重要远景任务之一,例如挤入拥挤的场景、相似的外观、小型物体探测困难、身份转换等等。为了应对这些挑战,研究人员试图利用变压器的注意机制、轨迹与图形神经网络的相互关系、不同框架中的物体与Siamese网络的相似性,他们还尝试简单的IOU匹配基于CNN网络的网络、与LSTM进行运动预测。为了将这些分散的技术纳入一个伞状之下,我们研究了近三年来出版的100多篇论文,并试图提取研究人员最近更注重解决MOT问题的技术。我们收集了许多应用程序、可能性以及MOT与现实生活的关系。我们的审查试图展示研究人员使用超时技术的不同观点,并为潜在的研究人员提供了一些未来方向。此外,我们在这次审查中包括了流行的基准数据集和衡量标准。