Joint object detection and online multi-object tracking (JDT) methods have been proposed recently to achieve one-shot tracking. Yet, existing works overlook the importance of detection itself and often result in missed detections when confronted by occlusions or motion blurs. The missed detections affect not only detection performance but also tracking performance due to inconsistent tracklets. Hence, we propose a new JDT model that recovers the missed detections while associating the detection candidates of consecutive frames by learning object-level spatio-temporal consistency through edge features in a Graph Neural Network (GNN). Our proposed model Sparse Graph Tracker (SGT) converts video data into a graph, where the nodes are top-$K$ scored detection candidates, and the edges are relations between the nodes at different times, such as position difference and visual similarity. Two nodes are connected if they are close in either a Euclidean or feature space, generating a sparsely connected graph. Without motion prediction or Re-Identification (ReID), the association is performed by predicting an edge score representing the probability that two connected nodes refer to the same object. Under the online setting, our SGT achieves state-of-the-art (SOTA) on the MOT17/20 Detection and MOT16/20 benchmarks in terms of AP and MOTA, respectively. Especially, SGT surpasses the previous SOTA on the crowded dataset MOT20 where partial occlusion cases are dominant, showing the effectiveness of detection recovery against partial occlusion. Code will be released at https://github.com/HYUNJS/SGT.
翻译:最近提出了联合物体探测和在线多球跟踪方法,以便实现一次性跟踪。然而,现有的工程忽略了探测本身的重要性,并常常导致在遇到隐蔽或运动模糊时误用探测数据。错过的探测不仅影响探测性能,而且由于轨迹不一致而跟踪性能。因此,我们提出一个新的JDT模型,在通过图形神经网络(GNN)的边缘特征学习目标水平的悬浮-时间一致性,同时将连续框架的探测候选人联系起来,同时学习目标水平的悬浮-时间一致性。我们提议的模型SGTA跟踪器(SGT)将视频数据转换成图表,这里的节点是最高-美元得分的检测对象,而边缘是不同时间的节点之间的关系,例如位置差异和视觉相似性能。如果在Euclideidean或地貌空间接近缺失的探测结果,则有两个节点相连的节点。在没有运动预测或重新识别(ReID)的情况下,该联盟将预测一个边缘分,表明在SGO-GO-20的深度检测基准中,SO-O-O-ODM-O-M-S-O-OD-OD-OD-OD-T-OD-OD-S-OD-S-OD-OD-S-S-T-S-S-S-S-S-S-S-OD-S-S-S-S-S-S-OB-S-S-S-S-S-S-S-S-S-OD-S-OB-OD-OD-OD-OD-OD-OD-S-S-S-S-S-OD-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-T-S-S-S-SDAR-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-