Recent Multiple Object Tracking (MOT) methods have gradually attempted to integrate object detection and instance re-identification (Re-ID) into a united network to form a one-stage solution. Typically, these methods use two separated branches within a single network to accomplish detection and Re-ID respectively without studying the inter-relationship between them, which inevitably impedes the tracking performance. In this paper, we propose an online multi-object tracking framework based on a hierarchical single-branch network to solve this problem. Specifically, the proposed single-branch network utilizes an improved Hierarchical Online In-stance Matching (iHOIM) loss to explicitly model the inter-relationship between object detection and Re-ID. Our novel iHOIM loss function unifies the objectives of the two sub-tasks and encourages better detection performance and feature learning even in extremely crowded scenes. Moreover, we propose to introduce the object positions, predicted by a motion model, as region proposals for subsequent object detection, where the intuition is that detection results and motion predictions can complement each other in different scenarios. Experimental results on MOT16 and MOT20 datasets show that we can achieve state-of-the-art tracking performance, and the ablation study verifies the effectiveness of each proposed component.
翻译:最近的多物体跟踪(MOT)方法逐渐试图将物体探测和情况再识别(Re-ID)整合为一个统一的网络,形成一个阶段的解决办法。通常,这些方法使用一个网络内两个分离的分支,在一个网络内分别实现探测和再识别,而不研究它们之间的相互关系,这不可避免地妨碍跟踪性能。在本文件中,我们提议一个基于等级分级单一部门网络的在线多目标跟踪框架,以解决这一问题。具体地说,拟议的单一部门网络利用改进的高度在线匹配(iHOIM)损失来明确模拟物体探测和再识别之间的相互关系。我们的新颖的iHOIM损失功能统一了两个子任务的目标,鼓励更好的检测性能和特征学习,即使在极为拥挤的场景中也是如此。此外,我们提议引入一个运动模型所预测的物体位置,作为以后的物体探测的区域建议,其中的直觉是探测结果和运动预测可以在不同情景中相互补充。MOT16和MOT20数据集的实验结果,显示我们每个功能追踪的每个组成部分能够实现状态。