Tracking multiple objects individually differs from tracking groups of related objects. When an object is a part of the group, its trajectory depends on the trajectories of the other group members. Most of the current state-of-the-art trackers follow the approach of tracking each object independently, with the mechanism to handle the overlapping trajectories where necessary. Such an approach does not take inter-object relations into account, which may cause unreliable tracking for the members of the groups, especially in crowded scenarios, where individual cues become unreliable due to occlusions. To overcome these limitations and to extend such trackers to crowded scenes, we propose a plug-in Relation Encoding Module (REM). REM encodes relations between tracked objects by running a message passing over a corresponding spatio-temporal graph, computing relation embeddings for the tracked objects. Our experiments on MOT17 and MOT20 demonstrate that the baseline tracker improves its results after a simple extension with REM. The proposed module allows for tracking severely or even fully occluded objects by utilizing relational cues.
翻译:跟踪多个对象与跟踪相关对象的组别不同。 当一个对象是该组的一部分时, 其轨迹取决于其他组的成员的轨迹。 目前大多数最先进的跟踪器都采用独立跟踪每个对象的方法, 必要时使用处理重叠轨迹的机制处理重叠的轨迹。 这种方法不考虑跨对象关系, 这可能给该组的成员造成不可靠的跟踪, 特别是在拥挤的场景中, 个别线索由于隔离而变得不可靠。 为了克服这些限制, 并将这些追踪器扩大到拥挤的场景, 我们建议使用一个插插接连接编码模块( REM ), 通过在相应的spantio- 时钟图上运行一条信息, 计算被跟踪对象的关系。 我们在 MOT17 和 MOT20 上进行的实验表明, 基线追踪器在与 REM 简单扩展后, 其结果会得到改善。 拟议的模块允许使用连接的线索, 来进行严重甚至完全隐蔽的物体的跟踪。