This study follows many classical approaches to multi-object tracking (MOT) that model the problem using dynamic graphical data structures, and adapts this formulation to make it amenable to modern neural networks. Our main contributions in this work are the creation of a framework based on dynamic undirected graphs that represent the data association problem over multiple timesteps, and a message passing graph neural network (MPNN) that operates on these graphs to produce the desired likelihood for every association therein. We also provide solutions and propositions for the computational problems that need to be addressed to create a memory-efficient, real-time, online algorithm that can reason over multiple timesteps, correct previous mistakes, update beliefs, and handle missed/false detections. To demonstrate the efficacy of our approach, we only use the 2D box location and object category ID to construct the descriptor for each object instance. Despite this, our model performs on par with state-of-the-art approaches that make use of additional sensors, as well as multiple hand-crafted and/or learned features. This illustrates that given the right problem formulation and model design, raw bounding boxes (and their kinematics) from any off-the-shelf detector are sufficient to achieve competitive tracking results on challenging MOT benchmarks.
翻译:这项研究遵循许多典型的多点跟踪方法(MOT),用动态图形数据结构来模拟问题,并调整这一配方使其适合现代神经网络。我们在这方面工作的主要贡献是建立一个基于动态非方向图表的框架,这些图表代表了多个时间步骤的数据关联问题,以及一个信息传递图形神经网络(MPNN),这些图形网络在这些图表上运行,以产生每个关联的预期可能性。我们还为计算问题提供了解决方案和建议,这些问题需要解决,以便产生一个记忆高效、实时、在线的算法,这种算法可以解释多重时间步骤、纠正以往错误、更新信念和处理漏/漏/漏探测。为了展示我们的方法的有效性,我们只使用2D框位置和对象类别标识来构建每个对象实例的描述符。尽管如此,我们的模型与使用其他传感器以及多手制和/或学习过的特性的状态方法一样,在设计正确的问题和模型设计、原始绑定和任何具有竞争力的跟踪基准时,都从正确的模型设计、原始绑定到任何具有挑战性的跟踪基准。