In this paper, we demonstrate that deep learning based method can be used to fuse multi-object densities. Given a scenario with several sensors with possibly different field-of-views, tracking is performed locally in each sensor by a tracker, which produces random finite set multi-object densities. To fuse outputs from different trackers, we adapt a recently proposed transformer-based multi-object tracker, where the fusion result is a global multi-object density, describing the set of all alive objects at the current time. We compare the performance of the transformer-based fusion method with a well-performing model-based Bayesian fusion method in several simulated scenarios with different parameter settings using synthetic data. The simulation results show that the transformer-based fusion method outperforms the model-based Bayesian method in our experimental scenarios.
翻译:在本文中,我们演示了深层次的学习方法可以用来融合多对象密度。 如果设想方案有多个传感器,且视野可能不同,则每个传感器由追踪器在当地进行跟踪,产生随机的有限多对象密度。为了融合不同跟踪器的输出,我们调整了最近提议的基于变压器的多对象跟踪器,其中聚变结果是一种全球多对象密度,描述当前所有活物体的组合。我们比较了以变压器为基础的聚变法的性能,将若干模拟假设方案中基于模型的贝叶斯聚变法与使用合成数据的不同参数设置相匹配。模拟结果表明,基于变压器的聚变法在我们实验情景中优于以模型为基础的贝叶斯方法。