Motion forecasting in highly interactive scenarios is a challenging problem in autonomous driving. In such scenarios, we need to accurately predict the joint behavior of interacting agents to ensure the safe and efficient navigation of autonomous vehicles. Recently, goal-conditioned methods have gained increasing attention due to their advantage in performance and their ability to capture the multimodality in trajectory distribution. In this work, we study the joint trajectory prediction problem with the goal-conditioned framework. In particular, we introduce a conditional-variational-autoencoder-based (CVAE) model to explicitly encode different interaction modes into the latent space. However, we discover that the vanilla model suffers from posterior collapse and cannot induce an informative latent space as desired. To address these issues, we propose a novel approach to avoid KL vanishing and induce an interpretable interactive latent space with pseudo labels. The proposed pseudo labels allow us to incorporate domain knowledge on interaction in a flexible manner. We motivate the proposed method using an illustrative toy example. In addition, we validate our framework on the Waymo Open Motion Dataset with both quantitative and qualitative evaluations.
翻译:在高度互动的情景中,动态预测是自主驾驶的一个棘手问题。在这种情景中,我们需要准确地预测互动代理人的共同行为,以确保自主车辆的安全和有效的导航。最近,目标限制的方法因其在性能方面的优势和捕捉轨道分布中多式功能的能力而日益受到重视。在这项工作中,我们用目标限制的框架研究联合轨迹预测问题。特别是,我们引入一个有条件的变式自动读数模型(CVAE),将不同的互动模式明确编码到潜藏空间。然而,我们发现香草模型存在后遗物崩溃,无法产生理想的知情潜伏空间。为了解决这些问题,我们提出了一个新的办法,以避免KL消失,并引出一个可解释的、带有假标签的交互式潜在空间。拟议的假标签使我们能够以灵活的方式纳入关于互动的域知识。我们用一个示例来激励拟议方法。此外,我们用定量和定性的评价来验证我们在Waymo Open Motion数据集上的框架。