We address the problem of forecasting pedestrian and vehicle trajectories in unknown environments, conditioned on their past motion and scene structure. Trajectory forecasting is a challenging problem due to the large variation in scene structure and the multimodal distribution of future trajectories. Unlike prior approaches that directly learn one-to-many mappings from observed context to multiple future trajectories, we propose to condition trajectory forecasts on plans sampled from a grid based policy learned using maximum entropy inverse reinforcement learning (MaxEnt IRL). We reformulate MaxEnt IRL to allow the policy to jointly infer plausible agent goals, and paths to those goals on a coarse 2-D grid defined over the scene. We propose an attention based trajectory generator that generates continuous valued future trajectories conditioned on state sequences sampled from the MaxEnt policy. Quantitative and qualitative evaluation on the publicly available Stanford drone and NuScenes datasets shows that our model generates trajectories that are diverse, representing the multimodal predictive distribution, and precise, conforming to the underlying scene structure over long prediction horizons.
翻译:我们处理在未知环境中预测行人和车辆轨迹的问题,以其过去运动和场景结构为条件。轨迹预测是一个具有挑战性的问题,因为现场结构变化很大,而且未来轨迹分布多式。与以往直接从观测到的场景中学习一对多图谱到多个未来轨迹的方法不同,我们提议对基于电网的计划进行轨迹预测,这种预测是根据利用最大反向反向强化学习(MaxEnt IRL)所学到的基于电网的政策抽样。我们重新配置了MaxEnt IRL, 以便让该政策在现场定义的粗略二维电网上共同推导出合理的物剂目标及实现这些目标的路径。我们提议一种基于轨迹的注意生成器,以从观测到的马克斯Ent政策的状态序列为条件,产生持续宝贵的未来轨迹。对公开使用的斯坦福公司的无人驾驶飞机和Nusenes数据集进行定量和定性评价,显示我们的模型产生不同的轨迹,代表了多式预测分布,准确、符合长期预测视野的基本场景结构。