We develop a generic mechanism for generating vehicle-type specific sequences of waypoints from a probabilistic foundation model of driving behavior. Many foundation behavior models are trained on data that does not include vehicle information, which limits their utility in downstream applications such as planning. Our novel methodology conditionally specializes such a behavior predictive model to a vehicle-type by utilizing byproducts of the reinforcement learning algorithms used to produce vehicle specific controllers. We show how to compose a vehicle specific value function estimate with a generic probabilistic behavior model to generate vehicle-type specific waypoint sequences that are more likely to be physically plausible then their vehicle-agnostic counterparts.
翻译:我们开发了一个通用机制,从驾驶行为概率基础模型中生成车辆类型特定路标序列。许多基础行为模型都接受不包含车辆信息的数据培训,这些数据限制了其在规划等下游应用中的实用性。我们的新颖方法有条件地通过利用用于产生车辆特定控制器的强化学习算法的副产品,将此类行为预测模型专门化为车辆类型。我们展示了如何以通用概率行为模型来计算车辆特定价值函数估计,以生成车辆类型特定路标序列,这些序列在物理上更可能合理,而在车辆不可知性对应方则更可能合理。