Autonomous vehicles must often contend with conflicting planning requirements, e.g., safety and comfort could be at odds with each other if avoiding a collision calls for slamming the brakes. To resolve such conflicts, assigning importance ranking to rules (i.e., imposing a rule hierarchy) has been proposed, which, in turn, induces rankings on trajectories based on the importance of the rules they satisfy. On one hand, imposing rule hierarchies can enhance interpretability, but introduce combinatorial complexity to planning; while on the other hand, differentiable reward structures can be leveraged by modern gradient-based optimization tools, but are less interpretable and unintuitive to tune. In this paper, we present an approach to equivalently express rule hierarchies as differentiable reward structures amenable to modern gradient-based optimizers, thereby, achieving the best of both worlds. We achieve this by formulating rank-preserving reward functions that are monotonic in the rank of the trajectories induced by the rule hierarchy; i.e., higher ranked trajectories receive higher reward. Equipped with a rule hierarchy and its corresponding rank-preserving reward function, we develop a two-stage planner that can efficiently resolve conflicting planning requirements. We demonstrate that our approach can generate motion plans in ~7-10 Hz for various challenging road navigation and intersection negotiation scenarios.
翻译:自治车辆往往必须面对相互冲突的规划要求,例如,如果避免碰撞,安全性和舒适性可能会相互冲突,如果避免碰撞要求击溃刹车,则安全和舒适性可能会相互冲突。为了解决这些冲突,我们提议对规则(即实行规则等级制度)进行重要排序,并提议对规则(即实行规则等级制度)进行重要排序,这反过来又根据规则要求的重要性对轨迹进行排名。一方面,实行规则等级制度可以提高解释性,但会给规划带来组合的复杂性;另一方面,如果现代梯度优化工具可以利用不同的奖励结构,但这种结构不易解释,而且不易理解。在本文中,我们提出了一个方法,将规则等级制度等级制度与规则相适应的不同奖励结构等同地表达规则等级制度,从而实现两个世界的最好。我们通过制定等级制度等级制度所引发的单一性奖赏功能来实现这一目标;另一方面,更高的等级制更高级别的奖赏结构可以得到更高的奖赏,但更难解释,更不易调调调调调调调调调调调调调调调。 我们提出一个具有挑战性的规则等级制度,我们可以以提出不同的等级制度,从而形成具有挑战性规划。