Automated vehicles require the ability to cooperate with humans for smooth integration into today's traffic. While the concept of cooperation is well known, developing a robust and efficient cooperative trajectory planning method is still a challenge. One aspect of this challenge is the uncertainty surrounding the state of the environment due to limited sensor accuracy. This uncertainty can be represented by a Partially Observable Markov Decision Process. Our work addresses this problem by extending an existing cooperative trajectory planning approach based on Monte Carlo Tree Search for continuous action spaces. It does so by explicitly modeling uncertainties in the form of a root belief state, from which start states for trees are sampled. After the trees have been constructed with Monte Carlo Tree Search, their results are aggregated into return distributions using kernel regression. We apply two risk metrics for the final selection, namely a Lower Confidence Bound and a Conditional Value at Risk. It can be demonstrated that the integration of risk metrics in the final selection policy consistently outperforms a baseline in uncertain environments, generating considerably safer trajectories.
翻译:自动车辆需要有能力与人类合作,以便顺利融入今天的交通中。虽然合作的概念众所周知,但开发一个稳健有效的合作轨迹规划方法仍是一项挑战。这一挑战的一个方面是传感器精度有限,导致环境状况的不确定性。这种不确定性可以通过一个部分可观测的Markov决策程序来体现。我们的工作通过扩大基于蒙特卡洛树搜索的现有合作轨迹规划方法来解决这一问题,以持续的行动空间为基础。它通过以根信仰状态的形式明确模拟不确定性,从这种状态开始采样树木的状态。在用蒙特卡洛树搜索建造树木后,其结果被汇总为利用内核回归的回报分布。我们为最终选择采用了两种风险指标,即低信任度和风险条件值。可以证明,将风险指标纳入最终选择政策的做法始终超越不确定环境中的基准,从而产生相当安全的轨迹。