利用利普施奇茨大盗在POMDPs中高效抽样,以进行连续空间的移动规划 (Efficient Sampling in POMDPs with Lipschitz Bandits for Motion Planning in Continuous Spaces)

Decision making under uncertainty can be framed as a partially observable Markov decision process (POMDP). Finding exact solutions of POMDPs is generally computationally intractable, but the solution can be approximated by sampling-based approaches. These sampling-based POMDP solvers rely on multi-armed bandit (MAB) heuristics, which assume the outcomes of different actions to be uncorrelated. In some applications, like motion planning in continuous spaces, similar actions yield similar outcomes. In this paper, we utilize variants of MAB heuristics that make Lipschitz continuity assumptions on the outcomes of actions to improve the efficiency of sampling-based planning approaches. We demonstrate the effectiveness of this approach in the context of motion planning for automated driving.

翻译：不确定情况下的决策可被描述为部分可见的Markov决策程序(POMDP),寻找POMDP的确切解决方案一般在计算上难以解决,但解决办法可以通过抽样方法加以比较。这些基于抽样的POMDP解决方案依赖多武装强盗(MAB)的体力学,认为不同行动的结果不相干。在一些应用中,如连续空间的运动规划,类似行动产生类似的结果。在本文中,我们使用了MAB超常理论的变异方法,使Libschitz对提高抽样规划方法效率的行动结果作出持续假设。我们从自动驾驶运动规划的角度表明了这一方法的有效性。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

【CMU】可扩展人工智能白皮书

专知会员服务

28+阅读 · 2021年7月3日

【AAAI2021】Lipschitz终身强化学习

专知会员服务

31+阅读 · 2020年12月14日

知识图谱推理，50页ppt，Salesforce首席科学家Richard Socher

专知会员服务

111+阅读 · 2020年6月10日

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

专知会员服务

13+阅读 · 2020年6月8日