In probabilistic control a controller is designed by matching modelled with some arbitrary but desired closed-loop system trajectory distribution. In thisworkwe reviewseveral productive approaches to measure the proximity between probable and desired behaviour. We then illustrate how the associated optimization problems solve into uncertain policies. Our main result is to show that these probabilistic control objectives majorize conventional, stochastic and risk sensitive, optimal control objectives. This observation allows us to identify two probabilistic fixed point iterations that converge to the deterministic optimal control policies. Based on these insights we discuss directions for future algorithmic development and point out some remaining challenges.
翻译:在概率控制方面,控制器的设计模式是仿照一些任意但理想的封闭环环系统轨迹分布模式。 在本文中,我们审视了衡量可能行为和理想行为之间的接近程度的多种生产方法。 然后我们演示了相关优化问题如何在不确定政策中解决。 我们的主要结果显示,这些概率控制目标集中了常规、随机和风险敏感、最佳控制目标。 通过这一观察,我们得以确定两个与确定性最佳控制政策趋同的概率固定点。 基于这些洞察力,我们讨论未来算法发展的方向,并指出一些尚存的挑战。