通过相对 Entropy Trust-Regions,在不确定动态下分配强力轨迹优化 (Distributionally Robust Trajectory Optimization Under Uncertain Dynamics via Relative Entropy Trust-Regions)

Trajectory optimization and model predictive control are essential techniques underpinning advanced robotic applications, ranging from autonomous driving to full-body humanoid control. State-of-the-art algorithms have focused on data-driven approaches that infer the system dynamics online and incorporate posterior uncertainty during planning and control. Despite their success, such approaches are still susceptible to catastrophic errors that may arise due to statistical learning biases, unmodeled disturbances, or even directed adversarial attacks. In this paper, we tackle the problem of dynamics mismatch and propose a distributionally robust optimal control formulation that alternates between two relative entropy trust-region optimization problems. Our method finds the worst-case maximum entropy Gaussian posterior over the dynamics parameters and the corresponding robust policy. Furthermore, we show that our approach admits a closed-form backward-pass for a certain class of systems. Finally, we demonstrate the resulting robustness on linear and nonlinear numerical examples.

翻译：轨迹优化和模型预测控制是支持先进机器人应用的关键技术,从自主驱动到全体人体控制。最新算法侧重于数据驱动方法,在规划和控制期间将系统动态在线推导并纳入后方不确定性。尽管这些方法取得了成功,但仍然容易发生灾难性错误,而这种错误可能是由于统计学习偏差、非模型干扰、甚至定向对立攻击造成的。在本文件中,我们处理动态不匹配问题,并提出一种分布稳健的最佳控制配方,在两个相对的对流信任区域优化问题之间进行交替。我们的方法发现,最坏的对流参数和相应的强势政策是最小的对流。此外,我们展示了我们的方法为某类系统提供了封闭式的后向通道。最后,我们展示了由此产生的线性和非线性数字实例的稳健性。

相关内容

相对熵

关注 0

相对熵（relative entropy），又被称为Kullback-Leibler散度（Kullback-Leibler divergence）或信息散度（information divergence），是两个概率分布（probability distribution）间差异的非对称性度量。在在信息理论中，相对熵等价于两个概率分布的信息熵（Shannon entropy）的差值.

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

54+阅读 · 2020年9月7日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

【MLA 2019】机器学习中分布式鲁棒优化的一阶算法框架( Towards a First-Order Algorithmic Framework for Distributionally Robust Optimization in Machine Learning),香港中文大学苏文藻

专知会员服务

28+阅读 · 2019年11月6日