评估基于模型的规划和规划方的摊分,以持续控制 (Evaluating model-based planning and planner amortization for continuous control)

Arunkumar Byravan,Leonard Hasenclever,Piotr Trochim,Mehdi Mirza,Alessandro Davide Ialongo,Yuval Tassa,Jost Tobias Springenberg,Abbas Abdolmaleki,Nicolas Heess,Josh Merel,Martin Riedmiller

from arxiv, 9 pages main text, 30 pages with references and appendix including several ablations and additional experiments. Submitted to ICLR 2022

There is a widespread intuition that model-based control methods should be able to surpass the data efficiency of model-free approaches. In this paper we attempt to evaluate this intuition on various challenging locomotion tasks. We take a hybrid approach, combining model predictive control (MPC) with a learned model and model-free policy learning; the learned policy serves as a proposal for MPC. We find that well-tuned model-free agents are strong baselines even for high DoF control problems but MPC with learned proposals and models (trained on the fly or transferred from related tasks) can significantly improve performance and data efficiency in hard multi-task/multi-goal settings. Finally, we show that it is possible to distil a model-based planner into a policy that amortizes the planning computation without any loss of performance. Videos of agents performing different tasks can be seen at https://sites.google.com/view/mbrl-amortization/home.

翻译：人们广泛认为,基于模型的控制方法应该能够超过无模型方法的数据效率。在本文件中,我们试图评估关于各种具有挑战性的移动任务的各种直觉。我们采取了混合方法,将模型预测控制(MPC)与学习的模型和不学习模型的政策学习结合起来;所学的政策作为MPC的一项提案。我们发现,即使对于高剂量控制问题,没有模型的妥善调控剂也是强有力的基线,但是,具有(在飞行上受过训练或从相关任务中转移的)丰富建议和模型的MPC可以大大提高硬性多任务/多目标环境中的性能和数据效率。最后,我们表明,有可能将基于模型的规划器分解成一项政策,在不丧失任何性能的情况下对规划计算进行合并。在https://sites.google.com/view/mbrl-amortization/home上可以看到从事不同任务的代理人的录像。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

【USC2021】常识推理，47页ppt，Commonsense Reasoning in the Wild

专知会员服务

33+阅读 · 2021年10月9日

【斯坦福大牛Chelsea Finn2020新课】深度多任务和元学习，附课程PPT下载

专知会员服务

56+阅读 · 2020年10月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日