Model Predictive Control (MPC) is attracting tremendous attention in the autonomous driving task as a powerful control technique. The success of an MPC controller strongly depends on an accurate internal dynamics model. However, the static parameters, usually learned by system identification, often fail to adapt to both internal and external perturbations in real-world scenarios. In this paper, we firstly (1) reformulate the problem as a Partially Observed Markov Decision Process (POMDP) that absorbs the uncertainties into observations and maintains Markov property into hidden states; and (2) learn a recurrent policy continually adapting the parameters of the dynamics model via Recurrent Reinforcement Learning (RRL) for optimal and adaptive control; and (3) finally evaluate the proposed algorithm (referred as $\textit{MPC-RRL}$) in CARLA simulator and leading to robust behaviours under a wide range of perturbations.
翻译:模型预测控制(MPC)作为强大的控制技术,在自主驱动任务中吸引了极大关注。MPC控制器的成功在很大程度上取决于准确的内部动态模型。然而,通常通过系统识别学得的静态参数往往无法适应现实世界情景中的内部和外部扰动。在本文中,我们首先(1) 重塑这一问题,将其作为一个部分观察的Markov决策程序(POMDP),将不确定性吸收到观测中,并将Markov财产保存到隐蔽状态;(2) 学习一项经常性政策,不断通过经常性强化学习(RRRL)调整动态模型的参数,以进行最佳和适应性控制;(3) 最后评估CARLA模拟器中的拟议算法(称为$\textit{MPC-RRL}$),并在广泛的扰动下导致稳健的行为。