基于贝叶斯优化的免重训练神经网络近似模型预测控制微调方法 (Fine-Tuning of Neural Network Approximate MPC without Retraining via Bayesian Optimization)

Approximate model-predictive control (AMPC) aims to imitate an MPC's behavior with a neural network, removing the need to solve an expensive optimization problem at runtime. However, during deployment, the parameters of the underlying MPC must usually be fine-tuned. This often renders AMPC impractical as it requires repeatedly generating a new dataset and retraining the neural network. Recent work addresses this problem by adapting AMPC without retraining using approximated sensitivities of the MPC's optimization problem. Currently, this adaption must be done by hand, which is labor-intensive and can be unintuitive for high-dimensional systems. To solve this issue, we propose using Bayesian optimization to tune the parameters of AMPC policies based on experimental data. By combining model-based control with direct and local learning, our approach achieves superior performance to nominal AMPC on hardware, with minimal experimentation. This allows automatic and data-efficient adaptation of AMPC to new system instances and fine-tuning to cost functions that are difficult to directly implement in MPC. We demonstrate the proposed method in hardware experiments for the swing-up maneuver on an inverted cartpole and yaw control of an under-actuated balancing unicycle robot, a challenging control problem.

翻译：近似模型预测控制（AMPC）旨在通过神经网络模拟MPC的行为，从而避免在运行时求解昂贵的优化问题。然而，在部署过程中，底层MPC的参数通常需要进行微调。这往往使得AMPC不实用，因为它需要反复生成新数据集并重新训练神经网络。近期研究通过利用MPC优化问题的近似灵敏度来实现AMPC的免重训练自适应，从而解决了这一问题。目前，这种自适应必须手动完成，这不仅劳动强度大，而且对于高维系统可能缺乏直观性。为解决此问题，我们提出使用贝叶斯优化基于实验数据对AMPC策略的参数进行调优。通过将基于模型的控制与直接局部学习相结合，我们的方法在硬件实验中以最少的实验成本实现了优于标称AMPC的性能。这使得AMPC能够以自动化和数据高效的方式适应新系统实例，并针对难以直接在MPC中实现的成本函数进行微调。我们在倒立摆起摆控制与欠驱动平衡独轮机器人偏航控制这两个具有挑战性的控制问题上，通过硬件实验验证了所提方法的有效性。