There has recently been an increased interest in reinforcement learning for nonlinear control problems. However standard reinforcement learning algorithms can often struggle even on seemingly simple set-point control problems. This paper argues that three ideas can improve reinforcement learning methods even for highly nonlinear set-point control problems: 1) Make use of a prior feedback controller to aid amplitude exploration. 2) Use integrated errors. 3) Train on model ensembles. Together these ideas lead to more efficient training, and a trained set-point controller that is more robust to modelling errors and thus can be directly deployed to real-world nonlinear systems. The claim is supported by experiments with a real-world nonlinear cascaded tank process and a simulated strongly nonlinear pH-control system.
翻译:最近,强化学习在非线性控制问题上引起了越来越多的关注。然而,即使是看似简单的设定点控制问题,标准的强化学习算法也经常遇到困难。本文认为,即使在高度非线性的设定点控制问题上,三个观点也可以改进强化学习方法:1)利用先前的反馈控制器来帮助幅度调整;2)使用积分误差;3)在模型集合上进行训练。这些想法共同导致更加高效的训练,并且训练好的设定点控制器更加鲁棒,可以直接应用于真实的非线性系统。实验基于一个真实的非线性串级水箱过程和一个模拟的强烈非线性的pH控制系统,支持了上述观点的主张。