We propose a parameterization of nonlinear output feedback controllers for linear dynamical systems based on a recently developed class of neural network called the recurrent equilibrium network (REN), and a nonlinear version of the Youla parameterization. Our approach guarantees the closed-loop stability of partially observable linear dynamical systems without requiring any constraints to be satisfied. This significantly simplifies model fitting as any unconstrained optimization procedure can be applied whilst still maintaining stability. We demonstrate our method on reinforcement learning tasks with both exact and approximate gradient methods. Simulation studies show that our method is significantly more scalable and significantly outperforms other approaches in the same problem setting.
翻译:我们建议对线性动态系统非线性输出反馈控制器进行参数化,其依据是最近开发的神经网络类别,称为经常平衡网络(REN),以及Youla参数化的非线性版本。我们的方法保证部分可观测线性动态系统的闭环稳定性,而无需满足任何限制条件。这大大简化了模型的适应性,因为任何不受限制的优化程序都可以在保持稳定性的同时加以应用。我们用精确和近似梯度的方法展示了我们加强学习任务的方法。模拟研究显示,我们的方法在同样的问题设置中大大地更加可伸缩,大大优于其他方法。