Recent research shows that supervised learning can be an effective tool for designing optimal feedback controllers for high-dimensional nonlinear dynamic systems. But the behavior of neural network controllers is still not well understood. In particular, some neural networks with high test accuracy can fail to even locally stabilize the dynamic system. To address this challenge we propose several novel neural network architectures, which we show guarantee local asymptotic stability while retaining the approximation capacity to learn the optimal feedback policy semi-globally. The proposed architectures are compared against standard neural network feedback controllers through numerical simulations of two high-dimensional nonlinear optimal control problems: stabilization of an unstable Burgers-type partial differential equation, and altitude and course tracking for an unmanned aerial vehicle. The simulations demonstrate that standard neural networks can fail to stabilize the dynamics even when trained well, while the proposed architectures are always at least locally stabilizing. Moreover, the proposed controllers are found to be nearly optimal in testing.
翻译:最近的研究显示,有监督的学习可以成为设计高维非线性动态系统最佳反馈控制器的有效工具。 但神经网络控制器的行为仍然没有得到很好的理解。 特别是, 一些测试精度高的神经网络可能甚至无法在当地稳定动态系统。 为了应对这一挑战,我们提议了一些新的神经网络结构,我们表明这保证了当地无症状稳定,同时保留了近似能力,以学习最佳反馈政策半全球的最佳反馈政策。 通过两个高维非线性非线性最佳控制问题的数字模拟,将拟议的神经网络反馈控制器与标准神经网络反馈控制器进行比较:稳定不稳定的布尔格斯型部分差异方程式,以及无人驾驶飞行器的高度和航道跟踪。 模拟表明,标准神经网络即使在经过良好培训后仍可能无法稳定动态,而拟议的结构也总是至少稳定在本地。 此外,在测试中发现拟议的控制器几乎是最佳的。