A deep learning approach for the approximation of the Hamilton-Jacobi-Bellman partial differential equation (HJB PDE) associated to the Nonlinear Quadratic Regulator (NLQR) problem. A state-dependent Riccati equation control law is first used to generate a gradient-augmented synthetic dataset for supervised learning. The resulting model becomes a warm start for the minimization of a loss function based on the residual of the HJB PDE. The combination of supervised learning and residual minimization avoids spurious solutions and mitigate the data inefficiency of a supervised learning-only approach. Numerical tests validate the different advantages of the proposed methodology.
翻译:与非线性二次调控(NLQR)问题相关的汉密尔顿-Jacobi-Bellman部分差异方程式(HJB PDE)近似近似化的深层次学习方法。国家依赖的里卡提方程式控制法首先用于生成一个梯度增强的合成数据集,供监督学习使用。由此形成的模型成为以HJB PDE剩余部分为基础的损失功能最小化的温和开端。监督的学习和残余最小化相结合,避免了欺骗性的解决办法,减轻了受监督的只学习方法的效率。数字测试验证了拟议方法的不同优势。