This paper presents a novel approach for optimal control of nonlinear stochastic systems using infinitesimal generator learning within infinite-dimensional reproducing kernel Hilbert spaces. Our learning framework leverages data samples of system dynamics and stage cost functions, with only control penalties and constraints provided. The proposed method directly learns the diffusion operator of a controlled Fokker-Planck-Kolmogorov equation in an infinite-dimensional hypothesis space. This operator models the continuous-time evolution of the probability measure of the control system's state. We demonstrate that this approach seamlessly integrates with modern convex operator-theoretic Hamilton-Jacobi-Bellman recursions, enabling a data-driven solution to the optimal control problem. Furthermore, our statistical learning framework includes nonparametric estimators for uncontrolled forward infinitesimal generators as a special case. Numerical experiments, ranging from synthetic differential equations to simulated robotic systems, showcase the advantages of our approach compared to both modern data-driven and classical nonlinear programming methods for optimal control.
翻译:暂无翻译