In this paper, a highly parallel and derivative-free martingale neural network learning method is proposed to solve Hamilton-Jacobi-Bellman (HJB) equations arising from stochastic optimal control problems (SOCPs), as well as general quasilinear parabolic partial differential equations (PDEs). In both cases, the PDEs are reformulated into a martingale formulation such that loss functions will not require the computation of the gradient or Hessian matrix of the PDE solution, while its implementation can be parallelized in both time and spatial domains. Moreover, the martingale conditions for the PDEs are enforced using a Galerkin method in conjunction with adversarial learning techniques, eliminating the need for direct computation of the conditional expectations associated with the martingale property. For SOCPs, a derivative-free implementation of the maximum principle for optimal controls is also introduced. The numerical results demonstrate the effectiveness and efficiency of the proposed method, which is capable of solving HJB and quasilinear parabolic PDEs accurately in dimensions as high as 10,000.
翻译:暂无翻译