The solution to a stochastic optimal control problem can be determined by computing the value function from a discretisation of the associated Hamilton-Jacobi-Bellman equation. Alternatively, the problem can be reformulated in terms of a pair of forward-backward SDEs, which makes Monte-Carlo techniques applicable. More recently, the problem has also been viewed from the perspective of forward and reverse time SDEs and their associated Fokker-Planck equations. This approach is closely related to techniques used in score generative models. Forward and reverse time formulations express the value function as the ratio of two probability functions; one stemming from a forward SDE and another one from a reverse time SDE. In this note, we extend this approach to a more general class of stochastic optimal control problems and combine it with ensemble Kalman filter type approximation techniques in order to obtain an efficient and robust numerical scheme.
翻译:暂无翻译