Devising optimal interventions for diffusive systems often requires the solution of the Hamilton-Jacobi-Bellman (HJB) equation, a nonlinear backward partial differential equation (PDE), that is, in general, nontrivial to solve. Existing control methods either tackle the HJB directly with grid-based PDE solvers, or resort to iterative stochastic path sampling to obtain the necessary controls. Here, we present a framework that interpolates between these two approaches. By reformulating the optimal interventions in terms of logarithmic gradients ( scores ) of two forward probability flows, and by employing deterministic particle methods for solving Fokker-Planck equations, we introduce a novel fully deterministic framework that computes the required optimal interventions in one shot.
翻译:设计用于diffusive系统的最佳干预措施往往需要解决汉密尔顿-Jacobi-Bellman(HJB)等式(HJB)问题,这是一个非线性后向偏差部分方程式(PDE),一般来说,这个方程式是非边际的。现有的控制方法要么直接用基于网格的PDE解答器解决HJB问题,要么采用迭代的随机路径取样以获得必要的控制。在这里,我们提出了一个在这两种方法之间进行相互交错的框架。通过重新确定两种前向概率流的对数梯度(分数 ) 的最佳干预措施,并通过使用确定性粒子法解决Fokker-Planck等式,我们引入了一种新型的完全确定性框架,在一次镜头中计算所需的最佳干预措施。