Optimal control of diffusion processes is intimately connected to the problem of solving certain Hamilton-Jacobi-Bellman equations. Building on recent machine learning inspired approaches towards high-dimensional PDEs, we investigate the potential of $\textit{iterative diffusion optimisation}$ techniques, in particular considering applications in importance sampling and rare event simulation, and focusing on problems without diffusion control, with linearly controlled drift and running costs that depend quadratically on the control. More generally, our methods apply to nonlinear parabolic PDEs with a certain shift invariance. The choice of an appropriate loss function being a central element in the algorithmic design, we develop a principled framework based on divergences between path measures, encompassing various existing methods. Motivated by connections to forward-backward SDEs, we propose and study the novel $\textit{log-variance}$ divergence, showing favourable properties of corresponding Monte Carlo estimators. The promise of the developed approach is exemplified by a range of high-dimensional and metastable numerical examples.
翻译:对扩散过程的优化控制与解决某些汉密尔顿-Jacobi-Bellman等式的问题密切相关。基于最近机械学习对高维PDE的启发性方法,我们调查了美元(textit{tremedi 扩散最佳化)技术的潜力,特别是考虑重要性抽样和罕见事件模拟的应用,侧重于没有扩散控制的问题,线性控制的漂移和运行成本取决于控制。更一般地说,我们的方法适用于非线性抛物面PDEs,具有某种变化。选择适当的损失函数是算法设计的一个核心要素,我们根据路径计量方法之间的差异,包括各种现有方法,制定了原则框架。我们借助与前向后的 SDEs的联系,提出并研究新的 $(textit{log- varience}$差异,显示相应的Monte Carlo 估量员的有利性。发达方法的许诺表现为一系列高维和元化的数字实例。