Novel numerical estimators are proposed for the forward-backward stochastic differential equations (FBSDE) appearing in the Feynman-Kac representation of the value function. In contrast to the current numerical method approaches based on discretization of the continuous-time FBSDE results, we propose a converse approach, by first obtaining a discrete-time approximation of the on-policy value function, and then developing a discrete-time result which resembles the continuous-time counterpart. This approach yields improved numerical estimators in the function approximation phase, and demonstrates enhanced error analysis for those value function estimators. Numerical results and error analysis are demonstrated on a scalar nonlinear stochastic optimal control problem, and they show improvements in the performance of the proposed estimators in comparison with the state-of-the-art methodologies.
翻译:对于在Feynman-Kac 中出现的前向后向随机差分方程式(FBSDE),提出了数字估计值。与当前基于连续时间FBSDE结果分解的数字方法不同,我们提出了反向方法,先是获取政策值函数的离散时间近似,然后是开发类似于连续时间对应方的离散时间结果。这种方法在功能近似阶段产生更好的数字估计值,并展示了这些值函数估测器的强化错误分析。数字结果和错误分析在标尺非线性非直线性最佳控制问题上展示,它们显示了与最新方法相比,拟议估计值的性能有所改善。