We propose two algorithms for the solution of the optimal control of ergodic McKean-Vlasov dynamics. Both algorithms are based on approximations of the theoretical solutions by neural networks, the latter being characterized by their architecture and a set of parameters. This allows the use of modern machine learning tools, and efficient implementations of stochastic gradient descent.The first algorithm is based on the idiosyncrasies of the ergodic optimal control problem. We provide a mathematical proof of the convergence of the approximation scheme, and we analyze rigorously the approximation by controlling the different sources of error. The second method is an adaptation of the deep Galerkin method to the system of partial differential equations issued from the optimality condition. We demonstrate the efficiency of these algorithms on several numerical examples, some of them being chosen to show that our algorithms succeed where existing ones failed. We also argue that both methods can easily be applied to problems in dimensions larger than what can be found in the existing literature. Finally, we illustrate the fact that, although the first algorithm is specifically designed for mean field control problems, the second one is more general and can also be applied to the partial differential equation systems arising in the theory of mean field games.
翻译:我们为最佳控制ERgodic McKan-Vlasov动态的解决方案提出了两种算法。两种算法都基于神经网络理论解决方案的近似值,后者的特征是神经网络的架构和一系列参数。这样可以使用现代机器学习工具,并有效地实施随机梯度梯度下降。第一种算法以ERgodic McKan-Vlasov最佳控制问题的特殊性为基础。我们提供了近似方案趋同的数学证明,我们通过控制不同误差源来严格分析近似值。第二种方法是深加勒金方法适应从最佳状态中发布的部分差异方程式系统。我们在若干数字例子中展示了这些算法的效率,其中一些例子被选用来显示我们的算法成功之处。我们还争论说,两种方法都可以很容易地适用于比现有文献中可以找到的更大范围的问题。最后,我们通过控制不同误差源来严格分析近似值。第二种方法是深加勒金方法,而第二个方法则更笼统地适用于从最佳状态中发出的部分差异方程式。我们用了一些数字例子来证明这些算法的效率。我们也可以将偏差法应用于局部的场中产生的偏差法系统。