Controlling systems of ordinary differential equations (ODEs) is ubiquitous in science and engineering. For finding an optimal feedback controller, the value function and associated fundamental equations such as the Bellman equation and the Hamilton-Jacobi-Bellman (HJB) equation are essential. The numerical treatment of these equations poses formidable challenges due to their non-linearity and their (possibly) high-dimensionality. In this paper we consider a finite horizon control system with associated Bellman equation. After a time-discretization, we obtain a sequence of short time horizon problems which we call local optimal control problems. For solving the local optimal control problems we apply two different methods, one being the well-known policy iteration, where a fixed-point iteration is required for every time step. The other algorithm borrows ideas from Model Predictive Control (MPC), by solving the local optimal control problem via open-loop control methods on a short time horizon, allowing us to replace the fixed-point iteration by an adjoint method. For high-dimensional systems we apply low rank hierarchical tensor product approximation/tree-based tensor formats, in particular tensor trains (TT tensors) and multi-polynomials, together with high-dimensional quadrature, e.g. Monte-Carlo. We prove a linear error propagation with respect to the time discretization and give numerical evidence by controlling a diffusion equation with unstable reaction term and an Allen-Kahn equation.
翻译:普通差异方程式( ODEs) 控制系统在科学和工程上是无处不在的。 为了找到最佳反馈控制器, 数值函数和相关的基本方程式, 如 Bellman 方程式和 Hamilton- Jacobi- Bellman (HJB) 等方程式是必需的。 这些方程式的数值处理因其非线性及其(可能)高维性而带来了巨大的挑战。 在本文中, 我们考虑用相关的Bellman 方程式来建立一个有限地平线控制系统。 在时间分解后, 我们获得了一系列短期问题, 我们称之为当地最佳控制问题。 为了解决当地最佳控制问题, 我们采用了两种不同的方法, 一种是众所周知的政策反复解释, 每一步都需要有一个固定点的重复。 其他算法从模型预测控制(MPC) 中借过一些想法, 其方法是在很短的时间范围内通过开放的操作控制方法解决本地最佳控制问题, 使我们能够用一个匹配的方法取代固定点的直径直径( 级级的直径对等/ 直径等) 的直径直方程式, 和直径直径方程式的直径方程式, 直方程式, 我们用一个高级的直径级的直径直方程式, 直径方程式, 和直方程式, 直方程式, 直方程式, 。