Optimal control problems can be solved by first applying the Pontryagin maximum principle, followed by computing a solution of the corresponding unconstrained Hamiltonian dynamical system. In this paper, and to achieve a balance between robustness and efficiency, we learn a reduced Hamiltonian of the unconstrained Hamiltonian. This reduced Hamiltonian is learned by going backward in time and by minimizing the loss function resulting from application of the Pontryagin maximum principle conditions. The robustness of our learning process is then further improved by progressively learning a posterior distribution of reduced Hamiltonians. This leads to a more efficient sampling of the generalized coordinates (position, velocity) of our phase space. Our solution framework applies to not only optimal control problems with finite-dimensional phase (state) spaces but also the infinite dimensional case.
翻译:最佳控制问题可以通过首先应用Pontryagin最大原则,然后计算相应的不受限制的汉密尔顿动态系统的解决办法来解决。在本文中,为了在稳健和效率之间取得平衡,我们学习了一位不受限制的汉密尔顿人减少的汉密尔顿人。这个减少的汉密尔顿人是通过时间倒退和尽量减少因适用Pontryagin最高原则条件而造成的损失功能来学习的。然后,通过逐步学习减少的汉密尔顿人后方分布来进一步提高我们学习过程的活力。这导致更有效地取样我们阶段空间的普遍坐标(位置、速度)。我们的解决方案框架不仅适用于有限空间(状态)的最佳控制问题,也适用于无限维度案例。