In this paper, we propose novel learning frameworks to tackle optimal control problems by applying the Pontryagin maximum principle and then solving for a Hamiltonian dynamical system. Applying the Pontryagin maximum principle to the original optimal control problem shifts the learning focus to reduced Hamiltonian dynamics and corresponding adjoint variables. Then, the reduced Hamiltonian networks can be learned by going backwards in time and then minimizing loss function deduced from the Pontryagin maximum principle's conditions. The learning process is further improved by progressively learning a posterior distribution of the reduced Hamiltonians. This is achieved through utilizing a variational autoencoder which leads to more effective path exploration process. We apply our learning frameworks called NeuralPMP to various control tasks and obtain competitive results.
翻译:暂无翻译