Learning-based predictive control is a promising alternative to optimization-based MPC. However, efficiently learning the optimal control policy, the optimal value function, or the Q-function requires suitable function approximators. Often, artificial neural networks (ANN) are considered but choosing a suitable topology is also non-trivial. Against this background, it has recently been shown that tailored ANN allow, in principle, to exactly describe the optimal control policy in linear MPC by exploiting its piecewise affine structure. In this paper, we provide a similar result for representing the optimal value function and the Q-function that are both known to be piecewise quadratic for linear MPC.
翻译:以学习为基础的预测控制是优化的MPC的一个大有希望的替代办法。 但是,高效地学习最佳控制政策、最佳价值功能或Q功能需要合适的功能近似器。 通常会考虑人工神经网络(ANN),但选择合适的地形也是非三角的。 在这种背景下,最近有证据表明,定制的ANN原则上允许利用线性MPC的片断方形结构来准确描述其最佳控制政策。 在本文中,我们提供了类似的结果,以代表最佳价值功能和已知线性MPC的方形二次函数。