While many theoretical works concerning Adaptive Dynamic Programming (ADP) have been proposed, application results are scarce. Therefore, we design an ADP-based optimal trajectory tracking controller and apply it to a large-scale ball-on-plate system. Our proposed method incorporates an approximated reference trajectory instead of using setpoint tracking and allows to automatically compensate for constant offset terms. Due to the off-policy characteristics of the algorithm, the method requires only a small amount of measured data to train the controller. Our experimental results show that this tracking mechanism significantly reduces the control cost compared to setpoint controllers. Furthermore, a comparison with a model-based optimal controller highlights the benefits of our model-free data-based ADP tracking controller, where no system model and manual tuning are required but the controller is tuned automatically using measured data.
翻译:虽然提出了许多关于适应动态程序(ADP)的理论工作,但应用结果却很少。因此,我们设计了一个基于ADP的最佳轨迹跟踪控制器,并将其应用到一个大型板球系统。我们提议的方法包含一个近似参考轨迹,而不是使用定点跟踪,并允许自动补偿固定抵消条件。由于算法的非政策性特点,该方法只需要少量测量数据来培训控制器。我们的实验结果显示,与设置点控制器相比,这一跟踪机制大大降低了控制成本。此外,与基于模型的最佳控制器的比较凸显了我们基于无模型的数据的ADP跟踪控制器的好处,不需要系统模型和手动调整,但控制器会自动使用计量数据调整。