Traditional linear control strategies have been extensively researched and utilized in many robotic and industrial applications and yet they don't respond to the total dynamics of the systems. To avoid tedious calculations for nonlinear control schemes like H infinity control and Predictive Control, the application of Reinforcement Learning can provide alternative solutions. This article presents the implementation of RL control with Deep Deterministic Policy Gradient and Proximal Policy Optimization on a mobile self-balancing Extendible Wheeled Inverted Pendulum (E-WIP) system. Such RL models make the task of finding a satisfactory control scheme easier and respond to the dynamics effectively while self-tuning the parameters to provide better control. In this article, two RL-based controllers are pitted against an MPC controller to evaluate the performance on the basis of state variables of the EWIP system while following a specific desired trajectory.
翻译:传统的线性控制战略在许多机器人和工业应用中得到了广泛的研究和利用,但它们并没有对系统的总体动态作出反应。为了避免对H无限控制和预测控制等非线性控制计划进行无聊的计算,应用强化学习可以提供替代解决方案。本篇文章介绍了在移动自平式延长轮式反向反转式双转式双翼计算机系统(E-WIP)的移动自平式自动平衡式扩展轮式双向双向双向计算机系统(E-WIP)上实施RL控制的情况。这些RL模型使得寻找令人满意的控制计划更容易,并有效地对动态作出反应,同时对参数进行自调,以提供更好的控制。在本篇文章中,两个基于RL的控制器与MPC控制器对齐,以便根据人们所期望的具体轨迹,根据移动式的EWIP系统状态变量评估性能。