In the last few years, researchers have applied machine learning strategies in the context of vehicular platoons to increase the safety and efficiency of cooperative transportation. Reinforcement Learning methods have been employed in the longitudinal spacing control of Cooperative Adaptive Cruise Control systems, but to date, none of those studies have addressed problems of disturbance rejection in such scenarios. Characteristics such as uncertain parameters in the model and external interferences may prevent agents from reaching null-spacing errors when traveling at cruising speed. On the other hand, complex communication topologies lead to specific training processes that can not be generalized to other contexts, demanding re-training every time the configuration changes. Therefore, in this paper, we propose an approach to generalize the training process of a vehicular platoon, such that the acceleration command of each agent becomes independent of the network topology. Also, we have modeled the acceleration input as a term with integral action, such that the Artificial Neural Network is capable of learning corrective actions when the states are disturbed by unknown effects. We illustrate the effectiveness of our proposal with experiments using different network topologies, uncertain parameters, and external forces. Comparative analyses, in terms of the steady-state error and overshoot response, were conducted against the state-of-the-art literature. The findings offer new insights concerning generalization and robustness of using Reinforcement Learning in the control of autonomous platoons.
翻译:在过去几年里,研究人员在车辆排中应用了机械学习战略,以提高合作运输的安全和效率。强化学习方法被用于合作适应性巡航控制系统的纵向间距控制,但迄今为止,这些研究都没有解决在这种情景下拒绝干扰的问题。模型的不确定参数和外部干扰等特征可能阻止代理商在以巡航速度旅行时出现无间距错误。另一方面,复杂的通信结构导致无法推广到其他环境的具体培训过程,要求每次配置变化时都进行再培训。因此,我们在本文件中提出对一个车辆排的培训过程进行概括化的方法,使每个代理商的加速指挥在网络地形学上变得独立。此外,我们用整体行动将加速输入作为术语的模式,例如人工神经网络能够在国家受到未知影响时学习纠正行动。我们的提案的有效性是利用不同的网络顶层结构学、不确定参数和外部力量进行实验。在总体学习层次上的稳健健的成绩方面,比较了最新的成绩分析,在总体学习成绩方面,提供了稳健的精确的成绩。