加强安全促进在自主离职保险中深强化学习 (Safety Enhancement for Deep Reinforcement Learning in Autonomous Separation Assurance)

The separation assurance task will be extremely challenging for air traffic controllers in a complex and high density airspace environment. Deep reinforcement learning (DRL) was used to develop an autonomous separation assurance framework in our previous work where the learned model advised speed maneuvers. In order to improve the safety of this model in unseen environments with uncertainties, in this work we propose a safety module for DRL in autonomous separation assurance applications. The proposed module directly addresses both model uncertainty and state uncertainty to improve safety. Our safety module consists of two sub-modules: (1) the state safety sub-module is based on the execution-time data augmentation method to introduce state disturbances in the model input state; (2) the model safety sub-module is a Monte-Carlo dropout extension that learns the posterior distribution of the DRL model policy. We demonstrate the effectiveness of the two sub-modules in an open-source air traffic simulator with challenging environment settings. Through extensive numerical experiments, our results show that the proposed sub-safety modules help the DRL agent significantly improve its safety performance in an autonomous separation assurance task.

翻译：在复杂和高密度空气空间环境中,对空中交通管制员来说,分离保证任务将极具挑战性。深强化学习(DRL)被用于在我们先前工作中开发一个自主的分离保证框架,而我们以前的工作就是在所学的模型建议速度动作。为了提高这一模型在不确定的隐蔽环境中的安全性,我们在此工作中提议在自主分离保证应用程序中为DRL提供一个安全模块。拟议模块直接处理模型不确定性和国家不确定性,以改善安全。我们的安全模块由两个子模块组成:(1) 国家安全子模块以执行时间数据增强方法为基础,在模型输入状态引入州扰动;(2) 示范安全子模块是一个蒙特-卡尔洛省辍学扩展单元,学习DRL模式政策的远地点分布。我们展示了开放源空中交通模拟器中两个子模块在环境环境环境挑战性模拟器中的有效性。我们通过广泛的数字实验,结果显示,拟议的次安全模块有助于DRL代理在自主分离保证任务中大大改进其安全性。

相关内容