This work aims to enable autonomous agents for network cyber operations (CyOps) by applying reinforcement and deep reinforcement learning (RL/DRL). The required RL training environment is particularly challenging, as it must balance the need for high-fidelity, best achieved through real network emulation, with the need for running large numbers of training episodes, best achieved using simulation. A unified training environment, namely the Cyber Gym for Intelligent Learning (CyGIL) is developed where an emulated CyGIL-E automatically generates a simulated CyGIL-S. From preliminary experimental results, CyGIL-S is capable to train agents in minutes compared with the days required in CyGIL-E. The agents trained in CyGIL-S are transferrable directly to CyGIL-E showing full decision proficiency in the emulated "real" network. Enabling offline RL, the CyGIL solution presents a promising direction towards sim-to-real for leveraging RL agents in real-world cyber networks.
翻译:本文旨在通过应用强化和深度强化学习(RL/DRL)为网络的CyOps操作实现自主代理。所需的RL训练环境具有特殊挑战,因为它必须平衡需要高保真度的需求,最好通过真实网络仿真实现,以及需要运行大量训练剧集的需求,最好使用模拟实现。开发了统一的训练环境,即智能学习的网络Cyber Gym(CyGIL),其中仿真的CyGIL-E自动生成了模拟的CyGIL-S。从初步的实验结果来看,CyGIL-S能够在几分钟内训练代理,而在CyGIL-E中需要数天的训练时间。在CyGIL-S中训练的代理可以直接转移到CyGIL-E中,展示在仿真的“真实”网络中的完全决策能力。通过实现脱机RL,CyGIL解决方案向利用现实世界网络中的RL代理提供了一个有前途的方向。