Autonomous cyber agents may be developed by applying reinforcement and deep reinforcement learning (RL/DRL), where agents are trained in a representative environment. The training environment must simulate with high-fidelity the network Cyber Operations (CyOp) that the agent aims to explore. Given the complexity of net-work CyOps, a good simulator is difficult to achieve. This work presents a systematic solution to automatically generate a high-fidelity simulator in the Cyber Gym for Intelligent Learning (CyGIL). Through representation learning and continuous learning, CyGIL provides a unified CyOp training environment where an emulated CyGIL-E automatically generates a simulated CyGIL-S. The simulator generation is integrated with the agent training process to further reduce the required agent training time. The agent trained in CyGIL-S is transferrable directly to CyGIL-E showing full transferability to the emulated "real" network. Experimental results are presented to demonstrate the CyGIL training performance. Enabling offline RL, the CyGIL solution presents a promising direction towards sim-to-real for leveraging RL agents in real-world cyber networks.
翻译:智能安全代理可以通过应用强化学习和深度强化学习(RL/DRL)开发,其中代理在代表性环境中进行训练。训练环境必须具有高保真度来模拟代理试图探索的网络Cyber Operations (CyOp)。由于网络CyOps的复杂性,很难实现一个好的仿真器。本文提出了一种系统性的解决方案,在智能学习的Cyber Gym中自动生成高保真度的仿真器。通过表示学习和连续学习,CyGIL提供了一个统一的CyOp训练环境,在模拟的CyGIL-S的基础上自动生成了模拟的CyGIL-E。仿真器生成与代理训练过程集成,进一步降低了所需的代理训练时间。在CyGIL-S中训练的代理直接可转移到CyGIL-E,完全集成实际网络的转移性。实验结果展示了CyGIL的训练性能。通过支持脱机RL,CyGIL解决方案向实现利用RL代理来管理实际网络提供了有前途的方向。