Recently, reinforcement and deep reinforcement learning (RL/DRL) have been applied to develop autonomous agents for cyber network operations(CyOps), where the agents are trained in a representative environment using RL and particularly DRL algorithms. The training environment must simulate CyOps with high fidelity, which the agent aims to learn and accomplish. A good simulator is hard to achieve due to the extreme complexity of the cyber environment. The trained agent must also be generalizable to network variations because operational cyber networks change constantly. The red agent case is taken to discuss these two issues in this work. We elaborate on their essential requirements and potential solution options, illustrated by some preliminary experimentations in a Cyber Gym for Intelligent Learning (CyGIL) testbed.
翻译:暂无翻译