Development of autonomous cyber system defense strategies and action recommendations in the real-world is challenging, and includes characterizing system state uncertainties and attack-defense dynamics. We propose a data-driven deep reinforcement learning (DRL) framework to learn proactive, context-aware, defense countermeasures that dynamically adapt to evolving adversarial behaviors while minimizing loss of cyber system operations. A dynamic defense optimization problem is formulated with multiple protective postures against different types of adversaries with varying levels of skill and persistence. A custom simulation environment was developed and experiments were devised to systematically evaluate the performance of four model-free DRL algorithms against realistic, multi-stage attack sequences. Our results suggest the efficacy of DRL algorithms for proactive cyber defense under multi-stage attack profiles and system uncertainties.
翻译:在现实世界中制定自主的网络系统防御战略和行动建议具有挑战性,包括确定系统状态的不确定性和攻击防御动态。我们提议了一个数据驱动的深度强化学习框架(DRL),以学习积极主动的、符合情境的和防御的对策,这种对策能动态地适应不断变化的对抗行为,同时尽量减少网络系统操作的损失。动态防御优化问题是在针对不同类型、技能水平和持久性不同的对手采取多种保护姿态的情况下形成的。开发了一个定制模拟环境,并设计了实验,以系统评价四个无模型的DRL算法对现实的多阶段攻击序列的性能。我们的结果表明DRL算法在多阶段攻击特征和系统不确定性下对主动式网络防御的功效。