In this work, we present a learning-based approach to analysis cyberspace security configuration. Unlike prior methods, our approach has the ability to learn from past experience and improve over time. In particular, as we train over a greater number of agents as attackers, our method becomes better at discovering hidden attack paths for previously methods, especially in multi-domain cyberspace. To achieve these results, we pose discovering attack paths as a Reinforcement Learning (RL) problem and train an agent to discover multi-domain cyberspace attack paths. To enable our RL policy to discover more hidden attack paths and shorter attack paths, we ground representation introduction an multi-domain action select module in RL. Our objective is to discover more hidden attack paths and shorter attack paths by our proposed method, to analysis the weakness of cyberspace security configuration. At last, we designed a simulated cyberspace experimental environment to verify our proposed method, the experimental results show that our method can discover more hidden multi-domain attack paths and shorter attack paths than existing baseline methods.
翻译:在这项工作中,我们展示了一种基于学习的方法来分析网络空间安全配置。与以往的方法不同,我们的方法能够从过去的经验中学习,并随着时间的推移不断改进。特别是,当我们训练更多的攻击者成为攻击者时,我们的方法在发现先前方法的隐藏攻击路径方面变得更好,特别是在多域网络空间中。为了实现这些结果,我们把发现攻击路径作为强化学习(RL)问题,并训练一个代理人发现多域网络空间攻击路径。为了使我们的RL政策能够发现更多的隐藏攻击路径和较短的攻击路径,我们地面代表在RL引入一个多域行动选择模块。我们的目标是通过我们提议的方法发现更多的隐藏攻击路径和较短的攻击路径,分析网络空间安全配置的弱点。最后,我们设计了一个模拟的网络空间实验环境来验证我们提出的方法,实验结果显示,我们的方法可以发现比现有基线方法更隐蔽的多域攻击路径和较短的攻击路径。