Reinforcement learning (RL) has been demonstrated suitable to develop agents that play complex games with human-level performance. However, it is not understood how to effectively use RL to perform cybersecurity tasks. To develop such understanding, it is necessary to develop RL agents using simulation and emulation systems allowing researchers to model a broad class of realistic threats and network conditions. Demonstrating that a specific RL algorithm can be effective for defending a network under certain conditions may not necessarily give insight about the performance of the algorithm when the threats, network conditions, and security goals change. This paper introduces a novel approach for network environment design and a software framework to address the fundamental problem that network defense cannot be defined as a single game with a simple set of fixed rules. We show how our approach is necessary to facilitate the development of RL network defenders that are robust against attacks aimed at the agent's learning. Our framework enables the development and simulation of adversaries with sophisticated behavior that includes poisoning and evasion attacks on RL network defenders.
翻译:实践证明,强化学习(RL)对于培养玩复杂游戏的人性化表现的代理商是合适的,然而,对于如何有效使用RL来完成网络安全任务,人们并不理解。为了发展这种理解,有必要利用模拟和模拟系统来开发RL代理商,使研究人员能够模拟一系列广泛的现实威胁和网络条件。证明特定RL算法在特定条件下能够有效地捍卫网络,不一定能使人们深入了解在威胁、网络条件和安全目标发生变化时算法的运作情况。本文介绍了网络环境设计的新颖方法和软件框架,以解决网络防御不能定义为一套简单固定规则的单一游戏这一根本问题。我们表明,我们的方法对于帮助发展RL网络维护者应对针对该代理商学习的攻击的强大能力十分必要。我们的框架能够以精密的行为发展和模拟对手,其中包括毒害和逃避对RL网络捍卫者的攻击。