How resources are deployed to secure critical targets in networks can be modelled by Network Security Games (NSGs). While recent advances in deep learning (DL) provide a powerful approach to dealing with large-scale NSGs, DL methods such as NSG-NFSP suffer from the problem of data inefficiency. Furthermore, due to centralized control, they cannot scale to scenarios with a large number of resources. In this paper, we propose a novel DL-based method, NSGZero, to learn a non-exploitable policy in NSGs. NSGZero improves data efficiency by performing planning with neural Monte Carlo Tree Search (MCTS). Our main contributions are threefold. First, we design deep neural networks (DNNs) to perform neural MCTS in NSGs. Second, we enable neural MCTS with decentralized control, making NSGZero applicable to NSGs with many resources. Third, we provide an efficient learning paradigm, to achieve joint training of the DNNs in NSGZero. Compared to state-of-the-art algorithms, our method achieves significantly better data efficiency and scalability.
翻译:网络安全运动会(NSGZero)可以模拟如何部署资源,确保网络的关键目标。虽然最近在深度学习(DL)方面的进展为处理大规模核供应国集团提供了强有力的办法,但像NSG-NFSP这样的DL方法也存在数据效率低下的问题;此外,由于集中控制,它们不能以大量资源为假设规模;在本文中,我们提议采用基于DL的新方法(NSGZero),在核供应国集团中学习一项不可开发的政策。NSGZero通过进行神经蒙特卡洛树搜索(MCTS)来提高数据效率。我们的主要贡献是三重。首先,我们设计了深度神经网络(DNNNS),以便在核供应国集团中进行神经监控。第二,我们使神经网络能够分散控制,使NSGZero能够以许多资源适用于NSGS;第三,我们提供了一个高效的学习模式,以在NSGZero对DNNS进行联合培训。与最新算法相比,我们的方法实现了显著的数据效率和可扩展性。