The increasing connectivity and intricate remote access environment have made traditional perimeter-based network defense vulnerable. Zero trust becomes a promising approach to provide defense policies based on agent-centric trust evaluation. However, the limited observations of the agent's trace bring information asymmetry in the decision-making. To facilitate the human understanding of the policy and the technology adoption, one needs to create a zero-trust defense that is explainable to humans and adaptable to different attack scenarios. To this end, we propose a scenario-agnostic zero-trust defense based on Partially Observable Markov Decision Processes (POMDP) and first-order Meta-Learning using only a handful of sample scenarios. The framework leads to an explainable and generalizable trust-threshold defense policy. To address the distribution shift between empirical security datasets and reality, we extend the model to a robust zero-trust defense minimizing the worst-case loss. We use case studies and real-world attacks to corroborate the results.
翻译:越来越多的连通性和复杂的远程接入环境使得传统的周边网络防御变得脆弱。 零信任成为提供基于代理中心信任评估的防御政策的一个很有希望的方法。 但是,对代理人追踪的有限观察在决策中带来了信息不对称。 为了便利人类对政策和技术应用的理解,人们需要创建一种对人可以解释的零信任防御,并适应不同的攻击情景。 为此,我们提议了一种基于部分可观测的Markov决策程序(POMDP)和第一阶的Meta-Learch 的情景 — — 不可知的零信任防御,仅使用少数样本。 该框架导致一种可解释的、可通用的信托门槛防御政策。 为了解决实际安全数据集与现实之间的分配变化,我们将模型推广为强力的零信任防御,最大限度地减少最坏的情况。 我们使用案例研究和现实世界攻击来证实结果。</s>