Threat analysts routinely rely on natural-language reports that describe attacker actions without enumerating the full kill chain or the dependencies between phases, making automated reconstruction of ATT&CK consistent intrusion paths a difficult open problem. We propose a reasoning framework that infers complete seven-phase kill chains by coupling phase-conditioned semantic priors from Transformer models with a symbolic Markov Decision Process and an AlphaZero-style Monte Carlo Tree Search guided by a Policy-Value Network. The framework enforces semantic relevance, phase cohesion, and transition plausibility through a multi-objective reward function while allowing search to explore alternative interpretations of the CTI narrative. Applied to three real intrusions FIN6, APT24, and UNC1549 the approach yields kill chains that surpass Transformer baselines in semantic fidelity and operational coherence, and frequently align with expert-selected TTPs. Our results demonstrate that combining contextual embeddings with search-based decision-making offers a practical path toward automated, interpretable kill-chain reconstruction for cyber defense.
翻译:威胁分析师通常依赖描述攻击者行动的自然语言报告,这些报告未完整列举杀伤链或各阶段间的依赖关系,使得自动化重建符合ATT&CK标准的入侵路径成为一个困难的开放性问题。我们提出一种推理框架,通过将Transformer模型生成的阶段条件语义先验与符号化马尔可夫决策过程、以及由策略-价值网络引导的AlphaZero风格蒙特卡洛树搜索相结合,推断完整的七阶段杀伤链。该框架通过多目标奖励函数强制实现语义相关性、阶段内聚性和转移合理性,同时允许搜索过程探索网络威胁情报叙述的替代性解释。应用于FIN6、APT24和UNC1549三个真实入侵案例时,该方法生成的杀伤链在语义保真度和操作连贯性上超越Transformer基线模型,且常与专家选定的战术、技术与程序保持一致。我们的结果表明,将上下文嵌入与基于搜索的决策相结合,为网络防御领域实现自动化、可解释的杀伤链重建提供了可行路径。