Insider threats pose a persistent and critical security risk, yet are notoriously difficult to detect in complex enterprise environments, where malicious actions are often hidden within seemingly benign user behaviors. Although machine-learning-based insider threat detection (ITD) methods have shown promise, their effectiveness is fundamentally limited by the scarcity of high-quality and realistic training data. Enterprise internal data is highly sensitive and rarely accessible, while existing public and synthetic datasets are either small-scale or lack sufficient realism, semantic richness, and behavioral diversity. To address this challenge, we propose Chimera, an LLM-based multi-agent framework that automatically simulates both benign and malicious insider activities and generates comprehensive system logs across diverse enterprise environments. Chimera models each agent as an individual employee with fine-grained roles and supports group meetings, pairwise interactions, and self-organized scheduling to capture realistic organizational dynamics. Based on 15 insider attacks abstracted from real-world incidents, we deploy Chimera in three representative data-sensitive organizational scenarios and construct ChimeraLog, a new dataset for developing and evaluating ITD methods. We evaluate ChimeraLog through human studies and quantitative analyses, demonstrating its diversity and realism. Experiments with existing ITD methods show substantially lower detection performance on ChimeraLog compared to prior datasets, indicating a more challenging and realistic benchmark. Moreover, despite distribution shifts, models trained on ChimeraLog exhibit strong generalization, highlighting the practical value of LLM-based multi-agent simulation for advancing insider threat detection.
翻译:内部威胁构成持续且关键的安全风险,但在复杂的企业环境中却因其恶意行为常隐藏于看似良性的用户行为中而极难检测。尽管基于机器学习的内部威胁检测方法已显示出潜力,但其有效性从根本上受限于高质量真实训练数据的稀缺性。企业内部数据高度敏感且难以获取,而现有的公开与合成数据集要么规模有限,要么缺乏足够的真实性、语义丰富性和行为多样性。为应对这一挑战,我们提出Chimera——一个基于大语言模型的多智能体框架,能够自动模拟良性及恶意的内部人员活动,并在多样化企业环境中生成全面的系统日志。Chimera将每个智能体建模为具有细粒度角色的独立员工,支持小组会议、双向交互及自组织调度,以捕捉真实的组织动态。基于从真实事件中抽象出的15种内部攻击模式,我们在三种具有代表性的数据敏感型组织场景中部署Chimera,构建了用于开发和评估内部威胁检测方法的新数据集ChimeraLog。通过人工评估与定量分析,我们验证了ChimeraLog的多样性与真实性。使用现有内部威胁检测方法进行的实验表明,相较于既有数据集,在ChimeraLog上的检测性能显著降低,这标志着该数据集提供了更具挑战性与真实性的基准。此外,尽管存在分布偏移,基于ChimeraLog训练的模型展现出强大的泛化能力,凸显了基于大语言模型的多智能体仿真在推进内部威胁检测研究方面的实用价值。