Cybersecurity vulnerability information is often recorded by multiple channels, including government vulnerability repositories, individual-maintained vulnerability-gathering platforms, or vulnerability-disclosure email lists and forums. Integrating vulnerability information from different channels enables comprehensive threat assessment and quick deployment to various security mechanisms. Efforts to automatically gather such information, however, are impeded by the limitations of today's entity alignment techniques. In our study, we annotate the first cybersecurity-domain entity alignment dataset and reveal the unique characteristics of security entities. Based on these observations, we propose the first cybersecurity entity alignment model, CEAM, which equips GNN-based entity alignment with two mechanisms: asymmetric masked aggregation and partitioned attention. Experimental results on cybersecurity-domain entity alignment datasets demonstrate that CEAM significantly outperforms state-of-the-art entity alignment methods.
翻译:网络安全脆弱性信息往往通过多种渠道记录,包括政府的脆弱性储存库、个人维持的脆弱性收集平台或脆弱性披露电子邮件清单和论坛; 将不同渠道的脆弱性信息综合起来,能够进行综合威胁评估和迅速部署到各种安全机制; 然而,自动收集此类信息的努力受到当今实体协调技术限制的阻碍; 我们的研究指出,首个网络安全领域实体协调数据集,并揭示了安保实体的独特性; 根据这些观察,我们提出了首个网络安全实体协调模式,即CEAM,该模式为基于GNN的实体配备了两种机制:不对称的蒙面聚合和分散关注; 网络安全领域实体协调数据集的实验结果显示,CEAM大大超越了最先进的实体协调方法。