Due to convenience, open-source software is widely used. For beneficial reasons, open-source maintainers often fix the vulnerabilities silently, exposing their users unaware of the updates to threats. Previous works all focus on black-box binary detection of the silent dependency alerts that suffer from high false-positive rates. Open-source software users need to analyze and explain AI prediction themselves. Explainable AI becomes remarkable as a complementary of black-box AI models, providing details in various forms to explain AI decisions. Noticing there is still no technique that can discover silent dependency alert on time, in this work, we propose a framework using an encoder-decoder model with a binary detector to provide explainable silent dependency alert prediction. Our model generates 4 types of vulnerability key aspects including vulnerability type, root cause, attack vector, and impact to enhance the trustworthiness and users' acceptance to alert prediction. By experiments with several models and inputs, we confirm CodeBERT with both commit messages and code changes achieves the best results. Our user study shows that explainable alert predictions can help users find silent dependency alert more easily than black-box predictions. To the best of our knowledge, this is the first research work on the application of Explainable AI in silent dependency alert prediction, which opens the door of the related domains.
翻译:由于方便,开放源码软件被广泛使用。出于有益的原因,开放源码维护者经常默默地修正弱点,暴露用户对威胁的更新不知情。 先前的工作都侧重于黑箱二进制检测受虚假阳性高影响的沉默依赖警报。 开放源码用户需要分析和解释AI的预测本身。 可以解释的AI是黑箱AI模型的补充,以各种方式提供细节来解释AI的决定。 仍然没有任何技术可以及时发现沉默依赖性警报,在这项工作中,我们建议使用一个带有二进制检测器的编码-解密器模型来提供一个框架,以提供可解释的沉默依赖性警报预报。 我们的模型生成了四种脆弱性关键方面,包括脆弱性类型、根源、攻击矢量和影响,以提高信任度和用户对警报性预测的接受度。通过几个模型和投入的实验,我们确认代码BERT能够做出最好的结果。 我们的用户研究表明,可以解释的警报性预测有助于用户找到比黑箱预测更容易的沉默依赖性警报。 对于我们最能解释的模型来说,最能解释的模型是,关于无声信箱预测的模型的可靠程度。