Due to the ever-increasing security breaches, practitioners are motivated to produce more secure software. In the United States, the White House Office released a memorandum on Executive Order (EO) 14028 that mandates organizations provide self-attestation of the use of secure software development practices. The OpenSSF Scorecard project allows practitioners to measure the use of software security practices automatically. However, little research has been done to determine whether the use of security practices improves package security, particularly which security practices have the biggest impact on security outcomes. The goal of this study is to assist practitioners and researchers making informed decisions on which security practices to adopt through the development of models between software security practice scores and security vulnerability counts. To that end, we developed five supervised machine learning models for npm and PyPI packages using the OpenSSF Scorecared security practices scores and aggregate security scores as predictors and the number of externally-reported vulnerabilities as a target variable. Our models found four security practices (Maintained, Code Review, Branch Protection, and Security Policy) were the most important practices influencing vulnerability count. However, we had low R^2 (ranging from 9% to 12%) when we tested the models to predict vulnerability counts. Additionally, we observed that the number of reported vulnerabilities increased rather than reduced as the aggregate security score of the packages increased. Both findings indicate that additional factors may influence the package vulnerability count. We suggest that vulnerability count and security score data be refined such that these measures may be used to provide actionable guidance on security practices.
翻译:在美国,白宫办公室发布了关于行政命令(EO)14028的备忘录,授权各组织提供使用安全软件开发做法的自我证明。OpenSSSF记分卡项目允许从业者自动测量软件安全做法的使用情况。然而,由于安全做法的使用是否改善了一揽子安全做法,特别是安全做法对安全结果影响最大的安全做法,没有做多少研究。这项研究的目的是协助从业者和研究人员做出知情决定,通过开发软件安全做法分数和安全脆弱性计数之间的模型,确定哪些安全做法需要采用。为此,我们开发了五种受监督的Npm和PyPI成套做法的机器学习模式,使用OpenSSF记分安全做法计分和综合安全分数作为预测因素和外部报告的脆弱性计数数目作为目标变量。我们的模型发现,四种安全做法(Mained、代码审查、部门保护和安全政策)是影响脆弱性计数的最重要做法。然而,我们在测试安全脆弱性计分数标准时,我们观察到了低R2(从9%到12%不等),因此,我们用安全标准评分数比估计了更多的标准。