Society needs more secure software. But the subject matter experts in software security are in short supply. Hence, development teams are motivated to make the most of their limited time. The goal of this paper is to improve software vulnerability inspection efficiency via an active learning based inspection support tool HARMLESS. HARMLESS incrementally updates its vulnerability prediction model (VPM) based on latest human inspection results and then applies the model to prioritize human inspection efforts to source code files that are more likely to contain vulnerabilities. HARMLESS is designed to have three advantages over conventional software vulnerability prediction methods. Firstly, by integrating human and vulnerability prediction model in an active learning environment, HARMLESS keeps refining its VPM and can find vulnerabilities with reduced human inspection effort before a software's first release. Secondly, by estimating the total number of vulnerabilities in a software project, HARMLESS guides human to stop the inspection at a target recall. Thirdly, HARMLESS applies redundant inspection (source code files inspected multiple times by different humans) on source code files that are more likely to contain missing vulnerabilities, so that vulnerabilities missed by human inspectors can be retrieved efficiently. We evaluate HARMLESS via a simulation with Mozilla Firefox vulnerability data. Our results show that (1) HARMLESS finds 60, 70, 80, 90, 95, 99% vulnerabilities by inspecting 6, 8, 10, 16, 20, 34% source code files, respectively. (2) During the simulation, when targeting at 90, 95, 99% recall, HARMLESS could stop early at 23, 30, 47% source code files inspected, respectively. (3) Even when human reviewers fail to identify half of the vulnerabilities, HARMLESS is able to cover 96% of the missing vulnerabilities by redundantly inspecting half of the classified files.
翻译:社会需要更安全的软件。 但软件安全领域的主题事项专家却缺乏足够的供应。 因此, 开发团队的动机是将人和脆弱性预测模型整合到一个积极的学习环境中, 开发团队的动机是充分利用有限的时间, 目的是通过积极的学习基础检查支持工具HARMLIS 来提高软件脆弱性检查检查检查效率。 HARMLES 逐步更新其脆弱性预测模型(VPM ) 。 HARMLIS 以最新的人类检查结果为基础, 并随后运用该模型, 将人类检查工作列为优先事项, 来源代码文件源更可能包含脆弱性。 HARMLES 旨在比常规软件脆弱性预测方法拥有三个优势。 首先, 通过将人和脆弱性预测模型纳入一个积极的学习环境, HARMLES 不断完善其47PM, 在软件首次发布之前,通过一个积极的学习基础测试工具工具,提高软件项目(HARMIS), 通过软件项目估计其脆弱性的总数,HARMES 人指南 20, 阻止检查。 第三, HARMES 使用多余的代码, 重复的检查, 23码文件, 多检查, 23( 检查了) 2次, 多处的代码, 多处检查, 多处检查, 多处的代码, 多处检查, 多处, 多处的代码, 多处, 解解解的代码, 23处的代码, 23 解解解解的, 解的代码, 23 (2) 解的, 解的, 解的, 解的, 解的, 解的, 解的, 解的, 解的, 解的, 解的, 解的 解的 解的 解的 解的 解的 的 的 的 解的 的 的 的 解的 的 的 解的 的 解的 的 的 解的 的 的 的 的 解的 的 的 解的 的 解的 解的 的 的 的 的 的 解的 的 的 的 解的 的 的 的 的 的 的 的 的 的 的 的 的 的,,,, 的