Peer code review has been found to be effective in identifying security vulnerabilities. However, despite practicing mandatory code reviews, many Open Source Software (OSS) projects still encounter a large number of post-release security vulnerabilities, as some security defects escape those. Therefore, a project manager may wonder if there was any weakness or inconsistency during a code review that missed a security vulnerability. Answers to this question may help a manager pinpointing areas of concern and taking measures to improve the effectiveness of his/her project's code reviews in identifying security defects. Therefore, this study aims to identify the factors that differentiate code reviews that successfully identified security defects from those that missed such defects. With this goal, we conduct a case-control study of Chromium OS project. Using multi-stage semi-automated approaches, we build a dataset of 516 code reviews that successfully identified security defects and 374 code reviews where security defects escaped. The results of our empirical study suggest that the are significant differences between the categories of security defects that are identified and that are missed during code reviews. A logistic regression model fitted on our dataset achieved an AUC score of 0.91 and has identified nine code review attributes that influence identifications of security defects. While time to complete a review, the number of mutual reviews between two developers, and if the review is for a bug fix have positive impacts on vulnerability identification, opposite effects are observed from the number of directories under review, the number of total reviews by a developer, and the total number of prior commits for the file under review.
翻译:然而,尽管进行了强制性代码审查,许多开放源码软件项目仍然遇到大量释放后安全薄弱环节,因为有些安全缺陷可以摆脱这些缺陷。因此,项目管理员可能会怀疑在代码审查中是否存在任何缺陷或不一致之处,从而忽略了安全薄弱环节。这个问题的答案可能有助于管理人员确定关切领域,并采取措施提高项目代码审查在查明安全缺陷方面的效力。因此,本研究旨在查明那些将成功发现安全缺陷的代码审查与未发现此类缺陷的代码审查区分开来的因素。我们为此目标,对氯胺酮OS项目进行案例控制研究。使用多阶段半自动办法,我们建立一套516代码审查的数据集,成功发现安全缺陷,在安全缺陷逃脱的地方进行374代码审查。我们的经验研究结果表明,所查明的安全缺陷类别与代码审查期间缺失的数量之间存在巨大差异。在我们数据集上安装的物流回归模型,已经达到完全的0.91分,并且已经确定了9个代码审查,使用多阶段半自动方法对铬项目进行案例控制研究。我们建立了516个代码审查,成功地确定了安全缺陷的相互审查,同时对安全缺陷进行审查,同时对安全缺陷进行审查,对安全缺陷进行审查,在两个目录进行。