Code review is a broadly adopted software quality practice where developers critique each others' patches. In addition to providing constructive feedback, reviewers may provide a score to indicate whether the patch should be integrated. Since reviewer opinions may differ, patches can receive both positive and negative scores. If reviews with divergent scores are not carefully resolved, they may contribute to a tense reviewing culture and may slow down integration. In this paper, we study patches with divergent review scores in the OPENSTACK and QT communities. Quantitative analysis indicates that patches with divergent review scores: (1) account for 15%-37% of patches that receive multiple review scores; (2) are integrated more often than they are abandoned; and (3) receive negative scores after positive ones in 70% of cases. Furthermore, a qualitative analysis indicates that patches with strongly divergent scores that: (4) are abandoned more often suffer from external issues (e.g., integration planning, content duplication) than patches with weakly divergent scores and patches without divergent scores; and (5) are integrated often address reviewer concerns indirectly (i.e., without changing patches). Our results suggest that review tooling should integrate with release schedules and detect concurrent development of similar patches to optimize review discussions with divergent scores. Moreover, patch authors should note that even the most divisive patches are often integrated through discussion, integration timing, and careful revision.
翻译:代码审查是一种广泛采用的软件质量做法,开发者在其中相互批评对方的补分。除了提供建设性的反馈外,审评员还可以提供一个分数,以表明补分是否应当整合。由于审评员的意见可能不同,补分可以同时获得正分和负分。如果评分不同,它们可能会有助于紧张的审查文化的解决,并可能放慢整合速度。在本文中,我们研究与公开STACK和QT社区中不同的评分相补。定量分析表明,不同的评分有不同的补分:(1) 占获得多分的补分的15%-37%;(2) 整合得比放弃得分的多;(3) 在70%的个案中,补分得正分后得负分。此外,定性分析表明,有显著差异的评分的补分有:(4) 被放弃的多为外部问题(如整合规划、内容重复),而不是差分不一的补分数和补分不一的补分;(5) 定量分析表明,审评员的评分往往间接解决(即不一分);我们的结果表明,审评工具的评分应同不同的评分、评分往往与统一。