利用平衡计分卡确定提高《守则》审查效力的机会:工业经验报告 (Using a Balanced Scorecard to Identify Opportunities to Improve Code Review Effectiveness: An Industrial Experience Report)

Peer code review is a widely adopted software engineering practice to ensure code quality and ensure software reliability in both the commercial and open-source software projects. Due to the large effort overhead associated with practicing code reviews, project managers often wonder, if their code reviews are effective and if there are improvement opportunities in that respect. Since project managers at Samsung Research Bangladesh (SRBD) were also intrigued by these questions, this research developed, deployed, and evaluated a production-ready solution using the Balanced SCorecard (BSC) strategy that SRBD managers can use in their day-to-day management to monitor individual developer's, a particular project's or the entire organization's code review effectiveness. Following the four-step framework of the BSC strategy, we: 1) defined the operation goals of this research, 2) defined a set of metrics to measure the effectiveness of code reviews, 3) developed an automated mechanism to measure those metrics, and 4) developed and evaluated a monitoring application to inform the key stakeholders. Our automated model to identify useful code reviews achieves 7.88% and 14.39% improvement in terms of accuracy and minority class F_1 score respectively over the models proposed in prior studies. It also outperforms human evaluators from SRBD, that the model replaces, by a margin of 19.01% and 13.72% respectively in terms of accuracy and minority class F_1 score. In our post-deployment survey, SRBD developers and managers indicated that they found our solution as useful and it provided them with important insights to help their decision makings.

翻译：同行守则审查是一种广泛采用的软件工程做法,目的是确保守则质量,并确保商业和开放源码软件项目软件的可靠性。由于在进行代码审查方面付出了大量努力,项目管理员往往怀疑其代码审查是否有效,以及在这方面是否有改进机会。由于这些问题也引起了三星研究孟加拉国(SRBD)项目管理员的好奇,这项研究开发、部署和评价了一种可用于生产的解决办法,使用平衡SCorecard(BSC)战略,供SRBD管理人员在日常管理中使用,以监测个体开发商、某个特定项目或整个组织代码审查的有效性。根据《BSC战略》的四步框架,项目管理员经常怀疑其代码审查是否有效,以及在这方面是否有改进的机会。由于这项研究确定了一套衡量代码审查有效性的衡量标准,3 开发、部署和评价了一套用于向主要利益攸关方提供信息的自动机制。我们为确定有用的代码审查实现了7.88%和14.39 % 的自动模式,在精确度和少数类(F_1)。