微软公司在守则审查建议中使用大规模异异种图代表比例学习 (Using Large-scale Heterogeneous Graph Representation Learning for Code Review Recommendations at Microsoft)

Code review is an integral part of any mature software development process, and identifying the best reviewer for a code change is a well-accepted problem within the software engineering community. Selecting a reviewer who lacks expertise and understanding can slow development or result in more defects. To date, most reviewer recommendation systems rely primarily on historical file change and review information; those who changed or reviewed a file in the past are the best positioned to review in the future. We posit that while these approaches are able to identify and suggest qualified reviewers, they may be blind to reviewers who have the needed expertise and have simply never interacted with the changed files before. Fortunately, at Microsoft, we have a wealth of work artifacts across many repositories that can yield valuable information about our developers. To address the aforementioned problem, we present CORAL, a novel approach to reviewer recommendation that leverages a socio-technical graph built from the rich set of entities (developers, repositories, files, pull requests (PRs), work items, etc.) and their relationships in modern source code management systems. We employ a graph convolutional neural network on this graph and train it on two and a half years of history on 332 repositories within Microsoft. We show that CORAL is able to model the manual history of reviewer selection remarkably well. Further, based on an extensive user study, we demonstrate that this approach identifies relevant and qualified reviewers who traditional reviewer recommenders miss, and that these developers desire to be included in the review process. Finally, we find that "classical" reviewer recommendation systems perform better on smaller (in terms of developers) software projects while CORAL excels on larger projects, suggesting that there is "no one model to rule them all."

翻译：代码审查是任何成熟软件开发过程的一个组成部分, 确定对代码修改的最佳审查者是软件工程界中一个公认的问题。幸运的是, 我们选择了一个缺乏专长和理解的审查者, 可能会减缓开发速度或导致更多缺陷。到目前为止, 多数审查者建议系统主要依靠历史文档变化和审查信息; 过去修改或审查过一个文件的人最有资格在未来审查。我们假设, 虽然这些方法能够识别和推荐合格的审查者, 但对于拥有所需专长并且从未与修改过的文件互动的审评者来说, 他们可能是盲目。幸运的是, 在微软公司, 我们拥有许多储存库中的大量传统作品, 能够产生关于我们开发者的宝贵信息。为了解决上述问题, 我们介绍CORAL, 一种新的审查者审查建议, 利用丰富实体组合( 开发者、储存者、档案、调用请求( PRs)、工作项目等等, 以及他们在现代源代码管理系统中的关系。我们在这个图表上使用一个图形变色网络, 并在两年半的时间里对它进行培训, 能够产生关于我们开发者的宝贵历史的C 评估。