Software bugs claim approximately 50% of development time and cost the global economy billions of dollars. Once a bug is reported, the assigned developer attempts to identify and understand the source code responsible for the bug and then corrects the code. Over the last five decades, there has been significant research on automatically finding or correcting software bugs. However, there has been little research on automatically explaining the bugs to the developers, which is essential but a highly challenging task. In this paper, we propose Bugsplainer, a transformer-based generative model, that generates natural language explanations for software bugs by learning from a large corpus of bug-fix commits. Bugsplainer can leverage structural information and buggy patterns from the source code to generate an explanation for a bug. Our evaluation using three performance metrics shows that Bugsplainer can generate understandable and good explanations according to Google's standard, and can outperform multiple baselines from the literature. We also conduct a developer study involving 20 participants where the explanations from Bugsplainer were found to be more accurate, more precise, more concise and more useful than the baselines.
翻译:软件错误要求了大约50%的开发时间,并花费了全球经济数十亿美元。 一旦出现错误, 指定的开发者就会试图识别和理解对错误负责的源代码, 然后纠正代码。 在过去五十年里, 已经对自动发现或纠正软件错误进行了大量研究。 但是, 在自动向开发者解释错误方面, 几乎没有研究, 这很重要, 但是是一项极具挑战性的任务 。 在本文中, 我们提议了基于变压器的基因化模型Bugsplainer, 通过从大量错误组合中学习, 产生软件错误的自然语言解释。 错误支持者可以利用源代码中的结构性信息和错误模式来为错误提供解释。 我们使用三种性能指标进行的评估显示, 错误支持者能够根据 Google 的标准产生易懂和良好的解释, 并且能够超越文献的多个基线。 我们还进行了一项开发者研究, 有20名参与者参与, 其中发现 Bugsplainer 的解释比基线更准确、更精确、更简洁、更有用 。