在神经机器翻译中解释软件错误使用代码结构 (Explaining Software Bugs Leveraging Code Structures in Neural Machine Translation)

Software bugs claim approximately 50% of development time and cost the global economy billions of dollars. Once a bug is reported, the assigned developer attempts to identify and understand the source code responsible for the bug and then corrects the code. Over the last five decades, there has been significant research on automatically finding or correcting software bugs. However, there has been little research on automatically explaining the bugs to the developers, which is essential but a highly challenging task. In this paper, we propose Bugsplainer, a transformer-based generative model, that generates natural language explanations for software bugs by learning from a large corpus of bug-fix commits. Bugsplainer can leverage structural information and buggy patterns from the source code to generate an explanation for a bug. Our evaluation using three performance metrics shows that Bugsplainer can generate understandable and good explanations according to Google's standard, and can outperform multiple baselines from the literature. We also conduct a developer study involving 20 participants where the explanations from Bugsplainer were found to be more accurate, more precise, more concise and more useful than the baselines.

翻译：软件错误要求了大约50%的开发时间,并花费了全球经济数十亿美元。一旦出现错误, 指定的开发者就会试图识别和理解对错误负责的源代码, 然后纠正代码。在过去五十年里, 已经对自动发现或纠正软件错误进行了大量研究。但是, 在自动向开发者解释错误方面, 几乎没有研究, 这很重要, 但是是一项极具挑战性的任务。在本文中, 我们提议了基于变压器的基因化模型Bugsplainer, 通过从大量错误组合中学习, 产生软件错误的自然语言解释。错误支持者可以利用源代码中的结构性信息和错误模式来为错误提供解释。我们使用三种性能指标进行的评估显示, 错误支持者能够根据 Google 的标准产生易懂和良好的解释, 并且能够超越文献的多个基线。我们还进行了一项开发者研究, 有20名参与者参与, 其中发现 Bugsplainer 的解释比基线更准确、更精确、更简洁、更有用。

相关内容

Machine Translation

关注 209

机器翻译（Machine Translation）涵盖计算语言学和语言工程的所有分支，包含多语言方面。特色论文涵盖理论，描述或计算方面的任何下列主题:双语和多语语料库的编写和使用，计算机辅助语言教学，非罗马字符集的计算含义，连接主义翻译方法，对比语言学等。官网地址：http://dblp.uni-trier.de/db/journals/mt/

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

专知会员服务

16+阅读 · 2022年3月13日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日