关于自动代码生成方法的可靠性和可解释性 (On the Reliability and Explainability of Automated Code Generation Approaches)

Automatic code generation, the task of generating new code snippets from existing code or comments, has long been of interest. Numerous code generation models have been proposed and proven on different benchmark datasets. However, little is known about whether this objective has been achieved and why code generation models effectively transform code sequences automatically. In other words, can we totally trust these automated code generation models? Consequently, there is a pressing need to understand the inner logic of code generation models and to investigate their replicability, reliability, and explainability. To bridge these research gaps, we conduct a thorough empirical study of five code generation models on four representative code generation datasets to assess the limits and capabilities of automatic code generation approaches. We further employ advanced explainable AI approaches to highlight the tokens that significantly contribute to the code generation. Experiments demonstrate that we successfully replicate state-of-the-art code generation approaches. We discover that state-of-the-art approaches suffer from severe data duplication and input insensitivity, which are subtle issues with significant implications. Our explainability analysis reveals that, in various experimental scenarios, code generation models can recognize code grammar and structural information, but can not capture key tokens that need to be updated. Our results draw several lessons and guidelines for future work in this area.

翻译：长久以来,人们一直关注从现有代码或评论中产生新的代码片断的自动代码生成任务。许多代码生成模型已经提出,并在不同的基准数据集上得到验证。然而,对于这一目标是否已经实现以及为什么代码生成模型是否自动地转换代码序列,人们知之甚少。换句话说,我们能否完全信任这些自动代码生成模型?因此,迫切需要理解代码生成模型的内部逻辑,并调查其可复制性、可靠性和可解释性。为了弥合这些研究差距,我们进行了一项彻底的经验性研究,对关于四个代议代码生成数据集的五个代码生成模型进行了全面的经验性研究,以评估自动代码生成方法的局限性和能力。我们进一步采用了先进的可解释的AI方法来突出为代码生成做出重大贡献的标志。实验表明,我们成功地复制了最新版代码生成方法;因此,我们发现,目前最先进的方法受到数据重复和敏感度的严重影响,这是具有重大意义的问题。我们的解释性分析表明,在各种实验情景下,代码生成模型能够识别代码制图和结构信息的范围和能力。我们无法为今后的工作总结一些关键符号。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日