Source code summaries are short natural language descriptions of code snippets that help developers better understand and maintain source code. There has been a surge of work on automatic code summarization to reduce the burden of writing summaries manually. However, most contemporary approaches mainly leverage the information within the boundary of the method being summarized (i.e., local context), and ignore the broader context that could assist with code summarization. This paper explores two global contexts, namely intra-class and inter-class contexts, and proposes the model CoCoSUM: Contextual Code Summarization with Multi-Relational Graph Neural Networks. CoCoSUM first incorporates class names as the intra-class context to generate the class semantic embeddings. Then, relevant Unified Modeling Language (UML) class diagrams are extracted as inter-class context and are encoded into the class relational embeddings using a novel Multi-Relational Graph Neural Network (MRGNN). Class semantic embeddings and class relational embeddings, together with the outputs from code token encoder and AST encoder, are passed to a decoder armed with a two-level attention mechanism to generate high-quality, context-aware code summaries. We conduct extensive experiments to evaluate our approach and compare it with other automatic code summarization models. The experimental results show that CoCoSUM is effective and outperforms state-of-the-art methods. Our source code and experimental data are available in the supplementary materials and will be made publicly available.
翻译:源代码摘要是对代码片断的简短自然语言描述,帮助开发者更好地理解和维护源代码。在自动代码汇总方面工作激增,以减少手工编写摘要摘要的负担。然而,大多数现代方法主要利用正在汇总的方法范围内的信息(即本地背景),忽视有助于代码汇总的更广泛的背景。本文探讨了两种全球背景,即类内和类际背景,并提出了模式CoCOSUM:背景代码与多关系图表神经网络的结合。CoSUM首先将类名称作为类内背景纳入,以生成类内语摘要。随后,相关的统一建模语言(UML)类图被作为分类间背景抽取,并使用新的多关系图内网(MRGNN)编码纳入类关系嵌入。CoCOUM:背景代码嵌入和类关系嵌入,连同代码符号编码编码和ASTcoUM首先将类名称作为类内嵌入的分类,然后将相关的统一建模解码解算出一个高质量的系统模型,然后通过我们现有的高级实验模型来进行解算。