Code summarization generates brief natural language descriptions of source code pieces, which can assist developers in understanding code and reduce documentation workload. Recent neural models on code summarization are trained and evaluated on large-scale multi-project datasets consisting of independent code-summary pairs. Despite the technical advances, their effectiveness on a specific project is rarely explored. In practical scenarios, however, developers are more concerned with generating high-quality summaries for their working projects. And these projects may not maintain sufficient documentation, hence having few historical code-summary pairs. To this end, we investigate low-resource project-specific code summarization, a novel task more consistent with the developers' requirements. To better characterize project-specific knowledge with limited training samples, we propose a meta transfer learning method by incorporating a lightweight fine-tuning mechanism into a meta-learning framework. Experimental results on nine real-world projects verify the superiority of our method over alternative ones and reveal how the project-specific knowledge is learned.
翻译:代码总和产生对源代码片的简单自然语言描述,有助于开发者理解代码和减少文件工作量。最近关于代码总和的神经模型在由独立代码总和组成的大型多项目数据集方面得到了培训和评价。尽管技术进展,但很少探索其在特定项目上的效力。但在实际假设中,开发者更关心的是为其工作项目编写高质量的摘要。这些项目可能没有保存足够的文件,因此没有多少历史代码总和。为此,我们调查低资源项目特定代码总和,这是一项与开发者要求更加一致的新任务。为了用有限的培训样本更好地描述特定项目知识的特点,我们建议一种元转移学习方法,将轻量微调机制纳入元学习框架。九个现实世界项目的实验结果证实了我们的方法优于替代方法,并揭示了如何学习项目特定知识。