项目一级的神经源代码项目编码 (Project-Level Encoding for Neural Source Code Summarization of Subroutines)

Source code summarization of a subroutine is the task of writing a short, natural language description of that subroutine. The description usually serves in documentation aimed at programmers, where even brief phrase (e.g. "compresses data to a zip file") can help readers rapidly comprehend what a subroutine does without resorting to reading the code itself. Techniques based on neural networks (and encoder-decoder model designs in particular) have established themselves as the state-of-the-art. Yet a problem widely recognized with these models is that they assume the information needed to create a summary is present within the code being summarized itself - an assumption which is at odds with program comprehension literature. Thus a current research frontier lies in the question of encoding source code context into neural models of summarization. In this paper, we present a project-level encoder to improve models of code summarization. By project-level, we mean that we create a vectorized representation of selected code files in a software project, and use that representation to augment the encoder of state-of-the-art neural code summarization techniques. We demonstrate how our encoder improves several existing models, and provide guidelines for maximizing improvement while controlling time and resource costs in model size.

翻译：子例程的源代码总和是写出该子例程的简短自然语言描述的任务。描述通常用于针对程序员的文档中,即使是简短的短语(例如“将数据压缩到一个拉链文件”)也能帮助读者快速理解子例程的原理,而不用阅读代码本身。基于神经网络(特别是编码器脱coder模型设计)的技术已经确立自己为最新技术。然而,这些模型所普遍认识到的一个问题是,他们认为创建摘要所需的信息存在于正在被总结的代码中,这一假设与程序理解文献相悖。因此,目前的研究前沿在于编码源代码背景问题,成为合成的神经模型问题。在本文中,我们提出了一个项目级的编码编码编码,以改进代码的模型。在项目一级,我们的意思是,在软件项目项目一级,我们创建了选定代码文档的矢量化表达器,并使用这种表达器来增加国家神经元模型的编码编码编码的编码,同时在控制现有资源成本和最大化时,我们如何在演示各种时间模型的改进中提供。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

注意力机制综述

专知会员服务

207+阅读 · 2021年1月26日

最新《Transformers模型》教程，64页ppt

专知会员服务

319+阅读 · 2020年11月26日

【IJCAI2020】神经摘要结构性注意力，Neural Abstractive Summarization with Structural Attention

专知会员服务

33+阅读 · 2020年4月24日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日