Deliberation is a common and natural behavior in human daily life. For example, when writing papers or articles, we usually first write drafts, and then iteratively polish them until satisfied. In light of such a human cognitive process, we propose DECOM, which is a multi-pass deliberation framework for automatic comment generation. DECOM consists of multiple Deliberation Models and one Evaluation Model. Given a code snippet, we first extract keywords from the code and retrieve a similar code fragment from a pre-defined corpus. Then, we treat the comment of the retrieved code as the initial draft and input it with the code and keywords into DECOM to start the iterative deliberation process. At each deliberation, the deliberation model polishes the draft and generates a new comment. The evaluation model measures the quality of the newly generated comment to determine whether to end the iterative process or not. When the iterative process is terminated, the best-generated comment will be selected as the target comment. Our approach is evaluated on two real-world datasets in Java (87K) and Python (108K), and experiment results show that our approach outperforms the state-of-the-art baselines. A human evaluation study also confirms the comments generated by DECOM tend to be more readable, informative, and useful.
翻译:思考是人类日常生活中一种常见和自然的行为。 例如, 当撰写论文或文章时, 我们通常先先写草稿, 然后在满足之前迭代地擦亮草稿。 根据这样的人类认知过程, 我们提议DECOM, 这是一种自动生成评论的多角度审议框架。 DECOM 由多个审议模型和一个评价模型组成。 我们首先从代码中提取关键字, 从预定义的元素中检索一个类似的代码碎片。 然后, 我们把检索到的代码的评语当作初始草稿, 然后输入 DECOM 的代码和关键字, 以启动迭代审议过程。 在每次评议中, 评语模型将草稿抛光并生成新的评语。 评价模型将测量新生成的评论的质量, 以确定是否结束迭代进程。 当迭代进程结束时, 最佳的评语将被选为目标评语。 我们在Java( 87K) 和 Python (108K) 两个真实世界数据集上评估我们的方法, 实验结果显示我们的方法超越了 DE- COM 的状态、 可读到更有用的基线。