In software development, it is common for programmers to copy-paste or port code snippets and then adapt them to their use case. This scenario motivates the code adaptation task -- a variant of program repair which aims to adapt variable identifiers in a pasted snippet of code to the surrounding, preexisting source code. However, no existing approach has been shown to effectively address this task. In this paper, we introduce AdaptivePaste, a learning-based approach to source code adaptation, based on transformers and a dedicated dataflow-aware deobfuscation pre-training task to learn meaningful representations of variable usage patterns. We evaluate AdaptivePaste on a dataset of code snippets in Python. Results suggest that our model can learn to adapt source code with 79.8% accuracy. To evaluate how valuable is AdaptivePaste in practice, we perform a user study with 10 Python developers on a hundred real-world copy-paste instances. The results show that AdaptivePaste reduces the dwell time to nearly half the time it takes for manual code adaptation, and helps to avoid bugs. In addition, we utilize the participant feedback to identify potential avenues for improvement of AdaptivePaste.
翻译:在软件开发中,程序员通常会复制纸版或端口代码片断,然后根据使用情况对其进行修改。这种情景激励代码调整任务 -- -- 程序维修的变体,目的是将已贴贴过的代码片段中的可变标识与周围的原源代码相适应。然而,没有现有方法能够有效地完成这项任务。在本文中,我们引入了基于学习的源代码调整方法 " 适应性帕斯特 ",这是一种基于变压器和专用数据流-觉变换法学的源代码调整方法,培训前的任务是了解不同使用模式的有意义的表达方式。我们评估了Python代码片断数据集中的适应性帕斯特。结果表明,我们的模型可以学习以79.8%的精确度调整源代码。为了评估适应性帕斯特在实践中的价值,我们进行了一项用户研究,10个Python开发商在100个真实世界的拷贝式纸版实例上进行了这项研究。结果显示,适应性Paste将居住时间缩短到其使用时间近一半的时间,用于手动代码调整,并有助于改进参与者的反馈。