Automated Program Repair (APR) techniques have shown more and more promising results in fixing real-world bugs. Despite the effectiveness, APR techniques still face an overfitting problem: a generated patch can be incorrect although it passes all tests. It is time-consuming to manually evaluate the correctness of generated patches that can pass all tests. To address this problem, many approaches have been proposed to automatically assess the correctness of patches generated by APR techniques. However, existing approaches require a large set of manually labeled patches as the training data. To mitigate the issue, in this study, we propose PatchZero, the patch correctness assessment by adopting large pre-trained models. Specifically, for patches generated by a new or unseen APR tool, PatchZero does not need labeled patches of this new or unseen APR tool for training (i.e., zero-shot) but directly queries the large pre-trained model to get predictions on the correctness labels without training. In this way, PatchZero can reduce the manual labeling effort when building a model to automatically assess the correctness of generated patches of new APR tools. To provide knowledge regarding the automatic patch correctness assessment (APCA) task to the large pre-trained models, we also design an instance-wise demonstration formation strategy by using contrastive learning. Specifically, PatchZero selects semantically similar patches to help the large pre-trained model to give more accurate predictions on the unlabeled patches. Our experimental results showed that PatchZero can achieve an accuracy of 82.7% and an F1-score of 86.0% on average although no labeled patch of the new or unseen APR tool is available. In addition, our proposed technique outperformed the prior state-of-the-art by a large margin.
翻译:自动程序修补( APR) 技术在修复真实世界错误方面显示出更多、更有希望的结果。 尽管效果有效, RAPR 技术仍然面临着一个过于完善的问题: 生成的补丁虽然通过所有测试, 也可能不正确。 手动评价生成的补丁的正确性, 可以通过所有测试来完成。 为了解决这个问题, 已经提出了许多方法来自动评估 RA 技术产生的补丁的正确性。 但是, 现有的方法需要用大量手工贴上标签的补丁作为培训数据。 为了减轻问题, 我们在本研究中建议采用大型预培训模型, 补丁( PatchZero ), 补丁评估补丁是否正确。 具体来说, PatchZero 不需要手动手动评估所生成的补丁是否正确性, 而直接询问大型预训练模型, 无需培训, PatchZero 可以在建立新模型时减少手工标签的补丁, 尽管在新REAR 工具预设的补缺补丁中可以自动评估新补缺的补补缺的补缺的补, 也可以通过常规的预选的补丁 。</s>