ChatGPT has revolutionized many research and industrial fields. ChatGPT has shown great potential in software engineering to boost various traditional tasks such as program repair, code understanding, and code generation. However, whether automatic program repair (APR) applies to deep learning (DL) programs is still unknown. DL programs, whose decision logic is not explicitly encoded in the source code, have posed unique challenges to APR. While to repair DL programs, an APR approach needs to not only parse the source code syntactically but also needs to understand the code intention. With the best prior work, the performance of fault localization is still far less than satisfactory (only about 30\%). Therefore, in this paper, we explore ChatGPT's capability for DL program repair by asking three research questions. (1) Can ChatGPT debug DL programs effectively? (2) How can ChatGPT's repair performance be improved by prompting? (3) In which way can dialogue help facilitate the repair? On top of that, we categorize the common aspects useful for prompt design for DL program repair. Also, we propose various prompt templates to facilitate the performance and summarize the advantages and disadvantages of ChatGPT's abilities such as detecting bad code smell, code refactoring, and detecting API misuse/deprecation.
翻译:ChatGPT已经在许多研究和行业领域产生了革命性的变化。 ChatGPT在软件工程中表现出了强大的潜力,能够促进各种传统任务,如程序修复、代码理解和代码生成等。然而,自动程序修复(APR)是否适用于深度学习(DL)程序仍然未知。 DL程序决策逻辑未在源代码中明确编码,对APR提出了独特的挑战。尽管为了修复DL程序,APR方法需要不仅在语法上解析源代码,还需要理解代码意图。尽管在最佳先前工作的基础上,错误定位的性能仍远远不如令人满意(仅约为30\%)。因此,在本文中,我们通过提出三个研究问题来探讨ChatGPT在DL程序修复中的能力。 (1)ChatGPT能否有效地调试DL程序? (2)如何通过提示提高ChatGPT的修复性能? (3)哪种方式可以帮助对话促进修复? 在此基础上,我们将可用于DL程序修复的常见方面进行分类。此外,我们提出各种提示模板以促进性能,并总结了ChatGPT的能力,例如检测糟糕的代码气味、代码重构和检测API不当使用/过时。