MatchFixAgent：语言无关的仓库级代码翻译验证与修复自主框架 (MatchFixAgent: Language-Agnostic Autonomous Repository-Level Code Translation Validation and Repair)

Code translation transforms source code from one programming language (PL) to another. Validating the functional equivalence of translation and repairing, if necessary, are critical steps in code translation. Existing automated validation and repair approaches struggle to generalize to many PLs due to high engineering overhead, and they rely on existing and often inadequate test suites, which results in false claims of equivalence and ineffective translation repair. We develop MatchFixAgent, a large language model (LLM)-based, PL-agnostic framework for equivalence validation and repair of translations. MatchFixAgent features a multi-agent architecture that divides equivalence validation into several sub-tasks to ensure thorough and consistent semantic analysis of the translation. Then it feeds this analysis to test agent to write and execute tests. Upon observing a test failure, the repair agent attempts to fix the translation bug. The final (in)equivalence decision is made by the verdict agent, considering semantic analyses and test execution results. We compare MatchFixAgent's validation and repair results with four repository-level code translation techniques. We use 2,219 translation pairs from their artifacts, which cover 6 PL pairs, and are collected from 24 GitHub projects totaling over 900K lines of code. Our results demonstrate that MatchFixAgent produces (in)equivalence verdicts for 99.2% of translation pairs, with the same equivalence validation result as prior work on 72.8% of them. When MatchFixAgent's result disagrees with prior work, we find that 60.7% of the time MatchFixAgent's result is actually correct. In addition, we show that MatchFixAgent can repair 50.6% of inequivalent translation, compared to prior work's 18.5%. This demonstrates that MatchFixAgent is far more adaptable to many PL pairs than prior work, while producing highly accurate validation results.

翻译：代码翻译将源代码从一种编程语言转换为另一种编程语言。验证翻译的功能等价性并在必要时进行修复，是代码翻译中的关键步骤。现有的自动化验证与修复方法由于工程开销较高，难以推广至多种编程语言，且依赖现有且通常不充分的测试套件，导致错误的等价性判定和低效的翻译修复。我们开发了MatchFixAgent，一个基于大语言模型（LLM）且与编程语言无关的翻译等价性验证与修复框架。MatchFixAgent采用多智能体架构，将等价性验证分解为若干子任务，以确保对翻译进行彻底且一致的语义分析。随后，该框架将此分析馈送至测试智能体以编写并执行测试。当观察到测试失败时，修复智能体会尝试修复翻译缺陷。最终的（非）等价性判定由裁决智能体综合考虑语义分析和测试执行结果后作出。我们将MatchFixAgent的验证与修复结果与四种仓库级代码翻译技术进行了比较。我们使用了来自其研究产物的2,219个翻译对，这些翻译对覆盖6种编程语言组合，并收集自24个GitHub项目，总计超过90万行代码。实验结果表明，MatchFixAgent能为99.2%的翻译对生成（非）等价性判定，其中72.8%的判定结果与先前工作一致。当MatchFixAgent的结果与先前工作不一致时，我们发现60.7%的情况下MatchFixAgent的结果实际上是正确的。此外，MatchFixAgent能够修复50.6%的非等价翻译，而先前工作的修复率仅为18.5%。这表明MatchFixAgent相比先前工作能更好地适应多种编程语言组合，同时产生高度准确的验证结果。

相关内容