Translating C to memory-safe languages, like Rust, prevents critical memory safety vulnerabilities that are prevalent in legacy C software. Existing approaches for C to safe Rust translation, including LLM-assisted ones, do not generalize on larger (> 500 LoC) C codebases because they depend on complex program analyses that frequently break. In this work, we present ACToR (Adversarial C To Rust translator), a simple LLM agent-based approach. Inspired by GANs, ACToR pits a generator agent against a discriminator agent, which collaborate to iteratively generate a Rust translation. On each iteration, the translator agent synthesizes and refines a Rust translation to pass an existing suite of tests, and then the discriminator agent finds new failing tests. We demonstrate that ACToR translates all of the 63 real-world command-line utilities considered in our benchmarks, which have an average size of 473 lines of code, and it achieves over 90% test pass rate with zero human intervention during translation. To our knowledge, it is the first work to show evidence that an agent-centric approach can reliably and automatically convert standalone command-line C programs at this scale. Furthermore, ACToR improves translation correctness by up to 25.1% compared to baseline, non-adversarial approaches.


翻译:将C语言翻译为内存安全语言(如Rust)能够有效预防遗留C软件中普遍存在的关键内存安全漏洞。现有的C到安全Rust翻译方法(包括基于大语言模型辅助的方法)无法在较大规模(>500行代码)的C代码库上实现泛化,因为它们依赖于频繁失效的复杂程序分析技术。本研究提出ACToR(对抗性C到Rust翻译器),一种基于大语言模型智能体的简洁方法。受生成对抗网络启发,ACToR通过生成器智能体与判别器智能体的对抗性协作,迭代生成Rust翻译。在每次迭代中,翻译器智能体合成并优化Rust翻译以通过现有测试套件,随后判别器智能体发现新的失败测试案例。实验表明,ACToR成功翻译了基准测试中全部63个真实世界命令行工具(平均代码规模为473行),在零人工干预的情况下实现了超过90%的测试通过率。据我们所知,这是首个证明以智能体为中心的方法能够在此规模上可靠、自动地转换独立命令行C程序的研究。此外,与基线非对抗性方法相比,ACToR将翻译正确率最高提升了25.1%。

0
下载
关闭预览

相关内容

Rust 是一种注重高效、安全、并行的系统程序语言。
Top
微信扫码咨询专知VIP会员