Automatic code optimization remains a difficult challenge, particularly for complex loop nests on modern hardware. This paper investigates a novel approach to code optimization where Large Language Models (LLMs) guide the process through a closed-loop interaction with a compiler. We present ComPilot, an experimental framework that leverages off-the-shelf LLMs, without any task-specific fine-tuning, as interactive optimization agents. ComPilot establishes a feedback loop where an LLM proposes transformations for a given loop nest to a compiler. The compiler attempts the transformations, reporting back legality status and measured speedup or slowdown. The LLM utilizes this concrete feedback to iteratively refine its optimization strategy. Our extensive evaluation across the PolyBench benchmark suite demonstrates the effectiveness of this zero-shot approach. ComPilot achieves geometric mean speedups of 2.66x (single run) and 3.54x (best-of-5 runs) over the original code. Furthermore, ComPilot demonstrates competitive performance against the state-of-the-art Pluto polyhedral optimizer, outperforming it in many cases. This experimental study demonstrates that general-purpose LLMs can effectively guide the code optimization process when grounded by compiler feedback, opening promising research directions for agentic AI in code optimization.
翻译:自动代码优化仍然是一个严峻的挑战,尤其对于现代硬件上的复杂循环嵌套而言。本文研究了一种新颖的代码优化方法,其中大型语言模型通过与编译器的闭环交互来引导优化过程。我们提出了ComPilot,一个实验性框架,它利用现成的LLM(无需任何任务特定微调)作为交互式优化智能体。ComPilot建立了一个反馈循环:LLM向编译器提出针对给定循环嵌套的变换方案;编译器尝试执行这些变换,并反馈合法性状态以及实测的加速或减速结果;LLM利用这一具体反馈来迭代地优化其策略。我们在PolyBench基准测试套件上进行的大量评估证明了这种零样本方法的有效性。相较于原始代码,ComPilot实现了2.66倍(单次运行)和3.54倍(五次运行最佳结果)的几何平均加速比。此外,ComPilot在与最先进的Pluto多面体优化器的对比中展现出有竞争力的性能,并在多数情况下表现更优。本实验研究表明,当以编译器反馈为基础时,通用LLM能够有效引导代码优化过程,这为代码优化领域中智能体化AI的研究开辟了有前景的方向。