AI agents powered by large language models (LLMs) are being used to solve increasingly complex software engineering challenges, but struggle with hardware design tasks. Register Transfer Level (RTL) code presents a unique challenge for LLMs, as it encodes complex, dynamic, time-evolving behaviors using the low-level language features of SystemVerilog. LLMs struggle to infer these complex behaviors from the syntax of RTL alone, which limits their ability to complete all downstream tasks like code completion, documentation, or verification. In response to this issue, we present DUET: a general methodology for developing Design Understanding via Experimentation and Testing. DUET mimics how hardware design experts develop an understanding of complex designs: not just via a one-off readthrough of the RTL, but via iterative experimentation using a number of tools. DUET iteratively generates hypotheses, tests them with EDA tools (e.g., simulation, waveform inspection, and formal verification), and integrates the results to build a bottom-up understanding of the design. In our evaluations, we show that DUET improves AI agent performance on formal verification, when compared to a baseline flow without experimentation.
翻译:基于大语言模型(LLM)的AI代理正被用于解决日益复杂的软件工程挑战,但在硬件设计任务中仍面临困难。寄存器传输级(RTL)代码对LLM构成了独特挑战,因为它使用SystemVerilog的低级语言特性编码了复杂、动态且随时间演化的行为。LLM难以仅从RTL语法推断这些复杂行为,这限制了其完成所有下游任务(如代码补全、文档生成或验证)的能力。针对此问题,我们提出DUET:一种通过实验与测试实现设计理解的通用方法。DUET模拟硬件设计专家理解复杂设计的方式:不仅通过一次性通读RTL代码,还借助多种工具进行迭代实验。DUET迭代生成假设,使用EDA工具(如仿真、波形检查和形式验证)进行测试,并整合结果以自底向上构建对设计的理解。在评估中,我们证明相较于无实验的基线流程,DUET显著提升了AI代理在形式验证任务上的性能。