The use of intrinsic functions to exploit hardware-specific capabilities is an important approach for optimizing library performance. Many mainstream libraries implement a large number of vectorized algorithms on Arm or x86 SIMD intrinsic functions. With the rapid expansion of the RISC-V hardware-software ecosystem, there is a growing demand for support of the RISC-V Vector (RVV) extension. Translating existing vectorized intrinsic code onto RVV intrinsics is a practical and effective approach. However, current cross-architecture translation largely relies on manual rewriting, which is time-consuming and error-prone. Furthermore, while some rule-based methods can reduce the need for manual intervention, their translation success rate is limited by incomplete rule coverage and syntactic constraints, and the performance suffers from inadequate utilization of RVV-specific features. We present IntrinTrans, a LLM-based multi-agent approach that utilizes compile-and-test feedback to translate intrinsic code across architectures automatically, and further optimizes the generated RVV intrinsics using register-usage information derived from liveness analysis. To evaluate the effectiveness of our approach, we collected 34 vectorized algorithm cases from open-source libraries. Each case includes an Arm Neon intrinsics implementation and a RVV intrinsics implementation contributed by the open-source community, together with correctness and performance tests. Our experiments show that advanced LLMs produce semantically correct RISC-V Vector intrinsics in most cases within a limited number of iterations, and in some cases achieve up to 5.93x the performance of the native implementation from the open-source community.
翻译:暂无翻译