OMPILOT：利用Transformer模型实现面向共享内存计算范式的自动并行化 (OMPILOT: Harnessing Transformer Models for Auto Parallelization to Shared Memory Computing Paradigms)

Recent advances in large language models (LLMs) have significantly accelerated progress in code translation, enabling more accurate and efficient transformation across programming languages. While originally developed for natural language processing, LLMs have shown strong capabilities in modeling programming language syntax and semantics, outperforming traditional rule-based systems in both accuracy and flexibility. These models have streamlined cross-language conversion, reduced development overhead, and accelerated legacy code migration. In this paper, we introduce OMPILOT, a novel domain-specific encoder-decoder transformer tailored for translating C++ code into OpenMP, enabling effective shared-memory parallelization. OMPILOT leverages custom pre-training objectives that incorporate the semantics of parallel constructs and combines both unsupervised and supervised learning strategies to improve code translation robustness. Unlike previous work that focused primarily on loop-level transformations, OMPILOT operates at the function level to capture a wider semantic context. To evaluate our approach, we propose OMPBLEU, a novel composite metric specifically crafted to assess the correctness and quality of OpenMP parallel constructs, addressing limitations in conventional translation metrics.

翻译：近年来，大型语言模型（LLMs）的进展显著推动了代码翻译领域的进步，实现了更准确、高效的跨编程语言转换。尽管最初是为自然语言处理而开发，LLMs在建模编程语言语法和语义方面展现出强大能力，在准确性和灵活性上均超越了传统的基于规则的系统。这些模型简化了跨语言转换流程，降低了开发开销，并加速了遗留代码的迁移。本文提出OMPILOT，一种专为将C++代码翻译为OpenMP而设计的领域特定编码器-解码器Transformer，以实现高效的共享内存并行化。OMPILOT利用融合并行结构语义的自定义预训练目标，并结合无监督与有监督学习策略，以提升代码翻译的鲁棒性。与先前主要关注循环级转换的研究不同，OMPILOT在函数级别进行操作，以捕获更广泛的语义上下文。为评估我们的方法，我们提出了OMPBLEU，一种专门设计用于评估OpenMP并行结构正确性与质量的新型复合指标，以解决传统翻译指标的局限性。