As the hardware industry moves towards using specialized heterogeneous many-cores to avoid the effects of the power wall, software developers are finding it hard to deal with the complexity of these systems. This article shares our experience when developing a programming model and its supporting compiler and libraries for Matrix-3000, which is designed for next-generation exascale supercomputers but has a complex memory hierarchy and processor organization. To assist its software development, we developed a software stack from scratch that includes a low-level programming interface and a high-level OpenCL compiler. Our low-level programming model offers native programming support for using the bare-metal accelerators of Matrix-3000, while the high-level model allows programmers to use the OpenCL programming standard. We detail our design choices and highlight the lessons learned from developing systems software to enable the programming of bare-metal accelerators. Our programming models have been deployed to the production environment of an exascale prototype system.
翻译:随着硬件行业逐步转向使用专门、多样化的多核心数据以避免电动墙的影响,软件开发者发现很难处理这些系统的复杂性。本文章分享了我们在为Mexmex-3000开发一个编程模型及其辅助编程者和图书馆方面的经验。Mex-300是为下一代高级高级计算机设计的,但有一个复杂的记忆级和处理组织。为了协助软件开发,我们开发了一个从零开始的软件堆叠,其中包括一个低级编程界面和一个高级的 OpenCL 编程器。我们的低级别编程模型为使用Mmex-3000光金属加速器提供了本地编程支持,而高级模型则允许程序设计员使用OpenCL编程标准。我们详细说明了我们的设计选择,并着重介绍了从开发系统软件中汲取的教益,以便能够编程光金属加速器。我们的编程模型已被应用到一个缩放原型系统的生产环境。