Continuous integration (CI) has become a ubiquitous practice in modern software development, with major code hosting services offering free automation on popular platforms. CI offers major benefits, as it enables detecting bugs in code prior to committing changes. While high-performance computing (HPC) research relies heavily on software, HPC machines are not considered "common" platforms. This presents several challenges that hinder the adoption of CI in HPC environments, making it difficult to maintain bug-free HPC projects, and resulting in adverse effects on the research community. In this article, we explore the challenges that impede HPC CI, such as hardware diversity, security, isolation, administrative policies, and non-standard authentication, environments, and job submission mechanisms. We propose several solutions that could enhance the quality of HPC software and the experience of developers. Implementing these solutions would require significant changes at HPC centers, but if these changes are made, it would ultimately enable faster and better science.
翻译:持续集成已成为现代软件开发中普遍的实践,主要的代码托管服务在流行平台上提供免费自动化。持续集成具有重大的好处,因为它能够在提交更改之前检测代码中的错误。虽然高性能计算(HPC)研究在很大程度上依赖于软件,但是HPC机器并不被视为“常见”平台。这带来了几个挑战,阻碍了持续集成在HPC环境中的采用,使得难以维护无缺陷的HPC项目,并对研究社区产生不利影响。在本文中,我们探讨了阻碍HPC持续集成的挑战,如硬件多样性、安全性、隔离、管理策略和非标准身份验证、环境和作业提交机制等。我们提出了几个解决方案,可以增强HPC软件的质量和开发人员的体验。实施这些解决方案需要在HPC中心进行重大变革,但如果进行这些变革,最终将实现更快、更好的科学。