Background. The rapid and growing popularity of machine learning (ML) applications has led to an increasing interest in MLOps, that is, the practice of continuous integration and deployment (CI/CD) of ML-enabled systems. Aims. Since changes may affect not only the code but also the ML model parameters and the data themselves, the automation of traditional CI/CD needs to be extended to manage model retraining in production. Method. In this paper, we present an initial investigation of the MLOps practices implemented in a set of ML-enabled systems retrieved from GitHub, focusing on GitHub Actions and CML, two solutions to automate the development workflow. Results. Our preliminary results suggest that the adoption of MLOps workflows in open-source GitHub projects is currently rather limited. Conclusions. Issues are also identified, which can guide future research work.
翻译:机器学习(ML)应用的迅速和日益受欢迎,使人们对MLOPS越来越感兴趣,即持续整合和部署MLL辅助系统(CI/CD)的做法。目标。由于变化可能不仅影响代码,而且影响ML模型参数和数据本身,传统CI/CD的自动化需要扩大,以管理生产中的示范再培训。方法。在本文件中,我们对从GitHub回收的一套MLPS系统执行的MLOPs做法进行了初步调查,重点是GitHub行动(GitHub Action)和CML(CML),这是将发展工作流程自动化的两个解决办法。结果。我们的初步结果表明,在GitHub项目中采用MLOPs工作流程目前相当有限。结论还查明了问题,这些问题可以指导未来的研究工作。