In the past decades, the revolutionary advances of Machine Learning (ML) have shown a rapid adoption of ML models into software systems of diverse types. Such Machine Learning Software Applications (MLSAs) are gaining importance in our daily lives. As such, the Quality Assurance (QA) of MLSAs is of paramount importance. Several research efforts are dedicated to determining the specific challenges we can face while adopting ML models into software systems. However, we are aware of no research that offered a holistic view of the distribution of those ML quality assurance challenges across the various phases of software development life cycles (SDLC). This paper conducts an in-depth literature review of a large volume of research papers that focused on the quality assurance of ML models. We developed a taxonomy of MLSA quality assurance issues by mapping the various ML adoption challenges across different phases of SDLC. We provide recommendations and research opportunities to improve SDLC practices based on the taxonomy. This mapping can help prioritize quality assurance efforts of MLSAs where the adoption of ML models can be considered crucial.
翻译:在过去几十年中,机器学习的革命性进展表明,在各种软件系统中迅速采用ML模型,这种机器学习软件应用在日常生活中的重要性日益提高,因此,MLSA的质量保证(QA)至关重要。一些研究工作致力于确定我们在将ML模型纳入软件系统时可能面临的具体挑战。然而,我们意识到,没有一项研究能够全面反映这些ML质量保证挑战在软件开发生命周期各个阶段的分布情况。本文对大量侧重于ML模型质量保证的研究文件进行了深入的文献审查。我们制定了MLSA质量保证问题的分类,绘制了SDLL不同阶段采用MLC时遇到的各种挑战。我们为改进基于分类学的SDLC做法提供了建议和研究机会。这种绘图有助于将MLSA的质量保证工作放在优先地位,因为采用ML模型至关重要。