Context: On top of the inherent challenges startup software companies face applying proper software engineering practices, the non-deterministic nature of machine learning techniques makes it even more difficult for machine learning (ML) startups. Objective: Therefore, the objective of our study is to understand the whole picture of software engineering practices followed by ML startups and identify additional needs. Method: To achieve our goal, we conducted a systematic literature review study on 37 papers published in the last 21 years. We selected papers on both general software startups and ML startups. We collected data to understand software engineering (SE) practices in five phases of the software development life-cycle: requirement engineering, design, development, quality assurance, and deployment. Results: We find some interesting differences in software engineering practices in ML startups and general software startups. The data management and model learning phases are the most prominent among them. Conclusion: While ML startups face many similar challenges to general software startups, the additional difficulties of using stochastic ML models require different strategies in using software engineering practices to produce high-quality products.
翻译:摘要:背景:在初创软件公司应用适当的软件工程实践的固有挑战之上,机器学习技术的非确定性本质使得机器学习(ML)初创公司更加困难。目的:因此,我们的研究目的是了解ML初创公司遵循的软件工程实践的整体情况,并确定其他需求。方法:为了实现我们的目标,我们进行了一个系统的文献综述研究,涵盖了过去21年中发表的37篇论文。我们选择了关于通用软件初创公司和ML初创公司的论文。我们收集了数据,以了解软件开发生命周期中的五个阶段中的软件工程(SE)实践:需求工程、设计、开发、质量保证和部署。结果:我们发现通用软件初创公司和ML初创公司的软件工程实践存在一些有趣的差异。其中,数据管理和模型学习阶段最为突出。结论:尽管ML初创公司面临许多与通用软件初创公司类似的挑战,但使用随机ML模型的额外困难需要使用不同的软件工程实践策略来产生高质量的产品。