The development of Machine Learning (ML) models is more than just a special case of software development (SD): ML models acquire properties and fulfill requirements even without direct human interaction in a seemingly uncontrollable manner. Nonetheless, the underlying processes can be described in a formal way. We define a comprehensive SD process model for ML that encompasses most tasks and artifacts described in the literature in a consistent way. In addition to the production of the necessary artifacts, we also focus on generating and validating fitting descriptions in the form of specifications. We stress the importance of further evolving the ML model throughout its life-cycle even after initial training and testing. Thus, we provide various interaction points with standard SD processes in which ML often is an encapsulated task. Further, our SD process model allows to formulate ML as a (meta-) optimization problem. If automated rigorously, it can be used to realize self-adaptive autonomous systems. Finally, our SD process model features a description of time that allows to reason about the progress within ML development processes. This might lead to further applications of formal methods within the field of ML.
翻译:开发机器学习(ML)模型不仅仅是软件开发(SD)的特殊情况:ML模型获得特性和满足要求,即使没有直接的人类互动,似乎无法控制。然而,可以正式描述基本过程。我们为ML定义了一个全面的SD过程模型,以一致的方式涵盖文献中描述的大多数任务和工艺品。除了制作必要的工艺品外,我们还侧重于制作和验证规格形式的适当描述。我们强调即使在初步培训和测试之后,在ML模型的整个生命周期内进一步发展该模型的重要性。因此,我们为标准SD过程提供各种互动点,而ML往往是包装的任务。此外,我们的SD进程模型允许将ML设计成一个(元的)优化问题。如果自动化,它可以用来实现自我适应自主系统。最后,我们的SD过程模型描述时间,以便了解ML开发过程的进展。这可能导致在ML领域进一步应用正式方法。