Context and motivation: The development and operation of critical software that contains machine learning (ML) models requires diligence and established processes. Especially the training data used during the development of ML models have major influences on the later behaviour of the system. Runtime monitors are used to provide guarantees for that behaviour. Question / problem: We see major uncertainty in how to specify training data and runtime monitoring for critical ML models and by this specifying the final functionality of the system. In this interview-based study we investigate the underlying challenges for these difficulties. Principal ideas/results: Based on ten interviews with practitioners who develop ML models for critical applications in the automotive and telecommunication sector, we identified 17 underlying challenges in 6 challenge groups that relate to the challenge of specifying training data and runtime monitoring. Contribution: The article provides a list of the identified underlying challenges related to the difficulties practitioners experience when specifying training data and runtime monitoring for ML models. Furthermore, interconnection between the challenges were found and based on these connections recommendation proposed to overcome the root causes for the challenges.
翻译:问题/问题:我们看到,在如何为关键的ML模型具体指定培训数据和运行时间监测方面,以及在如何为此具体说明该系统的最终功能方面,存在着很大的不确定性。在这项访谈研究中,我们调查了这些困难的潜在挑战。主要想法/结果:根据与开发ML模型用于汽车和电信部门关键应用的从业人员的10次访谈,我们查明了6个挑战组中的17个基本挑战,这些挑战组涉及具体确定培训数据和运行时间监测方面的挑战。文章提供了一份清单,列出了在为ML模型具体指定培训数据和运行时间监测方面与从业人员所经历的困难有关的基本挑战。此外,还发现挑战之间的相互联系,并根据为克服挑战的根源而提出的这些联系建议。