The massive trend of integrating data-driven AI capabilities into traditional software systems is rising new intriguing challenges. One of such challenges is achieving a smooth transition from the explorative phase of Machine Learning projects - in which data scientists build prototypical models in the lab - to their production phase - in which software engineers translate prototypes into production-ready AI components. To narrow down the gap between these two phases, tools and practices adopted by data scientists might be improved by incorporating consolidated software engineering solutions. In particular, computational notebooks have a prominent role in determining the quality of data science prototypes. In my research project, I address this challenge by studying the best practices for collaboration with computational notebooks and proposing proof-of-concept tools to foster guidelines compliance.
翻译:将数据驱动的AI能力纳入传统软件系统的巨大趋势正日益成为新的引人入胜的挑战,其中一项挑战是从机器学习项目的探索阶段(数据科学家在实验室中建立原型模型)顺利过渡到其生产阶段(软件工程师将原型转化为可生产的AI组件)。为了缩小这两个阶段之间的差距,可以通过纳入综合软件工程解决方案来改进数据科学家采用的工具和做法。特别是,计算笔记本在确定数据科学原型的质量方面发挥着突出的作用。在我的研究项目中,我通过研究与计算笔记协作的最佳做法和提出促进遵守准则的验证概念工具来应对这一挑战。