Modern software systems and products increasingly rely on machine learning models to make data-driven decisions based on interactions with users and systems, e.g., compute infrastructure. For broader adoption, this practice must (i) accommodate software engineers without ML backgrounds, and (ii) provide mechanisms to optimize for product goals. In this work, we describe general principles and a specific end-to-end ML platform, Looper, which offers easy-to-use APIs for decision-making and feedback collection. Looper supports the full end-to-end ML lifecycle from online data collection to model training, deployment, inference, and extends support to evaluation and tuning against product goals. We outline the platform architecture and overall impact of production deployment -- Looper currently hosts 700 ML models and makes 6 million decisions per second. We also describe the learning curve and summarize experiences of platform adopters.
翻译:现代软件系统和产品日益依赖机器学习模式,以便根据与用户和系统的互动,例如计算基础设施,作出以数据为驱动的决策。为了更广泛地采用这种做法,这种做法必须(一) 容纳没有ML背景的软件工程师,和(二) 提供优化产品目标的机制。在这项工作中,我们描述了一般原则和具体的端到端ML平台Looper,它为决策和反馈收集提供了容易使用的API。Looper支持从在线数据收集到模型培训、部署、推断的全端到端 ML生命周期,并为根据产品目标进行评估和调整提供支持。我们概述了平台结构和产品部署的总体影响 -- -- Looper目前容纳700 ML模型,每秒做出600万项决定。我们还描述了学习曲线,并总结了平台采用者的经验。