In recent years, Web services are becoming more and more intelligent (e.g., in understanding user preferences) thanks to the integration of components that rely on Machine Learning (ML). Before users can interact (inference phase) with an ML-based service (ML-Service), the underlying ML model must learn (training phase) from existing data, a process that requires long-lasting batch computations. The management of these two, diverse phases is complex and meeting time and quality requirements can hardly be done with manual approaches. This paper highlights some of the major issues in managing ML-services in both training and inference modes and presents some initial solutions that are able to meet set requirements with minimum user inputs. A preliminary evaluation demonstrates that our solutions allow these systems to become more efficient and predictable with respect to their response time and accuracy.
翻译:近年来,由于综合了依赖机器学习(ML)的组件,网络服务变得越来越聪明(例如理解用户偏好)。在用户能够与以ML为基础的服务(ML-Service)互动(推断阶段)之前,基础ML模式必须从现有数据中学习(培训阶段),这是一个需要长期分批计算的过程。这两个不同阶段的管理是复杂的,用人工方法很难满足时间和质量要求。本文件着重介绍了在培训和推理模式中管理ML服务的一些主要问题,并提出了一些初步解决办法,这些解决办法能够以最低限度的用户投入满足既定要求。初步评估表明,我们的解决办法使得这些系统在反应时间和准确性方面更加有效和可预测。