Many cloud applications are migrated from the monolithic model to a microservices framework in which hundreds of loosely-coupled microservices run concurrently, with significant benefits in terms of scalability, rapid development, modularity, and isolation. However, dependencies among microservices with uneven execution time may result in longer queues, idle resources, or Quality-of-Service (QoS) violations. In this paper we introduce Reclaimer, a deep reinforcement learning model that adapts to runtime changes in the number and behavior of microservices in order to minimize CPU core allocation while meeting QoS requirements. When evaluated with two benchmark microservice-based applications, Reclaimer reduces the mean CPU core allocation by 38.4% to 74.4% relative to the industry-standard scaling solution, and by 27.5% to 58.1% relative to a current state-of-the art method.
翻译:许多云端应用正在从单片模型转移到微服务架构中,在这种架构下,数百个松散耦合的微服务同时运行,具有可扩展性、快速开发、模块化和隔离等显著的优点。然而,微服务之间的依赖关系和执行时间不均可能导致较长的队列、空闲资源或服务质量(QoS)违规。在本文中,我们介绍了一种深度强化学习模型Reclaimer,该模型可以适应运行时的变化,以最小化CPU核分配并满足QoS要求。在两个基准微服务应用程序的评估中,Reclaimer相对于行业标准的比例下降了38.4%至74.4%的平均CPU核分配,相对于当前最先进的方法下降了27.5%至58.1%。