Increasing scale and heterogeneity in data centers have led to the development of federated clusters such as KubeFed, Hydra, and Pigeon, that federate individual data center clusters. In our work, we introduce Megha, a novel decentralized resource management framework for such federated clusters. Megha employs flexible logical partitioning of clusters to distribute its scheduling load, ensuring that the requirements of the workload are satisfied with very low scheduling overheads. It uses a distributed global scheduler that does not rely on a centralized data store but, instead, works with eventual consistency, unlike other schedulers that use a tiered architecture or rely on centralized databases. Our experiments with Megha show that it can schedule tasks taking into account fairness and placement constraints with low resource allocation times - in the order of tens of milliseconds.
翻译:数据中心规模的扩大和异质的扩大导致联合集群的形成,如KubeFed、Hydra和Pigeon等,它们组成了单个数据中心集群。在我们的工作中,我们引入了Megha,这是一个针对这些联合集群的新的分散资源管理框架。Megha采用灵活的逻辑分类组合来分配其排期负荷,确保工作量的要求与非常低的排期间接费用相适应。它使用分布式全球调度器,它不依赖集中的数据存储库,而是与使用分层结构或依赖集中数据库的其他调度器不同,最终工作的一致性。我们与Megha的实验表明,它可以考虑到资源分配时间低的公平和职位安排限制――按数十毫秒的顺序排列任务。