Cloud resource management is often modeled by two-dimensional bin packing with a set of items that correspond to tasks having fixed CPU and memory requirements. However, applications running in clouds are much more flexible: modern frameworks allow to (horizontally) scale a single application to dozens, even hundreds of instances; and then the load balancer can precisely divide the workload between them. We analyze a model that captures this (semi)-flexibility of cloud resource management. Each cloud application is characterized by its memory footprint and its momentary CPU load. Combining the scheduler and the load balancer, the resource manager decides how many instances of each application will be created and how the CPU load will be balanced between them. In contrast to the divisible load model, each instance of the application requires a certain amount of memory, independent of the number of instances. Thus, the resource manager effectively trades additional memory for more evenly balanced load. We study two objectives: the bin-packing-like minimization of the number of machines used; and the makespan-like minimization of the maximum load among all the machines. We prove NP-hardness of the general problems, but also propose polynomial-time exact algorithms for boundary special cases. Notably, we show that (semi)-flexibility may result in reducing the required number of machines by a tight factor of $2-\varepsilon$. For the general case, we propose heuristics that we validate by simulation on instances derived from the Azure trace.
翻译:云层资源管理通常以一套符合固定 CPU 和 内存要求的任务的二维垃圾包装成模。 但是, 在云层中运行的应用程序要灵活得多: 现代框架允许( horizontally) 将一个应用程序缩放到几十个甚至数百个实例; 然后负载平衡器可以精确地将工作量区分在它们之间。 我们分析一个模型, 捕捉云层资源管理的这种( 半) 灵活性。 每个云层应用的特征是它的记忆足迹和瞬间负载。 将调度器和负载平衡器合并起来, 资源管理器将决定每个应用程序将创建多少实例, 以及如何在它们之间实现平衡。 与可辨别的负载模型相比, 每个应用程序的每个实例都需要一定的记忆量, 独立于实例的数量。 因此, 资源管理器可以有效地交换额外的记忆, 以更平衡的负载量。 我们研究两个目标: 将使用的机器的数量与平面包装相仿为最小; 以及将所有机器的最大负载量最小化为最小的最小。 我们证明, 将普通的 NP- hard- hardlestralal- sleval 需要一个常规的硬度 。