Resource allocation problems in many computer systems can be formulated as mathematical optimization problems. However, finding exact solutions to these problems using off-the-shelf solvers is often intractable for large problem sizes with tight SLAs, leading system designers to rely on cheap, heuristic algorithms. We observe, however, that many allocation problems are granular: they consist of a large number of clients and resources, each client requests a small fraction of the total number of resources, and clients can interchangeably use different resources. For these problems, we propose an alternative approach that reuses the original optimization problem formulation and leads to better allocations than domain-specific heuristics. Our technique, Partitioned Optimization Problems (POP), randomly splits the problem into smaller problems (with a subset of the clients and resources in the system) and coalesces the resulting sub-allocations into a global allocation for all clients. We provide theoretical and empirical evidence as to why random partitioning works well. In our experiments, POP achieves allocations within 1.5% of the optimal with orders-of-magnitude improvements in runtime compared to existing systems for cluster scheduling, traffic engineering, and load balancing.
翻译:许多计算机系统中的资源分配问题可以作为数学优化问题来拟订。然而,利用现成的解决方案对于问题规模大,使用紧凑的SLS, 系统设计者依赖廉价的、疲劳的算法,往往难以找到解决这些问题的确切解决办法。然而,我们注意到,许多分配问题都是颗粒的:它们由大量客户和资源组成,每个客户都要求资源总数中的一小部分,客户可以互换地使用不同的资源。对于这些问题,我们建议了另一种办法,即重新利用最初的优化问题配方,并导致比特定域的希力学更好的分配。我们的技术,分解优化优化优化问题(POP),随机将问题分为较小的问题(系统客户和资源的分组),将由此产生的分分配合并到所有客户的全球分配中。我们提供了理论和经验证据,说明随机分配为什么效果良好。在我们的实验中,与现有集束、交通、工程和负荷平衡系统相比,持久性有机污染物在运行时最优的容容改进幅度在1.5%之内。