The distributed matrix multiplication problem with an unknown number of stragglers is considered, where the goal is to efficiently and flexibly obtain the product of two massive matrices by distributing the computation across N servers. There are up to N - R stragglers but the exact number is not known a priori. Motivated by reducing the computation load of each server, a flexible solution is proposed to fully utilize the computation capability of available servers. The computing task for each server is separated into several subtasks, constructed based on Entangled Polynomial codes by Yu et al. The final results can be obtained from either a larger number of servers with a smaller amount of computation completed per server or a smaller number of servers with a larger amount of computation completed per server. The required finite field size of the proposed solution is less than 2N. Moreover, the optimal design parameters such as the partitioning of the input matrices is discussed. Our constructions can also be generalized to other settings such as batch distributed matrix multiplication and secure distributed matrix multiplication.
翻译:在考虑分布式矩阵乘法问题时,将数量未知的累加器的分布式矩阵乘法问题考虑在内,目标是通过在 N 服务器上分配计算结果,从而高效和灵活地获得两个大型矩阵的产物。 最多为 N- R 的累加器,但确切的数字并不先验。 通过减少每个服务器的计算负荷,提出了一种灵活的解决办法,以充分利用现有服务器的计算能力。 每个服务器的计算任务分为几个子任务, 由 Yu 等人根据 Entracled Commonnomial code 构建。 最终结果可以来自数量更多的服务器, 且每个服务器完成较少的计算, 或数量较小的服务器完成较多的计算。 提议的解决方案所需的有限字段大小小于 2N 。 此外, 还讨论了输入矩阵的分隔等最佳设计参数。 我们的构造还可以推广到其他环境, 如分批分发的矩阵乘法和安全分布的矩阵倍增。