In this paper, due to the important value in practical applications, we consider the coded distributed matrix multiplication problem of computing $AA^\top$ in a distributed computing system with $N$ worker nodes and a master node, where the input matrices $A$ and $A^\top$ are partitioned into $m$-by-$p$ and $p$-by-$m$ blocks of equal-size sub-matrices respectively. For effective straggler mitigation, we propose a novel computation strategy, named \emph{folded polynomial code}, which is obtained by modifying the entangled polynomial codes. Moreover, we characterize a lower bound on the optimal recovery threshold among all linear computation strategies when the underlying field is real number field, and our folded polynomial codes can achieve this bound in the case of $m=1$. Compared with all known computation strategies for coded distributed matrix multiplication, our folded polynomial codes outperform them in terms of recovery threshold, download cost and decoding complexity.
翻译:在本文中,针对在分布式计算系统中计算$AA^\top$的编码分布式矩阵乘法问题,考虑到其在实际应用中的重要价值,其中输入矩阵$A$和$A^\top$分别被划分为等大小的$m$-by-$p$和$p$-by-$m$子矩阵。为了有效地缓解慢工作节点效应,我们提出了一种新的计算策略,称为\emph{折叠多项式码},它是通过修改纠缠多项式码得到的。此外,当底层域为实数域时,我们刻画了所有线性计算策略中最优恢复阈值的下界,而当$m=1$时,我们的折叠多项式码可以实现此下界。与所有已知的编码分布式矩阵乘法计算策略相比,我们的折叠多项式码在恢复阈值、下载成本和解码复杂度方面都表现优异。