We introduce two generalizations to the paradigm of using Random Khatri-Rao Product (RKRP) codes for distributed matrix multiplication. We first introduce a class of codes called Sparse Random Khatri-Rao Product (SRKRP) codes which have sparse generator matrices. SRKRP codes result in lower encoding, computation and communication costs than RKRP codes when the input matrices are sparse, while they exhibit similar numerical stability to other state of the art schemes. We empirically study the relationship between the probability of the generator matrix (restricted to the set of non-stragglers) of a randomly chosen SRKRP code being rank deficient and various parameters of the coding scheme including the degree of sparsity of the generator matrix and the number of non-stragglers. Secondly, we show that if the master node can perform a very small number of matrix product computations in addition to the computations performed by the workers, the failure probability can be substantially improved.
翻译:我们引入了两种对分布式矩阵乘法使用随机 Khatri-Rao Product (RKRP) 代码的范式。 我们首先引入了一类名为 Sprass 随机 Khatri-Rao Product (SRKRP) 代码的代码,这些代码缺少生成器矩阵。 当输入矩阵稀少时, SRKRP 代码导致的编码、计算和通信成本低于 RKRRP 代码, 而它们与其它艺术计划具有类似的数字稳定性。 我们从经验上研究了随机选择的 SRKRP 代码(限于非累加器集) 的概率存在缺陷, 以及编码方案的各种参数, 包括生成矩阵的宽度和非累加器的数量。 其次, 我们表明, 如果主节除工人的计算之外, 能够进行非常少量的矩阵产品计算, 失败概率可以大幅提高 。