Gradient coding is a coding theoretic framework to provide robustness against slow or unresponsive machines, known as stragglers, in distributed machine learning applications. Recently, Kadhe et al. proposed a gradient code based on a combinatorial design, called balanced incomplete block design (BIBD), which is shown to outperform many existing gradient codes in worst-case adversarial straggling scenarios. However, parameters for which such BIBD constructions exist are very limited. In this paper, we aim to overcome such limitations and construct gradient codes which exist for a wide range of parameters while retaining the superior performance of BIBD gradient codes. Two such constructions are proposed, one based on a probabilistic construction that relax the stringent BIBD gradient code constraints, and the other based on taking the Kronecker product of existing gradient codes. Theoretical error bounds for worst-case adversarial straggling scenarios are derived. Simulations show that the proposed constructions can outperform existing gradient codes with similar redundancy per data piece.
翻译:渐变编码是一个编码理论框架,用于在分布式机器学习应用程序中,对慢速或无反应的机器(称为累加器)提供稳健性能。最近,Kadhe等人提议了一个基于组合设计、称为平衡的不完整区块设计(BIBD)的梯度代码,该代码在最差的对抗性悬殊假设中表现优于现有的许多梯度代码。然而,存在这种BIBD构造的参数非常有限。在本文件中,我们的目标是克服这种限制,并构建对于多种参数存在的梯度代码,同时保留BIBD梯度代码的优性能。提出了两个这样的构建,一个建基于一种概率性构建,放松了严格的BIBD梯度代码限制,另一个建基于对现有梯度代码的Kronecker产品。理论误差为最差的对抗性斜度假设提供了参考。模拟显示,提议的构造可以超过现有梯度代码,每件数据都具有类似的冗余性。