Gradient coding is a coding theoretic framework to provide robustness against slow or unresponsive machines, known as stragglers, in distributed machine learning applications. Recently, Kadhe et al. proposed a gradient code based on a combinatorial design, called balanced incomplete block design (BIBD), which is shown to outperform many existing gradient codes in worst-case adversarial straggling scenarios. However, parameters for which such BIBD constructions exist are very limited. In this paper, we aim to overcome such limitations and construct gradient codes which exist for a wide range of system parameters while retaining the superior performance of BIBD gradient codes. Two such constructions are proposed, one based on a probabilistic construction that relax the stringent BIBD gradient code constraints, and the other based on taking the Kronecker product of existing gradient codes. The proposed gradient codes allow flexible choices of system parameters while retaining comparable error performance.
翻译:渐变编码是一个编码理论框架,用于在分布式机器学习应用程序中,对被称为累赘器的慢速或无反应机器提供稳健性能。最近,Kadhe等人提议了一个基于组合设计、称为平衡的不完整区块设计(BIBD)的梯度代码,该代码在最差的对抗性悬殊假设中表现优于现有的许多梯度代码。然而,存在BIBD构造的参数非常有限。在本文件中,我们的目标是克服这些限制,并构建对于多种系统参数存在的梯度代码,同时保留BIBD梯度代码的优性能。提出了两个这样的构建,一个基于概率构建,以放松严格的BIBD梯度代码限制,另一个基于获取现有梯度代码的Kronecker产品。拟议梯度代码允许灵活选择系统参数,同时保留类似的错误性能。