In this paper, we study gradient coding in a hierarchical setting, where there are intermediate nodes between the server and the workers. This structure reduces the bandwidth requirements at the server, which is a bottleneck in conventional gradient coding systems. In this paper, the intermediate nodes, referred to as $\textit{relays}$, process the data received from workers and send the results to the server for the final gradient computation. Our main contribution is deriving the optimal communication-computation trade-off by designing a linear coding scheme inspired by coded computing techniques, considering straggling and adversarial nodes among both relays and workers. The processing of the data in the relays makes it possible to achieve both the relay-to-server and the worker-to-relay communication loads simultaneously optimal with regard to the computation load.
翻译:暂无翻译