Deep neural networks (DNNs) are the de-facto standard for essential use cases, such as image classification, computer vision, and natural language processing. As DNNs and datasets get larger, they require distributed training on increasingly larger clusters. A main bottleneck is then the resulting communication overhead where workers exchange model updates (i.e., gradients) on a per-round basis. To address this bottleneck and accelerate training, a widely-deployed approach is compression. However, previous deployments often apply bi-directional compression schemes by simply using a uni-directional gradient compression scheme in each direction. This results in significant computational overheads at the parameter server and increased compression error, leading to longer training and lower accuracy. We introduce Tensor Homomorphic Compression (THC), a novel bi-directional compression framework that enables the direct aggregation of compressed values while optimizing the bandwidth to accuracy tradeoff, thus eliminating the aforementioned overheads. Moreover, THC is compatible with in-network aggregation (INA), which allows for further acceleration. Evaluation over a testbed shows that THC improves time-to-accuracy in comparison to alternatives by up to 1.32x with a software PS and up to 1.51x using INA. Finally, we demonstrate that THC is scalable and tolerant for acceptable packet-loss rates.
翻译:深神经网络(DNNS)是基本使用案例(如图像分类、计算机视觉和自然语言处理等)的离地标准。随着DNNS和数据集的扩大,它们需要分布式培训,以扩大集群。然后,主要的瓶颈就是由此产生的通信间接费用,工人可以全面交换模型更新(如梯度),这是一个新的双向压缩框架,使压缩值能够直接组合,同时优化带宽以达到准确交易,从而消除上述间接费用。此外,THC与网络内集成相容,从而可以进一步加速。对参数服务器进行的重大计算间接费用和增加压缩错误,导致培训时间更长,准确度降低。我们引入了Tensor单向单向单向式更新模型更新(即梯度),这是一个新的双向压缩框架,使压缩值能够直接组合,同时优化带宽以达到准确交易,从而消除上述间接费用。此外,THCHC与网络内集成(INA)兼容,从而可以进一步加速。通过测试床评估显示,THCS-HCS-S-S-CFS-S-S-S-CLS-C-S-CLVAT-C-C-C-C-S-S-S-C-C-C-S-CLVleventral 和S-C-C-S-C-C-S-S-S-S-C-C-S-S-S-S-S-S-C-S-S-S-CAR-CLis-S-S-S-S-S-S-S-CLis-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-C-C-C-C-C-C-CL-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-S-C-C-C-C-C-C-C-C-C-C-