Gradient tracking (GT) is an algorithm designed for solving decentralized optimization problems over a network (such as training a machine learning model). A key feature of GT is a tracking mechanism that allows to overcome data heterogeneity between nodes. We develop a novel decentralized tracking mechanism, $K$-GT, that enables communication-efficient local updates in GT while inheriting the data-independence property of GT. We prove a convergence rate for $K$-GT on smooth non-convex functions and prove that it reduces the communication overhead asymptotically by a linear factor $K$, where $K$ denotes the number of local steps. We illustrate the robustness and effectiveness of this heterogeneity correction on convex and non-convex benchmark problems and on a non-convex neural network training task with the MNIST dataset.
翻译:渐进跟踪(GT)是一种旨在解决网络上分散化优化问题的算法(例如,培训机器学习模式),GT的一个关键特征是能够克服节点之间数据差异性的跟踪机制。我们开发了一个全新的分散化跟踪机制,即K$GT,在继承GT数据独立属性的同时,可以在GT中进行通信高效本地更新。我们证明,在平滑的非凝固功能上,K$GT是一种趋同率,并证明它会以线性因子$(K$)减少通信间接费用,而K$(K$)表示本地步骤的数量。我们展示了这种异性纠正对于Convex和非convex基准问题以及非convex神经网络培训任务与MNIST数据集的稳健性和有效性。