We show that the convergence proof of a recent algorithm called dist-EF-SGD for distributed stochastic gradient descent with communication efficiency using error-feedback of Zheng et al. (NeurIPS 2019) is problematic mathematically. Concretely, the original error bound for arbitrary sequences of learning rate is unfortunately incorrect, leading to an invalidated upper bound in the convergence theorem for the algorithm. As evidences, we explicitly provide several counter-examples, for both convex and non-convex cases, to show the incorrectness of the error bound. We fix the issue by providing a new error bound and its corresponding proof, leading to a new convergence theorem for the dist-EF-SGD algorithm, and therefore recovering its mathematical analysis.
翻译:我们发现,使用郑等人(NeurIPS 2019年)错误反馈(NeurIPS 2019年)的通信效率,对分布式随机梯度梯度下降的最近算法称为dist-EF-SGD,这一合并证据在数学上呈问题。 具体地说,最初为任意学习率序列设定的错误是不正确的,导致该算法趋同理论的上界无效。 作为证据,我们明确提供了数个反示例,既包括Convex案,也包括非Convex案,以显示错误约束的不正确性。 我们通过提供新的错误约束性及其相应的证据来解决这个问题,从而导致新的折射-EF-SGD算法趋同理论的趋同,从而恢复其数学分析。