We consider distributed optimization over a $d$-dimensional space, where $K$ remote clients send coded gradient estimates over an {\em additive Gaussian Multiple Access Channel (MAC)} with noise variance $\sigma_z^2$. Furthermore, the codewords from the clients must satisfy the average power constraint $P$, resulting in a signal-to-noise ratio (SNR) of $KP/\sigma_z^2$. In this paper, we study the fundamental limits imposed by MAC on the {convergence rate of any distributed optimization algorithm and design optimal communication schemes to achieve these limits.} Our first result is a lower bound for the convergence rate, showing that communicating over a MAC imposes a slowdown of $\sqrt{d/\frac{1}{2}\log(1+\SNR)}$ on any protocol compared to the centralized setting. Next, we design a computationally tractable {digital} communication scheme that matches the lower bound to a logarithmic factor in $K$ when combined with a projected stochastic gradient descent algorithm. At the heart of our communication scheme is carefully combining several compression and modulation ideas such as quantizing along random bases, {\em Wyner-Ziv compression}, {\em modulo-lattice decoding}, and {\em amplitude shift keying.} We also show that analog schemes, which are popular due to their ease of implementation, can give close to optimal convergence rates at low $\SNR$ but experience a slowdown of roughly $\sqrt{d}$ at high $\SNR$.
翻译:暂无翻译