We study first-order optimization algorithms under the constraint that the descent direction is quantized using a pre-specified budget of $R$-bits per dimension, where $R \in (0 ,\infty)$. We propose computationally efficient optimization algorithms with convergence rates matching the information-theoretic performance lower bounds for: (i) Smooth and Strongly-Convex objectives with access to an Exact Gradient oracle, as well as (ii) General Convex and Non-Smooth objectives with access to a Noisy Subgradient oracle. The crux of these algorithms is a polynomial complexity source coding scheme that embeds a vector into a random subspace before quantizing it. These embeddings are such that with high probability, their projection along any of the canonical directions of the transform space is small. As a consequence, quantizing these embeddings followed by an inverse transform to the original space yields a source coding method with optimal covering efficiency while utilizing just $R$-bits per dimension. Our algorithms guarantee optimality for arbitrary values of the bit-budget $R$, which includes both the sub-linear budget regime ($R < 1$), as well as the high-budget regime ($R \geq 1$), while requiring $O\left(n^2\right)$ multiplications, where $n$ is the dimension. We also propose an efficient relaxation of this coding scheme using Hadamard subspaces that requires a near-linear time, i.e., $O\left(n \log n\right)$ additions.Furthermore, we show that the utility of our proposed embeddings can be extended to significantly improve the performance of gradient sparsification schemes. Numerical simulations validate our theoretical claims. Our implementations are available at https://github.com/rajarshisaha95/DistOptConstrComm.
翻译:我们研究第一级优化算法, 其限制是, 下端方向使用预先指定的每维的 R$- bit 预算, 即 $R = $( 0,\ infty) 美元。 我们提议计算高效优化算法, 其趋同率与信息- 理论性能较低界限匹配:(i) 平滑和强烈的Convex 目标, 并访问Exact Gratient oracle, 以及(ii) General Convex and Non- Smoth 目标, 并可以访问 noisy Subgradition Subgradition 。 这些算法的柱状是一个多元复杂的源代码, 将矢量嵌嵌入一个随机的子空间, 在量化前, 这些嵌入, 它们与变换空间的任何罐方向相匹配。 因此, 将这些嵌入的嵌入, 向原始空间逆变形转换后, 将产生一种来源编码方法, 覆盖效率, 同时仅使用 $ $ 美元 的 。 我们的递增 RO rodeal roal 。