分布式学习在通用错误上的速率扭曲理论误差 (Rate-Distortion Theoretic Bounds on Generalization Error for Distributed Learning)

In this paper, we use tools from rate-distortion theory to establish new upper bounds on the generalization error of statistical distributed learning algorithms. Specifically, there are $K$ clients whose individually chosen models are aggregated by a central server. The bounds depend on the compressibility of each client's algorithm while keeping other clients' algorithms un-compressed, and leverage the fact that small changes in each local model change the aggregated model by a factor of only $1/K$. Adopting a recently proposed approach by Sefidgaran et al., and extending it suitably to the distributed setting, this enables smaller rate-distortion terms which are shown to translate into tighter generalization bounds. The bounds are then applied to the distributed support vector machines (SVM), suggesting that the generalization error of the distributed setting decays faster than that of the centralized one with a factor of $\mathcal{O}(\log(K)/\sqrt{K})$. This finding is validated also experimentally. A similar conclusion is obtained for a multiple-round federated learning setup where each client uses stochastic gradient Langevin dynamics (SGLD).

翻译：在本文中,我们使用调试率理论的工具来为统计分布式学习算法的普及错误设定新的上限。具体地说, 有K美元客户的客户个人选择的模型由中央服务器汇总。这些界限取决于每个客户的算法的压缩, 同时保持其他客户的算法不受压缩, 并充分利用以下事实,即每个本地模型的微小变化仅以1/ K$的系数改变综合模型。采用Sefidgaran 等人最近提出的一种方法, 并将其扩展为适合分布式设置, 这样可以使较小的调试率术语被显示为更紧凑的通用界限。这些界限随后被应用到分布式的辅助矢量机( SVM), 这表明分布式设置的普遍误差比集中式的差快, 系数为$\mathcal{O} (\log (K)/\qrt{K} 。这一发现也得到实验性验证。每个客户都使用渐变的梯状学习设置, 获得类似的结论。

相关内容

泛化误差

关注 107

学习方法的泛化能力（Generalization Error）是由该方法学习到的模型对未知数据的预测能力，是学习方法本质上重要的性质。现实中采用最多的办法是通过测试泛化误差来评价学习方法的泛化能力。泛化误差界刻画了学习算法的经验风险与期望风险之间偏差和收敛速度。一个机器学习的泛化误差（Generalization Error），是一个描述学生机器在从样品数据中学习之后，离教师机器之间的差距的函数。

【2022新书】机器学习中的统计建模:概念和应用，398页pdf

专知会员服务

142+阅读 · 2022年11月5日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日