单位联邦学习优于非子体损失功能的顺序优化弹道 (Order Optimal Bounds for One-Shot Federated Learning over non-Convex Loss Functions)

We consider the problem of federated learning in a one-shot setting in which there are $m$ machines, each observing $n$ sample functions from an unknown distribution on non-convex loss functions. Let $F:[-1,1]^d\rightarrow\mathbb{R}$ be the expected loss function with respect to this unknown distribution. The goal is to find an estimate of the minimizer of $F$. Based on its observations, each machine generates a signal of bounded length $B$ and sends it to a server. The server collects signals of all machines and outputs an estimate of the minimizer of $F$. We show that the expected loss of any algorithm is lower bounded by $\max\big(1/(\sqrt{n}(mB)^{1/d}), 1/\sqrt{mn}\big)$, up to a logarithmic factor. We then prove that this lower bound is order optimal in $m$ and $n$ by presenting a distributed learning algorithm, called Multi-Resolution Estimator for Non-Convex loss function (MRE-NC), whose expected loss matches the lower bound for large $mn$ up to polylogarithmic factors.

翻译：我们考虑的是,在一拍即成的环境下,联邦学习问题,每台机器都有以美元为单位的机器,每台机器都从一个未知的非电流损失函数的分布上观测以美元为单位的样本功能。让美元[1,1,1,4\d\right\mathb{R}}作为这一未知分布的预期损失函数。目标是找到一个最小化单位的估计数。根据观察结果,每台机器生成一个约束长度为以美元为单位的信号,并将其发送到服务器。服务器收集所有机器和产出的信号,并估算以美元为单位的最小值。我们表明,任何算法的预期损失都受以下因素的制约较低:$\maxbig(1/(sqrt{n}(mB)\}1/d}), 1/sqrt{m ⁇ nbig), 最高值为对数值的估算值。我们随后通过提供一种分布式的学习算法来证明这一较低约束的定值以美元和美元为单位的最优。我们证明,这种下限是使用一种分散式的学习算法,称为非Con-nex损失系数的多解算法,要求为以美元为以美元为以美元为以美元为单位。

相关内容

损失函数（机器学习）

关注 10

损失函数，在AI中亦称呼距离函数，度量函数。此处的距离代表的是抽象性的，代表真实数据与预测数据之间的误差。损失函数（loss function）是用来估量你模型的预测值f(x)与真实值Y的不一致程度，它是一个非负实值函数,通常使用L(Y, f(x))来表示，损失函数越小，模型的鲁棒性就越好。损失函数是经验风险函数的核心部分，也是结构风险函数重要组成部分。

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

专知会员服务

49+阅读 · 2022年11月13日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

南大《优化方法（Optimization Methods》课程，推荐！

专知会员服务

80+阅读 · 2022年4月3日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日