There is a dearth of convergence results for differentially private federated learning (FL) with non-Lipschitz objective functions (i.e., when gradient norms are not bounded). The primary reason for this is that the clipping operation (i.e., projection onto an $\ell_2$ ball of a fixed radius called the clipping threshold) for bounding the sensitivity of the average update to each client's update introduces bias depending on the clipping threshold and the number of local steps in FL, and analyzing this is not easy. For Lipschitz functions, the Lipschitz constant serves as a trivial clipping threshold with zero bias. However, Lipschitzness does not hold in many practical settings; moreover, verifying it and computing the Lipschitz constant is hard. Thus, the choice of the clipping threshold is non-trivial and requires a lot of tuning in practice. In this paper, we provide the first convergence result for private FL on smooth \textit{convex} objectives \textit{for a general clipping threshold} -- \textit{without assuming Lipschitzness}. We also look at a simpler alternative to clipping (for bounding sensitivity) which is \textit{normalization} -- where we use only a scaled version of the unit vector along the client updates, completely discarding the magnitude information. {The resulting normalization-based private FL algorithm is theoretically shown to have better convergence than its clipping-based counterpart on smooth convex functions. We corroborate our theory with synthetic experiments as well as experiments on benchmarking datasets.
翻译:与非Lipschitz目标函数( 即, 梯度规范不受约束时) 的私人分级化学习( FL) 缺乏趋同结果。 主要原因是剪切操作( 投到一个固定半径( 剪切阈值) 的 $\ ell_ 2$球上) 将平均更新的敏感度约束到每个客户的更新中, 取决于剪裁阈值和 FL 中本地步骤的数量, 分析这一点并不容易。 对于 Lipschitz 函数来说, Lipschitz 常量是一个微不足道的剪切阈值, 带有零偏差。 然而, Lipschitz 常量在许多实际环境中并不维持; 此外, 校验和计算 Lipschitz 常数是困难的。 因此, 剪切阈值的选择是非边际的, 需要大量练习调整。 在本文中, 我们为私人分级FL 的平滑度 { convex} 目标\ textitit { { 提供首个趋同结果。 在一般剪切阈值 中, 一个普通剪裁阈值 {, 在假设 3\\ text 中, 显示 更简单的递增的 。