Normalization is an important but understudied challenge in privacy-related application domains such as federated learning (FL), differential privacy (DP), and differentially private federated learning (DP-FL). While the unsuitability of batch normalization for these domains has already been shown, the impact of other normalization methods on the performance of federated or differentially private models is not well-known. To address this, we draw a performance comparison among layer normalization (LayerNorm), group normalization (GroupNorm), and the recently proposed kernel normalization (KernelNorm) in FL, DP, and DP-FL settings. Our results indicate LayerNorm and GroupNorm provide no performance gain compared to the baseline (i.e. no normalization) for shallow models in FL and DP. They, on the other hand, considerably enhance the performance of shallow models in DP-FL and deeper models in FL and DP. KernelNorm, moreover, significantly outperforms its competitors in terms of accuracy and convergence rate (or communication efficiency) for both shallow and deeper models in all considered learning environments. Given these key observations, we propose a kernel normalized ResNet architecture called KNResNet-13 for differentially private learning. Using the proposed architecture, we provide new state-of-the-art accuracy values on the CIFAR-10 and Imagenette datasets, when trained from scratch.
翻译:在与隐私有关的应用领域,如联谊学习(FL)、差异隐私(DP)和不同私人联谊学习(DP-FL)等领域,正常化是一个重要但研究不足的挑战。虽然已经展示了这些领域批次正常化(DP-FL)的不适宜性,但其他正常化方法对联邦或不同私人模式绩效的影响并不广为人知。为了解决这个问题,我们比较了层正常化(LayerNorm)、群体正常化(GroupNorm)和最近提议的FL、DP和DP-FL环境中的内核正常化(KernelNorm)之间的业绩,以及最近提议的FL、DP和DP-FL环境中的内核正常化(KernelNorm)之间的内核正常化(KernelNorm)。我们的结果显示,TeilNorm和GNorm没有比FL和DP中浅度模型的基线(即没有正常化)。另一方面,它们大大加强了DP-FL的浅度模型和FL和更深层模型的性能表现。此外,在使用所有经过思考的AR-13号网络结构时,我们提出了一个经过培训的浅更深层次的图像结构。