The training of neural networks with Differentially Private Stochastic Gradient Descent offers formal Differential Privacy guarantees but introduces accuracy trade-offs. In this work, we propose to alleviate these trade-offs in residual networks with Group Normalisation through a simple architectural modification termed ScaleNorm by which an additional normalisation layer is introduced after the residual block's addition operation. Our method allows us to further improve on the recently reported state-of-the art on CIFAR-10, achieving a top-1 accuracy of 82.5% ({\epsilon}=8.0) when trained from scratch.
翻译:在这项工作中,我们建议通过简单的建筑改造,即ScaleNorm,在剩余区块新增操作后引入额外的正常化层。我们的方法使我们能够进一步改进最近报道的关于CIFAR-10的最新技术,从零开始培训时达到最高至最高至最高达82.5%的精确度。