We demonstrate self-supervised pretraining (SSP) is a scalable solution to deep learning with differential privacy (DP) regardless of the size of available public datasets in image classification. When facing the lack of public datasets, we show the features generated by SSP on only one single image enable a private classifier to obtain much better utility than the non-learned handcrafted features under the same privacy budget. When a moderate or large size public dataset is available, the features produced by SSP greatly outperform the features trained with labels on various complex private datasets under the same private budget. We also compared multiple DP-enabled training frameworks to train a private classifier on the features generated by SSP. Finally, we report a non-trivial utility 25.3\% of a private ImageNet-1K dataset when $\epsilon=3$.
翻译:我们展示自我监督的训练前(SSP)是用不同隐私(DP)进行深层次学习的一种可伸缩的解决方案,不管在图像分类中现有公共数据集的大小如何。在面对缺少公共数据集的情况下,我们展示了SSP在单一图像上产生的特征,使私人分类器在同一个隐私预算下比非手工制作的特征获得更好的使用。当有中大型公共数据集时,SSP产生的特征大大优于在相同私人预算下用各种复杂的私人数据集标签培训的特征。我们还比较了多个基于DP的训练框架,以便对私营分类器进行关于SSP所产生特征的培训。最后,我们报告了在$\epsilon=3美元的情况下,一个私人图像Net-1K数据集的非三维功能25.3 ⁇ 。