We demonstrate self-supervised pretraining (SSP) is a scalable solution to deep learning with differential privacy (DP) regardless of the size of available public datasets in image classification. When facing the lack of public datasets, we show the features generated by SSP on only one single image enable a private classifier to obtain much better utility than the non-learned handcrafted features under the same privacy budget. When a moderate or large size public dataset is available, the features produced by SSP greatly outperform the features trained with labels on various complex private datasets under the same private budget. We also compared multiple DP-enabled training frameworks to train a private classifier on the features generated by SSP. Finally, we report a non-trivial utility 25.3\% of a private ImageNet-1K dataset when $\epsilon=3$. Our source code can be found at \url{https://github.com/UnchartedRLab/SSP}.
翻译:我们展示了自我监督的预培训(SSP)是用不同隐私(DP)进行深层次学习的一种可扩展的解决方案,而不论在图像分类中现有公共数据集的大小。在面对缺少公共数据集的情况下,我们展示了SSP在单一图像上产生的特征,使得私人分类器在同一个隐私预算下比非学习手工制作的特征获得更好的使用。当有中大型公共数据集可用时,SSP产生的特征大大优于在相同的私人预算下用各种复杂的私人数据集标签培训的特征。我们还比较了多个DP驱动式培训框架,以便对私营分类器进行SSP所产生特征的培训。最后,我们报告了在$\epsilon=3美元的情况下,一个非三维的私人图像Net-1K数据集。我们的源代码可以在\url{https://github.com/UnchartedLab/SSP}找到。