A major direction in differentially private machine learning is differentially private fine-tuning: pretraining a model on a source of "public data" and transferring the extracted features to downstream tasks. This is an important setting because many industry deployments fine-tune publicly available feature extractors on proprietary data for downstream tasks. In this paper, we carefully integrate techniques, both new and from prior work, to solve benchmark tasks in computer vision and natural language processing using differentially private fine-tuning. Our key insight is that by accelerating training with the choice of key hyperparameters, we can quickly drive the model parameters to regions in parameter space where the impact of noise is minimized. We obtain new state-of-the art performance on CIFAR10, CIFAR100, FashionMNIST, STL10, and PersonaChat, including $99 \%$ on CIFAR10 for $\varepsilon=1, \delta=1e-5$-DP.
翻译:不同的私人机器学习的主要方向是不同的私人微调:为“公共数据”来源的模型做准备,并将提取的特征转移到下游任务。这是一个重要环境,因为许多行业在下游任务专有数据上对公开提供的特效提取器进行微调。在本文中,我们仔细地将新技术和以前工作的技术结合起来,利用不同的私人微调,解决计算机视觉和自然语言处理的基准任务。我们的主要见解是,通过加快选择关键超光谱计的培训,我们可以迅速将模型参数推到参数空间中噪音影响最小的地区。我们获得了CIFAR10、CIFAR100、Fashon MINIS、STL10和FrontaChat的新艺术表现,包括$\varepslon=1、\delta=1-5e-DP的CIFAR10美元。