Privacy noise may negate the benefits of using adaptive optimizers in differentially private model training. Prior works typically address this issue by using auxiliary information (e.g., public data) to boost the effectiveness of adaptive optimization. In this work, we explore techniques to estimate and efficiently adapt to gradient geometry in private adaptive optimization without auxiliary data. Motivated by the observation that adaptive methods can tolerate stale preconditioners, we propose differentially private adaptive training with delayed preconditioners (DP^2), a simple method that constructs delayed but less noisy preconditioners to better realize the benefits of adaptivity. Theoretically, we provide convergence guarantees for our method for both convex and non-convex problems, and analyze trade-offs between delay and privacy noise reduction. Empirically, we explore DP^2 across several real-world datasets, demonstrating that it can improve convergence speed by as much as 4x relative to non-adaptive baselines and match the performance of state-of-the-art optimization methods that require auxiliary data.
翻译:隐私噪音可能会抵消在不同的私人模式培训中使用适应性优化器的好处。 先前的工作通常通过使用辅助信息(例如公共数据)来解决这一问题,以提高适应性优化的效果。 在这项工作中,我们探索了在没有辅助数据的情况下在私人适应性优化中估算和有效适应梯度几何的技术。我们从适应性方法能够容忍破碎的先决条件的观察出发,建议用延迟的前提条件(DP2)进行差异性私人适应性培训,这一简单方法可以构建延迟但不那么吵的前提条件,以更好地实现适应性的好处。理论上,我们为调合和非调和问题的方法提供了趋同保证,并分析了延迟和减少隐私噪音之间的权衡。我们巧妙地探索了几个真实世界数据集的DP2,表明它能够提高趋同非适应性基线的趋同速度,并符合需要辅助数据的状态优化方法的性能。