Is overparameterization a privacy liability? In this work, we study the effect that the number of parameters has on a classifier's vulnerability to membership inference attacks. We first demonstrate how the number of parameters of a model can induce a privacy--utility trade-off: increasing the number of parameters generally improves generalization performance at the expense of lower privacy. However, remarkably, we then show that if coupled with proper regularization, increasing the number of parameters of a model can actually simultaneously increase both its privacy and performance, thereby eliminating the privacy--utility trade-off. Theoretically, we demonstrate this curious phenomenon for logistic regression with ridge regularization in a bi-level feature ensemble setting. Pursuant to our theoretical exploration, we develop a novel leave-one-out analysis tool to precisely characterize the vulnerability of a linear classifier to the optimal membership inference attack. We empirically exhibit this "blessing of dimensionality" for neural networks on a variety of tasks using early stopping as the regularizer.
翻译:Translated abstract:
过多的参数是否会带来隐私风险?本研究探讨了模型参数数量对分类器成员推断攻击的影响。我们首先证明了模型参数数量可能会引起隐私—实用性的权衡:增加参数数量通常会提高泛化性能,但会降低隐私。然而,值得注意的是,在适当正则化的情况下,增加模型参数数量实际上可以同时提高其隐私和性能,从而消除隐私—实用性的权衡。理论上,我们在双层特征集成设置中使用岭正则化展示了适量参数增加的效果。在理论探索的基础上,我们开发了一种新的留一分析工具,以精确描述线性分类器对最佳成员推断攻击的脆弱性。我们通过早停法将这种“维度魔力”在各种任务的神经网络中进行了实证展示。