Distribution inference, sometimes called property inference, infers statistical properties about a training set from access to a model trained on that data. Distribution inference attacks can pose serious risks when models are trained on private data, but are difficult to distinguish from the intrinsic purpose of statistical machine learning -- namely, to produce models that capture statistical properties about a distribution. Motivated by Yeom et al.'s membership inference framework, we propose a formal definition of distribution inference attacks that is general enough to describe a broad class of attacks distinguishing between possible training distributions. We show how our definition captures previous ratio-based property inference attacks as well as new kinds of attack including revealing the average node degree or clustering coefficient of a training graph. To understand distribution inference risks, we introduce a metric that quantifies observed leakage by relating it to the leakage that would occur if samples from the training distribution were provided directly to the adversary. We report on a series of experiments across a range of different distributions using both novel black-box attacks and improved versions of the state-of-the-art white-box attacks. Our results show that inexpensive attacks are often as effective as expensive meta-classifier attacks, and that there are surprising asymmetries in the effectiveness of attacks. Code is available at https://github.com/iamgroot42/FormEstDistRisks
翻译:分布分布推断,有时称为财产推断,推断出从访问到数据模型培训的一组培训的统计属性。分布推断攻击在模型接受私人数据培训时可能构成严重风险,但很难与统计机器学习的内在目的区分 -- -- 即制作模型,记录分布统计属性的模型。受Yeom等人成员推论框架的驱动,我们提议对分布推断攻击的正式定义,该定义很笼统,足以描述从可能的培训分布到模型的范围广泛的攻击类别。我们展示我们的定义如何捕捉以前基于比率的财产推断攻击以及新的攻击类型,包括显示培训图表的平均零度或组合系数。为了了解分布推断风险,我们引入了一种测量观察到的渗漏的尺度,即如果直接向对手提供培训分布的样本,就会发生渗漏。我们报告了一系列不同分布的系列实验,其中既使用了新的黑箱攻击,也使用了最新版本的白箱攻击。我们的结果显示,在高价攻击中,《准则》/《准则》中,《准则》攻击是有效的。