对机器学习模型的数据集相关推断攻击 (Dataset correlation inference attacks against machine learning models)

Machine learning models are often trained on sensitive and proprietary datasets. Yet what -- and under which conditions -- a model leaks about its dataset, is not well understood. Most previous works study the leakage of information about an individual record. Yet in many situations, global dataset information such as its underlying distribution, e.g. $k$-way marginals or correlations are similarly sensitive or secret. We here explore for the first time whether a model leaks information about the correlations between the input variables of its training dataset, something we name correlation inference attack. We first propose a model-less attack, showing how an attacker can exploit the spherical parametrization of correlation matrices to make an informed guess based on the correlations between the input variables and the target variable alone. Second, we propose a model-based attack, showing how an attacker can exploit black-box access to the model to infer the correlations using shadow models trained on synthetic datasets. Our synthetic data generation approach combines Gaussian copula-based generative modeling with a carefully adapted procedure for sampling correlation matrices under constraints. Third, we evaluate our model-based attack against Logistic Regression and Multilayer Perceptron models and show it to strongly outperform the model-less attack on three real-world tabular datasets, indicating that the models leak information about the correlations. We also propose a novel correlation inference-based attribute inference attack (CI-AIA), and show it to obtain state-of-the-art performance. Taken together, our results show how attackers can use the model to extract information about the dataset distribution, and use it to improve their prior on sensitive attributes of individual records.

翻译：机器学习模型往往在敏感和专有的数据集方面受过训练。然而,在哪些方面 -- -- 以及在哪些条件下 -- -- 模型在其数据集中泄漏的模型,并没有得到很好的理解。大多数先前的工作都研究个人记录的信息泄漏问题。然而,在许多情况下,全球数据集信息,例如其基本分布,例如$k$-way边际或关联,都是类似的敏感或秘密。我们在这里第一次探讨一个模型是否泄漏关于其培训数据集输入变量之间相互关系的信息,这是我们所命名的关联推理攻击的敏感因素。我们首先提出一个无模型的攻击者如何利用相关矩阵的球形准光度矩阵,以便根据输入变量和目标变量之间的关联进行知情的猜测。第二,我们提出一个基于模型的攻击者如何利用黑箱访问模型,以便利用在合成模型模型中训练的影子模型来推断这些相互关系。我们合成数据生成方法结合了基于成本的直径比值的直线比值,我们首先提出一个对相关关联矩阵矩阵矩阵进行取样的程序,然后在限制下,我们用模型前方数据显示我们的数据格式显示其真实攻击记录。