Property inference attacks consider an adversary who has access to the trained model and tries to extract some global statistics of the training data. In this work, we study property inference in scenarios where the adversary can maliciously control part of the training data (poisoning data) with the goal of increasing the leakage. Previous work on poisoning attacks focused on trying to decrease the accuracy of models either on the whole population or on specific sub-populations or instances. Here, for the first time, we study poisoning attacks where the goal of the adversary is to increase the information leakage of the model. Our findings suggest that poisoning attacks can boost the information leakage significantly and should be considered as a stronger threat model in sensitive applications where some of the data sources may be malicious. We describe our \emph{property inference poisoning attack} that allows the adversary to learn the prevalence in the training data of any property it chooses. We theoretically prove that our attack can always succeed as long as the learning algorithm used has good generalization properties. We then verify the effectiveness of our attack by experimentally evaluating it on two datasets: a Census dataset and the Enron email dataset. We were able to achieve above $90\%$ attack accuracy with $9-10\%$ poisoning in all of our experiments.
翻译:财产推断攻击是指能够接触经过训练的模型的对手,并试图提取培训数据的某些全球统计数据。在这项工作中,我们研究对手可以恶意控制部分培训数据(污染数据),目的是增加泄漏的情景中的属性推断。以前关于中毒攻击的工作重点是试图降低模型对全体人口或特定亚群或特定案例的准确性。在这里,我们第一次研究中毒攻击,敌人的目标是增加模型的信息泄漏。我们的调查结果表明,中毒攻击可大大加剧信息泄漏,并应被视为敏感应用中一种更强大的威胁模型,因为有些数据来源可能恶意。我们描述了我们的\emph{property inferty中毒攻击},使敌人能够了解其选择的任何财产的培训数据中的流行程度。我们理论上证明,只要学习算法具有良好的概括性,我们的攻击总是能够成功。我们随后通过实验性地评估两个数据集:普查数据集和Enronemail数据攻击的准确性,我们得以实现9-10的准确性。我们得以实现9-10的精确性。