Property inference attacks allow an adversary to extract global properties of the training dataset from a machine learning model. Such attacks have privacy implications for data owners who share their datasets to train machine learning models. Several existing approaches for property inference attacks against deep neural networks have been proposed, but they all rely on the attacker training a large number of shadow models, which induces large computational overhead. In this paper, we consider the setting of property inference attacks in which the attacker can poison a subset of the training dataset and query the trained target model. Motivated by our theoretical analysis of model confidences under poisoning, we design an efficient property inference attack, SNAP, which obtains higher attack success and requires lower amounts of poisoning than the state-of-the-art poisoning-based property inference attack by Mahloujifar et al. For example, on the Census dataset, SNAP achieves 34% higher success rate than Mahloujifar et al. while being 56.5x faster. We also extend our attack to determine if a certain property is present at all in training, and estimate the exact proportion of a property of interest efficiently. We evaluate our attack on several properties of varying proportions from four datasets, and demonstrate SNAP's generality and effectiveness.
翻译:财产推断攻击使敌人能够从机器学习模型中提取培训数据集的全球特性。这种攻击对数据拥有者具有隐私影响,他们分享了数据集,以培训机器学习模型。提出了几种现有的对深神经网络进行财产推断攻击的方法,但它们都依赖攻击者培训大量影子模型,从而导致大量计算间接费用。在本文中,我们认为攻击者可以对培训数据集的一部分进行污染,并查询经过培训的目标模型。我们根据对受毒模式信心的理论分析,设计了高效的财产推断攻击,即SNAP,它获得攻击成功率更高,需要比Mahlooujifar等人的基于中毒财产预测低的中毒程度。例如,在普查数据集中,SNAP的成功率比Mahlooujifar等人高出34%,同时速度要快56.5x。我们还扩大了我们的攻击范围,以确定在培训中是否存在某些财产,我们从四种攻击中估计了准确比例。我们从不同程度和不同程度的数据中评估了不同程度的财产。