Generative Adversarial Networks (GAN)-synthesized table publishing lets people privately learn insights without access to the private table. However, existing studies on Membership Inference (MI) Attacks show promising results on disclosing membership of training datasets of GAN-synthesized tables. Different from those works focusing on discovering membership of a given data point, in this paper, we propose a novel Membership Collision Attack against GANs (TableGAN-MCA), which allows an adversary given only synthetic entries randomly sampled from a black-box generator to recover partial GAN training data. Namely, a GAN-synthesized table immune to state-of-the-art MI attacks is vulnerable to the TableGAN-MCA. The success of TableGAN-MCA is boosted by an observation that GAN-synthesized tables potentially collide with the training data of the generator. Our experimental evaluations on TableGAN-MCA have five main findings. First, TableGAN-MCA has a satisfying training data recovery rate on three commonly used real-world datasets against four generative models. Second, factors, including the size of GAN training data, GAN training epochs and the number of synthetic samples available to the adversary, are positively correlated to the success of TableGAN-MCA. Third, highly frequent data points have high risks of being recovered by TableGAN-MCA. Fourth, some unique data are exposed to unexpected high recovery risks in TableGAN-MCA, which may attribute to GAN's generalization. Fifth, as expected, differential privacy, without the consideration of the correlations between features, does not show commendable mitigation effect against the TableGAN-MCA. Finally, we propose two mitigation methods and show promising privacy and utility trade-offs when protecting against TableGAN-MCA.
翻译:GAN-Synarial Networks (GAN) 合成的表格出版使人们能够在无法进入私人表格的情况下私下学习洞察力。然而,关于会员GAM(MI)袭击的现有研究表明,披露GAN-Synates大小表格培训数据集成员的情况有希望的结果。与那些侧重于发现特定数据点成员组成的工作不同,我们在本文件中提出了一个新的成员对GAN(TableGAN-MCA)的碰撞攻击新版本。我们针对表GAN-MCA(TableGAN-MCA)的实验性评估有五个主要结论。首先,表GAN-MC(MA)对部分GAN培训数据进行随机抽样采集。GAN-MA(N)规模小于最先进的军事攻击,GAN-MA(GAN)高比例数据显示,GAN-Synal-Sylorgal 数据显示高比例数据。GAN(GAN)数据显示,GAN-Synal-A(Oral-A)高比例数据显示,GAN(Oral-A)数据显示高比例数据分析模型显示GAN)的成功数据。