Membership inference (MI) attacks highlight a privacy weakness in present stochastic training methods for neural networks. It is not well understood, however, why they arise. Are they a natural consequence of imperfect generalization only? Which underlying causes should we address during training to mitigate these attacks? Towards answering such questions, we propose the first approach to explain MI attacks and their connection to generalization based on principled causal reasoning. We offer causal graphs that quantitatively explain the observed MI attack performance achieved for $6$ attack variants. We refute several prior non-quantitative hypotheses that over-simplify or over-estimate the influence of underlying causes, thereby failing to capture the complex interplay between several factors. Our causal models also show a new connection between generalization and MI attacks via their shared causal factors. Our causal models have high predictive power ($0.90$), i.e., their analytical predictions match with observations in unseen experiments often, which makes analysis via them a pragmatic alternative.
翻译:成员攻击(MI)凸显了目前神经网络的随机训练方法的隐私弱点。 但是,人们并不清楚为什么会出现这种训练方法。 它们是否只是不完善的概括化的自然结果? 培训期间我们应该解决哪些根本原因来减轻这些攻击? 为了回答这些问题,我们建议了第一个方法来解释军事攻击及其与基于原则性推理的概括化的联系。 我们提供了因果图表,从数量上解释观察到的军事攻击性能达到的6美元攻击变异。我们驳斥了以前一些非定量的假设,这些假设过分简单化或高估了根本原因的影响,从而未能捕捉到若干因素之间的复杂相互作用。我们的因果模型还表明,通过共同的因果关系因素,一般化与军事攻击之间有了新的联系。我们的因果模型具有高度的预测力(0.90美元),即其分析预测与秘密实验中观测的结果往往相符,通过这些分析结果使分析成为一种务实的替代办法。