Modern deep learning requires large volumes of data, which could contain sensitive or private information which cannot be leaked. Recent work has shown for homogeneous neural networks a large portion of this training data could be reconstructed with only access to the trained network parameters. While the attack was shown to work empirically, there exists little formal understanding of its effectiveness regime, and ways to defend against it. In this work, we first build a stronger version of the dataset reconstruction attack and show how it can provably recover its entire training set in the infinite width regime. We then empirically study the characteristics of this attack on two-layer networks and reveal that its success heavily depends on deviations from the frozen infinite-width Neural Tangent Kernel limit. More importantly, we formally show for the first time that dataset reconstruction attacks are a variation of dataset distillation. This key theoretical result on the unification of dataset reconstruction and distillation not only sheds more light on the characteristics of the attack but enables us to design defense mechanisms against them via distillation algorithms.
翻译:现代深层学习需要大量的数据,这些数据可能包含敏感或私人的信息,不能泄漏。最近的工作显示,对于同质神经网络来说,这种培训数据的很大一部分可以重建,但只能访问经过训练的网络参数。虽然这次攻击表明是经验性的,但对其有效性制度和防御方法缺乏正式的理解。在这项工作中,我们首先建立一个更强的数据集重建攻击版本,并表明它如何能够在无限宽的系统里恢复整个训练。然后我们实验性地研究这一攻击对两层网络的特性,并表明其成功在很大程度上取决于与冻结的无限神经凝固内核极限的偏差。更重要的是,我们首次正式表明,数据集重建攻击是数据集蒸馏的变异。这个关于数据集重建与蒸馏的统一的关键理论结果不仅更能说明攻击的特点,而且使我们能够通过蒸馏算法设计防御机制。