深海模型对采矿数据的影响,作为无可用培训数据的替代 (Mining Data Impressions from Deep Models as Substitute for the Unavailable Training Data)

Pretrained deep models hold their learnt knowledge in the form of model parameters. These parameters act as "memory" for the trained models and help them generalize well on unseen data. However, in absence of training data, the utility of a trained model is merely limited to either inference or better initialization towards a target task. In this paper, we go further and extract synthetic data by leveraging the learnt model parameters. We dub them "Data Impressions", which act as proxy to the training data and can be used to realize a variety of tasks. These are useful in scenarios where only the pretrained models are available and the training data is not shared (e.g., due to privacy or sensitivity concerns). We show the applicability of data impressions in solving several computer vision tasks such as unsupervised domain adaptation, continual learning as well as knowledge distillation. We also study the adversarial robustness of lightweight models trained via knowledge distillation using these data impressions. Further, we demonstrate the efficacy of data impressions in generating data-free Universal Adversarial Perturbations (UAPs) with better fooling rates. Extensive experiments performed on benchmark datasets demonstrate competitive performance achieved using data impressions in absence of original training data.

翻译：未经训练的深层模型以模型参数的形式保留其所学知识。这些参数作为经过训练的模型的“模拟”作用,有助于它们广泛了解秘密数据。然而,在缺乏培训数据的情况下,经过训练的模型的效用仅限于推断或更好地初始化,以完成目标任务。在本文中,我们通过利用所学模型参数更进一步并提取合成数据。我们把它们称为“数据压缩”,作为培训数据的替代,可用于完成各种任务。这些参数在只有经过训练的模型而且培训数据不共享(例如,由于隐私或敏感性问题)的情景中非常有用。我们展示了数据印象在解决若干计算机视觉任务时的可适用性,例如未经监督的域适应、持续学习以及知识蒸馏。我们还研究了通过利用这些数据的印象进行知识蒸馏所培训的轻质模型的对抗性强性强性。此外,我们展示了数据印象在生成数据无数据通用反差模型方面的效力,并用更好的原始印象显示在数据缺乏率方面进行的广泛数据测试。