Transfer learning has become a common solution to address training data scarcity in practice. It trains a specified student model by reusing or fine-tuning early layers of a well-trained teacher model that is usually publicly available. However, besides utility improvement, the transferred public knowledge also brings potential threats to model confidentiality, and even further raises other security and privacy issues. In this paper, we present the first comprehensive investigation of the teacher model exposure threat in the transfer learning context, aiming to gain a deeper insight into the tension between public knowledge and model confidentiality. To this end, we propose a teacher model fingerprinting attack to infer the origin of a student model, i.e., the teacher model it transfers from. Specifically, we propose a novel optimization-based method to carefully generate queries to probe the student model to realize our attack. Unlike existing model reverse engineering approaches, our proposed fingerprinting method neither relies on fine-grained model outputs, e.g., posteriors, nor auxiliary information of the model architecture or training dataset. We systematically evaluate the effectiveness of our proposed attack. The empirical results demonstrate that our attack can accurately identify the model origin with few probing queries. Moreover, we show that the proposed attack can serve as a stepping stone to facilitating other attacks against machine learning models, such as model stealing.
翻译:转让学习已成为解决实践中培训数据缺乏的常见解决办法,成为解决实际中培训数据缺乏问题的共同解决办法。它通过重新使用或微微微微调整通常公开提供的、训练有素的教师模式的早期阶段,培训一个指定的学生模式;然而,除了公用事业改进外,转让的公众知识还给模式保密带来潜在威胁,并可能进一步提出其他安全和隐私问题。在本文中,我们首次全面调查教师模式在转让学习过程中暴露风险的威胁,目的是更深入了解公共知识和模式保密之间的紧张关系,目的是更深入了解公共知识与模式保密之间的紧张关系。为此,我们提议了教师模拟指纹攻击的教师模型,以推断学生模式的起源,即它从中转让的教师模式。具体地说,我们建议了一种基于优化的新型方法,以仔细生成查询学生模式的查询,以了解学生模式如何实现我们的攻击。与现有的反向现行示范工程方法不同,我们提议的指纹方法既不依赖于在转让学习模式产出上精细的模型,例如外表,也不依赖模型或培训结构架构或培训数据集的辅助信息。我们系统地评估了拟议攻击的功效。我们拟议攻击的效果。我们系统地评估了我们提出的攻击的效果。实验结果显示,我们的攻击能够准确地确定模型,我们的攻击可以精确确定模型作为其他模型的起源来源的模型,用来识别模型,用来研究,用来进行这种攻击的模型,用来学习的模型,用来学习,我们用来学习,我们准备。