Tree models are very widely used in practice of machine learning and data mining. In this paper, we study the problem of model integrity authentication in tree models. In general, the task of model integrity authentication is the design \& implementation of mechanisms for checking/detecting whether the model deployed for the end-users has been tampered with or compromised, e.g., malicious modifications on the model. We propose an authentication framework that enables the model builders/distributors to embed a signature to the tree model and authenticate the existence of the signature by only making a small number of black-box queries to the model. To the best of our knowledge, this is the first study of signature embedding on tree models. Our proposed method simply locates a collection of leaves and modifies their prediction values, which does not require any training/testing data nor any re-training. The experiments on a large number of public classification datasets confirm that the proposed signature embedding process has a high success rate while only introducing a minimal prediction accuracy loss.
翻译:植树模型在机器学习和数据挖掘实践中被广泛使用。在本文中,我们研究了树类模型的完整性认证模型的问题。一般来说,模型的完整性认证的任务是设计用于检查/检测为终端用户部署的模型是否被篡改或损坏的机制,例如对模型的恶意修改。我们提出了一个认证框架,使模型建设者/分配者能够将签名嵌入树类模型,并通过仅对模型进行少量黑盒查询来验证签名的存在。根据我们所知,这是首次在树类模型上嵌入签名的研究。我们提议的方法只是查找树叶的集合并修改其预测值,而不需要任何培训/测试数据或任何再培训。对大量公共分类数据集的实验证实,拟议的签名嵌入过程的成功率很高,而只是引入了最低限度的预测准确性损失。